From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A3F15A0C4A; Thu, 8 Jul 2021 14:51:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 88CFE4069C; Thu, 8 Jul 2021 14:51:40 +0200 (CEST) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by mails.dpdk.org (Postfix) with ESMTP id A10F94014F for ; Thu, 8 Jul 2021 14:51:38 +0200 (CEST) Received: from dggeme756-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GLGLH3Kb8zbbZ0; Thu, 8 Jul 2021 20:48:23 +0800 (CST) Received: from [10.67.103.128] (10.67.103.128) by dggeme756-chm.china.huawei.com (10.3.19.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 8 Jul 2021 20:51:35 +0800 To: Andrew Rybchenko , Xiaoyun Li , Anatoly Burakov CC: , Lijun Ou , Ajit Khaparde , Ferruh Yigit References: <1615430867-29992-1-git-send-email-humin29@huawei.com> <20210702120906.705007-1-Andrew.Rybchenko@oktetlabs.ru> <3213dc8b-f40a-cfbe-aa69-7f8925773bfe@huawei.com> From: "Min Hu (Connor)" Message-ID: <236a2d9b-7756-f349-f5e2-f6cebdd80df0@huawei.com> Date: Thu, 8 Jul 2021 20:51:35 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.103.128] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggeme756-chm.china.huawei.com (10.3.19.102) X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] [PATCH v15] app/testpmd: support multi-process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi,Andrew , 在 2021/7/8 20:30, Andrew Rybchenko 写道: > On 7/8/21 3:20 PM, Min Hu (Connor) wrote: >> Hi, Andrew , >> >> 在 2021/7/2 20:47, Andrew Rybchenko 写道: >>> On 7/2/21 3:09 PM, Andrew Rybchenko wrote: >>>> From: "Min Hu (Connor)" >>>> >>>> For example the following commands run two testpmd processes: >>>> >>>>   * the primary process: >>>> >>>> ./dpdk-testpmd --proc-type=auto -l 0-1 -- -i \ >>>>     --rxq=4 --txq=4 --num-procs=2 --proc-id=0 >>>> >>>>   * the secondary process: >>>> >>>> ./dpdk-testpmd --proc-type=auto -l 2-3 -- -i \ >>>>     --rxq=4 --txq=4 --num-procs=2 --proc-id=1 >>>> >>>> Signed-off-by: Min Hu (Connor) >>>> Signed-off-by: Lijun Ou >>>> Signed-off-by: Andrew Rybchenko >>>> Acked-by: Xiaoyun Li >>>> Acked-by: Ajit Khaparde >>>> Reviewed-by: Ferruh Yigit >>> >>> [snip] >>> >>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c >>>> index 1cdd3cdd1..a5da0c272 100644 >>>> --- a/app/test-pmd/testpmd.c >>>> +++ b/app/test-pmd/testpmd.c >>>> @@ -520,6 +520,62 @@ enum rte_eth_rx_mq_mode rx_mq_mode = >>>> ETH_MQ_RX_VMDQ_DCB_RSS; >>>>    */ >>>>   uint32_t eth_link_speed; >>>>   +/* >>>> + * ID of the current process in multi-process, used to >>>> + * configure the queues to be polled. >>>> + */ >>>> +int proc_id; >>>> + >>>> +/* >>>> + * Number of processes in multi-process, used to >>>> + * configure the queues to be polled. >>>> + */ >>>> +unsigned int num_procs = 1; >>>> + >>>> +static int >>>> +eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t >>>> nb_tx_q, >>>> +              const struct rte_eth_conf *dev_conf) >>>> +{ >>>> +    if (is_proc_primary()) >>>> +        return rte_eth_dev_configure(port_id, nb_rx_q, nb_tx_q, >>>> +                    dev_conf); >>>> +    return 0; >>>> +} >>>> + >>>> +static int >>>> +eth_dev_start_mp(uint16_t port_id) >>>> +{ >>>> +    if (is_proc_primary()) >>>> +        return rte_eth_dev_start(port_id); >>>> + >>>> +    return 0; >>>> +} >>>> + >>>> +static int >>>> +eth_dev_stop_mp(uint16_t port_id) >>>> +{ >>>> +    if (is_proc_primary()) >>>> +        return rte_eth_dev_stop(port_id); >>>> + >>>> +    return 0; >>>> +} >>>> + >>>> +static void >>>> +mempool_free_mp(struct rte_mempool *mp) >>>> +{ >>>> +    if (is_proc_primary()) >>>> +        rte_mempool_free(mp); >>>> +} >>>> + >>>> +static int >>>> +eth_dev_set_mtu_mp(uint16_t port_id, uint16_t mtu) >>>> +{ >>>> +    if (is_proc_primary()) >>>> +        return rte_eth_dev_set_mtu(port_id, mtu); >>>> + >>>> +    return 0; >>>> +} >>>> + >>> >>> I think above functions should be removed and corresponding >>> checks should be done in caller directly since above functions >>> are used in single place only and just hide what actually >>> happens in the case of secondary process. It is very >>> misleading. >>> >> This was done as Ferruh suggested in V9, and this could reduce >> the complexity for testpmd when added by the multi-process support. >>> [snip] >>> >>>> @@ -2495,21 +2565,24 @@ start_port(portid_t pid) >>>>                   return -1; >>>>               } >>>>               /* configure port */ >>>> -            diag = rte_eth_dev_configure(pi, nb_rxq + nb_hairpinq, >>>> -                             nb_txq + nb_hairpinq, >>>> -                             &(port->dev_conf)); >>>> +            diag = eth_dev_configure_mp(pi, >>>> +                         nb_rxq + nb_hairpinq, >>>> +                         nb_txq + nb_hairpinq, >>>> +                         &(port->dev_conf)); >>>>               if (diag != 0) { >>>> -                if (rte_atomic16_cmpset(&(port->port_status), >>>> -                RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0) >>>> -                    printf("Port %d can not be set back " >>>> -                            "to stopped\n", pi); >>>> +                if (rte_atomic16_cmpset( >>>> +                        &(port->port_status), >>>> +                        RTE_PORT_HANDLING, >>>> +                        RTE_PORT_STOPPED) == 0) >>>> +                    printf("Port %d cannot be set back to stopped\n", >>>> +                        pi); >>> >>> Unrelated changes in the patch should be avoided since >>> it just makes the review harder.This will be fixed in v16. >>> >>>>                   printf("Fail to configure port %d\n", pi); >>>>                   /* try to reconfigure port next time */ >>>>                   port->need_reconfig = 1; >>>>                   return -1; >>>>               } >>>>           } >>>> -        if (port->need_reconfig_queues > 0) { >>>> +        if (port->need_reconfig_queues > 0 && is_proc_primary()) { >>>>               port->need_reconfig_queues = 0; >>>>               /* setup tx queues */ >>>>               for (qi = 0; qi < nb_txq; qi++) { >>>> @@ -2532,8 +2605,8 @@ start_port(portid_t pid) >>>>                   if (rte_atomic16_cmpset(&(port->port_status), >>>>                               RTE_PORT_HANDLING, >>>>                               RTE_PORT_STOPPED) == 0) >>>> -                    printf("Port %d can not be set back " >>>> -                            "to stopped\n", pi); >>>> +                    printf("Port %d cannot be set back to stopped\n", >>>> +                        pi); >>> >>> Unrelated changes in the patch should be avoided. >> This will be fixed in v16. >>> >>>>                   printf("Fail to configure port %d tx queues\n", >>>>                          pi); >>>>                   /* try to reconfigure queues next time */ >>>> @@ -2610,16 +2683,16 @@ start_port(portid_t pid) >>>>           cnt_pi++; >>>>             /* start port */ >>>> -        diag = rte_eth_dev_start(pi); >>>> +        diag = eth_dev_start_mp(pi); >>>>           if (diag < 0) { >>>>               printf("Fail to start port %d: %s\n", pi, >>>>                      rte_strerror(-diag)); >>>>                 /* Fail to setup rx queue, return */ >>>>               if (rte_atomic16_cmpset(&(port->port_status), >>>> -                RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0) >>>> -                printf("Port %d can not be set back to " >>>> -                            "stopped\n", pi); >>>> +            RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0) >>>> +                printf("Port %d cannot be set back to stopped\n", >>>> +                       pi); >>> >>> Unrelated changes in the patch should be avoided. >> This will be fixed in v16. >>> >>> [snip] >>> >>>> diff --git a/doc/guides/testpmd_app_ug/run_app.rst >>>> b/doc/guides/testpmd_app_ug/run_app.rst >>>> index eb4831835..348e5fcac 100644 >>>> --- a/doc/guides/testpmd_app_ug/run_app.rst >>>> +++ b/doc/guides/testpmd_app_ug/run_app.rst >>>> @@ -545,3 +545,85 @@ The command line options are: >>>>       bit 0 - two hairpin ports loop >>>>         The default value is 0. Hairpin will use single port mode and >>>> implicit Tx flow mode. >>>> + >>>> + >>>> +Testpmd Multi-Process Command-line Options >>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> + >>>> +The following are the command-line options for testpmd multi-process >>>> support: >>>> + >>>> +*   primary process: >>>> + >>>> +.. code-block:: console >>>> + >>>> +       sudo ./dpdk-testpmd --proc-type=auto -l 0-1 -- -i --rxq=4 >>>> --txq=4 \ >>>> +            --num-procs=2 --proc-id=0 >>>> + >>>> +*   secondary process: >>>> + >>>> +.. code-block:: console >>>> + >>>> +       sudo ./dpdk-testpmd --proc-type=auto -l 2-3 -- -i --rxq=4 >>>> --txq=4 \ >>>> +            --num-procs=2 --proc-id=1 >>>> + >>>> +The command line options are: >>>> + >>>> +*   ``--num-procs=N`` >>>> + >>>> +    The number of processes which will be used. >>>> + >>>> +*   ``--proc-id=ID`` >>>> + >>>> +    The ID of the current process (ID < num-procs). ID should be >>>> different in >>>> +    primary process and secondary process, which starts from '0'. >>>> + >>>> +Calculation rule for queue: >>>> +All queues are allocated to different processes based on >>>> ``proc_num`` and >>>> +``proc_id``. >>>> +Calculation rule for the testpmd to allocate queues to each process: >>>> + >>>> +* start(queue start id) = proc_id * nb_q / num_procs >>>> + >>>> +* end(queue end id) = start + nb_q / num_procs >>>> + >>>> +For example, if testpmd is configured to have 4 Tx and Rx queues, >>>> +queues 0 and 1 will be used by the primary process and >>>> +queues 2 and 3 will be used by the secondary process. >>>> + >>>> +The number of queues should be a multiple of the number of >>>> processes. If not, >>>> +redundant queues will exist after queues are allocated to processes. >>>> If RSS >>>> +is enabled, packet loss occurs when traffic is sent to all processes >>>> at the same >>>> +time. Some traffic goes to redundant queues and cannot be forwarded. >>>> + >>>> +All the dev ops is supported in primary process. While secondary >>>> process is >>>> +not permitted to allocate or release shared memory, so some ops are >>>> not supported >>>> +as follows: >>>> + >>>> +- ``dev_configure`` >>>> +- ``dev_start`` >>>> +- ``dev_stop`` >>>> +- ``rx_queue_setup`` >>>> +- ``tx_queue_setup`` >>>> +- ``rx_queue_release`` >>>> +- ``tx_queue_release`` >>>> + >>>> +So, any command from testpmd which calls those APIs will not be >>>> supported in >>>> +secondary process, like: >>>> + >>>> +.. code-block:: console >>>> + >>>> +    port config all rxq|txq|rxd|txd >>>> +    port config rx_offload xxx on/off >>>> +    port config tx_offload xxx on/off >>>> + >>>> +etc. >>> >>> I did the formatting cleanup, but I still think that testpmd >>> guide should not dive into such level of details. It should >>> rather highlight multi-process behaviour specifics. >>> >>> Shouldn't testpmd store state in shared memory to avoid >>> problems when primary is stopped while secondary is running >> This could be taken into consideration in future. >> >>> >>> Some testpmd features rely on reconfigure (i.e. simply change >>> configuration and set flag that reconfigure is required), but >>> configure does nothing and will simply ignore new settings. >>> So, it could look very-very confusing from user point of view. >>> >>> I'm not sure that it is acceptable to apply the patch in such >>> state and open huge number of bugs in testpmd behaviour when >>> multi-process is used. >>> >>> I'd even consider to exclude unsupported commands from help >>> etc. However, such level of care about user could be excessive >>> for test tool. >> This has been done in doc. >>> >>> IMHO, it should be no requirement to repeat the primary >>> process command-line configuration in the second process >>> command line (see --rxq=4 --txq=4 above). The information >>> should be obtained from shared state. In theory primary >>> process could even change some settings in interactive >> We think keeping the command line in consistent between primary >> and secondary is easy to understand for users. While shared memory for >> keeping in order or communicating could be performed,but this could >> be done in future patch. >> >>> mode. I think testpmd should guarantee consistent behaviour >>> even in such conditions. I.e. do not allow to stop ports >>> used by forwarding running in secondary processes. >>> Run-time queues setup and deferred start should be very >>> carefully handled as well. >>  ``dev_stop`` is not allowed in secondary, which has described in doc. > > I'm talking about dev_stop in primary while secondary is > running. > >>>> + >>>> +Stats is supported, stats will not change when one quits and starts, >>>> as they >>>> +share the same buffer to store the stats. Flow rules are maintained >>>> in process >>>> +level: primary and secondary has its own flow list (but one flow >>>> list in HW). >>>> +The two can see all the queues, so setting the flow rules for the >>>> other is OK. >>>> +But in the testpmd primary process receiving or transmitting packets >>>> from the >>>> +queue allocated for secondary process is not permitted, and same for >>>> secondary >>>> +process. >>>> + >>>> +Flow API and RSS are supported. >>>> Thanks for your comment, >> This patch supports basic function for multi-process support in testpmd. >> I think other patches in future could enhance or optimize it, thanks. > > IMHO, as I state above, current state is insufficient to > consider is a start point to be applied. OK, what is the critical(the most important) as a start point to be applied? I will take action to fix it,as I will the patch could be applied before V21.08. > . >