DPDK patches and discussions
 help / color / mirror / Atom feed
From: "lihuisong (C)" <lihuisong@huawei.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>,
	"Min Hu (Connor)" <humin29@huawei.com>, <dev@dpdk.org>
Cc: <thomas@monjalon.net>, <xiaoyun.li@intel.com>,
	Radu Nicolau <radu.nicolau@intel.com>,
	"Singh, Aman Deep" <aman.deep.singh@intel.com>
Subject: Re: [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device
Date: Tue, 8 Feb 2022 09:19:45 +0800	[thread overview]
Message-ID: <c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com> (raw)
In-Reply-To: <a793b6f1-4820-f4cd-7119-e14ccd7f6700@intel.com>


在 2022/2/4 20:07, Ferruh Yigit 写道:
> On 10/25/2021 7:39 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong@huawei.com>
>>
>> Stopping a bond device also stops all active slaves under the bond 
>> device.
>> If this port is bond device, we need to modify the port status of all
>> slaves from RTE_PORT_STARTED to RTE_PORT_STOPPED.
>>
>> Fixes: 0e545d3047fe ("app/testpmd: check stopping port is not in 
>> bonding")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>>   app/test-pmd/cmdline.c |  1 +
>>   app/test-pmd/testpmd.c | 49 +++++++++++++++++++++++++++++++++++++++---
>>   app/test-pmd/testpmd.h |  3 ++-
>>   3 files changed, 49 insertions(+), 4 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index 722f4fb9d9..5bfb4b509b 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -6639,6 +6639,7 @@ static void 
>> cmd_create_bonded_device_parsed(void *parsed_result,
>>                   "Failed to enable promiscuous mode for port %u: %s 
>> - ignore\n",
>>                   port_id, rte_strerror(-ret));
>>   +        ports[port_id].bond_flag = 1;
>>           ports[port_id].need_setup = 0;
>>           ports[port_id].port_status = RTE_PORT_STOPPED;
>>       }
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index af0e79fe6d..d6b9ebc4dd 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -65,6 +65,9 @@
>>   #ifdef RTE_EXEC_ENV_WINDOWS
>>   #include <process.h>
>>   #endif
>> +#ifdef RTE_NET_BOND
>> +#include <rte_eth_bond.h>
>> +#endif
>>     #include "testpmd.h"
>>   @@ -2986,6 +2989,35 @@ start_port(portid_t pid)
>>       return 0;
>>   }
>>   +#ifdef RTE_NET_BOND
>> +static void
>> +change_bonding_active_slave_port_status(portid_t bond_pid)
>
> The function sets the status explicitly to PORT_STOPPED, but function
> name is more generic, should we update the function name to reflect the
> functionality?
ok
>
>> +{
>> +    portid_t slave_pids[RTE_MAX_ETHPORTS];
>> +    struct rte_port *port;
>> +    int num_active_slaves;
>> +    portid_t slave_pid;
>> +    int i;
>> +
>> +    num_active_slaves = rte_eth_bond_active_slaves_get(bond_pid, 
>> slave_pids,
>> +                               RTE_MAX_ETHPORTS);
>> +    if (num_active_slaves < 0) {
>> +        fprintf(stderr, "Failed to get slave list for port = %u\n",
>> +            bond_pid);
>> +        return;
>> +    }
>> +
>> +    for (i = 0; i < num_active_slaves; i++) {
>> +        slave_pid = slave_pids[i];
>> +        port = &ports[slave_pid];
>> +        if (rte_atomic16_cmpset(&(port->port_status),
>> +            RTE_PORT_STARTED, RTE_PORT_STOPPED) == 0)
>> +            fprintf(stderr, "Port %u can not be set into stopped\n",
>> +                slave_pid);
>> +    }
>> +}
>> +#endif
>> +
>>   void
>>   stop_port(portid_t pid)
>>   {
>> @@ -3042,9 +3074,20 @@ stop_port(portid_t pid)
>>           if (port->flow_list)
>>               port_flow_flush(pi);
>>   -        if (eth_dev_stop_mp(pi) != 0)
>> -            RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
>> -                pi);
>
> Can you please remove the 'eth_dev_stop_mp()' function in this patch,
> which is removed in patch 2/3.
ok
>
>> +        if (is_proc_primary()) {
>> +#ifdef RTE_NET_BOND
>> +            /*
>> +             * Stopping a bond device also stops all active slaves
>> +             * under the bond device. If this port is bond device,
>> +             * we need to modify the port status of all slaves.
>> +             */
>> +            if (port->bond_flag == 1)
>> +                change_bonding_active_slave_port_status(pi);
>> +#endif
>> +            if (rte_eth_dev_stop(pi) != 0)
>> +                RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port 
>> %u\n",
>> +                    pi);
>
> Should we roll back the slave port status if 'rte_eth_dev_stop(pi)' 
> fails?
Yes, it is necessary here for slaves to fail to execute dev_stop() in 
bonding driver.

Btw, in thinking about this, I find a behavior that is not very reasonable.
Namely, only active slaves are stopped when a bonding device is stopped.
It can cause confusion in port status. For example, applications have to 
only modify
active slaves status to RTE_PORT_STOPPED and non-active slaves status is 
still
RTE_PORT_STARTED.
I think the bonding PMD should stop all slaves when a bonding device is 
stopped.
I checked the modification history about this in the bonding PMD. This 
behavior is
introduced by the following patch.

/*
commit 0911d4ec01839c9149a0df5758d00d9d57a47cea
Author: Radu Nicolau <radu.nicolau@intel.com>
Date:   Thu Nov 8 15:26:42 2018 +0000

     net/bonding: fix crash when stopping mode 4 port

     When stopping a bonded port all slaves are deactivated. Attempting
     to deactivate a slave that was never activated will result in a 
segfault
     when mode 4 is used.

     Fixes: 7486331308f6 ("net/bonding: stop and deactivate slaves on stop")
     Cc: stable@dpdk.org

     Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
     Acked-by: Chas Williams <chas3@att.com>
*/

The root cause of the problem the above patch mentioned is that in mode 4,
the bonding PMD does not allocate rx/tx rings to non-active slave devices.
The call stack is as follows:
#0  0x0000000000b1250c in rte_ring_dequeue_bulk_elem (available=0x0, 
n=1, esize=8, obj_table=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring_elem.h:380
#1  rte_ring_dequeue_elem (esize=8, obj_p=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring_elem.h:476
#2  rte_ring_dequeue (obj_p=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring.h:463
#3  bond_mode_8023ad_deactivate_slave (bond_dev=0x4753200 
<rte_eth_devices+33024>, slave_id=0) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_8023ad.c:1163
#4  0x0000000000b29e10 in deactivate_slave (eth_dev=0x4753200 
<rte_eth_devices+33024>, port_id=0) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_api.c:117
#5  0x0000000000b44208 in bond_ethdev_stop (eth_dev=0x4753200 
<rte_eth_devices+33024>) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_pmd.c:2103
#6  0x00000000007966fc in rte_eth_dev_stop (port_id=2) at 
../dpdk-next-net/lib/ethdev/rte_ethdev.c:1894
#7  0x000000000055ea60 in eth_dev_stop_mp (port_id=2) at 
../dpdk-next-net/app/test-pmd/testpmd.c:613
#8  0x0000000000565230 in stop_port (pid=2) at 
../dpdk-next-net/app/test-pmd/testpmd.c:3059
#9  0x00000000004f7614 in cmd_operate_specific_port_parsed 
(parsed_result=0xffffffff91b0, cl=0x4829250, data=0x0) at 
../dpdk-next-net/app/test-pmd/cmdline.c:1261
#10 0x000000000078be24 in cmdline_parse (cl=0x4829250, buf=0x4829298 
"port stop 2\n") at ../dpdk-next-net/lib/cmdline/cmdline_parse.c:290
#11 0x0000000000789c34 in cmdline_valid_buffer (rdl=0x4829260, 
buf=0x4829298 "port stop 2\n", size=13) at 
../dpdk-next-net/lib/cmdline/cmdline.c:26
#12 0x000000000078f160 in rdline_char_in (rdl=0x4829260, c=10 '\n') at 
../dpdk-next-net/lib/cmdline/cmdline_rdline.c:446
#13 0x000000000078a0c8 in cmdline_in (cl=0x4829250, buf=0xfffffffff2e7 
"\n", size=1) at ../dpdk-next-net/lib/cmdline/cmdline.c:148
#14 0x000000000078a3b4 in cmdline_interact (cl=0x4829250) at 
../dpdk-next-net/lib/cmdline/cmdline.c:222
#15 0x000000000050bf98 in prompt () at 
../dpdk-next-net/app/test-pmd/cmdline.c:18001
#16 0x00000000005687c4 in main (argc=4, argv=0xfffffffff510) at 
../dpdk-next-net/app/test-pmd/testpmd.c:4268

For the problem Radu encountered, we only need to ensure that
non-active slaves doesn't deactivate.
I plan to add a patch in this patchset to fix this problem.
What do you think, Ferruh?
>
>> +        }
>>             if (rte_atomic16_cmpset(&(port->port_status),
>>               RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index e3995d24ab..ad3b4f875c 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -237,7 +237,8 @@ struct rte_port {
>>       struct rte_eth_txconf tx_conf[RTE_MAX_QUEUES_PER_PORT+1]; /**< 
>> per queue tx configuration */
>>       struct rte_ether_addr   *mc_addr_pool; /**< pool of multicast 
>> addrs */
>>       uint32_t                mc_addr_nb; /**< nb. of addr. in 
>> mc_addr_pool */
>> -    uint8_t                 slave_flag; /**< bonding slave port */
>> +    uint8_t                 slave_flag : 1, /**< bonding slave port */
>> +                bond_flag : 1; /**< port is bond device */
>
> Can't we detect if the port is a bonding port without introducing a new
> variable/state?
The bonding device is also an ethdev. I do not find the external API that
can be used to detect whether a port is a bonding port.
>
>>       struct port_flow        *flow_list; /**< Associated flows. */
>>       struct port_indirect_action *actions_list;
>>       /**< Associated indirect actions. */
>
> .

  reply	other threads:[~2022-02-08  1:19 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25  6:39 [dpdk-dev] [PATCH 0/3] bugfix for testpmd Min Hu (Connor)
2021-10-25  6:39 ` [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device Min Hu (Connor)
2021-11-15 13:01   ` Singh, Aman Deep
2021-11-16  1:20     ` lihuisong (C)
2022-02-03  7:06       ` Singh, Aman Deep
2022-02-04 12:07   ` Ferruh Yigit
2022-02-08  1:19     ` lihuisong (C) [this message]
2021-10-25  6:39 ` [dpdk-dev] [PATCH 2/3] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-02-04 12:14   ` Ferruh Yigit
2022-02-08  1:12     ` lihuisong (C)
2021-10-25  6:39 ` [dpdk-dev] [PATCH 3/3] app/testpmd: remove unused header file Min Hu (Connor)
2021-11-08 16:05   ` Ferruh Yigit
2022-03-24  3:00 ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 1/4] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-04-26 18:19     ` Ferruh Yigit
2022-04-29  6:45       ` Min Hu (Connor)
2022-04-29 13:31         ` Ferruh Yigit
2022-05-03  6:54           ` Min Hu (Connor)
2022-05-03 19:04             ` Ferruh Yigit
2022-05-05  1:16               ` Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 2/4] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-04-26 18:19     ` Ferruh Yigit
2022-04-29  6:52       ` Min Hu (Connor)
2022-04-29 13:35         ` Ferruh Yigit
2022-03-24  3:00   ` [PATCH V2 3/4] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 4/4] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-05-30  6:01     ` Min Hu (Connor)
2022-05-30 10:21       ` Singh, Aman Deep
2022-04-25  6:49   ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-05-03 10:02   ` [PATCH v3 0/5] " Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 1/5] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 2/5] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 3/5] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-05-03 23:39       ` Konstantin Ananyev
2022-05-06  8:16         ` Min Hu (Connor)
2022-05-08 11:28           ` Konstantin Ananyev
2022-05-10 16:34           ` Ferruh Yigit
2022-05-10 21:48             ` Konstantin Ananyev
2022-05-11  2:16               ` Min Hu (Connor)
2022-05-11 10:05                 ` Ferruh Yigit
2022-05-11  2:14       ` [PATCH v4] " Min Hu (Connor)
2022-05-11 22:08         ` Konstantin Ananyev
2022-05-19  7:15           ` Andrew Rybchenko
2022-05-03 10:02     ` [PATCH v3 4/5] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-06-01 17:54       ` Ferruh Yigit
2022-06-07  8:15         ` Dongdong Liu
2022-06-07  8:10       ` [PATCH v4] " Dongdong Liu
2022-06-07 14:31         ` Ferruh Yigit
2022-06-09  7:50           ` Dongdong Liu
2022-06-09  8:50             ` Ferruh Yigit
2022-06-09 11:20               ` Dongdong Liu
2022-06-09 11:49       ` [PATCH v5] " Dongdong Liu
2022-06-10  8:10         ` Ferruh Yigit
2022-05-03 10:02     ` [PATCH v3 5/5] ethdev: fix dev state when stop Min Hu (Connor)
2022-05-25 17:44       ` Ferruh Yigit
2022-05-26 10:21         ` Thomas Monjalon
2022-05-30 12:04           ` Ferruh Yigit
2022-05-11 14:04     ` [PATCH v3 0/5] bugfix for bonding Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com \
    --to=lihuisong@huawei.com \
    --cc=aman.deep.singh@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=humin29@huawei.com \
    --cc=radu.nicolau@intel.com \
    --cc=thomas@monjalon.net \
    --cc=xiaoyun.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).