From: "lihuisong (C)" <lihuisong@huawei.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>,
"Min Hu (Connor)" <humin29@huawei.com>, <dev@dpdk.org>
Cc: <thomas@monjalon.net>, <xiaoyun.li@intel.com>,
Radu Nicolau <radu.nicolau@intel.com>,
"Singh, Aman Deep" <aman.deep.singh@intel.com>
Subject: Re: [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device
Date: Tue, 8 Feb 2022 09:19:45 +0800 [thread overview]
Message-ID: <c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com> (raw)
In-Reply-To: <a793b6f1-4820-f4cd-7119-e14ccd7f6700@intel.com>
在 2022/2/4 20:07, Ferruh Yigit 写道:
> On 10/25/2021 7:39 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong@huawei.com>
>>
>> Stopping a bond device also stops all active slaves under the bond
>> device.
>> If this port is bond device, we need to modify the port status of all
>> slaves from RTE_PORT_STARTED to RTE_PORT_STOPPED.
>>
>> Fixes: 0e545d3047fe ("app/testpmd: check stopping port is not in
>> bonding")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>> app/test-pmd/cmdline.c | 1 +
>> app/test-pmd/testpmd.c | 49 +++++++++++++++++++++++++++++++++++++++---
>> app/test-pmd/testpmd.h | 3 ++-
>> 3 files changed, 49 insertions(+), 4 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index 722f4fb9d9..5bfb4b509b 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -6639,6 +6639,7 @@ static void
>> cmd_create_bonded_device_parsed(void *parsed_result,
>> "Failed to enable promiscuous mode for port %u: %s
>> - ignore\n",
>> port_id, rte_strerror(-ret));
>> + ports[port_id].bond_flag = 1;
>> ports[port_id].need_setup = 0;
>> ports[port_id].port_status = RTE_PORT_STOPPED;
>> }
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index af0e79fe6d..d6b9ebc4dd 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -65,6 +65,9 @@
>> #ifdef RTE_EXEC_ENV_WINDOWS
>> #include <process.h>
>> #endif
>> +#ifdef RTE_NET_BOND
>> +#include <rte_eth_bond.h>
>> +#endif
>> #include "testpmd.h"
>> @@ -2986,6 +2989,35 @@ start_port(portid_t pid)
>> return 0;
>> }
>> +#ifdef RTE_NET_BOND
>> +static void
>> +change_bonding_active_slave_port_status(portid_t bond_pid)
>
> The function sets the status explicitly to PORT_STOPPED, but function
> name is more generic, should we update the function name to reflect the
> functionality?
ok
>
>> +{
>> + portid_t slave_pids[RTE_MAX_ETHPORTS];
>> + struct rte_port *port;
>> + int num_active_slaves;
>> + portid_t slave_pid;
>> + int i;
>> +
>> + num_active_slaves = rte_eth_bond_active_slaves_get(bond_pid,
>> slave_pids,
>> + RTE_MAX_ETHPORTS);
>> + if (num_active_slaves < 0) {
>> + fprintf(stderr, "Failed to get slave list for port = %u\n",
>> + bond_pid);
>> + return;
>> + }
>> +
>> + for (i = 0; i < num_active_slaves; i++) {
>> + slave_pid = slave_pids[i];
>> + port = &ports[slave_pid];
>> + if (rte_atomic16_cmpset(&(port->port_status),
>> + RTE_PORT_STARTED, RTE_PORT_STOPPED) == 0)
>> + fprintf(stderr, "Port %u can not be set into stopped\n",
>> + slave_pid);
>> + }
>> +}
>> +#endif
>> +
>> void
>> stop_port(portid_t pid)
>> {
>> @@ -3042,9 +3074,20 @@ stop_port(portid_t pid)
>> if (port->flow_list)
>> port_flow_flush(pi);
>> - if (eth_dev_stop_mp(pi) != 0)
>> - RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
>> - pi);
>
> Can you please remove the 'eth_dev_stop_mp()' function in this patch,
> which is removed in patch 2/3.
ok
>
>> + if (is_proc_primary()) {
>> +#ifdef RTE_NET_BOND
>> + /*
>> + * Stopping a bond device also stops all active slaves
>> + * under the bond device. If this port is bond device,
>> + * we need to modify the port status of all slaves.
>> + */
>> + if (port->bond_flag == 1)
>> + change_bonding_active_slave_port_status(pi);
>> +#endif
>> + if (rte_eth_dev_stop(pi) != 0)
>> + RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port
>> %u\n",
>> + pi);
>
> Should we roll back the slave port status if 'rte_eth_dev_stop(pi)'
> fails?
Yes, it is necessary here for slaves to fail to execute dev_stop() in
bonding driver.
Btw, in thinking about this, I find a behavior that is not very reasonable.
Namely, only active slaves are stopped when a bonding device is stopped.
It can cause confusion in port status. For example, applications have to
only modify
active slaves status to RTE_PORT_STOPPED and non-active slaves status is
still
RTE_PORT_STARTED.
I think the bonding PMD should stop all slaves when a bonding device is
stopped.
I checked the modification history about this in the bonding PMD. This
behavior is
introduced by the following patch.
/*
commit 0911d4ec01839c9149a0df5758d00d9d57a47cea
Author: Radu Nicolau <radu.nicolau@intel.com>
Date: Thu Nov 8 15:26:42 2018 +0000
net/bonding: fix crash when stopping mode 4 port
When stopping a bonded port all slaves are deactivated. Attempting
to deactivate a slave that was never activated will result in a
segfault
when mode 4 is used.
Fixes: 7486331308f6 ("net/bonding: stop and deactivate slaves on stop")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Chas Williams <chas3@att.com>
*/
The root cause of the problem the above patch mentioned is that in mode 4,
the bonding PMD does not allocate rx/tx rings to non-active slave devices.
The call stack is as follows:
#0 0x0000000000b1250c in rte_ring_dequeue_bulk_elem (available=0x0,
n=1, esize=8, obj_table=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring_elem.h:380
#1 rte_ring_dequeue_elem (esize=8, obj_p=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring_elem.h:476
#2 rte_ring_dequeue (obj_p=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring.h:463
#3 bond_mode_8023ad_deactivate_slave (bond_dev=0x4753200
<rte_eth_devices+33024>, slave_id=0) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_8023ad.c:1163
#4 0x0000000000b29e10 in deactivate_slave (eth_dev=0x4753200
<rte_eth_devices+33024>, port_id=0) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_api.c:117
#5 0x0000000000b44208 in bond_ethdev_stop (eth_dev=0x4753200
<rte_eth_devices+33024>) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_pmd.c:2103
#6 0x00000000007966fc in rte_eth_dev_stop (port_id=2) at
../dpdk-next-net/lib/ethdev/rte_ethdev.c:1894
#7 0x000000000055ea60 in eth_dev_stop_mp (port_id=2) at
../dpdk-next-net/app/test-pmd/testpmd.c:613
#8 0x0000000000565230 in stop_port (pid=2) at
../dpdk-next-net/app/test-pmd/testpmd.c:3059
#9 0x00000000004f7614 in cmd_operate_specific_port_parsed
(parsed_result=0xffffffff91b0, cl=0x4829250, data=0x0) at
../dpdk-next-net/app/test-pmd/cmdline.c:1261
#10 0x000000000078be24 in cmdline_parse (cl=0x4829250, buf=0x4829298
"port stop 2\n") at ../dpdk-next-net/lib/cmdline/cmdline_parse.c:290
#11 0x0000000000789c34 in cmdline_valid_buffer (rdl=0x4829260,
buf=0x4829298 "port stop 2\n", size=13) at
../dpdk-next-net/lib/cmdline/cmdline.c:26
#12 0x000000000078f160 in rdline_char_in (rdl=0x4829260, c=10 '\n') at
../dpdk-next-net/lib/cmdline/cmdline_rdline.c:446
#13 0x000000000078a0c8 in cmdline_in (cl=0x4829250, buf=0xfffffffff2e7
"\n", size=1) at ../dpdk-next-net/lib/cmdline/cmdline.c:148
#14 0x000000000078a3b4 in cmdline_interact (cl=0x4829250) at
../dpdk-next-net/lib/cmdline/cmdline.c:222
#15 0x000000000050bf98 in prompt () at
../dpdk-next-net/app/test-pmd/cmdline.c:18001
#16 0x00000000005687c4 in main (argc=4, argv=0xfffffffff510) at
../dpdk-next-net/app/test-pmd/testpmd.c:4268
For the problem Radu encountered, we only need to ensure that
non-active slaves doesn't deactivate.
I plan to add a patch in this patchset to fix this problem.
What do you think, Ferruh?
>
>> + }
>> if (rte_atomic16_cmpset(&(port->port_status),
>> RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index e3995d24ab..ad3b4f875c 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -237,7 +237,8 @@ struct rte_port {
>> struct rte_eth_txconf tx_conf[RTE_MAX_QUEUES_PER_PORT+1]; /**<
>> per queue tx configuration */
>> struct rte_ether_addr *mc_addr_pool; /**< pool of multicast
>> addrs */
>> uint32_t mc_addr_nb; /**< nb. of addr. in
>> mc_addr_pool */
>> - uint8_t slave_flag; /**< bonding slave port */
>> + uint8_t slave_flag : 1, /**< bonding slave port */
>> + bond_flag : 1; /**< port is bond device */
>
> Can't we detect if the port is a bonding port without introducing a new
> variable/state?
The bonding device is also an ethdev. I do not find the external API that
can be used to detect whether a port is a bonding port.
>
>> struct port_flow *flow_list; /**< Associated flows. */
>> struct port_indirect_action *actions_list;
>> /**< Associated indirect actions. */
>
> .
next prev parent reply other threads:[~2022-02-08 1:19 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 6:39 [dpdk-dev] [PATCH 0/3] bugfix for testpmd Min Hu (Connor)
2021-10-25 6:39 ` [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device Min Hu (Connor)
2021-11-15 13:01 ` Singh, Aman Deep
2021-11-16 1:20 ` lihuisong (C)
2022-02-03 7:06 ` Singh, Aman Deep
2022-02-04 12:07 ` Ferruh Yigit
2022-02-08 1:19 ` lihuisong (C) [this message]
2021-10-25 6:39 ` [dpdk-dev] [PATCH 2/3] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-02-04 12:14 ` Ferruh Yigit
2022-02-08 1:12 ` lihuisong (C)
2021-10-25 6:39 ` [dpdk-dev] [PATCH 3/3] app/testpmd: remove unused header file Min Hu (Connor)
2021-11-08 16:05 ` Ferruh Yigit
2022-03-24 3:00 ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-03-24 3:00 ` [PATCH V2 1/4] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-04-26 18:19 ` Ferruh Yigit
2022-04-29 6:45 ` Min Hu (Connor)
2022-04-29 13:31 ` Ferruh Yigit
2022-05-03 6:54 ` Min Hu (Connor)
2022-05-03 19:04 ` Ferruh Yigit
2022-05-05 1:16 ` Min Hu (Connor)
2022-03-24 3:00 ` [PATCH V2 2/4] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-04-26 18:19 ` Ferruh Yigit
2022-04-29 6:52 ` Min Hu (Connor)
2022-04-29 13:35 ` Ferruh Yigit
2022-03-24 3:00 ` [PATCH V2 3/4] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-03-24 3:00 ` [PATCH V2 4/4] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-05-30 6:01 ` Min Hu (Connor)
2022-05-30 10:21 ` Singh, Aman Deep
2022-04-25 6:49 ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-05-03 10:02 ` [PATCH v3 0/5] " Min Hu (Connor)
2022-05-03 10:02 ` [PATCH v3 1/5] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-05-03 10:02 ` [PATCH v3 2/5] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-05-03 10:02 ` [PATCH v3 3/5] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-05-03 23:39 ` Konstantin Ananyev
2022-05-06 8:16 ` Min Hu (Connor)
2022-05-08 11:28 ` Konstantin Ananyev
2022-05-10 16:34 ` Ferruh Yigit
2022-05-10 21:48 ` Konstantin Ananyev
2022-05-11 2:16 ` Min Hu (Connor)
2022-05-11 10:05 ` Ferruh Yigit
2022-05-11 2:14 ` [PATCH v4] " Min Hu (Connor)
2022-05-11 22:08 ` Konstantin Ananyev
2022-05-19 7:15 ` Andrew Rybchenko
2022-05-03 10:02 ` [PATCH v3 4/5] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-06-01 17:54 ` Ferruh Yigit
2022-06-07 8:15 ` Dongdong Liu
2022-06-07 8:10 ` [PATCH v4] " Dongdong Liu
2022-06-07 14:31 ` Ferruh Yigit
2022-06-09 7:50 ` Dongdong Liu
2022-06-09 8:50 ` Ferruh Yigit
2022-06-09 11:20 ` Dongdong Liu
2022-06-09 11:49 ` [PATCH v5] " Dongdong Liu
2022-06-10 8:10 ` Ferruh Yigit
2022-05-03 10:02 ` [PATCH v3 5/5] ethdev: fix dev state when stop Min Hu (Connor)
2022-05-25 17:44 ` Ferruh Yigit
2022-05-26 10:21 ` Thomas Monjalon
2022-05-30 12:04 ` Ferruh Yigit
2022-05-11 14:04 ` [PATCH v3 0/5] bugfix for bonding Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com \
--to=lihuisong@huawei.com \
--cc=aman.deep.singh@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=humin29@huawei.com \
--cc=radu.nicolau@intel.com \
--cc=thomas@monjalon.net \
--cc=xiaoyun.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).