DPDK patches and discussions
 help / color / mirror / Atom feed
* [BUG] [bonding] bonding member delete bug
@ 2023-12-18  2:51 Simon Jones
  2023-12-18  6:37 ` Simon Jones
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Jones @ 2023-12-18  2:51 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 2762 bytes --]

Hi all,

I'm using DPDK-21.11 in ovs-dpdk.

I found a "bonding member delete bug" .

1. How to reproduce

```
NOTICE: bondctl is a tool I develop, it's to control DPDK.

### step 1, Add bonding device bond0.
bondctl add bond0 mode active-backup

### step 2, Add member m1 into bond0.
bondctl set 0000:00:0a.0 master bond0

### step 3, Add bond0 into ovs bridge.
ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
options:dpdk-devargs=net_bonding-bond0
(this command call @bond_ethdev_start at last.)

### step 4, Delete bond0 from ovs bridge.
ovs-vsctl del-port br-phy bond0
(this command call @bond_ethdev_stop at last.)

### step 5, Delete m1 from bond0.
bondctl set 0000:00:0a.0 nomaster

### step 6, Delete bond0.
bondctl del bond0

### step 7, Add bond0.
bondctl add bond0 mode active-backup

### step 8, Add member m1 into bond0.
bondctl set 0000:00:0a.0 master bond0
(this command call @bond_ethdev_start at last.)

### Then got error message.
2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
configurr
ation
2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) - can not
confii
g slave 0000:00:0a.0!
```

2. Debug

I found the reason is, when member port is DOWN, then add operation will
call "eth_dev->data->dev_started = 1;", but no one add active member port,
so when delete bond0, will NOT call @rte_eth_dev_stop, then add bond0
again, got error. Detail is:
```
### After step 1-3, add bond0 into ovs-dpdk
bond_ethdev_start
    eth_dev->data->dev_started = 1;
    for (i = 0; i < internals->slave_count; i++) {
        if (slave_configure(eth_dev, slave_ethdev) != 0) {
        if (slave_start(eth_dev, slave_ethdev) != 0) {
            rte_eth_dev_start

### NOTICE, as member port is DOWN, so will NOT call @activate_slave,
so @active_slave_count is 0.
bond_ethdev_lsc_event_callback
    activate_slave(bonded_eth_dev, port_id);

### After step 4, delete bond0 from ovs-dpdk, NOTICE,
as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
bond_ethdev_stop
    for (i = 0; i < internals->slave_count; i++) {
        if (find_slave_by_id(internals->active_slaves,
                internals->active_slave_count, slave_id) !=
                        internals->active_slave_count) {
            ret = rte_eth_dev_stop(slave_id);

### After step 5-7, delete bond0 and then add bond0

### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
call @rte_eth_dev_start
again will got error.
2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
configurr
ation

```

3. My question

Is this bug fixed ? Which commit ?

If NOT, how to fix this bug? I think it's better to call @rte_eth_dev_stop
for every member, even it's DOWN. How about this?

Thanks~


----
Simon Jones

[-- Attachment #2: Type: text/html, Size: 4259 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] [bonding] bonding member delete bug
  2023-12-18  2:51 [BUG] [bonding] bonding member delete bug Simon Jones
@ 2023-12-18  6:37 ` Simon Jones
  2024-01-08 15:55   ` Ferruh Yigit
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Jones @ 2023-12-18  6:37 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 3180 bytes --]

Oh, it's fixed by 0911d4ec and f5e72e8e
----
Simon Jones


Simon Jones <batmanustc@gmail.com> 于2023年12月18日周一 10:51写道:

> Hi all,
>
> I'm using DPDK-21.11 in ovs-dpdk.
>
> I found a "bonding member delete bug" .
>
> 1. How to reproduce
>
> ```
> NOTICE: bondctl is a tool I develop, it's to control DPDK.
>
> ### step 1, Add bonding device bond0.
> bondctl add bond0 mode active-backup
>
> ### step 2, Add member m1 into bond0.
> bondctl set 0000:00:0a.0 master bond0
>
> ### step 3, Add bond0 into ovs bridge.
> ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
> options:dpdk-devargs=net_bonding-bond0
> (this command call @bond_ethdev_start at last.)
>
> ### step 4, Delete bond0 from ovs bridge.
> ovs-vsctl del-port br-phy bond0
> (this command call @bond_ethdev_stop at last.)
>
> ### step 5, Delete m1 from bond0.
> bondctl set 0000:00:0a.0 nomaster
>
> ### step 6, Delete bond0.
> bondctl del bond0
>
> ### step 7, Add bond0.
> bondctl add bond0 mode active-backup
>
> ### step 8, Add member m1 into bond0.
> bondctl set 0000:00:0a.0 master bond0
> (this command call @bond_ethdev_start at last.)
>
> ### Then got error message.
> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
> configurr
> ation
> 2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) - can not
> confii
> g slave 0000:00:0a.0!
> ```
>
> 2. Debug
>
> I found the reason is, when member port is DOWN, then add operation will
> call "eth_dev->data->dev_started = 1;", but no one add active member port,
> so when delete bond0, will NOT call @rte_eth_dev_stop, then add bond0
> again, got error. Detail is:
> ```
> ### After step 1-3, add bond0 into ovs-dpdk
> bond_ethdev_start
>     eth_dev->data->dev_started = 1;
>     for (i = 0; i < internals->slave_count; i++) {
>         if (slave_configure(eth_dev, slave_ethdev) != 0) {
>         if (slave_start(eth_dev, slave_ethdev) != 0) {
>             rte_eth_dev_start
>
> ### NOTICE, as member port is DOWN, so will NOT call @activate_slave,
> so @active_slave_count is 0.
> bond_ethdev_lsc_event_callback
>     activate_slave(bonded_eth_dev, port_id);
>
> ### After step 4, delete bond0 from ovs-dpdk, NOTICE,
> as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
> bond_ethdev_stop
>     for (i = 0; i < internals->slave_count; i++) {
>         if (find_slave_by_id(internals->active_slaves,
>                 internals->active_slave_count, slave_id) !=
>                         internals->active_slave_count) {
>             ret = rte_eth_dev_stop(slave_id);
>
> ### After step 5-7, delete bond0 and then add bond0
>
> ### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so call @rte_eth_dev_start
> again will got error.
> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
> configurr
> ation
>
> ```
>
> 3. My question
>
> Is this bug fixed ? Which commit ?
>
> If NOT, how to fix this bug? I think it's better to call @rte_eth_dev_stop
> for every member, even it's DOWN. How about this?
>
> Thanks~
>
>
> ----
> Simon Jones
>

[-- Attachment #2: Type: text/html, Size: 4928 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] [bonding] bonding member delete bug
  2023-12-18  6:37 ` Simon Jones
@ 2024-01-08 15:55   ` Ferruh Yigit
  2024-01-08 16:07     ` Kevin Traynor
  0 siblings, 1 reply; 5+ messages in thread
From: Ferruh Yigit @ 2024-01-08 15:55 UTC (permalink / raw)
  To: Simon Jones, dev; +Cc: Kevin Traynor, David Marchand

On 12/18/2023 6:37 AM, Simon Jones wrote:
> Oh, it's fixed by 0911d4ec and f5e72e8e
>

Thanks Simon for reporting.

Do you know if the above fixes backported to the 21.11.x LTS release?


> ----
> Simon Jones
> 
> 
> Simon Jones <batmanustc@gmail.com <mailto:batmanustc@gmail.com>> 于2023
> 年12月18日周一 10:51写道:
> 
>     Hi all,
> 
>     I'm using DPDK-21.11 in ovs-dpdk.
> 
>     I found a "bonding member delete bug" .
> 
>     1. How to reproduce
> 
>     ```
>     NOTICE: bondctl is a tool I develop, it's to control DPDK.
> 
>     ### step 1, Add bonding device bond0.
>     bondctl add bond0 mode active-backup
> 
>     ### step 2, Add member m1 into bond0.
>     bondctl set 0000:00:0a.0 master bond0 
> 
>     ### step 3, Add bond0 into ovs bridge.
>     ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
>     options:dpdk-devargs=net_bonding-bond0
>     (this command call @bond_ethdev_start at last.)
> 
>     ### step 4, Delete bond0 from ovs bridge.
>     ovs-vsctl del-port br-phy bond0
>     (this command call @bond_ethdev_stop at last.)
> 
>     ### step 5, Delete m1 from bond0.
>     bondctl set 0000:00:0a.0 nomaster
> 
>     ### step 6, Delete bond0.
>     bondctl del bond0
> 
>     ### step 7, Add bond0.
>     bondctl add bond0 mode active-backup
> 
>     ### step 8, Add member m1 into bond0.
>     bondctl set 0000:00:0a.0 master bond0
>     (this command call @bond_ethdev_start at last.)
> 
>     ### Then got error message.
>     2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>     allow configurr
>     ation
>     2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) -
>     can not confii
>     g slave 0000:00:0a.0!
>     ```
> 
>     2. Debug
> 
>     I found the reason is, when member port is DOWN, then add operation
>     will call "eth_dev->data->dev_started = 1;", but no one add active
>     member port, so when delete bond0, will NOT call @rte_eth_dev_stop,
>     then add bond0 again, got error. Detail is:
>     ```
>     ### After step 1-3, add bond0 into ovs-dpdk
>     bond_ethdev_start
>         eth_dev->data->dev_started = 1;
>         for (i = 0; i < internals->slave_count; i++) {
>             if (slave_configure(eth_dev, slave_ethdev) != 0) {
>             if (slave_start(eth_dev, slave_ethdev) != 0) {
>                 rte_eth_dev_start
> 
>     ### NOTICE, as member port is DOWN, so will NOT
>     call @activate_slave, so @active_slave_count is 0.
>     bond_ethdev_lsc_event_callback
>         activate_slave(bonded_eth_dev, port_id);
> 
>     ### After step 4, delete bond0 from ovs-dpdk, NOTICE,
>     as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
>     bond_ethdev_stop
>         for (i = 0; i < internals->slave_count; i++) {
>             if (find_slave_by_id(internals->active_slaves,
>                     internals->active_slave_count, slave_id) !=
>                             internals->active_slave_count) {
>                 ret = rte_eth_dev_stop(slave_id);
> 
>     ### After step 5-7, delete bond0 and then add bond0
> 
>     ### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
>     call @rte_eth_dev_start again will got error.
>     2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>     allow configurr
>     ation
> 
>     ```
> 
>     3. My question
> 
>     Is this bug fixed ? Which commit ?
> 
>     If NOT, how to fix this bug? I think it's better to
>     call @rte_eth_dev_stop for every member, even it's DOWN. How about this?
> 
>     Thanks~
> 
> 
>     ----
>     Simon Jones
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] [bonding] bonding member delete bug
  2024-01-08 15:55   ` Ferruh Yigit
@ 2024-01-08 16:07     ` Kevin Traynor
  2024-01-09 11:30       ` Ferruh Yigit
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Traynor @ 2024-01-08 16:07 UTC (permalink / raw)
  To: Ferruh Yigit, Simon Jones, dev; +Cc: David Marchand

On 08/01/2024 15:55, Ferruh Yigit wrote:
> On 12/18/2023 6:37 AM, Simon Jones wrote:
>> Oh, it's fixed by 0911d4ec and f5e72e8e
>>
> 
> Thanks Simon for reporting.
> 
> Do you know if the above fixes backported to the 21.11.x LTS release?
> 

Yes, 0911d4ec as part of 18.11 [0] and f5e72e8e backported to 21.11
branch since v21.11.2 [1]

[0]
https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=0911d4ec01839c9149a0df5758d00d9d57a47cea

[1]
https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=5a8afc69afabd3c69efbc1b0c048f31d06f7d875

thanks,
Kevin.

> 
>> ----
>> Simon Jones
>>
>>
>> Simon Jones <batmanustc@gmail.com <mailto:batmanustc@gmail.com>> 于2023
>> 年12月18日周一 10:51写道:
>>
>>     Hi all,
>>
>>     I'm using DPDK-21.11 in ovs-dpdk.
>>
>>     I found a "bonding member delete bug" .
>>
>>     1. How to reproduce
>>
>>     ```
>>     NOTICE: bondctl is a tool I develop, it's to control DPDK.
>>
>>     ### step 1, Add bonding device bond0.
>>     bondctl add bond0 mode active-backup
>>
>>     ### step 2, Add member m1 into bond0.
>>     bondctl set 0000:00:0a.0 master bond0 
>>
>>     ### step 3, Add bond0 into ovs bridge.
>>     ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
>>     options:dpdk-devargs=net_bonding-bond0
>>     (this command call @bond_ethdev_start at last.)
>>
>>     ### step 4, Delete bond0 from ovs bridge.
>>     ovs-vsctl del-port br-phy bond0
>>     (this command call @bond_ethdev_stop at last.)
>>
>>     ### step 5, Delete m1 from bond0.
>>     bondctl set 0000:00:0a.0 nomaster
>>
>>     ### step 6, Delete bond0.
>>     bondctl del bond0
>>
>>     ### step 7, Add bond0.
>>     bondctl add bond0 mode active-backup
>>
>>     ### step 8, Add member m1 into bond0.
>>     bondctl set 0000:00:0a.0 master bond0
>>     (this command call @bond_ethdev_start at last.)
>>
>>     ### Then got error message.
>>     2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>>     allow configurr
>>     ation
>>     2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) -
>>     can not confii
>>     g slave 0000:00:0a.0!
>>     ```
>>
>>     2. Debug
>>
>>     I found the reason is, when member port is DOWN, then add operation
>>     will call "eth_dev->data->dev_started = 1;", but no one add active
>>     member port, so when delete bond0, will NOT call @rte_eth_dev_stop,
>>     then add bond0 again, got error. Detail is:
>>     ```
>>     ### After step 1-3, add bond0 into ovs-dpdk
>>     bond_ethdev_start
>>         eth_dev->data->dev_started = 1;
>>         for (i = 0; i < internals->slave_count; i++) {
>>             if (slave_configure(eth_dev, slave_ethdev) != 0) {
>>             if (slave_start(eth_dev, slave_ethdev) != 0) {
>>                 rte_eth_dev_start
>>
>>     ### NOTICE, as member port is DOWN, so will NOT
>>     call @activate_slave, so @active_slave_count is 0.
>>     bond_ethdev_lsc_event_callback
>>         activate_slave(bonded_eth_dev, port_id);
>>
>>     ### After step 4, delete bond0 from ovs-dpdk, NOTICE,
>>     as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
>>     bond_ethdev_stop
>>         for (i = 0; i < internals->slave_count; i++) {
>>             if (find_slave_by_id(internals->active_slaves,
>>                     internals->active_slave_count, slave_id) !=
>>                             internals->active_slave_count) {
>>                 ret = rte_eth_dev_stop(slave_id);
>>
>>     ### After step 5-7, delete bond0 and then add bond0
>>
>>     ### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
>>     call @rte_eth_dev_start again will got error.
>>     2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>>     allow configurr
>>     ation
>>
>>     ```
>>
>>     3. My question
>>
>>     Is this bug fixed ? Which commit ?
>>
>>     If NOT, how to fix this bug? I think it's better to
>>     call @rte_eth_dev_stop for every member, even it's DOWN. How about this?
>>
>>     Thanks~
>>
>>
>>     ----
>>     Simon Jones
>>
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] [bonding] bonding member delete bug
  2024-01-08 16:07     ` Kevin Traynor
@ 2024-01-09 11:30       ` Ferruh Yigit
  0 siblings, 0 replies; 5+ messages in thread
From: Ferruh Yigit @ 2024-01-09 11:30 UTC (permalink / raw)
  To: Kevin Traynor, Simon Jones, dev; +Cc: David Marchand

On 1/8/2024 4:07 PM, Kevin Traynor wrote:
> On 08/01/2024 15:55, Ferruh Yigit wrote:
>> On 12/18/2023 6:37 AM, Simon Jones wrote:
>>> Oh, it's fixed by 0911d4ec and f5e72e8e
>>>
>>
>> Thanks Simon for reporting.
>>
>> Do you know if the above fixes backported to the 21.11.x LTS release?
>>
> 
> Yes, 0911d4ec as part of 18.11 [0] and f5e72e8e backported to 21.11
> branch since v21.11.2 [1]
> 

Thanks Kevin for confirming.

> [0]
> https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=0911d4ec01839c9149a0df5758d00d9d57a47cea
> 
> [1]
> https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=5a8afc69afabd3c69efbc1b0c048f31d06f7d875
> 

<...>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-01-09 11:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-18  2:51 [BUG] [bonding] bonding member delete bug Simon Jones
2023-12-18  6:37 ` Simon Jones
2024-01-08 15:55   ` Ferruh Yigit
2024-01-08 16:07     ` Kevin Traynor
2024-01-09 11:30       ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).