DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] net/bonding: failover of LACP with mode 4 takes long time
@ 2022-06-05  3:07 Gaoxiang Liu
  2022-06-06 14:34 ` [PATCH v2] " Gaoxiang Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Gaoxiang Liu @ 2022-06-05  3:07 UTC (permalink / raw)
  To: chas3, humin29; +Cc: dev, liugaoxiang, Gaoxiang Liu, stable

When the primary port of bond slaves with bond mode 4 linked down,
the system id of the other slave ports channged.
It may cause some switches to renegotiate,
and the process takes a few seconds. It is not acceptable for any
Telcos.
We need sub-second switchover time like in linux.

Set the mac of the bond port to the slave port's system to solve the
problem.

Bugzilla ID: 551
Fixes: 46fb43683679 ("bond: add mode 4")
Cc: stable@dpdk.org

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index b3cddd8a20..b393148238 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -866,7 +866,6 @@ bond_mode_8023ad_periodic_cb(void *arg)
 	struct bond_dev_private *internals = bond_dev->data->dev_private;
 	struct port *port;
 	struct rte_eth_link link_info;
-	struct rte_ether_addr slave_addr;
 	struct rte_mbuf *lacp_pkt = NULL;
 	uint16_t slave_id;
 	uint16_t i;
@@ -893,7 +892,6 @@ bond_mode_8023ad_periodic_cb(void *arg)
 			key = 0;
 		}
 
-		rte_eth_macaddr_get(slave_id, &slave_addr);
 		port = &bond_mode_8023ad_ports[slave_id];
 
 		key = rte_cpu_to_be_16(key);
@@ -905,8 +903,8 @@ bond_mode_8023ad_periodic_cb(void *arg)
 			SM_FLAG_SET(port, NTT);
 		}
 
-		if (!rte_is_same_ether_addr(&port->actor.system, &slave_addr)) {
-			rte_ether_addr_copy(&slave_addr, &port->actor.system);
+		if (!rte_is_same_ether_addr(&internals->mode4.mac_addr, &port->actor.system)) {
+			rte_ether_addr_copy(&internals->mode4.mac_addr, &port->actor.system);
 			if (port->aggregator_port_id == slave_id)
 				SM_FLAG_SET(port, NTT);
 		}
@@ -1178,15 +1176,15 @@ bond_mode_8023ad_mac_address_update(struct rte_eth_dev *bond_dev)
 
 	bond_mode_8023ad_stop(bond_dev);
 
+	rte_eth_macadd_get(internals->port_id, &internals->mode4.mac_addr);
 	for (i = 0; i < internals->active_slave_count; i++) {
 		slave_id = internals->active_slaves[i];
 		slave = &bond_mode_8023ad_ports[slave_id];
-		rte_eth_macaddr_get(slave_id, &slave_addr);
 
-		if (rte_is_same_ether_addr(&slave_addr, &slave->actor.system))
+		if (rte_is_same_ether_addr(&internals->mode4.mac_addr, &slave->actor.system))
 			continue;
 
-		rte_ether_addr_copy(&slave_addr, &slave->actor.system);
+		rte_ether_addr_copy(&internals->mode4.mac_addr, &slave->actor.system);
 		/* Do nothing if this port is not an aggregator. In other case
 		 * Set NTT flag on every port that use this aggregator. */
 		if (slave->aggregator_port_id != slave_id)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] net/bonding: failover of LACP with mode 4 takes long time
  2022-06-05  3:07 [PATCH] net/bonding: failover of LACP with mode 4 takes long time Gaoxiang Liu
@ 2022-06-06 14:34 ` Gaoxiang Liu
  2022-06-09  8:28   ` Andrew Rybchenko
  2024-04-16 15:44   ` Ferruh Yigit
  0 siblings, 2 replies; 5+ messages in thread
From: Gaoxiang Liu @ 2022-06-06 14:34 UTC (permalink / raw)
  To: chas3, humin29; +Cc: dev, liugaoxiang, Gaoxiang Liu, stable

When the primary port of bond slaves with bond mode 4 linked down,
the system id of the other slave ports channged.
It may cause some switches to renegotiate,
and the process takes a few seconds. It is not acceptable for any
Telcos.
We need sub-second switchover time like in linux.

Set the mac of the bond port to the slave port's system to solve the
problem.

Bugzilla ID: 551
Fixes: 46fb43683679 ("bond: add mode 4")
Cc: stable@dpdk.org

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

---
v2:
* Fixed compile issues.
---
 drivers/net/bonding/rte_eth_bond_8023ad.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index b3cddd8a20..8d5486dad3 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -866,7 +866,6 @@ bond_mode_8023ad_periodic_cb(void *arg)
 	struct bond_dev_private *internals = bond_dev->data->dev_private;
 	struct port *port;
 	struct rte_eth_link link_info;
-	struct rte_ether_addr slave_addr;
 	struct rte_mbuf *lacp_pkt = NULL;
 	uint16_t slave_id;
 	uint16_t i;
@@ -893,7 +892,6 @@ bond_mode_8023ad_periodic_cb(void *arg)
 			key = 0;
 		}
 
-		rte_eth_macaddr_get(slave_id, &slave_addr);
 		port = &bond_mode_8023ad_ports[slave_id];
 
 		key = rte_cpu_to_be_16(key);
@@ -905,8 +903,8 @@ bond_mode_8023ad_periodic_cb(void *arg)
 			SM_FLAG_SET(port, NTT);
 		}
 
-		if (!rte_is_same_ether_addr(&port->actor.system, &slave_addr)) {
-			rte_ether_addr_copy(&slave_addr, &port->actor.system);
+		if (!rte_is_same_ether_addr(&internals->mode4.mac_addr, &port->actor.system)) {
+			rte_ether_addr_copy(&internals->mode4.mac_addr, &port->actor.system);
 			if (port->aggregator_port_id == slave_id)
 				SM_FLAG_SET(port, NTT);
 		}
@@ -1172,21 +1170,20 @@ void
 bond_mode_8023ad_mac_address_update(struct rte_eth_dev *bond_dev)
 {
 	struct bond_dev_private *internals = bond_dev->data->dev_private;
-	struct rte_ether_addr slave_addr;
 	struct port *slave, *agg_slave;
 	uint16_t slave_id, i, j;
 
 	bond_mode_8023ad_stop(bond_dev);
 
+	rte_eth_macaddr_get(internals->port_id, &internals->mode4.mac_addr);
 	for (i = 0; i < internals->active_slave_count; i++) {
 		slave_id = internals->active_slaves[i];
 		slave = &bond_mode_8023ad_ports[slave_id];
-		rte_eth_macaddr_get(slave_id, &slave_addr);
 
-		if (rte_is_same_ether_addr(&slave_addr, &slave->actor.system))
+		if (rte_is_same_ether_addr(&internals->mode4.mac_addr, &slave->actor.system))
 			continue;
 
-		rte_ether_addr_copy(&slave_addr, &slave->actor.system);
+		rte_ether_addr_copy(&internals->mode4.mac_addr, &slave->actor.system);
 		/* Do nothing if this port is not an aggregator. In other case
 		 * Set NTT flag on every port that use this aggregator. */
 		if (slave->aggregator_port_id != slave_id)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] net/bonding: failover of LACP with mode 4 takes long time
  2022-06-06 14:34 ` [PATCH v2] " Gaoxiang Liu
@ 2022-06-09  8:28   ` Andrew Rybchenko
  2024-04-16 15:44   ` Ferruh Yigit
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Rybchenko @ 2022-06-09  8:28 UTC (permalink / raw)
  To: Gaoxiang Liu, chas3, humin29; +Cc: dev, liugaoxiang, stable

On 6/6/22 17:34, Gaoxiang Liu wrote:
> When the primary port of bond slaves with bond mode 4 linked down,
> the system id of the other slave ports channged.
> It may cause some switches to renegotiate,
> and the process takes a few seconds. It is not acceptable for any
> Telcos.
> We need sub-second switchover time like in linux.
> 
> Set the mac of the bond port to the slave port's system to solve the
> problem.
> 
> Bugzilla ID: 551
> Fixes: 46fb43683679 ("bond: add mode 4")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

Should I use E-mail address from Signed-off-by as the author of the
patch? E-mail From does not match it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] net/bonding: failover of LACP with mode 4 takes long time
  2022-06-06 14:34 ` [PATCH v2] " Gaoxiang Liu
  2022-06-09  8:28   ` Andrew Rybchenko
@ 2024-04-16 15:44   ` Ferruh Yigit
  2024-04-17 13:16     ` Ferruh Yigit
  1 sibling, 1 reply; 5+ messages in thread
From: Ferruh Yigit @ 2024-04-16 15:44 UTC (permalink / raw)
  To: Gaoxiang Liu, chas3, humin29, kiran.kn80, nandu.rangaswamy
  Cc: dev, liugaoxiang, stable

On 6/6/2022 3:34 PM, Gaoxiang Liu wrote:
> When the primary port of bond slaves with bond mode 4 linked down,
> the system id of the other slave ports channged.
> It may cause some switches to renegotiate,
> and the process takes a few seconds. It is not acceptable for any
> Telcos.
> We need sub-second switchover time like in linux.
> 
> Set the mac of the bond port to the slave port's system to solve the
> problem.
> 
> Bugzilla ID: 551
> Fixes: 46fb43683679 ("bond: add mode 4")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> 
>

This is another old patch, as far as I can see from bugzilla, the patch
the issue is reported by Kiran and proposed by Nandu, and shared in mail
list by Gaoxiang.

As it is addressing two bugzilla items, and we are at the beginning of
the release cycle, I am willing to take risk and apply this patch
without ack from a bonding maintainer.
If we found some issues we can revert this back.

Chas, Connor, if you have any objection, or if you need time for review
please let me know, otherwise I will merge the patch soon.

And if anyone can provide test, like Kiran who reported the issue, or
any other stakeholder, it helps a lot to give confidence to the fix.


Thanks,
ferruh



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] net/bonding: failover of LACP with mode 4 takes long time
  2024-04-16 15:44   ` Ferruh Yigit
@ 2024-04-17 13:16     ` Ferruh Yigit
  0 siblings, 0 replies; 5+ messages in thread
From: Ferruh Yigit @ 2024-04-17 13:16 UTC (permalink / raw)
  To: Gaoxiang Liu, chas3, humin29, kiran.kn80, nandu.rangaswamy
  Cc: dev, liugaoxiang, stable

On 4/16/2024 4:44 PM, Ferruh Yigit wrote:
> On 6/6/2022 3:34 PM, Gaoxiang Liu wrote:
>> When the primary port of bond slaves with bond mode 4 linked down,
>> the system id of the other slave ports channged.
>> It may cause some switches to renegotiate,
>> and the process takes a few seconds. It is not acceptable for any
>> Telcos.
>> We need sub-second switchover time like in linux.
>>
>> Set the mac of the bond port to the slave port's system to solve the
>> problem.
>>
>> Bugzilla ID: 551
>> Fixes: 46fb43683679 ("bond: add mode 4")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
>>
>>
> 
> This is another old patch, as far as I can see from bugzilla, the patch
> the issue is reported by Kiran and proposed by Nandu, and shared in mail
> list by Gaoxiang.
> 
> As it is addressing two bugzilla items, and we are at the beginning of
> the release cycle, I am willing to take risk and apply this patch
> without ack from a bonding maintainer.
> If we found some issues we can revert this back.
> 
> Chas, Connor, if you have any objection, or if you need time for review
> please let me know, otherwise I will merge the patch soon.
> 
> And if anyone can provide test, like Kiran who reported the issue, or
> any other stakeholder, it helps a lot to give confidence to the fix.
> 

Applied to dpdk-next-net/main, thanks.


Please help on verifying the patch.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-17 13:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-05  3:07 [PATCH] net/bonding: failover of LACP with mode 4 takes long time Gaoxiang Liu
2022-06-06 14:34 ` [PATCH v2] " Gaoxiang Liu
2022-06-09  8:28   ` Andrew Rybchenko
2024-04-16 15:44   ` Ferruh Yigit
2024-04-17 13:16     ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).