* [dpdk-dev] [Bug 551] LACP failover with 802.3ad bond mode 4 takes long time
@ 2020-10-09 18:43 bugzilla
0 siblings, 0 replies; only message in thread
From: bugzilla @ 2020-10-09 18:43 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=551
Bug ID: 551
Summary: LACP failover with 802.3ad bond mode 4 takes long time
Product: DPDK
Version: 20.11
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: major
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: kiran.kn80@gmail.com
Target Milestone: ---
When one of the bond slaves with 802.3ad is disabled, the switchover takes
almost 6 seconds which is not acceptable for any Telcos. We need sub-second
switchover time like in linux.
Testing with Juniper QFX switch.
The reason is system ID is changing (to that of the other slave device) when
one of the active slaves go down. This causes re-negotiation and hence takes a
lot of time to converge.
Is the system ID expected to be different for each link? Shouldn't it be the
same for all links?
As you can see below, system id of slave 0 is {0xac, 0x1f, 0x6b, 0x8d, 0xd7,
0xbd}
system id of slave 1 is {0xac, 0x1f, 0x6b, 0x8d, 0xd7, 0xbc}
Due to this, when the active slave goes down, system id changes.
Shown this to Doherty, Declan <declan.doherty@intel.com>
----- Logs from DPDK application -----
Breakpoint 1, rx_machine (internals=0x11409edf00, slave_id=1, lacp=0x13ae8b4be)
at
/root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:326
326
/root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c: No
such file or directory.
(gdb) p/x *port
$2 = {actor_state = 0x3f, actor = {system_priority = 0xffff, system =
{addr_bytes = {0xac, 0x1f, 0x6b,
0x8d, 0xd7, 0xbd}}, key = 0x2100, port_priority = 0xff00, port_number = 0x200},
partner_state = 0x3f, partner = {system_priority = 0x7f00, system = {addr_bytes
= {0x5c, 0x45, 0x27,
0x49, 0x64, 0x8c}}, key = 0x1500, port_priority = 0x7f00, port_number =
0x1000},
sm_flags = 0x202, selected = 0x2, forced_rx_flags = 0x1, current_while_timer =
0x23e52ecad117e2,
periodic_timer = 0x23e52d82f5c5a2, wait_while_timer = 0x23dae7bc8488e2,
tx_machine_timer = 0x23e52d4e8213a1, tx_marker_timer = 0x0, aggregator_port_id
= 0x0,
mbuf_pool = 0x113f5db580, rx_ring = 0x113fa00bc0, tx_ring = 0x113fa00980,
rx_marker_timer = 0x0,
warning_timer = 0x23db47b16ff07d, warnings_to_show = 0x10, slow_pool = 0x0}
Breakpoint 1, rx_machine (internals=0x11409edf00, slave_id=0, lacp=0x13d7ce87e)
at
/root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:326
326 in
/root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c
(gdb) p/x *port
$3 = {actor_state = 0x8f, actor = {system_priority = 0xffff, system =
{addr_bytes = {0xac, 0x1f, 0x6b,
0x8d, 0xd7, 0xbc}}, key = 0x2100, port_priority = 0xff00, port_number = 0x100},
partner_state = 0x3f, partner = {system_priority = 0x7f00, system = {addr_bytes
= {0x5c, 0x45, 0x27,
0x49, 0x64, 0x8c}}, key = 0x1500, port_priority = 0x7f00, port_number =
0x1100},
sm_flags = 0x202, selected = 0x2, forced_rx_flags = 0x1, current_while_timer =
0x23e537ad7f62bc,
periodic_timer = 0x23e5369a1fcadb, wait_while_timer = 0x23da0985aca4f4,
tx_machine_timer = 0x23e53665ac40d8, tx_marker_timer = 0x0, aggregator_port_id
= 0x0,
mbuf_pool = 0x114022b800, rx_ring = 0x1140607600, tx_ring = 0x11406073c0,
rx_marker_timer = 0x0,
warning_timer = 0x23e536a73cfc50, warnings_to_show = 0x10, slow_pool = 0x0}
---------
---- Logs from Juniper QFX switch ----
= =
root@a6-qfx1# run show lacp interfaces ae20
Aggregated interface: ae20
LACP state: Role Exp Def Dist Col Syn Aggr Timeout Activity
xe-0/0/23 Actor No No Yes Yes Yes Yes Fast Active
xe-0/0/23 Partner No No Yes Yes Yes Yes Fast Active
xe-0/0/20 Actor No No Yes Yes Yes Yes Fast Active
xe-0/0/20 Partner No No Yes Yes Yes Yes Fast Active
LACP protocol: Receive State Transmit State Mux State
xe-0/0/23 Current Fast periodic Collecting distributing
xe-0/0/20 Current Fast periodic Collecting distributing
[edit]
root@a6-qfx1# run show interfaces ae20 extensive | find LACP
LACP info: Role System System Port Port
Port
priority identifier priority number
key
xe-0/0/20.0 Actor 127 5c:45:27:49:64:8c 127 17
21
xe-0/0/20.0 Partner 65535 ac:1f:6b:8d:d7:bc 255 1
33
xe-0/0/23.0 Actor 127 5c:45:27:49:64:8c 127 16
21
xe-0/0/23.0 Partner 65535 ac:1f:6b:8d:d7:bc 255 2
33
LACP Statistics: LACP Rx LACP Tx Unknown Rx Illegal Rx
xe-0/0/20.0 1541355 1441147 0 0
xe-0/0/23.0 1727402 1601884 0 0
Marker Statistics: Marker Rx Resp Tx Unknown Rx Illegal Rx
xe-0/0/20.0 0 0 0 0
xe-0/0/23.0 0 0 0 0
Protocol eth-switch, MTU: 9216, Generation: 164, Route table: 0
Flags: None
05:28:19.157776 In LACPv1, length 110
Actor Information TLV (0x01), length 20
System ac:1f:6b:8d:d7:bc, System Priority 65535, Key 33, Port 2, Port
Priority 255
State Flags [Activity, Timeout, Aggregation, Synchronization,
Collecting, Distributing]
Partner Information TLV (0x02), length 20
System 5c:45:27:49:64:8c, System Priority 127, Key 21, Port 16, Port
Priority 127
State Flags [Activity, Timeout, Aggregation, Synchronization,
Collecting, Distributing]
Collector Information TLV (0x03), length 16
Max Delay 0
Terminator TLV (0x00), length 0
[edit]
root@a6-qfx1#
[edit]
root@a6-qfx1# set interfaces xe-0/0/20 disable
root@a6-qfx1# commit
commit complete
[edit]
root@a6-qfx1# run show lacp interfaces ae20
Aggregated interface: ae20
LACP state: Role Exp Def Dist Col Syn Aggr Timeout Activity
xe-0/0/23 Actor No No Yes Yes Yes Yes Fast Active
xe-0/0/23 Partner No No Yes Yes Yes Yes Fast Active
xe-0/0/20 Actor No Yes No No No Yes Fast Active
xe-0/0/20 Partner No Yes No No No Yes Fast Passive
LACP protocol: Receive State Transmit State Mux State
xe-0/0/23 Current Fast periodic Collecting distributing
xe-0/0/20 Port disabled No periodic Detached
[edit]
root@a6-qfx1# run show interfaces ae20 extensive | find LACP
LACP info: Role System System Port Port
Port
priority identifier priority number
key
xe-0/0/20.0 Actor 127 5c:45:27:49:64:8c 127 17
21
xe-0/0/20.0 Partner 65535 ac:1f:6b:8d:d7:bd 1 17
33
xe-0/0/23.0 Actor 127 5c:45:27:49:64:8c 127 16
21
xe-0/0/23.0 Partner 65535 ac:1f:6b:8d:d7:bd 255 2
33 =>>> notice the change in Linux system-id “:bc to :bd”
LACP Statistics: LACP Rx LACP Tx Unknown Rx Illegal Rx
xe-0/0/20.0 1541397 1441186 0 0
xe-0/0/23.0 1727471 1601948 0 0
Marker Statistics: Marker Rx Resp Tx Unknown Rx Illegal Rx
xe-0/0/20.0 0 0 0 0
xe-0/0/23.0 0 0 0 0
Protocol eth-switch, MTU: 9216, Generation: 164, Route table: 0
Flags: None
[edit]
root@a6-qfx1#
05:28:20.306105 In LACPv1, length 110
Actor Information TLV (0x01), length 20
System ac:1f:6b:8d:d7:bd, System Priority 65535, Key 33, Port 2, Port
Priority 255
State Flags [Activity, Timeout, Aggregation, Synchronization,
Collecting]
Partner Information TLV (0x02), length 20
System 5c:45:27:49:64:8c, System Priority 127, Key 21, Port 16, Port
Priority 127
State Flags [Activity, Timeout, Aggregation, Synchronization]
Collector Information TLV (0x03), length 16
Max Delay 0
Terminator TLV (0x00), length 0
= =
--------
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2020-10-09 18:43 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 18:43 [dpdk-dev] [Bug 551] LACP failover with 802.3ad bond mode 4 takes long time bugzilla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).