From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3BF57A04BC; Fri, 9 Oct 2020 20:43:19 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A09981D15D; Fri, 9 Oct 2020 20:43:17 +0200 (CEST) Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178]) by dpdk.org (Postfix) with ESMTP id D04EE1D15C for ; Fri, 9 Oct 2020 20:43:16 +0200 (CEST) Received: by inbox.dpdk.org (Postfix, from userid 33) id A9A34A04C0; Fri, 9 Oct 2020 20:43:15 +0200 (CEST) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Fri, 09 Oct 2020 18:43:15 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 20.11 X-Bugzilla-Keywords: X-Bugzilla-Severity: major X-Bugzilla-Who: kiran.kn80@gmail.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 551] LACP failover with 802.3ad bond mode 4 takes long time X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" https://bugs.dpdk.org/show_bug.cgi?id=3D551 Bug ID: 551 Summary: LACP failover with 802.3ad bond mode 4 takes long time Product: DPDK Version: 20.11 Hardware: All OS: All Status: UNCONFIRMED Severity: major Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: kiran.kn80@gmail.com Target Milestone: --- When one of the bond slaves with 802.3ad is disabled, the switchover takes almost 6 seconds which is not acceptable for any Telcos. We need sub-second switchover time like in linux. Testing with Juniper QFX switch. The reason is system ID is changing (to that of the other slave device) when one of the active slaves go down. This causes re-negotiation and hence take= s a lot of time to converge. Is the system ID expected to be different for each link? Shouldn't it be the same for all links? As you can see below, system id of slave 0 is {0xac, 0x1f, 0x6b, 0x8d, 0xd7, 0xbd} system id of slave 1 is {0xac, 0x1f, 0x6b, 0x8d, 0xd7, 0xbc} Due to this, when the active slave goes down, system id changes. Shown this to Doherty, Declan ----- Logs from DPDK application -----=20 Breakpoint 1, rx_machine (internals=3D0x11409edf00, slave_id=3D1, lacp=3D0x= 13ae8b4be) at /root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:3= 26 326=20=20=20=20 /root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c: = No such file or directory. (gdb) p/x *port $2 =3D {actor_state =3D 0x3f, actor =3D {system_priority =3D 0xffff, system= =3D {addr_bytes =3D {0xac, 0x1f, 0x6b, 0x8d, 0xd7, 0xbd}}, key =3D 0x2100, port_priority =3D 0xff00, port_number = =3D 0x200}, partner_state =3D 0x3f, partner =3D {system_priority =3D 0x7f00, system =3D= {addr_bytes =3D {0x5c, 0x45, 0x27, 0x49, 0x64, 0x8c}}, key =3D 0x1500, port_priority =3D 0x7f00, port_number = =3D 0x1000}, sm_flags =3D 0x202, selected =3D 0x2, forced_rx_flags =3D 0x1, current_whil= e_timer =3D 0x23e52ecad117e2, periodic_timer =3D 0x23e52d82f5c5a2, wait_while_timer =3D 0x23dae7bc8488e2, tx_machine_timer =3D 0x23e52d4e8213a1, tx_marker_timer =3D 0x0, aggregator_= port_id =3D 0x0, mbuf_pool =3D 0x113f5db580, rx_ring =3D 0x113fa00bc0, tx_ring =3D 0x113fa00= 980, rx_marker_timer =3D 0x0, warning_timer =3D 0x23db47b16ff07d, warnings_to_show =3D 0x10, slow_pool = =3D 0x0} Breakpoint 1, rx_machine (internals=3D0x11409edf00, slave_id=3D0, lacp=3D0x= 13d7ce87e) at /root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:3= 26 326 in /root/contrail/third_party/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c (gdb) p/x *port $3 =3D {actor_state =3D 0x8f, actor =3D {system_priority =3D 0xffff, system= =3D {addr_bytes =3D {0xac, 0x1f, 0x6b, 0x8d, 0xd7, 0xbc}}, key =3D 0x2100, port_priority =3D 0xff00, port_number = =3D 0x100}, partner_state =3D 0x3f, partner =3D {system_priority =3D 0x7f00, system =3D= {addr_bytes =3D {0x5c, 0x45, 0x27, 0x49, 0x64, 0x8c}}, key =3D 0x1500, port_priority =3D 0x7f00, port_number = =3D 0x1100}, sm_flags =3D 0x202, selected =3D 0x2, forced_rx_flags =3D 0x1, current_whil= e_timer =3D 0x23e537ad7f62bc, periodic_timer =3D 0x23e5369a1fcadb, wait_while_timer =3D 0x23da0985aca4f4, tx_machine_timer =3D 0x23e53665ac40d8, tx_marker_timer =3D 0x0, aggregator_= port_id =3D 0x0, mbuf_pool =3D 0x114022b800, rx_ring =3D 0x1140607600, tx_ring =3D 0x1140607= 3c0, rx_marker_timer =3D 0x0, warning_timer =3D 0x23e536a73cfc50, warnings_to_show =3D 0x10, slow_pool = =3D 0x0} --------- ---- Logs from Juniper QFX switch ---- =3D =3D root@a6-qfx1# run show lacp interfaces ae20 Aggregated interface: ae20 LACP state: Role Exp Def Dist Col Syn Aggr Timeout Acti= vity xe-0/0/23 Actor No No Yes Yes Yes Yes Fast Ac= tive xe-0/0/23 Partner No No Yes Yes Yes Yes Fast Ac= tive xe-0/0/20 Actor No No Yes Yes Yes Yes Fast Ac= tive xe-0/0/20 Partner No No Yes Yes Yes Yes Fast Ac= tive LACP protocol: Receive State Transmit State Mux State xe-0/0/23 Current Fast periodic Collecting distribu= ting xe-0/0/20 Current Fast periodic Collecting distribu= ting [edit] root@a6-qfx1# run show interfaces ae20 extensive | find LACP LACP info: Role System System Port Port= =20 Port priority identifier priority number= =20=20 key xe-0/0/20.0 Actor 127 5c:45:27:49:64:8c 127 17= =20=20=20 21 xe-0/0/20.0 Partner 65535 ac:1f:6b:8d:d7:bc 255 1= =20=20=20 33 xe-0/0/23.0 Actor 127 5c:45:27:49:64:8c 127 16= =20=20=20 21 xe-0/0/23.0 Partner 65535 ac:1f:6b:8d:d7:bc 255 2= =20=20=20 33 LACP Statistics: LACP Rx LACP Tx Unknown Rx Illegal Rx xe-0/0/20.0 1541355 1441147 0 0 xe-0/0/23.0 1727402 1601884 0 0 Marker Statistics: Marker Rx Resp Tx Unknown Rx Illegal Rx xe-0/0/20.0 0 0 0 0 xe-0/0/23.0 0 0 0 0 Protocol eth-switch, MTU: 9216, Generation: 164, Route table: 0 Flags: None 05:28:19.157776 In LACPv1, length 110 Actor Information TLV (0x01), length 20 System ac:1f:6b:8d:d7:bc, System Priority 65535, Key 33, Port 2, = Port Priority 255 State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing] Partner Information TLV (0x02), length 20 System 5c:45:27:49:64:8c, System Priority 127, Key 21, Port 16, P= ort Priority 127 State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing] Collector Information TLV (0x03), length 16 Max Delay 0 Terminator TLV (0x00), length 0 [edit] root@a6-qfx1# [edit] root@a6-qfx1# set interfaces xe-0/0/20 disable root@a6-qfx1# commit commit complete [edit] root@a6-qfx1# run show lacp interfaces ae20 Aggregated interface: ae20 LACP state: Role Exp Def Dist Col Syn Aggr Timeout Acti= vity xe-0/0/23 Actor No No Yes Yes Yes Yes Fast Ac= tive xe-0/0/23 Partner No No Yes Yes Yes Yes Fast Ac= tive xe-0/0/20 Actor No Yes No No No Yes Fast Ac= tive xe-0/0/20 Partner No Yes No No No Yes Fast Pas= sive LACP protocol: Receive State Transmit State Mux State xe-0/0/23 Current Fast periodic Collecting distribu= ting xe-0/0/20 Port disabled No periodic Detached [edit] root@a6-qfx1# run show interfaces ae20 extensive | find LACP LACP info: Role System System Port Port= =20 Port priority identifier priority number= =20=20 key xe-0/0/20.0 Actor 127 5c:45:27:49:64:8c 127 17= =20=20=20 21 xe-0/0/20.0 Partner 65535 ac:1f:6b:8d:d7:bd 1 17= =20=20=20 33 xe-0/0/23.0 Actor 127 5c:45:27:49:64:8c 127 16= =20=20=20 21 xe-0/0/23.0 Partner 65535 ac:1f:6b:8d:d7:bd 255 2= =20=20=20 33 =3D>>> notice the change in Linux system-id =E2=80=9C:bc to :bd=E2=80=9D LACP Statistics: LACP Rx LACP Tx Unknown Rx Illegal Rx xe-0/0/20.0 1541397 1441186 0 0 xe-0/0/23.0 1727471 1601948 0 0 Marker Statistics: Marker Rx Resp Tx Unknown Rx Illegal Rx xe-0/0/20.0 0 0 0 0 xe-0/0/23.0 0 0 0 0 Protocol eth-switch, MTU: 9216, Generation: 164, Route table: 0 Flags: None [edit] root@a6-qfx1# 05:28:20.306105 In LACPv1, length 110 Actor Information TLV (0x01), length 20 System ac:1f:6b:8d:d7:bd, System Priority 65535, Key 33, Port 2, = Port Priority 255 State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting] Partner Information TLV (0x02), length 20 System 5c:45:27:49:64:8c, System Priority 127, Key 21, Port 16, P= ort Priority 127 State Flags [Activity, Timeout, Aggregation, Synchronization] Collector Information TLV (0x03), length 16 Max Delay 0 Terminator TLV (0x00), length 0 =3D =3D -------- --=20 You are receiving this mail because: You are the assignee for the bug.=