From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CDAD5A00BE; Thu, 21 May 2020 14:12:02 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 40D6B1D69E; Thu, 21 May 2020 14:12:02 +0200 (CEST) Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178]) by dpdk.org (Postfix) with ESMTP id 0F7081D698 for ; Thu, 21 May 2020 14:12:01 +0200 (CEST) Received: by inbox.dpdk.org (Postfix, from userid 33) id DB981A00C5; Thu, 21 May 2020 14:12:00 +0200 (CEST) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Thu, 21 May 2020 12:12:01 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 19.11 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: iobeyond@126.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 483] Bond 8023ad lacp handshake sometimes fail X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" https://bugs.dpdk.org/show_bug.cgi?id=3D483 Bug ID: 483 Summary: Bond 8023ad lacp handshake sometimes fail Product: DPDK Version: 19.11 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: iobeyond@126.com Target Milestone: --- There are two ports in my bond and two hosts are connected by a switch. I open the dpdk debug info with macro RTE_LIBRTE_BOND_DEBUG_8023AD. Port 0 MAC: ac:f9:70:88:f3:26 Port 1 MAC: ac:f9:70:88:f3:27 BOND MAC: ac:f9:70:88:f3:26 When tx_machine send lacp with Port 1 Mac ac:f9:70:88:f3:27, the handshake = will fail. when lacp handshake failed, log like this: ---------- 997 [Port 0: rx_machine] -> INITIALIZE 997 [Port 0: periodic_machine] -> NO_PERIODIC ( begind LACP active ) 997 [Port 0: mux_machine] -> DETACHED 997 [Port 0: selection_logic] -> SELECTED: ID=3D 1 aggregator found aggregator ID=3D 1 997 [Port 0: mux_machine] DETACHED -> WAITING 1995 [Port 1: tx_machine] Sending LACP frame bond_print_lacp(122) - LACP: { subtype=3D 01 ver_num=3D01 actor=3D{ tlv=3D01, len=3D14 pri=3DFFFF, system=3DAC:F9:70:88:F3:27, key=3D2100, p_pri=3DFF00 p_num= =3D0200 state=3D{ ACT AGG DEF EXP } } partner=3D{ tlv=3D02, len=3D14 pri=3DFFFF, system=3D00:00:00:00:00:00, key=3D0100, p_pri=3DFF00 p_num= =3D0000 state=3D{ ACT TIMEOUT AGG } } collector=3D{info=3D03, length=3D10, max_delay=3D0000 , type_term=3D00, terminator_length =3D 00 } 1995 [Port 0: tx_machine] Sending LACP frame bond_print_lacp(122) - LACP: { subtype=3D 01 ver_num=3D01 actor=3D{ tlv=3D01, len=3D14 pri=3DFFFF, system=3DAC:F9:70:88:F3:27, key=3D2100, p_pri=3DFF00 p_num= =3D0100 state=3D{ ACT AGG DEF EXP } } partner=3D{ tlv=3D02, len=3D14 pri=3DFFFF, system=3D00:00:00:00:00:00, key=3D0100, p_pri=3DFF00 p_num= =3D0000 state=3D{ ACT TIMEOUT AGG } } collector=3D{info=3D03, length=3D10, max_delay=3D0000 , type_term=3D00, terminator_length =3D 00 } 2095 [Port 1: mux_machine] ATTACHED Entered 2594 [Port 1: tx_machine] Sending LACP frame ---------- when lacp handshake succeeds, log like this: ---------- 0 [Port 0: rx_machine] -> INITIALIZE 0 [Port 0: periodic_machine] -> NO_PERIODIC ( begind LACP active ) 0 [Port 0: mux_machine] -> DETACHED 99 [Port 0: mux_machine] DETACHED -> WAITING Waiting for slaves to become active... Port 2 MAC: ac:f9:70:88:f3:26 236 [Port 1: rx_machine] -> INITIALIZE 236 [Port 1: periodic_machine] -> NO_PERIODIC ( begind LACP active ) 236 [Port 1: mux_machine] -> DETACHED 236 [Port 1: selection_logic] -> SELECTED: ID=3D 0 aggregator found aggregator ID=3D 0 236 [Port 1: mux_machine] DETACHED -> WAITING 1034 [Port 0: tx_machine] Sending LACP frame 1034 [Port 0: tx_machine] Sending LACP frame bond_print_lacp(122) - LACP: { subtype=3D 01 ver_num=3D01 actor=3D{ tlv=3D01, len=3D14 pri=3DFFFF, system=3DAC:F9:70:88:F3:26, key=3D2100, p_pri=3DFF00 p_num= =3D0100 state=3D{ ACT AGG DEF EXP } } partner=3D{ tlv=3D02, len=3D14 pri=3DFFFF, system=3D00:00:00:00:00:00, key=3D0100, p_pri=3DFF00 p_num= =3D0000 state=3D{ ACT TIMEOUT AGG } } collector=3D{info=3D03, length=3D10, max_delay=3D0000 , type_term=3D00, terminator_length =3D 00 } 1234 [Port 1: tx_machine] Sending LACP frame bond_print_lacp(122) - LACP: { subtype=3D 01 ver_num=3D01 actor=3D{ tlv=3D01, len=3D14 pri=3DFFFF, system=3DAC:F9:70:88:F3:26, key=3D2100, p_pri=3DFF00 p_num= =3D0200 state=3D{ ACT AGG DEF EXP } } partner=3D{ tlv=3D02, len=3D14 pri=3DFFFF, system=3D00:00:00:00:00:00, key=3D0100, p_pri=3DFF00 p_num= =3D0000 state=3D{ ACT TIMEOUT AGG } } collector=3D{info=3D03, length=3D10, max_delay=3D0000 , type_term=3D00, terminator_length =3D 00 } 2032 [Port 0: tx_machine] Sending LACP frame 2332 [Port 1: rx_machine] LACP -> CURRENT bond_print_lacp(122) - LACP: { subtype=3D 01 ver_num=3D01 actor=3D{ tlv=3D01, len=3D14 pri=3D0080, system=3DF8:98:EF:69:83:91, key=3D417F, p_pri=3D0080 p_num= =3D0600 state=3D{ ACT TIMEOUT AGG } } partner=3D{ tlv=3D02, len=3D14 pri=3DFFFF, system=3DAC:F9:70:88:F3:26, key=3D2100, p_pri=3DFF00 p_num= =3D0200 state=3D{ ACT AGG DEF EXP } } collector=3D{info=3D03, length=3D10, max_delay=3D0000 , type_term=3D00, terminator_length =3D 00 } 2332 [Port 1: mux_machine] ATTACHED Entered ---------- Through my observation: when log print "SELECTED: ID=3D 1", it uses the wr= ong mac address to send lacp. selection_logic function choose wrong aggregator_port_id here. rte_eth_bond_8023ad.c:749 case AGG_STABLE: if (default_slave =3D=3D slaves_count) new_agg_id =3D slaves[slave_id]; else new_agg_id =3D slaves[default_slave]; // sometimes new_agg_id will be 1 why does the lacp handshake succeed sometimes? The "slaves" array is filled with unsure order by function "activate_slave". When port 0 fill the slave[0], It works correctly. --=20 You are receiving this mail because: You are the assignee for the bug.=