DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [Bug 483] Bond 8023ad lacp handshake sometimes fail
@ 2020-05-21 12:12 bugzilla
  0 siblings, 0 replies; only message in thread
From: bugzilla @ 2020-05-21 12:12 UTC (permalink / raw)
  To: dev

https://bugs.dpdk.org/show_bug.cgi?id=483

            Bug ID: 483
           Summary: Bond 8023ad lacp handshake sometimes fail
           Product: DPDK
           Version: 19.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev@dpdk.org
          Reporter: iobeyond@126.com
  Target Milestone: ---

There are two ports in my bond and two hosts are connected by a switch.
I open the dpdk debug info with macro RTE_LIBRTE_BOND_DEBUG_8023AD.


Port 0 MAC: ac:f9:70:88:f3:26
Port 1 MAC: ac:f9:70:88:f3:27
BOND MAC: ac:f9:70:88:f3:26

When tx_machine send lacp with Port 1 Mac ac:f9:70:88:f3:27, the handshake will
fail.


when lacp handshake failed, log like this:

----------

  997 [Port 0: rx_machine] -> INITIALIZE
   997 [Port 0: periodic_machine] -> NO_PERIODIC ( begind LACP active )
   997 [Port 0: mux_machine] -> DETACHED
   997 [Port 0: selection_logic] -> SELECTED: ID=  1
        aggregator found aggregator ID=  1
   997 [Port 0: mux_machine] DETACHED -> WAITING

1995 [Port 1: tx_machine] Sending LACP frame
bond_print_lacp(122) - LACP: {
  subtype= 01
  ver_num=01
  actor={ tlv=01, len=14
    pri=FFFF, system=AC:F9:70:88:F3:27, key=2100, p_pri=FF00 p_num=0200
       state={ ACT AGG DEF EXP }
  }
  partner={ tlv=02, len=14
    pri=FFFF, system=00:00:00:00:00:00, key=0100, p_pri=FF00 p_num=0000
       state={ ACT TIMEOUT AGG }
  }
  collector={info=03, length=10, max_delay=0000
, type_term=00, terminator_length = 00 }
  1995 [Port 0: tx_machine] Sending LACP frame
bond_print_lacp(122) - LACP: {
  subtype= 01
  ver_num=01
  actor={ tlv=01, len=14
    pri=FFFF, system=AC:F9:70:88:F3:27, key=2100, p_pri=FF00 p_num=0100
       state={ ACT AGG DEF EXP }
  }
  partner={ tlv=02, len=14
    pri=FFFF, system=00:00:00:00:00:00, key=0100, p_pri=FF00 p_num=0000
       state={ ACT TIMEOUT AGG }
  }
  collector={info=03, length=10, max_delay=0000
, type_term=00, terminator_length = 00 }
  2095 [Port 1: mux_machine] ATTACHED Entered
  2594 [Port 1: tx_machine] Sending LACP frame
----------

when lacp handshake succeeds, log like this:
----------

     0 [Port 0: rx_machine] -> INITIALIZE
     0 [Port 0: periodic_machine] -> NO_PERIODIC ( begind LACP active )
     0 [Port 0: mux_machine] -> DETACHED
    99 [Port 0: mux_machine] DETACHED -> WAITING
Waiting for slaves to become active...
Port 2 MAC: ac:f9:70:88:f3:26
   236 [Port 1: rx_machine] -> INITIALIZE
   236 [Port 1: periodic_machine] -> NO_PERIODIC ( begind LACP active )
   236 [Port 1: mux_machine] -> DETACHED
   236 [Port 1: selection_logic] -> SELECTED: ID=  0
        aggregator found aggregator ID=  0
   236 [Port 1: mux_machine] DETACHED -> WAITING
  1034 [Port 0: tx_machine] Sending LACP frame

1034 [Port 0: tx_machine] Sending LACP frame
bond_print_lacp(122) - LACP: {
  subtype= 01
  ver_num=01
  actor={ tlv=01, len=14
    pri=FFFF, system=AC:F9:70:88:F3:26, key=2100, p_pri=FF00 p_num=0100
       state={ ACT AGG DEF EXP }
  }
  partner={ tlv=02, len=14
    pri=FFFF, system=00:00:00:00:00:00, key=0100, p_pri=FF00 p_num=0000
       state={ ACT TIMEOUT AGG }
  }
  collector={info=03, length=10, max_delay=0000
, type_term=00, terminator_length = 00 }
  1234 [Port 1: tx_machine] Sending LACP frame
bond_print_lacp(122) - LACP: {
  subtype= 01
  ver_num=01
  actor={ tlv=01, len=14
    pri=FFFF, system=AC:F9:70:88:F3:26, key=2100, p_pri=FF00 p_num=0200
       state={ ACT AGG DEF EXP }
  }
  partner={ tlv=02, len=14
    pri=FFFF, system=00:00:00:00:00:00, key=0100, p_pri=FF00 p_num=0000
       state={ ACT TIMEOUT AGG }
  }
  collector={info=03, length=10, max_delay=0000
, type_term=00, terminator_length = 00 }
  2032 [Port 0: tx_machine] Sending LACP frame

2332 [Port 1: rx_machine] LACP -> CURRENT
bond_print_lacp(122) - LACP: {
  subtype= 01
  ver_num=01
  actor={ tlv=01, len=14
    pri=0080, system=F8:98:EF:69:83:91, key=417F, p_pri=0080 p_num=0600
       state={ ACT TIMEOUT AGG }
  }
  partner={ tlv=02, len=14
    pri=FFFF, system=AC:F9:70:88:F3:26, key=2100, p_pri=FF00 p_num=0200
       state={ ACT AGG DEF EXP }
  }
  collector={info=03, length=10, max_delay=0000
, type_term=00, terminator_length = 00 }
  2332 [Port 1: mux_machine] ATTACHED Entered

----------

Through my observation: when log print "SELECTED: ID=  1", it uses the wrong
mac address to send lacp.

selection_logic function choose wrong aggregator_port_id here.

rte_eth_bond_8023ad.c:749

        case AGG_STABLE:
                if (default_slave == slaves_count)
                        new_agg_id = slaves[slave_id];
                else
                        new_agg_id = slaves[default_slave]; // sometimes
new_agg_id will be 1

why does the lacp handshake succeed sometimes?

The "slaves" array is filled with unsure order by function "activate_slave".
When port 0 fill the slave[0], It works correctly.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-05-21 12:12 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 12:12 [dpdk-dev] [Bug 483] Bond 8023ad lacp handshake sometimes fail bugzilla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).