From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id E5CFC4C97 for ; Fri, 1 Jun 2018 11:59:34 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Jun 2018 02:59:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,465,1520924400"; d="scan'208,217";a="233768261" Received: from rnicolau-mobl.ger.corp.intel.com (HELO [10.237.221.67]) ([10.237.221.67]) by fmsmga006.fm.intel.com with ESMTP; 01 Jun 2018 02:59:31 -0700 To: Chas Williams <3chas3@gmail.com> Cc: dev@dpdk.org, Declan Doherty , Ferruh Yigit References: <1525867586-23328-1-git-send-email-radu.nicolau@intel.com> <1527783047-18201-1-git-send-email-radu.nicolau@intel.com> From: Radu Nicolau Message-ID: <83b445c2-a0a7-b6b5-b265-d3fdc3070790@intel.com> Date: Fri, 1 Jun 2018 10:59:30 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v3] net/bonding: fix slave add for mode 4 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jun 2018 09:59:35 -0000 On 6/1/2018 1:05 AM, Chas Williams wrote: > It's not clear to me that the issue here is the bonding slave add. > You can only add started PMDs.  When a PMD dev start is complete, > the PMD should have a valid link state and the link properties should be > valid.  A few of the PMDs are very good about this, particularly the > ones with LSC interrupts.  Those drivers often wait for the first > link interrupt before setting their link status.  So there is a > race where the link state isn't well defined. Indeed, the source of the problem is that the link state is not properly reflected across all the ports. So, the issue steps are the following: 1. user issues "port stop all" in testpmd; when a port is stopped the internal link state bits are cleared (and LSC interrupt will not run from what I can see) 2. testpmd tries to update link status on all the ports, reads the state of the first port updating the bits that were cleared; with a stopped port ixgbe PMD and probably others sets the link_autoneg bit, but all other bits remain cleared 3. seeing a link down, tespmd stops checking; now first port has the link state link_autoneg bit set, but all other ports have it cleared 4. trying to create a bonded port in mode 4 fails because of the link_autoneg bit To reiterate, the issue is creating a bonding port with stopped ports that have the link_status bits cleared, but not updated except the first port. My fix updated the bits on all the ports. > > And lastly, why do we care what the link state is when adding a > slave?  If the link state changes to down, do we remove the slave? > If the link speed of the slave changes, do we remove the slave? > So this test doesn't make much sense.  For mode 4, you should be > able to add a slave, but if the link state doesn't match what > has been negotiated, then the slave should fail to activate. You are right, I will send an updated patch that checks the slave link status before activation. This should also solve the initial issue, as the link state will be already updated. > > On Thu, May 31, 2018 at 12:10 PM, Radu Nicolau > wrote: > > > > Add a call to rte_eth_link_get_nowait on every slave to update > > the internal link status struct. Otherwise slave add will fail > > for mode 4 if the ports are all stopped but only one of them checked. > > > > Fixes: b77d21cc2364 ("ethdev: add link status get/set helper functions") > > Bugzilla ID: 52 > > > > Signed-off-by: Radu Nicolau > > > --- > > v3: updated commit msg > > v2: add fix and Bugzilla references > > > >  drivers/net/bonding/rte_eth_bond_api.c | 2 ++ > >  1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/bonding/rte_eth_bond_api.c > b/drivers/net/bonding/rte_eth_bond_api.c > > index d558df8..cad08b9 100644 > > --- a/drivers/net/bonding/rte_eth_bond_api.c > > +++ b/drivers/net/bonding/rte_eth_bond_api.c > > @@ -296,6 +296,8 @@ __eth_bond_slave_add_lock_free(uint16_t > bonded_port_id, uint16_t slave_port_id) > >                 return -1; > >         } > > > > +       rte_eth_link_get_nowait(slave_port_id, &link_props); > > + > >         slave_add(internals, slave_eth_dev); > > > >         /* We need to store slaves reta_size to be able to > synchronize RETA for all > > -- > > 2.7.5 > >