From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id 87FA1B150 for ; Fri, 13 Jun 2014 21:38:07 +0200 (CEST) Received: from hmsreliant.think-freely.org ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1WvXIm-0000Jn-T9; Fri, 13 Jun 2014 15:38:21 -0400 Date: Fri, 13 Jun 2014 15:38:15 -0400 From: Neil Horman To: "Doherty, Declan" Message-ID: <20140613193815.GE22451@hmsreliant.think-freely.org> References: <258914f35917ae07dddc991ac9726542964dce44.1402662300.git.declan.doherty@intel.com> <20140613160807.GD22451@hmsreliant.think-freely.org> <345C63BAECC1AD42A2EC8C63AFFC3ADC13D38EE9@IRSMSX101.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <345C63BAECC1AD42A2EC8C63AFFC3ADC13D38EE9@IRSMSX101.ger.corp.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Status: No Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v3 2/5] Link Bonding PMD Library (librte_eal/librte_ether link bonding support changes) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jun 2014 19:38:07 -0000 On Fri, Jun 13, 2014 at 06:34:04PM +0000, Doherty, Declan wrote: > > -----Original Message----- > > From: Neil Horman [mailto:nhorman@tuxdriver.com] > > Sent: Friday, June 13, 2014 5:08 PM > > To: Doherty, Declan > > Cc: dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v3 2/5] Link Bonding PMD Library > > (librte_eal/librte_ether link bonding support changes) > > > > On Fri, Jun 13, 2014 at 03:41:59PM +0100, Declan Doherty wrote: > > > Updating functionality in EAL to support adding link bonding > > > devices via –vdev option. Link bonding devices will be > > > initialized after all physical devices have been probed and > > > initialized. > > > > > > Signed-off-by: Declan Doherty > > > --- > > > lib/librte_eal/common/eal_common_dev.c | 66 > > +++++++++++++++++++++++++++-- > > > lib/librte_eal/common/eal_common_pci.c | 6 +++ > > > lib/librte_eal/common/include/eal_private.h | 7 +++ > > > lib/librte_eal/common/include/rte_dev.h | 1 + > > > lib/librte_ether/rte_ethdev.c | 34 +++++++++++++-- > > > lib/librte_ether/rte_ethdev.h | 7 ++- > > > lib/librte_pmd_pcap/rte_eth_pcap.c | 22 +++++----- > > > lib/librte_pmd_ring/rte_eth_ring.c | 32 +++++++------- > > > lib/librte_pmd_ring/rte_eth_ring.h | 3 +- > > > lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 2 +- > > > 10 files changed, 144 insertions(+), 36 deletions(-) > > > > > > diff --git a/lib/librte_eal/common/eal_common_dev.c > > b/lib/librte_eal/common/eal_common_dev.c > > > index eae5656..b50c908 100644 > > > --- a/lib/librte_eal/common/eal_common_dev.c > > > +++ b/lib/librte_eal/common/eal_common_dev.c > > > @@ -75,14 +75,28 @@ rte_eal_dev_init(void) > > > > > > /* call the init function for each virtual device */ > > > TAILQ_FOREACH(devargs, &devargs_list, next) { > > > + uint8_t bdev = 0; > > > > > > if (devargs->type != RTE_DEVTYPE_VIRTUAL) > > > continue; > > > > > > TAILQ_FOREACH(driver, &dev_driver_list, next) { > > > - if (driver->type != PMD_VDEV) > > > + /* RTE_DEVTYPE_VIRTUAL can only be a virtual or bonded > > device*/ > > > + if (driver->type != PMD_VDEV && driver->type != > > PMD_BDEV) > > > continue; > > > > > > + /* > > > + * Bonded devices are not initialize here, we do it later in > > > + * rte_eal_bonded_dev_init() after all physical devices > > have been > > > + * probed and initialized > > > + */ > > > + if (driver->type == PMD_BDEV && > > > + !strncmp(driver->name, devargs- > > >virtual.drv_name, > > > + strlen(driver->name))) { > > > + bdev = 1; > > > + break; > > > + } > > > + > > I really don't think you need to add a new device type for bonded devs. Its got > > no specific hardware that it drives, and you configure it with a --vdev command, > > so treat it as one here. I understand that you need to pass additional > > information about slaves to a bonded device, which is fine, but you can do that > > with kvargs pretty easily, at which point its just another vdev. The only other > > requirement is that you initilize the bonded vdev after the slave vdevs have > > been created, which you can do by any of several methods (a priority field to > > indicate that bonded drivers should be initilized last/later, a deferral return > > code from the init routine, or by dead reckoning via the careful construction of > > the application command line (placed the bonded --vdev option last on the > > command line argument list at run time). > > > It was my initial intent to do as you have describe above, but the physical devices > cause a real issue here, physical devices don't call through to rte_eth_dev_allocate until > during rte_eal_pci_probe call, so it's not possible to initialize the bonded device from > within rte_eal_dev_init as the physical devices have not been fully initialized at this > point, as a port_id has not been allocated and can't be added as bonding slaves. I don't > see away around this without changing the EAL API, which I've tried to avoid with this solution. > > Ordering isn't an issue, and can easily be solved if the above problem didn't exist, and although > a new device type isn't technically required, I think it's a cleaner solution than doing > string comparisons. > Ugh, you're right. That stinks. We still shouldn't just drop the init code in with the pci probe code though, especially if dpdk starts supporting more stacked network devices. A notification system for interface creation that you could monitor for pci_device addresses (B:D:F tupples) would be nice, but might be overkill since dpdk doesn't really support device freeing yet. what might be useful is a generic ordering mechanism for rte_eal_dev_init. If you added a priority field to the rte_driver structure, you could create an enumeration for the field to define when the init routine for that driver was called something like: typdef enum { PMD_INIT_FIRST = 0, PMD_INIT_POST_PROBE = 1, } Then you could make multiple calls to rte_eal_dev_init from rte_eal_init at the appropriate places in the initalization process, each time passing one of the above values. rte_eal_dev_init would only initalize those drivers whos priority matched the passed in priority. That would be helpful for interfaces like vlans, vxlans, tunnels, etc. > > > > > > +static int > > > +rte_eth_dev_name_unique(const char* name) > > > +{ > > > + unsigned i; > > > + > > > + for (i = 0; i < nb_ports; i++) { > > > + if (strcmp(rte_eth_devices[i].data->name, name) == 0) > > > + return -1; > > > + } > > > + > > > + return 0; > > > +} > > > + > > > struct rte_eth_dev * > > > -rte_eth_dev_allocate(void) > > > +rte_eth_dev_allocate(const char* name) > > > { > > > struct rte_eth_dev *eth_dev; > > > > > > @@ -165,23 +179,37 @@ rte_eth_dev_allocate(void) > > > if (rte_eth_dev_data == NULL) > > > rte_eth_dev_data_alloc(); > > > > > > + if (rte_eth_dev_name_unique(name)) { > > > + PMD_DEBUG_TRACE("Ethernet Device with name %s already > > allocated!\n"); > > > + return NULL; > > > + } > > > + > > This seems fairly racy if you allow dynamic device creation at run time from the > > application, if multiple threads attempt to create bonds in parallel. > > True but if this is an issue then there probably should be some locking around this allocation anyway. Agreed the allocation of the port id is also racy. > > > > > /** > > > diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c > > b/lib/librte_pmd_pcap/rte_eth_pcap.c > > Hmm, we're modifying other pmds for the naming feature, I think it would be best > > split out into a separate patch. Something entitled "support unique interface > > naming for virtual pmds" or something. > > True, but if I do that, I will need to break support for virtual devices in this patch set, and > I would prefer to keep this functionality working initial release of link bonding. Also another > patchset will be required at some point in the near future to tidy up/rationalize the current > handling of virtual devices identification and these changes do not change functionally how > these devices work. I'm not sure I'm following what you're saying here. If you split this into another patch set, and order earlier in this series, then you won't break anything. I'm not suggesting a functional change here, just a organizational one, so that this change doesn't do more than the changelog indicates. Neil