From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by dpdk.org (Postfix) with ESMTP id 5B1CF1396 for ; Wed, 15 Mar 2017 15:25:47 +0100 (CET) Received: by mail-wm0-f51.google.com with SMTP id n11so24388562wma.0 for ; Wed, 15 Mar 2017 07:25:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=2naoOXIAD0lO679UXo1YFza0wgE4YagoKAJCrQds74E=; b=Y3aBz4vx9ffmhwR7VcSuNjrMPtDeyAMpdUCCX+SALzwDaxGJ29dv5tD4oTK0D27HFq yIf08WLWg4nYlOpWEmbif0/sQEH9E+QpsqGJ2iePPidO30BYxP6eKIH7ZjmWOdd7V8hn lH4C6BZP9Cmrjv81VLZjTmmuWva4/hgNZQ4jPFtTstWpVeWeUvzYSI22YI02R90svjHF pfK33LLdFeHOxJIhYn2hdyrA7fH80Wx5eJlH9zBlWgdMNNe/hjMMFx/7Vg9rzn0GY5gr bjYnyJIvQi09w5HPuTg2A4EyR+Px3iZZDk6vsjWAfYi3EbUivmTU3uMnWnoPgn27gRMH cpFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=2naoOXIAD0lO679UXo1YFza0wgE4YagoKAJCrQds74E=; b=gyKyoj25tIjKPBbLc95khCncvHlRscsw/Kw/9VGNTNiM1EQ8dB5IAAqMHv+PcKfDVR Gvn8IMx5qnSBEwQvhb/MLoDZkIzDYz5jIYOuiDItnG69kSeqK530D2VzHvMUKlmxr6eb 80dIOUyrbo3NpP1k79bMpjSBEGWLJLI+JYiY8bFFThthogzwrgoJ+FC29YwZwAvdF8ng js6IBZwcGEopgsCl+HOA1CiZDYKF6LQE6IPGAg6+t2OoTTEkbKJmN3y1cYggCUYLYShL am/ysHdJHsEA5U+esBGDO6aeGqyc5RBtyK6ENXJBNVl2+3Ua7A8mLGTSYsJp76B+Zbk2 CRYg== X-Gm-Message-State: AFeK/H35KBVJLgtEix/RBBem3ednxp0UwHrVET1/MM+tkmIBqGF+FzJGhHaXwSOnv16eCH8k X-Received: by 10.28.232.13 with SMTP id f13mr4750102wmh.141.1489587946907; Wed, 15 Mar 2017 07:25:46 -0700 (PDT) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id n26sm2555140wra.44.2017.03.15.07.25.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Mar 2017 07:25:45 -0700 (PDT) Date: Wed, 15 Mar 2017 15:25:37 +0100 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Thomas Monjalon Cc: Bruce Richardson , Neil Horman , dev@dpdk.org, Adrien Mazarguil , techboard@dpdk.org Message-ID: <20170315142537.GR908@bidouze.vm.6wind.com> References: <20170314144947.GO908@bidouze.vm.6wind.com> <20170315032853.GA366048@bricha3-MOBL3.ger.corp.intel.com> <2220671.FCXKFdQQ2A@xps13> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2220671.FCXKFdQQ2A@xps13> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Mar 2017 14:25:47 -0000 On Wed, Mar 15, 2017 at 12:15:56PM +0100, Thomas Monjalon wrote: >2017-03-15 03:28, Bruce Richardson: >> On Tue, Mar 14, 2017 at 03:49:47PM +0100, Gaëtan Rivet wrote: >> > - In the bonding, the init and configuration steps are still the >> > responsibility of the application and no one else. The bonding PMD >> > captures the device, re-applies its configuration upon dev_configure() >> > which is actually re-applying part of the configuration already present >> > within the slave eth_dev (cf rte_eth_dev_config_restore). >> > >> > - In the fail-safe, the init and configuration are both the >> > responsibilities of the fail-safe PMD itself, not the application >> > anymore. This handling of these responsibilities in lieu of the >> > application is the whole point of the "deferred hot-plug" support, of >> > proposing a simple implementation to the user. >> > >> > This change in responsibilities is the bulk of the fail-safe code. It >> > would have to be added as-is to the bonding. Verifying the correctness >> > of the sync of the initialization phase (acceptable states of a device >> > following several events registered by the fail-safe PMD) and the >> > configuration items between the state the application believes it is in >> > and the fail-safe knows it is in, is the bulk of the fail-safe code. >> > >> > This function is not overlapping with that of the bonding. The reason I >> > did not add this whole architecture to the bonding is that when I tried >> > to do so, I found that I only had two possibilities: >> > >> > - The current slave handling path is kept, and we only add a new one >> > with additional functionalities: full init and conf handling with >> > extended parsing capabilities. >> > >> > - The current slave handling is scraped and replaced entirely by the new >> > slave management. The old capturing of existing device is not done >> > anymore. >> > >> > The first solution is not acceptable, because we effectively end-up with >> > a maintenance nightmare by having to validate two types of slaves with >> > differing capabilities, differing initialization paths and differing >> > configuration code. This is extremely awkward and architecturally >> > unsound. This is essentially the same as having the exact code of the >> > fail-safe as an aside in the bonding, maintening exactly the same >> > breadth of code while having muddier interfaces and organization. >> > >> > The second solution is not acceptable, because we are bending the whole >> > existing bonding API to our whim. We could just as well simply rename >> > the fail-safe PMD as bonding, add a few grouping capabilities and call >> > it a day. This is not acceptable for users. >> > >> If the first solution is indeed not an option, why do you think this >> second one would be unacceptable for users? If the functionality remains >> the same, I don't see how it matters much for users which driver >> provides it or where the code originates. >> The problem with the second solution is also that bonding is not only a PMD. It exposes its own public API that existing applications rely on, see rte_eth_bond_*() definitions in rte_eth_bond.h. Although bonding instances can be set up through command-line options, target "users" are mainly applications explicitly written to use it. This must be preserved for no other reason that it hasn't been deprecated. Also, trying to implement this API for the device failover function would implies a device capture down to the devargs parsing level. This means that a PMD could request taking over a device, messing with the internals of the EAL: devargs list and busses lists of devices. This seems unacceptable. The bonding API is thus in conflict with the concept of a device failover in the context of the current DPDK arch. >> Despite all the discussion, it still just doesn't make sense to me to >> have more than one DPDK driver to handle failover - be it link or >> device. If nothing else, it's going to be awkward to explain to users >> that if they want fail-over for when a link goes down they have to use >> driver A, but if they want fail-over when a NIC gets hotplugged they use >> driver B, and if they want both kinds of failover - which would surely >> be the expected case - they need to use both drivers. The usability is >> a problem here. Having both kind of failovers in the same PMD will always lead to the first solution in some form or another. I am sure we can document all this in a way that does no cause users confusion, with the help of community feedback such as yours. Perhaps "net_failsafe" is a misnomer? We also thought about "net_persistent" or "net_hotplug". Any other ideas? It is also possible for me to remove the failover support from this series, only providing deferred hot-plug handling at first. I could then send the failover support as separate patches to better assert that it is a useful, secondary feature that is essentially free to implement. > >It seems everybody agrees on the need for the failsafe code. >We are just discussing the right place to implement it. > >Gaetan, moving this code in the bonding PMD means replacing the bonding >API design by the failsafe design, right? >With the failsafe design in the bonding PMD, is it possible to keep other >bonding features? As seen previously, the bonding API is incompatible with device failover. Having some features enabled solely for one kind of failover, while having specific code paths for both, seems unecessarily complicated to me ; following suite with my previous points about the first solution. > >In case we do not have a consensus in the following days, I suggest to add >this topic in the next techboard meeting agenda. Regards, -- Gaëtan Rivet 6WIND