From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id AD8A71B294 for ; Fri, 6 Oct 2017 12:40:47 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Oct 2017 03:40:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,482,1500966000"; d="scan'208";a="143491827" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by orsmga002.jf.intel.com with ESMTP; 06 Oct 2017 03:40:44 -0700 Received: from irsmsx156.ger.corp.intel.com (10.108.20.68) by IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 6 Oct 2017 11:40:43 +0100 Received: from irsmsx108.ger.corp.intel.com ([169.254.11.167]) by IRSMSX156.ger.corp.intel.com ([169.254.3.33]) with mapi id 14.03.0319.002; Fri, 6 Oct 2017 11:40:43 +0100 From: "Dumitrescu, Cristian" To: Thomas Monjalon , "Singh, Jasvinder" CC: "dev@dpdk.org" , "Yigit, Ferruh" Thread-Topic: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thread-Index: AQHTMFxNNliM0WFThkKDtxUi2g+g2KK92ryAgBjZPkA= Date: Fri, 6 Oct 2017 10:40:43 +0000 Message-ID: <3EB4FA525960D640B5BDFFD6A3D891267BACA9D0@IRSMSX108.ger.corp.intel.com> References: <20170811124929.118564-2-jasvinder.singh@intel.com> <20170918091015.82824-1-jasvinder.singh@intel.com> <9843308.sSVeHNjL0n@xps> In-Reply-To: <9843308.sSVeHNjL0n@xps> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2017 10:40:48 -0000 Hi Thomas, Thanks for taking the time to read through our rationale and provide qualit= y comments on a topic where usually people are shouting but not listening! > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Wednesday, September 20, 2017 4:36 PM > To: Singh, Jasvinder ; Dumitrescu, Cristian > > Cc: dev@dpdk.org; Yigit, Ferruh > Subject: Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for > traffic mgmt and others >=20 > Hi, >=20 > 18/09/2017 11:10, Jasvinder Singh: > > The SoftNIC PMD is intended to provide SW fall-back options for specifi= c > > ethdev APIs in a generic way to the NICs not supporting those features. >=20 > I agree it is important to have a solution in DPDK to better manage > SW fallbacks. One question is to know whether we can implement and > maintain many solutions. We probably must choose only one solution. >=20 > I have not read the code. I am just interested in the design for now. > I think it is a smart idea but maybe less convenient than calling fallbac= k > from ethdev API glue code. My opinion has not changed since v1. > Thanks for the detailed explanations. Let's discuss below. >=20 Don't understand me wrong, I would also like to have the single device solu= tion (hard NIC augmented with SW-implemented features) as opposed to the cu= rrent proposal, which requires two devices (hard device and soft device act= ing as app front-end for the hard device). The problem is that right now the single device solution is not an option w= ith the current librte_ether, as there simply a lot of changes required tha= t need more time to think through and get agreement, and likely several inc= remental stages are required to make it happen. As detailed in the Dublin p= resentation, they mostly refer to: - the need of the SW fall-back to maintain its owns data structures and fun= ctions (per device, per RX queue, per TX queue) - coexistence of all the features together - how to bind an ethdev to one (or several) SW threads - thread safety requirements between ethdev SW thread and app threads Per our Dublin discussion, here is my proposal: 1. Get Soft NIC PMD into release 17.11. a) It is the imperfect 2-device solution, but it works and provides an int= erim solution. b) It allows us to make progress on the development for a few key features= such as traffic management (on TX) and hopefully flow & metering (on RX) a= nd get feedback on this code that we can later restructure into the final s= ingle-device solution. c) It is purely yet another PMD which we can melt away into the final solu= tion later. 2. Start an RFC on librte_ether required changes to get the single-device s= olution in place. a) I can spend some time to summarize the objectives, requirements, curren= t issues and potential approaches and send the first draft of this RFC in t= he next week or two? b) We can then discuss, poll for more ideas and hopefully draft an increme= ntal path forward What do you think? > [...] > > * RX/TX: The app reads packets from/writes packets to the "soft" port > > instead of the "hard" port. The RX and TX queues of the "soft" port a= re > > thread safe, as any ethdev. >=20 > "thread safe as any ethdev"? > I would say the ethdev queues are not thread safe. >=20 > [...] Yes, per the Dublin presentation, the thread safety mentioned here is betwe= en the Soft NIC thread and the application thread(s). > > * Meets the NFV vision: The app should be (almost) agnostic about the N= IC > > implementation (different vendors/models, HW-SW mix), the app should > not > > require changes to use different NICs, the app should use the same AP= I > > for all NICs. If a NIC does not implement a specific feature, the HW > > should be augmented with SW to meet the functionality while still > > preserving the same API. >=20 > This goal could also be achieved by adding the SW capability to the API. > After getting capabilities of a hardware, the app could set the capabilit= y > of some driver features to "SW fallback". > So the capability would become a tristate: > - not supported > - HW supported > - SW supported >=20 > The unique API goal is failed if we must manage two ports, > the HW port for some features and the softnic port for other features. > You explain it in A5 below. >=20 Yes, agree that 2-device solution is not fully meeting this goal, but IMHO = this is the best we can do today; hopefully we can come up with a path forw= ard for the single-device solution. > [...] > > Example: Create "soft" port for "hard" port "0000:04:00.1", enable the = TM > > feature with default settings: > > --vdev 'net_softnic0,hard_name=3D0000:04:00.1,soft_tm=3Don' >=20 > So the app will use only the vdev net_softnic0 which will forward packets > to 0000:04:00.1? > Can we say in this example that net_softnic0 owns 0000:04:00.1? > Probably not, because the config of the HW must be done separately (cf. > Q5). > See my "ownership proposal": > http://dpdk.org/ml/archives/dev/2017-September/074656.html >=20 > The issue I see in this example is that we must define how to enable > every features. It should be equivalent to defining the ethdev capabiliti= es. > In this example, the option soft_tm=3Don is probably not enough fine-grai= n. > We could support some parts of TM API in HW and other parts in SW. >=20 There are optional parameters for each feature (i.e. only TM at this point)= that are left on their default value for this simple example; they can eas= ily be added on the command line for fine grained tuning of each feature. > [...] > > Q3: Why not change the "hard" device (and keep a single device) instead= of > > creating a new "soft" device (and thus having two devices)? > > A3: This is not possible with the current librte_ether ethdev > > implementation. The ethdev->dev_ops are defined as constant > structure, > > so it cannot be changed per device (nor per PMD). The new ops also > > need memory space to store their context data structures, which > > requires updating the ethdev->data->dev_private of the existing > > device; at best, maybe a resize of ethdev->data->dev_private could = be > > done, assuming that librte_ether will introduce a way to find out i= ts > > size, but this cannot be done while device is running. Other side > > effects might exist, as the changes are very intrusive, plus it lik= ely > > needs more changes in librte_ether. >=20 > Q3 is about calling SW fallback from the driver code, right? >=20 Yes, correct, but the answer is applicable to the Q4 as well. > We must not implement fallbacks in drivers because it would hide > it to the application. > If a feature is not available in hardware, the application can choose > to bypass this feature or integrate the fallback in its own workflow. >=20 I agree. > > Q4: Why not call the SW fall-back dev_ops directly in librte_ether for > > devices which do not support the specific feature? If the device > > supports the capability, let's call its dev_ops, otherwise call the > > SW fall-back dev_ops. > > A4: First, similar reasons to Q&A3. This fixes the need to change > > ethdev->dev_ops of the device, but it does not do anything to fix t= he > > other significant issue of where to store the context data structur= es > > needed by the SW fall-back functions (which, in this approach, are > > called implicitly by librte_ether). > > Second, the SW fall-back options should not be restricted arbitrari= ly > > by the librte_ether library, the decision should belong to the app. > > For example, the TM SW fall-back should not be limited to only > > librte_sched, which (like any SW fall-back) is limited to a specifi= c > > hierarchy and feature set, it cannot do any possible hierarchy. If > > alternatives exist, the one to use should be picked by the app, not= by > > the ethdev layer. >=20 > Q4 is about calling SW callback from the API glue code, right? >=20 Yes. > We could summarize Q3/Q4 as "it could be done but we propose another > way". > I think we must consider the pros and cons of both approaches from > a user perspective. > I agree the application must decide which fallback to use. > We could propose one fallback in ethdev which can be enabled explicitly > (see my tristate capabilities proposal above). >=20 My summary would be: it would be great to do it this way, but significant r= oad blocks exist that need to be lifted first. > > Q5: Why is the app required to continue to configure both the "hard" an= d > > the "soft" devices even after the "soft" device has been created? W= hy > > not hiding the "hard" device under the "soft" device and have the > > "soft" device configure the "hard" device under the hood? > > A5: This was the approach tried in the V2 of this patch set (overlay > > "soft" device taking over the configuration of the underlay "hard" > > device) and eventually dropped due to increased complexity of havin= g > > to keep the configuration of two distinct devices in sync with > > librte_ether implementation that is not friendly towards such > > approach. Basically, each ethdev API call for the overlay device > > needs to configure the overlay device, invoke the same configuratio= n > > with possibly modified parameters for the underlay device, then res= ume > > the configuration of overlay device, turning this into a device > > emulation project. > > V2 minuses: increased complexity (deal with two devices at same tim= e); > > need to implement every ethdev API, even those not needed for the > scope > > of SW fall-back; intrusive; sometimes have to silently take decisio= ns > > that should be left to the app. > > V3 pluses: lower complexity (only one device); only need to impleme= nt > > those APIs that are in scope of the SW fall-back; non-intrusive (de= al > > with "hard" device through ethdev API); app decisions taken by the = app > > in an explicit way. >=20 > I think it is breaking what you call the NFV vision in several places. >=20 Personally, I also agree with you here. > [...] > > 9. [rte_ring proliferation] Thread safety requirements for ethdev > > RX/TXqueues require an rte_ring to be used for every RX/TX queue > > of each "soft" ethdev. This rte_ring proliferation unnecessarily > > increases the memory footprint and lowers performance, especiall= y > > when each "soft" ethdev ends up on a different CPU core (ping-po= ng > > of cache lines). >=20 > I am curious to understand why you consider thread safety as a requiremen= t > for queues. No need to reply here, the question is already asked > at the beginning of this email ;) Regards, Cristian