From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx.bisdn.de (mx.bisdn.de [185.27.182.31]) by dpdk.org (Postfix) with ESMTP id 9900C5A36 for ; Wed, 20 May 2015 19:01:02 +0200 (CEST) Received: from [172.16.250.156] (unknown [172.16.250.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx.bisdn.de (Postfix) with ESMTPSA id 50D58A1941; Wed, 20 May 2015 19:01:02 +0200 (CEST) Message-ID: <555CBDCE.5010004@bisdn.de> Date: Wed, 20 May 2015 19:01:02 +0200 From: Marc Sune User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 MIME-Version: 1.0 To: Neil Horman References: <1431361781-12621-1-git-send-email-bruce.richardson@intel.com> <20150519113112.GA10700@bricha3-MOBL3> <2340603.EB75GY9ZBE@xps13> <555C5C4C.80308@bisdn.de> <20150520102816.GA1775@hmsreliant.think-freely.org> In-Reply-To: <20150520102816.GA1775@hmsreliant.think-freely.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [RFC PATCHv2 0/2] pktdev as wrapper type X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 May 2015 17:01:02 -0000 On 20/05/15 12:28, Neil Horman wrote: > On Wed, May 20, 2015 at 12:05:00PM +0200, Marc Sune wrote: >> >> On 20/05/15 10:31, Thomas Monjalon wrote: >>> 2015-05-19 12:31, Bruce Richardson: >>>> On Mon, May 11, 2015 at 05:29:39PM +0100, Bruce Richardson wrote: >>>>> Hi all, >>>>> >>>>> after a small amount of offline discussion with Marc Sune, here is an >>>>> alternative proposal for a higher-level interface - aka pktdev - to allow a >>>>> common Rx/Tx API across device types handling mbufs [for now, ethdev, ring >>>>> and KNI]. The key code is in the first patch fo the set - the second is an >>>>> example of a trivial usecase. >>>>> >>>>> What is different about this to previously: >>>>> * wrapper class, so no changes to any existing ring, ethdev implementations >>>>> * use of function pointers for RX/TX with an API that maps to ethdev >>>>> - this means there is little/no additional overhead for ethdev calls >>>>> - inline special case for rings, to accelerate that. Since we are at a >>>>> higher level, we can special case process some things if appropriate. This >>>>> means the impact to ring ops is one (predictable) branch per burst >>>>> * elimination of the queue abstraction. For the ring and KNI, there is no >>>>> concept of queues, so we just wrap the functions directly (no need even for >>>>> wrapper functions, the api's match so we can call directly). This also >>>>> means: >>>>> - adding in features per-queue, is far easier as we don't need to worry about >>>>> having arrays of multiple queues. For example: >>>>> - adding in buffering on TX (or RX) is easier since again we only have a >>>>> single queue. >>>>> * thread safety is made easier using a wrapper. For a MP ring, we can create >>>>> multiple pktdevs around it, and each thread will then be able to use their >>>>> own copy, with their own buffering etc. >>>>> >>>>> However, at this point, I'm just looking for general feedback on this as an >>>>> approach. I think it's quite flexible - even more so than the earlier proposal >>>>> we had. It's less proscriptive and doesn't make any demands on any other libs. >>>>> >>>>> Comments/thoughts welcome. >>>> Any comments on this RFC before I see about investing further time in it to clean >>>> it up a bit and submit as a non-RFC patchset for merge in 2.1? >>> I would say there are 2 possible approaches for KNI and ring handling: >>> 1/ You Bruce, Marc and Keith are advocating for a layer on top of ethdev, >>> ring, KNI and possibly other devices, which uses mbuf. The set of functions >>> is simpler than ethdev but the data structure is mbuf which is related to >>> ethdev layer. >>> 2/ Konstantin and Neil talked about keeping mbuf for ethdev layer and related >>> libs only. Ring and KNI could have an ethdev API with a reduced set of >>> implemented functions. Crypto devices could adopt a specific crypto API and >>> an ethdev API at the same time. >> I don't fully understand which APIs you meant by non-ethdev. This pktdev >> wrapper proposal abstracts RX and TX functions only, and all of these are >> using mbufs as the packet buffer abstraction right now anyway (ethdev). >> > He's referring to future device classes (like crypto devices), which ostensibly > would make use of the pktdev API. My argument (and I think Thomas') is that if > a bit of hardware can be made to operate as a packet sending/receiving device, > then its just as reasonable to use the existing ethdev api rather than some > other restricted version of it (pktdev) > >> This approach does not preclude that different libraries expose other API >> calls. In fact they will have to; setup the port/device ... It is just a >> higher level API, so that you don't have to check the type of port in your >> DPDK application I/O loop, minimizing user's code. >> > No argument there. But if thats the case (and I agree that it is), an > application will implicitly have to know what what type of device it is, because > it (the application) will need to understand the specific API it is writing to. > >> Or were you in 2) thinking about creating a different "packet buffer" >> abstraction, independent from the ethdev, and then map the different port >> specifics (e.g. mbuf) to this new abstraction? >> > My argument was to just leave the ethdev api alone. If a device class can be > made to look like a packet forwarding device, then use the existing ethdev api > to implement it. > >>> I feel it's cleaner, more generic and more maintainable to have drivers >>> implementing one or several stable APIs instead of having some restricted >>> wrappers to update. >> This would be a separate library _on top_ of the existing APIs, and it has >> the advantage to simplify the DPDK user's application code when an >> application needs to deal with several types of port, as shown in the >> example that Bruce provided in PATCH #2. >> > But thats already the purpose of the ethdev api. Different types of > hardware/software can be made to look like the same thing (an ethdev) from an > application standpoint. Adding this pktdev layer does nothing but that, add a > layer. If you want restricted functionality of an interface, thats ok, ethdev > offers that ability. unimplemented methods in a pmd cause the ethdev api to > return EOPNOTSUP to the calling application, so the application knows when a > given ethdev can't do some aspect of what an ethdev is. Hi Neil, Thanks for the clarifications. Now I understand the concern Thomas expressed. Using ethdev API (port-ids) was actually my first suggestion here: http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 And to be honest, what I was expecting when I was reading for the first time DPDK's APIs. It is indeed an option. However, if we take a look at the API: http://www.dpdk.org/doc/api/rte__ethdev_8h.html none of the API calls, except the burst RX/TX and, perhaps, the callbacks, would be used by devices other than NICs. It seems going a bit too far using it, but ofc possible. In essence, rte_ether(rte_ethdev.h) right now has: i) NIC setup; general configuration, queue config, fdir, offloads, hw stuff like leds... ii) RX/TX routines and callbacks iii) Stats and queue stats iv) other utils for ethernet stuff (rte_ether.h) i) is clearly HW specific, and does only apply to NICs/ASICs (e.g. FM10k) while ii) and iii) are things that could be abstracted beyond NICs, like KNI, rte_ring, crypto... (iv could be moved into some utils/protocol parsing libraries). Perhaps these two groups could be split into two different libraries and then ii) and iii) together would be something like ~ rte_pktdev (stats are missing on the proposed patch), while i) would be rte_ether, or rte_nic if we think it is a better name. In any case, and I think we all agree here, I just think that one way or another this should be abstracted so that it simplifies (and reduces) a bit the code of DPDK applications. Marc > >> I don't see why this could limit us or make it less maintainable. Of course >> this is an RFC patch; appropriate tests are missing (Bruce I can help you on >> that) >> > It doesn't limit us, its just not a useful abstraction, because we already have > the abilities it provides. > > Neil >> Marc >> >>> Comments are welcome. >>