From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <keith.wiles@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 18760569A
 for <dev@dpdk.org>; Mon,  6 Apr 2015 00:20:13 +0200 (CEST)
Received: from orsmga003.jf.intel.com ([10.7.209.27])
 by fmsmga102.fm.intel.com with ESMTP; 05 Apr 2015 15:20:12 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.11,529,1422950400"; d="scan'208";a="551371384"
Received: from orsmsx102.amr.corp.intel.com ([10.22.225.129])
 by orsmga003.jf.intel.com with ESMTP; 05 Apr 2015 15:20:13 -0700
Received: from fmsmsx117.amr.corp.intel.com (10.18.116.17) by
 ORSMSX102.amr.corp.intel.com (10.22.225.129) with Microsoft SMTP Server (TLS)
 id 14.3.224.2; Sun, 5 Apr 2015 15:20:12 -0700
Received: from fmsmsx113.amr.corp.intel.com ([169.254.13.111]) by
 fmsmsx117.amr.corp.intel.com ([169.254.3.36]) with mapi id 14.03.0224.002;
 Sun, 5 Apr 2015 15:20:11 -0700
From: "Wiles, Keith" <keith.wiles@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Thread-Topic: [dpdk-dev] [RFC] Adding multiple device types to DPDK.
Thread-Index: AQHQbHmnRJG0cJWo7kasKgozQeIWPp07+9aAgAAIvICAAUmqAP//zuIAgAIvUoD//9l9gA==
Date: Sun, 5 Apr 2015 22:20:10 +0000
Message-ID: <D1471BCF.1A63A%keith.wiles@intel.com>
References: <D1408516.1A07B%keith.wiles@intel.com>
 <20150403170043.GA17441@hmsreliant.think-freely.org>
 <D1447349.1A40F%keith.wiles@intel.com>
 <20150404131154.GA6035@neilslaptop.think-freely.org>
 <D145551A.1A522%keith.wiles@intel.com>
 <20150405193758.GA12554@neilslaptop.think-freely.org>
In-Reply-To: <20150405193758.GA12554@neilslaptop.think-freely.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.254.13.154]
Content-Type: text/plain; charset="iso-8859-2"
Content-ID: <1C5D171C5EB41C4AB7F11AE0CF82F7CF@intel.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK.
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2015 22:20:15 -0000


On 4/5/15, 2:37 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote:

>On Sat, Apr 04, 2015 at 03:16:08PM +0000, Wiles, Keith wrote:
>>=20
>>=20
>> On 4/4/15, 8:11 AM, "Neil Horman" <nhorman@tuxdriver.com> wrote:
>>=20
>> >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote:
>> >> Hi Neil,
>> >>=20
>> >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote:
>> >>=20
>> >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote:
>> >> >> Hi all, (hoping format of the text is maintained)
>> >> >>=20
>> >> >> Bruce and myself are submitting this RFC in hopes of providing
>> >> >>discussion
>> >> >> points for the idea. Please do not get carried away with the code
>> >> >> included, it was to help everyone understand the proposal/RFC.
>> >> >>=20
>> >> >> The RFC is to describe a proposed change we are looking to make to
>> >>DPDK
>> >> >>to
>> >> >> add more device types. We would like to add in to DPDK the idea
>>of a
>> >> >> generic packet-device or ?pktdev?, which can be thought of as a
>>thin
>> >> >>layer
>> >> >> for all device classes. For other device types such as
>>potentially a
>> >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect
>> >> >> performance and not require any current application to be
>>modified.
>> >>The
>> >> >> pktdev layer is providing a light framework for developers to add
>>a
>> >> >>device
>> >> >> to DPDK.
>> >> >>=20
>> >> >> Reason for Change
>> >> >> -----------------
>> >> >>=20
>> >> >> The reason why we are looking to introduce these concepts to DPDK
>> >>are:
>> >> >>=20
>> >> >> * Expand the scope of DPDK so that it can provide APIs for more
>>than
>> >> >>just
>> >> >> packet acquisition and transmission, but also provide APIs that
>>can
>> >>be
>> >> >> used to work with other hardware and software offloads, such as
>> >> >> cryptographic accelerators, or accelerated libraries for
>> >>cryptographic
>> >> >> functions. [The reason why both software and hardware are
>>mentioned
>> >>is
>> >> >>so
>> >> >> that the same APIs can be used whether or not a hardware
>>accelerator
>> >>is
>> >> >> actually available].
>> >> >> * Provide a minimal common basis for device abstraction in DPDK,
>>that
>> >> >>can
>> >> >> be used to unify the different types of packet I/O devices already
>> >> >> existing in DPDK. To this end, the ethdev APIs are a good starting
>> >> >>point,
>> >> >> but the ethdev library contains too many functions which are
>> >> >>NIC-specific
>> >> >> to be a general-purpose set of APIs across all devices.
>> >> >>      Note: The idea was previously touched on here:
>> >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545
>> >> >>=20
>> >> >> Description of Proposed Change
>> >> >> ------------------------------
>> >> >>=20
>> >> >> The basic idea behind "pktdev" is to abstract out a few common
>> >>routines
>> >> >> and structures/members of structures by starting with ethdev
>> >>structures
>> >> >>as
>> >> >> a starting point, cut it down to little more than a few members in
>> >>each
>> >> >> structure then possible add just rx_burst and tx_burst. Then use
>>the
>> >> >> structures as a starting point for writing a device type.
>>Currently
>> >>we
>> >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see
>> >>like
>> >> >> move a couple more common functions maybe resaonable. It could be
>>the
>> >> >> Rx/Tx routines in pktdev should be left as is, but in the code
>>below
>> >>is
>> >> >>a
>> >> >> possible reason to abstract a few routines into a common set of
>> >>files.
>> >> >>=20
>> >> >> >From there, we have the ethdev type which adds in the existing
>> >> >>functions
>> >> >> specific to Ethernet devices, and also, for example, a cryptodev
>> >>which
>> >> >>may
>> >> >> add in functions specific for cryptographic offload. As now, with
>>the
>> >> >> ethdev, the specific drivers provide concrete implementations of
>>the
>> >> >> functionality exposed by the interface. This hierarchy is shown in
>> >>the
>> >> >> diagram below, using the existing ethdev and ixgbe drivers as a
>> >> >>reference,
>> >> >> alongside a hypothetical cryptodev class and driver implementation
>> >> >> (catchingly called) "X":
>> >> >>=20
>> >> >>                     ,---------------------.
>> >> >>                     | struct rte_pktdev   |
>> >> >>                     +---------------------+
>> >> >>                     | rte_pkt_rx_burst()  |
>> >> >>             .-------| rte_pkt_tx_burst()  |-----------.
>> >> >>             |       `---------------------'           |
>> >> >>             |                                         |
>> >> >>             |                                         |
>> >> >>   ,-------------------------------.
>> >>,------------------------------.
>> >> >>   |    struct rte_ethdev          |    |      struct rte_cryptodev
>> >> |
>> >> >>   +-------------------------------+
>> >>+------------------------------+
>> >> >>   | rte_eth_dev_configure()       |    |
>> >>rte_crypto_init_sym_session()|
>> >> >>   | rte_eth_allmulticast_enable() |    |
>> >>rte_crypto_del_sym_session() |
>> >> >>   | rte_eth_filter_ctrl()         |    |
>> >> |
>> >> >>   `-------------------------------'
>> >>`---------------.--------------'
>> >> >>             |                                          |
>> >> >>             |                                          |
>> >> >>   ,---------'---------------------.
>> >>,---------------'--------------.
>> >> >>   |    struct rte_pmd_ixgbe       |    |      struct rte_pmd_X
>> >> |
>> >> >>   +-------------------------------+
>> >>+------------------------------+
>> >> >>   | .configure -> ixgbe_configure |    | .init_session ->
>> >>X_init_ses()|
>> >> >>   | .tx_burst  -> ixgbe_xmit_pkts |    | .tx_burst ->
>> >>X_handle_pkts() |
>> >> >>   `-------------------------------'
>> >>`------------------------------'
>> >> >>=20
>> >> >> We are not attempting to create a real class model here only
>>looking
>> >>at
>> >> >> creating a very basic common set of APIs and structures for other
>> >>device
>> >> >> types.
>> >> >>=20
>> >> >> In terms of code changes for this, we obviously need to add in new
>> >> >> interface libraries for pktdev and cryptodev. The pktdev library
>>can
>> >> >> define a skeleton structure for the first few elements of the
>>nested
>> >> >> structures to ensure consistency. Each of the defines below
>> >>illustrate
>> >> >>the
>> >> >> common members in device structures, which gives some basic
>>structure
>> >> >>the
>> >> >> device framework. Each of the defines are placed at the top of the
>> >> >>devices
>> >> >> matching structures and allows the devices to contain common and
>> >>private
>> >> >> data. The pkdev structures overlay the first common set of members
>> >>for
>> >> >> each device type.
>> >> >>=20
>> >> >
>> >> >
>> >> >Keith and I discussed this offline, and for the purposes of
>> >>completeness
>> >> >I'll
>> >> >offer my rebuttal to this proposal here.
>> >> >
>> >> >In short, I don't think the segregation of the transmit and receive
>> >> >routines
>> >> >into their own separate structure (and ostensibly their own
>> >>librte_pktdev
>> >> >library) is particularly valuable.  While it does provide some
>>minimal
>> >> >code
>> >> >savings when new device classes are introduced, the savings are not
>> >> >significant
>> >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx
>> >>and rx
>> >> >routines are any sort of indicator).  It does however, come with
>> >> >significant
>> >> >costs in the sense that it binds a device class to using an I/O
>>model
>> >>(in
>> >> >this
>> >> >case packet based recieve and transmit) for which the device class
>>may
>> >> >not be
>> >> >suited.
>> >>=20
>> >> The only reason the we only have a 0.5Kb saving is you are only
>>looking
>> >>at
>> >> moving Rx/Tx routines into pktdev, but what happens if we decide to
>> >>move a
>> >> number of common functions like start/stop and others, then you
>>start to
>> >> see a much bigger saving.
>> >You're right of course, but lets just try the math here.  If we assume
>> >that,
>> >since all these libraries are just inlined redirection functions, we
>>can
>> >guess
>> >that each one is about .25kb of code (5.kb/2 functions).  So if you add
>> >two more
>> >functions you're looking at 1kb of code savings.  Of course if you
>>start
>> >doing
>> >so, you beg the two questions that I've been trying to pose here:
>> >
>> >1) How much 'common' API do you want to impose on a device class that
>>may
>> >not
>> >be condusive to the common set
>> >
>> >2) At what point does your common API just become an rte_ethdev?
>>=20
>> You are taking the code size to the extreme, which is not the real point
>> here. I agree it does not matter in footprint savings as most DPDK
>>systems
>> are pretty big. The reason for the common code was to help remove some
>>of
>> the code required for the developer to write in the new device type. If
>> the developer of the new device type writes his own set of Rx/Tx
>>routines
>> the community will determine if they are required or do the current APIs
>> work or do we require them to support both.
>>=20
>> Please try not getting hung up on the footprint savings and I am sorry
>>if
>> I was making this a huge point.
>>=20
>Hung up?  I'm not the one who made a big deal out of code size reduction
>being a
>major benefit there, be it source or binary, its miniscule any way you
>slice it.
>I'm happy to drop this though if you are.
>
>> >=20
>> >> Do we need this saving, maybe not, but it does
>> >I would say definately not, given that the DPDK houses entire API sets
>> >that are
>> >always-inline, and any savings from the above are easily eclipsed by
>>the
>> >fanout
>> >that occurs from use in multiple call sites.  Lets be honest,
>>performance
>> >is the
>> >DPDK's primary concern.  Code size is not a consideration.  Its only an
>> >argument
>> >here because it makes this proposal look favorable.  Its not a bad
>>thing
>> >mind
>> >you, but if this proposal caused any code bloat, it wouldn't even be
>> >mentioned.
>>=20
>> In the current design the pktdev APIs are still inline routines only the
>> debug routines are not.
>>=20
>Reread my comments above please.  You seem to have misread what I wrote.
>
>> >
>> >> provide a single call API rte_pkt_rx/tx_burst to use instead of the
>> >> application having to make sure it is calling the correct device
>>Rx/Tx
>> >> routines.
>> >Given that that is a compile time issue, I really don't see why that
>>is a
>> >big
>> >deal.  Especially if you, as I suggest, just use the rte_ethdev as your
>> >common
>> >dataplane device model.  Then by virtue of the fact that they're all
>> >rte_ethdevs, you get common API naming.
>> >
>> >> All that is required is passing in the device pointer and it is
>> >> handled for the application. Bruce added some code below to that
>>effect.
>> >Yes, and you have that now, its an rte_ethdev.
>> >
>> >> >
>> >> >To illustrate the difference in design ideas, currenty the dpdk data
>> >> >pipeline
>> >> >looks like this:
>> >> >
>> >> >+------------+   +----------+   +---------+
>> >> >|            |   |          |   |         |
>> >> >|  ARP       |   |  ethdev  |   |         |   +----------+
>> >> >|  handler   +-->+  api     +-->+  PMD    +-->+ Wire     |
>> >> >|            |   |          |   |         |   +----------+
>> >> >|            |   |          |   |         |
>> >> >+------------+   +----------+   +---------+
>> >>=20
>> >> You did not add the crypto to this picture as it is in the picture
>>below
>> >> to make them the same.
>> >>=20
>> >No I didn't, because it doesn't exist right now.  Thats what I mean by
>> >'currently'.  See my description below after you added your drawing.
>> >
>> >> +-----+  +---------+  +---------+  +------+
>> >> |     |  |         |  |         |  |      |  +------+
>> >> | ARP +--+ ethdev  +--+ crypto  +--+ PMD  +--+ Wire |
>> >> |     |  |         |  |         |  |      |  +------+
>> >> +-----+  +---------+  +---------+  +------+
>> >>=20
>> >>=20
>> >> >
>> >> >
>> >> >Where the ARP handler code is just some code that knows how to
>>manage
>> >>arp
>> >> >requests and responses, and only transmits and receives frames
>> >> >
>> >> >Keiths idea would introduce this new pktdev handler structure and
>>make
>> >>the
>> >> >dataplane pipeline look like this:
>> >> >
>> >> >+------------+ +------------+  +------------+  +--------+
>> >> >|            | |            |  |            |  |        |
>> >> >|  ARP       | | pktdev api |  | pktdev_api |  |        |
>>+---------+
>> >> >|  handler   +-+            +--+            +--+ PMD    +--+Wire
>> |
>> >> >|            | |            |  |            |  |        |
>>+---------+
>> >> >|            | |            |  |            |  |        |
>> >> >+------------+ |            |  |            |  |        |
>> >> >               |            |  |            |  +--------+
>> >> >               |            |  |            |
>> >> >               |            |  |            |
>> >> >               |            |  |            |
>> >> >               | rte_ethdev |  | rte_crypto |
>> >> >               |            |  |            |
>> >> >               |            |  |            |
>> >> >               +------------+  +------------+
>> >>=20
>> >> You are drawing this picture it appears trying to make the pktdev
>> >>another
>> >> function call layer when it is just a single macro in the rte_ethdev
>> >Not trying to imply a new function call layer, just trying to
>>illustrate
>> >your
>> >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look
>> >like
>> >rte_pktdevs so that the dataplane has a common element for passing
>>mbufs
>> >around,
>> >regardless of its true device class.  Not sure where you see an extra
>> >function
>> >call layer here.
>>=20
>> Please see my comment above as I am not trying to make them all look the
>> same you are taking my words to the extreme. We need some type of
>> structure for adding more device types to DPDK and I am just trying to
>> suggest a few common parts as a solution.
>>=20
>You see that you're contradicting yourself here, right?  You state in the
>same
>paragraph that you are not trying to make all devices look the same, but
>are
>trying to find comment parts (the implication being that they will be
>interacted
>with in the same way).  I get that you're saying their only going to be
>'partially' the same, where the overlapping structures are made common in
>your
>model, so that code can be written generically, but if thats your goal,
>then the
>generic code you propose can only be generic if it only interfaces to the
>common
>parts, which means, to the generic application code you envision, every
>device
>looks the same.  Thats my point
>
>Note I'm not opposed to generic code being able to be written for multiple
>device types, I'm just asserting that that common device is the
>rte_ethdev,
>which we already have.  I'm proposing that dataplane elements are network
>interfaces codified as rte_ethdevs, and crypto devices are codified as
>some
>other API, and should you want crypto functionality, you write a pmd
>(interfaced
>to using rte_ethdev), that makes use of the to-be-written crypto library.
>
>> Some parts of the structures should be common and possible a few common
>> routines, yes a big part of each device will be different. The
>>rte_ethdev
>> structures is what we have today and was a good starting point, but a
>>big
>> part of the code will be different.
>>=20
>No, they shouldn't.  Reuse existing data structures where possible.  But
>don't
>try to take device specific interfaces and data, and where you see
>commonality,
>force commonality.  Thats what you're trying to do here.  I'd challenge
>you to
>find another implementation of device driver interfaces where what you are
>proposing is done.  It just doesn't make sense.
>
>> Should we allow someone to write a new device type that is completely
>> different down to typedefs for variable types and all new structures, no
>> we should not, giving them some guide lines by providing a set of
>>routines
>> and structures to start with, by all means lets do it.
>>=20
>At what point did you read that I said we can't use common data
>structures?  Of
>course use them where it makes sense.  Want to use mbufs as a common data
>type
>for both API's, fine!  That makes sense.  But don't go trying to mash
>device
>specific data types and methods into a common structure for some sort of
>imaginary savings.
>
>> >
>> >> header pointing to the rte_pktdev tx_burst routine. No function
>>function
>> >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to
>> >> rte_pkt_tx_burst routine, which BTW is the same routine that was in
>> >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx
>> >> routine via the dev_ops function pointers structure. Which is also no
>> >> extra over head.
>> >>=20
>> >Never suggested it was extra overhead, where did you get that from?
>>I'm
>> >just
>> >illustrating what it is you want to do.  If I didn't get it right I
>> >apologize,
>> >please clarify.
>>=20
>> No need to clarify as we agree and maybe I read too much into your
>> comments.
>> >
>> >> If you are calling the rte_pkt_tx_burst routine directly it just
>>means
>> >>you
>> >> need to get the device pointer to pass instead of the port id value
>>in
>> >>the
>> >> rte_pkt_tx_burst routine. The above turns into something like this:
>> >>=20
>> >> +-----+  +---------+  +--------+  +-----+
>> >> |     |  | ethdev  |  |        |  |     |  +------+
>> >> | ARP +--+ map to  +--+ crypto +--+ PMD  +--+ Wire |
>> >> |     |  | pktdev  |  |        |  |     |  +------+
>> >> +-----+  +---------+  +--------+  +-----+
>> >>=20
>> >> So the path of the data is the same only a macro does a simple
>>rename of
>> >> the call to rte_eth_tx_burst routine. If you call the pktdev routine
>> >> directly then the macro is not used.
>> >>=20
>> >>=20
>> >Let me quash this by stipulating
>> >that you are absolutely correct, there is no additional overhead in
>>your
>> >design,
>> >it should provide the exact same performance that an rte_ethdev
>>currently
>> >does.
>> >I have no problem with that.  My point is: It provides the exact same
>> >performance that an rte_ethdev does, so lets just use rte_ethdev if
>>we're
>> >trying
>> >to model devices in the dataplane.
>>=20
>> The ethdev model does not give us a different device type it just forces
>> the new device type to look like an ethdev, this is what I am reading
>>from
>> your email here. I do not believe you are suggesting that ethdev look a
>> like is the answer to a new device type.
>>=20
>What!?  You need to re-read my last email, I'm not sure where you get the
>above
>from.  If you'll look back at the dataplane design, I proposed this:
>
>+------------+  +---------+ +---------+  +---------+  +--------+
>|            |  |         | |         |  |         |  |        |
>|            |  |ethdev   | | ipsec   |  |ethdev   +--+        |
>| ARP handler+->+api      +-+ tunnel  +->+api      |  | PMD
>|            |  |         | | PMD     |  |         |  |        |
>|            |  |         | |         |  |         |  |        |
>+------------+  +---------+ +---+-----+  +---------+  +--------+
>                                |
>                             +--+-----+
>                             |        |
>                             |crypto  |
>                             |api     |
>                             |        |
>                             |        |
>                             +--------+
>
>
>I'm clearly proposing two device models here:
>
>1) A dataplane element model, the interface for which is an rte_ethdev.
>This
>allows one to implement multiple dataplane elements and functionality (in
>the
>example above an ipsec tunnel).  This is a common and well understood
>device
>model that many network stacks use (read: easy for end users to
>understand)
>
>2) A crypto device model, implemented independently, reusing existing
>data types
>where possible (read: Appropriate for the device class and alternate use
>cases)
>
>(1) uses (2) to implement ipsec functionality.  if an application just
>wants to
>do crypto on some arbitrary data, then the crypto api is better suited to
>doing
>so (read: not bound by the pktdev api you propose)
>
>2 device instances, 2 device models, each best suited for thier purpose.
>
>> The bonding model allows you to link some devices together and that is
>> good in IMO. The bonding model is kind of like a pipeline model at the
>> device layer. The pipeline style at the application layer could have
>> provided the same feature as the bonding model just being done at two
>> different layers in the system (application and device layer).
>>=20
>Not sure what you're getting at here, but I'll keep reading...
>
>> Now the pipeline model is a bit more generic and may add some small
>> overhead compared to the bonding model. If we require every device type
>>to
>> support a bonding style then we would need them all to follow some type
>>of
>> common structure, correct?
>I'm not sure what you're driving at here, please clarify.  Are you trying
>to
>suggest that multiple pmds in the datapath require some pmd specific code
>to
>allow them to be stacked?  Yes, they currently do, which is less than
>ideal, and
>I would certainly support an API to enable ordering of pmds in a
>datapath, but
>thats outside the scope of this discussion I think, since we're talking
>about
>new device types here.
>
>Regarding your comment about multiple pmds requiring a common structure to
>interface to one another in the datapath (like bonding currently does),
>you're
>absolutely right, thats an rte_ethdev.  Thats the way it works today, and
>theres
>no reason to change that.
>
>> >
>> >> >
>> >> >The idea being that now all devices in the dataplane are pktdev
>>devices
>> >> >and code
>> >> >that transmits and receives frames only needs to know that a device
>>can
>> >> >transmit
>> >> >and receive frames.  The crypto device in this chain is ostensibly
>> >> >preforming
>> >> >some sort of ipsec functionality so that arp frames are properly
>> >> >encrypted and
>> >> >encapsulated for sending via a tunnel.
>> >> >
>> >> >On the surface this seems reasonable, and in a sense it is.
>>However,
>> >>my
>> >> >assertion is that we already have this functionality, and it is the
>> >> >rte_ethdev
>> >> >device.  To illustrate further, in my view  we can do the above
>> >>already:
>> >> >
>> >> >+------------+  +---------+ +---------+  +---------+  +--------+
>> >> >|            |  |         | |         |  |         |  |        |
>> >> >|            |  |ethdev   | | ipsec   |  |ethdev   +--+        |
>> >> >| ARP handler+->+api      +-+ tunnel  +->+api      |  | PMD
>> >> >|            |  |         | | PMD     |  |         |  |        |
>> >> >|            |  |         | |         |  |         |  |        |
>> >> >+------------+  +---------+ +---+-----+  +---------+  +--------+
>> >> >                                |
>> >> >                             +--+-----+
>> >> >                             |        |
>> >> >                             |crypto  |
>> >> >                             |api     |
>> >> >                             |        |
>> >> >                             |        |
>> >> >                             +--------+
>> >> >
>> >> >Using the rte_ethdev we can already codify the ipsec functionailty
>>as a
>> >> >pmd that
>> >> >registers an ethdev, and stack it with other pmds using methods
>> >>simmilar
>> >> >to what
>> >> >the bonding pmd does (or via some other more generalized dataplane
>> >> >indexing
>> >> >function).  This still leaves us with the creation of the crypto
>>api,
>> >> >which is
>> >> >adventageous because:
>> >>=20
>> >> The proposal does not remove the bonding method and can still be
>>used,
>> >> correct?
>> >Not sure I fully understand the question, but no, I'm not proposing any
>> >specific
>> >change to bonding here, just using it to illustrate that we have ways
>> >currently
>> >to stack PMD's on top of one another to create a multi-stage data
>> >pipeline.
>>=20
>> Look at the above comment for this one.
>
>Ditto.  I'm not making any comment about bonding, just using it as an
>example
>to show that its possible to stack ethdevs in a dataplane today.  The
>method of
>stacking can certainly be improved, but rte_ethdev is acting as a
>perfectly good
>interface for that purpose right now, no changes needed.
>
>> >
>> >> I do not see the different here using the pktdev style routines or
>>using
>> >> ethdev routines.
>> >Exactly my point!  There is no difference to the dataplane API, it
>>remains
>> >unchanged, just as in your proposal, except that there is no work
>> >required to
>> >create a pktdev device that doesn't change anything.  The advantage
>>that
>> >I'm
>> >proposing here is that my model separates whatever crypto device class
>> >API you want
>> >to design from the dataplane (rte_ethdev) api.  Doing this lets you
>> >create a
>> >crypto API that is more appropriately suited to the crypto class of
>> >device.  if
>> >there is overlap with rte_ethdev, thats fine, you can create simmilar
>> >functions
>> >(even giving them identical names to the rte_ethdev apis), and that
>>will
>> >be
>> >properly resolved at compile time.  But more importantly, where there
>>is
>> >not
>> >overlap, you're not forced into creating an API that inherits the
>>common
>> >API
>> >functions you propose, when they are not best suited to your device.
>>=20
>> I am still not trying to suggest we have the cryptodev be a copy of the
>> ethdev here. We need some type of structure and common guide lines for a
>> new device, which pktdev is a very reasonable plus very light framework.
>>=20
>
>So, perhaps we're talking past one another here, because I'm not
>suggesting that
>cryptodev be a copy of ethdev either, and I thought that was crystal
>clear in my
>drawings above.  I'm suggesting that cryptodev be completely independent
>of any
>ethdev design (i.e. no overlap with ethdev as  codified in your pktdev
>proposal).
>
>As for being lightweight, I don't see the relevance.  As we agreed above,
>code
>savings, either in binary our source format is going to be fairly
>irrelevant,
>and not the point of the discussion.  As for _needing_ commonality, I
>disagree,
>and cite other implementations in linux and bsd as evidence of that (the
>device
>models there, while reusing some datatypes, maintain independent method
>pointer
>structures).
>
>> I am a bit confused now it seems like you are suggesting a new device
>> should be ethdev and then you seem to be stating it should be completely
>> different=A9
>>=20
>I'm not, and if you would read my emails closely, I think you would see
>that.
>As I tried to state as clearly as I'm able above, I'm proposing two
>devices:=20
>
>1) A dataplane element model, the interface for which is an rte_ethdev.
>This
>allows one to implement multiple dataplane elements and functionality (in
>the
>example above an ipsec tunnel).  This is a common and well understood
>device
>model that many network stacks use (read: easy for end users to
>understand)
>
>2) A crypto device model, implemented independently, reusing existing
>data types
>where possible (read: Appropriate for the device class and alternate use
>cases)
>
>
>
>> Lets just agree some common parts are good between device types and some
>> are just not. Making all devices look like an ethdev is just not going
>>to
>> fly and making a new device type into something completely different is
>> not going to fly.
>>=20
>No, lets not, because thats exactly what I'm arguing against.
>Once again I'll try to re-iterate:
>
>I'm fine with reusing existing datatypes if they are applicable
>
>I would like to see a device model for cryptodev that implements its own
>methods
>that are condusive to the way that hardware works, and not bound to the
>way
>other device classes work
>
>I would like to see dataplane element functionality that implements other
>network elements (like an ipsec tunnel), be implemented using the
>rte_ethdev
>interface, and have it implement its internals using the to-be-created
>cryptodev
>api.
>
>
>> Then the only solution is to have somethings in common to help unify the
>> designs, but still allow the developer to add APIs to that device which
>> make the most sense.
>> >
>
>Please tell me, why do you think that unify device models is the _only_
>solution?  I've provided several examples that simply don't ever do that,
>and
>they are quite successfull
>
>> >> >
>> >> >1) It is not constrained by the i/o model of the dataplane (it may
>> >>include
>> >> >packet based i/o, but can build on more rudimentary (and performant)
>> >> >interfaces.
>> >> >For instance, in addition to async block based i/o, a crypto device
>>may
>> >> >also
>> >> >operate syncrhnously, meaning a call can be saved with each
>>transaction
>> >> >(2 calls
>> >> >for a tx/rx vs one for an encrypt operation).
>> >> >
>> >> >2) It is not constrained by use case.  That is to say the API can be
>> >> >constructed
>> >> >for more natural use with other functions (for instance encryptions
>>of
>> >> >files on
>> >> >disk or via a pipe to another process), which may not have any
>>relation
>> >> >to the
>> >> >data plane of DPDK.
>> >> >
>> >> >Neil
>> >>=20
>> >> Ok, you snipped the text of the email here an it makes the context
>>wrong
>> >I figured your illustration above was sufficient to make your point and
>> >mine.
>> >If your code tells a different story, I apologize.
>> >
>> >> without the rest of the code IMO. I will try to explain without the
>>text
>> >> that was omitted, but it would be best for anyone missing the
>>original
>> >> email to read it for more details. I know the formatting got messed
>>up a
>> >> bit :-(
>> >>=20
>> >> http://dpdk.org/ml/archives/dev/2015-April/016124.html
>> >>=20
>> >>=20
>> >> In the rest of the text it does show the points I wanted to make here
>> >>and
>> >> how little overhead it added.
>> >>=20
>> >As above, I'm stipulating to the fact that there is not performance
>> >overhead.
>> >That is not, and was never my point.  My point was/is that we already
>> >have an
>> >API to do what you want to do.  The commonality of functions that you
>> >seem to
>> >focus on, is already provided by the fact that we have a common device
>> >for the
>> >dataplane (the rte_ethdev).  You're proposal doesn't buy us anything
>> >beyond
>> >that.  By your own assertion, your proposal:
>> >
>> >1) Its just as performant as an rte_ethdev
>> >
>> >2) It doesn't require any code change to applications
>> >
>> >Given that, and the fact that it does require that new device classes
>> >inherit an
>>=20
>> Please do not bring inheritance into this again I wish Bruce had never
>> used that term :-( in hind site it is not a very good way to describe
>>the
>> design.
>why not, its a perfectly accurate way to describe what you want to do
>here.  You
>want to extract elements from rte_ethdev that you feel are going to be
>common to
>other device types, place them in their own structure, and have other
>device
>types include that common structure, so as to be able to have those
>methods be a
>part of the new device.  In UML terms I suppose the relationship would be
>a
>"has-a" rather than a "is-a" but the inheritance description is equally
>relevant
>when writing in C.
>
>> It seems like it at a very high level, but it seems to drag
>> everyone down to the bits and bytes way too much. We ever stated that
>>all
>> the APIs are common between ethdev (or are inherited) between a new
>>device
>> only a couple which seemed to make sense to me. The structures being a
>>bit
>> similar was my goal and to provide a good light framework.
>> =20
>Again with the light framework.  We're not talking about code savings
>anymore.
>Theres nothing more or less light about just using an rte_ethdev to
>implement a
>nework element.  Separate out the network element from the actual crypto
>functionality.
>
>As for not _having_ to use the common structure in all device types, I
>agree,
>you don't have to mandate its use, but doesn't that beg the question, why
>do this
>then?  If theres not any guaranteed functionality, why do this at all?
>
>
>> >API set that may not be relevant to said device class, my one and only
>> >question
>> >is: whats the advantage?
>>=20
>> I have been trying to layout the advantages:
>>  1) - we provide a very light framework for the developer of a new
>>device
>> type to start with in his design
>We agreed at the start of this thread that 'lightness' wasn't relevant.
>Or do
>you mean to describe something other than code size savings by the term=20
>'light'?
>
>>  2) - we look at making some routines common between devices if it make
>> sense, some non-performance APIs do make sense to be common, but the
>> structures need to be common in some way IMO
>
>I can't parse this sentance.  You're propsal at the start of this thread
>was to create common structures between device classes.  How is "we look=20
>at
>making some routines common between devices if it makes sense" an=20
>avantage to
>your proposal, which is exactly that?  Its like saying changing code is=20
>good,
>because we get to change code.
>
>>  3) - we make a few structures look similar using pktdev structures as=20
>>an
>> overlay to allow for a clean way to pass around these different device
>> types
>
>This I actually agree with, but it goes back to your first point.  I would
>argue that passing around rte_ethdevs is just as, if not more useful than
>passing around pktdevs, because it more fully describes an ethernet=20
>interface.
>As you note, the pktdev interface is 'lighter' which I take as a=20
>euphamism for
>'smaller or having less code'.  As we've discussed above however, the=20
>savings
>are minimal and not paricularly relevant to this debate, so why not just=20
>use
>rte_ethdev to pass around network element functionality and design a=20
>crypto
>device interfae independently (I'll remind you here that I'm proposing two
>devices here, an rte_ethdev and a cryptodev).
>
>>  4) - adopt standard APIs or translate those APIs into something similar
>> for DPDK (e.g. Linux kernel crypto APIs)
>>=20
>What standard API's have you seen that make common API methods between=20
>device
>types?  Can you provide an example?
>
>> >You claim that common API naming makes application
>>=20
>> I never claimed that common API naming make applications common, but=20
>>only
>> a few common constructs or API semantics could help.
>>=20
>What do you see as the difference?  How, specifically does creating common
>device model mehtods help an application?  I was under the impression=20
>that, in
>so doing, you could allow some generic application code to be written=20
>that could
>interface to either a real ethernet device, or to a crypto device.  What=20
>other
>advantage do you see?
>
>
>> >code generic, but we have that today.  If you want to do IPsec with a
>> >crypto
>> >offload devie, write a pmd that interfaces to the data pipeline via the
>> >rte_ethdev API and uses whatever crypto offload device API you want to
>> >construct
>> >to do the crypto work.  That way, if a given application wants to use=20
>>the
>> >crypto
>>=20
>> I am trying to suggest the crypto device can have other APIs better=20
>>suited
>> to how we want to use that device, again we can have a few common items
>> between ethdev and cryptodev, but it does not make sense to use all of
>> ethdev.
>>=20
>
>Ok, so this is a good thought.  I think what I hear you saying above is=20
>that you
>could envision a crypto device having two api sets, yes?  One, a raw API=20
>that
>applications could use to talk to crypto devices directly, and two, a=20
>network
>oriented API that can transmit and recieve network frames, right?
>
>If thats the case, I'm fine with that.  But if thats interesting to you,=20
>why
>can't the network interface just be an rte_ethdev that you register like=20
>other
>pmd's do?
>
>> I am getting confused here as you seem to be suggesting we use ethdev
>> model, but I am sure you are not, correct?
>>=20
>Correct, two device models, for two devices.  An ipsec device,=20
>implemented as a
>pmd that uses an rte_ethdev to interface to the network dataplane, and a=20
>crypto
>device model that the aforementioned pmd interfaces to to encrypt and=20
>decrypt
>data to implement ipsec functionality.
>
>> >hardware for something non-network related (say disk/file encryption),=
=20
>>it
>> >can
>> >use the crypto device layer in a way that is not packet/network=20
>>oriented
>> >more
>> >easily, at which point having a common API is completely irrelevant, as
>> >the
>> >application code is being specifically written to talk to a crypto=20
>>device.
>> >
>> >
>> >
>> >
>> >> Lets just say we do not move the TX/RX routines from rte_ethdev into
>> >> rte_pktdev and only have a header file with a few structures and=20
>>macros
>> >>to
>> >> help define some common parts between each of the new device types=20
>>being
>> >> added. I could see that as an option, but I do not see the big issues
>> >>you
>> >> are pointing out here.
>> >>=20
>> >My big issue is that I see no point in creating a common API.
>>=20
>> I am not trying to create a common API between all devices, it only=20
>>seemed
>> like a reasonable design to make Rx/Tx look similar. Also making some of
>> the structures look similar is a good thing IMO.
>>=20
>Ok, you're not trying to make them look identical, I get that.  But you=20
>are
>trying to get them to try to look identical in terms of how they do data=20
>i/o.
>I'm suggesting/asserting that such an interface isn't appropriate to all
>devices.  Even that little bit isn't worth making common, because it won't
>always be used.
>
>> >You're focused on finding a common set of API's that apply to all=20
>>device
>> >classes, and I'm asserting that there is no safe set of common API
>> >routines.
>> >While it may be possible to shoehorn multiple device classes into the
>> >same i/o
>> >API model, you are doing yourself a disservice by mandating that.
>>=20
>> I feel like you are taking the extreme view here again and that was not=
=20
>>my
>> intent to make everything the same only where it makes sense. A ethdev=20
>>and
>> cryptodev are two different animals, but they do have some similarities
>> and I want to make sure we investigate these similarities, right?
>>=20
>Yes, they are two different animals, and while they may have some=20
>similarities,
>I'm suggesting that its not worth codifying those similarities in code. =20
>crypto
>devices don't send and recieve packets.  They might operate on block=20
>data, so
>you might use mbufs in both apis, but you don't transmit data to a crypto=
=20
>dev
>and you don't receive data from a crypto device.  You could do that I=20
>suppose,
>but there are better ways, and existing api's show us the way to do that.
>Remember from above, I'm suggesting two device models here.  A crypto=20
>device
>that just does cryptography in the fastest/best way possible, and a=20
>optional
>network element (an ipsec tunnel in our example), that uses the crypto=20
>device to
>preform its function, but uses an rte_ethdev to transmit and receive=20
>network
>data.
>
>
>> >
>> >Put another way, ipsec devices and crypto devices are separate device
>> >classes
>> >and deserve to be modeled differently.  An IPsec device is a network
>> >element
>> >that is very nicely modeled as an ethernet interface, for which you=20
>>have
>> >a very
>> >robust API (the rte_ethdev).  Crypto offload devices are not network
>> >elements,
>> >and can preform i/o in a mutltitude of ways (synchronous, asynchronous,
>> >scatter/gather, etc), and have many non ethernet API facets.  The=20
>>latter
>> >deserves to have its own API design, not to be forced to inherit part=20
>>of
>> >an
>> >ethernet device (rte_pktdev) and be forced to work that way.
>> >
>> >> You did have some great comments about how crypto is used and the=20
>>APIs
>> >> from the Linux Kernel crypto model is proven and I do not disagree.
>> >>=20
>> >> In my email to my own email I tried to point our we could add=20
>>something
>> >Yeah, sorry about the broken thread, the in-reply-to field got somehow
>> >html-mangled.  Not sure how that happened as I use mutt as my MUA.
>> >
>> >> very similar to Linux Kernel Crypto API and it would be a model most
>> >>would
>> >> be able to understand. Creating my own crypto API is not my goal,=20
>>but to
>> >> use standards where it makes the most sense to DPDK.
>> >>=20
>> >> The email link above is the email I suggested the Linux Kernel Crypto
>> >>API
>> >> would be reasonable.
>> >>=20
>> >And thats good.  It doesn't need to be the linux kernel crypto API by=20
>>any
>> >stretch, and I don't expect it to be.  My take away from review of the
>> >linux
>> >crypto api is that, while it uses some simmilar notions to parts of the
>> >network
>> >API (scatter gather i/o, optional async operation, completion queues),=
=20
>>it
>> >doesn't create any sort of common API structure between it and the=20
>>network
>> >subsystem, because they're separate devices.  Even though crypto is
>> >largely used
>> >in the network datapath for various ipsec tunnels, crypto still uses=20
>>its
>> >own API
>> >methods and data types because its not exclusive to the use of
>> >networking, and
>> >doing so allows for more flexible use outside the datapath.  Thats what
>> >I'm
>> >trying to get at here.  That users might want to use crypto outside of=
=20
>>the
>> >network path, and they should have an API that is well suited to that
>> >task.
>>=20
>> We agree here, I only want some similarities between devices not an=20
>>exact
>> common API.=20
>>=20
>> I am afraid I am not following you now. At one point it seems like all=20
>>of
>> the devices must be completely different then in another place ethdev is
>> best way to go as a common API. I think we are saying the same thing=20
>>here
>> just from a different point of view and I have tried to understand your
>> view point. I have even relaxed my design goals to meet you part way. It
>> seems like we agree and disagree here, which does not make a lot of=20
>>sense
>> to me at this point.
>>=20
>
>I'll repeat what I've said above several times now:
>
>I'd like to see a crypto device api that is developed independently.  It=20
>can use
>common data types if its suitable to do so, but shouldn't be forced to use
>simmilar methods from existing device types.  The API here should be=20
>optimized
>to best reflect the behavior of the hardware
>
>If you would like to use the cryptodevice as part of a network dataplane
>element, you should create a software pmd to register a rte_ethdev. =20
>Internally
>that pmd should use the crypto dev api to do its encryption work.
>
>That way your 'common' device api in the dataplane is the rte_ethdev.  The
>cryptodev api is then independent of, and optimized for, the cryptodev=20
>hardware.
>
>> Maybe we need to get into a room and talk F2F to make sure we are both
>> being understood here. Maybe we can then make sure we are both being=20
>>heard
>> and come to some type of solution for DPDK. Maybe the community meeting=
=20
>>is
>> a good place, but I feel it is going to be too big and too short to
>> discuss this correctly. If you are up to a conf call or I can travel to
>> you if you feel a F2F is better.
>>=20
>I think we're on opposite coasts, so I don't think a face to face is=20
>going to
>happen just for this, and that excludes other interests from this=20
>discussion,
>which I would really like to hear.
>
>I've re-read this thread, and I have a suggestion:
>
>Implement it.
>
>We're clearly approaching this from different angles.  You seem to place=20
>alot of
>value on the benefits of commonality between device types, while I=20
>clearly dont.
>I've got several examples in support of my view, but in fairness, the=20
>dpdk isn't
>the linux or bsd kernel, so while I don't think so, its possible this
>commonality provides some level of value.  Likewise, I may show that=20
>creating
>this common structure doesn't provide any real added value. =20
>Additionally, we've
>both focused on the commonality of devices in this thread (which was the
>intent), but neither of us have spent any time actually designing the=20
>crypto
>API, which I think will shed alot of light on the problem.
>
>I would suggest choosing a crypto device, maybe the ixp4 crypto engine,=20
>since it
>has an existing open source implementation, and using it to flesh out=20
>crypto
>API design.  If you implement it independently of the dpdk (i.e. as part=20
>of its
>own library with its own API), then we take a step back and look at it,=20
>we can
>identify the true api overlap and consider the advantages and limitation=20
>of
>consolidation.
>
>Sound like an idea?

Instead of replying to all of your comments above (sorry) I will finish up=
=20
the PoC code and then we can poke at the patches.

I have created a PoC around the code fragments in these emails, but I did=20
not include them as patches. The Quick Assist code on 01.org is being used=
=20
for the PoC and someone else is doing that work, but I can look at the=20
non-packet API from a top level to be added to the PoC.

https://01.org/packet-processing/intel%C2%AE-quickassist-technology-drivers
-and-patches


Regards,
++Keith