From: Neil Horman <nhorman@tuxdriver.com>
To: "Wiles, Keith" <keith.wiles@intel.com>, dev@dpdk.org
Subject: Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK.
Date: Fri, 3 Apr 2015 13:00:43 -0400 [thread overview]
Message-ID: <20150403170043.GA17441@hmsreliant.think-freely.org> (raw)
In-Reply-To: <D1408516.1A07B%keith.wiles@intel.com>
On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote:
> Hi all, (hoping format of the text is maintained)
>
> Bruce and myself are submitting this RFC in hopes of providing discussion
> points for the idea. Please do not get carried away with the code
> included, it was to help everyone understand the proposal/RFC.
>
> The RFC is to describe a proposed change we are looking to make to DPDK to
> add more device types. We would like to add in to DPDK the idea of a
> generic packet-device or ?pktdev?, which can be thought of as a thin layer
> for all device classes. For other device types such as potentially a
> ?cryptodev? or ?dpidev?. One of the main goals is to not effect
> performance and not require any current application to be modified. The
> pktdev layer is providing a light framework for developers to add a device
> to DPDK.
>
> Reason for Change
> -----------------
>
> The reason why we are looking to introduce these concepts to DPDK are:
>
> * Expand the scope of DPDK so that it can provide APIs for more than just
> packet acquisition and transmission, but also provide APIs that can be
> used to work with other hardware and software offloads, such as
> cryptographic accelerators, or accelerated libraries for cryptographic
> functions. [The reason why both software and hardware are mentioned is so
> that the same APIs can be used whether or not a hardware accelerator is
> actually available].
> * Provide a minimal common basis for device abstraction in DPDK, that can
> be used to unify the different types of packet I/O devices already
> existing in DPDK. To this end, the ethdev APIs are a good starting point,
> but the ethdev library contains too many functions which are NIC-specific
> to be a general-purpose set of APIs across all devices.
> Note: The idea was previously touched on here:
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545
>
> Description of Proposed Change
> ------------------------------
>
> The basic idea behind "pktdev" is to abstract out a few common routines
> and structures/members of structures by starting with ethdev structures as
> a starting point, cut it down to little more than a few members in each
> structure then possible add just rx_burst and tx_burst. Then use the
> structures as a starting point for writing a device type. Currently we
> have the rx_burst/tx_burst routines moved to the pktdev and it see like
> move a couple more common functions maybe resaonable. It could be the
> Rx/Tx routines in pktdev should be left as is, but in the code below is a
> possible reason to abstract a few routines into a common set of files.
>
> >From there, we have the ethdev type which adds in the existing functions
> specific to Ethernet devices, and also, for example, a cryptodev which may
> add in functions specific for cryptographic offload. As now, with the
> ethdev, the specific drivers provide concrete implementations of the
> functionality exposed by the interface. This hierarchy is shown in the
> diagram below, using the existing ethdev and ixgbe drivers as a reference,
> alongside a hypothetical cryptodev class and driver implementation
> (catchingly called) "X":
>
> ,---------------------.
> | struct rte_pktdev |
> +---------------------+
> | rte_pkt_rx_burst() |
> .-------| rte_pkt_tx_burst() |-----------.
> | `---------------------' |
> | |
> | |
> ,-------------------------------. ,------------------------------.
> | struct rte_ethdev | | struct rte_cryptodev |
> +-------------------------------+ +------------------------------+
> | rte_eth_dev_configure() | | rte_crypto_init_sym_session()|
> | rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() |
> | rte_eth_filter_ctrl() | | |
> `-------------------------------' `---------------.--------------'
> | |
> | |
> ,---------'---------------------. ,---------------'--------------.
> | struct rte_pmd_ixgbe | | struct rte_pmd_X |
> +-------------------------------+ +------------------------------+
> | .configure -> ixgbe_configure | | .init_session -> X_init_ses()|
> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() |
> `-------------------------------' `------------------------------'
>
> We are not attempting to create a real class model here only looking at
> creating a very basic common set of APIs and structures for other device
> types.
>
> In terms of code changes for this, we obviously need to add in new
> interface libraries for pktdev and cryptodev. The pktdev library can
> define a skeleton structure for the first few elements of the nested
> structures to ensure consistency. Each of the defines below illustrate the
> common members in device structures, which gives some basic structure the
> device framework. Each of the defines are placed at the top of the devices
> matching structures and allows the devices to contain common and private
> data. The pkdev structures overlay the first common set of members for
> each device type.
>
Keith and I discussed this offline, and for the purposes of completeness I'll
offer my rebuttal to this proposal here.
In short, I don't think the segregation of the transmit and receive routines
into their own separate structure (and ostensibly their own librte_pktdev
library) is particularly valuable. While it does provide some minimal code
savings when new device classes are introduced, the savings are not significant
(approximlately 0.5kb per device class if the rte_ethdev generic tx and rx
routines are any sort of indicator). It does however, come with significant
costs in the sense that it binds a device class to using an I/O model (in this
case packet based recieve and transmit) for which the device class may not be
suited.
To illustrate the difference in design ideas, currenty the dpdk data pipeline
looks like this:
+------------+ +----------+ +---------+
| | | | | |
| ARP | | ethdev | | | +----------+
| handler +-->+ api +-->+ PMD +-->+ Wire |
| | | | | | +----------+
| | | | | |
+------------+ +----------+ +---------+
Where the ARP handler code is just some code that knows how to manage arp
requests and responses, and only transmits and receives frames
Keiths idea would introduce this new pktdev handler structure and make the
dataplane pipeline look like this:
+------------+ +------------+ +------------+ +--------+
| | | | | | | |
| ARP | | pktdev api | | pktdev_api | | | +---------+
| handler +-+ +--+ +--+ PMD +--+Wire |
| | | | | | | | +---------+
| | | | | | | |
+------------+ | | | | | |
| | | | +--------+
| | | |
| | | |
| | | |
| rte_ethdev | | rte_crypto |
| | | |
| | | |
+------------+ +------------+
The idea being that now all devices in the dataplane are pktdev devices and code
that transmits and receives frames only needs to know that a device can transmit
and receive frames. The crypto device in this chain is ostensibly preforming
some sort of ipsec functionality so that arp frames are properly encrypted and
encapsulated for sending via a tunnel.
On the surface this seems reasonable, and in a sense it is. However, my
assertion is that we already have this functionality, and it is the rte_ethdev
device. To illustrate further, in my view we can do the above already:
+------------+ +---------+ +---------+ +---------+ +--------+
| | | | | | | | | |
| | |ethdev | | ipsec | |ethdev +--+ |
| ARP handler+->+api +-+ tunnel +->+api | | PMD
| | | | | PMD | | | | |
| | | | | | | | | |
+------------+ +---------+ +---+-----+ +---------+ +--------+
|
+--+-----+
| |
|crypto |
|api |
| |
| |
+--------+
Using the rte_ethdev we can already codify the ipsec functionailty as a pmd that
registers an ethdev, and stack it with other pmds using methods simmilar to what
the bonding pmd does (or via some other more generalized dataplane indexing
function). This still leaves us with the creation of the crypto api, which is
adventageous because:
1) It is not constrained by the i/o model of the dataplane (it may include
packet based i/o, but can build on more rudimentary (and performant) interfaces.
For instance, in addition to async block based i/o, a crypto device may also
operate syncrhnously, meaning a call can be saved with each transaction (2 calls
for a tx/rx vs one for an encrypt operation).
2) It is not constrained by use case. That is to say the API can be constructed
for more natural use with other functions (for instance encryptions of files on
disk or via a pipe to another process), which may not have any relation to the
data plane of DPDK.
Neil
next prev parent reply other threads:[~2015-04-03 17:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-01 12:44 Wiles, Keith
2015-04-03 17:00 ` Neil Horman [this message]
2015-04-03 22:32 ` Wiles, Keith
2015-04-04 13:11 ` Neil Horman
2015-04-04 15:16 ` Wiles, Keith
2015-04-05 19:37 ` Neil Horman
2015-04-05 22:20 ` Wiles, Keith
2015-04-06 1:48 ` Neil Horman
2015-04-02 14:16 Wiles, Keith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150403170043.GA17441@hmsreliant.think-freely.org \
--to=nhorman@tuxdriver.com \
--cc=dev@dpdk.org \
--cc=keith.wiles@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).