* [dpdk-dev] [RFC] Adding multiple device types to DPDK.
@ 2015-04-01 12:44 Wiles, Keith
2015-04-03 17:00 ` Neil Horman
0 siblings, 1 reply; 9+ messages in thread
From: Wiles, Keith @ 2015-04-01 12:44 UTC (permalink / raw)
To: dev
Hi all, (hoping format of the text is maintained)
Bruce and myself are submitting this RFC in hopes of providing discussion
points for the idea. Please do not get carried away with the code
included, it was to help everyone understand the proposal/RFC.
The RFC is to describe a proposed change we are looking to make to DPDK to
add more device types. We would like to add in to DPDK the idea of a
generic packet-device or ³pktdev², which can be thought of as a thin layer
for all device classes. For other device types such as potentially a
³cryptodev² or ³dpidev². One of the main goals is to not effect
performance and not require any current application to be modified. The
pktdev layer is providing a light framework for developers to add a device
to DPDK.
Reason for Change
-----------------
The reason why we are looking to introduce these concepts to DPDK are:
* Expand the scope of DPDK so that it can provide APIs for more than just
packet acquisition and transmission, but also provide APIs that can be
used to work with other hardware and software offloads, such as
cryptographic accelerators, or accelerated libraries for cryptographic
functions. [The reason why both software and hardware are mentioned is so
that the same APIs can be used whether or not a hardware accelerator is
actually available].
* Provide a minimal common basis for device abstraction in DPDK, that can
be used to unify the different types of packet I/O devices already
existing in DPDK. To this end, the ethdev APIs are a good starting point,
but the ethdev library contains too many functions which are NIC-specific
to be a general-purpose set of APIs across all devices.
Note: The idea was previously touched on here:
http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545
Description of Proposed Change
------------------------------
The basic idea behind "pktdev" is to abstract out a few common routines
and structures/members of structures by starting with ethdev structures as
a starting point, cut it down to little more than a few members in each
structure then possible add just rx_burst and tx_burst. Then use the
structures as a starting point for writing a device type. Currently we
have the rx_burst/tx_burst routines moved to the pktdev and it see like
move a couple more common functions maybe resaonable. It could be the
Rx/Tx routines in pktdev should be left as is, but in the code below is a
possible reason to abstract a few routines into a common set of files.
>From there, we have the ethdev type which adds in the existing functions
specific to Ethernet devices, and also, for example, a cryptodev which may
add in functions specific for cryptographic offload. As now, with the
ethdev, the specific drivers provide concrete implementations of the
functionality exposed by the interface. This hierarchy is shown in the
diagram below, using the existing ethdev and ixgbe drivers as a reference,
alongside a hypothetical cryptodev class and driver implementation
(catchingly called) "X":
,---------------------.
| struct rte_pktdev |
+---------------------+
| rte_pkt_rx_burst() |
.-------| rte_pkt_tx_burst() |-----------.
| `---------------------' |
| |
| |
,-------------------------------. ,------------------------------.
| struct rte_ethdev | | struct rte_cryptodev |
+-------------------------------+ +------------------------------+
| rte_eth_dev_configure() | | rte_crypto_init_sym_session()|
| rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() |
| rte_eth_filter_ctrl() | | |
`-------------------------------' `---------------.--------------'
| |
| |
,---------'---------------------. ,---------------'--------------.
| struct rte_pmd_ixgbe | | struct rte_pmd_X |
+-------------------------------+ +------------------------------+
| .configure -> ixgbe_configure | | .init_session -> X_init_ses()|
| .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() |
`-------------------------------' `------------------------------'
We are not attempting to create a real class model here only looking at
creating a very basic common set of APIs and structures for other device
types.
In terms of code changes for this, we obviously need to add in new
interface libraries for pktdev and cryptodev. The pktdev library can
define a skeleton structure for the first few elements of the nested
structures to ensure consistency. Each of the defines below illustrate the
common members in device structures, which gives some basic structure the
device framework. Each of the defines are placed at the top of the devices
matching structures and allows the devices to contain common and private
data. The pkdev structures overlay the first common set of members for
each device type.
For example:
------------
We are using macros to reduce code changes to DPDK, but nested structures
are a better solution:
#define RTE_PKT_COMMON_DEV(_t)
\
pkt_rx_burst_t rx_pkt_burst; /**< Pointer to PMD
receive function. */ \
pkt_tx_burst_t tx_pkt_burst; /**< Pointer to PMD
transmit function. */ \
struct rte_##_t##_dev_data *data; /**< Pointer to device
data */ \
const struct _t##_driver *driver; /**< Driver for this
device */ \
struct _t##_dev_ops *dev_ops; /**< Functions exported by
PMD */ \
struct rte_pci_device *pci_dev; /**< PCI info. supplied by
probing */ \
/** User application callback for interrupts if present */
\
struct rte_##_t##_dev_cb_list link_intr_cbs;
\
/**
\
* User-supplied functions called from rx_burst to post-process
\
* received packets before passing them to the user
\
*/
\
struct rte_##_t##_rxtx_callback **post_rx_burst_cbs;
\
/**
\
* User-supplied functions called from tx_burst to pre-process
\
* received packets before passing them to the driver for
transmission. \
*/
\
struct rte_##_t##_rxtx_callback **pre_tx_burst_cbs;
\
enum rte_pkt_dev_type dev_type; /**< Flag indicating the
device type */ \
uint8_t attached /**< Flag indicating the
port is attached */
/* Possible alignment or a hole in the structure */
#define RTE_PKT_NAME_MAX_LEN (32)
#define RTE_PKT_COMMON_DEV_DATA
\
char name[RTE_PKT_NAME_MAX_LEN]; /**< Unique identifier name */
\
\
void **rx_queues; /**< Array of pointers to RX queues.
*/ \
void **tx_queues; /**< Array of pointers to TX queues.
*/ \
uint16_t nb_rx_queues; /**< Number of RX queues. */
\
uint16_t nb_tx_queues; /**< Number of TX queues. */
\
\
uint16_t flags; /**< Bit fields for xyzdev's to use.
*/ \
uint16_t mtu; /**< Maximum Transmission Unit. */
\
uint8_t unit_id; /**< Unit ID for this instance */
\
uint8_t _filler[7]; /* alignment filler */
\
\
/* 64bit alignment starts here */
\
void *dev_private; /**< PMD-specific private data */
\
uint64_t rx_mbuf_alloc_failed; /**< RX ring mbuf allocation failures.
*/ \
uint32_t min_rx_buf_size; /**< Common rx buffer size handled by
all queues */ \
uint32_t _pad0
#define port_id unit_id
#define RTE_PKT_COMMON_DEV_INFO
\
struct rte_pci_device *pci_dev; /**< Device PCI information.
*/ \
const char *driver_name; /**< Device Driver name. */
\
unsigned int if_index; /**< Index to bound host
interface, or 0 if none. */ \
/* Use if_indextoname() to translate into an interface name. */
\
uint32_t _pad0
The above is attempting to collect the common members to be place into the
top of private device structures as we feel these members should be fairly
common among the device types.
/**
* @internal
* The generic data structure associated with each device.
*
* Pointers to burst-oriented packet receive and transmit functions are
* located at the beginning of the structure, along with the pointer to
* where all the data elements for the particular device are stored in
shared
* memory. This split allows the function pointer and driver data to be per-
* process, while the actual configuration data for the device is shared.
*/
struct rte_pkt_dev {
RTE_PKT_COMMON_DEV(pkt);
};
/**
* @internal
* The data part, with no function pointers, associated with each device.
*
* This structure is safe to place in shared memory to be common among
different
* processes in a multi-process configuration.
*/
struct rte_pkt_dev_data {
RTE_PKT_COMMON_DEV_DATA;
};
------
The existing ethdev code can then have a minor updates such as those shown
below:
struct rte_eth_dev_info {
RTE_PKT_COMMON_DEV_INFO;
/* Private device data maybe here */
uint32_t min_rx_bufsize; /**< Minimum size of RX buffer. */
uint32_t max_rx_pktlen; /**< Maximum configurable length of RX pkt. */
...
struct rte_eth_dev_data {
RTE_PKT_COMMON_DEV_DATA; /**< Define located in <rte_pkt.h> */
/* Private device data maybe here */
struct rte_eth_dev_sriov sriov; /**< SRIOV data */
struct rte_eth_link dev_link; /**< Link-level information & status */
...
struct rte_eth_dev {
RTE_PKT_COMMON_DEV(eth);
/* Private device data maybe here */
};
/* Bit defines for flags in common pkt structure */
#define promiscuous 0x0008 /**< RX promiscuous mode ON(1) / OFF(0).
*/
#define scattered_rx 0x0004 /**< RX of scattered packets is ON(1) /
OFF(0) */
#define all_multicast 0x0002 /**< RX all multicast mode ON(1) /
OFF(0). */
#define dev_started 0x0001 /**< Device state: STARTED(1)/STOPPED(0)
*/
The advantage of doing a common set of member is the existing ethdev
structures and APIs can remain exactly the same, but every ethdev is also
a pktdev, which can be used as either as appropriate. Similarly for a type
of crypto devices, or dpi devices (or software rings or KNI devices, if we
so desire), we can base them off this common minimal framework and use
them all in a similar manner.
Moving some basic common functions and structures into a common set of
files gives everyone a clean starting point for a new device plus adding a
light framework. The pktdev code is normally not called directly from the
application, but called from the device itself via a define in the device
header files. The pktdev RX/TX routines can be called from the
application, but the application needs to get the device structure pointer
based on the port id.
The cryptodev API maybe very different from other devices and following
some type of Open Crypto API. The goal is not to restrict the device API,
but try to give some type of structure to tghe design. Does it make sense
to have a mbuf based Rx/Tx API, maybe not. Could the mbuf based APIs be
hidden in the pktdev code, very possible. We have a lot of options here.
How the two Rx/Tx routines are defined:
---------------------------------------
/**
*
* Retrieve a burst of input packets from a receive queue of an Ethernet
* device.
*
<SNIP>
*/
#define rte_eth_rx_burst(_pid, _qid, _pkts, _nb_pkts) \
rte_pkt_rx_burst((struct rte_pkt_dev *)&rte_eth_devices[_pid], _qid,
_pkts, _nb_pkts)
/**
* Send a burst of output packets on a transmit queue of an Ethernet
device.
*
<SNIP>
*/
#define rte_eth_tx_burst(_pid, _qid, _pkts, _nb_pkts) \
rte_pkt_tx_burst((struct rte_pkt_dev *)&rte_eth_device[_pid], _qid,
_pkts, _nb_pkts)
A snip of code showing some advantages and use case of using pktdev API:
------------------------------------------------------------------------
Not the complete code and it has not been tested and is only an example
how one could use the design.
/*
* The lcore main. This is the main thread that does the work, reading from
* an input port and writing to an output port.
*/
static __attribute__((noreturn)) void
do_work(const struct pipeline_params *p)
{
printf("\nCore %u forwarding packets. %s -> %s\n",
rte_lcore_id(),
p->src->data->name,
p->dst->data->name);
/* Run until the application is quit or killed. */
for (;;) {
/*
* Receive packets on a src device and forward them on out
* the dst device.
*/
/* Get burst of RX packets, from first port of pair. */
struct rte_mbuf *bufs[BURST_SIZE];
const uint16_t nb_rx = rte_pkt_rx_burst(p->src, 0,
bufs, BURST_SIZE);
if (unlikely(nb_rx == 0))
continue;
/* Send burst of TX packets, to second port of pair. */
const uint16_t nb_tx = rte_pkt_tx_burst(p->dst, 0,
bufs, nb_rx);
/* Free any unsent packets. */
if (unlikely(nb_tx < nb_rx)) {
uint16_t buf;
for (buf = nb_tx; buf < nb_rx; buf++)
rte_pktmbuf_free(bufs[buf]);
}
}
}
/*
* The main function, which does initialization and calls the per-lcore
* functions.
*/
int
main(int argc, char *argv[])
{
struct pipeline_params p[RTE_MAX_LCORE];
struct rte_mempool *mbuf_pool;
unsigned nb_ports, lcore_id;
uint8_t portid;
/* Initialize the Environment Abstraction Layer (EAL). */
int ret = rte_eal_init(argc, argv);
if (ret < 0)
rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
argc -= ret;
argv += ret;
/* Check that there is an even number of ports to send/receive on. */
nb_ports = rte_eth_dev_count();
if (nb_ports < 2 || (nb_ports & 1))
rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
/* Creates a new mempool in memory to hold the mbufs. */
mbuf_pool = rte_mempool_create("MBUF_POOL",
NUM_MBUFS * nb_ports,
MBUF_SIZE,
MBUF_CACHE_SIZE,
sizeof(struct rte_pktmbuf_pool_private),
rte_pktmbuf_pool_init, NULL,
rte_pktmbuf_init, NULL,
rte_socket_id(),
0);
if (mbuf_pool == NULL)
rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
/* Initialize all ports. */
for (portid = 0; portid < nb_ports; portid++)
if (port_init(portid, mbuf_pool) != 0)
rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
portid);
struct rte_pkt_dev *in = rte_eth_get_dev(0);
RTE_LCORE_FOREACH_SLAVE(lcore_id){
char name[RTE_RING_NAMESIZE];
snprintf(name, sizeof(name), "RING_from_%u", lcore_id);
struct rte_pkt_dev *out = rte_ring_get_dev(
rte_ring_create(name, 4096, rte_socket_id(), 0));
p[lcore_id].src = in;
p[lcore_id].dst = out;
rte_eal_remote_launch((lcore_function_t *)do_work,
&p[lcore_id], lcore_id);
in = out; // next pipeline stage reads from my output.
}
//now finish pipeline on master lcore
lcore_id = rte_lcore_id();
p[lcore_id].src = in;
p[lcore_id].dst = rte_eth_get_dev(1);
do_work(&p[lcore_id]);
return 0;
}
Changes to rte_ethdev.[ch]
--------------------------
The most changes to rte_ethdev.[ch] was to use the new defines from
rte_pkt.h. All of the references to the globals in ethdev had to be
replaced with a reference to a global structure in ethdev. Moving the
global or private data into a device specific structure seemed reasonable
to reduce name space issues with new devices. The rx_burst/tx_burst
routines were removed as they now exist in the rte_pktdev.c file. If we
use nested structures instead of macros then more of the code will need to
be converted or macros used to convert the members to address the nested
structures.
Example:
#define rx_pkt_burst dev_data.rx_pkt_burst
#define tx_pkt_burst dev_data.tx_pkt_burst
Impact to Existing Applications
-------------------------------
None. The existing APIs should all remain unchanged, only the underlying
library code needs to change. [Obviously changes to apps will be needed to
take advantage of new device classes as we make them available].
The crypto API could be similar to the Open Crypto APIs and they seem
reasonable, but also using mbufs to hold data is just trying to use that
container type to provide some common structure to the system. Some of the
crypto data with be in the form of packets and some in the form of chunks
of data, which the API should account for in the design.
My goal is to provide a light weight framework for adding more devices and
not try to make everthing look like Ethernet device.
Regards,
++Keith and Bruce
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-01 12:44 [dpdk-dev] [RFC] Adding multiple device types to DPDK Wiles, Keith @ 2015-04-03 17:00 ` Neil Horman 2015-04-03 22:32 ` Wiles, Keith 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2015-04-03 17:00 UTC (permalink / raw) To: Wiles, Keith, dev On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: > Hi all, (hoping format of the text is maintained) > > Bruce and myself are submitting this RFC in hopes of providing discussion > points for the idea. Please do not get carried away with the code > included, it was to help everyone understand the proposal/RFC. > > The RFC is to describe a proposed change we are looking to make to DPDK to > add more device types. We would like to add in to DPDK the idea of a > generic packet-device or ?pktdev?, which can be thought of as a thin layer > for all device classes. For other device types such as potentially a > ?cryptodev? or ?dpidev?. One of the main goals is to not effect > performance and not require any current application to be modified. The > pktdev layer is providing a light framework for developers to add a device > to DPDK. > > Reason for Change > ----------------- > > The reason why we are looking to introduce these concepts to DPDK are: > > * Expand the scope of DPDK so that it can provide APIs for more than just > packet acquisition and transmission, but also provide APIs that can be > used to work with other hardware and software offloads, such as > cryptographic accelerators, or accelerated libraries for cryptographic > functions. [The reason why both software and hardware are mentioned is so > that the same APIs can be used whether or not a hardware accelerator is > actually available]. > * Provide a minimal common basis for device abstraction in DPDK, that can > be used to unify the different types of packet I/O devices already > existing in DPDK. To this end, the ethdev APIs are a good starting point, > but the ethdev library contains too many functions which are NIC-specific > to be a general-purpose set of APIs across all devices. > Note: The idea was previously touched on here: > http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 > > Description of Proposed Change > ------------------------------ > > The basic idea behind "pktdev" is to abstract out a few common routines > and structures/members of structures by starting with ethdev structures as > a starting point, cut it down to little more than a few members in each > structure then possible add just rx_burst and tx_burst. Then use the > structures as a starting point for writing a device type. Currently we > have the rx_burst/tx_burst routines moved to the pktdev and it see like > move a couple more common functions maybe resaonable. It could be the > Rx/Tx routines in pktdev should be left as is, but in the code below is a > possible reason to abstract a few routines into a common set of files. > > >From there, we have the ethdev type which adds in the existing functions > specific to Ethernet devices, and also, for example, a cryptodev which may > add in functions specific for cryptographic offload. As now, with the > ethdev, the specific drivers provide concrete implementations of the > functionality exposed by the interface. This hierarchy is shown in the > diagram below, using the existing ethdev and ixgbe drivers as a reference, > alongside a hypothetical cryptodev class and driver implementation > (catchingly called) "X": > > ,---------------------. > | struct rte_pktdev | > +---------------------+ > | rte_pkt_rx_burst() | > .-------| rte_pkt_tx_burst() |-----------. > | `---------------------' | > | | > | | > ,-------------------------------. ,------------------------------. > | struct rte_ethdev | | struct rte_cryptodev | > +-------------------------------+ +------------------------------+ > | rte_eth_dev_configure() | | rte_crypto_init_sym_session()| > | rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() | > | rte_eth_filter_ctrl() | | | > `-------------------------------' `---------------.--------------' > | | > | | > ,---------'---------------------. ,---------------'--------------. > | struct rte_pmd_ixgbe | | struct rte_pmd_X | > +-------------------------------+ +------------------------------+ > | .configure -> ixgbe_configure | | .init_session -> X_init_ses()| > | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() | > `-------------------------------' `------------------------------' > > We are not attempting to create a real class model here only looking at > creating a very basic common set of APIs and structures for other device > types. > > In terms of code changes for this, we obviously need to add in new > interface libraries for pktdev and cryptodev. The pktdev library can > define a skeleton structure for the first few elements of the nested > structures to ensure consistency. Each of the defines below illustrate the > common members in device structures, which gives some basic structure the > device framework. Each of the defines are placed at the top of the devices > matching structures and allows the devices to contain common and private > data. The pkdev structures overlay the first common set of members for > each device type. > Keith and I discussed this offline, and for the purposes of completeness I'll offer my rebuttal to this proposal here. In short, I don't think the segregation of the transmit and receive routines into their own separate structure (and ostensibly their own librte_pktdev library) is particularly valuable. While it does provide some minimal code savings when new device classes are introduced, the savings are not significant (approximlately 0.5kb per device class if the rte_ethdev generic tx and rx routines are any sort of indicator). It does however, come with significant costs in the sense that it binds a device class to using an I/O model (in this case packet based recieve and transmit) for which the device class may not be suited. To illustrate the difference in design ideas, currenty the dpdk data pipeline looks like this: +------------+ +----------+ +---------+ | | | | | | | ARP | | ethdev | | | +----------+ | handler +-->+ api +-->+ PMD +-->+ Wire | | | | | | | +----------+ | | | | | | +------------+ +----------+ +---------+ Where the ARP handler code is just some code that knows how to manage arp requests and responses, and only transmits and receives frames Keiths idea would introduce this new pktdev handler structure and make the dataplane pipeline look like this: +------------+ +------------+ +------------+ +--------+ | | | | | | | | | ARP | | pktdev api | | pktdev_api | | | +---------+ | handler +-+ +--+ +--+ PMD +--+Wire | | | | | | | | | +---------+ | | | | | | | | +------------+ | | | | | | | | | | +--------+ | | | | | | | | | | | | | rte_ethdev | | rte_crypto | | | | | | | | | +------------+ +------------+ The idea being that now all devices in the dataplane are pktdev devices and code that transmits and receives frames only needs to know that a device can transmit and receive frames. The crypto device in this chain is ostensibly preforming some sort of ipsec functionality so that arp frames are properly encrypted and encapsulated for sending via a tunnel. On the surface this seems reasonable, and in a sense it is. However, my assertion is that we already have this functionality, and it is the rte_ethdev device. To illustrate further, in my view we can do the above already: +------------+ +---------+ +---------+ +---------+ +--------+ | | | | | | | | | | | | |ethdev | | ipsec | |ethdev +--+ | | ARP handler+->+api +-+ tunnel +->+api | | PMD | | | | | PMD | | | | | | | | | | | | | | | +------------+ +---------+ +---+-----+ +---------+ +--------+ | +--+-----+ | | |crypto | |api | | | | | +--------+ Using the rte_ethdev we can already codify the ipsec functionailty as a pmd that registers an ethdev, and stack it with other pmds using methods simmilar to what the bonding pmd does (or via some other more generalized dataplane indexing function). This still leaves us with the creation of the crypto api, which is adventageous because: 1) It is not constrained by the i/o model of the dataplane (it may include packet based i/o, but can build on more rudimentary (and performant) interfaces. For instance, in addition to async block based i/o, a crypto device may also operate syncrhnously, meaning a call can be saved with each transaction (2 calls for a tx/rx vs one for an encrypt operation). 2) It is not constrained by use case. That is to say the API can be constructed for more natural use with other functions (for instance encryptions of files on disk or via a pipe to another process), which may not have any relation to the data plane of DPDK. Neil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-03 17:00 ` Neil Horman @ 2015-04-03 22:32 ` Wiles, Keith 2015-04-04 13:11 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: Wiles, Keith @ 2015-04-03 22:32 UTC (permalink / raw) To: Neil Horman, dev Hi Neil, On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: >> Hi all, (hoping format of the text is maintained) >> >> Bruce and myself are submitting this RFC in hopes of providing >>discussion >> points for the idea. Please do not get carried away with the code >> included, it was to help everyone understand the proposal/RFC. >> >> The RFC is to describe a proposed change we are looking to make to DPDK >>to >> add more device types. We would like to add in to DPDK the idea of a >> generic packet-device or ?pktdev?, which can be thought of as a thin >>layer >> for all device classes. For other device types such as potentially a >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect >> performance and not require any current application to be modified. The >> pktdev layer is providing a light framework for developers to add a >>device >> to DPDK. >> >> Reason for Change >> ----------------- >> >> The reason why we are looking to introduce these concepts to DPDK are: >> >> * Expand the scope of DPDK so that it can provide APIs for more than >>just >> packet acquisition and transmission, but also provide APIs that can be >> used to work with other hardware and software offloads, such as >> cryptographic accelerators, or accelerated libraries for cryptographic >> functions. [The reason why both software and hardware are mentioned is >>so >> that the same APIs can be used whether or not a hardware accelerator is >> actually available]. >> * Provide a minimal common basis for device abstraction in DPDK, that >>can >> be used to unify the different types of packet I/O devices already >> existing in DPDK. To this end, the ethdev APIs are a good starting >>point, >> but the ethdev library contains too many functions which are >>NIC-specific >> to be a general-purpose set of APIs across all devices. >> Note: The idea was previously touched on here: >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 >> >> Description of Proposed Change >> ------------------------------ >> >> The basic idea behind "pktdev" is to abstract out a few common routines >> and structures/members of structures by starting with ethdev structures >>as >> a starting point, cut it down to little more than a few members in each >> structure then possible add just rx_burst and tx_burst. Then use the >> structures as a starting point for writing a device type. Currently we >> have the rx_burst/tx_burst routines moved to the pktdev and it see like >> move a couple more common functions maybe resaonable. It could be the >> Rx/Tx routines in pktdev should be left as is, but in the code below is >>a >> possible reason to abstract a few routines into a common set of files. >> >> >From there, we have the ethdev type which adds in the existing >>functions >> specific to Ethernet devices, and also, for example, a cryptodev which >>may >> add in functions specific for cryptographic offload. As now, with the >> ethdev, the specific drivers provide concrete implementations of the >> functionality exposed by the interface. This hierarchy is shown in the >> diagram below, using the existing ethdev and ixgbe drivers as a >>reference, >> alongside a hypothetical cryptodev class and driver implementation >> (catchingly called) "X": >> >> ,---------------------. >> | struct rte_pktdev | >> +---------------------+ >> | rte_pkt_rx_burst() | >> .-------| rte_pkt_tx_burst() |-----------. >> | `---------------------' | >> | | >> | | >> ,-------------------------------. ,------------------------------. >> | struct rte_ethdev | | struct rte_cryptodev | >> +-------------------------------+ +------------------------------+ >> | rte_eth_dev_configure() | | rte_crypto_init_sym_session()| >> | rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() | >> | rte_eth_filter_ctrl() | | | >> `-------------------------------' `---------------.--------------' >> | | >> | | >> ,---------'---------------------. ,---------------'--------------. >> | struct rte_pmd_ixgbe | | struct rte_pmd_X | >> +-------------------------------+ +------------------------------+ >> | .configure -> ixgbe_configure | | .init_session -> X_init_ses()| >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() | >> `-------------------------------' `------------------------------' >> >> We are not attempting to create a real class model here only looking at >> creating a very basic common set of APIs and structures for other device >> types. >> >> In terms of code changes for this, we obviously need to add in new >> interface libraries for pktdev and cryptodev. The pktdev library can >> define a skeleton structure for the first few elements of the nested >> structures to ensure consistency. Each of the defines below illustrate >>the >> common members in device structures, which gives some basic structure >>the >> device framework. Each of the defines are placed at the top of the >>devices >> matching structures and allows the devices to contain common and private >> data. The pkdev structures overlay the first common set of members for >> each device type. >> > > >Keith and I discussed this offline, and for the purposes of completeness >I'll >offer my rebuttal to this proposal here. > >In short, I don't think the segregation of the transmit and receive >routines >into their own separate structure (and ostensibly their own librte_pktdev >library) is particularly valuable. While it does provide some minimal >code >savings when new device classes are introduced, the savings are not >significant >(approximlately 0.5kb per device class if the rte_ethdev generic tx and rx >routines are any sort of indicator). It does however, come with >significant >costs in the sense that it binds a device class to using an I/O model (in >this >case packet based recieve and transmit) for which the device class may >not be >suited. The only reason the we only have a 0.5Kb saving is you are only looking at moving Rx/Tx routines into pktdev, but what happens if we decide to move a number of common functions like start/stop and others, then you start to see a much bigger saving. Do we need this saving, maybe not, but it does provide a single call API rte_pkt_rx/tx_burst to use instead of the application having to make sure it is calling the correct device Rx/Tx routines. All that is required is passing in the device pointer and it is handled for the application. Bruce added some code below to that effect. > >To illustrate the difference in design ideas, currenty the dpdk data >pipeline >looks like this: > >+------------+ +----------+ +---------+ >| | | | | | >| ARP | | ethdev | | | +----------+ >| handler +-->+ api +-->+ PMD +-->+ Wire | >| | | | | | +----------+ >| | | | | | >+------------+ +----------+ +---------+ You did not add the crypto to this picture as it is in the picture below to make them the same. +-----+ +---------+ +---------+ +------+ | | | | | | | | +------+ | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | | | | | | | | | +------+ +-----+ +---------+ +---------+ +------+ > > >Where the ARP handler code is just some code that knows how to manage arp >requests and responses, and only transmits and receives frames > >Keiths idea would introduce this new pktdev handler structure and make the >dataplane pipeline look like this: > >+------------+ +------------+ +------------+ +--------+ >| | | | | | | | >| ARP | | pktdev api | | pktdev_api | | | +---------+ >| handler +-+ +--+ +--+ PMD +--+Wire | >| | | | | | | | +---------+ >| | | | | | | | >+------------+ | | | | | | > | | | | +--------+ > | | | | > | | | | > | | | | > | rte_ethdev | | rte_crypto | > | | | | > | | | | > +------------+ +------------+ You are drawing this picture it appears trying to make the pktdev another function call layer when it is just a single macro in the rte_ethdev header pointing to the rte_pktdev tx_burst routine. No function function overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to rte_pkt_tx_burst routine, which BTW is the same routine that was in rte_ethdev today. The pktdev and ethdev are calling into the PMD tx routine via the dev_ops function pointers structure. Which is also no extra over head. If you are calling the rte_pkt_tx_burst routine directly it just means you need to get the device pointer to pass instead of the port id value in the rte_pkt_tx_burst routine. The above turns into something like this: +-----+ +---------+ +--------+ +-----+ | | | ethdev | | | | | +------+ | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | | | | pktdev | | | | | +------+ +-----+ +---------+ +--------+ +-----+ So the path of the data is the same only a macro does a simple rename of the call to rte_eth_tx_burst routine. If you call the pktdev routine directly then the macro is not used. > >The idea being that now all devices in the dataplane are pktdev devices >and code >that transmits and receives frames only needs to know that a device can >transmit >and receive frames. The crypto device in this chain is ostensibly >preforming >some sort of ipsec functionality so that arp frames are properly >encrypted and >encapsulated for sending via a tunnel. > >On the surface this seems reasonable, and in a sense it is. However, my >assertion is that we already have this functionality, and it is the >rte_ethdev >device. To illustrate further, in my view we can do the above already: > >+------------+ +---------+ +---------+ +---------+ +--------+ >| | | | | | | | | | >| | |ethdev | | ipsec | |ethdev +--+ | >| ARP handler+->+api +-+ tunnel +->+api | | PMD >| | | | | PMD | | | | | >| | | | | | | | | | >+------------+ +---------+ +---+-----+ +---------+ +--------+ > | > +--+-----+ > | | > |crypto | > |api | > | | > | | > +--------+ > >Using the rte_ethdev we can already codify the ipsec functionailty as a >pmd that >registers an ethdev, and stack it with other pmds using methods simmilar >to what >the bonding pmd does (or via some other more generalized dataplane >indexing >function). This still leaves us with the creation of the crypto api, >which is >adventageous because: The proposal does not remove the bonding method and can still be used, correct? I do not see the different here using the pktdev style routines or using ethdev routines. > >1) It is not constrained by the i/o model of the dataplane (it may include >packet based i/o, but can build on more rudimentary (and performant) >interfaces. >For instance, in addition to async block based i/o, a crypto device may >also >operate syncrhnously, meaning a call can be saved with each transaction >(2 calls >for a tx/rx vs one for an encrypt operation). > >2) It is not constrained by use case. That is to say the API can be >constructed >for more natural use with other functions (for instance encryptions of >files on >disk or via a pipe to another process), which may not have any relation >to the >data plane of DPDK. > >Neil Ok, you snipped the text of the email here an it makes the context wrong without the rest of the code IMO. I will try to explain without the text that was omitted, but it would be best for anyone missing the original email to read it for more details. I know the formatting got messed up a bit :-( http://dpdk.org/ml/archives/dev/2015-April/016124.html In the rest of the text it does show the points I wanted to make here and how little overhead it added. Lets just say we do not move the TX/RX routines from rte_ethdev into rte_pktdev and only have a header file with a few structures and macros to help define some common parts between each of the new device types being added. I could see that as an option, but I do not see the big issues you are pointing out here. You did have some great comments about how crypto is used and the APIs from the Linux Kernel crypto model is proven and I do not disagree. In my email to my own email I tried to point our we could add something very similar to Linux Kernel Crypto API and it would be a model most would be able to understand. Creating my own crypto API is not my goal, but to use standards where it makes the most sense to DPDK. The email link above is the email I suggested the Linux Kernel Crypto API would be reasonable. Regards, ++Keith ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-03 22:32 ` Wiles, Keith @ 2015-04-04 13:11 ` Neil Horman 2015-04-04 15:16 ` Wiles, Keith 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2015-04-04 13:11 UTC (permalink / raw) To: Wiles, Keith; +Cc: dev On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote: > Hi Neil, > > On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > > >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: > >> Hi all, (hoping format of the text is maintained) > >> > >> Bruce and myself are submitting this RFC in hopes of providing > >>discussion > >> points for the idea. Please do not get carried away with the code > >> included, it was to help everyone understand the proposal/RFC. > >> > >> The RFC is to describe a proposed change we are looking to make to DPDK > >>to > >> add more device types. We would like to add in to DPDK the idea of a > >> generic packet-device or ?pktdev?, which can be thought of as a thin > >>layer > >> for all device classes. For other device types such as potentially a > >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect > >> performance and not require any current application to be modified. The > >> pktdev layer is providing a light framework for developers to add a > >>device > >> to DPDK. > >> > >> Reason for Change > >> ----------------- > >> > >> The reason why we are looking to introduce these concepts to DPDK are: > >> > >> * Expand the scope of DPDK so that it can provide APIs for more than > >>just > >> packet acquisition and transmission, but also provide APIs that can be > >> used to work with other hardware and software offloads, such as > >> cryptographic accelerators, or accelerated libraries for cryptographic > >> functions. [The reason why both software and hardware are mentioned is > >>so > >> that the same APIs can be used whether or not a hardware accelerator is > >> actually available]. > >> * Provide a minimal common basis for device abstraction in DPDK, that > >>can > >> be used to unify the different types of packet I/O devices already > >> existing in DPDK. To this end, the ethdev APIs are a good starting > >>point, > >> but the ethdev library contains too many functions which are > >>NIC-specific > >> to be a general-purpose set of APIs across all devices. > >> Note: The idea was previously touched on here: > >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 > >> > >> Description of Proposed Change > >> ------------------------------ > >> > >> The basic idea behind "pktdev" is to abstract out a few common routines > >> and structures/members of structures by starting with ethdev structures > >>as > >> a starting point, cut it down to little more than a few members in each > >> structure then possible add just rx_burst and tx_burst. Then use the > >> structures as a starting point for writing a device type. Currently we > >> have the rx_burst/tx_burst routines moved to the pktdev and it see like > >> move a couple more common functions maybe resaonable. It could be the > >> Rx/Tx routines in pktdev should be left as is, but in the code below is > >>a > >> possible reason to abstract a few routines into a common set of files. > >> > >> >From there, we have the ethdev type which adds in the existing > >>functions > >> specific to Ethernet devices, and also, for example, a cryptodev which > >>may > >> add in functions specific for cryptographic offload. As now, with the > >> ethdev, the specific drivers provide concrete implementations of the > >> functionality exposed by the interface. This hierarchy is shown in the > >> diagram below, using the existing ethdev and ixgbe drivers as a > >>reference, > >> alongside a hypothetical cryptodev class and driver implementation > >> (catchingly called) "X": > >> > >> ,---------------------. > >> | struct rte_pktdev | > >> +---------------------+ > >> | rte_pkt_rx_burst() | > >> .-------| rte_pkt_tx_burst() |-----------. > >> | `---------------------' | > >> | | > >> | | > >> ,-------------------------------. ,------------------------------. > >> | struct rte_ethdev | | struct rte_cryptodev | > >> +-------------------------------+ +------------------------------+ > >> | rte_eth_dev_configure() | | rte_crypto_init_sym_session()| > >> | rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() | > >> | rte_eth_filter_ctrl() | | | > >> `-------------------------------' `---------------.--------------' > >> | | > >> | | > >> ,---------'---------------------. ,---------------'--------------. > >> | struct rte_pmd_ixgbe | | struct rte_pmd_X | > >> +-------------------------------+ +------------------------------+ > >> | .configure -> ixgbe_configure | | .init_session -> X_init_ses()| > >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() | > >> `-------------------------------' `------------------------------' > >> > >> We are not attempting to create a real class model here only looking at > >> creating a very basic common set of APIs and structures for other device > >> types. > >> > >> In terms of code changes for this, we obviously need to add in new > >> interface libraries for pktdev and cryptodev. The pktdev library can > >> define a skeleton structure for the first few elements of the nested > >> structures to ensure consistency. Each of the defines below illustrate > >>the > >> common members in device structures, which gives some basic structure > >>the > >> device framework. Each of the defines are placed at the top of the > >>devices > >> matching structures and allows the devices to contain common and private > >> data. The pkdev structures overlay the first common set of members for > >> each device type. > >> > > > > > >Keith and I discussed this offline, and for the purposes of completeness > >I'll > >offer my rebuttal to this proposal here. > > > >In short, I don't think the segregation of the transmit and receive > >routines > >into their own separate structure (and ostensibly their own librte_pktdev > >library) is particularly valuable. While it does provide some minimal > >code > >savings when new device classes are introduced, the savings are not > >significant > >(approximlately 0.5kb per device class if the rte_ethdev generic tx and rx > >routines are any sort of indicator). It does however, come with > >significant > >costs in the sense that it binds a device class to using an I/O model (in > >this > >case packet based recieve and transmit) for which the device class may > >not be > >suited. > > The only reason the we only have a 0.5Kb saving is you are only looking at > moving Rx/Tx routines into pktdev, but what happens if we decide to move a > number of common functions like start/stop and others, then you start to > see a much bigger saving. You're right of course, but lets just try the math here. If we assume that, since all these libraries are just inlined redirection functions, we can guess that each one is about .25kb of code (5.kb/2 functions). So if you add two more functions you're looking at 1kb of code savings. Of course if you start doing so, you beg the two questions that I've been trying to pose here: 1) How much 'common' API do you want to impose on a device class that may not be condusive to the common set 2) At what point does your common API just become an rte_ethdev? > Do we need this saving, maybe not, but it does I would say definately not, given that the DPDK houses entire API sets that are always-inline, and any savings from the above are easily eclipsed by the fanout that occurs from use in multiple call sites. Lets be honest, performance is the DPDK's primary concern. Code size is not a consideration. Its only an argument here because it makes this proposal look favorable. Its not a bad thing mind you, but if this proposal caused any code bloat, it wouldn't even be mentioned. > provide a single call API rte_pkt_rx/tx_burst to use instead of the > application having to make sure it is calling the correct device Rx/Tx > routines. Given that that is a compile time issue, I really don't see why that is a big deal. Especially if you, as I suggest, just use the rte_ethdev as your common dataplane device model. Then by virtue of the fact that they're all rte_ethdevs, you get common API naming. > All that is required is passing in the device pointer and it is > handled for the application. Bruce added some code below to that effect. Yes, and you have that now, its an rte_ethdev. > > > >To illustrate the difference in design ideas, currenty the dpdk data > >pipeline > >looks like this: > > > >+------------+ +----------+ +---------+ > >| | | | | | > >| ARP | | ethdev | | | +----------+ > >| handler +-->+ api +-->+ PMD +-->+ Wire | > >| | | | | | +----------+ > >| | | | | | > >+------------+ +----------+ +---------+ > > You did not add the crypto to this picture as it is in the picture below > to make them the same. > No I didn't, because it doesn't exist right now. Thats what I mean by 'currently'. See my description below after you added your drawing. > +-----+ +---------+ +---------+ +------+ > | | | | | | | | +------+ > | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | > | | | | | | | | +------+ > +-----+ +---------+ +---------+ +------+ > > > > > > > >Where the ARP handler code is just some code that knows how to manage arp > >requests and responses, and only transmits and receives frames > > > >Keiths idea would introduce this new pktdev handler structure and make the > >dataplane pipeline look like this: > > > >+------------+ +------------+ +------------+ +--------+ > >| | | | | | | | > >| ARP | | pktdev api | | pktdev_api | | | +---------+ > >| handler +-+ +--+ +--+ PMD +--+Wire | > >| | | | | | | | +---------+ > >| | | | | | | | > >+------------+ | | | | | | > > | | | | +--------+ > > | | | | > > | | | | > > | | | | > > | rte_ethdev | | rte_crypto | > > | | | | > > | | | | > > +------------+ +------------+ > > You are drawing this picture it appears trying to make the pktdev another > function call layer when it is just a single macro in the rte_ethdev Not trying to imply a new function call layer, just trying to illustrate your proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look like rte_pktdevs so that the dataplane has a common element for passing mbufs around, regardless of its true device class. Not sure where you see an extra function call layer here. > header pointing to the rte_pktdev tx_burst routine. No function function > overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to > rte_pkt_tx_burst routine, which BTW is the same routine that was in > rte_ethdev today. The pktdev and ethdev are calling into the PMD tx > routine via the dev_ops function pointers structure. Which is also no > extra over head. > Never suggested it was extra overhead, where did you get that from? I'm just illustrating what it is you want to do. If I didn't get it right I apologize, please clarify. > If you are calling the rte_pkt_tx_burst routine directly it just means you > need to get the device pointer to pass instead of the port id value in the > rte_pkt_tx_burst routine. The above turns into something like this: > > +-----+ +---------+ +--------+ +-----+ > | | | ethdev | | | | | +------+ > | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | > | | | pktdev | | | | | +------+ > +-----+ +---------+ +--------+ +-----+ > > So the path of the data is the same only a macro does a simple rename of > the call to rte_eth_tx_burst routine. If you call the pktdev routine > directly then the macro is not used. > > Let me quash this by stipulating that you are absolutely correct, there is no additional overhead in your design, it should provide the exact same performance that an rte_ethdev currently does. I have no problem with that. My point is: It provides the exact same performance that an rte_ethdev does, so lets just use rte_ethdev if we're trying to model devices in the dataplane. > > > >The idea being that now all devices in the dataplane are pktdev devices > >and code > >that transmits and receives frames only needs to know that a device can > >transmit > >and receive frames. The crypto device in this chain is ostensibly > >preforming > >some sort of ipsec functionality so that arp frames are properly > >encrypted and > >encapsulated for sending via a tunnel. > > > >On the surface this seems reasonable, and in a sense it is. However, my > >assertion is that we already have this functionality, and it is the > >rte_ethdev > >device. To illustrate further, in my view we can do the above already: > > > >+------------+ +---------+ +---------+ +---------+ +--------+ > >| | | | | | | | | | > >| | |ethdev | | ipsec | |ethdev +--+ | > >| ARP handler+->+api +-+ tunnel +->+api | | PMD > >| | | | | PMD | | | | | > >| | | | | | | | | | > >+------------+ +---------+ +---+-----+ +---------+ +--------+ > > | > > +--+-----+ > > | | > > |crypto | > > |api | > > | | > > | | > > +--------+ > > > >Using the rte_ethdev we can already codify the ipsec functionailty as a > >pmd that > >registers an ethdev, and stack it with other pmds using methods simmilar > >to what > >the bonding pmd does (or via some other more generalized dataplane > >indexing > >function). This still leaves us with the creation of the crypto api, > >which is > >adventageous because: > > The proposal does not remove the bonding method and can still be used, > correct? Not sure I fully understand the question, but no, I'm not proposing any specific change to bonding here, just using it to illustrate that we have ways currently to stack PMD's on top of one another to create a multi-stage data pipeline. > I do not see the different here using the pktdev style routines or using > ethdev routines. Exactly my point! There is no difference to the dataplane API, it remains unchanged, just as in your proposal, except that there is no work required to create a pktdev device that doesn't change anything. The advantage that I'm proposing here is that my model separates whatever crypto device class API you want to design from the dataplane (rte_ethdev) api. Doing this lets you create a crypto API that is more appropriately suited to the crypto class of device. if there is overlap with rte_ethdev, thats fine, you can create simmilar functions (even giving them identical names to the rte_ethdev apis), and that will be properly resolved at compile time. But more importantly, where there is not overlap, you're not forced into creating an API that inherits the common API functions you propose, when they are not best suited to your device. > > > >1) It is not constrained by the i/o model of the dataplane (it may include > >packet based i/o, but can build on more rudimentary (and performant) > >interfaces. > >For instance, in addition to async block based i/o, a crypto device may > >also > >operate syncrhnously, meaning a call can be saved with each transaction > >(2 calls > >for a tx/rx vs one for an encrypt operation). > > > >2) It is not constrained by use case. That is to say the API can be > >constructed > >for more natural use with other functions (for instance encryptions of > >files on > >disk or via a pipe to another process), which may not have any relation > >to the > >data plane of DPDK. > > > >Neil > > Ok, you snipped the text of the email here an it makes the context wrong I figured your illustration above was sufficient to make your point and mine. If your code tells a different story, I apologize. > without the rest of the code IMO. I will try to explain without the text > that was omitted, but it would be best for anyone missing the original > email to read it for more details. I know the formatting got messed up a > bit :-( > > http://dpdk.org/ml/archives/dev/2015-April/016124.html > > > In the rest of the text it does show the points I wanted to make here and > how little overhead it added. > As above, I'm stipulating to the fact that there is not performance overhead. That is not, and was never my point. My point was/is that we already have an API to do what you want to do. The commonality of functions that you seem to focus on, is already provided by the fact that we have a common device for the dataplane (the rte_ethdev). You're proposal doesn't buy us anything beyond that. By your own assertion, your proposal: 1) Its just as performant as an rte_ethdev 2) It doesn't require any code change to applications Given that, and the fact that it does require that new device classes inherit an API set that may not be relevant to said device class, my one and only question is: whats the advantage? You claim that common API naming makes application code generic, but we have that today. If you want to do IPsec with a crypto offload devie, write a pmd that interfaces to the data pipeline via the rte_ethdev API and uses whatever crypto offload device API you want to construct to do the crypto work. That way, if a given application wants to use the crypto hardware for something non-network related (say disk/file encryption), it can use the crypto device layer in a way that is not packet/network oriented more easily, at which point having a common API is completely irrelevant, as the application code is being specifically written to talk to a crypto device. > Lets just say we do not move the TX/RX routines from rte_ethdev into > rte_pktdev and only have a header file with a few structures and macros to > help define some common parts between each of the new device types being > added. I could see that as an option, but I do not see the big issues you > are pointing out here. > My big issue is that I see no point in creating a common API. You're focused on finding a common set of API's that apply to all device classes, and I'm asserting that there is no safe set of common API routines. While it may be possible to shoehorn multiple device classes into the same i/o API model, you are doing yourself a disservice by mandating that. Put another way, ipsec devices and crypto devices are separate device classes and deserve to be modeled differently. An IPsec device is a network element that is very nicely modeled as an ethernet interface, for which you have a very robust API (the rte_ethdev). Crypto offload devices are not network elements, and can preform i/o in a mutltitude of ways (synchronous, asynchronous, scatter/gather, etc), and have many non ethernet API facets. The latter deserves to have its own API design, not to be forced to inherit part of an ethernet device (rte_pktdev) and be forced to work that way. > You did have some great comments about how crypto is used and the APIs > from the Linux Kernel crypto model is proven and I do not disagree. > > In my email to my own email I tried to point our we could add something Yeah, sorry about the broken thread, the in-reply-to field got somehow html-mangled. Not sure how that happened as I use mutt as my MUA. > very similar to Linux Kernel Crypto API and it would be a model most would > be able to understand. Creating my own crypto API is not my goal, but to > use standards where it makes the most sense to DPDK. > > The email link above is the email I suggested the Linux Kernel Crypto API > would be reasonable. > And thats good. It doesn't need to be the linux kernel crypto API by any stretch, and I don't expect it to be. My take away from review of the linux crypto api is that, while it uses some simmilar notions to parts of the network API (scatter gather i/o, optional async operation, completion queues), it doesn't create any sort of common API structure between it and the network subsystem, because they're separate devices. Even though crypto is largely used in the network datapath for various ipsec tunnels, crypto still uses its own API methods and data types because its not exclusive to the use of networking, and doing so allows for more flexible use outside the datapath. Thats what I'm trying to get at here. That users might want to use crypto outside of the network path, and they should have an API that is well suited to that task. Neil > Regards, > ++Keith > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-04 13:11 ` Neil Horman @ 2015-04-04 15:16 ` Wiles, Keith 2015-04-05 19:37 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: Wiles, Keith @ 2015-04-04 15:16 UTC (permalink / raw) To: Neil Horman; +Cc: dev On 4/4/15, 8:11 AM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote: >> Hi Neil, >> >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >> >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: >> >> Hi all, (hoping format of the text is maintained) >> >> >> >> Bruce and myself are submitting this RFC in hopes of providing >> >>discussion >> >> points for the idea. Please do not get carried away with the code >> >> included, it was to help everyone understand the proposal/RFC. >> >> >> >> The RFC is to describe a proposed change we are looking to make to >>DPDK >> >>to >> >> add more device types. We would like to add in to DPDK the idea of a >> >> generic packet-device or ?pktdev?, which can be thought of as a thin >> >>layer >> >> for all device classes. For other device types such as potentially a >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect >> >> performance and not require any current application to be modified. >>The >> >> pktdev layer is providing a light framework for developers to add a >> >>device >> >> to DPDK. >> >> >> >> Reason for Change >> >> ----------------- >> >> >> >> The reason why we are looking to introduce these concepts to DPDK >>are: >> >> >> >> * Expand the scope of DPDK so that it can provide APIs for more than >> >>just >> >> packet acquisition and transmission, but also provide APIs that can >>be >> >> used to work with other hardware and software offloads, such as >> >> cryptographic accelerators, or accelerated libraries for >>cryptographic >> >> functions. [The reason why both software and hardware are mentioned >>is >> >>so >> >> that the same APIs can be used whether or not a hardware accelerator >>is >> >> actually available]. >> >> * Provide a minimal common basis for device abstraction in DPDK, that >> >>can >> >> be used to unify the different types of packet I/O devices already >> >> existing in DPDK. To this end, the ethdev APIs are a good starting >> >>point, >> >> but the ethdev library contains too many functions which are >> >>NIC-specific >> >> to be a general-purpose set of APIs across all devices. >> >> Note: The idea was previously touched on here: >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 >> >> >> >> Description of Proposed Change >> >> ------------------------------ >> >> >> >> The basic idea behind "pktdev" is to abstract out a few common >>routines >> >> and structures/members of structures by starting with ethdev >>structures >> >>as >> >> a starting point, cut it down to little more than a few members in >>each >> >> structure then possible add just rx_burst and tx_burst. Then use the >> >> structures as a starting point for writing a device type. Currently >>we >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see >>like >> >> move a couple more common functions maybe resaonable. It could be the >> >> Rx/Tx routines in pktdev should be left as is, but in the code below >>is >> >>a >> >> possible reason to abstract a few routines into a common set of >>files. >> >> >> >> >From there, we have the ethdev type which adds in the existing >> >>functions >> >> specific to Ethernet devices, and also, for example, a cryptodev >>which >> >>may >> >> add in functions specific for cryptographic offload. As now, with the >> >> ethdev, the specific drivers provide concrete implementations of the >> >> functionality exposed by the interface. This hierarchy is shown in >>the >> >> diagram below, using the existing ethdev and ixgbe drivers as a >> >>reference, >> >> alongside a hypothetical cryptodev class and driver implementation >> >> (catchingly called) "X": >> >> >> >> ,---------------------. >> >> | struct rte_pktdev | >> >> +---------------------+ >> >> | rte_pkt_rx_burst() | >> >> .-------| rte_pkt_tx_burst() |-----------. >> >> | `---------------------' | >> >> | | >> >> | | >> >> ,-------------------------------. >>,------------------------------. >> >> | struct rte_ethdev | | struct rte_cryptodev >> | >> >> +-------------------------------+ >>+------------------------------+ >> >> | rte_eth_dev_configure() | | >>rte_crypto_init_sym_session()| >> >> | rte_eth_allmulticast_enable() | | >>rte_crypto_del_sym_session() | >> >> | rte_eth_filter_ctrl() | | >> | >> >> `-------------------------------' >>`---------------.--------------' >> >> | | >> >> | | >> >> ,---------'---------------------. >>,---------------'--------------. >> >> | struct rte_pmd_ixgbe | | struct rte_pmd_X >> | >> >> +-------------------------------+ >>+------------------------------+ >> >> | .configure -> ixgbe_configure | | .init_session -> >>X_init_ses()| >> >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> >>X_handle_pkts() | >> >> `-------------------------------' >>`------------------------------' >> >> >> >> We are not attempting to create a real class model here only looking >>at >> >> creating a very basic common set of APIs and structures for other >>device >> >> types. >> >> >> >> In terms of code changes for this, we obviously need to add in new >> >> interface libraries for pktdev and cryptodev. The pktdev library can >> >> define a skeleton structure for the first few elements of the nested >> >> structures to ensure consistency. Each of the defines below >>illustrate >> >>the >> >> common members in device structures, which gives some basic structure >> >>the >> >> device framework. Each of the defines are placed at the top of the >> >>devices >> >> matching structures and allows the devices to contain common and >>private >> >> data. The pkdev structures overlay the first common set of members >>for >> >> each device type. >> >> >> > >> > >> >Keith and I discussed this offline, and for the purposes of >>completeness >> >I'll >> >offer my rebuttal to this proposal here. >> > >> >In short, I don't think the segregation of the transmit and receive >> >routines >> >into their own separate structure (and ostensibly their own >>librte_pktdev >> >library) is particularly valuable. While it does provide some minimal >> >code >> >savings when new device classes are introduced, the savings are not >> >significant >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx >>and rx >> >routines are any sort of indicator). It does however, come with >> >significant >> >costs in the sense that it binds a device class to using an I/O model >>(in >> >this >> >case packet based recieve and transmit) for which the device class may >> >not be >> >suited. >> >> The only reason the we only have a 0.5Kb saving is you are only looking >>at >> moving Rx/Tx routines into pktdev, but what happens if we decide to >>move a >> number of common functions like start/stop and others, then you start to >> see a much bigger saving. >You're right of course, but lets just try the math here. If we assume >that, >since all these libraries are just inlined redirection functions, we can >guess >that each one is about .25kb of code (5.kb/2 functions). So if you add >two more >functions you're looking at 1kb of code savings. Of course if you start >doing >so, you beg the two questions that I've been trying to pose here: > >1) How much 'common' API do you want to impose on a device class that may >not >be condusive to the common set > >2) At what point does your common API just become an rte_ethdev? You are taking the code size to the extreme, which is not the real point here. I agree it does not matter in footprint savings as most DPDK systems are pretty big. The reason for the common code was to help remove some of the code required for the developer to write in the new device type. If the developer of the new device type writes his own set of Rx/Tx routines the community will determine if they are required or do the current APIs work or do we require them to support both. Please try not getting hung up on the footprint savings and I am sorry if I was making this a huge point. > >> Do we need this saving, maybe not, but it does >I would say definately not, given that the DPDK houses entire API sets >that are >always-inline, and any savings from the above are easily eclipsed by the >fanout >that occurs from use in multiple call sites. Lets be honest, performance >is the >DPDK's primary concern. Code size is not a consideration. Its only an >argument >here because it makes this proposal look favorable. Its not a bad thing >mind >you, but if this proposal caused any code bloat, it wouldn't even be >mentioned. In the current design the pktdev APIs are still inline routines only the debug routines are not. > >> provide a single call API rte_pkt_rx/tx_burst to use instead of the >> application having to make sure it is calling the correct device Rx/Tx >> routines. >Given that that is a compile time issue, I really don't see why that is a >big >deal. Especially if you, as I suggest, just use the rte_ethdev as your >common >dataplane device model. Then by virtue of the fact that they're all >rte_ethdevs, you get common API naming. > >> All that is required is passing in the device pointer and it is >> handled for the application. Bruce added some code below to that effect. >Yes, and you have that now, its an rte_ethdev. > >> > >> >To illustrate the difference in design ideas, currenty the dpdk data >> >pipeline >> >looks like this: >> > >> >+------------+ +----------+ +---------+ >> >| | | | | | >> >| ARP | | ethdev | | | +----------+ >> >| handler +-->+ api +-->+ PMD +-->+ Wire | >> >| | | | | | +----------+ >> >| | | | | | >> >+------------+ +----------+ +---------+ >> >> You did not add the crypto to this picture as it is in the picture below >> to make them the same. >> >No I didn't, because it doesn't exist right now. Thats what I mean by >'currently'. See my description below after you added your drawing. > >> +-----+ +---------+ +---------+ +------+ >> | | | | | | | | +------+ >> | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | >> | | | | | | | | +------+ >> +-----+ +---------+ +---------+ +------+ >> >> >> > >> > >> >Where the ARP handler code is just some code that knows how to manage >>arp >> >requests and responses, and only transmits and receives frames >> > >> >Keiths idea would introduce this new pktdev handler structure and make >>the >> >dataplane pipeline look like this: >> > >> >+------------+ +------------+ +------------+ +--------+ >> >| | | | | | | | >> >| ARP | | pktdev api | | pktdev_api | | | +---------+ >> >| handler +-+ +--+ +--+ PMD +--+Wire | >> >| | | | | | | | +---------+ >> >| | | | | | | | >> >+------------+ | | | | | | >> > | | | | +--------+ >> > | | | | >> > | | | | >> > | | | | >> > | rte_ethdev | | rte_crypto | >> > | | | | >> > | | | | >> > +------------+ +------------+ >> >> You are drawing this picture it appears trying to make the pktdev >>another >> function call layer when it is just a single macro in the rte_ethdev >Not trying to imply a new function call layer, just trying to illustrate >your >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look >like >rte_pktdevs so that the dataplane has a common element for passing mbufs >around, >regardless of its true device class. Not sure where you see an extra >function >call layer here. Please see my comment above as I am not trying to make them all look the same you are taking my words to the extreme. We need some type of structure for adding more device types to DPDK and I am just trying to suggest a few common parts as a solution. Some parts of the structures should be common and possible a few common routines, yes a big part of each device will be different. The rte_ethdev structures is what we have today and was a good starting point, but a big part of the code will be different. Should we allow someone to write a new device type that is completely different down to typedefs for variable types and all new structures, no we should not, giving them some guide lines by providing a set of routines and structures to start with, by all means lets do it. > >> header pointing to the rte_pktdev tx_burst routine. No function function >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to >> rte_pkt_tx_burst routine, which BTW is the same routine that was in >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx >> routine via the dev_ops function pointers structure. Which is also no >> extra over head. >> >Never suggested it was extra overhead, where did you get that from? I'm >just >illustrating what it is you want to do. If I didn't get it right I >apologize, >please clarify. No need to clarify as we agree and maybe I read too much into your comments. > >> If you are calling the rte_pkt_tx_burst routine directly it just means >>you >> need to get the device pointer to pass instead of the port id value in >>the >> rte_pkt_tx_burst routine. The above turns into something like this: >> >> +-----+ +---------+ +--------+ +-----+ >> | | | ethdev | | | | | +------+ >> | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | >> | | | pktdev | | | | | +------+ >> +-----+ +---------+ +--------+ +-----+ >> >> So the path of the data is the same only a macro does a simple rename of >> the call to rte_eth_tx_burst routine. If you call the pktdev routine >> directly then the macro is not used. >> >> >Let me quash this by stipulating >that you are absolutely correct, there is no additional overhead in your >design, >it should provide the exact same performance that an rte_ethdev currently >does. >I have no problem with that. My point is: It provides the exact same >performance that an rte_ethdev does, so lets just use rte_ethdev if we're >trying >to model devices in the dataplane. The ethdev model does not give us a different device type it just forces the new device type to look like an ethdev, this is what I am reading from your email here. I do not believe you are suggesting that ethdev look a like is the answer to a new device type. The bonding model allows you to link some devices together and that is good in IMO. The bonding model is kind of like a pipeline model at the device layer. The pipeline style at the application layer could have provided the same feature as the bonding model just being done at two different layers in the system (application and device layer). Now the pipeline model is a bit more generic and may add some small overhead compared to the bonding model. If we require every device type to support a bonding style then we would need them all to follow some type of common structure, correct? > >> > >> >The idea being that now all devices in the dataplane are pktdev devices >> >and code >> >that transmits and receives frames only needs to know that a device can >> >transmit >> >and receive frames. The crypto device in this chain is ostensibly >> >preforming >> >some sort of ipsec functionality so that arp frames are properly >> >encrypted and >> >encapsulated for sending via a tunnel. >> > >> >On the surface this seems reasonable, and in a sense it is. However, >>my >> >assertion is that we already have this functionality, and it is the >> >rte_ethdev >> >device. To illustrate further, in my view we can do the above >>already: >> > >> >+------------+ +---------+ +---------+ +---------+ +--------+ >> >| | | | | | | | | | >> >| | |ethdev | | ipsec | |ethdev +--+ | >> >| ARP handler+->+api +-+ tunnel +->+api | | PMD >> >| | | | | PMD | | | | | >> >| | | | | | | | | | >> >+------------+ +---------+ +---+-----+ +---------+ +--------+ >> > | >> > +--+-----+ >> > | | >> > |crypto | >> > |api | >> > | | >> > | | >> > +--------+ >> > >> >Using the rte_ethdev we can already codify the ipsec functionailty as a >> >pmd that >> >registers an ethdev, and stack it with other pmds using methods >>simmilar >> >to what >> >the bonding pmd does (or via some other more generalized dataplane >> >indexing >> >function). This still leaves us with the creation of the crypto api, >> >which is >> >adventageous because: >> >> The proposal does not remove the bonding method and can still be used, >> correct? >Not sure I fully understand the question, but no, I'm not proposing any >specific >change to bonding here, just using it to illustrate that we have ways >currently >to stack PMD's on top of one another to create a multi-stage data >pipeline. Look at the above comment for this one. > >> I do not see the different here using the pktdev style routines or using >> ethdev routines. >Exactly my point! There is no difference to the dataplane API, it remains >unchanged, just as in your proposal, except that there is no work >required to >create a pktdev device that doesn't change anything. The advantage that >I'm >proposing here is that my model separates whatever crypto device class >API you want >to design from the dataplane (rte_ethdev) api. Doing this lets you >create a >crypto API that is more appropriately suited to the crypto class of >device. if >there is overlap with rte_ethdev, thats fine, you can create simmilar >functions >(even giving them identical names to the rte_ethdev apis), and that will >be >properly resolved at compile time. But more importantly, where there is >not >overlap, you're not forced into creating an API that inherits the common >API >functions you propose, when they are not best suited to your device. I am still not trying to suggest we have the cryptodev be a copy of the ethdev here. We need some type of structure and common guide lines for a new device, which pktdev is a very reasonable plus very light framework. I am a bit confused now it seems like you are suggesting a new device should be ethdev and then you seem to be stating it should be completely differentŠ Lets just agree some common parts are good between device types and some are just not. Making all devices look like an ethdev is just not going to fly and making a new device type into something completely different is not going to fly. Then the only solution is to have somethings in common to help unify the designs, but still allow the developer to add APIs to that device which make the most sense. > >> > >> >1) It is not constrained by the i/o model of the dataplane (it may >>include >> >packet based i/o, but can build on more rudimentary (and performant) >> >interfaces. >> >For instance, in addition to async block based i/o, a crypto device may >> >also >> >operate syncrhnously, meaning a call can be saved with each transaction >> >(2 calls >> >for a tx/rx vs one for an encrypt operation). >> > >> >2) It is not constrained by use case. That is to say the API can be >> >constructed >> >for more natural use with other functions (for instance encryptions of >> >files on >> >disk or via a pipe to another process), which may not have any relation >> >to the >> >data plane of DPDK. >> > >> >Neil >> >> Ok, you snipped the text of the email here an it makes the context wrong >I figured your illustration above was sufficient to make your point and >mine. >If your code tells a different story, I apologize. > >> without the rest of the code IMO. I will try to explain without the text >> that was omitted, but it would be best for anyone missing the original >> email to read it for more details. I know the formatting got messed up a >> bit :-( >> >> http://dpdk.org/ml/archives/dev/2015-April/016124.html >> >> >> In the rest of the text it does show the points I wanted to make here >>and >> how little overhead it added. >> >As above, I'm stipulating to the fact that there is not performance >overhead. >That is not, and was never my point. My point was/is that we already >have an >API to do what you want to do. The commonality of functions that you >seem to >focus on, is already provided by the fact that we have a common device >for the >dataplane (the rte_ethdev). You're proposal doesn't buy us anything >beyond >that. By your own assertion, your proposal: > >1) Its just as performant as an rte_ethdev > >2) It doesn't require any code change to applications > >Given that, and the fact that it does require that new device classes >inherit an Please do not bring inheritance into this again I wish Bruce had never used that term :-( in hind site it is not a very good way to describe the design. It seems like it at a very high level, but it seems to drag everyone down to the bits and bytes way too much. We ever stated that all the APIs are common between ethdev (or are inherited) between a new device only a couple which seemed to make sense to me. The structures being a bit similar was my goal and to provide a good light framework. >API set that may not be relevant to said device class, my one and only >question >is: whats the advantage? I have been trying to layout the advantages: 1) - we provide a very light framework for the developer of a new device type to start with in his design 2) - we look at making some routines common between devices if it make sense, some non-performance APIs do make sense to be common, but the structures need to be common in some way IMO 3) - we make a few structures look similar using pktdev structures as an overlay to allow for a clean way to pass around these different device types 4) - adopt standard APIs or translate those APIs into something similar for DPDK (e.g. Linux kernel crypto APIs) >You claim that common API naming makes application I never claimed that common API naming make applications common, but only a few common constructs or API semantics could help. >code generic, but we have that today. If you want to do IPsec with a >crypto >offload devie, write a pmd that interfaces to the data pipeline via the >rte_ethdev API and uses whatever crypto offload device API you want to >construct >to do the crypto work. That way, if a given application wants to use the >crypto I am trying to suggest the crypto device can have other APIs better suited to how we want to use that device, again we can have a few common items between ethdev and cryptodev, but it does not make sense to use all of ethdev. I am getting confused here as you seem to be suggesting we use ethdev model, but I am sure you are not, correct? >hardware for something non-network related (say disk/file encryption), it >can >use the crypto device layer in a way that is not packet/network oriented >more >easily, at which point having a common API is completely irrelevant, as >the >application code is being specifically written to talk to a crypto device. > > > > >> Lets just say we do not move the TX/RX routines from rte_ethdev into >> rte_pktdev and only have a header file with a few structures and macros >>to >> help define some common parts between each of the new device types being >> added. I could see that as an option, but I do not see the big issues >>you >> are pointing out here. >> >My big issue is that I see no point in creating a common API. I am not trying to create a common API between all devices, it only seemed like a reasonable design to make Rx/Tx look similar. Also making some of the structures look similar is a good thing IMO. >You're focused on finding a common set of API's that apply to all device >classes, and I'm asserting that there is no safe set of common API >routines. >While it may be possible to shoehorn multiple device classes into the >same i/o >API model, you are doing yourself a disservice by mandating that. I feel like you are taking the extreme view here again and that was not my intent to make everything the same only where it makes sense. A ethdev and cryptodev are two different animals, but they do have some similarities and I want to make sure we investigate these similarities, right? > >Put another way, ipsec devices and crypto devices are separate device >classes >and deserve to be modeled differently. An IPsec device is a network >element >that is very nicely modeled as an ethernet interface, for which you have >a very >robust API (the rte_ethdev). Crypto offload devices are not network >elements, >and can preform i/o in a mutltitude of ways (synchronous, asynchronous, >scatter/gather, etc), and have many non ethernet API facets. The latter >deserves to have its own API design, not to be forced to inherit part of >an >ethernet device (rte_pktdev) and be forced to work that way. > >> You did have some great comments about how crypto is used and the APIs >> from the Linux Kernel crypto model is proven and I do not disagree. >> >> In my email to my own email I tried to point our we could add something >Yeah, sorry about the broken thread, the in-reply-to field got somehow >html-mangled. Not sure how that happened as I use mutt as my MUA. > >> very similar to Linux Kernel Crypto API and it would be a model most >>would >> be able to understand. Creating my own crypto API is not my goal, but to >> use standards where it makes the most sense to DPDK. >> >> The email link above is the email I suggested the Linux Kernel Crypto >>API >> would be reasonable. >> >And thats good. It doesn't need to be the linux kernel crypto API by any >stretch, and I don't expect it to be. My take away from review of the >linux >crypto api is that, while it uses some simmilar notions to parts of the >network >API (scatter gather i/o, optional async operation, completion queues), it >doesn't create any sort of common API structure between it and the network >subsystem, because they're separate devices. Even though crypto is >largely used >in the network datapath for various ipsec tunnels, crypto still uses its >own API >methods and data types because its not exclusive to the use of >networking, and >doing so allows for more flexible use outside the datapath. Thats what >I'm >trying to get at here. That users might want to use crypto outside of the >network path, and they should have an API that is well suited to that >task. We agree here, I only want some similarities between devices not an exact common API. I am afraid I am not following you now. At one point it seems like all of the devices must be completely different then in another place ethdev is best way to go as a common API. I think we are saying the same thing here just from a different point of view and I have tried to understand your view point. I have even relaxed my design goals to meet you part way. It seems like we agree and disagree here, which does not make a lot of sense to me at this point. Maybe we need to get into a room and talk F2F to make sure we are both being understood here. Maybe we can then make sure we are both being heard and come to some type of solution for DPDK. Maybe the community meeting is a good place, but I feel it is going to be too big and too short to discuss this correctly. If you are up to a conf call or I can travel to you if you feel a F2F is better. > >Neil >> Regards, >> ++Keith >> >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-04 15:16 ` Wiles, Keith @ 2015-04-05 19:37 ` Neil Horman 2015-04-05 22:20 ` Wiles, Keith 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2015-04-05 19:37 UTC (permalink / raw) To: Wiles, Keith; +Cc: dev On Sat, Apr 04, 2015 at 03:16:08PM +0000, Wiles, Keith wrote: > > > On 4/4/15, 8:11 AM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > > >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote: > >> Hi Neil, > >> > >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > >> > >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: > >> >> Hi all, (hoping format of the text is maintained) > >> >> > >> >> Bruce and myself are submitting this RFC in hopes of providing > >> >>discussion > >> >> points for the idea. Please do not get carried away with the code > >> >> included, it was to help everyone understand the proposal/RFC. > >> >> > >> >> The RFC is to describe a proposed change we are looking to make to > >>DPDK > >> >>to > >> >> add more device types. We would like to add in to DPDK the idea of a > >> >> generic packet-device or ?pktdev?, which can be thought of as a thin > >> >>layer > >> >> for all device classes. For other device types such as potentially a > >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect > >> >> performance and not require any current application to be modified. > >>The > >> >> pktdev layer is providing a light framework for developers to add a > >> >>device > >> >> to DPDK. > >> >> > >> >> Reason for Change > >> >> ----------------- > >> >> > >> >> The reason why we are looking to introduce these concepts to DPDK > >>are: > >> >> > >> >> * Expand the scope of DPDK so that it can provide APIs for more than > >> >>just > >> >> packet acquisition and transmission, but also provide APIs that can > >>be > >> >> used to work with other hardware and software offloads, such as > >> >> cryptographic accelerators, or accelerated libraries for > >>cryptographic > >> >> functions. [The reason why both software and hardware are mentioned > >>is > >> >>so > >> >> that the same APIs can be used whether or not a hardware accelerator > >>is > >> >> actually available]. > >> >> * Provide a minimal common basis for device abstraction in DPDK, that > >> >>can > >> >> be used to unify the different types of packet I/O devices already > >> >> existing in DPDK. To this end, the ethdev APIs are a good starting > >> >>point, > >> >> but the ethdev library contains too many functions which are > >> >>NIC-specific > >> >> to be a general-purpose set of APIs across all devices. > >> >> Note: The idea was previously touched on here: > >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 > >> >> > >> >> Description of Proposed Change > >> >> ------------------------------ > >> >> > >> >> The basic idea behind "pktdev" is to abstract out a few common > >>routines > >> >> and structures/members of structures by starting with ethdev > >>structures > >> >>as > >> >> a starting point, cut it down to little more than a few members in > >>each > >> >> structure then possible add just rx_burst and tx_burst. Then use the > >> >> structures as a starting point for writing a device type. Currently > >>we > >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see > >>like > >> >> move a couple more common functions maybe resaonable. It could be the > >> >> Rx/Tx routines in pktdev should be left as is, but in the code below > >>is > >> >>a > >> >> possible reason to abstract a few routines into a common set of > >>files. > >> >> > >> >> >From there, we have the ethdev type which adds in the existing > >> >>functions > >> >> specific to Ethernet devices, and also, for example, a cryptodev > >>which > >> >>may > >> >> add in functions specific for cryptographic offload. As now, with the > >> >> ethdev, the specific drivers provide concrete implementations of the > >> >> functionality exposed by the interface. This hierarchy is shown in > >>the > >> >> diagram below, using the existing ethdev and ixgbe drivers as a > >> >>reference, > >> >> alongside a hypothetical cryptodev class and driver implementation > >> >> (catchingly called) "X": > >> >> > >> >> ,---------------------. > >> >> | struct rte_pktdev | > >> >> +---------------------+ > >> >> | rte_pkt_rx_burst() | > >> >> .-------| rte_pkt_tx_burst() |-----------. > >> >> | `---------------------' | > >> >> | | > >> >> | | > >> >> ,-------------------------------. > >>,------------------------------. > >> >> | struct rte_ethdev | | struct rte_cryptodev > >> | > >> >> +-------------------------------+ > >>+------------------------------+ > >> >> | rte_eth_dev_configure() | | > >>rte_crypto_init_sym_session()| > >> >> | rte_eth_allmulticast_enable() | | > >>rte_crypto_del_sym_session() | > >> >> | rte_eth_filter_ctrl() | | > >> | > >> >> `-------------------------------' > >>`---------------.--------------' > >> >> | | > >> >> | | > >> >> ,---------'---------------------. > >>,---------------'--------------. > >> >> | struct rte_pmd_ixgbe | | struct rte_pmd_X > >> | > >> >> +-------------------------------+ > >>+------------------------------+ > >> >> | .configure -> ixgbe_configure | | .init_session -> > >>X_init_ses()| > >> >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> > >>X_handle_pkts() | > >> >> `-------------------------------' > >>`------------------------------' > >> >> > >> >> We are not attempting to create a real class model here only looking > >>at > >> >> creating a very basic common set of APIs and structures for other > >>device > >> >> types. > >> >> > >> >> In terms of code changes for this, we obviously need to add in new > >> >> interface libraries for pktdev and cryptodev. The pktdev library can > >> >> define a skeleton structure for the first few elements of the nested > >> >> structures to ensure consistency. Each of the defines below > >>illustrate > >> >>the > >> >> common members in device structures, which gives some basic structure > >> >>the > >> >> device framework. Each of the defines are placed at the top of the > >> >>devices > >> >> matching structures and allows the devices to contain common and > >>private > >> >> data. The pkdev structures overlay the first common set of members > >>for > >> >> each device type. > >> >> > >> > > >> > > >> >Keith and I discussed this offline, and for the purposes of > >>completeness > >> >I'll > >> >offer my rebuttal to this proposal here. > >> > > >> >In short, I don't think the segregation of the transmit and receive > >> >routines > >> >into their own separate structure (and ostensibly their own > >>librte_pktdev > >> >library) is particularly valuable. While it does provide some minimal > >> >code > >> >savings when new device classes are introduced, the savings are not > >> >significant > >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx > >>and rx > >> >routines are any sort of indicator). It does however, come with > >> >significant > >> >costs in the sense that it binds a device class to using an I/O model > >>(in > >> >this > >> >case packet based recieve and transmit) for which the device class may > >> >not be > >> >suited. > >> > >> The only reason the we only have a 0.5Kb saving is you are only looking > >>at > >> moving Rx/Tx routines into pktdev, but what happens if we decide to > >>move a > >> number of common functions like start/stop and others, then you start to > >> see a much bigger saving. > >You're right of course, but lets just try the math here. If we assume > >that, > >since all these libraries are just inlined redirection functions, we can > >guess > >that each one is about .25kb of code (5.kb/2 functions). So if you add > >two more > >functions you're looking at 1kb of code savings. Of course if you start > >doing > >so, you beg the two questions that I've been trying to pose here: > > > >1) How much 'common' API do you want to impose on a device class that may > >not > >be condusive to the common set > > > >2) At what point does your common API just become an rte_ethdev? > > You are taking the code size to the extreme, which is not the real point > here. I agree it does not matter in footprint savings as most DPDK systems > are pretty big. The reason for the common code was to help remove some of > the code required for the developer to write in the new device type. If > the developer of the new device type writes his own set of Rx/Tx routines > the community will determine if they are required or do the current APIs > work or do we require them to support both. > > Please try not getting hung up on the footprint savings and I am sorry if > I was making this a huge point. > Hung up? I'm not the one who made a big deal out of code size reduction being a major benefit there, be it source or binary, its miniscule any way you slice it. I'm happy to drop this though if you are. > > > >> Do we need this saving, maybe not, but it does > >I would say definately not, given that the DPDK houses entire API sets > >that are > >always-inline, and any savings from the above are easily eclipsed by the > >fanout > >that occurs from use in multiple call sites. Lets be honest, performance > >is the > >DPDK's primary concern. Code size is not a consideration. Its only an > >argument > >here because it makes this proposal look favorable. Its not a bad thing > >mind > >you, but if this proposal caused any code bloat, it wouldn't even be > >mentioned. > > In the current design the pktdev APIs are still inline routines only the > debug routines are not. > Reread my comments above please. You seem to have misread what I wrote. > > > >> provide a single call API rte_pkt_rx/tx_burst to use instead of the > >> application having to make sure it is calling the correct device Rx/Tx > >> routines. > >Given that that is a compile time issue, I really don't see why that is a > >big > >deal. Especially if you, as I suggest, just use the rte_ethdev as your > >common > >dataplane device model. Then by virtue of the fact that they're all > >rte_ethdevs, you get common API naming. > > > >> All that is required is passing in the device pointer and it is > >> handled for the application. Bruce added some code below to that effect. > >Yes, and you have that now, its an rte_ethdev. > > > >> > > >> >To illustrate the difference in design ideas, currenty the dpdk data > >> >pipeline > >> >looks like this: > >> > > >> >+------------+ +----------+ +---------+ > >> >| | | | | | > >> >| ARP | | ethdev | | | +----------+ > >> >| handler +-->+ api +-->+ PMD +-->+ Wire | > >> >| | | | | | +----------+ > >> >| | | | | | > >> >+------------+ +----------+ +---------+ > >> > >> You did not add the crypto to this picture as it is in the picture below > >> to make them the same. > >> > >No I didn't, because it doesn't exist right now. Thats what I mean by > >'currently'. See my description below after you added your drawing. > > > >> +-----+ +---------+ +---------+ +------+ > >> | | | | | | | | +------+ > >> | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | > >> | | | | | | | | +------+ > >> +-----+ +---------+ +---------+ +------+ > >> > >> > >> > > >> > > >> >Where the ARP handler code is just some code that knows how to manage > >>arp > >> >requests and responses, and only transmits and receives frames > >> > > >> >Keiths idea would introduce this new pktdev handler structure and make > >>the > >> >dataplane pipeline look like this: > >> > > >> >+------------+ +------------+ +------------+ +--------+ > >> >| | | | | | | | > >> >| ARP | | pktdev api | | pktdev_api | | | +---------+ > >> >| handler +-+ +--+ +--+ PMD +--+Wire | > >> >| | | | | | | | +---------+ > >> >| | | | | | | | > >> >+------------+ | | | | | | > >> > | | | | +--------+ > >> > | | | | > >> > | | | | > >> > | | | | > >> > | rte_ethdev | | rte_crypto | > >> > | | | | > >> > | | | | > >> > +------------+ +------------+ > >> > >> You are drawing this picture it appears trying to make the pktdev > >>another > >> function call layer when it is just a single macro in the rte_ethdev > >Not trying to imply a new function call layer, just trying to illustrate > >your > >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look > >like > >rte_pktdevs so that the dataplane has a common element for passing mbufs > >around, > >regardless of its true device class. Not sure where you see an extra > >function > >call layer here. > > Please see my comment above as I am not trying to make them all look the > same you are taking my words to the extreme. We need some type of > structure for adding more device types to DPDK and I am just trying to > suggest a few common parts as a solution. > You see that you're contradicting yourself here, right? You state in the same paragraph that you are not trying to make all devices look the same, but are trying to find comment parts (the implication being that they will be interacted with in the same way). I get that you're saying their only going to be 'partially' the same, where the overlapping structures are made common in your model, so that code can be written generically, but if thats your goal, then the generic code you propose can only be generic if it only interfaces to the common parts, which means, to the generic application code you envision, every device looks the same. Thats my point Note I'm not opposed to generic code being able to be written for multiple device types, I'm just asserting that that common device is the rte_ethdev, which we already have. I'm proposing that dataplane elements are network interfaces codified as rte_ethdevs, and crypto devices are codified as some other API, and should you want crypto functionality, you write a pmd (interfaced to using rte_ethdev), that makes use of the to-be-written crypto library. > Some parts of the structures should be common and possible a few common > routines, yes a big part of each device will be different. The rte_ethdev > structures is what we have today and was a good starting point, but a big > part of the code will be different. > No, they shouldn't. Reuse existing data structures where possible. But don't try to take device specific interfaces and data, and where you see commonality, force commonality. Thats what you're trying to do here. I'd challenge you to find another implementation of device driver interfaces where what you are proposing is done. It just doesn't make sense. > Should we allow someone to write a new device type that is completely > different down to typedefs for variable types and all new structures, no > we should not, giving them some guide lines by providing a set of routines > and structures to start with, by all means lets do it. > At what point did you read that I said we can't use common data structures? Of course use them where it makes sense. Want to use mbufs as a common data type for both API's, fine! That makes sense. But don't go trying to mash device specific data types and methods into a common structure for some sort of imaginary savings. > > > >> header pointing to the rte_pktdev tx_burst routine. No function function > >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to > >> rte_pkt_tx_burst routine, which BTW is the same routine that was in > >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx > >> routine via the dev_ops function pointers structure. Which is also no > >> extra over head. > >> > >Never suggested it was extra overhead, where did you get that from? I'm > >just > >illustrating what it is you want to do. If I didn't get it right I > >apologize, > >please clarify. > > No need to clarify as we agree and maybe I read too much into your > comments. > > > >> If you are calling the rte_pkt_tx_burst routine directly it just means > >>you > >> need to get the device pointer to pass instead of the port id value in > >>the > >> rte_pkt_tx_burst routine. The above turns into something like this: > >> > >> +-----+ +---------+ +--------+ +-----+ > >> | | | ethdev | | | | | +------+ > >> | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | > >> | | | pktdev | | | | | +------+ > >> +-----+ +---------+ +--------+ +-----+ > >> > >> So the path of the data is the same only a macro does a simple rename of > >> the call to rte_eth_tx_burst routine. If you call the pktdev routine > >> directly then the macro is not used. > >> > >> > >Let me quash this by stipulating > >that you are absolutely correct, there is no additional overhead in your > >design, > >it should provide the exact same performance that an rte_ethdev currently > >does. > >I have no problem with that. My point is: It provides the exact same > >performance that an rte_ethdev does, so lets just use rte_ethdev if we're > >trying > >to model devices in the dataplane. > > The ethdev model does not give us a different device type it just forces > the new device type to look like an ethdev, this is what I am reading from > your email here. I do not believe you are suggesting that ethdev look a > like is the answer to a new device type. > What!? You need to re-read my last email, I'm not sure where you get the above from. If you'll look back at the dataplane design, I proposed this: +------------+ +---------+ +---------+ +---------+ +--------+ | | | | | | | | | | | | |ethdev | | ipsec | |ethdev +--+ | | ARP handler+->+api +-+ tunnel +->+api | | PMD | | | | | PMD | | | | | | | | | | | | | | | +------------+ +---------+ +---+-----+ +---------+ +--------+ | +--+-----+ | | |crypto | |api | | | | | +--------+ I'm clearly proposing two device models here: 1) A dataplane element model, the interface for which is an rte_ethdev. This allows one to implement multiple dataplane elements and functionality (in the example above an ipsec tunnel). This is a common and well understood device model that many network stacks use (read: easy for end users to understand) 2) A crypto device model, implemented independently, reusing existing data types where possible (read: Appropriate for the device class and alternate use cases) (1) uses (2) to implement ipsec functionality. if an application just wants to do crypto on some arbitrary data, then the crypto api is better suited to doing so (read: not bound by the pktdev api you propose) 2 device instances, 2 device models, each best suited for thier purpose. > The bonding model allows you to link some devices together and that is > good in IMO. The bonding model is kind of like a pipeline model at the > device layer. The pipeline style at the application layer could have > provided the same feature as the bonding model just being done at two > different layers in the system (application and device layer). > Not sure what you're getting at here, but I'll keep reading... > Now the pipeline model is a bit more generic and may add some small > overhead compared to the bonding model. If we require every device type to > support a bonding style then we would need them all to follow some type of > common structure, correct? I'm not sure what you're driving at here, please clarify. Are you trying to suggest that multiple pmds in the datapath require some pmd specific code to allow them to be stacked? Yes, they currently do, which is less than ideal, and I would certainly support an API to enable ordering of pmds in a datapath, but thats outside the scope of this discussion I think, since we're talking about new device types here. Regarding your comment about multiple pmds requiring a common structure to interface to one another in the datapath (like bonding currently does), you're absolutely right, thats an rte_ethdev. Thats the way it works today, and theres no reason to change that. > > > >> > > >> >The idea being that now all devices in the dataplane are pktdev devices > >> >and code > >> >that transmits and receives frames only needs to know that a device can > >> >transmit > >> >and receive frames. The crypto device in this chain is ostensibly > >> >preforming > >> >some sort of ipsec functionality so that arp frames are properly > >> >encrypted and > >> >encapsulated for sending via a tunnel. > >> > > >> >On the surface this seems reasonable, and in a sense it is. However, > >>my > >> >assertion is that we already have this functionality, and it is the > >> >rte_ethdev > >> >device. To illustrate further, in my view we can do the above > >>already: > >> > > >> >+------------+ +---------+ +---------+ +---------+ +--------+ > >> >| | | | | | | | | | > >> >| | |ethdev | | ipsec | |ethdev +--+ | > >> >| ARP handler+->+api +-+ tunnel +->+api | | PMD > >> >| | | | | PMD | | | | | > >> >| | | | | | | | | | > >> >+------------+ +---------+ +---+-----+ +---------+ +--------+ > >> > | > >> > +--+-----+ > >> > | | > >> > |crypto | > >> > |api | > >> > | | > >> > | | > >> > +--------+ > >> > > >> >Using the rte_ethdev we can already codify the ipsec functionailty as a > >> >pmd that > >> >registers an ethdev, and stack it with other pmds using methods > >>simmilar > >> >to what > >> >the bonding pmd does (or via some other more generalized dataplane > >> >indexing > >> >function). This still leaves us with the creation of the crypto api, > >> >which is > >> >adventageous because: > >> > >> The proposal does not remove the bonding method and can still be used, > >> correct? > >Not sure I fully understand the question, but no, I'm not proposing any > >specific > >change to bonding here, just using it to illustrate that we have ways > >currently > >to stack PMD's on top of one another to create a multi-stage data > >pipeline. > > Look at the above comment for this one. Ditto. I'm not making any comment about bonding, just using it as an example to show that its possible to stack ethdevs in a dataplane today. The method of stacking can certainly be improved, but rte_ethdev is acting as a perfectly good interface for that purpose right now, no changes needed. > > > >> I do not see the different here using the pktdev style routines or using > >> ethdev routines. > >Exactly my point! There is no difference to the dataplane API, it remains > >unchanged, just as in your proposal, except that there is no work > >required to > >create a pktdev device that doesn't change anything. The advantage that > >I'm > >proposing here is that my model separates whatever crypto device class > >API you want > >to design from the dataplane (rte_ethdev) api. Doing this lets you > >create a > >crypto API that is more appropriately suited to the crypto class of > >device. if > >there is overlap with rte_ethdev, thats fine, you can create simmilar > >functions > >(even giving them identical names to the rte_ethdev apis), and that will > >be > >properly resolved at compile time. But more importantly, where there is > >not > >overlap, you're not forced into creating an API that inherits the common > >API > >functions you propose, when they are not best suited to your device. > > I am still not trying to suggest we have the cryptodev be a copy of the > ethdev here. We need some type of structure and common guide lines for a > new device, which pktdev is a very reasonable plus very light framework. > So, perhaps we're talking past one another here, because I'm not suggesting that cryptodev be a copy of ethdev either, and I thought that was crystal clear in my drawings above. I'm suggesting that cryptodev be completely independent of any ethdev design (i.e. no overlap with ethdev as codified in your pktdev proposal). As for being lightweight, I don't see the relevance. As we agreed above, code savings, either in binary our source format is going to be fairly irrelevant, and not the point of the discussion. As for _needing_ commonality, I disagree, and cite other implementations in linux and bsd as evidence of that (the device models there, while reusing some datatypes, maintain independent method pointer structures). > I am a bit confused now it seems like you are suggesting a new device > should be ethdev and then you seem to be stating it should be completely > differentŠ > I'm not, and if you would read my emails closely, I think you would see that. As I tried to state as clearly as I'm able above, I'm proposing two devices: 1) A dataplane element model, the interface for which is an rte_ethdev. This allows one to implement multiple dataplane elements and functionality (in the example above an ipsec tunnel). This is a common and well understood device model that many network stacks use (read: easy for end users to understand) 2) A crypto device model, implemented independently, reusing existing data types where possible (read: Appropriate for the device class and alternate use cases) > Lets just agree some common parts are good between device types and some > are just not. Making all devices look like an ethdev is just not going to > fly and making a new device type into something completely different is > not going to fly. > No, lets not, because thats exactly what I'm arguing against. Once again I'll try to re-iterate: I'm fine with reusing existing datatypes if they are applicable I would like to see a device model for cryptodev that implements its own methods that are condusive to the way that hardware works, and not bound to the way other device classes work I would like to see dataplane element functionality that implements other network elements (like an ipsec tunnel), be implemented using the rte_ethdev interface, and have it implement its internals using the to-be-created cryptodev api. > Then the only solution is to have somethings in common to help unify the > designs, but still allow the developer to add APIs to that device which > make the most sense. > > Please tell me, why do you think that unify device models is the _only_ solution? I've provided several examples that simply don't ever do that, and they are quite successfull > >> > > >> >1) It is not constrained by the i/o model of the dataplane (it may > >>include > >> >packet based i/o, but can build on more rudimentary (and performant) > >> >interfaces. > >> >For instance, in addition to async block based i/o, a crypto device may > >> >also > >> >operate syncrhnously, meaning a call can be saved with each transaction > >> >(2 calls > >> >for a tx/rx vs one for an encrypt operation). > >> > > >> >2) It is not constrained by use case. That is to say the API can be > >> >constructed > >> >for more natural use with other functions (for instance encryptions of > >> >files on > >> >disk or via a pipe to another process), which may not have any relation > >> >to the > >> >data plane of DPDK. > >> > > >> >Neil > >> > >> Ok, you snipped the text of the email here an it makes the context wrong > >I figured your illustration above was sufficient to make your point and > >mine. > >If your code tells a different story, I apologize. > > > >> without the rest of the code IMO. I will try to explain without the text > >> that was omitted, but it would be best for anyone missing the original > >> email to read it for more details. I know the formatting got messed up a > >> bit :-( > >> > >> http://dpdk.org/ml/archives/dev/2015-April/016124.html > >> > >> > >> In the rest of the text it does show the points I wanted to make here > >>and > >> how little overhead it added. > >> > >As above, I'm stipulating to the fact that there is not performance > >overhead. > >That is not, and was never my point. My point was/is that we already > >have an > >API to do what you want to do. The commonality of functions that you > >seem to > >focus on, is already provided by the fact that we have a common device > >for the > >dataplane (the rte_ethdev). You're proposal doesn't buy us anything > >beyond > >that. By your own assertion, your proposal: > > > >1) Its just as performant as an rte_ethdev > > > >2) It doesn't require any code change to applications > > > >Given that, and the fact that it does require that new device classes > >inherit an > > Please do not bring inheritance into this again I wish Bruce had never > used that term :-( in hind site it is not a very good way to describe the > design. why not, its a perfectly accurate way to describe what you want to do here. You want to extract elements from rte_ethdev that you feel are going to be common to other device types, place them in their own structure, and have other device types include that common structure, so as to be able to have those methods be a part of the new device. In UML terms I suppose the relationship would be a "has-a" rather than a "is-a" but the inheritance description is equally relevant when writing in C. > It seems like it at a very high level, but it seems to drag > everyone down to the bits and bytes way too much. We ever stated that all > the APIs are common between ethdev (or are inherited) between a new device > only a couple which seemed to make sense to me. The structures being a bit > similar was my goal and to provide a good light framework. > Again with the light framework. We're not talking about code savings anymore. Theres nothing more or less light about just using an rte_ethdev to implement a nework element. Separate out the network element from the actual crypto functionality. As for not _having_ to use the common structure in all device types, I agree, you don't have to mandate its use, but doesn't that beg the question, why do this then? If theres not any guaranteed functionality, why do this at all? > >API set that may not be relevant to said device class, my one and only > >question > >is: whats the advantage? > > I have been trying to layout the advantages: > 1) - we provide a very light framework for the developer of a new device > type to start with in his design We agreed at the start of this thread that 'lightness' wasn't relevant. Or do you mean to describe something other than code size savings by the term 'light'? > 2) - we look at making some routines common between devices if it make > sense, some non-performance APIs do make sense to be common, but the > structures need to be common in some way IMO I can't parse this sentance. You're propsal at the start of this thread was to create common structures between device classes. How is "we look at making some routines common between devices if it makes sense" an avantage to your proposal, which is exactly that? Its like saying changing code is good, because we get to change code. > 3) - we make a few structures look similar using pktdev structures as an > overlay to allow for a clean way to pass around these different device > types This I actually agree with, but it goes back to your first point. I would argue that passing around rte_ethdevs is just as, if not more useful than passing around pktdevs, because it more fully describes an ethernet interface. As you note, the pktdev interface is 'lighter' which I take as a euphamism for 'smaller or having less code'. As we've discussed above however, the savings are minimal and not paricularly relevant to this debate, so why not just use rte_ethdev to pass around network element functionality and design a crypto device interfae independently (I'll remind you here that I'm proposing two devices here, an rte_ethdev and a cryptodev). > 4) - adopt standard APIs or translate those APIs into something similar > for DPDK (e.g. Linux kernel crypto APIs) > What standard API's have you seen that make common API methods between device types? Can you provide an example? > >You claim that common API naming makes application > > I never claimed that common API naming make applications common, but only > a few common constructs or API semantics could help. > What do you see as the difference? How, specifically does creating common device model mehtods help an application? I was under the impression that, in so doing, you could allow some generic application code to be written that could interface to either a real ethernet device, or to a crypto device. What other advantage do you see? > >code generic, but we have that today. If you want to do IPsec with a > >crypto > >offload devie, write a pmd that interfaces to the data pipeline via the > >rte_ethdev API and uses whatever crypto offload device API you want to > >construct > >to do the crypto work. That way, if a given application wants to use the > >crypto > > I am trying to suggest the crypto device can have other APIs better suited > to how we want to use that device, again we can have a few common items > between ethdev and cryptodev, but it does not make sense to use all of > ethdev. > Ok, so this is a good thought. I think what I hear you saying above is that you could envision a crypto device having two api sets, yes? One, a raw API that applications could use to talk to crypto devices directly, and two, a network oriented API that can transmit and recieve network frames, right? If thats the case, I'm fine with that. But if thats interesting to you, why can't the network interface just be an rte_ethdev that you register like other pmd's do? > I am getting confused here as you seem to be suggesting we use ethdev > model, but I am sure you are not, correct? > Correct, two device models, for two devices. An ipsec device, implemented as a pmd that uses an rte_ethdev to interface to the network dataplane, and a crypto device model that the aforementioned pmd interfaces to to encrypt and decrypt data to implement ipsec functionality. > >hardware for something non-network related (say disk/file encryption), it > >can > >use the crypto device layer in a way that is not packet/network oriented > >more > >easily, at which point having a common API is completely irrelevant, as > >the > >application code is being specifically written to talk to a crypto device. > > > > > > > > > >> Lets just say we do not move the TX/RX routines from rte_ethdev into > >> rte_pktdev and only have a header file with a few structures and macros > >>to > >> help define some common parts between each of the new device types being > >> added. I could see that as an option, but I do not see the big issues > >>you > >> are pointing out here. > >> > >My big issue is that I see no point in creating a common API. > > I am not trying to create a common API between all devices, it only seemed > like a reasonable design to make Rx/Tx look similar. Also making some of > the structures look similar is a good thing IMO. > Ok, you're not trying to make them look identical, I get that. But you are trying to get them to try to look identical in terms of how they do data i/o. I'm suggesting/asserting that such an interface isn't appropriate to all devices. Even that little bit isn't worth making common, because it won't always be used. > >You're focused on finding a common set of API's that apply to all device > >classes, and I'm asserting that there is no safe set of common API > >routines. > >While it may be possible to shoehorn multiple device classes into the > >same i/o > >API model, you are doing yourself a disservice by mandating that. > > I feel like you are taking the extreme view here again and that was not my > intent to make everything the same only where it makes sense. A ethdev and > cryptodev are two different animals, but they do have some similarities > and I want to make sure we investigate these similarities, right? > Yes, they are two different animals, and while they may have some similarities, I'm suggesting that its not worth codifying those similarities in code. crypto devices don't send and recieve packets. They might operate on block data, so you might use mbufs in both apis, but you don't transmit data to a crypto dev and you don't receive data from a crypto device. You could do that I suppose, but there are better ways, and existing api's show us the way to do that. Remember from above, I'm suggesting two device models here. A crypto device that just does cryptography in the fastest/best way possible, and a optional network element (an ipsec tunnel in our example), that uses the crypto device to preform its function, but uses an rte_ethdev to transmit and receive network data. > > > >Put another way, ipsec devices and crypto devices are separate device > >classes > >and deserve to be modeled differently. An IPsec device is a network > >element > >that is very nicely modeled as an ethernet interface, for which you have > >a very > >robust API (the rte_ethdev). Crypto offload devices are not network > >elements, > >and can preform i/o in a mutltitude of ways (synchronous, asynchronous, > >scatter/gather, etc), and have many non ethernet API facets. The latter > >deserves to have its own API design, not to be forced to inherit part of > >an > >ethernet device (rte_pktdev) and be forced to work that way. > > > >> You did have some great comments about how crypto is used and the APIs > >> from the Linux Kernel crypto model is proven and I do not disagree. > >> > >> In my email to my own email I tried to point our we could add something > >Yeah, sorry about the broken thread, the in-reply-to field got somehow > >html-mangled. Not sure how that happened as I use mutt as my MUA. > > > >> very similar to Linux Kernel Crypto API and it would be a model most > >>would > >> be able to understand. Creating my own crypto API is not my goal, but to > >> use standards where it makes the most sense to DPDK. > >> > >> The email link above is the email I suggested the Linux Kernel Crypto > >>API > >> would be reasonable. > >> > >And thats good. It doesn't need to be the linux kernel crypto API by any > >stretch, and I don't expect it to be. My take away from review of the > >linux > >crypto api is that, while it uses some simmilar notions to parts of the > >network > >API (scatter gather i/o, optional async operation, completion queues), it > >doesn't create any sort of common API structure between it and the network > >subsystem, because they're separate devices. Even though crypto is > >largely used > >in the network datapath for various ipsec tunnels, crypto still uses its > >own API > >methods and data types because its not exclusive to the use of > >networking, and > >doing so allows for more flexible use outside the datapath. Thats what > >I'm > >trying to get at here. That users might want to use crypto outside of the > >network path, and they should have an API that is well suited to that > >task. > > We agree here, I only want some similarities between devices not an exact > common API. > > I am afraid I am not following you now. At one point it seems like all of > the devices must be completely different then in another place ethdev is > best way to go as a common API. I think we are saying the same thing here > just from a different point of view and I have tried to understand your > view point. I have even relaxed my design goals to meet you part way. It > seems like we agree and disagree here, which does not make a lot of sense > to me at this point. > I'll repeat what I've said above several times now: I'd like to see a crypto device api that is developed independently. It can use common data types if its suitable to do so, but shouldn't be forced to use simmilar methods from existing device types. The API here should be optimized to best reflect the behavior of the hardware If you would like to use the cryptodevice as part of a network dataplane element, you should create a software pmd to register a rte_ethdev. Internally that pmd should use the crypto dev api to do its encryption work. That way your 'common' device api in the dataplane is the rte_ethdev. The cryptodev api is then independent of, and optimized for, the cryptodev hardware. > Maybe we need to get into a room and talk F2F to make sure we are both > being understood here. Maybe we can then make sure we are both being heard > and come to some type of solution for DPDK. Maybe the community meeting is > a good place, but I feel it is going to be too big and too short to > discuss this correctly. If you are up to a conf call or I can travel to > you if you feel a F2F is better. > I think we're on opposite coasts, so I don't think a face to face is going to happen just for this, and that excludes other interests from this discussion, which I would really like to hear. I've re-read this thread, and I have a suggestion: Implement it. We're clearly approaching this from different angles. You seem to place alot of value on the benefits of commonality between device types, while I clearly dont. I've got several examples in support of my view, but in fairness, the dpdk isn't the linux or bsd kernel, so while I don't think so, its possible this commonality provides some level of value. Likewise, I may show that creating this common structure doesn't provide any real added value. Additionally, we've both focused on the commonality of devices in this thread (which was the intent), but neither of us have spent any time actually designing the crypto API, which I think will shed alot of light on the problem. I would suggest choosing a crypto device, maybe the ixp4 crypto engine, since it has an existing open source implementation, and using it to flesh out crypto API design. If you implement it independently of the dpdk (i.e. as part of its own library with its own API), then we take a step back and look at it, we can identify the true api overlap and consider the advantages and limitation of consolidation. Sound like an idea? Neil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-05 19:37 ` Neil Horman @ 2015-04-05 22:20 ` Wiles, Keith 2015-04-06 1:48 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: Wiles, Keith @ 2015-04-05 22:20 UTC (permalink / raw) To: Neil Horman; +Cc: dev On 4/5/15, 2:37 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >On Sat, Apr 04, 2015 at 03:16:08PM +0000, Wiles, Keith wrote: >> >> >> On 4/4/15, 8:11 AM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >> >> >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote: >> >> Hi Neil, >> >> >> >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: >> >> >> >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: >> >> >> Hi all, (hoping format of the text is maintained) >> >> >> >> >> >> Bruce and myself are submitting this RFC in hopes of providing >> >> >>discussion >> >> >> points for the idea. Please do not get carried away with the code >> >> >> included, it was to help everyone understand the proposal/RFC. >> >> >> >> >> >> The RFC is to describe a proposed change we are looking to make to >> >>DPDK >> >> >>to >> >> >> add more device types. We would like to add in to DPDK the idea >>of a >> >> >> generic packet-device or ?pktdev?, which can be thought of as a >>thin >> >> >>layer >> >> >> for all device classes. For other device types such as >>potentially a >> >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect >> >> >> performance and not require any current application to be >>modified. >> >>The >> >> >> pktdev layer is providing a light framework for developers to add >>a >> >> >>device >> >> >> to DPDK. >> >> >> >> >> >> Reason for Change >> >> >> ----------------- >> >> >> >> >> >> The reason why we are looking to introduce these concepts to DPDK >> >>are: >> >> >> >> >> >> * Expand the scope of DPDK so that it can provide APIs for more >>than >> >> >>just >> >> >> packet acquisition and transmission, but also provide APIs that >>can >> >>be >> >> >> used to work with other hardware and software offloads, such as >> >> >> cryptographic accelerators, or accelerated libraries for >> >>cryptographic >> >> >> functions. [The reason why both software and hardware are >>mentioned >> >>is >> >> >>so >> >> >> that the same APIs can be used whether or not a hardware >>accelerator >> >>is >> >> >> actually available]. >> >> >> * Provide a minimal common basis for device abstraction in DPDK, >>that >> >> >>can >> >> >> be used to unify the different types of packet I/O devices already >> >> >> existing in DPDK. To this end, the ethdev APIs are a good starting >> >> >>point, >> >> >> but the ethdev library contains too many functions which are >> >> >>NIC-specific >> >> >> to be a general-purpose set of APIs across all devices. >> >> >> Note: The idea was previously touched on here: >> >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 >> >> >> >> >> >> Description of Proposed Change >> >> >> ------------------------------ >> >> >> >> >> >> The basic idea behind "pktdev" is to abstract out a few common >> >>routines >> >> >> and structures/members of structures by starting with ethdev >> >>structures >> >> >>as >> >> >> a starting point, cut it down to little more than a few members in >> >>each >> >> >> structure then possible add just rx_burst and tx_burst. Then use >>the >> >> >> structures as a starting point for writing a device type. >>Currently >> >>we >> >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see >> >>like >> >> >> move a couple more common functions maybe resaonable. It could be >>the >> >> >> Rx/Tx routines in pktdev should be left as is, but in the code >>below >> >>is >> >> >>a >> >> >> possible reason to abstract a few routines into a common set of >> >>files. >> >> >> >> >> >> >From there, we have the ethdev type which adds in the existing >> >> >>functions >> >> >> specific to Ethernet devices, and also, for example, a cryptodev >> >>which >> >> >>may >> >> >> add in functions specific for cryptographic offload. As now, with >>the >> >> >> ethdev, the specific drivers provide concrete implementations of >>the >> >> >> functionality exposed by the interface. This hierarchy is shown in >> >>the >> >> >> diagram below, using the existing ethdev and ixgbe drivers as a >> >> >>reference, >> >> >> alongside a hypothetical cryptodev class and driver implementation >> >> >> (catchingly called) "X": >> >> >> >> >> >> ,---------------------. >> >> >> | struct rte_pktdev | >> >> >> +---------------------+ >> >> >> | rte_pkt_rx_burst() | >> >> >> .-------| rte_pkt_tx_burst() |-----------. >> >> >> | `---------------------' | >> >> >> | | >> >> >> | | >> >> >> ,-------------------------------. >> >>,------------------------------. >> >> >> | struct rte_ethdev | | struct rte_cryptodev >> >> | >> >> >> +-------------------------------+ >> >>+------------------------------+ >> >> >> | rte_eth_dev_configure() | | >> >>rte_crypto_init_sym_session()| >> >> >> | rte_eth_allmulticast_enable() | | >> >>rte_crypto_del_sym_session() | >> >> >> | rte_eth_filter_ctrl() | | >> >> | >> >> >> `-------------------------------' >> >>`---------------.--------------' >> >> >> | | >> >> >> | | >> >> >> ,---------'---------------------. >> >>,---------------'--------------. >> >> >> | struct rte_pmd_ixgbe | | struct rte_pmd_X >> >> | >> >> >> +-------------------------------+ >> >>+------------------------------+ >> >> >> | .configure -> ixgbe_configure | | .init_session -> >> >>X_init_ses()| >> >> >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> >> >>X_handle_pkts() | >> >> >> `-------------------------------' >> >>`------------------------------' >> >> >> >> >> >> We are not attempting to create a real class model here only >>looking >> >>at >> >> >> creating a very basic common set of APIs and structures for other >> >>device >> >> >> types. >> >> >> >> >> >> In terms of code changes for this, we obviously need to add in new >> >> >> interface libraries for pktdev and cryptodev. The pktdev library >>can >> >> >> define a skeleton structure for the first few elements of the >>nested >> >> >> structures to ensure consistency. Each of the defines below >> >>illustrate >> >> >>the >> >> >> common members in device structures, which gives some basic >>structure >> >> >>the >> >> >> device framework. Each of the defines are placed at the top of the >> >> >>devices >> >> >> matching structures and allows the devices to contain common and >> >>private >> >> >> data. The pkdev structures overlay the first common set of members >> >>for >> >> >> each device type. >> >> >> >> >> > >> >> > >> >> >Keith and I discussed this offline, and for the purposes of >> >>completeness >> >> >I'll >> >> >offer my rebuttal to this proposal here. >> >> > >> >> >In short, I don't think the segregation of the transmit and receive >> >> >routines >> >> >into their own separate structure (and ostensibly their own >> >>librte_pktdev >> >> >library) is particularly valuable. While it does provide some >>minimal >> >> >code >> >> >savings when new device classes are introduced, the savings are not >> >> >significant >> >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx >> >>and rx >> >> >routines are any sort of indicator). It does however, come with >> >> >significant >> >> >costs in the sense that it binds a device class to using an I/O >>model >> >>(in >> >> >this >> >> >case packet based recieve and transmit) for which the device class >>may >> >> >not be >> >> >suited. >> >> >> >> The only reason the we only have a 0.5Kb saving is you are only >>looking >> >>at >> >> moving Rx/Tx routines into pktdev, but what happens if we decide to >> >>move a >> >> number of common functions like start/stop and others, then you >>start to >> >> see a much bigger saving. >> >You're right of course, but lets just try the math here. If we assume >> >that, >> >since all these libraries are just inlined redirection functions, we >>can >> >guess >> >that each one is about .25kb of code (5.kb/2 functions). So if you add >> >two more >> >functions you're looking at 1kb of code savings. Of course if you >>start >> >doing >> >so, you beg the two questions that I've been trying to pose here: >> > >> >1) How much 'common' API do you want to impose on a device class that >>may >> >not >> >be condusive to the common set >> > >> >2) At what point does your common API just become an rte_ethdev? >> >> You are taking the code size to the extreme, which is not the real point >> here. I agree it does not matter in footprint savings as most DPDK >>systems >> are pretty big. The reason for the common code was to help remove some >>of >> the code required for the developer to write in the new device type. If >> the developer of the new device type writes his own set of Rx/Tx >>routines >> the community will determine if they are required or do the current APIs >> work or do we require them to support both. >> >> Please try not getting hung up on the footprint savings and I am sorry >>if >> I was making this a huge point. >> >Hung up? I'm not the one who made a big deal out of code size reduction >being a >major benefit there, be it source or binary, its miniscule any way you >slice it. >I'm happy to drop this though if you are. > >> > >> >> Do we need this saving, maybe not, but it does >> >I would say definately not, given that the DPDK houses entire API sets >> >that are >> >always-inline, and any savings from the above are easily eclipsed by >>the >> >fanout >> >that occurs from use in multiple call sites. Lets be honest, >>performance >> >is the >> >DPDK's primary concern. Code size is not a consideration. Its only an >> >argument >> >here because it makes this proposal look favorable. Its not a bad >>thing >> >mind >> >you, but if this proposal caused any code bloat, it wouldn't even be >> >mentioned. >> >> In the current design the pktdev APIs are still inline routines only the >> debug routines are not. >> >Reread my comments above please. You seem to have misread what I wrote. > >> > >> >> provide a single call API rte_pkt_rx/tx_burst to use instead of the >> >> application having to make sure it is calling the correct device >>Rx/Tx >> >> routines. >> >Given that that is a compile time issue, I really don't see why that >>is a >> >big >> >deal. Especially if you, as I suggest, just use the rte_ethdev as your >> >common >> >dataplane device model. Then by virtue of the fact that they're all >> >rte_ethdevs, you get common API naming. >> > >> >> All that is required is passing in the device pointer and it is >> >> handled for the application. Bruce added some code below to that >>effect. >> >Yes, and you have that now, its an rte_ethdev. >> > >> >> > >> >> >To illustrate the difference in design ideas, currenty the dpdk data >> >> >pipeline >> >> >looks like this: >> >> > >> >> >+------------+ +----------+ +---------+ >> >> >| | | | | | >> >> >| ARP | | ethdev | | | +----------+ >> >> >| handler +-->+ api +-->+ PMD +-->+ Wire | >> >> >| | | | | | +----------+ >> >> >| | | | | | >> >> >+------------+ +----------+ +---------+ >> >> >> >> You did not add the crypto to this picture as it is in the picture >>below >> >> to make them the same. >> >> >> >No I didn't, because it doesn't exist right now. Thats what I mean by >> >'currently'. See my description below after you added your drawing. >> > >> >> +-----+ +---------+ +---------+ +------+ >> >> | | | | | | | | +------+ >> >> | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | >> >> | | | | | | | | +------+ >> >> +-----+ +---------+ +---------+ +------+ >> >> >> >> >> >> > >> >> > >> >> >Where the ARP handler code is just some code that knows how to >>manage >> >>arp >> >> >requests and responses, and only transmits and receives frames >> >> > >> >> >Keiths idea would introduce this new pktdev handler structure and >>make >> >>the >> >> >dataplane pipeline look like this: >> >> > >> >> >+------------+ +------------+ +------------+ +--------+ >> >> >| | | | | | | | >> >> >| ARP | | pktdev api | | pktdev_api | | | >>+---------+ >> >> >| handler +-+ +--+ +--+ PMD +--+Wire >> | >> >> >| | | | | | | | >>+---------+ >> >> >| | | | | | | | >> >> >+------------+ | | | | | | >> >> > | | | | +--------+ >> >> > | | | | >> >> > | | | | >> >> > | | | | >> >> > | rte_ethdev | | rte_crypto | >> >> > | | | | >> >> > | | | | >> >> > +------------+ +------------+ >> >> >> >> You are drawing this picture it appears trying to make the pktdev >> >>another >> >> function call layer when it is just a single macro in the rte_ethdev >> >Not trying to imply a new function call layer, just trying to >>illustrate >> >your >> >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look >> >like >> >rte_pktdevs so that the dataplane has a common element for passing >>mbufs >> >around, >> >regardless of its true device class. Not sure where you see an extra >> >function >> >call layer here. >> >> Please see my comment above as I am not trying to make them all look the >> same you are taking my words to the extreme. We need some type of >> structure for adding more device types to DPDK and I am just trying to >> suggest a few common parts as a solution. >> >You see that you're contradicting yourself here, right? You state in the >same >paragraph that you are not trying to make all devices look the same, but >are >trying to find comment parts (the implication being that they will be >interacted >with in the same way). I get that you're saying their only going to be >'partially' the same, where the overlapping structures are made common in >your >model, so that code can be written generically, but if thats your goal, >then the >generic code you propose can only be generic if it only interfaces to the >common >parts, which means, to the generic application code you envision, every >device >looks the same. Thats my point > >Note I'm not opposed to generic code being able to be written for multiple >device types, I'm just asserting that that common device is the >rte_ethdev, >which we already have. I'm proposing that dataplane elements are network >interfaces codified as rte_ethdevs, and crypto devices are codified as >some >other API, and should you want crypto functionality, you write a pmd >(interfaced >to using rte_ethdev), that makes use of the to-be-written crypto library. > >> Some parts of the structures should be common and possible a few common >> routines, yes a big part of each device will be different. The >>rte_ethdev >> structures is what we have today and was a good starting point, but a >>big >> part of the code will be different. >> >No, they shouldn't. Reuse existing data structures where possible. But >don't >try to take device specific interfaces and data, and where you see >commonality, >force commonality. Thats what you're trying to do here. I'd challenge >you to >find another implementation of device driver interfaces where what you are >proposing is done. It just doesn't make sense. > >> Should we allow someone to write a new device type that is completely >> different down to typedefs for variable types and all new structures, no >> we should not, giving them some guide lines by providing a set of >>routines >> and structures to start with, by all means lets do it. >> >At what point did you read that I said we can't use common data >structures? Of >course use them where it makes sense. Want to use mbufs as a common data >type >for both API's, fine! That makes sense. But don't go trying to mash >device >specific data types and methods into a common structure for some sort of >imaginary savings. > >> > >> >> header pointing to the rte_pktdev tx_burst routine. No function >>function >> >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to >> >> rte_pkt_tx_burst routine, which BTW is the same routine that was in >> >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx >> >> routine via the dev_ops function pointers structure. Which is also no >> >> extra over head. >> >> >> >Never suggested it was extra overhead, where did you get that from? >>I'm >> >just >> >illustrating what it is you want to do. If I didn't get it right I >> >apologize, >> >please clarify. >> >> No need to clarify as we agree and maybe I read too much into your >> comments. >> > >> >> If you are calling the rte_pkt_tx_burst routine directly it just >>means >> >>you >> >> need to get the device pointer to pass instead of the port id value >>in >> >>the >> >> rte_pkt_tx_burst routine. The above turns into something like this: >> >> >> >> +-----+ +---------+ +--------+ +-----+ >> >> | | | ethdev | | | | | +------+ >> >> | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | >> >> | | | pktdev | | | | | +------+ >> >> +-----+ +---------+ +--------+ +-----+ >> >> >> >> So the path of the data is the same only a macro does a simple >>rename of >> >> the call to rte_eth_tx_burst routine. If you call the pktdev routine >> >> directly then the macro is not used. >> >> >> >> >> >Let me quash this by stipulating >> >that you are absolutely correct, there is no additional overhead in >>your >> >design, >> >it should provide the exact same performance that an rte_ethdev >>currently >> >does. >> >I have no problem with that. My point is: It provides the exact same >> >performance that an rte_ethdev does, so lets just use rte_ethdev if >>we're >> >trying >> >to model devices in the dataplane. >> >> The ethdev model does not give us a different device type it just forces >> the new device type to look like an ethdev, this is what I am reading >>from >> your email here. I do not believe you are suggesting that ethdev look a >> like is the answer to a new device type. >> >What!? You need to re-read my last email, I'm not sure where you get the >above >from. If you'll look back at the dataplane design, I proposed this: > >+------------+ +---------+ +---------+ +---------+ +--------+ >| | | | | | | | | | >| | |ethdev | | ipsec | |ethdev +--+ | >| ARP handler+->+api +-+ tunnel +->+api | | PMD >| | | | | PMD | | | | | >| | | | | | | | | | >+------------+ +---------+ +---+-----+ +---------+ +--------+ > | > +--+-----+ > | | > |crypto | > |api | > | | > | | > +--------+ > > >I'm clearly proposing two device models here: > >1) A dataplane element model, the interface for which is an rte_ethdev. >This >allows one to implement multiple dataplane elements and functionality (in >the >example above an ipsec tunnel). This is a common and well understood >device >model that many network stacks use (read: easy for end users to >understand) > >2) A crypto device model, implemented independently, reusing existing >data types >where possible (read: Appropriate for the device class and alternate use >cases) > >(1) uses (2) to implement ipsec functionality. if an application just >wants to >do crypto on some arbitrary data, then the crypto api is better suited to >doing >so (read: not bound by the pktdev api you propose) > >2 device instances, 2 device models, each best suited for thier purpose. > >> The bonding model allows you to link some devices together and that is >> good in IMO. The bonding model is kind of like a pipeline model at the >> device layer. The pipeline style at the application layer could have >> provided the same feature as the bonding model just being done at two >> different layers in the system (application and device layer). >> >Not sure what you're getting at here, but I'll keep reading... > >> Now the pipeline model is a bit more generic and may add some small >> overhead compared to the bonding model. If we require every device type >>to >> support a bonding style then we would need them all to follow some type >>of >> common structure, correct? >I'm not sure what you're driving at here, please clarify. Are you trying >to >suggest that multiple pmds in the datapath require some pmd specific code >to >allow them to be stacked? Yes, they currently do, which is less than >ideal, and >I would certainly support an API to enable ordering of pmds in a >datapath, but >thats outside the scope of this discussion I think, since we're talking >about >new device types here. > >Regarding your comment about multiple pmds requiring a common structure to >interface to one another in the datapath (like bonding currently does), >you're >absolutely right, thats an rte_ethdev. Thats the way it works today, and >theres >no reason to change that. > >> > >> >> > >> >> >The idea being that now all devices in the dataplane are pktdev >>devices >> >> >and code >> >> >that transmits and receives frames only needs to know that a device >>can >> >> >transmit >> >> >and receive frames. The crypto device in this chain is ostensibly >> >> >preforming >> >> >some sort of ipsec functionality so that arp frames are properly >> >> >encrypted and >> >> >encapsulated for sending via a tunnel. >> >> > >> >> >On the surface this seems reasonable, and in a sense it is. >>However, >> >>my >> >> >assertion is that we already have this functionality, and it is the >> >> >rte_ethdev >> >> >device. To illustrate further, in my view we can do the above >> >>already: >> >> > >> >> >+------------+ +---------+ +---------+ +---------+ +--------+ >> >> >| | | | | | | | | | >> >> >| | |ethdev | | ipsec | |ethdev +--+ | >> >> >| ARP handler+->+api +-+ tunnel +->+api | | PMD >> >> >| | | | | PMD | | | | | >> >> >| | | | | | | | | | >> >> >+------------+ +---------+ +---+-----+ +---------+ +--------+ >> >> > | >> >> > +--+-----+ >> >> > | | >> >> > |crypto | >> >> > |api | >> >> > | | >> >> > | | >> >> > +--------+ >> >> > >> >> >Using the rte_ethdev we can already codify the ipsec functionailty >>as a >> >> >pmd that >> >> >registers an ethdev, and stack it with other pmds using methods >> >>simmilar >> >> >to what >> >> >the bonding pmd does (or via some other more generalized dataplane >> >> >indexing >> >> >function). This still leaves us with the creation of the crypto >>api, >> >> >which is >> >> >adventageous because: >> >> >> >> The proposal does not remove the bonding method and can still be >>used, >> >> correct? >> >Not sure I fully understand the question, but no, I'm not proposing any >> >specific >> >change to bonding here, just using it to illustrate that we have ways >> >currently >> >to stack PMD's on top of one another to create a multi-stage data >> >pipeline. >> >> Look at the above comment for this one. > >Ditto. I'm not making any comment about bonding, just using it as an >example >to show that its possible to stack ethdevs in a dataplane today. The >method of >stacking can certainly be improved, but rte_ethdev is acting as a >perfectly good >interface for that purpose right now, no changes needed. > >> > >> >> I do not see the different here using the pktdev style routines or >>using >> >> ethdev routines. >> >Exactly my point! There is no difference to the dataplane API, it >>remains >> >unchanged, just as in your proposal, except that there is no work >> >required to >> >create a pktdev device that doesn't change anything. The advantage >>that >> >I'm >> >proposing here is that my model separates whatever crypto device class >> >API you want >> >to design from the dataplane (rte_ethdev) api. Doing this lets you >> >create a >> >crypto API that is more appropriately suited to the crypto class of >> >device. if >> >there is overlap with rte_ethdev, thats fine, you can create simmilar >> >functions >> >(even giving them identical names to the rte_ethdev apis), and that >>will >> >be >> >properly resolved at compile time. But more importantly, where there >>is >> >not >> >overlap, you're not forced into creating an API that inherits the >>common >> >API >> >functions you propose, when they are not best suited to your device. >> >> I am still not trying to suggest we have the cryptodev be a copy of the >> ethdev here. We need some type of structure and common guide lines for a >> new device, which pktdev is a very reasonable plus very light framework. >> > >So, perhaps we're talking past one another here, because I'm not >suggesting that >cryptodev be a copy of ethdev either, and I thought that was crystal >clear in my >drawings above. I'm suggesting that cryptodev be completely independent >of any >ethdev design (i.e. no overlap with ethdev as codified in your pktdev >proposal). > >As for being lightweight, I don't see the relevance. As we agreed above, >code >savings, either in binary our source format is going to be fairly >irrelevant, >and not the point of the discussion. As for _needing_ commonality, I >disagree, >and cite other implementations in linux and bsd as evidence of that (the >device >models there, while reusing some datatypes, maintain independent method >pointer >structures). > >> I am a bit confused now it seems like you are suggesting a new device >> should be ethdev and then you seem to be stating it should be completely >> differentŠ >> >I'm not, and if you would read my emails closely, I think you would see >that. >As I tried to state as clearly as I'm able above, I'm proposing two >devices: > >1) A dataplane element model, the interface for which is an rte_ethdev. >This >allows one to implement multiple dataplane elements and functionality (in >the >example above an ipsec tunnel). This is a common and well understood >device >model that many network stacks use (read: easy for end users to >understand) > >2) A crypto device model, implemented independently, reusing existing >data types >where possible (read: Appropriate for the device class and alternate use >cases) > > > >> Lets just agree some common parts are good between device types and some >> are just not. Making all devices look like an ethdev is just not going >>to >> fly and making a new device type into something completely different is >> not going to fly. >> >No, lets not, because thats exactly what I'm arguing against. >Once again I'll try to re-iterate: > >I'm fine with reusing existing datatypes if they are applicable > >I would like to see a device model for cryptodev that implements its own >methods >that are condusive to the way that hardware works, and not bound to the >way >other device classes work > >I would like to see dataplane element functionality that implements other >network elements (like an ipsec tunnel), be implemented using the >rte_ethdev >interface, and have it implement its internals using the to-be-created >cryptodev >api. > > >> Then the only solution is to have somethings in common to help unify the >> designs, but still allow the developer to add APIs to that device which >> make the most sense. >> > > >Please tell me, why do you think that unify device models is the _only_ >solution? I've provided several examples that simply don't ever do that, >and >they are quite successfull > >> >> > >> >> >1) It is not constrained by the i/o model of the dataplane (it may >> >>include >> >> >packet based i/o, but can build on more rudimentary (and performant) >> >> >interfaces. >> >> >For instance, in addition to async block based i/o, a crypto device >>may >> >> >also >> >> >operate syncrhnously, meaning a call can be saved with each >>transaction >> >> >(2 calls >> >> >for a tx/rx vs one for an encrypt operation). >> >> > >> >> >2) It is not constrained by use case. That is to say the API can be >> >> >constructed >> >> >for more natural use with other functions (for instance encryptions >>of >> >> >files on >> >> >disk or via a pipe to another process), which may not have any >>relation >> >> >to the >> >> >data plane of DPDK. >> >> > >> >> >Neil >> >> >> >> Ok, you snipped the text of the email here an it makes the context >>wrong >> >I figured your illustration above was sufficient to make your point and >> >mine. >> >If your code tells a different story, I apologize. >> > >> >> without the rest of the code IMO. I will try to explain without the >>text >> >> that was omitted, but it would be best for anyone missing the >>original >> >> email to read it for more details. I know the formatting got messed >>up a >> >> bit :-( >> >> >> >> http://dpdk.org/ml/archives/dev/2015-April/016124.html >> >> >> >> >> >> In the rest of the text it does show the points I wanted to make here >> >>and >> >> how little overhead it added. >> >> >> >As above, I'm stipulating to the fact that there is not performance >> >overhead. >> >That is not, and was never my point. My point was/is that we already >> >have an >> >API to do what you want to do. The commonality of functions that you >> >seem to >> >focus on, is already provided by the fact that we have a common device >> >for the >> >dataplane (the rte_ethdev). You're proposal doesn't buy us anything >> >beyond >> >that. By your own assertion, your proposal: >> > >> >1) Its just as performant as an rte_ethdev >> > >> >2) It doesn't require any code change to applications >> > >> >Given that, and the fact that it does require that new device classes >> >inherit an >> >> Please do not bring inheritance into this again I wish Bruce had never >> used that term :-( in hind site it is not a very good way to describe >>the >> design. >why not, its a perfectly accurate way to describe what you want to do >here. You >want to extract elements from rte_ethdev that you feel are going to be >common to >other device types, place them in their own structure, and have other >device >types include that common structure, so as to be able to have those >methods be a >part of the new device. In UML terms I suppose the relationship would be >a >"has-a" rather than a "is-a" but the inheritance description is equally >relevant >when writing in C. > >> It seems like it at a very high level, but it seems to drag >> everyone down to the bits and bytes way too much. We ever stated that >>all >> the APIs are common between ethdev (or are inherited) between a new >>device >> only a couple which seemed to make sense to me. The structures being a >>bit >> similar was my goal and to provide a good light framework. >> >Again with the light framework. We're not talking about code savings >anymore. >Theres nothing more or less light about just using an rte_ethdev to >implement a >nework element. Separate out the network element from the actual crypto >functionality. > >As for not _having_ to use the common structure in all device types, I >agree, >you don't have to mandate its use, but doesn't that beg the question, why >do this >then? If theres not any guaranteed functionality, why do this at all? > > >> >API set that may not be relevant to said device class, my one and only >> >question >> >is: whats the advantage? >> >> I have been trying to layout the advantages: >> 1) - we provide a very light framework for the developer of a new >>device >> type to start with in his design >We agreed at the start of this thread that 'lightness' wasn't relevant. >Or do >you mean to describe something other than code size savings by the term >'light'? > >> 2) - we look at making some routines common between devices if it make >> sense, some non-performance APIs do make sense to be common, but the >> structures need to be common in some way IMO > >I can't parse this sentance. You're propsal at the start of this thread >was to create common structures between device classes. How is "we look >at >making some routines common between devices if it makes sense" an >avantage to >your proposal, which is exactly that? Its like saying changing code is >good, >because we get to change code. > >> 3) - we make a few structures look similar using pktdev structures as >>an >> overlay to allow for a clean way to pass around these different device >> types > >This I actually agree with, but it goes back to your first point. I would >argue that passing around rte_ethdevs is just as, if not more useful than >passing around pktdevs, because it more fully describes an ethernet >interface. >As you note, the pktdev interface is 'lighter' which I take as a >euphamism for >'smaller or having less code'. As we've discussed above however, the >savings >are minimal and not paricularly relevant to this debate, so why not just >use >rte_ethdev to pass around network element functionality and design a >crypto >device interfae independently (I'll remind you here that I'm proposing two >devices here, an rte_ethdev and a cryptodev). > >> 4) - adopt standard APIs or translate those APIs into something similar >> for DPDK (e.g. Linux kernel crypto APIs) >> >What standard API's have you seen that make common API methods between >device >types? Can you provide an example? > >> >You claim that common API naming makes application >> >> I never claimed that common API naming make applications common, but >>only >> a few common constructs or API semantics could help. >> >What do you see as the difference? How, specifically does creating common >device model mehtods help an application? I was under the impression >that, in >so doing, you could allow some generic application code to be written >that could >interface to either a real ethernet device, or to a crypto device. What >other >advantage do you see? > > >> >code generic, but we have that today. If you want to do IPsec with a >> >crypto >> >offload devie, write a pmd that interfaces to the data pipeline via the >> >rte_ethdev API and uses whatever crypto offload device API you want to >> >construct >> >to do the crypto work. That way, if a given application wants to use >>the >> >crypto >> >> I am trying to suggest the crypto device can have other APIs better >>suited >> to how we want to use that device, again we can have a few common items >> between ethdev and cryptodev, but it does not make sense to use all of >> ethdev. >> > >Ok, so this is a good thought. I think what I hear you saying above is >that you >could envision a crypto device having two api sets, yes? One, a raw API >that >applications could use to talk to crypto devices directly, and two, a >network >oriented API that can transmit and recieve network frames, right? > >If thats the case, I'm fine with that. But if thats interesting to you, >why >can't the network interface just be an rte_ethdev that you register like >other >pmd's do? > >> I am getting confused here as you seem to be suggesting we use ethdev >> model, but I am sure you are not, correct? >> >Correct, two device models, for two devices. An ipsec device, >implemented as a >pmd that uses an rte_ethdev to interface to the network dataplane, and a >crypto >device model that the aforementioned pmd interfaces to to encrypt and >decrypt >data to implement ipsec functionality. > >> >hardware for something non-network related (say disk/file encryption), >>it >> >can >> >use the crypto device layer in a way that is not packet/network >>oriented >> >more >> >easily, at which point having a common API is completely irrelevant, as >> >the >> >application code is being specifically written to talk to a crypto >>device. >> > >> > >> > >> > >> >> Lets just say we do not move the TX/RX routines from rte_ethdev into >> >> rte_pktdev and only have a header file with a few structures and >>macros >> >>to >> >> help define some common parts between each of the new device types >>being >> >> added. I could see that as an option, but I do not see the big issues >> >>you >> >> are pointing out here. >> >> >> >My big issue is that I see no point in creating a common API. >> >> I am not trying to create a common API between all devices, it only >>seemed >> like a reasonable design to make Rx/Tx look similar. Also making some of >> the structures look similar is a good thing IMO. >> >Ok, you're not trying to make them look identical, I get that. But you >are >trying to get them to try to look identical in terms of how they do data >i/o. >I'm suggesting/asserting that such an interface isn't appropriate to all >devices. Even that little bit isn't worth making common, because it won't >always be used. > >> >You're focused on finding a common set of API's that apply to all >>device >> >classes, and I'm asserting that there is no safe set of common API >> >routines. >> >While it may be possible to shoehorn multiple device classes into the >> >same i/o >> >API model, you are doing yourself a disservice by mandating that. >> >> I feel like you are taking the extreme view here again and that was not >>my >> intent to make everything the same only where it makes sense. A ethdev >>and >> cryptodev are two different animals, but they do have some similarities >> and I want to make sure we investigate these similarities, right? >> >Yes, they are two different animals, and while they may have some >similarities, >I'm suggesting that its not worth codifying those similarities in code. >crypto >devices don't send and recieve packets. They might operate on block >data, so >you might use mbufs in both apis, but you don't transmit data to a crypto >dev >and you don't receive data from a crypto device. You could do that I >suppose, >but there are better ways, and existing api's show us the way to do that. >Remember from above, I'm suggesting two device models here. A crypto >device >that just does cryptography in the fastest/best way possible, and a >optional >network element (an ipsec tunnel in our example), that uses the crypto >device to >preform its function, but uses an rte_ethdev to transmit and receive >network >data. > > >> > >> >Put another way, ipsec devices and crypto devices are separate device >> >classes >> >and deserve to be modeled differently. An IPsec device is a network >> >element >> >that is very nicely modeled as an ethernet interface, for which you >>have >> >a very >> >robust API (the rte_ethdev). Crypto offload devices are not network >> >elements, >> >and can preform i/o in a mutltitude of ways (synchronous, asynchronous, >> >scatter/gather, etc), and have many non ethernet API facets. The >>latter >> >deserves to have its own API design, not to be forced to inherit part >>of >> >an >> >ethernet device (rte_pktdev) and be forced to work that way. >> > >> >> You did have some great comments about how crypto is used and the >>APIs >> >> from the Linux Kernel crypto model is proven and I do not disagree. >> >> >> >> In my email to my own email I tried to point our we could add >>something >> >Yeah, sorry about the broken thread, the in-reply-to field got somehow >> >html-mangled. Not sure how that happened as I use mutt as my MUA. >> > >> >> very similar to Linux Kernel Crypto API and it would be a model most >> >>would >> >> be able to understand. Creating my own crypto API is not my goal, >>but to >> >> use standards where it makes the most sense to DPDK. >> >> >> >> The email link above is the email I suggested the Linux Kernel Crypto >> >>API >> >> would be reasonable. >> >> >> >And thats good. It doesn't need to be the linux kernel crypto API by >>any >> >stretch, and I don't expect it to be. My take away from review of the >> >linux >> >crypto api is that, while it uses some simmilar notions to parts of the >> >network >> >API (scatter gather i/o, optional async operation, completion queues), >>it >> >doesn't create any sort of common API structure between it and the >>network >> >subsystem, because they're separate devices. Even though crypto is >> >largely used >> >in the network datapath for various ipsec tunnels, crypto still uses >>its >> >own API >> >methods and data types because its not exclusive to the use of >> >networking, and >> >doing so allows for more flexible use outside the datapath. Thats what >> >I'm >> >trying to get at here. That users might want to use crypto outside of >>the >> >network path, and they should have an API that is well suited to that >> >task. >> >> We agree here, I only want some similarities between devices not an >>exact >> common API. >> >> I am afraid I am not following you now. At one point it seems like all >>of >> the devices must be completely different then in another place ethdev is >> best way to go as a common API. I think we are saying the same thing >>here >> just from a different point of view and I have tried to understand your >> view point. I have even relaxed my design goals to meet you part way. It >> seems like we agree and disagree here, which does not make a lot of >>sense >> to me at this point. >> > >I'll repeat what I've said above several times now: > >I'd like to see a crypto device api that is developed independently. It >can use >common data types if its suitable to do so, but shouldn't be forced to use >simmilar methods from existing device types. The API here should be >optimized >to best reflect the behavior of the hardware > >If you would like to use the cryptodevice as part of a network dataplane >element, you should create a software pmd to register a rte_ethdev. >Internally >that pmd should use the crypto dev api to do its encryption work. > >That way your 'common' device api in the dataplane is the rte_ethdev. The >cryptodev api is then independent of, and optimized for, the cryptodev >hardware. > >> Maybe we need to get into a room and talk F2F to make sure we are both >> being understood here. Maybe we can then make sure we are both being >>heard >> and come to some type of solution for DPDK. Maybe the community meeting >>is >> a good place, but I feel it is going to be too big and too short to >> discuss this correctly. If you are up to a conf call or I can travel to >> you if you feel a F2F is better. >> >I think we're on opposite coasts, so I don't think a face to face is >going to >happen just for this, and that excludes other interests from this >discussion, >which I would really like to hear. > >I've re-read this thread, and I have a suggestion: > >Implement it. > >We're clearly approaching this from different angles. You seem to place >alot of >value on the benefits of commonality between device types, while I >clearly dont. >I've got several examples in support of my view, but in fairness, the >dpdk isn't >the linux or bsd kernel, so while I don't think so, its possible this >commonality provides some level of value. Likewise, I may show that >creating >this common structure doesn't provide any real added value. >Additionally, we've >both focused on the commonality of devices in this thread (which was the >intent), but neither of us have spent any time actually designing the >crypto >API, which I think will shed alot of light on the problem. > >I would suggest choosing a crypto device, maybe the ixp4 crypto engine, >since it >has an existing open source implementation, and using it to flesh out >crypto >API design. If you implement it independently of the dpdk (i.e. as part >of its >own library with its own API), then we take a step back and look at it, >we can >identify the true api overlap and consider the advantages and limitation >of >consolidation. > >Sound like an idea? Instead of replying to all of your comments above (sorry) I will finish up the PoC code and then we can poke at the patches. I have created a PoC around the code fragments in these emails, but I did not include them as patches. The Quick Assist code on 01.org is being used for the PoC and someone else is doing that work, but I can look at the non-packet API from a top level to be added to the PoC. https://01.org/packet-processing/intel%C2%AE-quickassist-technology-drivers -and-patches Regards, ++Keith ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. 2015-04-05 22:20 ` Wiles, Keith @ 2015-04-06 1:48 ` Neil Horman 0 siblings, 0 replies; 9+ messages in thread From: Neil Horman @ 2015-04-06 1:48 UTC (permalink / raw) To: Wiles, Keith; +Cc: dev On Sun, Apr 05, 2015 at 10:20:10PM +0000, Wiles, Keith wrote: > > > On 4/5/15, 2:37 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > > >On Sat, Apr 04, 2015 at 03:16:08PM +0000, Wiles, Keith wrote: > >> > >> > >> On 4/4/15, 8:11 AM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > >> > >> >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote: > >> >> Hi Neil, > >> >> > >> >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote: > >> >> > >> >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote: > >> >> >> Hi all, (hoping format of the text is maintained) > >> >> >> > >> >> >> Bruce and myself are submitting this RFC in hopes of providing > >> >> >>discussion > >> >> >> points for the idea. Please do not get carried away with the code > >> >> >> included, it was to help everyone understand the proposal/RFC. > >> >> >> > >> >> >> The RFC is to describe a proposed change we are looking to make to > >> >>DPDK > >> >> >>to > >> >> >> add more device types. We would like to add in to DPDK the idea > >>of a > >> >> >> generic packet-device or ?pktdev?, which can be thought of as a > >>thin > >> >> >>layer > >> >> >> for all device classes. For other device types such as > >>potentially a > >> >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect > >> >> >> performance and not require any current application to be > >>modified. > >> >>The > >> >> >> pktdev layer is providing a light framework for developers to add > >>a > >> >> >>device > >> >> >> to DPDK. > >> >> >> > >> >> >> Reason for Change > >> >> >> ----------------- > >> >> >> > >> >> >> The reason why we are looking to introduce these concepts to DPDK > >> >>are: > >> >> >> > >> >> >> * Expand the scope of DPDK so that it can provide APIs for more > >>than > >> >> >>just > >> >> >> packet acquisition and transmission, but also provide APIs that > >>can > >> >>be > >> >> >> used to work with other hardware and software offloads, such as > >> >> >> cryptographic accelerators, or accelerated libraries for > >> >>cryptographic > >> >> >> functions. [The reason why both software and hardware are > >>mentioned > >> >>is > >> >> >>so > >> >> >> that the same APIs can be used whether or not a hardware > >>accelerator > >> >>is > >> >> >> actually available]. > >> >> >> * Provide a minimal common basis for device abstraction in DPDK, > >>that > >> >> >>can > >> >> >> be used to unify the different types of packet I/O devices already > >> >> >> existing in DPDK. To this end, the ethdev APIs are a good starting > >> >> >>point, > >> >> >> but the ethdev library contains too many functions which are > >> >> >>NIC-specific > >> >> >> to be a general-purpose set of APIs across all devices. > >> >> >> Note: The idea was previously touched on here: > >> >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 > >> >> >> > >> >> >> Description of Proposed Change > >> >> >> ------------------------------ > >> >> >> > >> >> >> The basic idea behind "pktdev" is to abstract out a few common > >> >>routines > >> >> >> and structures/members of structures by starting with ethdev > >> >>structures > >> >> >>as > >> >> >> a starting point, cut it down to little more than a few members in > >> >>each > >> >> >> structure then possible add just rx_burst and tx_burst. Then use > >>the > >> >> >> structures as a starting point for writing a device type. > >>Currently > >> >>we > >> >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see > >> >>like > >> >> >> move a couple more common functions maybe resaonable. It could be > >>the > >> >> >> Rx/Tx routines in pktdev should be left as is, but in the code > >>below > >> >>is > >> >> >>a > >> >> >> possible reason to abstract a few routines into a common set of > >> >>files. > >> >> >> > >> >> >> >From there, we have the ethdev type which adds in the existing > >> >> >>functions > >> >> >> specific to Ethernet devices, and also, for example, a cryptodev > >> >>which > >> >> >>may > >> >> >> add in functions specific for cryptographic offload. As now, with > >>the > >> >> >> ethdev, the specific drivers provide concrete implementations of > >>the > >> >> >> functionality exposed by the interface. This hierarchy is shown in > >> >>the > >> >> >> diagram below, using the existing ethdev and ixgbe drivers as a > >> >> >>reference, > >> >> >> alongside a hypothetical cryptodev class and driver implementation > >> >> >> (catchingly called) "X": > >> >> >> > >> >> >> ,---------------------. > >> >> >> | struct rte_pktdev | > >> >> >> +---------------------+ > >> >> >> | rte_pkt_rx_burst() | > >> >> >> .-------| rte_pkt_tx_burst() |-----------. > >> >> >> | `---------------------' | > >> >> >> | | > >> >> >> | | > >> >> >> ,-------------------------------. > >> >>,------------------------------. > >> >> >> | struct rte_ethdev | | struct rte_cryptodev > >> >> | > >> >> >> +-------------------------------+ > >> >>+------------------------------+ > >> >> >> | rte_eth_dev_configure() | | > >> >>rte_crypto_init_sym_session()| > >> >> >> | rte_eth_allmulticast_enable() | | > >> >>rte_crypto_del_sym_session() | > >> >> >> | rte_eth_filter_ctrl() | | > >> >> | > >> >> >> `-------------------------------' > >> >>`---------------.--------------' > >> >> >> | | > >> >> >> | | > >> >> >> ,---------'---------------------. > >> >>,---------------'--------------. > >> >> >> | struct rte_pmd_ixgbe | | struct rte_pmd_X > >> >> | > >> >> >> +-------------------------------+ > >> >>+------------------------------+ > >> >> >> | .configure -> ixgbe_configure | | .init_session -> > >> >>X_init_ses()| > >> >> >> | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> > >> >>X_handle_pkts() | > >> >> >> `-------------------------------' > >> >>`------------------------------' > >> >> >> > >> >> >> We are not attempting to create a real class model here only > >>looking > >> >>at > >> >> >> creating a very basic common set of APIs and structures for other > >> >>device > >> >> >> types. > >> >> >> > >> >> >> In terms of code changes for this, we obviously need to add in new > >> >> >> interface libraries for pktdev and cryptodev. The pktdev library > >>can > >> >> >> define a skeleton structure for the first few elements of the > >>nested > >> >> >> structures to ensure consistency. Each of the defines below > >> >>illustrate > >> >> >>the > >> >> >> common members in device structures, which gives some basic > >>structure > >> >> >>the > >> >> >> device framework. Each of the defines are placed at the top of the > >> >> >>devices > >> >> >> matching structures and allows the devices to contain common and > >> >>private > >> >> >> data. The pkdev structures overlay the first common set of members > >> >>for > >> >> >> each device type. > >> >> >> > >> >> > > >> >> > > >> >> >Keith and I discussed this offline, and for the purposes of > >> >>completeness > >> >> >I'll > >> >> >offer my rebuttal to this proposal here. > >> >> > > >> >> >In short, I don't think the segregation of the transmit and receive > >> >> >routines > >> >> >into their own separate structure (and ostensibly their own > >> >>librte_pktdev > >> >> >library) is particularly valuable. While it does provide some > >>minimal > >> >> >code > >> >> >savings when new device classes are introduced, the savings are not > >> >> >significant > >> >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx > >> >>and rx > >> >> >routines are any sort of indicator). It does however, come with > >> >> >significant > >> >> >costs in the sense that it binds a device class to using an I/O > >>model > >> >>(in > >> >> >this > >> >> >case packet based recieve and transmit) for which the device class > >>may > >> >> >not be > >> >> >suited. > >> >> > >> >> The only reason the we only have a 0.5Kb saving is you are only > >>looking > >> >>at > >> >> moving Rx/Tx routines into pktdev, but what happens if we decide to > >> >>move a > >> >> number of common functions like start/stop and others, then you > >>start to > >> >> see a much bigger saving. > >> >You're right of course, but lets just try the math here. If we assume > >> >that, > >> >since all these libraries are just inlined redirection functions, we > >>can > >> >guess > >> >that each one is about .25kb of code (5.kb/2 functions). So if you add > >> >two more > >> >functions you're looking at 1kb of code savings. Of course if you > >>start > >> >doing > >> >so, you beg the two questions that I've been trying to pose here: > >> > > >> >1) How much 'common' API do you want to impose on a device class that > >>may > >> >not > >> >be condusive to the common set > >> > > >> >2) At what point does your common API just become an rte_ethdev? > >> > >> You are taking the code size to the extreme, which is not the real point > >> here. I agree it does not matter in footprint savings as most DPDK > >>systems > >> are pretty big. The reason for the common code was to help remove some > >>of > >> the code required for the developer to write in the new device type. If > >> the developer of the new device type writes his own set of Rx/Tx > >>routines > >> the community will determine if they are required or do the current APIs > >> work or do we require them to support both. > >> > >> Please try not getting hung up on the footprint savings and I am sorry > >>if > >> I was making this a huge point. > >> > >Hung up? I'm not the one who made a big deal out of code size reduction > >being a > >major benefit there, be it source or binary, its miniscule any way you > >slice it. > >I'm happy to drop this though if you are. > > > >> > > >> >> Do we need this saving, maybe not, but it does > >> >I would say definately not, given that the DPDK houses entire API sets > >> >that are > >> >always-inline, and any savings from the above are easily eclipsed by > >>the > >> >fanout > >> >that occurs from use in multiple call sites. Lets be honest, > >>performance > >> >is the > >> >DPDK's primary concern. Code size is not a consideration. Its only an > >> >argument > >> >here because it makes this proposal look favorable. Its not a bad > >>thing > >> >mind > >> >you, but if this proposal caused any code bloat, it wouldn't even be > >> >mentioned. > >> > >> In the current design the pktdev APIs are still inline routines only the > >> debug routines are not. > >> > >Reread my comments above please. You seem to have misread what I wrote. > > > >> > > >> >> provide a single call API rte_pkt_rx/tx_burst to use instead of the > >> >> application having to make sure it is calling the correct device > >>Rx/Tx > >> >> routines. > >> >Given that that is a compile time issue, I really don't see why that > >>is a > >> >big > >> >deal. Especially if you, as I suggest, just use the rte_ethdev as your > >> >common > >> >dataplane device model. Then by virtue of the fact that they're all > >> >rte_ethdevs, you get common API naming. > >> > > >> >> All that is required is passing in the device pointer and it is > >> >> handled for the application. Bruce added some code below to that > >>effect. > >> >Yes, and you have that now, its an rte_ethdev. > >> > > >> >> > > >> >> >To illustrate the difference in design ideas, currenty the dpdk data > >> >> >pipeline > >> >> >looks like this: > >> >> > > >> >> >+------------+ +----------+ +---------+ > >> >> >| | | | | | > >> >> >| ARP | | ethdev | | | +----------+ > >> >> >| handler +-->+ api +-->+ PMD +-->+ Wire | > >> >> >| | | | | | +----------+ > >> >> >| | | | | | > >> >> >+------------+ +----------+ +---------+ > >> >> > >> >> You did not add the crypto to this picture as it is in the picture > >>below > >> >> to make them the same. > >> >> > >> >No I didn't, because it doesn't exist right now. Thats what I mean by > >> >'currently'. See my description below after you added your drawing. > >> > > >> >> +-----+ +---------+ +---------+ +------+ > >> >> | | | | | | | | +------+ > >> >> | ARP +--+ ethdev +--+ crypto +--+ PMD +--+ Wire | > >> >> | | | | | | | | +------+ > >> >> +-----+ +---------+ +---------+ +------+ > >> >> > >> >> > >> >> > > >> >> > > >> >> >Where the ARP handler code is just some code that knows how to > >>manage > >> >>arp > >> >> >requests and responses, and only transmits and receives frames > >> >> > > >> >> >Keiths idea would introduce this new pktdev handler structure and > >>make > >> >>the > >> >> >dataplane pipeline look like this: > >> >> > > >> >> >+------------+ +------------+ +------------+ +--------+ > >> >> >| | | | | | | | > >> >> >| ARP | | pktdev api | | pktdev_api | | | > >>+---------+ > >> >> >| handler +-+ +--+ +--+ PMD +--+Wire > >> | > >> >> >| | | | | | | | > >>+---------+ > >> >> >| | | | | | | | > >> >> >+------------+ | | | | | | > >> >> > | | | | +--------+ > >> >> > | | | | > >> >> > | | | | > >> >> > | | | | > >> >> > | rte_ethdev | | rte_crypto | > >> >> > | | | | > >> >> > | | | | > >> >> > +------------+ +------------+ > >> >> > >> >> You are drawing this picture it appears trying to make the pktdev > >> >>another > >> >> function call layer when it is just a single macro in the rte_ethdev > >> >Not trying to imply a new function call layer, just trying to > >>illustrate > >> >your > >> >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look > >> >like > >> >rte_pktdevs so that the dataplane has a common element for passing > >>mbufs > >> >around, > >> >regardless of its true device class. Not sure where you see an extra > >> >function > >> >call layer here. > >> > >> Please see my comment above as I am not trying to make them all look the > >> same you are taking my words to the extreme. We need some type of > >> structure for adding more device types to DPDK and I am just trying to > >> suggest a few common parts as a solution. > >> > >You see that you're contradicting yourself here, right? You state in the > >same > >paragraph that you are not trying to make all devices look the same, but > >are > >trying to find comment parts (the implication being that they will be > >interacted > >with in the same way). I get that you're saying their only going to be > >'partially' the same, where the overlapping structures are made common in > >your > >model, so that code can be written generically, but if thats your goal, > >then the > >generic code you propose can only be generic if it only interfaces to the > >common > >parts, which means, to the generic application code you envision, every > >device > >looks the same. Thats my point > > > >Note I'm not opposed to generic code being able to be written for multiple > >device types, I'm just asserting that that common device is the > >rte_ethdev, > >which we already have. I'm proposing that dataplane elements are network > >interfaces codified as rte_ethdevs, and crypto devices are codified as > >some > >other API, and should you want crypto functionality, you write a pmd > >(interfaced > >to using rte_ethdev), that makes use of the to-be-written crypto library. > > > >> Some parts of the structures should be common and possible a few common > >> routines, yes a big part of each device will be different. The > >>rte_ethdev > >> structures is what we have today and was a good starting point, but a > >>big > >> part of the code will be different. > >> > >No, they shouldn't. Reuse existing data structures where possible. But > >don't > >try to take device specific interfaces and data, and where you see > >commonality, > >force commonality. Thats what you're trying to do here. I'd challenge > >you to > >find another implementation of device driver interfaces where what you are > >proposing is done. It just doesn't make sense. > > > >> Should we allow someone to write a new device type that is completely > >> different down to typedefs for variable types and all new structures, no > >> we should not, giving them some guide lines by providing a set of > >>routines > >> and structures to start with, by all means lets do it. > >> > >At what point did you read that I said we can't use common data > >structures? Of > >course use them where it makes sense. Want to use mbufs as a common data > >type > >for both API's, fine! That makes sense. But don't go trying to mash > >device > >specific data types and methods into a common structure for some sort of > >imaginary savings. > > > >> > > >> >> header pointing to the rte_pktdev tx_burst routine. No function > >>function > >> >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to > >> >> rte_pkt_tx_burst routine, which BTW is the same routine that was in > >> >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx > >> >> routine via the dev_ops function pointers structure. Which is also no > >> >> extra over head. > >> >> > >> >Never suggested it was extra overhead, where did you get that from? > >>I'm > >> >just > >> >illustrating what it is you want to do. If I didn't get it right I > >> >apologize, > >> >please clarify. > >> > >> No need to clarify as we agree and maybe I read too much into your > >> comments. > >> > > >> >> If you are calling the rte_pkt_tx_burst routine directly it just > >>means > >> >>you > >> >> need to get the device pointer to pass instead of the port id value > >>in > >> >>the > >> >> rte_pkt_tx_burst routine. The above turns into something like this: > >> >> > >> >> +-----+ +---------+ +--------+ +-----+ > >> >> | | | ethdev | | | | | +------+ > >> >> | ARP +--+ map to +--+ crypto +--+ PMD +--+ Wire | > >> >> | | | pktdev | | | | | +------+ > >> >> +-----+ +---------+ +--------+ +-----+ > >> >> > >> >> So the path of the data is the same only a macro does a simple > >>rename of > >> >> the call to rte_eth_tx_burst routine. If you call the pktdev routine > >> >> directly then the macro is not used. > >> >> > >> >> > >> >Let me quash this by stipulating > >> >that you are absolutely correct, there is no additional overhead in > >>your > >> >design, > >> >it should provide the exact same performance that an rte_ethdev > >>currently > >> >does. > >> >I have no problem with that. My point is: It provides the exact same > >> >performance that an rte_ethdev does, so lets just use rte_ethdev if > >>we're > >> >trying > >> >to model devices in the dataplane. > >> > >> The ethdev model does not give us a different device type it just forces > >> the new device type to look like an ethdev, this is what I am reading > >>from > >> your email here. I do not believe you are suggesting that ethdev look a > >> like is the answer to a new device type. > >> > >What!? You need to re-read my last email, I'm not sure where you get the > >above > >from. If you'll look back at the dataplane design, I proposed this: > > > >+------------+ +---------+ +---------+ +---------+ +--------+ > >| | | | | | | | | | > >| | |ethdev | | ipsec | |ethdev +--+ | > >| ARP handler+->+api +-+ tunnel +->+api | | PMD > >| | | | | PMD | | | | | > >| | | | | | | | | | > >+------------+ +---------+ +---+-----+ +---------+ +--------+ > > | > > +--+-----+ > > | | > > |crypto | > > |api | > > | | > > | | > > +--------+ > > > > > >I'm clearly proposing two device models here: > > > >1) A dataplane element model, the interface for which is an rte_ethdev. > >This > >allows one to implement multiple dataplane elements and functionality (in > >the > >example above an ipsec tunnel). This is a common and well understood > >device > >model that many network stacks use (read: easy for end users to > >understand) > > > >2) A crypto device model, implemented independently, reusing existing > >data types > >where possible (read: Appropriate for the device class and alternate use > >cases) > > > >(1) uses (2) to implement ipsec functionality. if an application just > >wants to > >do crypto on some arbitrary data, then the crypto api is better suited to > >doing > >so (read: not bound by the pktdev api you propose) > > > >2 device instances, 2 device models, each best suited for thier purpose. > > > >> The bonding model allows you to link some devices together and that is > >> good in IMO. The bonding model is kind of like a pipeline model at the > >> device layer. The pipeline style at the application layer could have > >> provided the same feature as the bonding model just being done at two > >> different layers in the system (application and device layer). > >> > >Not sure what you're getting at here, but I'll keep reading... > > > >> Now the pipeline model is a bit more generic and may add some small > >> overhead compared to the bonding model. If we require every device type > >>to > >> support a bonding style then we would need them all to follow some type > >>of > >> common structure, correct? > >I'm not sure what you're driving at here, please clarify. Are you trying > >to > >suggest that multiple pmds in the datapath require some pmd specific code > >to > >allow them to be stacked? Yes, they currently do, which is less than > >ideal, and > >I would certainly support an API to enable ordering of pmds in a > >datapath, but > >thats outside the scope of this discussion I think, since we're talking > >about > >new device types here. > > > >Regarding your comment about multiple pmds requiring a common structure to > >interface to one another in the datapath (like bonding currently does), > >you're > >absolutely right, thats an rte_ethdev. Thats the way it works today, and > >theres > >no reason to change that. > > > >> > > >> >> > > >> >> >The idea being that now all devices in the dataplane are pktdev > >>devices > >> >> >and code > >> >> >that transmits and receives frames only needs to know that a device > >>can > >> >> >transmit > >> >> >and receive frames. The crypto device in this chain is ostensibly > >> >> >preforming > >> >> >some sort of ipsec functionality so that arp frames are properly > >> >> >encrypted and > >> >> >encapsulated for sending via a tunnel. > >> >> > > >> >> >On the surface this seems reasonable, and in a sense it is. > >>However, > >> >>my > >> >> >assertion is that we already have this functionality, and it is the > >> >> >rte_ethdev > >> >> >device. To illustrate further, in my view we can do the above > >> >>already: > >> >> > > >> >> >+------------+ +---------+ +---------+ +---------+ +--------+ > >> >> >| | | | | | | | | | > >> >> >| | |ethdev | | ipsec | |ethdev +--+ | > >> >> >| ARP handler+->+api +-+ tunnel +->+api | | PMD > >> >> >| | | | | PMD | | | | | > >> >> >| | | | | | | | | | > >> >> >+------------+ +---------+ +---+-----+ +---------+ +--------+ > >> >> > | > >> >> > +--+-----+ > >> >> > | | > >> >> > |crypto | > >> >> > |api | > >> >> > | | > >> >> > | | > >> >> > +--------+ > >> >> > > >> >> >Using the rte_ethdev we can already codify the ipsec functionailty > >>as a > >> >> >pmd that > >> >> >registers an ethdev, and stack it with other pmds using methods > >> >>simmilar > >> >> >to what > >> >> >the bonding pmd does (or via some other more generalized dataplane > >> >> >indexing > >> >> >function). This still leaves us with the creation of the crypto > >>api, > >> >> >which is > >> >> >adventageous because: > >> >> > >> >> The proposal does not remove the bonding method and can still be > >>used, > >> >> correct? > >> >Not sure I fully understand the question, but no, I'm not proposing any > >> >specific > >> >change to bonding here, just using it to illustrate that we have ways > >> >currently > >> >to stack PMD's on top of one another to create a multi-stage data > >> >pipeline. > >> > >> Look at the above comment for this one. > > > >Ditto. I'm not making any comment about bonding, just using it as an > >example > >to show that its possible to stack ethdevs in a dataplane today. The > >method of > >stacking can certainly be improved, but rte_ethdev is acting as a > >perfectly good > >interface for that purpose right now, no changes needed. > > > >> > > >> >> I do not see the different here using the pktdev style routines or > >>using > >> >> ethdev routines. > >> >Exactly my point! There is no difference to the dataplane API, it > >>remains > >> >unchanged, just as in your proposal, except that there is no work > >> >required to > >> >create a pktdev device that doesn't change anything. The advantage > >>that > >> >I'm > >> >proposing here is that my model separates whatever crypto device class > >> >API you want > >> >to design from the dataplane (rte_ethdev) api. Doing this lets you > >> >create a > >> >crypto API that is more appropriately suited to the crypto class of > >> >device. if > >> >there is overlap with rte_ethdev, thats fine, you can create simmilar > >> >functions > >> >(even giving them identical names to the rte_ethdev apis), and that > >>will > >> >be > >> >properly resolved at compile time. But more importantly, where there > >>is > >> >not > >> >overlap, you're not forced into creating an API that inherits the > >>common > >> >API > >> >functions you propose, when they are not best suited to your device. > >> > >> I am still not trying to suggest we have the cryptodev be a copy of the > >> ethdev here. We need some type of structure and common guide lines for a > >> new device, which pktdev is a very reasonable plus very light framework. > >> > > > >So, perhaps we're talking past one another here, because I'm not > >suggesting that > >cryptodev be a copy of ethdev either, and I thought that was crystal > >clear in my > >drawings above. I'm suggesting that cryptodev be completely independent > >of any > >ethdev design (i.e. no overlap with ethdev as codified in your pktdev > >proposal). > > > >As for being lightweight, I don't see the relevance. As we agreed above, > >code > >savings, either in binary our source format is going to be fairly > >irrelevant, > >and not the point of the discussion. As for _needing_ commonality, I > >disagree, > >and cite other implementations in linux and bsd as evidence of that (the > >device > >models there, while reusing some datatypes, maintain independent method > >pointer > >structures). > > > >> I am a bit confused now it seems like you are suggesting a new device > >> should be ethdev and then you seem to be stating it should be completely > >> differentŠ > >> > >I'm not, and if you would read my emails closely, I think you would see > >that. > >As I tried to state as clearly as I'm able above, I'm proposing two > >devices: > > > >1) A dataplane element model, the interface for which is an rte_ethdev. > >This > >allows one to implement multiple dataplane elements and functionality (in > >the > >example above an ipsec tunnel). This is a common and well understood > >device > >model that many network stacks use (read: easy for end users to > >understand) > > > >2) A crypto device model, implemented independently, reusing existing > >data types > >where possible (read: Appropriate for the device class and alternate use > >cases) > > > > > > > >> Lets just agree some common parts are good between device types and some > >> are just not. Making all devices look like an ethdev is just not going > >>to > >> fly and making a new device type into something completely different is > >> not going to fly. > >> > >No, lets not, because thats exactly what I'm arguing against. > >Once again I'll try to re-iterate: > > > >I'm fine with reusing existing datatypes if they are applicable > > > >I would like to see a device model for cryptodev that implements its own > >methods > >that are condusive to the way that hardware works, and not bound to the > >way > >other device classes work > > > >I would like to see dataplane element functionality that implements other > >network elements (like an ipsec tunnel), be implemented using the > >rte_ethdev > >interface, and have it implement its internals using the to-be-created > >cryptodev > >api. > > > > > >> Then the only solution is to have somethings in common to help unify the > >> designs, but still allow the developer to add APIs to that device which > >> make the most sense. > >> > > > > >Please tell me, why do you think that unify device models is the _only_ > >solution? I've provided several examples that simply don't ever do that, > >and > >they are quite successfull > > > >> >> > > >> >> >1) It is not constrained by the i/o model of the dataplane (it may > >> >>include > >> >> >packet based i/o, but can build on more rudimentary (and performant) > >> >> >interfaces. > >> >> >For instance, in addition to async block based i/o, a crypto device > >>may > >> >> >also > >> >> >operate syncrhnously, meaning a call can be saved with each > >>transaction > >> >> >(2 calls > >> >> >for a tx/rx vs one for an encrypt operation). > >> >> > > >> >> >2) It is not constrained by use case. That is to say the API can be > >> >> >constructed > >> >> >for more natural use with other functions (for instance encryptions > >>of > >> >> >files on > >> >> >disk or via a pipe to another process), which may not have any > >>relation > >> >> >to the > >> >> >data plane of DPDK. > >> >> > > >> >> >Neil > >> >> > >> >> Ok, you snipped the text of the email here an it makes the context > >>wrong > >> >I figured your illustration above was sufficient to make your point and > >> >mine. > >> >If your code tells a different story, I apologize. > >> > > >> >> without the rest of the code IMO. I will try to explain without the > >>text > >> >> that was omitted, but it would be best for anyone missing the > >>original > >> >> email to read it for more details. I know the formatting got messed > >>up a > >> >> bit :-( > >> >> > >> >> http://dpdk.org/ml/archives/dev/2015-April/016124.html > >> >> > >> >> > >> >> In the rest of the text it does show the points I wanted to make here > >> >>and > >> >> how little overhead it added. > >> >> > >> >As above, I'm stipulating to the fact that there is not performance > >> >overhead. > >> >That is not, and was never my point. My point was/is that we already > >> >have an > >> >API to do what you want to do. The commonality of functions that you > >> >seem to > >> >focus on, is already provided by the fact that we have a common device > >> >for the > >> >dataplane (the rte_ethdev). You're proposal doesn't buy us anything > >> >beyond > >> >that. By your own assertion, your proposal: > >> > > >> >1) Its just as performant as an rte_ethdev > >> > > >> >2) It doesn't require any code change to applications > >> > > >> >Given that, and the fact that it does require that new device classes > >> >inherit an > >> > >> Please do not bring inheritance into this again I wish Bruce had never > >> used that term :-( in hind site it is not a very good way to describe > >>the > >> design. > >why not, its a perfectly accurate way to describe what you want to do > >here. You > >want to extract elements from rte_ethdev that you feel are going to be > >common to > >other device types, place them in their own structure, and have other > >device > >types include that common structure, so as to be able to have those > >methods be a > >part of the new device. In UML terms I suppose the relationship would be > >a > >"has-a" rather than a "is-a" but the inheritance description is equally > >relevant > >when writing in C. > > > >> It seems like it at a very high level, but it seems to drag > >> everyone down to the bits and bytes way too much. We ever stated that > >>all > >> the APIs are common between ethdev (or are inherited) between a new > >>device > >> only a couple which seemed to make sense to me. The structures being a > >>bit > >> similar was my goal and to provide a good light framework. > >> > >Again with the light framework. We're not talking about code savings > >anymore. > >Theres nothing more or less light about just using an rte_ethdev to > >implement a > >nework element. Separate out the network element from the actual crypto > >functionality. > > > >As for not _having_ to use the common structure in all device types, I > >agree, > >you don't have to mandate its use, but doesn't that beg the question, why > >do this > >then? If theres not any guaranteed functionality, why do this at all? > > > > > >> >API set that may not be relevant to said device class, my one and only > >> >question > >> >is: whats the advantage? > >> > >> I have been trying to layout the advantages: > >> 1) - we provide a very light framework for the developer of a new > >>device > >> type to start with in his design > >We agreed at the start of this thread that 'lightness' wasn't relevant. > >Or do > >you mean to describe something other than code size savings by the term > >'light'? > > > >> 2) - we look at making some routines common between devices if it make > >> sense, some non-performance APIs do make sense to be common, but the > >> structures need to be common in some way IMO > > > >I can't parse this sentance. You're propsal at the start of this thread > >was to create common structures between device classes. How is "we look > >at > >making some routines common between devices if it makes sense" an > >avantage to > >your proposal, which is exactly that? Its like saying changing code is > >good, > >because we get to change code. > > > >> 3) - we make a few structures look similar using pktdev structures as > >>an > >> overlay to allow for a clean way to pass around these different device > >> types > > > >This I actually agree with, but it goes back to your first point. I would > >argue that passing around rte_ethdevs is just as, if not more useful than > >passing around pktdevs, because it more fully describes an ethernet > >interface. > >As you note, the pktdev interface is 'lighter' which I take as a > >euphamism for > >'smaller or having less code'. As we've discussed above however, the > >savings > >are minimal and not paricularly relevant to this debate, so why not just > >use > >rte_ethdev to pass around network element functionality and design a > >crypto > >device interfae independently (I'll remind you here that I'm proposing two > >devices here, an rte_ethdev and a cryptodev). > > > >> 4) - adopt standard APIs or translate those APIs into something similar > >> for DPDK (e.g. Linux kernel crypto APIs) > >> > >What standard API's have you seen that make common API methods between > >device > >types? Can you provide an example? > > > >> >You claim that common API naming makes application > >> > >> I never claimed that common API naming make applications common, but > >>only > >> a few common constructs or API semantics could help. > >> > >What do you see as the difference? How, specifically does creating common > >device model mehtods help an application? I was under the impression > >that, in > >so doing, you could allow some generic application code to be written > >that could > >interface to either a real ethernet device, or to a crypto device. What > >other > >advantage do you see? > > > > > >> >code generic, but we have that today. If you want to do IPsec with a > >> >crypto > >> >offload devie, write a pmd that interfaces to the data pipeline via the > >> >rte_ethdev API and uses whatever crypto offload device API you want to > >> >construct > >> >to do the crypto work. That way, if a given application wants to use > >>the > >> >crypto > >> > >> I am trying to suggest the crypto device can have other APIs better > >>suited > >> to how we want to use that device, again we can have a few common items > >> between ethdev and cryptodev, but it does not make sense to use all of > >> ethdev. > >> > > > >Ok, so this is a good thought. I think what I hear you saying above is > >that you > >could envision a crypto device having two api sets, yes? One, a raw API > >that > >applications could use to talk to crypto devices directly, and two, a > >network > >oriented API that can transmit and recieve network frames, right? > > > >If thats the case, I'm fine with that. But if thats interesting to you, > >why > >can't the network interface just be an rte_ethdev that you register like > >other > >pmd's do? > > > >> I am getting confused here as you seem to be suggesting we use ethdev > >> model, but I am sure you are not, correct? > >> > >Correct, two device models, for two devices. An ipsec device, > >implemented as a > >pmd that uses an rte_ethdev to interface to the network dataplane, and a > >crypto > >device model that the aforementioned pmd interfaces to to encrypt and > >decrypt > >data to implement ipsec functionality. > > > >> >hardware for something non-network related (say disk/file encryption), > >>it > >> >can > >> >use the crypto device layer in a way that is not packet/network > >>oriented > >> >more > >> >easily, at which point having a common API is completely irrelevant, as > >> >the > >> >application code is being specifically written to talk to a crypto > >>device. > >> > > >> > > >> > > >> > > >> >> Lets just say we do not move the TX/RX routines from rte_ethdev into > >> >> rte_pktdev and only have a header file with a few structures and > >>macros > >> >>to > >> >> help define some common parts between each of the new device types > >>being > >> >> added. I could see that as an option, but I do not see the big issues > >> >>you > >> >> are pointing out here. > >> >> > >> >My big issue is that I see no point in creating a common API. > >> > >> I am not trying to create a common API between all devices, it only > >>seemed > >> like a reasonable design to make Rx/Tx look similar. Also making some of > >> the structures look similar is a good thing IMO. > >> > >Ok, you're not trying to make them look identical, I get that. But you > >are > >trying to get them to try to look identical in terms of how they do data > >i/o. > >I'm suggesting/asserting that such an interface isn't appropriate to all > >devices. Even that little bit isn't worth making common, because it won't > >always be used. > > > >> >You're focused on finding a common set of API's that apply to all > >>device > >> >classes, and I'm asserting that there is no safe set of common API > >> >routines. > >> >While it may be possible to shoehorn multiple device classes into the > >> >same i/o > >> >API model, you are doing yourself a disservice by mandating that. > >> > >> I feel like you are taking the extreme view here again and that was not > >>my > >> intent to make everything the same only where it makes sense. A ethdev > >>and > >> cryptodev are two different animals, but they do have some similarities > >> and I want to make sure we investigate these similarities, right? > >> > >Yes, they are two different animals, and while they may have some > >similarities, > >I'm suggesting that its not worth codifying those similarities in code. > >crypto > >devices don't send and recieve packets. They might operate on block > >data, so > >you might use mbufs in both apis, but you don't transmit data to a crypto > >dev > >and you don't receive data from a crypto device. You could do that I > >suppose, > >but there are better ways, and existing api's show us the way to do that. > >Remember from above, I'm suggesting two device models here. A crypto > >device > >that just does cryptography in the fastest/best way possible, and a > >optional > >network element (an ipsec tunnel in our example), that uses the crypto > >device to > >preform its function, but uses an rte_ethdev to transmit and receive > >network > >data. > > > > > >> > > >> >Put another way, ipsec devices and crypto devices are separate device > >> >classes > >> >and deserve to be modeled differently. An IPsec device is a network > >> >element > >> >that is very nicely modeled as an ethernet interface, for which you > >>have > >> >a very > >> >robust API (the rte_ethdev). Crypto offload devices are not network > >> >elements, > >> >and can preform i/o in a mutltitude of ways (synchronous, asynchronous, > >> >scatter/gather, etc), and have many non ethernet API facets. The > >>latter > >> >deserves to have its own API design, not to be forced to inherit part > >>of > >> >an > >> >ethernet device (rte_pktdev) and be forced to work that way. > >> > > >> >> You did have some great comments about how crypto is used and the > >>APIs > >> >> from the Linux Kernel crypto model is proven and I do not disagree. > >> >> > >> >> In my email to my own email I tried to point our we could add > >>something > >> >Yeah, sorry about the broken thread, the in-reply-to field got somehow > >> >html-mangled. Not sure how that happened as I use mutt as my MUA. > >> > > >> >> very similar to Linux Kernel Crypto API and it would be a model most > >> >>would > >> >> be able to understand. Creating my own crypto API is not my goal, > >>but to > >> >> use standards where it makes the most sense to DPDK. > >> >> > >> >> The email link above is the email I suggested the Linux Kernel Crypto > >> >>API > >> >> would be reasonable. > >> >> > >> >And thats good. It doesn't need to be the linux kernel crypto API by > >>any > >> >stretch, and I don't expect it to be. My take away from review of the > >> >linux > >> >crypto api is that, while it uses some simmilar notions to parts of the > >> >network > >> >API (scatter gather i/o, optional async operation, completion queues), > >>it > >> >doesn't create any sort of common API structure between it and the > >>network > >> >subsystem, because they're separate devices. Even though crypto is > >> >largely used > >> >in the network datapath for various ipsec tunnels, crypto still uses > >>its > >> >own API > >> >methods and data types because its not exclusive to the use of > >> >networking, and > >> >doing so allows for more flexible use outside the datapath. Thats what > >> >I'm > >> >trying to get at here. That users might want to use crypto outside of > >>the > >> >network path, and they should have an API that is well suited to that > >> >task. > >> > >> We agree here, I only want some similarities between devices not an > >>exact > >> common API. > >> > >> I am afraid I am not following you now. At one point it seems like all > >>of > >> the devices must be completely different then in another place ethdev is > >> best way to go as a common API. I think we are saying the same thing > >>here > >> just from a different point of view and I have tried to understand your > >> view point. I have even relaxed my design goals to meet you part way. It > >> seems like we agree and disagree here, which does not make a lot of > >>sense > >> to me at this point. > >> > > > >I'll repeat what I've said above several times now: > > > >I'd like to see a crypto device api that is developed independently. It > >can use > >common data types if its suitable to do so, but shouldn't be forced to use > >simmilar methods from existing device types. The API here should be > >optimized > >to best reflect the behavior of the hardware > > > >If you would like to use the cryptodevice as part of a network dataplane > >element, you should create a software pmd to register a rte_ethdev. > >Internally > >that pmd should use the crypto dev api to do its encryption work. > > > >That way your 'common' device api in the dataplane is the rte_ethdev. The > >cryptodev api is then independent of, and optimized for, the cryptodev > >hardware. > > > >> Maybe we need to get into a room and talk F2F to make sure we are both > >> being understood here. Maybe we can then make sure we are both being > >>heard > >> and come to some type of solution for DPDK. Maybe the community meeting > >>is > >> a good place, but I feel it is going to be too big and too short to > >> discuss this correctly. If you are up to a conf call or I can travel to > >> you if you feel a F2F is better. > >> > >I think we're on opposite coasts, so I don't think a face to face is > >going to > >happen just for this, and that excludes other interests from this > >discussion, > >which I would really like to hear. > > > >I've re-read this thread, and I have a suggestion: > > > >Implement it. > > > >We're clearly approaching this from different angles. You seem to place > >alot of > >value on the benefits of commonality between device types, while I > >clearly dont. > >I've got several examples in support of my view, but in fairness, the > >dpdk isn't > >the linux or bsd kernel, so while I don't think so, its possible this > >commonality provides some level of value. Likewise, I may show that > >creating > >this common structure doesn't provide any real added value. > >Additionally, we've > >both focused on the commonality of devices in this thread (which was the > >intent), but neither of us have spent any time actually designing the > >crypto > >API, which I think will shed alot of light on the problem. > > > >I would suggest choosing a crypto device, maybe the ixp4 crypto engine, > >since it > >has an existing open source implementation, and using it to flesh out > >crypto > >API design. If you implement it independently of the dpdk (i.e. as part > >of its > >own library with its own API), then we take a step back and look at it, > >we can > >identify the true api overlap and consider the advantages and limitation > >of > >consolidation. > > > >Sound like an idea? > > Instead of replying to all of your comments above (sorry) I will finish up > the PoC code and then we can poke at the patches. > > I have created a PoC around the code fragments in these emails, but I did > not include them as patches. The Quick Assist code on 01.org is being used > for the PoC and someone else is doing that work, but I can look at the > non-packet API from a top level to be added to the PoC. > > https://01.org/packet-processing/intel%C2%AE-quickassist-technology-drivers > -and-patches > That sounds like a reasonable approach. When you have them ready, please reply to the thread and I'll take a look at it. Best Neil > > Regards, > ++Keith > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [RFC] Adding multiple device types to DPDK. @ 2015-04-02 14:16 Wiles, Keith 0 siblings, 0 replies; 9+ messages in thread From: Wiles, Keith @ 2015-04-02 14:16 UTC (permalink / raw) To: dev Hi All, just to make a comment on my own email :-) On 4/1/15, 7:44 AM, "Wiles, Keith" <keith.wiles@intel.com> wrote: >Hi all, (hoping format of the text is maintained) > >Bruce and myself are submitting this RFC in hopes of providing discussion >points for the idea. Please do not get carried away with the code >included, it was to help everyone understand the proposal/RFC. > >The RFC is to describe a proposed change we are looking to make to DPDK to >add more device types. We would like to add in to DPDK the idea of a >generic packet-device or ³pktdev², which can be thought of as a thin layer >for all device classes. For other device types such as potentially a >³cryptodev² or ³dpidev². One of the main goals is to not effect >performance and not require any current application to be modified. The >pktdev layer is providing a light framework for developers to add a device >to DPDK. > >Reason for Change >----------------- > >The reason why we are looking to introduce these concepts to DPDK are: > >* Expand the scope of DPDK so that it can provide APIs for more than just >packet acquisition and transmission, but also provide APIs that can be >used to work with other hardware and software offloads, such as >cryptographic accelerators, or accelerated libraries for cryptographic >functions. [The reason why both software and hardware are mentioned is so >that the same APIs can be used whether or not a hardware accelerator is >actually available]. >* Provide a minimal common basis for device abstraction in DPDK, that can >be used to unify the different types of packet I/O devices already >existing in DPDK. To this end, the ethdev APIs are a good starting point, >but the ethdev library contains too many functions which are NIC-specific >to be a general-purpose set of APIs across all devices. > Note: The idea was previously touched on here: >http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545 > >Description of Proposed Change >------------------------------ > >The basic idea behind "pktdev" is to abstract out a few common routines >and structures/members of structures by starting with ethdev structures as >a starting point, cut it down to little more than a few members in each >structure then possible add just rx_burst and tx_burst. Then use the >structures as a starting point for writing a device type. Currently we >have the rx_burst/tx_burst routines moved to the pktdev and it see like >move a couple more common functions maybe resaonable. It could be the >Rx/Tx routines in pktdev should be left as is, but in the code below is a >possible reason to abstract a few routines into a common set of files. > >From there, we have the ethdev type which adds in the existing functions >specific to Ethernet devices, and also, for example, a cryptodev which may >add in functions specific for cryptographic offload. As now, with the >ethdev, the specific drivers provide concrete implementations of the >functionality exposed by the interface. This hierarchy is shown in the >diagram below, using the existing ethdev and ixgbe drivers as a reference, >alongside a hypothetical cryptodev class and driver implementation >(catchingly called) "X": > > ,---------------------. > | struct rte_pktdev | > +---------------------+ > | rte_pkt_rx_burst() | > .-------| rte_pkt_tx_burst() |-----------. > | `---------------------' | > | | > | | > ,-------------------------------. ,------------------------------. > | struct rte_ethdev | | struct rte_cryptodev | > +-------------------------------+ +------------------------------+ > | rte_eth_dev_configure() | | rte_crypto_init_sym_session()| > | rte_eth_allmulticast_enable() | | rte_crypto_del_sym_session() | > | rte_eth_filter_ctrl() | | | > `-------------------------------' `---------------.--------------' > | | > | | > ,---------'---------------------. ,---------------'--------------. > | struct rte_pmd_ixgbe | | struct rte_pmd_X | > +-------------------------------+ +------------------------------+ > | .configure -> ixgbe_configure | | .init_session -> X_init_ses()| > | .tx_burst -> ixgbe_xmit_pkts | | .tx_burst -> X_handle_pkts() | > `-------------------------------' `------------------------------' > >We are not attempting to create a real class model here only looking at >creating a very basic common set of APIs and structures for other device >types. > >In terms of code changes for this, we obviously need to add in new >interface libraries for pktdev and cryptodev. The pktdev library can >define a skeleton structure for the first few elements of the nested >structures to ensure consistency. Each of the defines below illustrate the >common members in device structures, which gives some basic structure the >device framework. Each of the defines are placed at the top of the devices >matching structures and allows the devices to contain common and private >data. The pkdev structures overlay the first common set of members for >each device type. > >For example: >------------ > >We are using macros to reduce code changes to DPDK, but nested structures >are a better solution: > >#define RTE_PKT_COMMON_DEV(_t) > \ > pkt_rx_burst_t rx_pkt_burst; /**< Pointer to PMD >receive function. */ \ > pkt_tx_burst_t tx_pkt_burst; /**< Pointer to PMD >transmit function. */ \ > struct rte_##_t##_dev_data *data; /**< Pointer to device >data */ \ > const struct _t##_driver *driver; /**< Driver for this >device */ \ > struct _t##_dev_ops *dev_ops; /**< Functions exported by >PMD */ \ > struct rte_pci_device *pci_dev; /**< PCI info. supplied by >probing */ \ > /** User application callback for interrupts if present */ > \ > struct rte_##_t##_dev_cb_list link_intr_cbs; > \ > /** > \ > * User-supplied functions called from rx_burst to post-process > \ > * received packets before passing them to the user > \ > */ > \ > struct rte_##_t##_rxtx_callback **post_rx_burst_cbs; > \ > /** > \ > * User-supplied functions called from tx_burst to pre-process > \ > * received packets before passing them to the driver for >transmission. \ > */ > \ > struct rte_##_t##_rxtx_callback **pre_tx_burst_cbs; > \ > enum rte_pkt_dev_type dev_type; /**< Flag indicating the >device type */ \ > uint8_t attached /**< Flag indicating the >port is attached */ > /* Possible alignment or a hole in the structure */ > >#define RTE_PKT_NAME_MAX_LEN (32) > >#define RTE_PKT_COMMON_DEV_DATA > \ > char name[RTE_PKT_NAME_MAX_LEN]; /**< Unique identifier name */ > \ > > \ > void **rx_queues; /**< Array of pointers to RX queues. >*/ \ > void **tx_queues; /**< Array of pointers to TX queues. >*/ \ > uint16_t nb_rx_queues; /**< Number of RX queues. */ > \ > uint16_t nb_tx_queues; /**< Number of TX queues. */ > \ > > \ > uint16_t flags; /**< Bit fields for xyzdev's to use. >*/ \ > uint16_t mtu; /**< Maximum Transmission Unit. */ > \ > uint8_t unit_id; /**< Unit ID for this instance */ > \ > uint8_t _filler[7]; /* alignment filler */ > \ > > \ > /* 64bit alignment starts here */ > \ > void *dev_private; /**< PMD-specific private data */ > \ > uint64_t rx_mbuf_alloc_failed; /**< RX ring mbuf allocation failures. >*/ \ > uint32_t min_rx_buf_size; /**< Common rx buffer size handled by >all queues */ \ > uint32_t _pad0 > >#define port_id unit_id > >#define RTE_PKT_COMMON_DEV_INFO > \ > struct rte_pci_device *pci_dev; /**< Device PCI information. >*/ \ > const char *driver_name; /**< Device Driver name. */ > \ > unsigned int if_index; /**< Index to bound host >interface, or 0 if none. */ \ > /* Use if_indextoname() to translate into an interface name. */ > \ > uint32_t _pad0 > >The above is attempting to collect the common members to be place into the >top of private device structures as we feel these members should be fairly >common among the device types. > >/** >* @internal >* The generic data structure associated with each device. >* >* Pointers to burst-oriented packet receive and transmit functions are >* located at the beginning of the structure, along with the pointer to >* where all the data elements for the particular device are stored in >shared >* memory. This split allows the function pointer and driver data to be >per- >* process, while the actual configuration data for the device is shared. >*/ >struct rte_pkt_dev { > RTE_PKT_COMMON_DEV(pkt); >}; > >/** >* @internal >* The data part, with no function pointers, associated with each device. >* >* This structure is safe to place in shared memory to be common among >different >* processes in a multi-process configuration. >*/ >struct rte_pkt_dev_data { > RTE_PKT_COMMON_DEV_DATA; >}; > >------ > >The existing ethdev code can then have a minor updates such as those shown >below: > >struct rte_eth_dev_info { > RTE_PKT_COMMON_DEV_INFO; > > /* Private device data maybe here */ > uint32_t min_rx_bufsize; /**< Minimum size of RX buffer. */ > uint32_t max_rx_pktlen; /**< Maximum configurable length of RX pkt. */ > ... > >struct rte_eth_dev_data { > RTE_PKT_COMMON_DEV_DATA; /**< Define located in <rte_pkt.h> */ > > /* Private device data maybe here */ > struct rte_eth_dev_sriov sriov; /**< SRIOV data */ > > struct rte_eth_link dev_link; /**< Link-level information & status */ > ... > >struct rte_eth_dev { > RTE_PKT_COMMON_DEV(eth); > /* Private device data maybe here */ >}; > >/* Bit defines for flags in common pkt structure */ >#define promiscuous 0x0008 /**< RX promiscuous mode ON(1) / OFF(0). >*/ >#define scattered_rx 0x0004 /**< RX of scattered packets is ON(1) / >OFF(0) */ >#define all_multicast 0x0002 /**< RX all multicast mode ON(1) / >OFF(0). */ >#define dev_started 0x0001 /**< Device state: STARTED(1)/STOPPED(0) >*/ > >The advantage of doing a common set of member is the existing ethdev >structures and APIs can remain exactly the same, but every ethdev is also >a pktdev, which can be used as either as appropriate. Similarly for a type >of crypto devices, or dpi devices (or software rings or KNI devices, if we >so desire), we can base them off this common minimal framework and use >them all in a similar manner. > >Moving some basic common functions and structures into a common set of >files gives everyone a clean starting point for a new device plus adding a >light framework. The pktdev code is normally not called directly from the >application, but called from the device itself via a define in the device >header files. The pktdev RX/TX routines can be called from the >application, but the application needs to get the device structure pointer >based on the port id. > >The cryptodev API maybe very different from other devices and following >some type of Open Crypto API. The goal is not to restrict the device API, >but try to give some type of structure to tghe design. Does it make sense >to have a mbuf based Rx/Tx API, maybe not. Could the mbuf based APIs be >hidden in the pktdev code, very possible. We have a lot of options here. Adding some modified API looking very close to the OpenCrypto API or Linux Kernel Crypto API is a good way to extend and create a common API for crypto in DPDK. Following a standard set of APIs should help adoption. > >How the two Rx/Tx routines are defined: >--------------------------------------- > >/** > * > * Retrieve a burst of input packets from a receive queue of an Ethernet > * device. > * ><SNIP> > */ >#define rte_eth_rx_burst(_pid, _qid, _pkts, _nb_pkts) \ > rte_pkt_rx_burst((struct rte_pkt_dev *)&rte_eth_devices[_pid], _qid, >_pkts, _nb_pkts) > >/** > * Send a burst of output packets on a transmit queue of an Ethernet >device. > * ><SNIP> > */ >#define rte_eth_tx_burst(_pid, _qid, _pkts, _nb_pkts) \ > rte_pkt_tx_burst((struct rte_pkt_dev *)&rte_eth_device[_pid], _qid, >_pkts, _nb_pkts) > >A snip of code showing some advantages and use case of using pktdev API: >------------------------------------------------------------------------ > >Not the complete code and it has not been tested and is only an example >how one could use the design. > >/* > * The lcore main. This is the main thread that does the work, reading >from > * an input port and writing to an output port. > */ >static __attribute__((noreturn)) void >do_work(const struct pipeline_params *p) >{ > printf("\nCore %u forwarding packets. %s -> %s\n", > rte_lcore_id(), > p->src->data->name, > p->dst->data->name); > > /* Run until the application is quit or killed. */ > for (;;) { > /* > * Receive packets on a src device and forward them on out > * the dst device. > */ > /* Get burst of RX packets, from first port of pair. */ > struct rte_mbuf *bufs[BURST_SIZE]; > const uint16_t nb_rx = rte_pkt_rx_burst(p->src, 0, > bufs, BURST_SIZE); > > if (unlikely(nb_rx == 0)) > continue; > > /* Send burst of TX packets, to second port of pair. */ > const uint16_t nb_tx = rte_pkt_tx_burst(p->dst, 0, > bufs, nb_rx); > > /* Free any unsent packets. */ > if (unlikely(nb_tx < nb_rx)) { > uint16_t buf; > for (buf = nb_tx; buf < nb_rx; buf++) > rte_pktmbuf_free(bufs[buf]); > } > } >} > >/* > * The main function, which does initialization and calls the per-lcore > * functions. > */ >int >main(int argc, char *argv[]) >{ > struct pipeline_params p[RTE_MAX_LCORE]; > struct rte_mempool *mbuf_pool; > unsigned nb_ports, lcore_id; > uint8_t portid; > > /* Initialize the Environment Abstraction Layer (EAL). */ > int ret = rte_eal_init(argc, argv); > if (ret < 0) > rte_exit(EXIT_FAILURE, "Error with EAL initialization\n"); > > argc -= ret; > argv += ret; > > /* Check that there is an even number of ports to send/receive on. */ > nb_ports = rte_eth_dev_count(); > if (nb_ports < 2 || (nb_ports & 1)) > rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n"); > > /* Creates a new mempool in memory to hold the mbufs. */ > mbuf_pool = rte_mempool_create("MBUF_POOL", > NUM_MBUFS * nb_ports, > MBUF_SIZE, > MBUF_CACHE_SIZE, > sizeof(struct rte_pktmbuf_pool_private), > rte_pktmbuf_pool_init, NULL, > rte_pktmbuf_init, NULL, > rte_socket_id(), > 0); > > if (mbuf_pool == NULL) > rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); > > /* Initialize all ports. */ > for (portid = 0; portid < nb_ports; portid++) > if (port_init(portid, mbuf_pool) != 0) > rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n", > portid); > > struct rte_pkt_dev *in = rte_eth_get_dev(0); > RTE_LCORE_FOREACH_SLAVE(lcore_id){ > char name[RTE_RING_NAMESIZE]; > snprintf(name, sizeof(name), "RING_from_%u", lcore_id); > struct rte_pkt_dev *out = rte_ring_get_dev( > rte_ring_create(name, 4096, rte_socket_id(), 0)); > > p[lcore_id].src = in; > p[lcore_id].dst = out; > rte_eal_remote_launch((lcore_function_t *)do_work, > &p[lcore_id], lcore_id); > in = out; // next pipeline stage reads from my output. > } > //now finish pipeline on master lcore > lcore_id = rte_lcore_id(); > p[lcore_id].src = in; > p[lcore_id].dst = rte_eth_get_dev(1); > do_work(&p[lcore_id]); > > return 0; >} > > >Changes to rte_ethdev.[ch] >-------------------------- > >The most changes to rte_ethdev.[ch] was to use the new defines from >rte_pkt.h. All of the references to the globals in ethdev had to be >replaced with a reference to a global structure in ethdev. Moving the >global or private data into a device specific structure seemed reasonable >to reduce name space issues with new devices. The rx_burst/tx_burst >routines were removed as they now exist in the rte_pktdev.c file. If we >use nested structures instead of macros then more of the code will need to >be converted or macros used to convert the members to address the nested >structures. > >Example: >#define rx_pkt_burst dev_data.rx_pkt_burst >#define tx_pkt_burst dev_data.tx_pkt_burst > > >Impact to Existing Applications >------------------------------- > >None. The existing APIs should all remain unchanged, only the underlying >library code needs to change. [Obviously changes to apps will be needed to >take advantage of new device classes as we make them available]. > >The crypto API could be similar to the Open Crypto APIs and they seem >reasonable, but also using mbufs to hold data is just trying to use that >container type to provide some common structure to the system. Some of the >crypto data with be in the form of packets and some in the form of chunks >of data, which the API should account for in the design. > >My goal is to provide a light weight framework for adding more devices and >not try to make everthing look like Ethernet device. > >Regards, >++Keith and Bruce > > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-04-06 1:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-01 12:44 [dpdk-dev] [RFC] Adding multiple device types to DPDK Wiles, Keith 2015-04-03 17:00 ` Neil Horman 2015-04-03 22:32 ` Wiles, Keith 2015-04-04 13:11 ` Neil Horman 2015-04-04 15:16 ` Wiles, Keith 2015-04-05 19:37 ` Neil Horman 2015-04-05 22:20 ` Wiles, Keith 2015-04-06 1:48 ` Neil Horman 2015-04-02 14:16 Wiles, Keith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).