From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id DEDB52716 for ; Tue, 21 Apr 2015 10:40:08 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP; 21 Apr 2015 01:40:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,614,1422950400"; d="scan'208";a="698444146" Received: from irsmsx106.ger.corp.intel.com ([163.33.3.31]) by fmsmga001.fm.intel.com with ESMTP; 21 Apr 2015 01:40:06 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.178]) by IRSMSX106.ger.corp.intel.com ([169.254.8.204]) with mapi id 14.03.0224.002; Tue, 21 Apr 2015 09:40:06 +0100 From: "Ananyev, Konstantin" To: "Richardson, Bruce" Thread-Topic: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation Thread-Index: AQHQeSGhmKJzJaSTmEGpWAjHdFUkNp1VwJrwgAAycoCAAJwXkA== Date: Tue, 21 Apr 2015 08:40:05 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772582141B5D1@irsmsx105.ger.corp.intel.com> References: <1428954274-26944-1-git-send-email-keith.wiles@intel.com> <1429283804-28087-1-git-send-email-bruce.richardson@intel.com> <1429283804-28087-2-git-send-email-bruce.richardson@intel.com> <2601191342CEEE43887BDE71AB97725821416F4D@irsmsx105.ger.corp.intel.com> <20150420150236.GA9656@bricha3-MOBL3> In-Reply-To: <20150420150236.GA9656@bricha3-MOBL3> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Apr 2015 08:40:09 -0000 > -----Original Message----- > From: Richardson, Bruce > Sent: Monday, April 20, 2015 4:03 PM > To: Ananyev, Konstantin > Cc: dev@dpdk.org; Wiles, Keith > Subject: Re: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation >=20 > On Mon, Apr 20, 2015 at 12:26:43PM +0100, Ananyev, Konstantin wrote: > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson > > > Sent: Friday, April 17, 2015 4:17 PM > > > To: dev@dpdk.org; Wiles, Keith > > > Subject: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation > > > > > > This commit demonstrates what a minimal API for all packet handling > > > types would look like. It simply provides the necessary parts for > > > receiving and transmiting packets, and is based off the ethdev > > > implementation. > > > --- > > > config/common_bsdapp | 5 ++ > > > config/common_linuxapp | 5 ++ > > > lib/Makefile | 1 + > > > lib/librte_pktdev/Makefile | 56 ++++++++++++++++ > > > lib/librte_pktdev/rte_pktdev.c | 35 ++++++++++ > > > lib/librte_pktdev/rte_pktdev.h | 144 +++++++++++++++++++++++++++++++= ++++++++++ > > > 6 files changed, 246 insertions(+) > > > create mode 100644 lib/librte_pktdev/Makefile > > > create mode 100644 lib/librte_pktdev/rte_pktdev.c > > > create mode 100644 lib/librte_pktdev/rte_pktdev.h > > > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp > > > index 8ff4dc2..d2b932c 100644 > > > --- a/config/common_bsdapp > > > +++ b/config/common_bsdapp > > > @@ -132,6 +132,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=3Dy > > > CONFIG_RTE_LIBRTE_KVARGS=3Dy > > > > > > # > > > +# Compile generic packet handling device library > > > +# > > > +CONFIG_RTE_LIBRTE_PKTDEV=3Dy > > > + > > > +# > > > # Compile generic ethernet library > > > # > > > CONFIG_RTE_LIBRTE_ETHER=3Dy > > > diff --git a/config/common_linuxapp b/config/common_linuxapp > > > index 09a58ac..5bda416 100644 > > > --- a/config/common_linuxapp > > > +++ b/config/common_linuxapp > > > @@ -129,6 +129,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=3Dy > > > CONFIG_RTE_LIBRTE_KVARGS=3Dy > > > > > > # > > > +# Compile generic packet handling device library > > > +# > > > +CONFIG_RTE_LIBRTE_PKTDEV=3Dy > > > + > > > +# > > > # Compile generic ethernet library > > > # > > > CONFIG_RTE_LIBRTE_ETHER=3Dy > > > diff --git a/lib/Makefile b/lib/Makefile > > > index d94355d..4db5ee0 100644 > > > --- a/lib/Makefile > > > +++ b/lib/Makefile > > > @@ -32,6 +32,7 @@ > > > include $(RTE_SDK)/mk/rte.vars.mk > > > > > > DIRS-y +=3D librte_compat > > > +DIRS-$(CONFIG_RTE_LIBRTE_PKTDEV) +=3D librte_pktdev > > > DIRS-$(CONFIG_RTE_LIBRTE_EAL) +=3D librte_eal > > > DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) +=3D librte_malloc > > > DIRS-$(CONFIG_RTE_LIBRTE_RING) +=3D librte_ring > > > diff --git a/lib/librte_pktdev/Makefile b/lib/librte_pktdev/Makefile > > > new file mode 100644 > > > index 0000000..2d3b3a1 > > > --- /dev/null > > > +++ b/lib/librte_pktdev/Makefile > > > @@ -0,0 +1,56 @@ > > > +# BSD LICENSE > > > +# > > > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > > > +# All rights reserved. > > > +# > > > +# Redistribution and use in source and binary forms, with or witho= ut > > > +# modification, are permitted provided that the following conditio= ns > > > +# are met: > > > +# > > > +# * Redistributions of source code must retain the above copyrig= ht > > > +# notice, this list of conditions and the following disclaimer= . > > > +# * Redistributions in binary form must reproduce the above copy= right > > > +# notice, this list of conditions and the following disclaimer= in > > > +# the documentation and/or other materials provided with the > > > +# distribution. > > > +# * Neither the name of Intel Corporation nor the names of its > > > +# contributors may be used to endorse or promote products deri= ved > > > +# from this software without specific prior written permission= . > > > +# > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUT= ORS > > > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NO= T > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNES= S FOR > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYR= IGHT > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDE= NTAL, > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF= USE, > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND O= N ANY > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR T= ORT > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF TH= E USE > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAM= AGE. > > > + > > > +include $(RTE_SDK)/mk/rte.vars.mk > > > + > > > +# > > > +# library name > > > +# > > > +LIB =3D libpktdev.a > > > + > > > +CFLAGS +=3D -O3 > > > +CFLAGS +=3D $(WERROR_FLAGS) > > > + > > > +EXPORT_MAP :=3D rte_pktdev_version.map > > > + > > > +LIBABIVER :=3D 1 > > > + > > > +SRCS-y +=3D rte_pktdev.c > > > + > > > +# > > > +# Export include files > > > +# > > > +SYMLINK-y-include +=3D rte_pktdev.h > > > + > > > +# this lib depends upon no others: > > > +DEPDIRS-y +=3D > > > + > > > +include $(RTE_SDK)/mk/rte.lib.mk > > > diff --git a/lib/librte_pktdev/rte_pktdev.c b/lib/librte_pktdev/rte_p= ktdev.c > > > new file mode 100644 > > > index 0000000..4c32d86 > > > --- /dev/null > > > +++ b/lib/librte_pktdev/rte_pktdev.c > > > @@ -0,0 +1,36 @@ > > > +/*- > > > + * BSD LICENSE > > > + * > > > + * Copyright(c) 2015 Intel Corporation. All rights reserved. > > > + * All rights reserved. > > > + * > > > + * Redistribution and use in source and binary forms, with or with= out > > > + * modification, are permitted provided that the following conditi= ons > > > + * are met: > > > + * > > > + * * Redistributions of source code must retain the above copyri= ght > > > + * notice, this list of conditions and the following disclaime= r. > > > + * * Redistributions in binary form must reproduce the above cop= yright > > > + * notice, this list of conditions and the following disclaime= r in > > > + * the documentation and/or other materials provided with the > > > + * distribution. > > > + * * Neither the name of Intel Corporation nor the names of its > > > + * contributors may be used to endorse or promote products der= ived > > > + * from this software without specific prior written permissio= n. > > > + * > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBU= TORS > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT N= OT > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNE= SS FOR > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPY= RIGHT > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCID= ENTAL, > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NO= T > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS O= F USE, > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND = ON ANY > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR = TORT > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF T= HE USE > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DA= MAGE. > > > + */ > > > + > > > +#include "rte_pktdev.h" > > > + > > > +/* For future use */ > > > diff --git a/lib/librte_pktdev/rte_pktdev.h b/lib/librte_pktdev/rte_p= ktdev.h > > > new file mode 100644 > > > index 0000000..8a5699a > > > --- /dev/null > > > +++ b/lib/librte_pktdev/rte_pktdev.h > > > @@ -0,0 +1,144 @@ > > > +/*- > > > + * BSD LICENSE > > > + * > > > + * Copyright(c) 2015 Intel Corporation. All rights reserved. > > > + * All rights reserved. > > > + * > > > + * Redistribution and use in source and binary forms, with or with= out > > > + * modification, are permitted provided that the following conditi= ons > > > + * are met: > > > + * > > > + * * Redistributions of source code must retain the above copyri= ght > > > + * notice, this list of conditions and the following disclaime= r. > > > + * * Redistributions in binary form must reproduce the above cop= yright > > > + * notice, this list of conditions and the following disclaime= r in > > > + * the documentation and/or other materials provided with the > > > + * distribution. > > > + * * Neither the name of Intel Corporation nor the names of its > > > + * contributors may be used to endorse or promote products der= ived > > > + * from this software without specific prior written permissio= n. > > > + * > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBU= TORS > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT N= OT > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNE= SS FOR > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPY= RIGHT > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCID= ENTAL, > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NO= T > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS O= F USE, > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND = ON ANY > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR = TORT > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF T= HE USE > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DA= MAGE. > > > + */ > > > + > > > +#ifndef _RTE_PKTDEV_H_ > > > +#define _RTE_PKTDEV_H_ > > > + > > > +#include > > > + > > > +/** > > > + * @file > > > + * > > > + * RTE Packet Processing Device API > > > + */ > > > + > > > +#ifdef __cplusplus > > > +extern "C" { > > > +#endif > > > + > > > +/* forward definition of mbuf structure. We don't need full mbuf hea= der here */ > > > +struct rte_mbuf; > > > + > > > +#define RTE_PKT_NAME_MAX_LEN (32) > > > + > > > +typedef uint16_t (*pkt_rx_burst_t)(void *rxq, > > > + struct rte_mbuf **rx_pkts, > > > + uint16_t nb_pkts); > > > +/**< @internal Retrieve packets from a queue of a device. */ > > > + > > > +typedef uint16_t (*pkt_tx_burst_t)(void *txq, > > > + struct rte_mbuf **tx_pkts, > > > + uint16_t nb_pkts); > > > +/**< @internal Send packets on a queue of a device. */ > > > + > > > +#define RTE_PKT_DEV_HDR(structname) struct { \ > > > + pkt_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. = */ \ > > > + pkt_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function.= */ \ > > > + struct structname ## _data *data; /**< Pointer to device data */ \ > > > +} > > > + > > > +#define RTE_PKT_DEV_DATA_HDR struct { \ > > > + char name[RTE_PKT_NAME_MAX_LEN]; /**< Unique identifier name */ \ > > > +\ > > > + void **rx_queues; /**< Array of pointers to RX queues. */ \ > > > + void **tx_queues; /**< Array of pointers to TX queues. */ \ > > > + uint16_t nb_rx_queues; /**< Number of RX queues. */ \ > > > + uint16_t nb_tx_queues; /**< Number of TX queues. */ \ > > > +} > > > + > > > +struct rte_pkt_dev { > > > + RTE_PKT_DEV_HDR(rte_pkt_dev); > > > +}; > > > + > > > +struct rte_pkt_dev_data { > > > + RTE_PKT_DEV_DATA_HDR; > > > +}; > > > + > > > +/** > > > + * > > > + * Retrieve a burst of input packets from a receive queue of a > > > + * device. The retrieved packets are stored in *rte_mbuf* structures= whose > > > + * pointers are supplied in the *rx_pkts* array. > > > + * > > > + * @param dev > > > + * The device to be polled for packets > > > + * @param queue_id > > > + * The index of the receive queue from which to retrieve input pac= kets. > > > + * @param rx_pkts > > > + * The address of an array of pointers to *rte_mbuf* structures th= at > > > + * must be large enough to store *nb_pkts* pointers in it. > > > + * @param nb_pkts > > > + * The maximum number of packets to retrieve. > > > + * @return > > > + * The number of packets actually retrieved, which is the number > > > + * of pointers to *rte_mbuf* structures effectively supplied to th= e > > > + * *rx_pkts* array. > > > + */ > > > +static inline uint16_t > > > +rte_pkt_rx_burst(struct rte_pkt_dev *dev, uint16_t queue_id, > > > + struct rte_mbuf **rx_pkts, uint16_t nb_pkts) > > > +{ > > > + return (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], > > > + rx_pkts, nb_pkts); > > > +} > > > + > > > +/** > > > + * Send a burst of output packets on a transmit queue of a device. > > > + * > > > + * @param dev > > > + * The device to be given the packets. > > > + * @param queue_id > > > + * The index of the queue through which output packets must be sen= t. > > > + * @param tx_pkts > > > + * The address of an array of *nb_pkts* pointers to *rte_mbuf* str= uctures > > > + * which contain the output packets. > > > + * @param nb_pkts > > > + * The maximum number of packets to transmit. > > > + * @return > > > + * The number of output packets actually stored in transmit descri= ptors of > > > + * the transmit ring. The return value can be less than the value = of the > > > + * *tx_pkts* parameter when the transmit ring is full or has been = filled up. > > > + */ > > > +static inline uint16_t > > > +rte_pkt_tx_burst(struct rte_pkt_dev *dev, uint16_t queue_id, > > > + struct rte_mbuf **tx_pkts, uint16_t nb_pkts) > > > +{ > > > + return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts= , nb_pkts); > > > +} > > > > That one looks much more lightweight, then Keith one :) > > I have a question here: > > Why are you guys so confident, that all foreseeable devices would fit i= nto current eth_dev rx_burst/tx_burst API? > > As I understand, QAT devices have HW request/response ring pairs, so th= e idea probably is to make tx_burst() > > populate an request ring and rx_burst() to read from response ring, rig= ht? > > Though right now rte_mbuf contains a lot of flags and data fields that = are specific for ethdev, and probably have no sense for crypto dev. > > From other side, for QAT devices, I suppose you'll need crypto specific= flags and data: > > encrypt/decrypt, source and destination buffer addresses, some cipher s= pecific data, etc. > > Wonder do you plan to fit all that into current rte_mbuf structure, or = do you plan to have some superset structure, > > that would consist of rte_mbuf plus some extra stuff, or ... ? > > Konstantin > > > > >=20 > From my point of view, the purpose of a pktdev API to to define a common = API > for devices that read/write (i.e. RX/TX) mbufs. If other devices don't fi= t that > model, then they don't use it. However, we already have a number of diffe= rent > device types that already read/write mbufs, and there are likely more in = future, > so it would be nice to have a common API that can be used across those ty= pes so > the data path can be source-neutral when processing mbufs. So far, we hav= e > three proposals on how that might be achieved. :-) Ok, probably I misunderstood previous RFCs about pktdev. So pktdev is not going to be a generic abstraction for any foreseeable devi= ce types: eth, crypto, compress, dpi, etc? It would represent only subset of dev types: real eth devices, plus SW emul= ated ones=20 that can mimic rx_burst/tx_burst, but don't support any other eth specific = ops (kni, pmd_ring)? Basically what we support right now? And for future device types to support (crypto, etc) there would be somethi= ng else?=20 >=20 > As for the crypto, there is still a lot of investigation work to be done = here, > and Keith is probably better placed than me to comment on it. But if you = look at any > application using DPDK for packet processing, the mbuf is still going to = be at > the heart of it, so whatever the actual crypto API is like, it's going to= have > to have to fit into an "mbuf world" anyway. > If the crypto-dev APIs don't work > with mbufs, then it's pushing work onto the app (or higher level libs) to= extract > the meaningful fields from the mbuf to pass to the dev. As I understand application (or some upper layer lib) would have to do that= work anyway:=20 RX packet, find and read related to this packet crypto data, and pass all t= hat info to the crypto dev, right? I can hardly imagine, that in reality you always would be able to do: n =3D rx_burst(eth_dev, pkts, n); n =3D tx_burst(crypto_dev, pkts, n); without any processing of incoming packets in between. The difference is only how that information will be passed into crypto devi= ce: Would you need to to update an mbuf, or fill some other structure. =20 > Therefore, in a DPDK > library, I would think it better to have the dev work with mbufs directly= , as > the rest of DPDK does. I wouldn't say that all DPDK libs accepts data as mbufs only. Only those where it makes sense: PMDs, ip_frag/reassembly, pipeline fw. Let say rte_hash and rte_acl accept input data in a neutral way. You wouldn't expect rte_hash_lookup() or rte_acl_classify() to accept point= er(s) to the mbuf. if you'll have a SW implemented encrypt()/decrypt() routines, you probably = wouldn't require an mbuf as input parameter either. Wouldn't it be more plausible to create a new struct rte_crypto_buf (or som= ething) to pass information to/from crypto devices. Then we can provide for it an ability to attach/detach to rte_mbuf(s) buf_a= ddr(s). That way we can probably make crypto_buf to attach to 2 mbufs at once (sour= ce and destination). Konstantin > For the extra data, my suggestion would be to have the > mbuf store a pointer to a crypto context, rather than trying to store it = directly > in the mbuf itself - there's no way we'd fit a crypto key in there! - and > the context would be shared among a set of packets rather than per-mbuf. >=20 > /Bruce