From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id ABACFB3B5 for ; Fri, 11 Jul 2014 18:06:53 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 11 Jul 2014 09:01:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.01,644,1400050800"; d="scan'208";a="571792579" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by orsmga002.jf.intel.com with ESMTP; 11 Jul 2014 09:07:19 -0700 Received: from irsmsx109.ger.corp.intel.com (163.33.3.23) by IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS) id 14.3.123.3; Fri, 11 Jul 2014 17:07:15 +0100 Received: from irsmsx103.ger.corp.intel.com ([169.254.3.62]) by IRSMSX109.ger.corp.intel.com ([169.254.13.142]) with mapi id 14.03.0123.003; Fri, 11 Jul 2014 17:07:14 +0100 From: "Richardson, Bruce" To: Stephen Hemminger Thread-Topic: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices Thread-Index: AQHPnQnUuwBrDeTxbk6r8888BAhUc5ua47cAgAAU6nD///KugIAAHhKQ Date: Fri, 11 Jul 2014 16:07:13 +0000 Message-ID: <59AF69C657FD0841A61C55336867B5B0343ACDFF@IRSMSX103.ger.corp.intel.com> References: <1405024369-30058-1-git-send-email-linville@tuxdriver.com> <20140711061147.06c12136@samsung-9> <20140711144912.GA25478@tuxdriver.com> <59AF69C657FD0841A61C55336867B5B0343ACD8B@IRSMSX103.ger.corp.intel.com> <20140711081623.4c026199@samsung-9> In-Reply-To: <20140711081623.4c026199@samsung-9> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jul 2014 16:06:54 -0000 > -----Original Message----- > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Friday, July 11, 2014 8:16 AM > To: Richardson, Bruce > Cc: John W. Linville; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET- > based virtual devices >=20 > On Fri, 11 Jul 2014 15:06:25 +0000 > "Richardson, Bruce" wrote: >=20 > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville > > > Sent: Friday, July 11, 2014 7:49 AM > > > To: Stephen Hemminger > > > Cc: dev@dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for > AF_PACKET- > > > based virtual devices > > > > > > On Fri, Jul 11, 2014 at 06:11:47AM -0700, Stephen Hemminger wrote: > > > > On Thu, 10 Jul 2014 16:32:49 -0400 > > > > "John W. Linville" wrote: > > > > > > > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKE= T > > > > > socket. This implementation uses mmap'ed ring buffers to limit c= opying > > > > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of > > > > > AF_PACKET is used for frame reception. In the current implementa= tion, > > > > > Tx and Rx queues are always paired, and therefore are always equa= l > > > > > in number -- changing this would be a Simple Matter Of Programmin= g. > > > > > > > > > > Interfaces of this type are created with a command line option li= ke > > > > > "--vdev=3Deth_packet0,iface=3D...". There are a number of option= s availabe > > > > > as arguments: > > > > > > > > > > - Interface is chosen by "iface" (required) > > > > > - Number of queue pairs set by "qpairs" (optional, default: 16) > > > > > - AF_PACKET MMAP block size set by "blocksz" (optional, default:= 4096) > > > > > - AF_PACKET MMAP frame size set by "framesz" (optional, default:= 2048) > > > > > - AF_PACKET MMAP frame count set by "framecnt" (optional, defaul= t: > 512) > > > > > > > > > > Signed-off-by: John W. Linville > > > > > --- > > > > > This PMD is intended to provide a means for using DPDK on a broad > > > > > range of hardware without hardware-specific PMDs and (hopefully) > > > > > with better performance than what PCAP offers in Linux. This mig= ht > > > > > be useful as a development platform for DPDK applications when > > > > > DPDK-supported hardware is expensive or unavailable. > > > > > > > > > > config/common_bsdapp | 5 + > > > > > config/common_linuxapp | 5 + > > > > > lib/Makefile | 1 + > > > > > lib/librte_eal/linuxapp/eal/Makefile | 1 + > > > > > lib/librte_pmd_packet/Makefile | 60 +++ > > > > > lib/librte_pmd_packet/rte_eth_packet.c | 826 > > > +++++++++++++++++++++++++++++++++ > > > > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++ > > > > > mk/rte.app.mk | 4 + > > > > > 8 files changed, 957 insertions(+) > > > > > create mode 100644 lib/librte_pmd_packet/Makefile > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h > > > > > > > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp > > > > > index 943dce8f1ede..c317f031278e 100644 > > > > > --- a/config/common_bsdapp > > > > > +++ b/config/common_bsdapp > > > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=3Dy > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=3Dy > > > > > > > > > > # > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) > > > > > +# > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=3Dn > > > > > + > > > > > +# > > > > > # Do prefetch of packet data within PMD driver receive function > > > > > # > > > > > CONFIG_RTE_PMD_PACKET_PREFETCH=3Dy > > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp > > > > > index 7bf5d80d4e26..f9e7bc3015ec 100644 > > > > > --- a/config/common_linuxapp > > > > > +++ b/config/common_linuxapp > > > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=3Dn > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=3Dy > > > > > > > > > > # > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) > > > > > +# > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=3Dy > > > > > + > > > > > +# > > > > > # Compile Xen PMD > > > > > # > > > > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=3Dn > > > > > diff --git a/lib/Makefile b/lib/Makefile > > > > > index 10c5bb3045bc..930fadf29898 100644 > > > > > --- a/lib/Makefile > > > > > +++ b/lib/Makefile > > > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=3D > > > librte_pmd_i40e > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) +=3D librte_pmd_bond > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) +=3D librte_pmd_ring > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) +=3D librte_pmd_pcap > > > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D librte_pmd_packet > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=3D librte_pmd_virtio > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) +=3D librte_pmd_vmxnet3 > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) +=3D librte_pmd_xenvirt > > > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile > > > b/lib/librte_eal/linuxapp/eal/Makefile > > > > > index 756d6b0c9301..feed24a63272 100644 > > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile > > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile > > > > > @@ -44,6 +44,7 @@ CFLAGS +=3D -I$(RTE_SDK)/lib/librte_ether > > > > > CFLAGS +=3D -I$(RTE_SDK)/lib/librte_ivshmem > > > > > CFLAGS +=3D -I$(RTE_SDK)/lib/librte_pmd_ring > > > > > CFLAGS +=3D -I$(RTE_SDK)/lib/librte_pmd_pcap > > > > > +CFLAGS +=3D -I$(RTE_SDK)/lib/librte_pmd_packet > > > > > CFLAGS +=3D -I$(RTE_SDK)/lib/librte_pmd_xenvirt > > > > > CFLAGS +=3D $(WERROR_FLAGS) -O3 > > > > > > > > > > diff --git a/lib/librte_pmd_packet/Makefile > > > b/lib/librte_pmd_packet/Makefile > > > > > new file mode 100644 > > > > > index 000000000000..e1266fb992cd > > > > > --- /dev/null > > > > > +++ b/lib/librte_pmd_packet/Makefile > > > > > @@ -0,0 +1,60 @@ > > > > > +# BSD LICENSE > > > > > +# > > > > > +# Copyright(c) 2014 John W. Linville > > > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserve= d. > > > > > +# Copyright(c) 2014 6WIND S.A. > > > > > +# All rights reserved. > > > > > +# > > > > > +# Redistribution and use in source and binary forms, with or w= ithout > > > > > +# modification, are permitted provided that the following cond= itions > > > > > +# are met: > > > > > +# > > > > > +# * Redistributions of source code must retain the above cop= yright > > > > > +# notice, this list of conditions and the following discla= imer. > > > > > +# * Redistributions in binary form must reproduce the above = copyright > > > > > +# notice, this list of conditions and the following discla= imer in > > > > > +# the documentation and/or other materials provided with t= he > > > > > +# distribution. > > > > > +# * Neither the name of Intel Corporation nor the names of i= ts > > > > > +# contributors may be used to endorse or promote products = derived > > > > > +# from this software without specific prior written permis= sion. > > > > > +# > > > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > > > CONTRIBUTORS > > > > > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, > BUT > > > NOT > > > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > > > FITNESS FOR > > > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > > > COPYRIGHT > > > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > > > INCIDENTAL, > > > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, > BUT > > > NOT > > > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; > > > LOSS OF USE, > > > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER > CAUSED > > > AND ON ANY > > > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, > OR > > > TORT > > > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > OUT > > > OF THE USE > > > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > > > DAMAGE. > > > > > + > > > > > +include $(RTE_SDK)/mk/rte.vars.mk > > > > > + > > > > > +# > > > > > +# library name > > > > > +# > > > > > +LIB =3D librte_pmd_packet.a > > > > > + > > > > > +CFLAGS +=3D -O3 > > > > > +CFLAGS +=3D $(WERROR_FLAGS) > > > > > + > > > > > +# > > > > > +# all source are stored in SRCS-y > > > > > +# > > > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D rte_eth_packet.c > > > > > + > > > > > +# > > > > > +# Export include files > > > > > +# > > > > > +SYMLINK-y-include +=3D rte_eth_packet.h > > > > > + > > > > > +# this lib depends upon: > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D lib/librte_mbuf > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D lib/librte_ether > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D lib/librte_malloc > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) +=3D lib/librte_kvargs > > > > > + > > > > > +include $(RTE_SDK)/mk/rte.lib.mk > > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c > > > b/lib/librte_pmd_packet/rte_eth_packet.c > > > > > new file mode 100644 > > > > > index 000000000000..fceb6258aad6 > > > > > --- /dev/null > > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c > > > > > @@ -0,0 +1,826 @@ > > > > > +/*- > > > > > + * BSD LICENSE > > > > > + * > > > > > + * Copyright(c) 2014 John W. Linville > > > > > + * > > > > > + * Originally based upon librte_pmd_pcap code: > > > > > + * > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserv= ed. > > > > > + * Copyright(c) 2014 6WIND S.A. > > > > > + * All rights reserved. > > > > > + * > > > > > + * Redistribution and use in source and binary forms, with or = without > > > > > + * modification, are permitted provided that the following con= ditions > > > > > + * are met: > > > > > + * > > > > > + * * Redistributions of source code must retain the above co= pyright > > > > > + * notice, this list of conditions and the following discl= aimer. > > > > > + * * Redistributions in binary form must reproduce the above > copyright > > > > > + * notice, this list of conditions and the following discl= aimer in > > > > > + * the documentation and/or other materials provided with = the > > > > > + * distribution. > > > > > + * * Neither the name of Intel Corporation nor the names of = its > > > > > + * contributors may be used to endorse or promote products= derived > > > > > + * from this software without specific prior written permi= ssion. > > > > > + * > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > > > CONTRIBUTORS > > > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, > BUT > > > NOT > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > > > FITNESS FOR > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > > > COPYRIGHT > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > > > INCIDENTAL, > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, > > > BUT NOT > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; > > > LOSS OF USE, > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER > CAUSED > > > AND ON ANY > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, > OR > > > TORT > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > OUT > > > OF THE USE > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUC= H > > > DAMAGE. > > > > > + */ > > > > > + > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > + > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > + > > > > > +#include "rte_eth_packet.h" > > > > > + > > > > > +#define ETH_PACKET_IFACE_ARG "iface" > > > > > +#define ETH_PACKET_NUM_Q_ARG "qpairs" > > > > > +#define ETH_PACKET_BLOCKSIZE_ARG "blocksz" > > > > > +#define ETH_PACKET_FRAMESIZE_ARG "framesz" > > > > > +#define ETH_PACKET_FRAMECOUNT_ARG "framecnt" > > > > > + > > > > > +#define DFLT_BLOCK_SIZE (1 << 12) > > > > > +#define DFLT_FRAME_SIZE (1 << 11) > > > > > +#define DFLT_FRAME_COUNT (1 << 9) > > > > > + > > > > > +struct pkt_rx_queue { > > > > > + int sockfd; > > > > > + > > > > > + struct iovec *rd; > > > > > + uint8_t *map; > > > > > + unsigned int framecount; > > > > > + unsigned int framenum; > > > > > + > > > > > + struct rte_mempool *mb_pool; > > > > > + > > > > > + volatile unsigned long rx_pkts; > > > > > + volatile unsigned long err_pkts; > > > > > > > > Use of volatile will generate slow code, don't think > > > > it is necessary, especially when only one CPU can use a queue > > > > at a time. > > > > > > That is a good point, worth checking out. FWIW, those lines are > > > boilerplate originally copied from the pcap PMD. :-) > > > > > > > > > Yes, I agree it's worth checking out if there is a performance impact, = but if we > assume that the stats for RX/TX are possibly going to be read by another = core, > they really should be volatile for correctness. >=20 > Since only one core does update, that is not necessary. add will generate > valid value. and reader will read a valid value. > Only if two cpu's are using same queue would it be possible to for two ad= d's > to collide; but DPDK queue documentation specifically says queue's are no= t MP > safe. AFAIK adds colliding can occur whether volatile or not, unless atomic opera= tions are explicitly used. The volatile would just be a sanity check to ens= ure the value isn't cached in registers in either read or writer cores, so = it's strictly necessary but I also would suspect it to have minimal to no p= erformance impact as the value should be written to memory anyway even with= out volatile (though there is no guarantee of this), and the additional com= piler ordering constraints imposed by volatile, I would hope shouldn't affe= ct things much.=20