DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Venkatesan, Venky" <venky.venkatesan@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices
Date: Fri, 11 Jul 2014 15:29:17 +0000	[thread overview]
Message-ID: <1FD9B82B8BF2CF418D9A1000154491D974124465@ORSMSX102.amr.corp.intel.com> (raw)
In-Reply-To: <59AF69C657FD0841A61C55336867B5B0343ACD8B@IRSMSX103.ger.corp.intel.com>

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville
> Sent: Friday, July 11, 2014 7:49 AM
> To: Stephen Hemminger
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for 
> AF_PACKET- based virtual devices
> 
> On Fri, Jul 11, 2014 at 06:11:47AM -0700, Stephen Hemminger wrote:
> > On Thu, 10 Jul 2014 16:32:49 -0400
> > "John W. Linville" <linville@tuxdriver.com> wrote:
> >
> > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET 
> > > socket.  This implementation uses mmap'ed ring buffers to limit 
> > > copying and user/kernel transitions.  The PACKET_FANOUT_HASH 
> > > behavior of AF_PACKET is used for frame reception.  In the current 
> > > implementation, Tx and Rx queues are always paired, and therefore 
> > > are always equal in number -- changing this would be a Simple Matter Of Programming.
> > >
> > > Interfaces of this type are created with a command line option 
> > > like "--vdev=eth_packet0,iface=...".  There are a number of 
> > > options availabe as arguments:
> > >
> > >  - Interface is chosen by "iface" (required)
> > >  - Number of queue pairs set by "qpairs" (optional, default: 16)
> > >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 
> > > 4096)
> > >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 
> > > 2048)
> > >  - AF_PACKET MMAP frame count set by "framecnt" (optional, 
> > > default: 512)
> > >
> > > Signed-off-by: John W. Linville <linville@tuxdriver.com>
> > > ---
> > > This PMD is intended to provide a means for using DPDK on a broad 
> > > range of hardware without hardware-specific PMDs and (hopefully) 
> > > with better performance than what PCAP offers in Linux.  This 
> > > might be useful as a development platform for DPDK applications 
> > > when DPDK-supported hardware is expensive or unavailable.
> > >
> > >  config/common_bsdapp                   |   5 +
> > >  config/common_linuxapp                 |   5 +
> > >  lib/Makefile                           |   1 +
> > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> > >  lib/librte_pmd_packet/Makefile         |  60 +++
> > >  lib/librte_pmd_packet/rte_eth_packet.c | 826
> +++++++++++++++++++++++++++++++++
> > >  lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> > >  mk/rte.app.mk                          |   4 +
> > >  8 files changed, 957 insertions(+)  create mode 100644 
> > > lib/librte_pmd_packet/Makefile  create mode 100644 
> > > lib/librte_pmd_packet/rte_eth_packet.c
> > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp index 
> > > 943dce8f1ede..c317f031278e 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y  
> > > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > >
> > >  #
> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) # 
> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > > +
> > > +#
> > >  # Do prefetch of packet data within PMD driver receive function  
> > > #  CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git 
> > > a/config/common_linuxapp b/config/common_linuxapp index 
> > > 7bf5d80d4e26..f9e7bc3015ec 100644
> > > --- a/config/common_linuxapp
> > > +++ b/config/common_linuxapp
> > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n  
> > > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > >
> > >  #
> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) # 
> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> > > +
> > > +#
> > >  # Compile Xen PMD
> > >  #
> > >  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> > > diff --git a/lib/Makefile b/lib/Makefile index 
> > > 10c5bb3045bc..930fadf29898 100644
> > > --- a/lib/Makefile
> > > +++ b/lib/Makefile
> > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=
> librte_pmd_i40e
> > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
> > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
> > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
> > >  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
> > >  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
> > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt diff 
> > > --git a/lib/librte_eal/linuxapp/eal/Makefile
> b/lib/librte_eal/linuxapp/eal/Makefile
> > > index 756d6b0c9301..feed24a63272 100644
> > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether  CFLAGS 
> > > += -I$(RTE_SDK)/lib/librte_ivshmem  CFLAGS += 
> > > -I$(RTE_SDK)/lib/librte_pmd_ring  CFLAGS += 
> > > -I$(RTE_SDK)/lib/librte_pmd_pcap
> > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt
> > >  CFLAGS += $(WERROR_FLAGS) -O3
> > >
> > > diff --git a/lib/librte_pmd_packet/Makefile
> b/lib/librte_pmd_packet/Makefile
> > > new file mode 100644
> > > index 000000000000..e1266fb992cd
> > > --- /dev/null
> > > +++ b/lib/librte_pmd_packet/Makefile
> > > @@ -0,0 +1,60 @@
> > > +#   BSD LICENSE
> > > +#
> > > +#   Copyright(c) 2014 John W. Linville <linville@redhat.com>
> > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > +#   Copyright(c) 2014 6WIND S.A.
> > > +#   All rights reserved.
> > > +#
> > > +#   Redistribution and use in source and binary forms, with or without
> > > +#   modification, are permitted provided that the following conditions
> > > +#   are met:
> > > +#
> > > +#     * Redistributions of source code must retain the above copyright
> > > +#       notice, this list of conditions and the following disclaimer.
> > > +#     * Redistributions in binary form must reproduce the above copyright
> > > +#       notice, this list of conditions and the following disclaimer in
> > > +#       the documentation and/or other materials provided with the
> > > +#       distribution.
> > > +#     * Neither the name of Intel Corporation nor the names of its
> > > +#       contributors may be used to endorse or promote products derived
> > > +#       from this software without specific prior written permission.
> > > +#
> > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> > > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > > +
> > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > +
> > > +#
> > > +# library name
> > > +#
> > > +LIB = librte_pmd_packet.a
> > > +
> > > +CFLAGS += -O3
> > > +CFLAGS += $(WERROR_FLAGS)
> > > +
> > > +#
> > > +# all source are stored in SRCS-y #
> > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c
> > > +
> > > +#
> > > +# Export include files
> > > +#
> > > +SYMLINK-y-include += rte_eth_packet.h
> > > +
> > > +# this lib depends upon:
> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf
> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether
> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc
> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs
> > > +
> > > +include $(RTE_SDK)/mk/rte.lib.mk
> > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c
> b/lib/librte_pmd_packet/rte_eth_packet.c
> > > new file mode 100644
> > > index 000000000000..fceb6258aad6
> > > --- /dev/null
> > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c
> > > @@ -0,0 +1,826 @@
> > > +/*-
> > > + *   BSD LICENSE
> > > + *
> > > + *   Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>
> > > + *
> > > + *   Originally based upon librte_pmd_pcap code:
> > > + *
> > > + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > + *   Copyright(c) 2014 6WIND S.A.
> > > + *   All rights reserved.
> > > + *
> > > + *   Redistribution and use in source and binary forms, with or without
> > > + *   modification, are permitted provided that the following conditions
> > > + *   are met:
> > > + *
> > > + *     * Redistributions of source code must retain the above copyright
> > > + *       notice, this list of conditions and the following disclaimer.
> > > + *     * Redistributions in binary form must reproduce the above copyright
> > > + *       notice, this list of conditions and the following disclaimer in
> > > + *       the documentation and/or other materials provided with the
> > > + *       distribution.
> > > + *     * Neither the name of Intel Corporation nor the names of its
> > > + *       contributors may be used to endorse or promote products derived
> > > + *       from this software without specific prior written permission.
> > > + *
> > > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > > + */
> > > +
> > > +#include <rte_mbuf.h>
> > > +#include <rte_ethdev.h>
> > > +#include <rte_malloc.h>
> > > +#include <rte_kvargs.h>
> > > +#include <rte_dev.h>
> > > +
> > > +#include <linux/if_ether.h>
> > > +#include <linux/if_packet.h>
> > > +#include <arpa/inet.h>
> > > +#include <net/if.h>
> > > +#include <sys/types.h>
> > > +#include <sys/socket.h>
> > > +#include <sys/ioctl.h>
> > > +#include <sys/mman.h>
> > > +#include <unistd.h>
> > > +#include <poll.h>
> > > +
> > > +#include "rte_eth_packet.h"
> > > +
> > > +#define ETH_PACKET_IFACE_ARG		"iface"
> > > +#define ETH_PACKET_NUM_Q_ARG		"qpairs"
> > > +#define ETH_PACKET_BLOCKSIZE_ARG	"blocksz"
> > > +#define ETH_PACKET_FRAMESIZE_ARG	"framesz"
> > > +#define ETH_PACKET_FRAMECOUNT_ARG	"framecnt"
> > > +
> > > +#define DFLT_BLOCK_SIZE		(1 << 12)
> > > +#define DFLT_FRAME_SIZE		(1 << 11)
> > > +#define DFLT_FRAME_COUNT	(1 << 9)
> > > +
> > > +struct pkt_rx_queue {
> > > +	int sockfd;
> > > +
> > > +	struct iovec *rd;
> > > +	uint8_t *map;
> > > +	unsigned int framecount;
> > > +	unsigned int framenum;
> > > +
> > > +	struct rte_mempool *mb_pool;
> > > +
> > > +	volatile unsigned long rx_pkts;
> > > +	volatile unsigned long err_pkts;
> >
> > Use of volatile will generate slow code, don't think it is 
> > necessary, especially when only one CPU can use a queue at a time.
> 
> That is a good point, worth checking out.  FWIW, those lines are 
> boilerplate originally copied from the pcap PMD. :-)
> 

> Yes, I agree it's worth checking out if there is a performance impact, but if we assume that the stats for RX/TX are possibly going to be read by another core, they really should be volatile for correctness

Accessing the rx_queue structure directly for stats is unlikely to happen from a second core; we should probably change the PCAP PMD as well (thanks for pointing that out John). 

  parent reply	other threads:[~2014-07-11 15:28 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-10 20:32 John W. Linville
2014-07-11 13:11 ` Stephen Hemminger
2014-07-11 14:49   ` John W. Linville
2014-07-11 15:06     ` Richardson, Bruce
2014-07-11 15:16       ` Stephen Hemminger
2014-07-11 16:07         ` Richardson, Bruce
2014-07-11 15:29       ` Venkatesan, Venky [this message]
2014-07-11 15:33         ` John W. Linville
2014-07-11 16:29           ` Venkatesan, Venky
2014-07-11 13:26 ` Thomas Monjalon
2014-07-11 14:51   ` John W. Linville
2014-07-11 15:04     ` Thomas Monjalon
2014-07-11 15:30       ` John W. Linville
2014-07-11 16:47         ` Thomas Monjalon
2014-07-11 17:38           ` Richardson, Bruce
2014-07-11 17:41             ` John W. Linville
2014-07-12 11:48           ` Neil Horman
     [not found] ` <D0158A423229094DA7ABF71CF2FA0DA3117D3A23@shsmsx102.ccr.corp.intel.com>
2014-07-11 17:20   ` Zhou, Danny
2014-07-11 17:40     ` John W. Linville
2014-07-11 18:01       ` Zhou, Danny
2014-07-11 18:46         ` John W. Linville
2014-07-12  0:42           ` Zhou, Danny
2014-07-14 13:45             ` John W. Linville
2014-07-11 19:04       ` Zhou, Danny
2014-07-11 19:31         ` John W. Linville
2014-07-11 20:27           ` Zhou, Danny
2014-07-11 20:31             ` Shaw, Jeffrey B
2014-07-11 20:35               ` Zhou, Danny
2014-07-11 20:40                 ` John W. Linville
2014-07-11 22:34       ` Thomas Monjalon
2014-07-14 13:46         ` John W. Linville
2014-07-15 21:27           ` Thomas Monjalon
2014-07-16 12:35             ` Neil Horman
2014-07-16 13:37               ` Thomas Monjalon
2014-07-16 14:07             ` John W. Linville
2014-07-16 14:26               ` Thomas Monjalon
2014-07-16 15:59                 ` Shaw, Jeffrey B
2014-07-11 22:30 ` Thomas Monjalon
2014-07-14 17:53   ` John W. Linville
2014-07-11 22:51 ` Bruce Richardson
2014-07-14 13:48   ` John W. Linville
2014-07-14 17:35     ` John W. Linville
2014-07-14 18:24 ` [dpdk-dev] [PATCH v2] " John W. Linville
2014-07-15  0:15   ` Zhou, Danny
2014-07-15 12:17     ` Neil Horman
2014-07-15 14:01       ` John W. Linville
2014-07-15 15:40         ` Zhou, Danny
2014-07-15 19:08           ` John W. Linville
2014-07-15 20:31         ` Neil Horman
2014-07-15 20:41           ` Zhou, Danny
2014-07-15 15:34       ` Zhou, Danny
2014-09-12 18:05   ` John W. Linville
2014-09-12 18:31     ` Zhou, Danny
2014-09-12 18:54       ` John W. Linville
2014-09-12 20:35         ` Zhou, Danny
2014-09-15 15:09           ` Neil Horman
2014-09-15 15:15             ` John W. Linville
2014-09-15 15:43             ` Zhou, Danny
2014-09-15 16:22               ` Neil Horman
2014-09-15 17:48                 ` John W. Linville
2014-09-15 19:11                   ` Zhou, Danny
2014-09-16 20:16     ` Neil Horman
2014-09-26  9:28       ` Thomas Monjalon
2014-09-26 14:08         ` Neil Horman
2014-09-29 10:05           ` Bruce Richardson
2014-10-08 15:57             ` Thomas Monjalon
2014-10-08 19:14               ` Neil Horman
2014-11-13 10:03                 ` Thomas Monjalon
2014-11-13 11:14                   ` Neil Horman
2014-11-13 11:57                     ` Thomas Monjalon
2014-11-14  0:42                       ` Neil Horman
2014-11-14 14:45                         ` John W. Linville
2014-11-17 15:57                           ` [dpdk-dev] [PATCH v3] librte_pmd_af_packet: " John W. Linville
2014-11-24 16:16                             ` Thomas Monjalon
2014-11-17 11:19                       ` [dpdk-dev] [PATCH v2] librte_pmd_packet: " Neil Horman
2014-11-17 11:22                         ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1FD9B82B8BF2CF418D9A1000154491D974124465@ORSMSX102.amr.corp.intel.com \
    --to=venky.venkatesan@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=linville@tuxdriver.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).