DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Richardson, Bruce" <bruce.richardson@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices
Date: Fri, 11 Jul 2014 16:07:13 +0000	[thread overview]
Message-ID: <59AF69C657FD0841A61C55336867B5B0343ACDFF@IRSMSX103.ger.corp.intel.com> (raw)
In-Reply-To: <20140711081623.4c026199@samsung-9>

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, July 11, 2014 8:16 AM
> To: Richardson, Bruce
> Cc: John W. Linville; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-
> based virtual devices
> 
> On Fri, 11 Jul 2014 15:06:25 +0000
> "Richardson, Bruce" <bruce.richardson@intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville
> > > Sent: Friday, July 11, 2014 7:49 AM
> > > To: Stephen Hemminger
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] librte_pmd_packet: add PMD for
> AF_PACKET-
> > > based virtual devices
> > >
> > > On Fri, Jul 11, 2014 at 06:11:47AM -0700, Stephen Hemminger wrote:
> > > > On Thu, 10 Jul 2014 16:32:49 -0400
> > > > "John W. Linville" <linville@tuxdriver.com> wrote:
> > > >
> > > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> > > > > socket.  This implementation uses mmap'ed ring buffers to limit copying
> > > > > and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
> > > > > AF_PACKET is used for frame reception.  In the current implementation,
> > > > > Tx and Rx queues are always paired, and therefore are always equal
> > > > > in number -- changing this would be a Simple Matter Of Programming.
> > > > >
> > > > > Interfaces of this type are created with a command line option like
> > > > > "--vdev=eth_packet0,iface=...".  There are a number of options availabe
> > > > > as arguments:
> > > > >
> > > > >  - Interface is chosen by "iface" (required)
> > > > >  - Number of queue pairs set by "qpairs" (optional, default: 16)
> > > > >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
> > > > >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
> > > > >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default:
> 512)
> > > > >
> > > > > Signed-off-by: John W. Linville <linville@tuxdriver.com>
> > > > > ---
> > > > > This PMD is intended to provide a means for using DPDK on a broad
> > > > > range of hardware without hardware-specific PMDs and (hopefully)
> > > > > with better performance than what PCAP offers in Linux.  This might
> > > > > be useful as a development platform for DPDK applications when
> > > > > DPDK-supported hardware is expensive or unavailable.
> > > > >
> > > > >  config/common_bsdapp                   |   5 +
> > > > >  config/common_linuxapp                 |   5 +
> > > > >  lib/Makefile                           |   1 +
> > > > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> > > > >  lib/librte_pmd_packet/Makefile         |  60 +++
> > > > >  lib/librte_pmd_packet/rte_eth_packet.c | 826
> > > +++++++++++++++++++++++++++++++++
> > > > >  lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> > > > >  mk/rte.app.mk                          |   4 +
> > > > >  8 files changed, 957 insertions(+)
> > > > >  create mode 100644 lib/librte_pmd_packet/Makefile
> > > > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c
> > > > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> > > > >
> > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > > > index 943dce8f1ede..c317f031278e 100644
> > > > > --- a/config/common_bsdapp
> > > > > +++ b/config/common_bsdapp
> > > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> > > > >  CONFIG_RTE_LIBRTE_PMD_BOND=y
> > > > >
> > > > >  #
> > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> > > > > +#
> > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > > > > +
> > > > > +#
> > > > >  # Do prefetch of packet data within PMD driver receive function
> > > > >  #
> > > > >  CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp
> > > > > index 7bf5d80d4e26..f9e7bc3015ec 100644
> > > > > --- a/config/common_linuxapp
> > > > > +++ b/config/common_linuxapp
> > > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
> > > > >  CONFIG_RTE_LIBRTE_PMD_BOND=y
> > > > >
> > > > >  #
> > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> > > > > +#
> > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> > > > > +
> > > > > +#
> > > > >  # Compile Xen PMD
> > > > >  #
> > > > >  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> > > > > diff --git a/lib/Makefile b/lib/Makefile
> > > > > index 10c5bb3045bc..930fadf29898 100644
> > > > > --- a/lib/Makefile
> > > > > +++ b/lib/Makefile
> > > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=
> > > librte_pmd_i40e
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> > > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
> > > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile
> > > b/lib/librte_eal/linuxapp/eal/Makefile
> > > > > index 756d6b0c9301..feed24a63272 100644
> > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap
> > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt
> > > > >  CFLAGS += $(WERROR_FLAGS) -O3
> > > > >
> > > > > diff --git a/lib/librte_pmd_packet/Makefile
> > > b/lib/librte_pmd_packet/Makefile
> > > > > new file mode 100644
> > > > > index 000000000000..e1266fb992cd
> > > > > --- /dev/null
> > > > > +++ b/lib/librte_pmd_packet/Makefile
> > > > > @@ -0,0 +1,60 @@
> > > > > +#   BSD LICENSE
> > > > > +#
> > > > > +#   Copyright(c) 2014 John W. Linville <linville@redhat.com>
> > > > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > +#   Copyright(c) 2014 6WIND S.A.
> > > > > +#   All rights reserved.
> > > > > +#
> > > > > +#   Redistribution and use in source and binary forms, with or without
> > > > > +#   modification, are permitted provided that the following conditions
> > > > > +#   are met:
> > > > > +#
> > > > > +#     * Redistributions of source code must retain the above copyright
> > > > > +#       notice, this list of conditions and the following disclaimer.
> > > > > +#     * Redistributions in binary form must reproduce the above copyright
> > > > > +#       notice, this list of conditions and the following disclaimer in
> > > > > +#       the documentation and/or other materials provided with the
> > > > > +#       distribution.
> > > > > +#     * Neither the name of Intel Corporation nor the names of its
> > > > > +#       contributors may be used to endorse or promote products derived
> > > > > +#       from this software without specific prior written permission.
> > > > > +#
> > > > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > CONTRIBUTORS
> > > > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
> BUT
> > > NOT
> > > > > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > FITNESS FOR
> > > > > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > COPYRIGHT
> > > > > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > INCIDENTAL,
> > > > > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT
> > > NOT
> > > > > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> > > LOSS OF USE,
> > > > > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> CAUSED
> > > AND ON ANY
> > > > > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> OR
> > > TORT
> > > > > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> OUT
> > > OF THE USE
> > > > > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > > DAMAGE.
> > > > > +
> > > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > > +
> > > > > +#
> > > > > +# library name
> > > > > +#
> > > > > +LIB = librte_pmd_packet.a
> > > > > +
> > > > > +CFLAGS += -O3
> > > > > +CFLAGS += $(WERROR_FLAGS)
> > > > > +
> > > > > +#
> > > > > +# all source are stored in SRCS-y
> > > > > +#
> > > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c
> > > > > +
> > > > > +#
> > > > > +# Export include files
> > > > > +#
> > > > > +SYMLINK-y-include += rte_eth_packet.h
> > > > > +
> > > > > +# this lib depends upon:
> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf
> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether
> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc
> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs
> > > > > +
> > > > > +include $(RTE_SDK)/mk/rte.lib.mk
> > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c
> > > b/lib/librte_pmd_packet/rte_eth_packet.c
> > > > > new file mode 100644
> > > > > index 000000000000..fceb6258aad6
> > > > > --- /dev/null
> > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c
> > > > > @@ -0,0 +1,826 @@
> > > > > +/*-
> > > > > + *   BSD LICENSE
> > > > > + *
> > > > > + *   Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>
> > > > > + *
> > > > > + *   Originally based upon librte_pmd_pcap code:
> > > > > + *
> > > > > + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > + *   Copyright(c) 2014 6WIND S.A.
> > > > > + *   All rights reserved.
> > > > > + *
> > > > > + *   Redistribution and use in source and binary forms, with or without
> > > > > + *   modification, are permitted provided that the following conditions
> > > > > + *   are met:
> > > > > + *
> > > > > + *     * Redistributions of source code must retain the above copyright
> > > > > + *       notice, this list of conditions and the following disclaimer.
> > > > > + *     * Redistributions in binary form must reproduce the above
> copyright
> > > > > + *       notice, this list of conditions and the following disclaimer in
> > > > > + *       the documentation and/or other materials provided with the
> > > > > + *       distribution.
> > > > > + *     * Neither the name of Intel Corporation nor the names of its
> > > > > + *       contributors may be used to endorse or promote products derived
> > > > > + *       from this software without specific prior written permission.
> > > > > + *
> > > > > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > CONTRIBUTORS
> > > > > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
> BUT
> > > NOT
> > > > > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > FITNESS FOR
> > > > > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > COPYRIGHT
> > > > > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > INCIDENTAL,
> > > > > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> > > BUT NOT
> > > > > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> > > LOSS OF USE,
> > > > > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> CAUSED
> > > AND ON ANY
> > > > > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> OR
> > > TORT
> > > > > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> OUT
> > > OF THE USE
> > > > > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > > DAMAGE.
> > > > > + */
> > > > > +
> > > > > +#include <rte_mbuf.h>
> > > > > +#include <rte_ethdev.h>
> > > > > +#include <rte_malloc.h>
> > > > > +#include <rte_kvargs.h>
> > > > > +#include <rte_dev.h>
> > > > > +
> > > > > +#include <linux/if_ether.h>
> > > > > +#include <linux/if_packet.h>
> > > > > +#include <arpa/inet.h>
> > > > > +#include <net/if.h>
> > > > > +#include <sys/types.h>
> > > > > +#include <sys/socket.h>
> > > > > +#include <sys/ioctl.h>
> > > > > +#include <sys/mman.h>
> > > > > +#include <unistd.h>
> > > > > +#include <poll.h>
> > > > > +
> > > > > +#include "rte_eth_packet.h"
> > > > > +
> > > > > +#define ETH_PACKET_IFACE_ARG		"iface"
> > > > > +#define ETH_PACKET_NUM_Q_ARG		"qpairs"
> > > > > +#define ETH_PACKET_BLOCKSIZE_ARG	"blocksz"
> > > > > +#define ETH_PACKET_FRAMESIZE_ARG	"framesz"
> > > > > +#define ETH_PACKET_FRAMECOUNT_ARG	"framecnt"
> > > > > +
> > > > > +#define DFLT_BLOCK_SIZE		(1 << 12)
> > > > > +#define DFLT_FRAME_SIZE		(1 << 11)
> > > > > +#define DFLT_FRAME_COUNT	(1 << 9)
> > > > > +
> > > > > +struct pkt_rx_queue {
> > > > > +	int sockfd;
> > > > > +
> > > > > +	struct iovec *rd;
> > > > > +	uint8_t *map;
> > > > > +	unsigned int framecount;
> > > > > +	unsigned int framenum;
> > > > > +
> > > > > +	struct rte_mempool *mb_pool;
> > > > > +
> > > > > +	volatile unsigned long rx_pkts;
> > > > > +	volatile unsigned long err_pkts;
> > > >
> > > > Use of volatile will generate slow code, don't think
> > > > it is necessary, especially when only one CPU can use a queue
> > > > at a time.
> > >
> > > That is a good point, worth checking out.  FWIW, those lines are
> > > boilerplate originally copied from the pcap PMD. :-)
> > >
> >
> >
> > Yes, I agree it's worth checking out if there is a performance impact, but if we
> assume that the stats for RX/TX are possibly going to be read by another core,
> they really should be volatile for correctness.
> 
> Since only one core does update, that is not necessary. add will generate
> valid value. and reader will read a valid value.
> Only if two cpu's are using same queue would it be possible to for two add's
> to collide; but DPDK queue documentation specifically says queue's are not MP
> safe.

AFAIK adds colliding can occur whether volatile or not, unless atomic operations are explicitly used. The volatile would just be a sanity check to ensure the value isn't cached in registers in either read or writer cores, so it's strictly necessary but I also would suspect it to have minimal to no performance impact as the value should be written to memory anyway even without volatile (though there is no guarantee of this), and the additional compiler ordering constraints imposed by volatile, I would hope shouldn't affect things much. 

  reply	other threads:[~2014-07-11 16:06 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-10 20:32 John W. Linville
2014-07-11 13:11 ` Stephen Hemminger
2014-07-11 14:49   ` John W. Linville
2014-07-11 15:06     ` Richardson, Bruce
2014-07-11 15:16       ` Stephen Hemminger
2014-07-11 16:07         ` Richardson, Bruce [this message]
2014-07-11 15:29       ` Venkatesan, Venky
2014-07-11 15:33         ` John W. Linville
2014-07-11 16:29           ` Venkatesan, Venky
2014-07-11 13:26 ` Thomas Monjalon
2014-07-11 14:51   ` John W. Linville
2014-07-11 15:04     ` Thomas Monjalon
2014-07-11 15:30       ` John W. Linville
2014-07-11 16:47         ` Thomas Monjalon
2014-07-11 17:38           ` Richardson, Bruce
2014-07-11 17:41             ` John W. Linville
2014-07-12 11:48           ` Neil Horman
     [not found] ` <D0158A423229094DA7ABF71CF2FA0DA3117D3A23@shsmsx102.ccr.corp.intel.com>
2014-07-11 17:20   ` Zhou, Danny
2014-07-11 17:40     ` John W. Linville
2014-07-11 18:01       ` Zhou, Danny
2014-07-11 18:46         ` John W. Linville
2014-07-12  0:42           ` Zhou, Danny
2014-07-14 13:45             ` John W. Linville
2014-07-11 19:04       ` Zhou, Danny
2014-07-11 19:31         ` John W. Linville
2014-07-11 20:27           ` Zhou, Danny
2014-07-11 20:31             ` Shaw, Jeffrey B
2014-07-11 20:35               ` Zhou, Danny
2014-07-11 20:40                 ` John W. Linville
2014-07-11 22:34       ` Thomas Monjalon
2014-07-14 13:46         ` John W. Linville
2014-07-15 21:27           ` Thomas Monjalon
2014-07-16 12:35             ` Neil Horman
2014-07-16 13:37               ` Thomas Monjalon
2014-07-16 14:07             ` John W. Linville
2014-07-16 14:26               ` Thomas Monjalon
2014-07-16 15:59                 ` Shaw, Jeffrey B
2014-07-11 22:30 ` Thomas Monjalon
2014-07-14 17:53   ` John W. Linville
2014-07-11 22:51 ` Bruce Richardson
2014-07-14 13:48   ` John W. Linville
2014-07-14 17:35     ` John W. Linville
2014-07-14 18:24 ` [dpdk-dev] [PATCH v2] " John W. Linville
2014-07-15  0:15   ` Zhou, Danny
2014-07-15 12:17     ` Neil Horman
2014-07-15 14:01       ` John W. Linville
2014-07-15 15:40         ` Zhou, Danny
2014-07-15 19:08           ` John W. Linville
2014-07-15 20:31         ` Neil Horman
2014-07-15 20:41           ` Zhou, Danny
2014-07-15 15:34       ` Zhou, Danny
2014-09-12 18:05   ` John W. Linville
2014-09-12 18:31     ` Zhou, Danny
2014-09-12 18:54       ` John W. Linville
2014-09-12 20:35         ` Zhou, Danny
2014-09-15 15:09           ` Neil Horman
2014-09-15 15:15             ` John W. Linville
2014-09-15 15:43             ` Zhou, Danny
2014-09-15 16:22               ` Neil Horman
2014-09-15 17:48                 ` John W. Linville
2014-09-15 19:11                   ` Zhou, Danny
2014-09-16 20:16     ` Neil Horman
2014-09-26  9:28       ` Thomas Monjalon
2014-09-26 14:08         ` Neil Horman
2014-09-29 10:05           ` Bruce Richardson
2014-10-08 15:57             ` Thomas Monjalon
2014-10-08 19:14               ` Neil Horman
2014-11-13 10:03                 ` Thomas Monjalon
2014-11-13 11:14                   ` Neil Horman
2014-11-13 11:57                     ` Thomas Monjalon
2014-11-14  0:42                       ` Neil Horman
2014-11-14 14:45                         ` John W. Linville
2014-11-17 15:57                           ` [dpdk-dev] [PATCH v3] librte_pmd_af_packet: " John W. Linville
2014-11-24 16:16                             ` Thomas Monjalon
2014-11-17 11:19                       ` [dpdk-dev] [PATCH v2] librte_pmd_packet: " Neil Horman
2014-11-17 11:22                         ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59AF69C657FD0841A61C55336867B5B0343ACDFF@IRSMSX103.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).