From: William Tu <u9012063@gmail.com>
To: "Zhang, Qi Z" <qi.z.zhang@intel.com>
Cc: dev@dpdk.org, "Karlsson, Magnus" <magnus.karlsson@intel.com>,
"Björn Töpel" <bjorn.topel@intel.com>,
jingjing.wu@intel.com, xiaoyun.li@intel.com,
ferruh.yigit@intel.com
Subject: Re: [dpdk-dev] [PATCH v3 0/6] PMD driver for AF_XDP
Date: Thu, 23 Aug 2018 09:25:07 -0700 [thread overview]
Message-ID: <CALDO+SbVLosAARoHpQTtApj6m+dZ6mO4ZDVJ_VkWBuqgb4uhug@mail.gmail.com> (raw)
In-Reply-To: <20180816144321.17719-1-qi.z.zhang@intel.com>
Hi Zhang Qi,
I'm not familiar with DPDK code, but I'm curious about the
benefits of using AF_XDP pmd, specifically I have a couple questions:
1) With zero-copy driver support, is AF_XDP pmd expects to have
similar performance than other pmd? Since AF_XDP is still using
native device driver, isn't the interrupt still there and not "poll-mode"
anymore?
2) does the patch expect user to customize the ebpf/xdp code
so that this becomes another way to extend dpdk datapath?
Thank you
William
On Thu, Aug 16, 2018 at 7:42 AM Qi Zhang <qi.z.zhang@intel.com> wrote:
>
> Overview
> ========
>
> The patch set add a new PMD driver for AF_XDP which is a proposed
> faster version of AF_PACKET interface in Linux, see below link for
> detail AF_XDP introduction:
> https://lwn.net/Articles/750845/
> https://fosdem.org/2018/schedule/event/af_xdp/
>
> AF_XDP roadmap
> ==============
> - The kernel 4.18 is out and af_xdp is included.
> https://kernelnewbies.org/Linux_4.18
> - So far there is no zero copy supported driver be merged, but some are
> on the way.
>
> Change logs
> ===========
>
> v3:
> - Re-work base on AF_XDP's interface changes.
> - Support multi-queues, each dpdk queue has its own xdp socket.
> An xdp socket is always bound to a netdev queue.
> We assume all xdp socket from the same ethdev are bound to the
> same netdev queue, though a netdev queue still can be bound by
> xdp sockets from different ethdev instances.
> Below is an example of the mapping.
> ------------------------------------------------------
> | dpdk q0 | dpdk q1 | dpdk q0 | dpdk q0 | dpdk q1 |
> ------------------------------------------------------
> | xsk A | xsk B | xsk C | xsk D | xsk E |<---|
> ------------------------------------------------------ |
> | ETHDEV 0 | ETHDEV 1 | ETHDEV 2 | | DPDK
> ------------------------------------------------------------------
> | netdev queue 0 | netdev queue 1 | | KERNEL
> ------------------------------------------------------ |
> | NETDEV eth0 | |
> ------------------------------------------------------ |
> | key xsk | |
> | ---------- -------------- | |
> | | | | 0 | xsk A | | |
> | | | -------------- | |
> | | | | 2 | xsk B | | |
> | | ebpf | ---------------------------------------
> | | | | 3 | xsk C | |
> | | redirect ->|-------------- |
> | | | | 4 | xsk D | |
> | | | -------------- |
> | |---------| | 5 | xsk E | |
> | -------------- |
> |-----------------------------------------------------
>
> - It is an open question that how to load ebpf to kernel and link to
> specific netdev in DPDK, should it be part of PMD, or it should be handled by
> an independent tool? In this patchset, it takes the second option, there will
> be a "bind" stage before we start AF_XDP PMD, this includes below steps:
> a) load ebpf program to the kernel, (the ebpf program must contain the
> logic to redirect packet to a xdp socket base on a redirect map).
> b) link ebpf program to specific network interface.
> c) expose the xdp socket redirect map id and entries number to user,
> so this will be parsed to PMD, and PMD will create xdp socket
> for each queue and update the redirect map correctly.
> (example: --vdev,iface=eth0,xsk_map_id=53,xsk_map_key_base=0,xsk_map_key_count=4)
>
> v2:
> - fix lisence header
> - clean up bpf dependency, bpf program is embedded, no "xdpsock_kern.o"
> required
> - clean up make file, only linux_header is required
> - fix all the compile warning.
> - fix packet number return in Tx.
>
> How to try
> ==========
>
> 1. Take the kernel v4.18.
> make sure you turn on XDP sockets when compiling
> Networking support -->
> Networking options -->
> [ * ] XDP sockets
> 2. in the kernel source code, apply below patch and compile the bpf sample code.
> #make samples/bpf/
> so the sample xdpsock can be used as a bind/unbind tool for af_xdp
> PMD, sorry for this ugly, but in future, there could be a dedicated
> tool in DPDK, if we agree with the idea that bpf configure in the kernel
> should be separated from PMD.
>
> ~~~~~~~~~~~~~~~~~~~~~~~PATCH START~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> index d69c8d78d3fd..44a6318043e7 100644
> --- a/samples/bpf/xdpsock_user.c
> +++ b/samples/bpf/xdpsock_user.c
> @@ -76,6 +76,8 @@ static int opt_poll;
> static int opt_shared_packet_buffer;
> static int opt_interval = 1;
> static u32 opt_xdp_bind_flags;
> +static int opt_bind;
> +static int opt_unbind;
>
> struct xdp_umem_uqueue {
> u32 cached_prod;
> @@ -662,6 +664,8 @@ static void usage(const char *prog)
> " -S, --xdp-skb=n Use XDP skb-mod\n"
> " -N, --xdp-native=n Enfore XDP native mode\n"
> " -n, --interval=n Specify statistics update interval (default 1 sec).\n"
> + " -b, --bind Bind only.\n"
> + " -u, --unbind Unbind only.\n"
> "\n";
> fprintf(stderr, str, prog);
> exit(EXIT_FAILURE);
> @@ -674,7 +678,7 @@ static void parse_command_line(int argc, char **argv)
> opterr = 0;
>
> for (;;) {
> - c = getopt_long(argc, argv, "rtli:q:psSNn:", long_options,
> + c = getopt_long(argc, argv, "rtli:q:psSNn:bu", long_options,
> &option_index);
> if (c == -1)
> break;
> @@ -711,6 +715,12 @@ static void parse_command_line(int argc, char **argv)
> case 'n':
> opt_interval = atoi(optarg);
> break;
> + case 'b':
> + opt_bind = 1;
> + break;
> + case 'u':
> + opt_unbind = 1;
> + break;
> default:
> usage(basename(argv[0]));
> }
> @@ -898,6 +908,12 @@ int main(int argc, char **argv)
> exit(EXIT_FAILURE);
> }
>
> + if (opt_unbind) {
> + bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags);
>
> ~~~~~~~~~~~~~~~~~~~~~~~PATCH END~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 3. bind
> #./samples/bpf/xdpsock -i eth0 -b
>
> in this step, an ebpf binary xdpsock_kern.o is be loaded into the kernel
> and linked to eth0, the ebpf source code is /samples/bpf/xdpsock_kern.c
> you can modify it and re-compile for a different test.
>
> 4. dump xdp socket map information.
> #./tools/bpf/bpftool/bpftool map -p, you will see something like below.
>
> },{
> "id": 56,
> "type": "xskmap",
> "name": "xsks_map",
> "flags": 0,
> "bytes_key": 4,
> "bytes_value": 4,
> "max_entries": 4,
> "bytes_memlock": 4096
> }
>
> in this case 56 is the map id and it has 4 entries
>
> 5. start testpmd
>
> ./build/app/testpmd -c 0xc -n 4 --vdev eth_af_xdp,iface=enp59s0f0,xsk_map_id=56,xsk_map_key_start=2xsk_map_key_count=2 -- -i --rxq=2 --txq=2
>
> in this case, we reserved 2 entries (2,3) in the map, and they will be mapped to queue 0 and queue 1.
>
> 6. unbind after test
> ./sample/bpf/xdpsock -i eth0 -u.
>
> Performance
> ===========
> Since no zero copy driver is ready yet.
> So far only tested with DRV and SKB mode on i40e 25G
> the result show identical with kernel sample "xdpsock"
>
> Qi Zhang (6):
> net/af_xdp: new PMD driver
> lib/mbuf: enable parse flags when create mempool
> lib/mempool: allow page size aligned mempool
> net/af_xdp: use mbuf mempool for buffer management
> net/af_xdp: enable zero copy
> app/testpmd: add mempool flags parameter
>
> app/test-pmd/parameters.c | 12 +
> app/test-pmd/testpmd.c | 15 +-
> app/test-pmd/testpmd.h | 1 +
> config/common_base | 5 +
> config/common_linuxapp | 1 +
> drivers/net/Makefile | 1 +
> drivers/net/af_xdp/Makefile | 30 +
> drivers/net/af_xdp/meson.build | 7 +
> drivers/net/af_xdp/rte_eth_af_xdp.c | 1345 +++++++++++++++++++++++++
> drivers/net/af_xdp/rte_pmd_af_xdp_version.map | 4 +
> lib/librte_mbuf/rte_mbuf.c | 15 +-
> lib/librte_mbuf/rte_mbuf.h | 8 +-
> lib/librte_mempool/rte_mempool.c | 3 +
> lib/librte_mempool/rte_mempool.h | 1 +
> mk/rte.app.mk | 1 +
> 15 files changed, 1439 insertions(+), 10 deletions(-)
> create mode 100644 drivers/net/af_xdp/Makefile
> create mode 100644 drivers/net/af_xdp/meson.build
> create mode 100644 drivers/net/af_xdp/rte_eth_af_xdp.c
> create mode 100644 drivers/net/af_xdp/rte_pmd_af_xdp_version.map
>
> --
> 2.13.6
>
next prev parent reply other threads:[~2018-08-23 16:25 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-16 14:43 Qi Zhang
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 1/6] net/af_xdp: new PMD driver Qi Zhang
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 2/6] lib/mbuf: enable parse flags when create mempool Qi Zhang
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 3/6] lib/mempool: allow page size aligned mempool Qi Zhang
2018-08-19 6:56 ` Jerin Jacob
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 4/6] net/af_xdp: use mbuf mempool for buffer management Qi Zhang
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 5/6] net/af_xdp: enable zero copy Qi Zhang
2018-08-16 14:43 ` [dpdk-dev] [RFC v3 6/6] app/testpmd: add mempool flags parameter Qi Zhang
2018-08-23 16:25 ` William Tu [this message]
2018-08-28 14:11 ` [dpdk-dev] [PATCH v3 0/6] PMD driver for AF_XDP Zhang, Qi Z
2018-08-25 6:11 ` Zhang, Qi Z
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALDO+SbVLosAARoHpQTtApj6m+dZ6mO4ZDVJ_VkWBuqgb4uhug@mail.gmail.com \
--to=u9012063@gmail.com \
--cc=bjorn.topel@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=jingjing.wu@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=xiaoyun.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).