From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 69974324D for ; Thu, 16 Aug 2018 16:42:38 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Aug 2018 07:42:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,247,1531810800"; d="scan'208";a="65704081" Received: from dpdk51.sh.intel.com ([10.67.110.190]) by orsmga008.jf.intel.com with ESMTP; 16 Aug 2018 07:42:35 -0700 From: Qi Zhang To: dev@dpdk.org Cc: magnus.karlsson@intel.com, bjorn.topel@intel.com, jingjing.wu@intel.com, xiaoyun.li@intel.com, ferruh.yigit@intel.com, Qi Zhang Date: Thu, 16 Aug 2018 22:43:15 +0800 Message-Id: <20180816144321.17719-1-qi.z.zhang@intel.com> X-Mailer: git-send-email 2.13.6 Subject: [dpdk-dev] [PATCH v3 0/6] PMD driver for AF_XDP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2018 14:42:39 -0000 Overview ======== The patch set add a new PMD driver for AF_XDP which is a proposed faster version of AF_PACKET interface in Linux, see below link for detail AF_XDP introduction: https://lwn.net/Articles/750845/ https://fosdem.org/2018/schedule/event/af_xdp/ AF_XDP roadmap ============== - The kernel 4.18 is out and af_xdp is included. https://kernelnewbies.org/Linux_4.18 - So far there is no zero copy supported driver be merged, but some are on the way. Change logs =========== v3: - Re-work base on AF_XDP's interface changes. - Support multi-queues, each dpdk queue has its own xdp socket. An xdp socket is always bound to a netdev queue. We assume all xdp socket from the same ethdev are bound to the same netdev queue, though a netdev queue still can be bound by xdp sockets from different ethdev instances. Below is an example of the mapping. ------------------------------------------------------ | dpdk q0 | dpdk q1 | dpdk q0 | dpdk q0 | dpdk q1 | ------------------------------------------------------ | xsk A | xsk B | xsk C | xsk D | xsk E |<---| ------------------------------------------------------ | | ETHDEV 0 | ETHDEV 1 | ETHDEV 2 | | DPDK ------------------------------------------------------------------ | netdev queue 0 | netdev queue 1 | | KERNEL ------------------------------------------------------ | | NETDEV eth0 | | ------------------------------------------------------ | | key xsk | | | ---------- -------------- | | | | | | 0 | xsk A | | | | | | -------------- | | | | | | 2 | xsk B | | | | | ebpf | --------------------------------------- | | | | 3 | xsk C | | | | redirect ->|-------------- | | | | | 4 | xsk D | | | | | -------------- | | |---------| | 5 | xsk E | | | -------------- | |----------------------------------------------------- - It is an open question that how to load ebpf to kernel and link to specific netdev in DPDK, should it be part of PMD, or it should be handled by an independent tool? In this patchset, it takes the second option, there will be a "bind" stage before we start AF_XDP PMD, this includes below steps: a) load ebpf program to the kernel, (the ebpf program must contain the logic to redirect packet to a xdp socket base on a redirect map). b) link ebpf program to specific network interface. c) expose the xdp socket redirect map id and entries number to user, so this will be parsed to PMD, and PMD will create xdp socket for each queue and update the redirect map correctly. (example: --vdev,iface=eth0,xsk_map_id=53,xsk_map_key_base=0,xsk_map_key_count=4) v2: - fix lisence header - clean up bpf dependency, bpf program is embedded, no "xdpsock_kern.o" required - clean up make file, only linux_header is required - fix all the compile warning. - fix packet number return in Tx. How to try ========== 1. Take the kernel v4.18. make sure you turn on XDP sockets when compiling Networking support --> Networking options --> [ * ] XDP sockets 2. in the kernel source code, apply below patch and compile the bpf sample code. #make samples/bpf/ so the sample xdpsock can be used as a bind/unbind tool for af_xdp PMD, sorry for this ugly, but in future, there could be a dedicated tool in DPDK, if we agree with the idea that bpf configure in the kernel should be separated from PMD. ~~~~~~~~~~~~~~~~~~~~~~~PATCH START~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index d69c8d78d3fd..44a6318043e7 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -76,6 +76,8 @@ static int opt_poll; static int opt_shared_packet_buffer; static int opt_interval = 1; static u32 opt_xdp_bind_flags; +static int opt_bind; +static int opt_unbind; struct xdp_umem_uqueue { u32 cached_prod; @@ -662,6 +664,8 @@ static void usage(const char *prog) " -S, --xdp-skb=n Use XDP skb-mod\n" " -N, --xdp-native=n Enfore XDP native mode\n" " -n, --interval=n Specify statistics update interval (default 1 sec).\n" + " -b, --bind Bind only.\n" + " -u, --unbind Unbind only.\n" "\n"; fprintf(stderr, str, prog); exit(EXIT_FAILURE); @@ -674,7 +678,7 @@ static void parse_command_line(int argc, char **argv) opterr = 0; for (;;) { - c = getopt_long(argc, argv, "rtli:q:psSNn:", long_options, + c = getopt_long(argc, argv, "rtli:q:psSNn:bu", long_options, &option_index); if (c == -1) break; @@ -711,6 +715,12 @@ static void parse_command_line(int argc, char **argv) case 'n': opt_interval = atoi(optarg); break; + case 'b': + opt_bind = 1; + break; + case 'u': + opt_unbind = 1; + break; default: usage(basename(argv[0])); } @@ -898,6 +908,12 @@ int main(int argc, char **argv) exit(EXIT_FAILURE); } + if (opt_unbind) { + bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); ~~~~~~~~~~~~~~~~~~~~~~~PATCH END~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3. bind #./samples/bpf/xdpsock -i eth0 -b in this step, an ebpf binary xdpsock_kern.o is be loaded into the kernel and linked to eth0, the ebpf source code is /samples/bpf/xdpsock_kern.c you can modify it and re-compile for a different test. 4. dump xdp socket map information. #./tools/bpf/bpftool/bpftool map -p, you will see something like below. },{ "id": 56, "type": "xskmap", "name": "xsks_map", "flags": 0, "bytes_key": 4, "bytes_value": 4, "max_entries": 4, "bytes_memlock": 4096 } in this case 56 is the map id and it has 4 entries 5. start testpmd ./build/app/testpmd -c 0xc -n 4 --vdev eth_af_xdp,iface=enp59s0f0,xsk_map_id=56,xsk_map_key_start=2xsk_map_key_count=2 -- -i --rxq=2 --txq=2 in this case, we reserved 2 entries (2,3) in the map, and they will be mapped to queue 0 and queue 1. 6. unbind after test ./sample/bpf/xdpsock -i eth0 -u. Performance =========== Since no zero copy driver is ready yet. So far only tested with DRV and SKB mode on i40e 25G the result show identical with kernel sample "xdpsock" Qi Zhang (6): net/af_xdp: new PMD driver lib/mbuf: enable parse flags when create mempool lib/mempool: allow page size aligned mempool net/af_xdp: use mbuf mempool for buffer management net/af_xdp: enable zero copy app/testpmd: add mempool flags parameter app/test-pmd/parameters.c | 12 + app/test-pmd/testpmd.c | 15 +- app/test-pmd/testpmd.h | 1 + config/common_base | 5 + config/common_linuxapp | 1 + drivers/net/Makefile | 1 + drivers/net/af_xdp/Makefile | 30 + drivers/net/af_xdp/meson.build | 7 + drivers/net/af_xdp/rte_eth_af_xdp.c | 1345 +++++++++++++++++++++++++ drivers/net/af_xdp/rte_pmd_af_xdp_version.map | 4 + lib/librte_mbuf/rte_mbuf.c | 15 +- lib/librte_mbuf/rte_mbuf.h | 8 +- lib/librte_mempool/rte_mempool.c | 3 + lib/librte_mempool/rte_mempool.h | 1 + mk/rte.app.mk | 1 + 15 files changed, 1439 insertions(+), 10 deletions(-) create mode 100644 drivers/net/af_xdp/Makefile create mode 100644 drivers/net/af_xdp/meson.build create mode 100644 drivers/net/af_xdp/rte_eth_af_xdp.c create mode 100644 drivers/net/af_xdp/rte_pmd_af_xdp_version.map -- 2.13.6