From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D856343DA3; Mon, 8 Apr 2024 15:10:03 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id F0C44406B4; Mon, 8 Apr 2024 15:09:39 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 2AC88402E3 for ; Mon, 8 Apr 2024 15:09:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712581778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ccr1m12UANUTMPNm+P3Ik9l9QaO2EaB/PXyftsaOppw=; b=Cb4GxFwBI6qzLcE9j4e59KNnXM/un36vX06GBy6eMyieyk5IEQgKpJ/6/OdUIpkr2wcWhP 46DE//K9pP5Tr4lzml/N2/yU17/lP1a2GQSgbaZqWQcDM3bzne4MEzy0TIUbABcXZsQNnK Pc6ZVdVZPk4fMtcFCypri6llki+eUeE= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-529-H3p5tIqVMNuq24rIYW4fnQ-1; Mon, 08 Apr 2024 09:09:37 -0400 X-MC-Unique: H3p5tIqVMNuq24rIYW4fnQ-1 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-432efdd9374so39177051cf.0 for ; Mon, 08 Apr 2024 06:09:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712581776; x=1713186576; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ccr1m12UANUTMPNm+P3Ik9l9QaO2EaB/PXyftsaOppw=; b=HtLK+dLeNYpWZUFV/5PXYcLXALrQVvWGgIAoerzYrhYauD/4DoZ9Zb+vNWcNaxjDhA 7H+8wxONJrPZaHgxnmfVbhnj41ybl7jxOHdRJjfm3L96OW10B8/wIl5n949eaI9gxbly 7NEeHAFmecKPsap/knf2SkAI8lfVxTZP5PECRdniQyE3lsnAOkBCagcpnczMjHClEoof YlxfJBB74Csdft8qzXM6GZm7s8byEwBTSLvS8OXvwLDLM7BuaK1i0uFs2Yrlq66KPAB1 5V6NRFIARn9RGmxMms2wM4RcNhXOM2Dg86+deRo4q8+5r7SOuBOnzzTfZoAptffhAIOu 5rVw== X-Gm-Message-State: AOJu0YxDnHXE21kCqE+t6yPTHAbP/f7t0PqiNHXmhaYIy0VeIy6TBgrh xP0Ddk/14EBYoqvoLbufpFi7SC9OsiDf8cgM0YZtdCGKH1++k2qGtPcv1O+XKt2jdfjgBLeTS7f JXn93gvsAXeQwwNInlG27xEgpn9DsRyu2RtfOU8f0 X-Received: by 2002:a05:622a:1816:b0:434:a317:2df4 with SMTP id t22-20020a05622a181600b00434a3172df4mr2630038qtc.25.1712581776352; Mon, 08 Apr 2024 06:09:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEev1tFvviKhC0qVe1Vp5QsLb2ZCL3ppL1YE6Uz/SmC8RASdlH6KpTxq+xMwzphx46y+LMGmQ== X-Received: by 2002:a05:622a:1816:b0:434:a317:2df4 with SMTP id t22-20020a05622a181600b00434a3172df4mr2630022qtc.25.1712581775986; Mon, 08 Apr 2024 06:09:35 -0700 (PDT) Received: from nfvsdn-06.redhat.com (nat-pool-232-132.redhat.com. [66.187.232.132]) by smtp.gmail.com with ESMTPSA id kg18-20020a05622a761200b00434c9e59ab4sm20945qtb.25.2024.04.08.06.09.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Apr 2024 06:09:34 -0700 (PDT) From: Maryam Tahhan To: ferruh.yigit@amd.com, stephen@networkplumber.org, lihuisong@huawei.com, fengchengwen@huawei.com, liuyonglong@huawei.com, david.marchand@redhat.com, shibin.koikkara.reeny@intel.com, ciara.loftus@intel.com Cc: dev@dpdk.org, Maryam Tahhan Subject: [v14 3/3] net/af_xdp: support AF_XDP DP pinned maps Date: Mon, 8 Apr 2024 09:09:22 -0400 Message-ID: <20240408130924.232154-4-mtahhan@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20240408130924.232154-1-mtahhan@redhat.com> References: <20240408130924.232154-1-mtahhan@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Enable the AF_XDP PMD to retrieve the xskmap from a pinned eBPF map. This map is expected to be pinned by an external entity like the AF_XDP Device Plugin. This enabled unprivileged pods to create and use AF_XDP sockets. Signed-off-by: Maryam Tahhan --- doc/guides/howto/af_xdp_dp.rst | 35 ++++++++-- doc/guides/nics/af_xdp.rst | 34 ++++++++-- doc/guides/rel_notes/release_24_07.rst | 10 +++ drivers/net/af_xdp/rte_eth_af_xdp.c | 93 ++++++++++++++++++++------ 4 files changed, 141 insertions(+), 31 deletions(-) diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst index 4aa6b5499f..8b9b5ebbad 100644 --- a/doc/guides/howto/af_xdp_dp.rst +++ b/doc/guides/howto/af_xdp_dp.rst @@ -52,10 +52,21 @@ should be used when creating the socket to instruct libbpf not to load the default libbpf program on the netdev. Instead the loading is handled by the AF_XDP Device Plugin. -The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument -to explicitly tell the AF_XDP PMD where to find the UDS to interact with the -AF_XDP Device Plugin. If this argument is not passed alongside the ``use_cni`` -argument then the AF_XDP PMD configures it internally. +The EAL vdev argument ``use_pinned_map`` is used indicate to the AF_XDP PMD to +retrieve the XSKMAP fd from a pinned eBPF map. This map is expected to be pinned +by an external entity like the AF_XDP Device Plugin. This enabled unprivileged pods +to create and use AF_XDP sockets. When this flag is set, the +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used by the AF_XDP PMD when +creating the AF_XDP socket. + +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map`` +arguments to explicitly tell the AF_XDP PMD where to find either: + +1. The UDS to interact with the AF_XDP Device Plugin. OR +2. The pinned xskmap to use when creating AF_XDP sockets. + +If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then +the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_. .. note:: @@ -312,8 +323,18 @@ Run dpdk-testpmd with the AF_XDP Device Plugin + CNI --no-mlockall --in-memory \ -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + Or + + .. code-block:: console + + kubectl exec -i --container -- \ + //dpdk-testpmd -l 0,1 --no-pci \ + --vdev=net_af_xdp0,use_pinned_map=1,iface=,dp_path="/tmp/afxdp_dp//xsks_map" \ + --no-mlockall --in-memory \ + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + .. note:: - If the ``dp_path`` parameter isn't explicitly set (like the example above) - the AF_XDP PMD will set the parameter value to - ``/tmp/afxdp_dp/<>/afxdp.sock``. + If the ``dp_path`` parameter isn't explicitly set with ``use_cni`` or ``use_pinned_map`` + the AF_XDP PMD will set the parameter values to the `AF_XDP Device Plugin for Kubernetes`_ + defaults. diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 7f8651beda..940bbf60f2 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -171,13 +171,35 @@ enable the `AF_XDP Device Plugin for Kubernetes`_ with a DPDK application/pod. so enabling and disabling of the promiscuous mode through the DPDK application is also not supported. +use_pinned_map +~~~~~~~~~~~~~~ + +The EAL vdev argument ``use_pinned_map`` is used to indicate that the user wishes to +load a pinned xskmap mounted by `AF_XDP Device Plugin for Kubernetes`_ in the DPDK +application/pod. + +.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes + +.. code-block:: console + + --vdev=net_af_xdp0,use_pinned_map=1 + +.. note:: + + This feature can also be used with any external entity that can pin an eBPF map, not just + the `AF_XDP Device Plugin for Kubernetes`_. + dp_path ~~~~~~~ -The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument -to explicitly tell the AF_XDP PMD where to find the UDS to interact with the -`AF_XDP Device Plugin for Kubernetes`_. If this argument is not passed -alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally. +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map`` +arguments to explicitly tell the AF_XDP PMD where to find either: + +1. The UDS to interact with the AF_XDP Device Plugin. OR +2. The pinned xskmap to use when creating AF_XDP sockets. + +If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then +the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_. .. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes @@ -185,6 +207,10 @@ alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally. --vdev=net_af_xdp0,use_cni=1,dp_path="/tmp/afxdp_dp/<>/afxdp.sock" +.. code-block:: console + + --vdev=net_af_xdp0,use_pinned_map=1,dp_path="/tmp/afxdp_dp/<>/xsks_map" + Limitations ----------- diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst index 2b85ae55aa..ffc4e02944 100644 --- a/doc/guides/rel_notes/release_24_07.rst +++ b/doc/guides/rel_notes/release_24_07.rst @@ -63,6 +63,16 @@ New Features compatibility for any applications already using the ``use_cni`` vdev argument with the AF_XDP Device Plugin. +* **Integrated AF_XDP PMD with AF_XDP Device Plugin eBPF map pinning support**. + + The EAL vdev argument for the AF_XDP PMD ``use_map_pinning`` was added + to allow Kubernetes Pods to use AF_XDP with DPDK, and run with limited + privileges, without having to do a full handshake over a Unix Domain + Socket with the Device Plugin. This flag indicates that the AF_XDP PMD + will be used in unprivileged mode and will obtain the XSKMAP FD by calling + ``bpf_obj_get()`` for an xskmap pinned (by the AF_XDP DP) inside the + container. + Removed Items ------------- diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index dea3bab983..cdf5ca7b67 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -85,6 +85,7 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE); #define DP_BASE_PATH "/tmp/afxdp_dp" #define DP_UDS_SOCK "afxdp.sock" +#define DP_XSK_MAP "xsks_map" #define MAX_LONG_OPT_SZ 64 #define UDS_MAX_FD_NUM 2 #define UDS_MAX_CMD_LEN 64 @@ -172,6 +173,7 @@ struct pmd_internals { bool custom_prog_configured; bool force_copy; bool use_cni; + bool use_pinned_map; char dp_path[PATH_MAX]; struct bpf_map *map; @@ -193,6 +195,7 @@ struct pmd_process_private { #define ETH_AF_XDP_BUDGET_ARG "busy_budget" #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" #define ETH_AF_XDP_USE_CNI_ARG "use_cni" +#define ETH_AF_XDP_USE_PINNED_MAP_ARG "use_pinned_map" #define ETH_AF_XDP_DP_PATH_ARG "dp_path" static const char * const valid_arguments[] = { @@ -204,6 +207,7 @@ static const char * const valid_arguments[] = { ETH_AF_XDP_BUDGET_ARG, ETH_AF_XDP_FORCE_COPY_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_USE_PINNED_MAP_ARG, ETH_AF_XDP_DP_PATH_ARG, NULL }; @@ -1258,6 +1262,21 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, } #endif +static int +get_pinned_map(const char *dp_path, int *map_fd) +{ + *map_fd = bpf_obj_get(dp_path); + if (!*map_fd) { + AF_XDP_LOG(ERR, "Failed to find xsks_map in %s\n", dp_path); + return -1; + } + + AF_XDP_LOG(INFO, "Successfully retrieved map %s with fd %d\n", + dp_path, *map_fd); + + return 0; +} + static int load_custom_xdp_prog(const char *prog_path, int if_index, struct bpf_map **map) { @@ -1644,7 +1663,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, #endif /* Disable libbpf from loading XDP program */ - if (internals->use_cni) + if (internals->use_cni || internals->use_pinned_map) cfg.libbpf_flags |= XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; if (strnlen(internals->prog_path, PATH_MAX)) { @@ -1698,14 +1717,23 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, } } - if (internals->use_cni) { + if (internals->use_cni || internals->use_pinned_map) { int err, map_fd; - /* get socket fd from AF_XDP Device Plugin */ - map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path); - if (map_fd < 0) { - AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n"); - goto out_xsk; + if (internals->use_cni) { + /* get socket fd from AF_XDP Device Plugin */ + map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path); + if (map_fd < 0) { + AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n"); + goto out_xsk; + } + } else { + /* get socket fd from AF_XDP plugin */ + err = get_pinned_map(internals->dp_path, &map_fd); + if (err < 0 || map_fd < 0) { + AF_XDP_LOG(ERR, "Failed to retrieve pinned map fd\n"); + goto out_xsk; + } } err = update_xskmap(rxq->xsk, map_fd, rxq->xsk_queue_idx); @@ -2028,7 +2056,7 @@ static int parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, int *queue_cnt, int *shared_umem, char *prog_path, int *busy_budget, int *force_copy, int *use_cni, - char *dp_path) + int *use_pinned_map, char *dp_path) { int ret; @@ -2074,6 +2102,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_PINNED_MAP_ARG, + &parse_integer_arg, use_pinned_map); + if (ret < 0) + goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_DP_PATH_ARG, &parse_prog_arg, dp_path); if (ret < 0) @@ -2118,7 +2151,7 @@ static struct rte_eth_dev * init_internals(struct rte_vdev_device *dev, const char *if_name, int start_queue_idx, int queue_cnt, int shared_umem, const char *prog_path, int busy_budget, int force_copy, - int use_cni, const char *dp_path) + int use_cni, int use_pinned_map, const char *dp_path) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -2148,6 +2181,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, internals->shared_umem = shared_umem; internals->force_copy = force_copy; internals->use_cni = use_cni; + internals->use_pinned_map = use_pinned_map; strlcpy(internals->dp_path, dp_path, PATH_MAX); if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, @@ -2207,7 +2241,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, eth_dev->data->dev_link = pmd_link; eth_dev->data->mac_addrs = &internals->eth_addr; eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS; - if (!internals->use_cni) + if (!internals->use_cni && !internals->use_pinned_map) eth_dev->dev_ops = &ops; else eth_dev->dev_ops = &ops_afxdp_dp; @@ -2339,6 +2373,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int busy_budget = -1, ret; int force_copy = 0; int use_cni = 0; + int use_pinned_map = 0; char dp_path[PATH_MAX] = {'\0'}; struct rte_eth_dev *eth_dev = NULL; const char *name = rte_vdev_device_name(dev); @@ -2382,20 +2417,29 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, &xsk_queue_cnt, &shared_umem, prog_path, - &busy_budget, &force_copy, &use_cni, dp_path) < 0) { + &busy_budget, &force_copy, &use_cni, &use_pinned_map, + dp_path) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } - if (use_cni && busy_budget > 0) { + if (use_cni && use_pinned_map) { AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_BUDGET_ARG); + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG); return -EINVAL; } - if (use_cni && strnlen(prog_path, PATH_MAX)) { - AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG); + if ((use_cni || use_pinned_map) && busy_budget > 0) { + AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n", + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG, + ETH_AF_XDP_BUDGET_ARG); + return -EINVAL; + } + + if ((use_cni || use_pinned_map) && strnlen(prog_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n", + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG, + ETH_AF_XDP_PROG_ARG); return -EINVAL; } @@ -2405,9 +2449,16 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) ETH_AF_XDP_DP_PATH_ARG, dp_path); } - if (!use_cni && strnlen(dp_path, PATH_MAX)) { - AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' was not enabled\n", - ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG); + if (use_pinned_map && !strnlen(dp_path, PATH_MAX)) { + snprintf(dp_path, sizeof(dp_path), "%s/%s/%s", DP_BASE_PATH, if_name, DP_XSK_MAP); + AF_XDP_LOG(INFO, "'%s' parameter not provided, setting value to '%s'\n", + ETH_AF_XDP_DP_PATH_ARG, dp_path); + } + + if ((!use_cni && !use_pinned_map) && strnlen(dp_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' or '%s' were not enabled\n", + ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_USE_PINNED_MAP_ARG); return -EINVAL; } @@ -2434,7 +2485,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, xsk_queue_cnt, shared_umem, prog_path, - busy_budget, force_copy, use_cni, dp_path); + busy_budget, force_copy, use_cni, use_pinned_map, + dp_path); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -2496,4 +2548,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "busy_budget= " "force_copy= " "use_cni= " + "use_pinned_map= " "dp_path= "); -- 2.41.0