From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1C6F343883; Wed, 10 Jan 2024 15:58:43 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0763140269; Wed, 10 Jan 2024 15:58:43 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 972474021E for ; Wed, 10 Jan 2024 15:58:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704898721; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yuPp1PWo7aOvVUl5JV4Lhtiyn/+O1CqB7SjwzzNtTt8=; b=bpDUwewpCRNaDIeZAovajafHmT974FpsrvNOR9Ohd8oEI3TI7k9MuXcSOdFmS7S923mutV OBPzwqqC4IJowaJcEZchKbvMIpeH6dmiaPnL5rhTQ9tRbfDmpAPBRp9oR57EUK6pEuJEm6 fCOUArAnC08iYr5Vqg/gCSxXftFM/ko= Received: from mail-oi1-f197.google.com (mail-oi1-f197.google.com [209.85.167.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-496-xHwZkbvlO-qQmQZeE2Fvvg-1; Wed, 10 Jan 2024 09:58:39 -0500 X-MC-Unique: xHwZkbvlO-qQmQZeE2Fvvg-1 Received: by mail-oi1-f197.google.com with SMTP id 5614622812f47-3bca5524924so5668302b6e.0 for ; Wed, 10 Jan 2024 06:58:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704898718; x=1705503518; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yuPp1PWo7aOvVUl5JV4Lhtiyn/+O1CqB7SjwzzNtTt8=; b=eujddbhPkJtkkGC3cGw1ciO3Tjn3M76W6o4u+SC4K3SNdUy9aafYFCsv0SKBdOKiCW gXtQg2FiWIUt6N3R7ZWLlJAZueiL7IvsiurwAISIjjqStGx+W8D+q6mqK5eVNCs4GCsH SX9KySlPuF073F2cZogpP6NiXTFa2TclNJwcTp6hmi3IwnaWKiTSULlKa0KnDicT+LnE MvEZYdTQi3WlH7Ab3p/2pw3uaL6IeUILidg+mHQvyWSPucG8t3je+Z3WAuK4djq/XYaf iel4SB5gfaaU/vT/z2WdG2MS+31RUJ8+bvw9VChPBIMwk/qlNMBDlmy8jtUqjYoDoHrT k6ag== X-Gm-Message-State: AOJu0YyxWoHEgvDoCOR63AG+YcF/o9Bi+3BUOsmDo/qJhOZMDMDgBCMX bUZGkFYqF7gPXAEpmn7rYABTVrSN1i2wpGXFryZ7ck1OBf6a7HEWDGYGHEGfQMWkNvMK+xPcZFP +UZh/PV7Y5stCYWUKA3E= X-Received: by 2002:a05:6808:ec6:b0:3bb:bc99:1788 with SMTP id q6-20020a0568080ec600b003bbbc991788mr1046272oiv.46.1704898717471; Wed, 10 Jan 2024 06:58:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IFb1y9HTwZbUsXrNieWtTqx7ld+RMK+HiDDLlqdzHf9+sG/FusxlPh3lo9N6ofDTu54AeT9Ig== X-Received: by 2002:a05:6808:ec6:b0:3bb:bc99:1788 with SMTP id q6-20020a0568080ec600b003bbbc991788mr1046254oiv.46.1704898716861; Wed, 10 Jan 2024 06:58:36 -0800 (PST) Received: from [192.168.116.69] ([89.19.88.29]) by smtp.gmail.com with ESMTPSA id k16-20020a0cebd0000000b0067eff6713a8sm1742083qvq.87.2024.01.10.06.58.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 10 Jan 2024 06:58:36 -0800 (PST) Message-ID: Date: Wed, 10 Jan 2024 14:58:32 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [v7 1/1] net/af_xdp: fix multi interface support for K8s To: ferruh.yigit@amd.com, stephen@networkplumber.org, lihuisong@huawei.com, fengchengwen@huawei.com, liuyonglong@huawei.com, david.marchand@redhat.com Cc: dev@dpdk.org, Ciara Loftus , Shibin Koikkara Reeny References: <20231222110441.2507650-1-mtahhan@redhat.com> From: Maryam Tahhan In-Reply-To: <20231222110441.2507650-1-mtahhan@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi folks Just wondering if there's any other comments re this patch after all the review comments were addressed? Thanks Maryam On 22/12/2023 11:04, Maryam Tahhan wrote: > The original 'use_cni' implementation, was added > to enable support for the AF_XDP PMD in a K8s env > without any escalated privileges. > However 'use_cni' used a hardcoded socket rather > than a configurable one. If a DPDK pod is requesting > multiple net devices and these devices are from > different pools, then the AF_XDP PMD attempts to > mount all the netdev UDSes in the pod as /tmp/afxdp.sock. > Which means that at best only 1 netdev will handshake > correctly with the AF_XDP DP. This patch addresses > this by making the socket parameter configurable using > a new vdev param called 'uds_path' and removing the > previous 'use_cni' param. This change has been tested > with the AF_XDP DP PR 81[1], with both single and > multiple interfaces. This patch also renames the > af_xdp_cni.rst doc to af_xdp_dp.rst and changes > incorrect references to the DP as CNI. Lastly, > this patch adds this feature to the release notes. > > [1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81 > > Signed-off-by: Maryam Tahhan > Reviewed-by: Ciara Loftus > Reviewed-by: Shibin Koikkara Reeny > --- > v7: > * Give a more descriptive commit msg headline. > * Fixup typos in documentation. > > v6: > * Add link to PR 81 in commit message > * Add release notes changes to this patchset > > v5: > * Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG > * Remove use_cni references in af_xdp.rst > > v4: > * Rename af_xdp_cni.rst to af_xdp_dp.rst > * Removed all incorrect references to CNI throughout af_xdp > PMD file. > * Fixed Typos in af_xdp_dp.rst > > v3: > * Remove `use_cni` vdev argument as it's no longer needed. > * Update incorrect CNI references for the AF_XDP DP in the > documentation. > * Update the documentation to run a simple example with the > AF_XDP DP plugin in K8s. > > v2: > * Rename sock_path to uds_path. > * Update documentation to reflect when CAP_BPF is needed. > * Fix testpmd arguments in the provided example for Pods. > * Use AF_XDP API to update the xskmap entry. > --- > doc/guides/howto/af_xdp_cni.rst | 253 ---------------------- > doc/guides/howto/af_xdp_dp.rst | 281 +++++++++++++++++++++++++ > doc/guides/howto/index.rst | 2 +- > doc/guides/nics/af_xdp.rst | 27 ++- > doc/guides/rel_notes/release_24_03.rst | 8 + > drivers/net/af_xdp/rte_eth_af_xdp.c | 100 +++++---- > 6 files changed, 356 insertions(+), 315 deletions(-) > delete mode 100644 doc/guides/howto/af_xdp_cni.rst > create mode 100644 doc/guides/howto/af_xdp_dp.rst > > diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst > deleted file mode 100644 > index a1a6d5b99c..0000000000 > --- a/doc/guides/howto/af_xdp_cni.rst > +++ /dev/null > @@ -1,253 +0,0 @@ > -.. SPDX-License-Identifier: BSD-3-Clause > - Copyright(c) 2023 Intel Corporation. > - > -Using a CNI with the AF_XDP driver > -================================== > - > -Introduction > ------------- > - > -CNI, the Container Network Interface, is a technology for configuring > -container network interfaces > -and which can be used to setup Kubernetes networking. > -AF_XDP is a Linux socket Address Family that enables an XDP program > -to redirect packets to a memory buffer in userspace. > - > -This document explains how to enable the `AF_XDP Plugin for Kubernetes`_ within > -a DPDK application using the :doc:`../nics/af_xdp` to connect and use these technologies. > - > -.. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes > - > - > -Background > ----------- > - > -The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program > -onto the kernel netdev to be used by the PMD. > -This operation requires root or escalated Linux privileges > -and thus prevents the PMD from working in an unprivileged container. > -The AF_XDP CNI plugin handles this situation > -by providing a device plugin that performs the program loading. > - > -At a technical level the CNI opens a Unix Domain Socket and listens for a client > -to make requests over that socket. > -A DPDK application acting as a client connects and initiates a configuration "handshake". > -The client then receives a file descriptor which points to the XSKMAP > -associated with the loaded eBPF program. > -The XSKMAP is a BPF map of AF_XDP sockets (XSK). > -The client can then proceed with creating an AF_XDP socket > -and inserting that socket into the XSKMAP pointed to by the descriptor. > - > -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes > -to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor > -from the CNI. > -When this flag is set, > -the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag > -should be used when creating the socket > -to instruct libbpf not to load the default libbpf program on the netdev. > -Instead the loading is handled by the CNI. > - > -.. note:: > - > - The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock". > - > - > -Prerequisites > -------------- > - > -Docker and container prerequisites: > - > -* Set up the device plugin > - as described in the instructions for `AF_XDP Plugin for Kubernetes`_. > - > -* The Docker image should contain the libbpf and libxdp libraries, > - which are dependencies for AF_XDP, > - and should include support for the ``ethtool`` command. > - > -* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF`` > - for AF_XDP along with support for hugepages. > - > -* Increase locked memory limit so containers have enough memory for packet buffers. > - For example: > - > - .. code-block:: console > - > - cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf > - [Service] > - LimitMEMLOCK=infinity > - EOF > - > -* dpdk-testpmd application should have AF_XDP feature enabled. > - > - For further information see the docs for the: :doc:`../../nics/af_xdp`. > - > - > -Example > -------- > - > -Howto run dpdk-testpmd with CNI plugin: > - > -* Clone the CNI plugin > - > - .. code-block:: console > - > - # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git > - > -* Build the CNI plugin > - > - .. code-block:: console > - > - # cd afxdp-plugins-for-kubernetes/ > - # make build > - > - .. note:: > - > - CNI plugin has a dependence on the config.json. > - > - Sample Config.json > - > - .. code-block:: json > - > - { > - "logLevel":"debug", > - "logFile":"afxdp-dp-e2e.log", > - "pools":[ > - { > - "name":"e2e", > - "mode":"primary", > - "timeout":30, > - "ethtoolCmds" : ["-L -device- combined 1"], > - "devices":[ > - { > - "name":"ens785f0" > - } > - ] > - } > - ] > - } > - > - For further reference please use the `config.json`_ > - > - .. _config.json: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/config.json > - > -* Create the Network Attachment definition > - > - .. code-block:: console > - > - # kubectl create -f nad.yaml > - > - Sample nad.yml > - > - .. code-block:: yaml > - > - apiVersion: "k8s.cni.cncf.io/v1" > - kind: NetworkAttachmentDefinition > - metadata: > - name: afxdp-e2e-test > - annotations: > - k8s.v1.cni.cncf.io/resourceName: afxdp/e2e > - spec: > - config: '{ > - "cniVersion": "0.3.0", > - "type": "afxdp", > - "mode": "cdq", > - "logFile": "afxdp-cni-e2e.log", > - "logLevel": "debug", > - "ipam": { > - "type": "host-local", > - "subnet": "192.168.1.0/24", > - "rangeStart": "192.168.1.200", > - "rangeEnd": "192.168.1.216", > - "routes": [ > - { "dst": "0.0.0.0/0" } > - ], > - "gateway": "192.168.1.1" > - } > - }' > - > - For further reference please use the `nad.yaml`_ > - > - .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/nad.yaml > - > -* Build the Docker image > - > - .. code-block:: console > - > - # docker build -t afxdp-e2e-test -f Dockerfile . > - > - Sample Dockerfile: > - > - .. code-block:: console > - > - FROM ubuntu:20.04 > - RUN apt-get update -y > - RUN apt install build-essential libelf-dev -y > - RUN apt-get install iproute2 acl -y > - RUN apt install python3-pyelftools ethtool -y > - RUN apt install libnuma-dev libjansson-dev libpcap-dev net-tools -y > - RUN apt-get install clang llvm -y > - COPY ./libbpf.tar.gz /tmp > - RUN cd /tmp && tar -xvmf libbpf.tar.gz && cd libbpf/src && make install > - COPY ./libxdp.tar.gz /tmp > - RUN cd /tmp && tar -xvmf libxdp.tar.gz && cd libxdp && make install > - > - .. note:: > - > - All the files that need to COPY-ed should be in the same directory as the Dockerfile > - > -* Run the Pod > - > - .. code-block:: console > - > - # kubectl create -f pod.yaml > - > - Sample pod.yaml: > - > - .. code-block:: yaml > - > - apiVersion: v1 > - kind: Pod > - metadata: > - name: afxdp-e2e-test > - annotations: > - k8s.v1.cni.cncf.io/networks: afxdp-e2e-test > - spec: > - containers: > - - name: afxdp > - image: afxdp-e2e-test:latest > - imagePullPolicy: Never > - env: > - - name: LD_LIBRARY_PATH > - value: /usr/lib64/:/usr/local/lib/ > - command: ["tail", "-f", "/dev/null"] > - securityContext: > - capabilities: > - add: > - - CAP_NET_RAW > - - CAP_BPF > - resources: > - requests: > - hugepages-2Mi: 2Gi > - memory: 2Gi > - afxdp/e2e: '1' > - limits: > - hugepages-2Mi: 2Gi > - memory: 2Gi > - afxdp/e2e: '1' > - > - For further reference please use the `pod.yaml`_ > - > - .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml > - > -* Run DPDK with a command like the following: > - > - .. code-block:: console > - > - kubectl exec -i --container -- \ > - //dpdk-testpmd -l 0,1 --no-pci \ > - --vdev=net_af_xdp0,use_cni=1,iface= \ > - -- --no-mlockall --in-memory > - > -For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_ > - > - .. _e2e: https://github.com/intel/afxdp-plugins-for-kubernetes/tree/v0.0.2/test/e2e > diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst > new file mode 100644 > index 0000000000..96bb9c0337 > --- /dev/null > +++ b/doc/guides/howto/af_xdp_dp.rst > @@ -0,0 +1,281 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2023 Intel Corporation. > + > +Using the AF_XDP Device Plugin with the AF_XDP driver > +===================================================== > + > +Introduction > +------------ > + > +The `AF_XDP Device Plugin for Kubernetes`_ is a project that provisions > +and advertises interfaces (that can be used with AF_XDP) to Kubernetes. > +The project also includes a `CNI`_. > + > +AF_XDP is a Linux socket Address Family that enables an XDP program > +to redirect packets to a memory buffer in userspace. > + > +This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_ with > +a DPDK :doc:`../nics/af_xdp` based application running in a Pod. > + > +.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes > +.. _CNI: https://github.com/containernetworking/cni > + > +Background > +---------- > + > +The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program > +onto the Kernel netdev to be used by the PMD. > +This operation requires root or escalated Linux privileges > +and prevents the PMD from working in an unprivileged container. > +The AF_XDP Device Plugin (DP) addresses this situation > +by providing an entity that manages eBPF program > +lifecycle for Pod interfaces that wish to use AF_XDP, this in turn allows > +the pod to be used without privilege escalation. > + > +In order for the pod to run without privilege escalation, the AF_XDP DP > +creates a Unix Domain Socket (UDS) and listens for Pods to make requests > +for XSKMAP(s) File Descriptors (FDs) for interfaces in their network namespace. > +In other words, the DPDK application running in the Pod connects to this UDS and > +initiates a "handshake" to retrieve the XSKMAP(s) FD(s). Upon a successful "handshake", > +the DPDK application receives the FD(s) for the XSKMAP(s) associated with the relevant > +netdevs. The DPDK application can then create the AF_XDP socket(s), and attach > +the socket(s) to the netdev queue(s) by inserting the socket(s) into the XSKMAP(s). > + > +The EAL vdev argument ``uds_path`` is used to indicate that the user wishes > +to run the AF_XDP PMD in unprivileged mode and to receive the XSKMAP FD > +from the AF_XDP DP. > +When this param is used, the > +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag > +is used when creating the AF_XDP socket > +to instruct libbpf/libxdp not to load the default eBPF redirect > +program for AF_XDP on the netdev. Instead the lifecycle management of the eBPF > +program is handled by the AF_XDP DP. > + > +.. note:: > + > + The UDS file path inside the pod appears at "/tmp/afxdp_dp//afxdp.sock". > + > +Prerequisites > +------------- > + > +Device Plugin and DPDK container prerequisites: > + > +* Create a DPDK container image. > + > +* Set up the device plugin and prepare the Pod Spec as described in > + the instructions for `AF_XDP Device Plugin for Kubernetes`_. > + > +* Increase locked memory limit so containers have enough memory for packet buffers. > + For example: > + > + .. code-block:: console > + > + cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf > + [Service] > + LimitMEMLOCK=infinity > + EOF > + > +* dpdk-testpmd application should have AF_XDP feature enabled. > + > + For further information see the docs for the: :doc:`../../nics/af_xdp`. > + > + > +Example > +------- > + > +How to run dpdk-testpmd with the AF_XDP Device plugin: > + > +* Clone the AF_XDP Device plugin > + > + .. code-block:: console > + > + # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git > + > +* Build the AF_XDP Device plugin and the CNI > + > + .. code-block:: console > + > + # cd afxdp-plugins-for-kubernetes/ > + # make image > + > +* Make sure to modify the image used by the `daemonset.yml`_ file in the deployments directory with > + the following configuration: > + > + .. _daemonset.yml : https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml > + > + .. code-block:: yaml > + > + image: afxdp-device-plugin:latest > + > + .. note:: > + > + This will select the AF_XDP DP image that was built locally. Detailed configuration > + options can be found in the AF_XDP Device Plugin `readme`_ . > + > + .. _readme: https://github.com/intel/afxdp-plugins-for-kubernetes#readme > + > +* Deploy the AF_XDP Device Plugin and CNI > + > + .. code-block:: console > + > + # kubectl create -f deployments/daemonset.yml > + > +* Create a Network Attachment Definition (NAD) > + > + .. code-block:: console > + > + # kubectl create -f nad.yaml > + > + Sample nad.yml > + > + .. code-block:: yaml > + > + apiVersion: "k8s.cni.cncf.io/v1" > + kind: NetworkAttachmentDefinition > + metadata: > + name: afxdp-network > + annotations: > + k8s.v1.cni.cncf.io/resourceName: afxdp/myPool > + spec: > + config: '{ > + "cniVersion": "0.3.0", > + "type": "afxdp", > + "mode": "primary", > + "logFile": "afxdp-cni.log", > + "logLevel": "debug", > + "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn", > + "-N -device- flow-type udp4 dst-port 2152 action 22" > + ], > + "ipam": { > + "type": "host-local", > + "subnet": "192.168.1.0/24", > + "rangeStart": "192.168.1.200", > + "rangeEnd": "192.168.1.220", > + "routes": [ > + { "dst": "0.0.0.0/0" } > + ], > + "gateway": "192.168.1.1" > + } > + }' > + > + For further reference please use the example provided by the AF_XDP DP `nad.yaml`_ > + > + .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml > + > +* Build a DPDK container image (using Docker) > + > + .. code-block:: console > + > + # docker build -t dpdk -f Dockerfile . > + > + Sample Dockerfile (should be placed in top level DPDK directory): > + > + .. code-block:: console > + > + FROM fedora:38 > + > + # Setup container to build DPDK applications > + RUN dnf -y upgrade && dnf -y install \ > + libbsd-devel \ > + numactl-libs \ > + libbpf-devel \ > + libbpf \ > + meson \ > + ninja-build \ > + libxdp-devel \ > + libxdp \ > + numactl-devel \ > + python3-pyelftools \ > + python38 \ > + iproute > + RUN dnf groupinstall -y 'Development Tools' > + > + # Create DPDK dir and copy over sources > + WORKDIR /dpdk > + COPY app app > + COPY builddir builddir > + COPY buildtools buildtools > + COPY config config > + COPY devtools devtools > + COPY drivers drivers > + COPY dts dts > + COPY examples examples > + COPY kernel kernel > + COPY lib lib > + COPY license license > + COPY MAINTAINERS MAINTAINERS > + COPY Makefile Makefile > + COPY meson.build meson.build > + COPY meson_options.txt meson_options.txt > + COPY usertools usertools > + COPY VERSION VERSION > + COPY ABI_VERSION ABI_VERSION > + COPY doc doc > + > + # Build DPDK > + RUN meson setup build > + RUN ninja -C build > + > + .. note:: > + > + Ensure the Dockerfile is placed in the top level DPDK directory. > + > +* Run the Pod > + > + .. code-block:: console > + > + # kubectl create -f pod.yaml > + > + Sample pod.yaml: > + > + .. code-block:: yaml > + > + apiVersion: v1 > + kind: Pod > + metadata: > + name: dpdk > + annotations: > + k8s.v1.cni.cncf.io/networks: afxdp-network > + spec: > + containers: > + - name: testpmd > + image: dpdk:latest > + command: ["tail", "-f", "/dev/null"] > + securityContext: > + capabilities: > + add: > + - NET_RAW > + - IPC_LOCK > + resources: > + requests: > + afxdp/myPool: '1' > + limits: > + hugepages-1Gi: 2Gi > + cpu: 2 > + memory: 256Mi > + afxdp/myPool: '1' > + volumeMounts: > + - name: hugepages > + mountPath: /dev/hugepages > + volumes: > + - name: hugepages > + emptyDir: > + medium: HugePages > + > + For further reference please use the `pod.yaml`_ > + > + .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml > + > +.. note:: > + > + For Kernel versions older than 5.19 `CAP_BPF` is also required in > + the container capabilities stanza. > + > +* Run DPDK with a command like the following: > + > + .. code-block:: console > + > + kubectl exec -i dpdk --container testpmd -- \ > + ./build/app/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \ > + --vdev net_af_xdp,iface=,start_queue=22,queue_count=1,uds_path=/tmp/afxdp_dp//afxdp.sock \ > + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; > diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst > index 71a3381c36..a7692e8a97 100644 > --- a/doc/guides/howto/index.rst > +++ b/doc/guides/howto/index.rst > @@ -8,7 +8,7 @@ HowTo Guides > :maxdepth: 2 > :numbered: > > - af_xdp_cni > + af_xdp_dp > lm_bond_virtio_sriov > lm_virtio_vhost_user > flow_bifurcation > diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst > index 1932525d4d..5f3105cf2a 100644 > --- a/doc/guides/nics/af_xdp.rst > +++ b/doc/guides/nics/af_xdp.rst > @@ -151,25 +151,32 @@ instead of zero copy mode (if available). > > --vdev net_af_xdp,iface=ens786f1,force_copy=1 > > -use_cni > -~~~~~~~ > +uds_path > +~~~~~~~~ > > -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes to > -enable the `AF_XDP Plugin for Kubernetes`_ within a DPDK application. > +The EAL vdev argument ``uds_path`` is used to indicate that the user wishes to > +use the `AF_XDP Plugin for Kubernetes`_ with a DPDK application running in a Pod. > > .. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes > > .. code-block:: console > > - --vdev=net_af_xdp0,use_cni=1 > + --vdev=net_af_xdp0,uds_path=/tmp/afxdp_dp//afxdp.sock > > .. note:: > > - When using `use_cni`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled > - as both of these will be handled by the AF_XDP plugin. > - Since the DPDK application is running in limited privileges > - so enabling and disabling of the promiscuous mode through the DPDK application > - is also not supported. > + The UDS ``afxdp.sock`` is available in the DPDK container through a > + volume mounted by the `AF_XDP Plugin for Kubernetes`_ at the path > + specified in the example above. > + > +.. note:: > + > + When using `uds_path`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled > + as both of these will be handled by the AF_XDP Device plugin (if required). > + Since the pod/container is running with limited privileges enabling and disabling > + of promiscuous mode through the DPDK application is also not supported. > + > +For more details please see: :doc:`../howto/af_xdp_dp` > > Limitations > ----------- > diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst > index 6f8ad27808..a736de5d3f 100644 > --- a/doc/guides/rel_notes/release_24_03.rst > +++ b/doc/guides/rel_notes/release_24_03.rst > @@ -55,6 +55,14 @@ New Features > Also, make sure to start the actual text at the margin. > ======================================================= > > +* **Integrated AF_XDP PMD with AF_XDP Device Plugin**. > + > + The EAL vdev argument for the AF_XDP PMD ``uds_path`` was added > + to allow Kubernetes Pods that which to use AF_XDP with DPDK to run > + with limited privileges. This flag indicates that the AF_XDP PMD > + will be used in unprivileged mode and will receive the XSKMAP FD from > + the AF_XDP Device Plugin. > + > > Removed Items > ------------- > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 353c8688ec..db6724b9e5 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE); > #define UDS_MAX_CMD_LEN 64 > #define UDS_MAX_CMD_RESP 128 > #define UDS_XSK_MAP_FD_MSG "/xsk_map_fd" > -#define UDS_SOCK "/tmp/afxdp.sock" > #define UDS_CONNECT_MSG "/connect" > #define UDS_HOST_OK_MSG "/host_ok" > #define UDS_HOST_NAK_MSG "/host_nak" > @@ -170,7 +169,7 @@ struct pmd_internals { > char prog_path[PATH_MAX]; > bool custom_prog_configured; > bool force_copy; > - bool use_cni; > + char uds_path[PATH_MAX]; > struct bpf_map *map; > > struct rte_ether_addr eth_addr; > @@ -190,7 +189,7 @@ struct pmd_process_private { > #define ETH_AF_XDP_PROG_ARG "xdp_prog" > #define ETH_AF_XDP_BUDGET_ARG "busy_budget" > #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" > -#define ETH_AF_XDP_USE_CNI_ARG "use_cni" > +#define ETH_AF_XDP_USE_DP_UDS_PATH_ARG "uds_path" > > static const char * const valid_arguments[] = { > ETH_AF_XDP_IFACE_ARG, > @@ -200,7 +199,7 @@ static const char * const valid_arguments[] = { > ETH_AF_XDP_PROG_ARG, > ETH_AF_XDP_BUDGET_ARG, > ETH_AF_XDP_FORCE_COPY_ARG, > - ETH_AF_XDP_USE_CNI_ARG, > + ETH_AF_XDP_USE_DP_UDS_PATH_ARG, > NULL > }; > > @@ -1351,7 +1350,7 @@ configure_preferred_busy_poll(struct pkt_rx_queue *rxq) > } > > static int > -init_uds_sock(struct sockaddr_un *server) > +init_uds_sock(struct sockaddr_un *server, const char *uds_path) > { > int sock; > > @@ -1362,7 +1361,7 @@ init_uds_sock(struct sockaddr_un *server) > } > > server->sun_family = AF_UNIX; > - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path)); > + strlcpy(server->sun_path, uds_path, sizeof(server->sun_path)); > > if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) { > close(sock); > @@ -1382,7 +1381,7 @@ struct msg_internal { > }; > > static int > -send_msg(int sock, char *request, int *fd) > +send_msg(int sock, char *request, int *fd, const char *uds_path) > { > int snd; > struct iovec iov; > @@ -1393,7 +1392,7 @@ send_msg(int sock, char *request, int *fd) > > memset(&dst, 0, sizeof(dst)); > dst.sun_family = AF_UNIX; > - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path)); > + strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path)); > > /* Initialize message header structure */ > memset(&msgh, 0, sizeof(msgh)); > @@ -1470,8 +1469,8 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd) > } > > static int > -make_request_cni(int sock, struct sockaddr_un *server, char *request, > - int *req_fd, char *response, int *out_fd) > +make_request_dp(int sock, struct sockaddr_un *server, char *request, > + int *req_fd, char *response, int *out_fd, const char *uds_path) > { > int rval; > > @@ -1483,7 +1482,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request, > if (req_fd == NULL) > rval = write(sock, request, strlen(request)); > else > - rval = send_msg(sock, request, req_fd); > + rval = send_msg(sock, request, req_fd, uds_path); > > if (rval < 0) { > AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); > @@ -1507,7 +1506,7 @@ check_response(char *response, char *exp_resp, long size) > } > > static int > -get_cni_fd(char *if_name) > +get_xskmap_fd(char *if_name, const char *uds_path) > { > char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP]; > char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP]; > @@ -1520,14 +1519,14 @@ get_cni_fd(char *if_name) > return -1; > > memset(&server, 0, sizeof(server)); > - sock = init_uds_sock(&server); > + sock = init_uds_sock(&server, uds_path); > if (sock < 0) > return -1; > > - /* Initiates handshake to CNI send: /connect,hostname */ > + /* Initiates handshake to AF_XDP Device Plugin send: /connect,hostname */ > snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname); > memset(response, 0, sizeof(response)); > - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { > + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); > goto err_close; > } > @@ -1541,7 +1540,7 @@ get_cni_fd(char *if_name) > /* Request for "/version" */ > strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN); > memset(response, 0, sizeof(response)); > - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { > + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); > goto err_close; > } > @@ -1549,7 +1548,7 @@ get_cni_fd(char *if_name) > /* Request for file descriptor for netdev name*/ > snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name); > memset(response, 0, sizeof(response)); > - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { > + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); > goto err_close; > } > @@ -1571,7 +1570,7 @@ get_cni_fd(char *if_name) > /* Initiate close connection */ > strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN); > memset(response, 0, sizeof(response)); > - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { > + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); > goto err_close; > } > @@ -1640,7 +1639,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, > #endif > > /* Disable libbpf from loading XDP program */ > - if (internals->use_cni) > + if (strnlen(internals->uds_path, PATH_MAX)) > cfg.libbpf_flags |= XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; > > if (strnlen(internals->prog_path, PATH_MAX)) { > @@ -1694,18 +1693,17 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, > } > } > > - if (internals->use_cni) { > - int err, fd, map_fd; > + if (strnlen(internals->uds_path, PATH_MAX)) { > + int err, map_fd; > > - /* get socket fd from CNI plugin */ > - map_fd = get_cni_fd(internals->if_name); > + /* get socket fd from AF_XDP Device plugin */ > + map_fd = get_xskmap_fd(internals->if_name, internals->uds_path); > if (map_fd < 0) { > - AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n"); > + AF_XDP_LOG(ERR, "Failed to receive AF_XDP Device plugin fd\n"); > goto out_xsk; > } > - /* get socket fd */ > - fd = xsk_socket__fd(rxq->xsk); > - err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0); > + > + err = xsk_socket__update_xskmap(rxq->xsk, map_fd); > if (err) { > AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n"); > goto out_xsk; > @@ -1881,13 +1879,13 @@ static const struct eth_dev_ops ops = { > .get_monitor_addr = eth_get_monitor_addr, > }; > > -/* CNI option works in unprivileged container environment > - * and ethernet device functionality will be reduced. So > - * additional customiszed eth_dev_ops struct is needed > - * for cni. Promiscuous enable and disable functionality > - * is removed. > - **/ > -static const struct eth_dev_ops ops_cni = { > +/* AF_XDP Device Plugin option works in unprivileged > + * container environment and ethernet device functionality > + * will be reduced. So additional customized eth_dev_ops > + * struct is needed for the AF_XDP Device Plugin. Promiscuous > + * enable and disable functionality is removed. > + */ > +static const struct eth_dev_ops ops_afxdp_dp = { > .dev_start = eth_dev_start, > .dev_stop = eth_dev_stop, > .dev_close = eth_dev_close, > @@ -1957,7 +1955,7 @@ parse_name_arg(const char *key __rte_unused, > > /** parse xdp prog argument */ > static int > -parse_prog_arg(const char *key __rte_unused, > +parse_path_arg(const char *key __rte_unused, > const char *value, void *extra_args) > { > char *path = extra_args; > @@ -2023,7 +2021,7 @@ xdp_get_channels_info(const char *if_name, int *max_queues, > static int > parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, > int *queue_cnt, int *shared_umem, char *prog_path, > - int *busy_budget, int *force_copy, int *use_cni) > + int *busy_budget, int *force_copy, char *uds_path) > { > int ret; > > @@ -2050,7 +2048,7 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, > goto free_kvlist; > > ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PROG_ARG, > - &parse_prog_arg, prog_path); > + &parse_path_arg, prog_path); > if (ret < 0) > goto free_kvlist; > > @@ -2064,8 +2062,8 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, > if (ret < 0) > goto free_kvlist; > > - ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_ARG, > - &parse_integer_arg, use_cni); > + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_DP_UDS_PATH_ARG, > + &parse_path_arg, uds_path); > if (ret < 0) > goto free_kvlist; > > @@ -2108,7 +2106,7 @@ static struct rte_eth_dev * > init_internals(struct rte_vdev_device *dev, const char *if_name, > int start_queue_idx, int queue_cnt, int shared_umem, > const char *prog_path, int busy_budget, int force_copy, > - int use_cni) > + const char *uds_path) > { > const char *name = rte_vdev_device_name(dev); > const unsigned int numa_node = dev->device.numa_node; > @@ -2137,7 +2135,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, > #endif > internals->shared_umem = shared_umem; > internals->force_copy = force_copy; > - internals->use_cni = use_cni; > + strlcpy(internals->uds_path, uds_path, PATH_MAX); > > if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, > &internals->combined_queue_cnt)) { > @@ -2196,10 +2194,10 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, > eth_dev->data->dev_link = pmd_link; > eth_dev->data->mac_addrs = &internals->eth_addr; > eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS; > - if (!internals->use_cni) > + if (!strnlen(internals->uds_path, PATH_MAX)) > eth_dev->dev_ops = &ops; > else > - eth_dev->dev_ops = &ops_cni; > + eth_dev->dev_ops = &ops_afxdp_dp; > > eth_dev->rx_pkt_burst = eth_af_xdp_rx; > eth_dev->tx_pkt_burst = eth_af_xdp_tx; > @@ -2327,7 +2325,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > char prog_path[PATH_MAX] = {'\0'}; > int busy_budget = -1, ret; > int force_copy = 0; > - int use_cni = 0; > + char uds_path[PATH_MAX] = {'\0'}; > struct rte_eth_dev *eth_dev = NULL; > const char *name = rte_vdev_device_name(dev); > > @@ -2370,20 +2368,20 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > > if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, > &xsk_queue_cnt, &shared_umem, prog_path, > - &busy_budget, &force_copy, &use_cni) < 0) { > + &busy_budget, &force_copy, uds_path) < 0) { > AF_XDP_LOG(ERR, "Invalid kvargs value\n"); > return -EINVAL; > } > > - if (use_cni && busy_budget > 0) { > + if (strnlen(uds_path, PATH_MAX) && busy_budget > 0) { > AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", > - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_BUDGET_ARG); > + ETH_AF_XDP_USE_DP_UDS_PATH_ARG, ETH_AF_XDP_BUDGET_ARG); > return -EINVAL; > } > > - if (use_cni && strnlen(prog_path, PATH_MAX)) { > + if (strnlen(uds_path, PATH_MAX) && strnlen(prog_path, PATH_MAX)) { > AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", > - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG); > + ETH_AF_XDP_USE_DP_UDS_PATH_ARG, ETH_AF_XDP_PROG_ARG); > return -EINVAL; > } > > @@ -2410,7 +2408,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > > eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, > xsk_queue_cnt, shared_umem, prog_path, > - busy_budget, force_copy, use_cni); > + busy_budget, force_copy, uds_path); > if (eth_dev == NULL) { > AF_XDP_LOG(ERR, "Failed to init internals\n"); > return -1; > @@ -2471,4 +2469,4 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, > "xdp_prog= " > "busy_budget= " > "force_copy= " > - "use_cni= "); > + "uds_path= ");