DPDK patches and discussions
 help / color / mirror / Atom feed
* [v6] net/af_xdp: enable uds_path instead of use_cni
@ 2023-12-15 15:39 Maryam Tahhan
  2023-12-21 10:44 ` Koikkara Reeny, Shibin
  2023-12-21 11:36 ` David Marchand
  0 siblings, 2 replies; 5+ messages in thread
From: Maryam Tahhan @ 2023-12-15 15:39 UTC (permalink / raw)
  To: ferruh.yigit, stephen, lihuisong, fengchengwen, liuyonglong,
	shibin.koikkara.reeny, ciara.loftus
  Cc: dev, Maryam Tahhan

With the original 'use_cni' implementation, (using a
hardcoded socket rather than a configurable one),
if a DPDK pod is requesting multiple net devices
and these devices are from different pools, then
the container attempts to mount all the netdev UDSes
in the pod as /tmp/afxdp.sock. Which means that at best
only 1 netdev will handshake correctly with the AF_XDP
DP. This patch addresses this by making the socket
parameter configurable using a new vdev param called
'uds_path' and removing the previous 'use_cni' param.
This patch also fixes incorrect references to the
AF_XDP DP as CNI and updates the documentation with a
working example. This change has been tested with the
AF_XDP DP PR 81[1], with both single and multiple interfaces.

[1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81

v6:
* Add link to PR 81 in commit message
* Add release notes changes to this patchset

v5:
* Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG
* Remove use_cni references in af_xdp.rst

v4:
* Rename af_xdp_cni.rst to af_xdp_dp.rst
* Removed all incorrect references to CNI throughout af_xdp
  PMD file.
* Fixed Typos in af_xdp_dp.rst

v3:
* Remove `use_cni` vdev argument as it's no longer needed.
* Update incorrect CNI references for the AF_XDP DP in the
  documentation.
* Update the documentation to run a simple example with the
  AF_XDP DP plugin in K8s.

v2:
* Rename sock_path to uds_path.
* Update documentation to reflect when CAP_BPF is needed.
* Fix testpmd arguments in the provided example for Pods.
* Use AF_XDP API to update the xskmap entry.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
---
 doc/guides/howto/af_xdp_cni.rst        | 253 ----------------------
 doc/guides/howto/af_xdp_dp.rst         | 278 +++++++++++++++++++++++++
 doc/guides/howto/index.rst             |   2 +-
 doc/guides/nics/af_xdp.rst             |  27 ++-
 doc/guides/rel_notes/release_24_03.rst |   7 +
 drivers/net/af_xdp/rte_eth_af_xdp.c    | 100 +++++----
 6 files changed, 352 insertions(+), 315 deletions(-)
 delete mode 100644 doc/guides/howto/af_xdp_cni.rst
 create mode 100644 doc/guides/howto/af_xdp_dp.rst

diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst
deleted file mode 100644
index a1a6d5b99c..0000000000
--- a/doc/guides/howto/af_xdp_cni.rst
+++ /dev/null
@@ -1,253 +0,0 @@
-.. SPDX-License-Identifier: BSD-3-Clause
-   Copyright(c) 2023 Intel Corporation.
-
-Using a CNI with the AF_XDP driver
-==================================
-
-Introduction
-------------
-
-CNI, the Container Network Interface, is a technology for configuring
-container network interfaces
-and which can be used to setup Kubernetes networking.
-AF_XDP is a Linux socket Address Family that enables an XDP program
-to redirect packets to a memory buffer in userspace.
-
-This document explains how to enable the `AF_XDP Plugin for Kubernetes`_ within
-a DPDK application using the :doc:`../nics/af_xdp` to connect and use these technologies.
-
-.. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
-
-
-Background
-----------
-
-The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program
-onto the kernel netdev to be used by the PMD.
-This operation requires root or escalated Linux privileges
-and thus prevents the PMD from working in an unprivileged container.
-The AF_XDP CNI plugin handles this situation
-by providing a device plugin that performs the program loading.
-
-At a technical level the CNI opens a Unix Domain Socket and listens for a client
-to make requests over that socket.
-A DPDK application acting as a client connects and initiates a configuration "handshake".
-The client then receives a file descriptor which points to the XSKMAP
-associated with the loaded eBPF program.
-The XSKMAP is a BPF map of AF_XDP sockets (XSK).
-The client can then proceed with creating an AF_XDP socket
-and inserting that socket into the XSKMAP pointed to by the descriptor.
-
-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
-to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor
-from the CNI.
-When this flag is set,
-the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
-should be used when creating the socket
-to instruct libbpf not to load the default libbpf program on the netdev.
-Instead the loading is handled by the CNI.
-
-.. note::
-
-   The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock".
-
-
-Prerequisites
--------------
-
-Docker and container prerequisites:
-
-* Set up the device plugin
-  as described in the instructions for `AF_XDP Plugin for Kubernetes`_.
-
-* The Docker image should contain the libbpf and libxdp libraries,
-  which are dependencies for AF_XDP,
-  and should include support for the ``ethtool`` command.
-
-* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF``
-  for AF_XDP along with support for hugepages.
-
-* Increase locked memory limit so containers have enough memory for packet buffers.
-  For example:
-
-  .. code-block:: console
-
-     cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf
-     [Service]
-     LimitMEMLOCK=infinity
-     EOF
-
-* dpdk-testpmd application should have AF_XDP feature enabled.
-
-  For further information see the docs for the: :doc:`../../nics/af_xdp`.
-
-
-Example
--------
-
-Howto run dpdk-testpmd with CNI plugin:
-
-* Clone the CNI plugin
-
-  .. code-block:: console
-
-     # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git
-
-* Build the CNI plugin
-
-  .. code-block:: console
-
-     # cd afxdp-plugins-for-kubernetes/
-     # make build
-
-  .. note::
-
-     CNI plugin has a dependence on the config.json.
-
-  Sample Config.json
-
-  .. code-block:: json
-
-     {
-        "logLevel":"debug",
-        "logFile":"afxdp-dp-e2e.log",
-        "pools":[
-           {
-              "name":"e2e",
-              "mode":"primary",
-              "timeout":30,
-              "ethtoolCmds" : ["-L -device- combined 1"],
-              "devices":[
-                 {
-                    "name":"ens785f0"
-                 }
-              ]
-           }
-        ]
-     }
-
-  For further reference please use the `config.json`_
-
-  .. _config.json: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/config.json
-
-* Create the Network Attachment definition
-
-  .. code-block:: console
-
-     # kubectl create -f nad.yaml
-
-  Sample nad.yml
-
-  .. code-block:: yaml
-
-      apiVersion: "k8s.cni.cncf.io/v1"
-      kind: NetworkAttachmentDefinition
-      metadata:
-        name: afxdp-e2e-test
-        annotations:
-          k8s.v1.cni.cncf.io/resourceName: afxdp/e2e
-      spec:
-        config: '{
-            "cniVersion": "0.3.0",
-            "type": "afxdp",
-            "mode": "cdq",
-            "logFile": "afxdp-cni-e2e.log",
-            "logLevel": "debug",
-            "ipam": {
-              "type": "host-local",
-              "subnet": "192.168.1.0/24",
-              "rangeStart": "192.168.1.200",
-              "rangeEnd": "192.168.1.216",
-              "routes": [
-                { "dst": "0.0.0.0/0" }
-              ],
-              "gateway": "192.168.1.1"
-            }
-          }'
-
-  For further reference please use the `nad.yaml`_
-
-  .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/nad.yaml
-
-* Build the Docker image
-
-  .. code-block:: console
-
-     # docker build -t afxdp-e2e-test -f Dockerfile .
-
-  Sample Dockerfile:
-
-  .. code-block:: console
-
-     FROM ubuntu:20.04
-     RUN apt-get update -y
-     RUN apt install build-essential libelf-dev -y
-     RUN apt-get install iproute2  acl -y
-     RUN apt install python3-pyelftools ethtool -y
-     RUN apt install libnuma-dev libjansson-dev libpcap-dev net-tools -y
-     RUN apt-get install clang llvm -y
-     COPY ./libbpf<version>.tar.gz /tmp
-     RUN cd /tmp && tar -xvmf libbpf<version>.tar.gz && cd libbpf/src && make install
-     COPY ./libxdp<version>.tar.gz /tmp
-     RUN cd /tmp && tar -xvmf libxdp<version>.tar.gz && cd libxdp && make install
-
-  .. note::
-
-     All the files that need to COPY-ed should be in the same directory as the Dockerfile
-
-* Run the Pod
-
-  .. code-block:: console
-
-     # kubectl create -f pod.yaml
-
-  Sample pod.yaml:
-
-  .. code-block:: yaml
-
-     apiVersion: v1
-     kind: Pod
-     metadata:
-       name: afxdp-e2e-test
-       annotations:
-         k8s.v1.cni.cncf.io/networks: afxdp-e2e-test
-     spec:
-       containers:
-       - name: afxdp
-         image: afxdp-e2e-test:latest
-         imagePullPolicy: Never
-         env:
-         - name: LD_LIBRARY_PATH
-           value: /usr/lib64/:/usr/local/lib/
-         command: ["tail", "-f", "/dev/null"]
-         securityContext:
-          capabilities:
-             add:
-               - CAP_NET_RAW
-               - CAP_BPF
-         resources:
-           requests:
-             hugepages-2Mi: 2Gi
-             memory: 2Gi
-             afxdp/e2e: '1'
-           limits:
-             hugepages-2Mi: 2Gi
-             memory: 2Gi
-             afxdp/e2e: '1'
-
-  For further reference please use the `pod.yaml`_
-
-  .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
-
-* Run DPDK with a command like the following:
-
-  .. code-block:: console
-
-     kubectl exec -i <Pod name> --container <containers name> -- \
-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
-           -- --no-mlockall --in-memory
-
-For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_
-
-  .. _e2e: https://github.com/intel/afxdp-plugins-for-kubernetes/tree/v0.0.2/test/e2e
diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst
new file mode 100644
index 0000000000..7717d59224
--- /dev/null
+++ b/doc/guides/howto/af_xdp_dp.rst
@@ -0,0 +1,278 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright(c) 2023 Intel Corporation.
+
+Using the AF_XDP Device Plugin with the AF_XDP driver
+======================================================
+
+Introduction
+------------
+
+The `AF_XDP Device Plugin for Kubernetes`_ is a project that provisions
+and advertises interfaces (that can be used with AF_XDP) to Kubernetes.
+The project also includes a `CNI`_.
+
+AF_XDP is a Linux socket Address Family that enables an XDP program
+to redirect packets to a memory buffer in userspace.
+
+This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_ with
+a DPDK :doc:`../nics/af_xdp` based application running in a Pod.
+
+.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
+.. _CNI: https://github.com/containernetworking/cni
+
+Background
+----------
+
+The standard :doc:`../nics/af_xdp` initialization process involves
+loading an eBPF program onto the Kernel netdev to be used by the PMD.
+This operation requires root or escalated Linux privileges and prevents
+the PMD from working in an unprivileged container. The AF_XDP Device Plugin (DP)
+addresses this situation by providing an entity that manages eBPF program
+lifecycle for Pod interfaces that wish to use AF_XDP, this in turn allows
+the pod to be used without privilege escalation.
+
+In order for the pod to run without privilege escalation, the AF_XDP DP
+creates a Unix Domain Socket (UDS) and listens for Pods to make requests
+for XSKMAP(s) File Descriptors (FDs) for interfaces in their network namespace.
+In other words, the DPDK application running in the Pod connects to this UDS and
+initiates a "handshake" to retrieve the XSKMAP(s) FD(s). Upon a successful "handshake",
+the DPDK application receives the FD(s) for the XSKMAP(s) associated with the relevant
+netdevs. The DPDK application can then create the AF_XDP socket(s), and attach
+the socket(s) to the netdev queue(s) by inserting the socket(s) into the XSKMAP(s).
+
+The EAL vdev argument ``uds_path`` is used to indicate that the user
+wishes to run the AF_XDP PMD in unprivileged mode and to receive the XSKMAP
+FD from the AF_XDP DP. When this param is used, the
+``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used when creating the
+AF_XDP socket to instruct libbpf/libxdp not to load the default eBPF redirect
+program for AF_XDP on the netdev. Instead the lifecycle management of the eBPF
+program is handled by the AF_XDP DP.
+
+.. note::
+
+   The UDS file path inside the pod appears at "/tmp/afxdp_dp/<netdev>/afxdp.sock".
+
+Prerequisites
+-------------
+
+Device Plugin and DPDK container prerequisites:
+
+* Create a DPDK container image.
+
+* Set up the device plugin and prepare the Pod Spec as described in
+  the instructions for `AF_XDP Device Plugin for Kubernetes`_.
+
+* Increase locked memory limit so containers have enough memory for packet buffers.
+  For example:
+
+  .. code-block:: console
+
+     cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf
+     [Service]
+     LimitMEMLOCK=infinity
+     EOF
+
+* dpdk-testpmd application should have AF_XDP feature enabled.
+
+  For further information see the docs for the: :doc:`../../nics/af_xdp`.
+
+
+Example
+-------
+
+How to run dpdk-testpmd with the AF_XDP Device plugin:
+
+* Clone the AF_XDP Device plugin
+
+  .. code-block:: console
+
+     # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git
+
+* Build the AF_XDP Device plugin and the CNI
+
+  .. code-block:: console
+
+     # cd afxdp-plugins-for-kubernetes/
+     # make image
+
+* Make sure to modify the image used by the `daemonset.yml`_ file in the deployments directory with
+  the following configuration:
+
+   .. _daemonset.yml : https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml
+
+  .. code-block:: yaml
+
+    image: afxdp-device-plugin:latest
+
+  .. note::
+
+    This will select the AF_XDP DP image that was built locally. Detailed configuration
+    options can be found in the AF_XDP Device Plugin `readme`_ .
+
+  .. _readme: https://github.com/intel/afxdp-plugins-for-kubernetes#readme
+
+* Deploy the AF_XDP Device Plugin and CNI
+
+  .. code-block:: console
+
+    # kubectl create -f deployments/daemonset.yml
+
+* Create a Network Attachment Definition (NAD)
+
+  .. code-block:: console
+
+    # kubectl create -f nad.yaml
+
+  Sample nad.yml
+
+  .. code-block:: yaml
+
+    apiVersion: "k8s.cni.cncf.io/v1"
+    kind: NetworkAttachmentDefinition
+    metadata:
+      name: afxdp-network
+      annotations:
+        k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
+    spec:
+      config: '{
+          "cniVersion": "0.3.0",
+          "type": "afxdp",
+          "mode": "primary",
+          "logFile": "afxdp-cni.log",
+          "logLevel": "debug",
+          "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn",
+                           "-N -device- flow-type udp4 dst-port 2152 action 22"
+                        ],
+          "ipam": {
+            "type": "host-local",
+            "subnet": "192.168.1.0/24",
+            "rangeStart": "192.168.1.200",
+            "rangeEnd": "192.168.1.220",
+            "routes": [
+              { "dst": "0.0.0.0/0" }
+            ],
+            "gateway": "192.168.1.1"
+          }
+        }'
+
+  For further reference please use the example provided by the AF_XDP DP `nad.yaml`_
+
+  .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml
+
+* Build a DPDK container image (using Docker)
+
+  .. code-block:: console
+
+    # docker build -t dpdk -f Dockerfile .
+
+  Sample Dockerfile (should be placed in top level DPDK directory):
+
+  .. code-block:: console
+
+    FROM fedora:38
+
+    # Setup container to build DPDK applications
+    RUN dnf -y upgrade && dnf -y install \
+        libbsd-devel \
+        numactl-libs \
+        libbpf-devel \
+        libbpf \
+        meson \
+        ninja-build \
+        libxdp-devel \
+        libxdp \
+        numactl-devel \
+        python3-pyelftools \
+        python38 \
+        iproute
+    RUN dnf groupinstall -y 'Development Tools'
+
+    # Create DPDK dir and copy over sources
+    WORKDIR /dpdk
+    COPY app app
+    COPY builddir  builddir
+    COPY buildtools buildtools
+    COPY config config
+    COPY devtools devtools
+    COPY drivers drivers
+    COPY dts dts
+    COPY examples examples
+    COPY kernel kernel
+    COPY lib lib
+    COPY license license
+    COPY MAINTAINERS MAINTAINERS
+    COPY Makefile Makefile
+    COPY meson.build meson.build
+    COPY meson_options.txt meson_options.txt
+    COPY usertools usertools
+    COPY VERSION VERSION
+    COPY ABI_VERSION ABI_VERSION
+    COPY doc doc
+
+    # Build DPDK
+    RUN meson setup build
+    RUN ninja -C build
+
+  .. note::
+
+    Ensure the Dockerfile is placed in the top level DPDK directory.
+
+* Run the Pod
+
+  .. code-block:: console
+
+     # kubectl create -f pod.yaml
+
+  Sample pod.yaml:
+
+  .. code-block:: yaml
+
+    apiVersion: v1
+    kind: Pod
+    metadata:
+     name: dpdk
+     annotations:
+       k8s.v1.cni.cncf.io/networks: afxdp-network
+    spec:
+      containers:
+      - name: testpmd
+        image: dpdk:latest
+        command: ["tail", "-f", "/dev/null"]
+        securityContext:
+          capabilities:
+            add:
+              - NET_RAW
+              - IPC_LOCK
+        resources:
+          requests:
+            afxdp/myPool: '1'
+          limits:
+            hugepages-1Gi: 2Gi
+            cpu: 2
+            memory: 256Mi
+            afxdp/myPool: '1'
+        volumeMounts:
+        - name: hugepages
+          mountPath: /dev/hugepages
+      volumes:
+      - name: hugepages
+        emptyDir:
+          medium: HugePages
+
+  For further reference please use the `pod.yaml`_
+
+  .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml
+
+.. note::
+
+   For Kernel versions older than 5.19 `CAP_BPF` is also required in
+   the container capabilities stanza.
+
+* Run DPDK with a command like the following:
+
+  .. code-block:: console
+
+     kubectl exec -i dpdk --container testpmd -- \
+           ./build/app/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
+           --vdev net_af_xdp,iface=<interface name>,start_queue=22,queue_count=1,uds_path=/tmp/afxdp_dp/<interface-name>/afxdp.sock \
+           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst
index 71a3381c36..a7692e8a97 100644
--- a/doc/guides/howto/index.rst
+++ b/doc/guides/howto/index.rst
@@ -8,7 +8,7 @@ HowTo Guides
     :maxdepth: 2
     :numbered:
 
-    af_xdp_cni
+    af_xdp_dp
     lm_bond_virtio_sriov
     lm_virtio_vhost_user
     flow_bifurcation
diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
index 1932525d4d..0edc84328d 100644
--- a/doc/guides/nics/af_xdp.rst
+++ b/doc/guides/nics/af_xdp.rst
@@ -151,25 +151,32 @@ instead of zero copy mode (if available).
 
     --vdev net_af_xdp,iface=ens786f1,force_copy=1
 
-use_cni
-~~~~~~~
+uds_path
+~~~~~~~~~
 
-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes to
-enable the `AF_XDP Plugin for Kubernetes`_ within a DPDK application.
+The EAL vdev argument ``uds_path`` is used to indicate that the user wishes to
+use the `AF_XDP Plugin for Kubernetes`_ with a DPDK application running in a Pod.
 
 .. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
 
 .. code-block:: console
 
-   --vdev=net_af_xdp0,use_cni=1
+   --vdev=net_af_xdp0,uds_path==/tmp/afxdp_dp/<interface-name>/afxdp.sock
 
 .. note::
 
-   When using `use_cni`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled
-   as both of these will be handled by the AF_XDP plugin.
-   Since the DPDK application is running in limited privileges
-   so enabling and disabling of the promiscuous mode through the DPDK application
-   is also not supported.
+   The UDS ``afxdp.sock`` is available in the DPDK container through a
+   volume mounted by the `AF_XDP Plugin for Kubernetes`_ at the path
+   specified in the example above.
+
+.. note::
+
+   When using `uds_path`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled
+   as both of these will be handled by the AF_XDP Device plugin (if required).
+   Since the pod/container is running with limited privileges enabling and disabling
+   of promiscuous mode through the DPDK application is also not supported.
+
+For more details please see: :doc:`../howto/af_xdp_dp`
 
 Limitations
 -----------
diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
index 6f8ad27808..606cdf6316 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Integration of AF_XDP PMD with AF_XDP Device Plugin**
+
+  The EAL vdev argument for the AF_XDP PMD ``uds_path`` was added
+  to allow Kubernetes Pods that which to use AF_XDP with DPDK to run
+  with limited privileges. This flag indicates that the AF_XDP PMD
+  will be used in unprivileged mode and will receive the XSKMAP FD from
+  the AF_XDP Device Plugin.
 
 Removed Items
 -------------
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 353c8688ec..db6724b9e5 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE);
 #define UDS_MAX_CMD_LEN			64
 #define UDS_MAX_CMD_RESP		128
 #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
-#define UDS_SOCK			"/tmp/afxdp.sock"
 #define UDS_CONNECT_MSG			"/connect"
 #define UDS_HOST_OK_MSG			"/host_ok"
 #define UDS_HOST_NAK_MSG		"/host_nak"
@@ -170,7 +169,7 @@ struct pmd_internals {
 	char prog_path[PATH_MAX];
 	bool custom_prog_configured;
 	bool force_copy;
-	bool use_cni;
+	char uds_path[PATH_MAX];
 	struct bpf_map *map;
 
 	struct rte_ether_addr eth_addr;
@@ -190,7 +189,7 @@ struct pmd_process_private {
 #define ETH_AF_XDP_PROG_ARG			"xdp_prog"
 #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
 #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
-#define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
+#define ETH_AF_XDP_USE_DP_UDS_PATH_ARG		"uds_path"
 
 static const char * const valid_arguments[] = {
 	ETH_AF_XDP_IFACE_ARG,
@@ -200,7 +199,7 @@ static const char * const valid_arguments[] = {
 	ETH_AF_XDP_PROG_ARG,
 	ETH_AF_XDP_BUDGET_ARG,
 	ETH_AF_XDP_FORCE_COPY_ARG,
-	ETH_AF_XDP_USE_CNI_ARG,
+	ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
 	NULL
 };
 
@@ -1351,7 +1350,7 @@ configure_preferred_busy_poll(struct pkt_rx_queue *rxq)
 }
 
 static int
-init_uds_sock(struct sockaddr_un *server)
+init_uds_sock(struct sockaddr_un *server, const char *uds_path)
 {
 	int sock;
 
@@ -1362,7 +1361,7 @@ init_uds_sock(struct sockaddr_un *server)
 	}
 
 	server->sun_family = AF_UNIX;
-	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
+	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
 
 	if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) {
 		close(sock);
@@ -1382,7 +1381,7 @@ struct msg_internal {
 };
 
 static int
-send_msg(int sock, char *request, int *fd)
+send_msg(int sock, char *request, int *fd, const char *uds_path)
 {
 	int snd;
 	struct iovec iov;
@@ -1393,7 +1392,7 @@ send_msg(int sock, char *request, int *fd)
 
 	memset(&dst, 0, sizeof(dst));
 	dst.sun_family = AF_UNIX;
-	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
+	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
 
 	/* Initialize message header structure */
 	memset(&msgh, 0, sizeof(msgh));
@@ -1470,8 +1469,8 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd)
 }
 
 static int
-make_request_cni(int sock, struct sockaddr_un *server, char *request,
-		 int *req_fd, char *response, int *out_fd)
+make_request_dp(int sock, struct sockaddr_un *server, char *request,
+		 int *req_fd, char *response, int *out_fd, const char *uds_path)
 {
 	int rval;
 
@@ -1483,7 +1482,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request,
 	if (req_fd == NULL)
 		rval = write(sock, request, strlen(request));
 	else
-		rval = send_msg(sock, request, req_fd);
+		rval = send_msg(sock, request, req_fd, uds_path);
 
 	if (rval < 0) {
 		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno));
@@ -1507,7 +1506,7 @@ check_response(char *response, char *exp_resp, long size)
 }
 
 static int
-get_cni_fd(char *if_name)
+get_xskmap_fd(char *if_name, const char *uds_path)
 {
 	char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP];
 	char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP];
@@ -1520,14 +1519,14 @@ get_cni_fd(char *if_name)
 		return -1;
 
 	memset(&server, 0, sizeof(server));
-	sock = init_uds_sock(&server);
+	sock = init_uds_sock(&server, uds_path);
 	if (sock < 0)
 		return -1;
 
-	/* Initiates handshake to CNI send: /connect,hostname */
+	/* Initiates handshake to AF_XDP Device Plugin send: /connect,hostname */
 	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1541,7 +1540,7 @@ get_cni_fd(char *if_name)
 	/* Request for "/version" */
 	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1549,7 +1548,7 @@ get_cni_fd(char *if_name)
 	/* Request for file descriptor for netdev name*/
 	snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1571,7 +1570,7 @@ get_cni_fd(char *if_name)
 	/* Initiate close connection */
 	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_dp(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1640,7 +1639,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 #endif
 
 	/* Disable libbpf from loading XDP program */
-	if (internals->use_cni)
+	if (strnlen(internals->uds_path, PATH_MAX))
 		cfg.libbpf_flags |= XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
 
 	if (strnlen(internals->prog_path, PATH_MAX)) {
@@ -1694,18 +1693,17 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 		}
 	}
 
-	if (internals->use_cni) {
-		int err, fd, map_fd;
+	if (strnlen(internals->uds_path, PATH_MAX)) {
+		int err, map_fd;
 
-		/* get socket fd from CNI plugin */
-		map_fd = get_cni_fd(internals->if_name);
+		/* get socket fd from AF_XDP Device plugin */
+		map_fd = get_xskmap_fd(internals->if_name, internals->uds_path);
 		if (map_fd < 0) {
-			AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n");
+			AF_XDP_LOG(ERR, "Failed to receive AF_XDP Device plugin fd\n");
 			goto out_xsk;
 		}
-		/* get socket fd */
-		fd = xsk_socket__fd(rxq->xsk);
-		err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0);
+
+		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
 		if (err) {
 			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n");
 			goto out_xsk;
@@ -1881,13 +1879,13 @@ static const struct eth_dev_ops ops = {
 	.get_monitor_addr = eth_get_monitor_addr,
 };
 
-/* CNI option works in unprivileged container environment
- * and ethernet device functionality will be reduced. So
- * additional customiszed eth_dev_ops struct is needed
- * for cni. Promiscuous enable and disable functionality
- * is removed.
- **/
-static const struct eth_dev_ops ops_cni = {
+/* AF_XDP Device Plugin option works in unprivileged
+ * container environment and ethernet device functionality
+ * will be reduced. So additional customized eth_dev_ops
+ * struct is needed for the AF_XDP Device Plugin. Promiscuous
+ * enable and disable functionality is removed.
+ */
+static const struct eth_dev_ops ops_afxdp_dp = {
 	.dev_start = eth_dev_start,
 	.dev_stop = eth_dev_stop,
 	.dev_close = eth_dev_close,
@@ -1957,7 +1955,7 @@ parse_name_arg(const char *key __rte_unused,
 
 /** parse xdp prog argument */
 static int
-parse_prog_arg(const char *key __rte_unused,
+parse_path_arg(const char *key __rte_unused,
 	       const char *value, void *extra_args)
 {
 	char *path = extra_args;
@@ -2023,7 +2021,7 @@ xdp_get_channels_info(const char *if_name, int *max_queues,
 static int
 parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 		 int *queue_cnt, int *shared_umem, char *prog_path,
-		 int *busy_budget, int *force_copy, int *use_cni)
+		 int *busy_budget, int *force_copy, char *uds_path)
 {
 	int ret;
 
@@ -2050,7 +2048,7 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 		goto free_kvlist;
 
 	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PROG_ARG,
-				 &parse_prog_arg, prog_path);
+				 &parse_path_arg, prog_path);
 	if (ret < 0)
 		goto free_kvlist;
 
@@ -2064,8 +2062,8 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 	if (ret < 0)
 		goto free_kvlist;
 
-	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_ARG,
-				 &parse_integer_arg, use_cni);
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
+				 &parse_path_arg, uds_path);
 	if (ret < 0)
 		goto free_kvlist;
 
@@ -2108,7 +2106,7 @@ static struct rte_eth_dev *
 init_internals(struct rte_vdev_device *dev, const char *if_name,
 	       int start_queue_idx, int queue_cnt, int shared_umem,
 	       const char *prog_path, int busy_budget, int force_copy,
-	       int use_cni)
+		   const char *uds_path)
 {
 	const char *name = rte_vdev_device_name(dev);
 	const unsigned int numa_node = dev->device.numa_node;
@@ -2137,7 +2135,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 #endif
 	internals->shared_umem = shared_umem;
 	internals->force_copy = force_copy;
-	internals->use_cni = use_cni;
+	strlcpy(internals->uds_path, uds_path, PATH_MAX);
 
 	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
 				  &internals->combined_queue_cnt)) {
@@ -2196,10 +2194,10 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 	eth_dev->data->dev_link = pmd_link;
 	eth_dev->data->mac_addrs = &internals->eth_addr;
 	eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
-	if (!internals->use_cni)
+	if (!strnlen(internals->uds_path, PATH_MAX))
 		eth_dev->dev_ops = &ops;
 	else
-		eth_dev->dev_ops = &ops_cni;
+		eth_dev->dev_ops = &ops_afxdp_dp;
 
 	eth_dev->rx_pkt_burst = eth_af_xdp_rx;
 	eth_dev->tx_pkt_burst = eth_af_xdp_tx;
@@ -2327,7 +2325,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	char prog_path[PATH_MAX] = {'\0'};
 	int busy_budget = -1, ret;
 	int force_copy = 0;
-	int use_cni = 0;
+	char uds_path[PATH_MAX] = {'\0'};
 	struct rte_eth_dev *eth_dev = NULL;
 	const char *name = rte_vdev_device_name(dev);
 
@@ -2370,20 +2368,20 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
 			     &xsk_queue_cnt, &shared_umem, prog_path,
-			     &busy_budget, &force_copy, &use_cni) < 0) {
+				 &busy_budget, &force_copy, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
 		return -EINVAL;
 	}
 
-	if (use_cni && busy_budget > 0) {
+	if (strnlen(uds_path, PATH_MAX) && busy_budget > 0) {
 		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n",
-			ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_BUDGET_ARG);
+			ETH_AF_XDP_USE_DP_UDS_PATH_ARG, ETH_AF_XDP_BUDGET_ARG);
 		return -EINVAL;
 	}
 
-	if (use_cni && strnlen(prog_path, PATH_MAX)) {
+	if (strnlen(uds_path, PATH_MAX) && strnlen(prog_path, PATH_MAX)) {
 		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n",
-			ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG);
+			ETH_AF_XDP_USE_DP_UDS_PATH_ARG, ETH_AF_XDP_PROG_ARG);
 			return -EINVAL;
 	}
 
@@ -2410,7 +2408,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
 				 xsk_queue_cnt, shared_umem, prog_path,
-				 busy_budget, force_copy, use_cni);
+				 busy_budget, force_copy, uds_path);
 	if (eth_dev == NULL) {
 		AF_XDP_LOG(ERR, "Failed to init internals\n");
 		return -1;
@@ -2471,4 +2469,4 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
 			      "xdp_prog=<string> "
 			      "busy_budget=<int> "
 			      "force_copy=<int> "
-			      "use_cni=<int> ");
+			      "uds_path=<string> ");
-- 
2.41.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [v6] net/af_xdp: enable uds_path instead of use_cni
  2023-12-15 15:39 [v6] net/af_xdp: enable uds_path instead of use_cni Maryam Tahhan
@ 2023-12-21 10:44 ` Koikkara Reeny, Shibin
  2023-12-21 11:36 ` David Marchand
  1 sibling, 0 replies; 5+ messages in thread
From: Koikkara Reeny, Shibin @ 2023-12-21 10:44 UTC (permalink / raw)
  To: Tahhan, Maryam, ferruh.yigit, stephen, lihuisong, fengchengwen,
	liuyonglong, Loftus, Ciara
  Cc: dev, Tahhan, Maryam

Thank you Maryam for the changes. 
We can merge the patch.

Regards,
Shibin

> -----Original Message-----
> From: Maryam Tahhan <mtahhan@redhat.com>
> Sent: Friday, December 15, 2023 3:40 PM
> To: ferruh.yigit@amd.com; stephen@networkplumber.org;
> lihuisong@huawei.com; fengchengwen@huawei.com;
> liuyonglong@huawei.com; Koikkara Reeny, Shibin
> <shibin.koikkara.reeny@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>
> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
> Subject: [v6] net/af_xdp: enable uds_path instead of use_cni
> 
> With the original 'use_cni' implementation, (using a hardcoded socket rather
> than a configurable one), if a DPDK pod is requesting multiple net devices
> and these devices are from different pools, then the container attempts to
> mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means
> that at best only 1 netdev will handshake correctly with the AF_XDP DP. This
> patch addresses this by making the socket parameter configurable using a
> new vdev param called 'uds_path' and removing the previous 'use_cni'
> param.
> This patch also fixes incorrect references to the AF_XDP DP as CNI and
> updates the documentation with a working example. This change has been
> tested with the AF_XDP DP PR 81[1], with both single and multiple interfaces.
> 
> [1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81
> 
> v6:
> * Add link to PR 81 in commit message
> * Add release notes changes to this patchset
> 
> v5:
> * Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG
> * Remove use_cni references in af_xdp.rst
> 
> v4:
> * Rename af_xdp_cni.rst to af_xdp_dp.rst
> * Removed all incorrect references to CNI throughout af_xdp
>   PMD file.
> * Fixed Typos in af_xdp_dp.rst
> 
> v3:
> * Remove `use_cni` vdev argument as it's no longer needed.
> * Update incorrect CNI references for the AF_XDP DP in the
>   documentation.
> * Update the documentation to run a simple example with the
>   AF_XDP DP plugin in K8s.
> 
> v2:
> * Rename sock_path to uds_path.
> * Update documentation to reflect when CAP_BPF is needed.
> * Fix testpmd arguments in the provided example for Pods.
> * Use AF_XDP API to update the xskmap entry.
> 
> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  doc/guides/howto/af_xdp_cni.rst        | 253 ----------------------
>  doc/guides/howto/af_xdp_dp.rst         | 278
> +++++++++++++++++++++++++
>  doc/guides/howto/index.rst             |   2 +-
>  doc/guides/nics/af_xdp.rst             |  27 ++-
>  doc/guides/rel_notes/release_24_03.rst |   7 +
>  drivers/net/af_xdp/rte_eth_af_xdp.c    | 100 +++++----
>  6 files changed, 352 insertions(+), 315 deletions(-)  delete mode 100644
> doc/guides/howto/af_xdp_cni.rst  create mode 100644
> doc/guides/howto/af_xdp_dp.rst
> 
> diff --git a/doc/guides/howto/af_xdp_cni.rst
> b/doc/guides/howto/af_xdp_cni.rst deleted file mode 100644 index
> a1a6d5b99c..0000000000
> --- a/doc/guides/howto/af_xdp_cni.rst
> +++ /dev/null
> @@ -1,253 +0,0 @@
> -.. SPDX-License-Identifier: BSD-3-Clause
> -   Copyright(c) 2023 Intel Corporation.
> -
> -Using a CNI with the AF_XDP driver
> -==================================
> -
> -Introduction
> -------------
> -
> -CNI, the Container Network Interface, is a technology for configuring -
> container network interfaces -and which can be used to setup Kubernetes
> networking.
> -AF_XDP is a Linux socket Address Family that enables an XDP program -to
> redirect packets to a memory buffer in userspace.
> -
> -This document explains how to enable the `AF_XDP Plugin for Kubernetes`_
> within -a DPDK application using the :doc:`../nics/af_xdp` to connect and use
> these technologies.
> -
> -.. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-
> for-kubernetes
> -
> -
> -Background
> -----------
> -
> -The standard :doc:`../nics/af_xdp` initialization process involves loading an
> eBPF program -onto the kernel netdev to be used by the PMD.
> -This operation requires root or escalated Linux privileges -and thus prevents
> the PMD from working in an unprivileged container.
> -The AF_XDP CNI plugin handles this situation -by providing a device plugin
> that performs the program loading.
> -
> -At a technical level the CNI opens a Unix Domain Socket and listens for a
> client -to make requests over that socket.
> -A DPDK application acting as a client connects and initiates a configuration
> "handshake".
> -The client then receives a file descriptor which points to the XSKMAP -
> associated with the loaded eBPF program.
> -The XSKMAP is a BPF map of AF_XDP sockets (XSK).
> -The client can then proceed with creating an AF_XDP socket -and inserting
> that socket into the XSKMAP pointed to by the descriptor.
> -
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -
> to run the PMD in unprivileged mode and to receive the XSKMAP file
> descriptor -from the CNI.
> -When this flag is set,
> -the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag -should be
> used when creating the socket -to instruct libbpf not to load the default
> libbpf program on the netdev.
> -Instead the loading is handled by the CNI.
> -
> -.. note::
> -
> -   The Unix Domain Socket file path appear in the end user is
> "/tmp/afxdp.sock".
> -
> -
> -Prerequisites
> --------------
> -
> -Docker and container prerequisites:
> -
> -* Set up the device plugin
> -  as described in the instructions for `AF_XDP Plugin for Kubernetes`_.
> -
> -* The Docker image should contain the libbpf and libxdp libraries,
> -  which are dependencies for AF_XDP,
> -  and should include support for the ``ethtool`` command.
> -
> -* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and
> ``CAP_BPF``
> -  for AF_XDP along with support for hugepages.
> -
> -* Increase locked memory limit so containers have enough memory for
> packet buffers.
> -  For example:
> -
> -  .. code-block:: console
> -
> -     cat << EOF | sudo tee
> /etc/systemd/system/containerd.service.d/limits.conf
> -     [Service]
> -     LimitMEMLOCK=infinity
> -     EOF
> -
> -* dpdk-testpmd application should have AF_XDP feature enabled.
> -
> -  For further information see the docs for the: :doc:`../../nics/af_xdp`.
> -
> -
> -Example
> --------
> -
> -Howto run dpdk-testpmd with CNI plugin:
> -
> -* Clone the CNI plugin
> -
> -  .. code-block:: console
> -
> -     # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git
> -
> -* Build the CNI plugin
> -
> -  .. code-block:: console
> -
> -     # cd afxdp-plugins-for-kubernetes/
> -     # make build
> -
> -  .. note::
> -
> -     CNI plugin has a dependence on the config.json.
> -
> -  Sample Config.json
> -
> -  .. code-block:: json
> -
> -     {
> -        "logLevel":"debug",
> -        "logFile":"afxdp-dp-e2e.log",
> -        "pools":[
> -           {
> -              "name":"e2e",
> -              "mode":"primary",
> -              "timeout":30,
> -              "ethtoolCmds" : ["-L -device- combined 1"],
> -              "devices":[
> -                 {
> -                    "name":"ens785f0"
> -                 }
> -              ]
> -           }
> -        ]
> -     }
> -
> -  For further reference please use the `config.json`_
> -
> -  .. _config.json: https://github.com/intel/afxdp-plugins-for-
> kubernetes/blob/v0.0.2/test/e2e/config.json
> -
> -* Create the Network Attachment definition
> -
> -  .. code-block:: console
> -
> -     # kubectl create -f nad.yaml
> -
> -  Sample nad.yml
> -
> -  .. code-block:: yaml
> -
> -      apiVersion: "k8s.cni.cncf.io/v1"
> -      kind: NetworkAttachmentDefinition
> -      metadata:
> -        name: afxdp-e2e-test
> -        annotations:
> -          k8s.v1.cni.cncf.io/resourceName: afxdp/e2e
> -      spec:
> -        config: '{
> -            "cniVersion": "0.3.0",
> -            "type": "afxdp",
> -            "mode": "cdq",
> -            "logFile": "afxdp-cni-e2e.log",
> -            "logLevel": "debug",
> -            "ipam": {
> -              "type": "host-local",
> -              "subnet": "192.168.1.0/24",
> -              "rangeStart": "192.168.1.200",
> -              "rangeEnd": "192.168.1.216",
> -              "routes": [
> -                { "dst": "0.0.0.0/0" }
> -              ],
> -              "gateway": "192.168.1.1"
> -            }
> -          }'
> -
> -  For further reference please use the `nad.yaml`_
> -
> -  .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-
> kubernetes/blob/v0.0.2/test/e2e/nad.yaml
> -
> -* Build the Docker image
> -
> -  .. code-block:: console
> -
> -     # docker build -t afxdp-e2e-test -f Dockerfile .
> -
> -  Sample Dockerfile:
> -
> -  .. code-block:: console
> -
> -     FROM ubuntu:20.04
> -     RUN apt-get update -y
> -     RUN apt install build-essential libelf-dev -y
> -     RUN apt-get install iproute2  acl -y
> -     RUN apt install python3-pyelftools ethtool -y
> -     RUN apt install libnuma-dev libjansson-dev libpcap-dev net-tools -y
> -     RUN apt-get install clang llvm -y
> -     COPY ./libbpf<version>.tar.gz /tmp
> -     RUN cd /tmp && tar -xvmf libbpf<version>.tar.gz && cd libbpf/src &&
> make install
> -     COPY ./libxdp<version>.tar.gz /tmp
> -     RUN cd /tmp && tar -xvmf libxdp<version>.tar.gz && cd libxdp && make
> install
> -
> -  .. note::
> -
> -     All the files that need to COPY-ed should be in the same directory as the
> Dockerfile
> -
> -* Run the Pod
> -
> -  .. code-block:: console
> -
> -     # kubectl create -f pod.yaml
> -
> -  Sample pod.yaml:
> -
> -  .. code-block:: yaml
> -
> -     apiVersion: v1
> -     kind: Pod
> -     metadata:
> -       name: afxdp-e2e-test
> -       annotations:
> -         k8s.v1.cni.cncf.io/networks: afxdp-e2e-test
> -     spec:
> -       containers:
> -       - name: afxdp
> -         image: afxdp-e2e-test:latest
> -         imagePullPolicy: Never
> -         env:
> -         - name: LD_LIBRARY_PATH
> -           value: /usr/lib64/:/usr/local/lib/
> -         command: ["tail", "-f", "/dev/null"]
> -         securityContext:
> -          capabilities:
> -             add:
> -               - CAP_NET_RAW
> -               - CAP_BPF
> -         resources:
> -           requests:
> -             hugepages-2Mi: 2Gi
> -             memory: 2Gi
> -             afxdp/e2e: '1'
> -           limits:
> -             hugepages-2Mi: 2Gi
> -             memory: 2Gi
> -             afxdp/e2e: '1'
> -
> -  For further reference please use the `pod.yaml`_
> -
> -  .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-
> kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
> -
> -* Run DPDK with a command like the following:
> -
> -  .. code-block:: console
> -
> -     kubectl exec -i <Pod name> --container <containers name> -- \
> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
> -           -- --no-mlockall --in-memory
> -
> -For further reference please use the `e2e`_ test case in `AF_XDP Plugin for
> Kubernetes`_
> -
> -  .. _e2e: https://github.com/intel/afxdp-plugins-for-
> kubernetes/tree/v0.0.2/test/e2e
> diff --git a/doc/guides/howto/af_xdp_dp.rst
> b/doc/guides/howto/af_xdp_dp.rst new file mode 100644 index
> 0000000000..7717d59224
> --- /dev/null
> +++ b/doc/guides/howto/af_xdp_dp.rst
> @@ -0,0 +1,278 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright(c) 2023 Intel Corporation.
> +
> +Using the AF_XDP Device Plugin with the AF_XDP driver
> +======================================================
> +
> +Introduction
> +------------
> +
> +The `AF_XDP Device Plugin for Kubernetes`_ is a project that provisions
> +and advertises interfaces (that can be used with AF_XDP) to Kubernetes.
> +The project also includes a `CNI`_.
> +
> +AF_XDP is a Linux socket Address Family that enables an XDP program to
> +redirect packets to a memory buffer in userspace.
> +
> +This document explains how to use the `AF_XDP Device Plugin for
> +Kubernetes`_ with a DPDK :doc:`../nics/af_xdp` based application running in
> a Pod.
> +
> +.. _AF_XDP Device Plugin for Kubernetes:
> +https://github.com/intel/afxdp-plugins-for-kubernetes
> +.. _CNI: https://github.com/containernetworking/cni
> +
> +Background
> +----------
> +
> +The standard :doc:`../nics/af_xdp` initialization process involves
> +loading an eBPF program onto the Kernel netdev to be used by the PMD.
> +This operation requires root or escalated Linux privileges and prevents
> +the PMD from working in an unprivileged container. The AF_XDP Device
> +Plugin (DP) addresses this situation by providing an entity that
> +manages eBPF program lifecycle for Pod interfaces that wish to use
> +AF_XDP, this in turn allows the pod to be used without privilege escalation.
> +
> +In order for the pod to run without privilege escalation, the AF_XDP DP
> +creates a Unix Domain Socket (UDS) and listens for Pods to make
> +requests for XSKMAP(s) File Descriptors (FDs) for interfaces in their
> network namespace.
> +In other words, the DPDK application running in the Pod connects to
> +this UDS and initiates a "handshake" to retrieve the XSKMAP(s) FD(s).
> +Upon a successful "handshake", the DPDK application receives the FD(s)
> +for the XSKMAP(s) associated with the relevant netdevs. The DPDK
> +application can then create the AF_XDP socket(s), and attach the socket(s)
> to the netdev queue(s) by inserting the socket(s) into the XSKMAP(s).
> +
> +The EAL vdev argument ``uds_path`` is used to indicate that the user
> +wishes to run the AF_XDP PMD in unprivileged mode and to receive the
> +XSKMAP FD from the AF_XDP DP. When this param is used, the
> +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used when
> +creating the AF_XDP socket to instruct libbpf/libxdp not to load the
> +default eBPF redirect program for AF_XDP on the netdev. Instead the
> +lifecycle management of the eBPF program is handled by the AF_XDP DP.
> +
> +.. note::
> +
> +   The UDS file path inside the pod appears at
> "/tmp/afxdp_dp/<netdev>/afxdp.sock".
> +
> +Prerequisites
> +-------------
> +
> +Device Plugin and DPDK container prerequisites:
> +
> +* Create a DPDK container image.
> +
> +* Set up the device plugin and prepare the Pod Spec as described in
> +  the instructions for `AF_XDP Device Plugin for Kubernetes`_.
> +
> +* Increase locked memory limit so containers have enough memory for
> packet buffers.
> +  For example:
> +
> +  .. code-block:: console
> +
> +     cat << EOF | sudo tee
> /etc/systemd/system/containerd.service.d/limits.conf
> +     [Service]
> +     LimitMEMLOCK=infinity
> +     EOF
> +
> +* dpdk-testpmd application should have AF_XDP feature enabled.
> +
> +  For further information see the docs for the: :doc:`../../nics/af_xdp`.
> +
> +
> +Example
> +-------
> +
> +How to run dpdk-testpmd with the AF_XDP Device plugin:
> +
> +* Clone the AF_XDP Device plugin
> +
> +  .. code-block:: console
> +
> +     # git clone
> + https://github.com/intel/afxdp-plugins-for-kubernetes.git
> +
> +* Build the AF_XDP Device plugin and the CNI
> +
> +  .. code-block:: console
> +
> +     # cd afxdp-plugins-for-kubernetes/
> +     # make image
> +
> +* Make sure to modify the image used by the `daemonset.yml`_ file in
> +the deployments directory with
> +  the following configuration:
> +
> +   .. _daemonset.yml :
> + https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/deploy
> + ments/daemonset.yml
> +
> +  .. code-block:: yaml
> +
> +    image: afxdp-device-plugin:latest
> +
> +  .. note::
> +
> +    This will select the AF_XDP DP image that was built locally. Detailed
> configuration
> +    options can be found in the AF_XDP Device Plugin `readme`_ .
> +
> +  .. _readme:
> + https://github.com/intel/afxdp-plugins-for-kubernetes#readme
> +
> +* Deploy the AF_XDP Device Plugin and CNI
> +
> +  .. code-block:: console
> +
> +    # kubectl create -f deployments/daemonset.yml
> +
> +* Create a Network Attachment Definition (NAD)
> +
> +  .. code-block:: console
> +
> +    # kubectl create -f nad.yaml
> +
> +  Sample nad.yml
> +
> +  .. code-block:: yaml
> +
> +    apiVersion: "k8s.cni.cncf.io/v1"
> +    kind: NetworkAttachmentDefinition
> +    metadata:
> +      name: afxdp-network
> +      annotations:
> +        k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
> +    spec:
> +      config: '{
> +          "cniVersion": "0.3.0",
> +          "type": "afxdp",
> +          "mode": "primary",
> +          "logFile": "afxdp-cni.log",
> +          "logLevel": "debug",
> +          "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn",
> +                           "-N -device- flow-type udp4 dst-port 2152 action 22"
> +                        ],
> +          "ipam": {
> +            "type": "host-local",
> +            "subnet": "192.168.1.0/24",
> +            "rangeStart": "192.168.1.200",
> +            "rangeEnd": "192.168.1.220",
> +            "routes": [
> +              { "dst": "0.0.0.0/0" }
> +            ],
> +            "gateway": "192.168.1.1"
> +          }
> +        }'
> +
> +  For further reference please use the example provided by the AF_XDP
> + DP `nad.yaml`_
> +
> +  .. _nad.yaml:
> + https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/exampl
> + es/network-attachment-definition.yaml
> +
> +* Build a DPDK container image (using Docker)
> +
> +  .. code-block:: console
> +
> +    # docker build -t dpdk -f Dockerfile .
> +
> +  Sample Dockerfile (should be placed in top level DPDK directory):
> +
> +  .. code-block:: console
> +
> +    FROM fedora:38
> +
> +    # Setup container to build DPDK applications
> +    RUN dnf -y upgrade && dnf -y install \
> +        libbsd-devel \
> +        numactl-libs \
> +        libbpf-devel \
> +        libbpf \
> +        meson \
> +        ninja-build \
> +        libxdp-devel \
> +        libxdp \
> +        numactl-devel \
> +        python3-pyelftools \
> +        python38 \
> +        iproute
> +    RUN dnf groupinstall -y 'Development Tools'
> +
> +    # Create DPDK dir and copy over sources
> +    WORKDIR /dpdk
> +    COPY app app
> +    COPY builddir  builddir
> +    COPY buildtools buildtools
> +    COPY config config
> +    COPY devtools devtools
> +    COPY drivers drivers
> +    COPY dts dts
> +    COPY examples examples
> +    COPY kernel kernel
> +    COPY lib lib
> +    COPY license license
> +    COPY MAINTAINERS MAINTAINERS
> +    COPY Makefile Makefile
> +    COPY meson.build meson.build
> +    COPY meson_options.txt meson_options.txt
> +    COPY usertools usertools
> +    COPY VERSION VERSION
> +    COPY ABI_VERSION ABI_VERSION
> +    COPY doc doc
> +
> +    # Build DPDK
> +    RUN meson setup build
> +    RUN ninja -C build
> +
> +  .. note::
> +
> +    Ensure the Dockerfile is placed in the top level DPDK directory.
> +
> +* Run the Pod
> +
> +  .. code-block:: console
> +
> +     # kubectl create -f pod.yaml
> +
> +  Sample pod.yaml:
> +
> +  .. code-block:: yaml
> +
> +    apiVersion: v1
> +    kind: Pod
> +    metadata:
> +     name: dpdk
> +     annotations:
> +       k8s.v1.cni.cncf.io/networks: afxdp-network
> +    spec:
> +      containers:
> +      - name: testpmd
> +        image: dpdk:latest
> +        command: ["tail", "-f", "/dev/null"]
> +        securityContext:
> +          capabilities:
> +            add:
> +              - NET_RAW
> +              - IPC_LOCK
> +        resources:
> +          requests:
> +            afxdp/myPool: '1'
> +          limits:
> +            hugepages-1Gi: 2Gi
> +            cpu: 2
> +            memory: 256Mi
> +            afxdp/myPool: '1'
> +        volumeMounts:
> +        - name: hugepages
> +          mountPath: /dev/hugepages
> +      volumes:
> +      - name: hugepages
> +        emptyDir:
> +          medium: HugePages
> +
> +  For further reference please use the `pod.yaml`_
> +
> +  .. _pod.yaml:
> + https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/exampl
> + es/pod-spec.yaml
> +
> +.. note::
> +
> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
> +   the container capabilities stanza.
> +
> +* Run DPDK with a command like the following:
> +
> +  .. code-block:: console
> +
> +     kubectl exec -i dpdk --container testpmd -- \
> +           ./build/app/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
> +           --vdev net_af_xdp,iface=<interface
> name>,start_queue=22,queue_count=1,uds_path=/tmp/afxdp_dp/<interfa
> ce-name>/afxdp.sock \
> +           -- -i --a --nb-cores=2 --rxq=1 --txq=1
> + --forward-mode=macswap;
> diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst index
> 71a3381c36..a7692e8a97 100644
> --- a/doc/guides/howto/index.rst
> +++ b/doc/guides/howto/index.rst
> @@ -8,7 +8,7 @@ HowTo Guides
>      :maxdepth: 2
>      :numbered:
> 
> -    af_xdp_cni
> +    af_xdp_dp
>      lm_bond_virtio_sriov
>      lm_virtio_vhost_user
>      flow_bifurcation
> diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index
> 1932525d4d..0edc84328d 100644
> --- a/doc/guides/nics/af_xdp.rst
> +++ b/doc/guides/nics/af_xdp.rst
> @@ -151,25 +151,32 @@ instead of zero copy mode (if available).
> 
>      --vdev net_af_xdp,iface=ens786f1,force_copy=1
> 
> -use_cni
> -~~~~~~~
> +uds_path
> +~~~~~~~~~
> 
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes to
> -enable the `AF_XDP Plugin for Kubernetes`_ within a DPDK application.
> +The EAL vdev argument ``uds_path`` is used to indicate that the user
> +wishes to use the `AF_XDP Plugin for Kubernetes`_ with a DPDK application
> running in a Pod.
> 
>  .. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-
> for-kubernetes
> 
>  .. code-block:: console
> 
> -   --vdev=net_af_xdp0,use_cni=1
> +
> + --vdev=net_af_xdp0,uds_path==/tmp/afxdp_dp/<interface-
> name>/afxdp.sock
> 
>  .. note::
> 
> -   When using `use_cni`_, both parameters `xdp_prog`_ and `busy_budget`_
> are disabled
> -   as both of these will be handled by the AF_XDP plugin.
> -   Since the DPDK application is running in limited privileges
> -   so enabling and disabling of the promiscuous mode through the DPDK
> application
> -   is also not supported.
> +   The UDS ``afxdp.sock`` is available in the DPDK container through a
> +   volume mounted by the `AF_XDP Plugin for Kubernetes`_ at the path
> +   specified in the example above.
> +
> +.. note::
> +
> +   When using `uds_path`_, both parameters `xdp_prog`_ and
> `busy_budget`_ are disabled
> +   as both of these will be handled by the AF_XDP Device plugin (if required).
> +   Since the pod/container is running with limited privileges enabling and
> disabling
> +   of promiscuous mode through the DPDK application is also not supported.
> +
> +For more details please see: :doc:`../howto/af_xdp_dp`
> 
>  Limitations
>  -----------
> diff --git a/doc/guides/rel_notes/release_24_03.rst
> b/doc/guides/rel_notes/release_24_03.rst
> index 6f8ad27808..606cdf6316 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -55,6 +55,13 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Integration of AF_XDP PMD with AF_XDP Device Plugin**
> +
> +  The EAL vdev argument for the AF_XDP PMD ``uds_path`` was added  to
> + allow Kubernetes Pods that which to use AF_XDP with DPDK to run  with
> + limited privileges. This flag indicates that the AF_XDP PMD  will be
> + used in unprivileged mode and will receive the XSKMAP FD from  the
> + AF_XDP Device Plugin.
> 
>  Removed Items
>  -------------
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 353c8688ec..db6724b9e5 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
> NOTICE);
>  #define UDS_MAX_CMD_LEN			64
>  #define UDS_MAX_CMD_RESP		128
>  #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
> -#define UDS_SOCK			"/tmp/afxdp.sock"
>  #define UDS_CONNECT_MSG			"/connect"
>  #define UDS_HOST_OK_MSG			"/host_ok"
>  #define UDS_HOST_NAK_MSG		"/host_nak"
> @@ -170,7 +169,7 @@ struct pmd_internals {
>  	char prog_path[PATH_MAX];
>  	bool custom_prog_configured;
>  	bool force_copy;
> -	bool use_cni;
> +	char uds_path[PATH_MAX];
>  	struct bpf_map *map;
> 
>  	struct rte_ether_addr eth_addr;
> @@ -190,7 +189,7 @@ struct pmd_process_private {
>  #define ETH_AF_XDP_PROG_ARG			"xdp_prog"
>  #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
>  #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
> -#define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
> +#define ETH_AF_XDP_USE_DP_UDS_PATH_ARG		"uds_path"
> 
>  static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_IFACE_ARG,
> @@ -200,7 +199,7 @@ static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_PROG_ARG,
>  	ETH_AF_XDP_BUDGET_ARG,
>  	ETH_AF_XDP_FORCE_COPY_ARG,
> -	ETH_AF_XDP_USE_CNI_ARG,
> +	ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
>  	NULL
>  };
> 
> @@ -1351,7 +1350,7 @@ configure_preferred_busy_poll(struct
> pkt_rx_queue *rxq)  }
> 
>  static int
> -init_uds_sock(struct sockaddr_un *server)
> +init_uds_sock(struct sockaddr_un *server, const char *uds_path)
>  {
>  	int sock;
> 
> @@ -1362,7 +1361,7 @@ init_uds_sock(struct sockaddr_un *server)
>  	}
> 
>  	server->sun_family = AF_UNIX;
> -	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
> +	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
> 
>  	if (connect(sock, (struct sockaddr *)server, sizeof(struct
> sockaddr_un)) < 0) {
>  		close(sock);
> @@ -1382,7 +1381,7 @@ struct msg_internal {  };
> 
>  static int
> -send_msg(int sock, char *request, int *fd)
> +send_msg(int sock, char *request, int *fd, const char *uds_path)
>  {
>  	int snd;
>  	struct iovec iov;
> @@ -1393,7 +1392,7 @@ send_msg(int sock, char *request, int *fd)
> 
>  	memset(&dst, 0, sizeof(dst));
>  	dst.sun_family = AF_UNIX;
> -	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
> +	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
> 
>  	/* Initialize message header structure */
>  	memset(&msgh, 0, sizeof(msgh));
> @@ -1470,8 +1469,8 @@ read_msg(int sock, char *response, struct
> sockaddr_un *s, int *fd)  }
> 
>  static int
> -make_request_cni(int sock, struct sockaddr_un *server, char *request,
> -		 int *req_fd, char *response, int *out_fd)
> +make_request_dp(int sock, struct sockaddr_un *server, char *request,
> +		 int *req_fd, char *response, int *out_fd, const char
> *uds_path)
>  {
>  	int rval;
> 
> @@ -1483,7 +1482,7 @@ make_request_cni(int sock, struct sockaddr_un
> *server, char *request,
>  	if (req_fd == NULL)
>  		rval = write(sock, request, strlen(request));
>  	else
> -		rval = send_msg(sock, request, req_fd);
> +		rval = send_msg(sock, request, req_fd, uds_path);
> 
>  	if (rval < 0) {
>  		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
> -1507,7 +1506,7 @@ check_response(char *response, char *exp_resp, long
> size)  }
> 
>  static int
> -get_cni_fd(char *if_name)
> +get_xskmap_fd(char *if_name, const char *uds_path)
>  {
>  	char request[UDS_MAX_CMD_LEN],
> response[UDS_MAX_CMD_RESP];
>  	char hostname[MAX_LONG_OPT_SZ],
> exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1519,14 @@
> get_cni_fd(char *if_name)
>  		return -1;
> 
>  	memset(&server, 0, sizeof(server));
> -	sock = init_uds_sock(&server);
> +	sock = init_uds_sock(&server, uds_path);
>  	if (sock < 0)
>  		return -1;
> 
> -	/* Initiates handshake to CNI send: /connect,hostname */
> +	/* Initiates handshake to AF_XDP Device Plugin send:
> /connect,hostname
> +*/
>  	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
> hostname);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_dp(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1541,7 +1540,7 @@ get_cni_fd(char *if_name)
>  	/* Request for "/version" */
>  	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_dp(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1549,7 +1548,7 @@ get_cni_fd(char *if_name)
>  	/* Request for file descriptor for netdev name*/
>  	snprintf(request, sizeof(request), "%s,%s",
> UDS_XSK_MAP_FD_MSG, if_name);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_dp(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1571,7 +1570,7 @@ get_cni_fd(char *if_name)
>  	/* Initiate close connection */
>  	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_dp(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1640,7 +1639,7 @@ xsk_configure(struct pmd_internals *internals,
> struct pkt_rx_queue *rxq,  #endif
> 
>  	/* Disable libbpf from loading XDP program */
> -	if (internals->use_cni)
> +	if (strnlen(internals->uds_path, PATH_MAX))
>  		cfg.libbpf_flags |=
> XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
> 
>  	if (strnlen(internals->prog_path, PATH_MAX)) { @@ -1694,18
> +1693,17 @@ xsk_configure(struct pmd_internals *internals, struct
> pkt_rx_queue *rxq,
>  		}
>  	}
> 
> -	if (internals->use_cni) {
> -		int err, fd, map_fd;
> +	if (strnlen(internals->uds_path, PATH_MAX)) {
> +		int err, map_fd;
> 
> -		/* get socket fd from CNI plugin */
> -		map_fd = get_cni_fd(internals->if_name);
> +		/* get socket fd from AF_XDP Device plugin */
> +		map_fd = get_xskmap_fd(internals->if_name, internals-
> >uds_path);
>  		if (map_fd < 0) {
> -			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
> fd\n");
> +			AF_XDP_LOG(ERR, "Failed to receive AF_XDP Device
> plugin fd\n");
>  			goto out_xsk;
>  		}
> -		/* get socket fd */
> -		fd = xsk_socket__fd(rxq->xsk);
> -		err = bpf_map_update_elem(map_fd, &rxq-
> >xsk_queue_idx, &fd, 0);
> +
> +		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
>  		if (err) {
>  			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
> in map.\n");
>  			goto out_xsk;
> @@ -1881,13 +1879,13 @@ static const struct eth_dev_ops ops = {
>  	.get_monitor_addr = eth_get_monitor_addr,  };
> 
> -/* CNI option works in unprivileged container environment
> - * and ethernet device functionality will be reduced. So
> - * additional customiszed eth_dev_ops struct is needed
> - * for cni. Promiscuous enable and disable functionality
> - * is removed.
> - **/
> -static const struct eth_dev_ops ops_cni = {
> +/* AF_XDP Device Plugin option works in unprivileged
> + * container environment and ethernet device functionality
> + * will be reduced. So additional customized eth_dev_ops
> + * struct is needed for the AF_XDP Device Plugin. Promiscuous
> + * enable and disable functionality is removed.
> + */
> +static const struct eth_dev_ops ops_afxdp_dp = {
>  	.dev_start = eth_dev_start,
>  	.dev_stop = eth_dev_stop,
>  	.dev_close = eth_dev_close,
> @@ -1957,7 +1955,7 @@ parse_name_arg(const char *key __rte_unused,
> 
>  /** parse xdp prog argument */
>  static int
> -parse_prog_arg(const char *key __rte_unused,
> +parse_path_arg(const char *key __rte_unused,
>  	       const char *value, void *extra_args)  {
>  	char *path = extra_args;
> @@ -2023,7 +2021,7 @@ xdp_get_channels_info(const char *if_name, int
> *max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  		 int *queue_cnt, int *shared_umem, char *prog_path,
> -		 int *busy_budget, int *force_copy, int *use_cni)
> +		 int *busy_budget, int *force_copy, char *uds_path)
>  {
>  	int ret;
> 
> @@ -2050,7 +2048,7 @@ parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  		goto free_kvlist;
> 
>  	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PROG_ARG,
> -				 &parse_prog_arg, prog_path);
> +				 &parse_path_arg, prog_path);
>  	if (ret < 0)
>  		goto free_kvlist;
> 
> @@ -2064,8 +2062,8 @@ parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  	if (ret < 0)
>  		goto free_kvlist;
> 
> -	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_ARG,
> -				 &parse_integer_arg, use_cni);
> +	ret = rte_kvargs_process(kvlist,
> ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
> +				 &parse_path_arg, uds_path);
>  	if (ret < 0)
>  		goto free_kvlist;
> 
> @@ -2108,7 +2106,7 @@ static struct rte_eth_dev *  init_internals(struct
> rte_vdev_device *dev, const char *if_name,
>  	       int start_queue_idx, int queue_cnt, int shared_umem,
>  	       const char *prog_path, int busy_budget, int force_copy,
> -	       int use_cni)
> +		   const char *uds_path)
>  {
>  	const char *name = rte_vdev_device_name(dev);
>  	const unsigned int numa_node = dev->device.numa_node; @@ -
> 2137,7 +2135,7 @@ init_internals(struct rte_vdev_device *dev, const char
> *if_name,  #endif
>  	internals->shared_umem = shared_umem;
>  	internals->force_copy = force_copy;
> -	internals->use_cni = use_cni;
> +	strlcpy(internals->uds_path, uds_path, PATH_MAX);
> 
>  	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>  				  &internals->combined_queue_cnt)) { @@ -
> 2196,10 +2194,10 @@ init_internals(struct rte_vdev_device *dev, const char
> *if_name,
>  	eth_dev->data->dev_link = pmd_link;
>  	eth_dev->data->mac_addrs = &internals->eth_addr;
>  	eth_dev->data->dev_flags |=
> RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
> -	if (!internals->use_cni)
> +	if (!strnlen(internals->uds_path, PATH_MAX))
>  		eth_dev->dev_ops = &ops;
>  	else
> -		eth_dev->dev_ops = &ops_cni;
> +		eth_dev->dev_ops = &ops_afxdp_dp;
> 
>  	eth_dev->rx_pkt_burst = eth_af_xdp_rx;
>  	eth_dev->tx_pkt_burst = eth_af_xdp_tx; @@ -2327,7 +2325,7 @@
> rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>  	char prog_path[PATH_MAX] = {'\0'};
>  	int busy_budget = -1, ret;
>  	int force_copy = 0;
> -	int use_cni = 0;
> +	char uds_path[PATH_MAX] = {'\0'};
>  	struct rte_eth_dev *eth_dev = NULL;
>  	const char *name = rte_vdev_device_name(dev);
> 
> @@ -2370,20 +2368,20 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>  			     &xsk_queue_cnt, &shared_umem, prog_path,
> -			     &busy_budget, &force_copy, &use_cni) < 0) {
> +				 &busy_budget, &force_copy, uds_path) < 0)
> {
>  		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>  		return -EINVAL;
>  	}
> 
> -	if (use_cni && busy_budget > 0) {
> +	if (strnlen(uds_path, PATH_MAX) && busy_budget > 0) {
>  		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s'
> parameter is not valid\n",
> -			ETH_AF_XDP_USE_CNI_ARG,
> ETH_AF_XDP_BUDGET_ARG);
> +			ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
> ETH_AF_XDP_BUDGET_ARG);
>  		return -EINVAL;
>  	}
> 
> -	if (use_cni && strnlen(prog_path, PATH_MAX)) {
> +	if (strnlen(uds_path, PATH_MAX) && strnlen(prog_path,
> PATH_MAX)) {
>  		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s'
> parameter is not valid\n",
> -			ETH_AF_XDP_USE_CNI_ARG,
> ETH_AF_XDP_PROG_ARG);
> +			ETH_AF_XDP_USE_DP_UDS_PATH_ARG,
> ETH_AF_XDP_PROG_ARG);
>  			return -EINVAL;
>  	}
> 
> @@ -2410,7 +2408,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>  				 xsk_queue_cnt, shared_umem, prog_path,
> -				 busy_budget, force_copy, use_cni);
> +				 busy_budget, force_copy, uds_path);
>  	if (eth_dev == NULL) {
>  		AF_XDP_LOG(ERR, "Failed to init internals\n");
>  		return -1;
> @@ -2471,4 +2469,4 @@
> RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>  			      "xdp_prog=<string> "
>  			      "busy_budget=<int> "
>  			      "force_copy=<int> "
> -			      "use_cni=<int> ");
> +			      "uds_path=<string> ");
> --
> 2.41.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v6] net/af_xdp: enable uds_path instead of use_cni
  2023-12-15 15:39 [v6] net/af_xdp: enable uds_path instead of use_cni Maryam Tahhan
  2023-12-21 10:44 ` Koikkara Reeny, Shibin
@ 2023-12-21 11:36 ` David Marchand
  2023-12-21 11:46   ` Maryam Tahhan
  1 sibling, 1 reply; 5+ messages in thread
From: David Marchand @ 2023-12-21 11:36 UTC (permalink / raw)
  To: Maryam Tahhan
  Cc: ferruh.yigit, stephen, lihuisong, fengchengwen, liuyonglong,
	shibin.koikkara.reeny, ciara.loftus, dev

Hello,

On Fri, Dec 15, 2023 at 4:40 PM Maryam Tahhan <mtahhan@redhat.com> wrote:
>
> With the original 'use_cni' implementation, (using a
> hardcoded socket rather than a configurable one),
> if a DPDK pod is requesting multiple net devices
> and these devices are from different pools, then
> the container attempts to mount all the netdev UDSes
> in the pod as /tmp/afxdp.sock. Which means that at best
> only 1 netdev will handshake correctly with the AF_XDP
> DP. This patch addresses this by making the socket
> parameter configurable using a new vdev param called
> 'uds_path' and removing the previous 'use_cni' param.
> This patch also fixes incorrect references to the
> AF_XDP DP as CNI and updates the documentation with a
> working example. This change has been tested with the

If there are fixes mixed in, please separate them in dedicated patches
so we can backport them to LTS releases.


> AF_XDP DP PR 81[1], with both single and multiple interfaces.
>
> [1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81

On the patch title, mentioning internals like uds_path does not help a
user: it is hard to tell what this change is about at a first glance.
Finding a good title is hard, but maybe something like: "net/af_xdp:
enhance multiple interfaces support" ?


>
> v6:
> * Add link to PR 81 in commit message
> * Add release notes changes to this patchset
>
> v5:
> * Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG
> * Remove use_cni references in af_xdp.rst
>
> v4:
> * Rename af_xdp_cni.rst to af_xdp_dp.rst
> * Removed all incorrect references to CNI throughout af_xdp
>   PMD file.
> * Fixed Typos in af_xdp_dp.rst
>
> v3:
> * Remove `use_cni` vdev argument as it's no longer needed.
> * Update incorrect CNI references for the AF_XDP DP in the
>   documentation.
> * Update the documentation to run a simple example with the
>   AF_XDP DP plugin in K8s.
>
> v2:
> * Rename sock_path to uds_path.
> * Update documentation to reflect when CAP_BPF is needed.
> * Fix testpmd arguments in the provided example for Pods.
> * Use AF_XDP API to update the xskmap entry.

This patch history block above has no place in the commitlog.
It should be in the annotations part of the patch.
https://doc.dpdk.org/guides/contributing/patches.html#creating-patches


>
> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  doc/guides/howto/af_xdp_cni.rst        | 253 ----------------------
>  doc/guides/howto/af_xdp_dp.rst         | 278 +++++++++++++++++++++++++

Renaming the file seems fine to me.
However, don't add extra unrelated whitespaces/line wraps changes that
makes it hard for git to see it is a rename.

For example:

$ git show -M05 -- doc/
...
-The standard :doc:`../nics/af_xdp` initialization process involves
loading an eBPF program
-onto the kernel netdev to be used by the PMD.
-This operation requires root or escalated Linux privileges
...
+The standard :doc:`../nics/af_xdp` initialization process involves
+loading an eBPF program onto the Kernel netdev to be used by the PMD.
+This operation requires root or escalated Linux privileges and prevents
...


>  doc/guides/howto/index.rst             |   2 +-
>  doc/guides/nics/af_xdp.rst             |  27 ++-
>  doc/guides/rel_notes/release_24_03.rst |   7 +
>  drivers/net/af_xdp/rte_eth_af_xdp.c    | 100 +++++----
>  6 files changed, 352 insertions(+), 315 deletions(-)
>  delete mode 100644 doc/guides/howto/af_xdp_cni.rst
>  create mode 100644 doc/guides/howto/af_xdp_dp.rst
>
> diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst
> deleted file mode 100644
> index a1a6d5b99c..0000000000
> --- a/doc/guides/howto/af_xdp_cni.rst
> +++ /dev/null
> @@ -1,253 +0,0 @@
> -.. SPDX-License-Identifier: BSD-3-Clause
> -   Copyright(c) 2023 Intel Corporation.
> -
> -Using a CNI with the AF_XDP driver
> -==================================
> -
> -Introduction
> -------------
> -
> -CNI, the Container Network Interface, is a technology for configuring
> -container network interfaces
> -and which can be used to setup Kubernetes networking.
> -AF_XDP is a Linux socket Address Family that enables an XDP program
> -to redirect packets to a memory buffer in userspace.
> -
> -This document explains how to enable the `AF_XDP Plugin for Kubernetes`_ within
> -a DPDK application using the :doc:`../nics/af_xdp` to connect and use these technologies.
> -
> -.. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
> -
> -
> -Background
> -----------
> -
> -The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program
> -onto the kernel netdev to be used by the PMD.
> -This operation requires root or escalated Linux privileges
> -and thus prevents the PMD from working in an unprivileged container.
> -The AF_XDP CNI plugin handles this situation
> -by providing a device plugin that performs the program loading.
> -
> -At a technical level the CNI opens a Unix Domain Socket and listens for a client
> -to make requests over that socket.
> -A DPDK application acting as a client connects and initiates a configuration "handshake".
> -The client then receives a file descriptor which points to the XSKMAP
> -associated with the loaded eBPF program.
> -The XSKMAP is a BPF map of AF_XDP sockets (XSK).
> -The client can then proceed with creating an AF_XDP socket
> -and inserting that socket into the XSKMAP pointed to by the descriptor.
> -
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
> -to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor
> -from the CNI.
> -When this flag is set,
> -the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
> -should be used when creating the socket
> -to instruct libbpf not to load the default libbpf program on the netdev.
> -Instead the loading is handled by the CNI.
> -
> -.. note::
> -
> -   The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock".
> -
> -
> -Prerequisites
> --------------
> -
> -Docker and container prerequisites:
> -
> -* Set up the device plugin
> -  as described in the instructions for `AF_XDP Plugin for Kubernetes`_.
> -
> -* The Docker image should contain the libbpf and libxdp libraries,
> -  which are dependencies for AF_XDP,
> -  and should include support for the ``ethtool`` command.
> -
> -* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF``
> -  for AF_XDP along with support for hugepages.
> -
> -* Increase locked memory limit so containers have enough memory for packet buffers.
> -  For example:
> -
> -  .. code-block:: console
> -
> -     cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf
> -     [Service]
> -     LimitMEMLOCK=infinity
> -     EOF
> -
> -* dpdk-testpmd application should have AF_XDP feature enabled.
> -
> -  For further information see the docs for the: :doc:`../../nics/af_xdp`.
> -
> -
> -Example
> --------
> -
> -Howto run dpdk-testpmd with CNI plugin:
> -
> -* Clone the CNI plugin
> -
> -  .. code-block:: console
> -
> -     # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git
> -
> -* Build the CNI plugin
> -
> -  .. code-block:: console
> -
> -     # cd afxdp-plugins-for-kubernetes/
> -     # make build
> -
> -  .. note::
> -
> -     CNI plugin has a dependence on the config.json.
> -
> -  Sample Config.json
> -
> -  .. code-block:: json
> -
> -     {
> -        "logLevel":"debug",
> -        "logFile":"afxdp-dp-e2e.log",
> -        "pools":[
> -           {
> -              "name":"e2e",
> -              "mode":"primary",
> -              "timeout":30,
> -              "ethtoolCmds" : ["-L -device- combined 1"],
> -              "devices":[
> -                 {
> -                    "name":"ens785f0"
> -                 }
> -              ]
> -           }
> -        ]
> -     }
> -
> -  For further reference please use the `config.json`_
> -
> -  .. _config.json: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/config.json
> -
> -* Create the Network Attachment definition
> -
> -  .. code-block:: console
> -
> -     # kubectl create -f nad.yaml
> -
> -  Sample nad.yml
> -
> -  .. code-block:: yaml
> -
> -      apiVersion: "k8s.cni.cncf.io/v1"
> -      kind: NetworkAttachmentDefinition
> -      metadata:
> -        name: afxdp-e2e-test
> -        annotations:
> -          k8s.v1.cni.cncf.io/resourceName: afxdp/e2e
> -      spec:
> -        config: '{
> -            "cniVersion": "0.3.0",
> -            "type": "afxdp",
> -            "mode": "cdq",
> -            "logFile": "afxdp-cni-e2e.log",
> -            "logLevel": "debug",
> -            "ipam": {
> -              "type": "host-local",
> -              "subnet": "192.168.1.0/24",
> -              "rangeStart": "192.168.1.200",
> -              "rangeEnd": "192.168.1.216",
> -              "routes": [
> -                { "dst": "0.0.0.0/0" }
> -              ],
> -              "gateway": "192.168.1.1"
> -            }
> -          }'
> -
> -  For further reference please use the `nad.yaml`_
> -
> -  .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/nad.yaml
> -
> -* Build the Docker image
> -
> -  .. code-block:: console
> -
> -     # docker build -t afxdp-e2e-test -f Dockerfile .
> -
> -  Sample Dockerfile:
> -
> -  .. code-block:: console
> -
> -     FROM ubuntu:20.04
> -     RUN apt-get update -y
> -     RUN apt install build-essential libelf-dev -y
> -     RUN apt-get install iproute2  acl -y
> -     RUN apt install python3-pyelftools ethtool -y
> -     RUN apt install libnuma-dev libjansson-dev libpcap-dev net-tools -y
> -     RUN apt-get install clang llvm -y
> -     COPY ./libbpf<version>.tar.gz /tmp
> -     RUN cd /tmp && tar -xvmf libbpf<version>.tar.gz && cd libbpf/src && make install
> -     COPY ./libxdp<version>.tar.gz /tmp
> -     RUN cd /tmp && tar -xvmf libxdp<version>.tar.gz && cd libxdp && make install
> -
> -  .. note::
> -
> -     All the files that need to COPY-ed should be in the same directory as the Dockerfile
> -
> -* Run the Pod
> -
> -  .. code-block:: console
> -
> -     # kubectl create -f pod.yaml
> -
> -  Sample pod.yaml:
> -
> -  .. code-block:: yaml
> -
> -     apiVersion: v1
> -     kind: Pod
> -     metadata:
> -       name: afxdp-e2e-test
> -       annotations:
> -         k8s.v1.cni.cncf.io/networks: afxdp-e2e-test
> -     spec:
> -       containers:
> -       - name: afxdp
> -         image: afxdp-e2e-test:latest
> -         imagePullPolicy: Never
> -         env:
> -         - name: LD_LIBRARY_PATH
> -           value: /usr/lib64/:/usr/local/lib/
> -         command: ["tail", "-f", "/dev/null"]
> -         securityContext:
> -          capabilities:
> -             add:
> -               - CAP_NET_RAW
> -               - CAP_BPF
> -         resources:
> -           requests:
> -             hugepages-2Mi: 2Gi
> -             memory: 2Gi
> -             afxdp/e2e: '1'
> -           limits:
> -             hugepages-2Mi: 2Gi
> -             memory: 2Gi
> -             afxdp/e2e: '1'
> -
> -  For further reference please use the `pod.yaml`_
> -
> -  .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
> -
> -* Run DPDK with a command like the following:
> -
> -  .. code-block:: console
> -
> -     kubectl exec -i <Pod name> --container <containers name> -- \
> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
> -           -- --no-mlockall --in-memory
> -
> -For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_
> -
> -  .. _e2e: https://github.com/intel/afxdp-plugins-for-kubernetes/tree/v0.0.2/test/e2e
> diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst
> new file mode 100644
> index 0000000000..7717d59224
> --- /dev/null
> +++ b/doc/guides/howto/af_xdp_dp.rst
> @@ -0,0 +1,278 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright(c) 2023 Intel Corporation.
> +
> +Using the AF_XDP Device Plugin with the AF_XDP driver
> +======================================================

Too long by one =.


> +
> +Introduction
> +------------
> +
> +The `AF_XDP Device Plugin for Kubernetes`_ is a project that provisions
> +and advertises interfaces (that can be used with AF_XDP) to Kubernetes.
> +The project also includes a `CNI`_.
> +
> +AF_XDP is a Linux socket Address Family that enables an XDP program
> +to redirect packets to a memory buffer in userspace.
> +
> +This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_ with
> +a DPDK :doc:`../nics/af_xdp` based application running in a Pod.
> +
> +.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
> +.. _CNI: https://github.com/containernetworking/cni
> +
> +Background
> +----------
> +
> +The standard :doc:`../nics/af_xdp` initialization process involves
> +loading an eBPF program onto the Kernel netdev to be used by the PMD.
> +This operation requires root or escalated Linux privileges and prevents
> +the PMD from working in an unprivileged container. The AF_XDP Device Plugin (DP)
> +addresses this situation by providing an entity that manages eBPF program
> +lifecycle for Pod interfaces that wish to use AF_XDP, this in turn allows
> +the pod to be used without privilege escalation.
> +
> +In order for the pod to run without privilege escalation, the AF_XDP DP
> +creates a Unix Domain Socket (UDS) and listens for Pods to make requests
> +for XSKMAP(s) File Descriptors (FDs) for interfaces in their network namespace.
> +In other words, the DPDK application running in the Pod connects to this UDS and
> +initiates a "handshake" to retrieve the XSKMAP(s) FD(s). Upon a successful "handshake",
> +the DPDK application receives the FD(s) for the XSKMAP(s) associated with the relevant
> +netdevs. The DPDK application can then create the AF_XDP socket(s), and attach
> +the socket(s) to the netdev queue(s) by inserting the socket(s) into the XSKMAP(s).
> +
> +The EAL vdev argument ``uds_path`` is used to indicate that the user
> +wishes to run the AF_XDP PMD in unprivileged mode and to receive the XSKMAP
> +FD from the AF_XDP DP. When this param is used, the
> +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used when creating the
> +AF_XDP socket to instruct libbpf/libxdp not to load the default eBPF redirect
> +program for AF_XDP on the netdev. Instead the lifecycle management of the eBPF
> +program is handled by the AF_XDP DP.
> +
> +.. note::
> +
> +   The UDS file path inside the pod appears at "/tmp/afxdp_dp/<netdev>/afxdp.sock".
> +
> +Prerequisites
> +-------------
> +
> +Device Plugin and DPDK container prerequisites:
> +
> +* Create a DPDK container image.
> +
> +* Set up the device plugin and prepare the Pod Spec as described in
> +  the instructions for `AF_XDP Device Plugin for Kubernetes`_.
> +
> +* Increase locked memory limit so containers have enough memory for packet buffers.
> +  For example:
> +
> +  .. code-block:: console
> +
> +     cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf
> +     [Service]
> +     LimitMEMLOCK=infinity
> +     EOF
> +
> +* dpdk-testpmd application should have AF_XDP feature enabled.
> +
> +  For further information see the docs for the: :doc:`../../nics/af_xdp`.
> +
> +
> +Example
> +-------
> +
> +How to run dpdk-testpmd with the AF_XDP Device plugin:
> +
> +* Clone the AF_XDP Device plugin
> +
> +  .. code-block:: console
> +
> +     # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git
> +
> +* Build the AF_XDP Device plugin and the CNI
> +
> +  .. code-block:: console
> +
> +     # cd afxdp-plugins-for-kubernetes/
> +     # make image
> +
> +* Make sure to modify the image used by the `daemonset.yml`_ file in the deployments directory with
> +  the following configuration:
> +
> +   .. _daemonset.yml : https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml
> +
> +  .. code-block:: yaml
> +
> +    image: afxdp-device-plugin:latest
> +
> +  .. note::
> +
> +    This will select the AF_XDP DP image that was built locally. Detailed configuration
> +    options can be found in the AF_XDP Device Plugin `readme`_ .
> +
> +  .. _readme: https://github.com/intel/afxdp-plugins-for-kubernetes#readme
> +
> +* Deploy the AF_XDP Device Plugin and CNI
> +
> +  .. code-block:: console
> +
> +    # kubectl create -f deployments/daemonset.yml
> +
> +* Create a Network Attachment Definition (NAD)
> +
> +  .. code-block:: console
> +
> +    # kubectl create -f nad.yaml
> +
> +  Sample nad.yml
> +
> +  .. code-block:: yaml
> +
> +    apiVersion: "k8s.cni.cncf.io/v1"
> +    kind: NetworkAttachmentDefinition
> +    metadata:
> +      name: afxdp-network
> +      annotations:
> +        k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
> +    spec:
> +      config: '{
> +          "cniVersion": "0.3.0",
> +          "type": "afxdp",
> +          "mode": "primary",
> +          "logFile": "afxdp-cni.log",
> +          "logLevel": "debug",
> +          "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn",
> +                           "-N -device- flow-type udp4 dst-port 2152 action 22"
> +                        ],
> +          "ipam": {
> +            "type": "host-local",
> +            "subnet": "192.168.1.0/24",
> +            "rangeStart": "192.168.1.200",
> +            "rangeEnd": "192.168.1.220",
> +            "routes": [
> +              { "dst": "0.0.0.0/0" }
> +            ],
> +            "gateway": "192.168.1.1"
> +          }
> +        }'
> +
> +  For further reference please use the example provided by the AF_XDP DP `nad.yaml`_
> +
> +  .. _nad.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml
> +
> +* Build a DPDK container image (using Docker)
> +
> +  .. code-block:: console
> +
> +    # docker build -t dpdk -f Dockerfile .
> +
> +  Sample Dockerfile (should be placed in top level DPDK directory):
> +
> +  .. code-block:: console
> +
> +    FROM fedora:38
> +
> +    # Setup container to build DPDK applications
> +    RUN dnf -y upgrade && dnf -y install \
> +        libbsd-devel \
> +        numactl-libs \
> +        libbpf-devel \
> +        libbpf \
> +        meson \
> +        ninja-build \
> +        libxdp-devel \
> +        libxdp \
> +        numactl-devel \
> +        python3-pyelftools \
> +        python38 \
> +        iproute
> +    RUN dnf groupinstall -y 'Development Tools'
> +
> +    # Create DPDK dir and copy over sources
> +    WORKDIR /dpdk
> +    COPY app app
> +    COPY builddir  builddir
> +    COPY buildtools buildtools
> +    COPY config config
> +    COPY devtools devtools
> +    COPY drivers drivers
> +    COPY dts dts
> +    COPY examples examples
> +    COPY kernel kernel
> +    COPY lib lib
> +    COPY license license
> +    COPY MAINTAINERS MAINTAINERS
> +    COPY Makefile Makefile
> +    COPY meson.build meson.build
> +    COPY meson_options.txt meson_options.txt
> +    COPY usertools usertools
> +    COPY VERSION VERSION
> +    COPY ABI_VERSION ABI_VERSION
> +    COPY doc doc
> +
> +    # Build DPDK
> +    RUN meson setup build
> +    RUN ninja -C build
> +
> +  .. note::
> +
> +    Ensure the Dockerfile is placed in the top level DPDK directory.
> +
> +* Run the Pod
> +
> +  .. code-block:: console
> +
> +     # kubectl create -f pod.yaml
> +
> +  Sample pod.yaml:
> +
> +  .. code-block:: yaml
> +
> +    apiVersion: v1
> +    kind: Pod
> +    metadata:
> +     name: dpdk
> +     annotations:
> +       k8s.v1.cni.cncf.io/networks: afxdp-network
> +    spec:
> +      containers:
> +      - name: testpmd
> +        image: dpdk:latest
> +        command: ["tail", "-f", "/dev/null"]
> +        securityContext:
> +          capabilities:
> +            add:
> +              - NET_RAW
> +              - IPC_LOCK
> +        resources:
> +          requests:
> +            afxdp/myPool: '1'
> +          limits:
> +            hugepages-1Gi: 2Gi
> +            cpu: 2
> +            memory: 256Mi
> +            afxdp/myPool: '1'
> +        volumeMounts:
> +        - name: hugepages
> +          mountPath: /dev/hugepages
> +      volumes:
> +      - name: hugepages
> +        emptyDir:
> +          medium: HugePages
> +
> +  For further reference please use the `pod.yaml`_
> +
> +  .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml
> +
> +.. note::
> +
> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
> +   the container capabilities stanza.
> +
> +* Run DPDK with a command like the following:
> +
> +  .. code-block:: console
> +
> +     kubectl exec -i dpdk --container testpmd -- \
> +           ./build/app/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
> +           --vdev net_af_xdp,iface=<interface name>,start_queue=22,queue_count=1,uds_path=/tmp/afxdp_dp/<interface-name>/afxdp.sock \
> +           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
> diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst
> index 71a3381c36..a7692e8a97 100644
> --- a/doc/guides/howto/index.rst
> +++ b/doc/guides/howto/index.rst
> @@ -8,7 +8,7 @@ HowTo Guides
>      :maxdepth: 2
>      :numbered:
>
> -    af_xdp_cni
> +    af_xdp_dp
>      lm_bond_virtio_sriov
>      lm_virtio_vhost_user
>      flow_bifurcation
> diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
> index 1932525d4d..0edc84328d 100644
> --- a/doc/guides/nics/af_xdp.rst
> +++ b/doc/guides/nics/af_xdp.rst
> @@ -151,25 +151,32 @@ instead of zero copy mode (if available).
>
>      --vdev net_af_xdp,iface=ens786f1,force_copy=1
>
> -use_cni
> -~~~~~~~
> +uds_path
> +~~~~~~~~~

Too long by one ~.


>
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes to
> -enable the `AF_XDP Plugin for Kubernetes`_ within a DPDK application.
> +The EAL vdev argument ``uds_path`` is used to indicate that the user wishes to
> +use the `AF_XDP Plugin for Kubernetes`_ with a DPDK application running in a Pod.
>
>  .. _AF_XDP Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
>
>  .. code-block:: console
>
> -   --vdev=net_af_xdp0,use_cni=1
> +   --vdev=net_af_xdp0,uds_path==/tmp/afxdp_dp/<interface-name>/afxdp.sock

I suppose we only need one =.


>
>  .. note::
>
> -   When using `use_cni`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled
> -   as both of these will be handled by the AF_XDP plugin.
> -   Since the DPDK application is running in limited privileges
> -   so enabling and disabling of the promiscuous mode through the DPDK application
> -   is also not supported.
> +   The UDS ``afxdp.sock`` is available in the DPDK container through a
> +   volume mounted by the `AF_XDP Plugin for Kubernetes`_ at the path
> +   specified in the example above.
> +
> +.. note::
> +
> +   When using `uds_path`_, both parameters `xdp_prog`_ and `busy_budget`_ are disabled
> +   as both of these will be handled by the AF_XDP Device plugin (if required).
> +   Since the pod/container is running with limited privileges enabling and disabling
> +   of promiscuous mode through the DPDK application is also not supported.
> +
> +For more details please see: :doc:`../howto/af_xdp_dp`
>
>  Limitations
>  -----------
> diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
> index 6f8ad27808..606cdf6316 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -55,6 +55,13 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>
> +* **Integration of AF_XDP PMD with AF_XDP Device Plugin**

Quoting the comments a few lines before in this doc:

   * **Add a title in the past tense with a full stop.**

> +
> +  The EAL vdev argument for the AF_XDP PMD ``uds_path`` was added
> +  to allow Kubernetes Pods that which to use AF_XDP with DPDK to run
> +  with limited privileges. This flag indicates that the AF_XDP PMD
> +  will be used in unprivileged mode and will receive the XSKMAP FD from
> +  the AF_XDP Device Plugin.

And double empty line before a new section in the RN.


>
>  Removed Items
>  -------------


Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v6] net/af_xdp: enable uds_path instead of use_cni
  2023-12-21 11:36 ` David Marchand
@ 2023-12-21 11:46   ` Maryam Tahhan
  2023-12-21 11:57     ` David Marchand
  0 siblings, 1 reply; 5+ messages in thread
From: Maryam Tahhan @ 2023-12-21 11:46 UTC (permalink / raw)
  To: David Marchand
  Cc: ferruh.yigit, stephen, lihuisong, fengchengwen, liuyonglong,
	shibin.koikkara.reeny, ciara.loftus, dev

[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]

On 21/12/2023 11:36, David Marchand wrote:
> Hello,
>
> On Fri, Dec 15, 2023 at 4:40 PM Maryam Tahhan<mtahhan@redhat.com>  wrote:
>> With the original 'use_cni' implementation, (using a
>> hardcoded socket rather than a configurable one),
>> if a DPDK pod is requesting multiple net devices
>> and these devices are from different pools, then
>> the container attempts to mount all the netdev UDSes
>> in the pod as /tmp/afxdp.sock. Which means that at best
>> only 1 netdev will handshake correctly with the AF_XDP
>> DP. This patch addresses this by making the socket
>> parameter configurable using a new vdev param called
>> 'uds_path' and removing the previous 'use_cni' param.
>> This patch also fixes incorrect references to the
>> AF_XDP DP as CNI and updates the documentation with a
>> working example. This change has been tested with the
> If there are fixes mixed in, please separate them in dedicated patches
> so we can backport them to LTS releases.

Thanks for the Feedback David, I don't quite understand the ask here, is 
to to have a separate patch for documentation to fixes? As previous 
feedback in the mailing list was to keep the documentation changes with 
the fix.

There is one fix in this patchset (for for broken socket behaviour). The 
rest of the changes were fixing up all the documentation. So do I need 
to create a patchset with multiple patches, one for the c code and 
separate ones for the documentation?

[-- Attachment #2: Type: text/html, Size: 2111 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v6] net/af_xdp: enable uds_path instead of use_cni
  2023-12-21 11:46   ` Maryam Tahhan
@ 2023-12-21 11:57     ` David Marchand
  0 siblings, 0 replies; 5+ messages in thread
From: David Marchand @ 2023-12-21 11:57 UTC (permalink / raw)
  To: Maryam Tahhan
  Cc: ferruh.yigit, stephen, lihuisong, fengchengwen, liuyonglong,
	shibin.koikkara.reeny, ciara.loftus, dev

On Thu, Dec 21, 2023 at 12:46 PM Maryam Tahhan <mtahhan@redhat.com> wrote:
> > If there are fixes mixed in, please separate them in dedicated patches
> > so we can backport them to LTS releases.
>
> Thanks for the Feedback David, I don't quite understand the ask here, is to to have a separate patch for documentation to fixes? As previous feedback in the mailing list was to keep the documentation changes with the fix.
>
> There is one fix in this patchset (for for broken socket behaviour). The rest of the changes were fixing up all the documentation. So do I need to create a patchset with multiple patches, one for the c code and separate ones for the documentation?

Ideally yes.

If the documentation fix relates to the code fix, it shall go as one patch.
Otherwise the doc fix can go as a separate patch.
And then after those fixes, the last patch of the series would be your
enhancement change.


Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-12-21 11:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-15 15:39 [v6] net/af_xdp: enable uds_path instead of use_cni Maryam Tahhan
2023-12-21 10:44 ` Koikkara Reeny, Shibin
2023-12-21 11:36 ` David Marchand
2023-12-21 11:46   ` Maryam Tahhan
2023-12-21 11:57     ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).