From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id EDEBF4366A;
	Mon,  4 Dec 2023 11:31:07 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id DC98940DDE;
	Mon,  4 Dec 2023 11:31:07 +0100 (CET)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by mails.dpdk.org (Postfix) with ESMTP id CAC9440DD8
 for <dev@dpdk.org>; Mon,  4 Dec 2023 11:31:06 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1701685866;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding;
 bh=AmdkCgTPQaE22e7PvWQ580M7edLakH6qvLG0zz6BO14=;
 b=WmjkLgqsdbkvucrEBJCTfdkpZXQTN29riTTBFtYdIfrQy80RjhSpyZYXEpue6puZWFDBww
 TmcsNGSpUtg9pkBXfbbhY56cmYZhPHj4hXWXSyQHki9oR3cEvaTfunmZieHBEpBqQ0Vf0K
 GaKohl0EItezw267WExu04yxHKPTXuM=
Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com
 [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-648-P57opxz5O46dmU_UdsnkGA-1; Mon, 04 Dec 2023 05:31:05 -0500
X-MC-Unique: P57opxz5O46dmU_UdsnkGA-1
Received: by mail-qk1-f199.google.com with SMTP id
 af79cd13be357-77d7b2e8623so643438185a.0
 for <dev@dpdk.org>; Mon, 04 Dec 2023 02:31:05 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1701685865; x=1702290665;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=AmdkCgTPQaE22e7PvWQ580M7edLakH6qvLG0zz6BO14=;
 b=Pa8ddc3lQKUwSxrLvmKz0ungN8rc9eW5nKXXgDwy26yIx5huvyzckOT/srMSx17Ofj
 KV3NDy5aTUYA/4fZBl4fQ6YfsbIIFaD1mwh1+MFbj+WD+EFxRNac+OBGxdZ9zdiSZDw/
 Eq/8r+Lx5lf+bqTQe2OU/Uak76UBzZY0Vk1s2SuV1fQMtu0ilo0Mv2SLFlAcHwsoU1Bp
 v5kimgkJVmP8JIV0Qzy4pttSr/CY7CYCm3SQRtezfdP5oN61oqIKWPjdbXu0P2ibhJcY
 DiFr/ExpmgcsVxuV30MsJVqHqratxHWjVqCvMd9utWvQbdewJ9Iqs9GSOCql7hMxCcCK
 faVQ==
X-Gm-Message-State: AOJu0YxMqh7VnUSvytnRFxdsHpcRkAD/yim31de20Dz5PmNeiFhsYMYo
 WrwpMNrges9o4p1kJsX2uvS3KFYqTXtN4UzOj7RVT1L51b+44xBhionauesTbHTCiM5pxr75AA7
 6kGE=
X-Received: by 2002:a05:620a:d55:b0:77e:fba3:759e with SMTP id
 o21-20020a05620a0d5500b0077efba3759emr3423694qkl.150.1701685864822; 
 Mon, 04 Dec 2023 02:31:04 -0800 (PST)
X-Google-Smtp-Source: AGHT+IHi6qpTgR9IFjzNU2QByafs+0xBU5r1DdsaLdAwyHe+ENbduLoG7amy6h+6RbxWKLQUPvnhjQ==
X-Received: by 2002:a05:620a:d55:b0:77e:fba3:759e with SMTP id
 o21-20020a05620a0d5500b0077efba3759emr3423678qkl.150.1701685864430; 
 Mon, 04 Dec 2023 02:31:04 -0800 (PST)
Received: from nfvsdn-06.redhat.com (nat-pool-232-132.redhat.com.
 [66.187.232.132]) by smtp.gmail.com with ESMTPSA id
 qh13-20020a05620a668d00b0077d85d22e89sm4131287qkn.63.2023.12.04.02.31.03
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 04 Dec 2023 02:31:04 -0800 (PST)
From: Maryam Tahhan <mtahhan@redhat.com>
To: ferruh.yigit@amd.com, stephen@networkplumber.org, lihuisong@huawei.com,
 fengchengwen@huawei.com, liuyonglong@huawei.com,
 shibin.koikkara.reeny@intel.com
Cc: dev@dpdk.org,
	Maryam Tahhan <mtahhan@redhat.com>
Subject: [v2] net/af_xdp: enable a sock path alongside use_cni
Date: Mon,  4 Dec 2023 05:31:01 -0500
Message-ID: <20231204103101.2124374-1-mtahhan@redhat.com>
X-Mailer: git-send-email 2.41.0
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="US-ASCII"; x-default=true
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

With the original 'use_cni' implementation, (using a
hardcoded socket rather than a configurable one),
if a single pod is requesting multiple net devices
and these devices are from different pools, then
the container attempts to mount all the netdev UDSes
in the pod as /tmp/afxdp.sock. Which means that at best
only 1 netdev will handshake correctly with the AF_XDP
DP. This patch addresses this by making the socket
parameter configurable alongside the 'use_cni' param.
Tested with the AF_XDP DP CNI PR 81.

v2:
* Rename sock_path to uds_path.
* Update documentation to reflect when CAP_BPF is needed.
* Fix testpmd arguments in the provided example for Pods.
* Use AF_XDP API to update the xskmap entry.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
---
 doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
 drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
 2 files changed, 54 insertions(+), 32 deletions(-)

diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst
index a1a6d5b99c..7829526b40 100644
--- a/doc/guides/howto/af_xdp_cni.rst
+++ b/doc/guides/howto/af_xdp_cni.rst
@@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
 The client can then proceed with creating an AF_XDP socket
 and inserting that socket into the XSKMAP pointed to by the descriptor.
 
-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
-to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor
-from the CNI.
+The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to indicate that
+the user wishes to run the PMD in unprivileged mode and to receive the XSKMAP
+file descriptor from the CNI.
+
 When this flag is set,
 the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
 should be used when creating the socket
@@ -49,7 +50,7 @@ Instead the loading is handled by the CNI.
 
 .. note::
 
-   The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock".
+   The Unix Domain Socket file path appears to the end user at "/tmp/afxdp_dp/<netdev>/afxdp.sock".
 
 
 Prerequisites
@@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
          securityContext:
           capabilities:
              add:
-               - CAP_NET_RAW
-               - CAP_BPF
+               - NET_RAW
          resources:
            requests:
              hugepages-2Mi: 2Gi
@@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
 
   .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
 
+.. note::
+
+   For Kernel versions older than 5.19 `CAP_BPF` is also required in
+   the container capabilities stanza.
+
 * Run DPDK with a command like the following:
 
   .. code-block:: console
 
      kubectl exec -i <Pod name> --container <containers name> -- \
-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
-           -- --no-mlockall --in-memory
+           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
+           --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
+           --vdev net_af_xdp1,iface=e<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
+           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
 
 For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_
 
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 353c8688ec..505ed6cf1e 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE);
 #define UDS_MAX_CMD_LEN			64
 #define UDS_MAX_CMD_RESP		128
 #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
-#define UDS_SOCK			"/tmp/afxdp.sock"
 #define UDS_CONNECT_MSG			"/connect"
 #define UDS_HOST_OK_MSG			"/host_ok"
 #define UDS_HOST_NAK_MSG		"/host_nak"
@@ -171,6 +170,7 @@ struct pmd_internals {
 	bool custom_prog_configured;
 	bool force_copy;
 	bool use_cni;
+	char uds_path[PATH_MAX];
 	struct bpf_map *map;
 
 	struct rte_ether_addr eth_addr;
@@ -191,6 +191,7 @@ struct pmd_process_private {
 #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
 #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
 #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
+#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG	"uds_path"
 
 static const char * const valid_arguments[] = {
 	ETH_AF_XDP_IFACE_ARG,
@@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
 	ETH_AF_XDP_BUDGET_ARG,
 	ETH_AF_XDP_FORCE_COPY_ARG,
 	ETH_AF_XDP_USE_CNI_ARG,
+	ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
 	NULL
 };
 
@@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct pkt_rx_queue *rxq)
 }
 
 static int
-init_uds_sock(struct sockaddr_un *server)
+init_uds_sock(struct sockaddr_un *server, const char *uds_path)
 {
 	int sock;
 
@@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
 	}
 
 	server->sun_family = AF_UNIX;
-	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
+	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
 
 	if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) {
 		close(sock);
@@ -1382,7 +1384,7 @@ struct msg_internal {
 };
 
 static int
-send_msg(int sock, char *request, int *fd)
+send_msg(int sock, char *request, int *fd, const char *uds_path)
 {
 	int snd;
 	struct iovec iov;
@@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
 
 	memset(&dst, 0, sizeof(dst));
 	dst.sun_family = AF_UNIX;
-	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
+	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
 
 	/* Initialize message header structure */
 	memset(&msgh, 0, sizeof(msgh));
@@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd)
 
 static int
 make_request_cni(int sock, struct sockaddr_un *server, char *request,
-		 int *req_fd, char *response, int *out_fd)
+		 int *req_fd, char *response, int *out_fd, const char *uds_path)
 {
 	int rval;
 
@@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request,
 	if (req_fd == NULL)
 		rval = write(sock, request, strlen(request));
 	else
-		rval = send_msg(sock, request, req_fd);
+		rval = send_msg(sock, request, req_fd, uds_path);
 
 	if (rval < 0) {
 		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno));
@@ -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long size)
 }
 
 static int
-get_cni_fd(char *if_name)
+get_cni_fd(char *if_name, const char *uds_path)
 {
 	char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP];
 	char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP];
@@ -1520,14 +1522,14 @@ get_cni_fd(char *if_name)
 		return -1;
 
 	memset(&server, 0, sizeof(server));
-	sock = init_uds_sock(&server);
+	sock = init_uds_sock(&server, uds_path);
 	if (sock < 0)
 		return -1;
 
 	/* Initiates handshake to CNI send: /connect,hostname */
 	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
 	/* Request for "/version" */
 	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
 	/* Request for file descriptor for netdev name*/
 	snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
 	/* Initiate close connection */
 	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 	}
 
 	if (internals->use_cni) {
-		int err, fd, map_fd;
+		int err, map_fd;
 
 		/* get socket fd from CNI plugin */
-		map_fd = get_cni_fd(internals->if_name);
+		map_fd = get_cni_fd(internals->if_name, internals->uds_path);
 		if (map_fd < 0) {
 			AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n");
 			goto out_xsk;
 		}
-		/* get socket fd */
-		fd = xsk_socket__fd(rxq->xsk);
-		err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0);
+
+		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
 		if (err) {
 			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n");
 			goto out_xsk;
@@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues,
 static int
 parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 		 int *queue_cnt, int *shared_umem, char *prog_path,
-		 int *busy_budget, int *force_copy, int *use_cni)
+		 int *busy_budget, int *force_copy, int *use_cni,
+		 char *uds_path)
 {
 	int ret;
 
@@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 	if (ret < 0)
 		goto free_kvlist;
 
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
+				 &parse_prog_arg, uds_path);
+	if (ret < 0)
+		goto free_kvlist;
+
 free_kvlist:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -2108,7 +2115,7 @@ static struct rte_eth_dev *
 init_internals(struct rte_vdev_device *dev, const char *if_name,
 	       int start_queue_idx, int queue_cnt, int shared_umem,
 	       const char *prog_path, int busy_budget, int force_copy,
-	       int use_cni)
+		   int use_cni, const char *uds_path)
 {
 	const char *name = rte_vdev_device_name(dev);
 	const unsigned int numa_node = dev->device.numa_node;
@@ -2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 	internals->shared_umem = shared_umem;
 	internals->force_copy = force_copy;
 	internals->use_cni = use_cni;
+	strlcpy(internals->uds_path, uds_path, PATH_MAX);
 
 	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
 				  &internals->combined_queue_cnt)) {
@@ -2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	int busy_budget = -1, ret;
 	int force_copy = 0;
 	int use_cni = 0;
+	char uds_path[PATH_MAX] = {'\0'};
 	struct rte_eth_dev *eth_dev = NULL;
 	const char *name = rte_vdev_device_name(dev);
 
@@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
 			     &xsk_queue_cnt, &shared_umem, prog_path,
-			     &busy_budget, &force_copy, &use_cni) < 0) {
+				 &busy_budget, &force_copy, &use_cni, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
 		return -EINVAL;
 	}
@@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 			return -EINVAL;
 	}
 
+	if (use_cni && !strnlen(uds_path, PATH_MAX)) {
+		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must also be provided\n",
+			ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
+			return -EINVAL;
+	}
+
 	if (strlen(if_name) == 0) {
 		AF_XDP_LOG(ERR, "Network interface must be specified\n");
 		return -EINVAL;
@@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
 				 xsk_queue_cnt, shared_umem, prog_path,
-				 busy_budget, force_copy, use_cni);
+				 busy_budget, force_copy, use_cni, uds_path);
 	if (eth_dev == NULL) {
 		AF_XDP_LOG(ERR, "Failed to init internals\n");
 		return -1;
@@ -2471,4 +2486,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
 			      "xdp_prog=<string> "
 			      "busy_budget=<int> "
 			      "force_copy=<int> "
-			      "use_cni=<int> ");
+			      "use_cni=<int> "
+			      "uds_path=<string> ");
-- 
2.41.0