From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A334045610 for ; Fri, 12 Jul 2024 12:47:32 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9D0C0402E5; Fri, 12 Jul 2024 12:47:32 +0200 (CEST) Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2068.outbound.protection.outlook.com [40.107.100.68]) by mails.dpdk.org (Postfix) with ESMTP id 0D87040261 for ; Fri, 12 Jul 2024 12:47:31 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=TA4ec4WNGqB4IC1lxE8M+pmfkuHzi2x4ntMWqjyRkgjZwi6Dhby34I9hxVXbSpHGT0ul0PM4o7Aerm87IY9YdcC44Omp5Fjg1O5OQrE8D5nY9Cn+P47exC5WfeM0Rv9fj7aqitZIi50NG6dsE3ZZDx9hOelIB2moSWssKtamBreEehZ7KTZRx8+82aKH6QYI4/Il5fykmnodAyHJ9/PYk/3gCec1z+qWuXJrjyUW0fiwYRPE811XINj3RZSt0I48AkqJPMz62Z9ZudyMW++wpWSdczzMRfGCBqKLEWRBtzWHyiocptLiEL6jEJ/xdYAlHIijLMIpTCwCejd2duNzVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=opBjhj9M1MPAMCYI+cLvMjU2fwvPD3yczIve+TF3MQA=; b=Mii6yA9dtb0xBEAyaLv2PBTV7swOyXOixvwKn1x3j1IcNshgvC/pShuvCQCsh313pgMffwStPMuNsuW0qInsEMbYr+jprYgJDHnfNvZGlTV2sxDdLeeT+fNddj5rOnevI9wTA+RADfwSX0k25giz2DPmg9LzmpGyoQn6E3UNfi4ZdslRLtuhuY2PrxBFz4wL/EJsqGdX84HytSBJ677z4OlnGMbNkXp9MBWc3F5HQx7rtK8DOI3uTnlWc7NLzv1l1JgDMcyXkeHi5rvoO72ZT0bqQ3HYqhPiSBWlssPc6U/MgWnNl4QZemoEx2UKSTa6ghF6EhsuooeXa+sVYkQxEg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=redhat.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=opBjhj9M1MPAMCYI+cLvMjU2fwvPD3yczIve+TF3MQA=; b=JMJHsGDiDEEFT/qIXpl8tEv8GFLq/K46kzI6Z/s0AZLcRW9gDY8gQ/8A2udKI7MvAWVW40oyf/RR89rMMRMpaSQx06PxCeVp/b+mMsUPVWf74loG4KL4I3HmIKscNcfdzL7FQWUdC8P8uZRKwaX7TE7OD+bQxxnNdAIZm5DLMnihVi4GjSTPhRthktvTPrT+kPJAe95O5RoxScSmv0lRux8ku51XLvcASZTBMNCI9IEPSG1ITE6PANjyvUeSd6mnOkguN5YrxTPfm360d/07Zno+jqSSzckS8kdhZsMZT6ryHapmA/8/vYjSkHfgSoEDa5TUGubAh3+A6I2CnYf7TA== Received: from DM6PR02CA0142.namprd02.prod.outlook.com (2603:10b6:5:332::9) by DS7PR12MB5910.namprd12.prod.outlook.com (2603:10b6:8:7b::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.38; Fri, 12 Jul 2024 10:47:25 +0000 Received: from DS1PEPF00017098.namprd05.prod.outlook.com (2603:10b6:5:332:cafe::cb) by DM6PR02CA0142.outlook.office365.com (2603:10b6:5:332::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.23 via Frontend Transport; Fri, 12 Jul 2024 10:47:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS1PEPF00017098.mail.protection.outlook.com (10.167.18.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.17 via Frontend Transport; Fri, 12 Jul 2024 10:47:25 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 12 Jul 2024 03:47:16 -0700 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 12 Jul 2024 03:47:14 -0700 From: Xueming Li To: Maryam Tahhan CC: Ciara Loftus , dpdk stable Subject: patch 'net/af_xdp: fix multi-interface support for k8s' has been queued to stable release 23.11.2 Date: Fri, 12 Jul 2024 18:44:03 +0800 Message-ID: <20240712104528.308638-18-xuemingl@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240712104528.308638-1-xuemingl@nvidia.com> References: <20240712104528.308638-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail203.nvidia.com (10.129.68.9) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF00017098:EE_|DS7PR12MB5910:EE_ X-MS-Office365-Filtering-Correlation-Id: 6c635c95-70e6-495f-b0fc-08dca260047c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?D4DYUNtJjCQN4OT5/G5A3Xms+YIayzRGUDNP4zTxdEHXvctWr1WBAMSIoiTM?= =?us-ascii?Q?vJNsfSNrDgiJs4cl4QcQm4IDxFivD6J+zuKRdUSDGZUvbzdLHp+oaHOlHPv7?= =?us-ascii?Q?EOx8M5VqcTXRHVo2uHrqvWHTmzEgQbBZonDz/GzvTOD4phQksKFMu9K7BkCM?= =?us-ascii?Q?0h0QUzf9J30ndlPnLukezCCzFTDTmWiDr0onQzqvyqJK81lpKOaWHF7icXtX?= =?us-ascii?Q?SnCgsmziaPYTyPWJ+DcFB8pPbKtFNc3xdYXzH5iyPEqMLXOoiPYBlkQF+AaR?= =?us-ascii?Q?i4nnmdp2A5g27OJLWt7FJBnId0H6uavra54KYNZ974fxdlfh892m84nb+mcD?= =?us-ascii?Q?7lVzQ/8zSwhuQYK648jswEDzdb8piKjk+gDRF6D6z9WOxNZhVJnCNbNNfPxC?= =?us-ascii?Q?K44BbHBbxleMt6kHRzQAWpu/DF2w9wJC0cMo5D21QejpPeSEAqLUDL78VKWq?= =?us-ascii?Q?iF1ObvaimjuQ0vZKzNBECtNWBuZJAF4fN1qI6Ld2obs2mRNCUDEic1SU3TDa?= =?us-ascii?Q?FTleHNVVAY7CsiZymCH5W2Z/MfenXToLAFxGnueu1MnkRv44zyJUaaWsF4hu?= =?us-ascii?Q?8WjMAYl9ZYx2JkarMnLKKGuhxLm0BPMul/CVp5b/A/rjqqaf4i4ZyIB5ENIr?= =?us-ascii?Q?4mCJ19DPz6fzVp2GnYbPYfC5in05Io68UmHbUIvYLR1Xf/xIs2DhFA1aLi8W?= =?us-ascii?Q?bVwZxMJrcowBaWD+DYPgoCGkM9iaIEFwsGr2JudmDnCfVBL6y6OQvtn2+ss1?= =?us-ascii?Q?h/VgtVJiVtjRU3WZYXlESnE9SuJV3g8sseYzxvxVlBItV3VuYH9xI8iarpE8?= =?us-ascii?Q?m9Z7V2uhicfIoogN4Mm486tsfq3EB4QzhSNEfq+zWGXOg4UyXkGh1OjYGQ9k?= =?us-ascii?Q?xyXMYevJBKwWBCKSWNSXcI8DHwyJO5l/eHfiF/oEh3Hk/n4TPgSRhinAriyJ?= =?us-ascii?Q?b0IgaZjoaRa2pnLCtS2DOC8eqwuS6mmAqEHUw2hkFSNS8foB14vx/iFVnQxI?= =?us-ascii?Q?T+xETb0PpFuOLGBI5J/EUumqynfzjVza+mardRjif/FxGrwUlJZauPPKmdQ2?= =?us-ascii?Q?7zfe+JFJnQq7fQAupobBmMfd4g8isLD7j6TVQ8l93QqnKDolco2vcyVipwPp?= =?us-ascii?Q?vgQUtQWnouBOwcX1mCiCzgZRuz4r6si6i8oApFPr2XgJw9a+JBjo/DWo1Djo?= =?us-ascii?Q?M/iNbjqkMa/KQGYKOxEkV5WXLtxRx5go1nxw60xuRlMU8QOuI0b1hArJZPzL?= =?us-ascii?Q?Zb+rUlFLam4k3To2mb6BwnLTmrkBp9ezzUkxPHmqZSPO09BZOI4AAQ/Bikop?= =?us-ascii?Q?F8U7dkc8OjbDpdA+d9jt3ZyKmDJGZxG2CE1W0ubIdp5SRLmKl003bl8cMiYa?= =?us-ascii?Q?lSi1hVNpG11bYFuM5ixyKBn8K5lVVAd1XjDF1KtQBR62muhDKyffD4b+yev6?= =?us-ascii?Q?SLCFY0BXnv923Nz9roHQi0nDXSREa7z4?= X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jul 2024 10:47:25.4046 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6c635c95-70e6-495f-b0fc-08dca260047c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF00017098.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS7PR12MB5910 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Hi, FYI, your patch has been queued to stable release 23.11.2 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 07/14/24. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://git.dpdk.org/dpdk-stable/log/?h=23.11-staging This queued commit can be viewed at: https://git.dpdk.org/dpdk-stable/commit/?h=23.11-staging&id=29bcdd02a3cf1673d9c4fe7502300c165e23c446 Thanks. Xueming Li --- >From 29bcdd02a3cf1673d9c4fe7502300c165e23c446 Mon Sep 17 00:00:00 2001 From: Maryam Tahhan Date: Mon, 8 Apr 2024 09:09:21 -0400 Subject: [PATCH] net/af_xdp: fix multi-interface support for k8s Cc: Xueming Li [ upstream commit 9c1323736cf91aa46d43def8e8d2349f7498a203 ] The original 'use_cni' implementation, was added to enable support for the AF_XDP PMD in a K8s env without any escalated privileges. However 'use_cni' used a hardcoded socket rather than a configurable one. If a DPDK pod is requesting multiple net devices and these devices are from different pools, then the AF_XDP PMD attempts to mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means that at best only 1 netdev will handshake correctly with the AF_XDP DP. This patch addresses this by making the socket parameter configurable using a new vdev param called 'dp_path' alongside the original 'use_cni' param. If the 'dp_path' parameter is not set alongside the 'use_cni' parameter, then it's configured inside the AF_XDP PMD (transparently to the user). This change has been tested with the AF_XDP DP PR 81[1], with both single and multiple interfaces. [1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81 Fixes: 7fc6ae50369d ("net/af_xdp: support CNI Integration") Signed-off-by: Maryam Tahhan Acked-by: Ciara Loftus --- doc/guides/howto/af_xdp_dp.rst | 55 ++++++++++------ doc/guides/nics/af_xdp.rst | 15 +++++ drivers/net/af_xdp/compat.h | 15 +++++ drivers/net/af_xdp/meson.build | 4 ++ drivers/net/af_xdp/rte_eth_af_xdp.c | 97 ++++++++++++++++++----------- 5 files changed, 131 insertions(+), 55 deletions(-) diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst index 2f51b37f20..f0f82c79f6 100644 --- a/doc/guides/howto/af_xdp_dp.rst +++ b/doc/guides/howto/af_xdp_dp.rst @@ -54,33 +54,34 @@ should be used when creating the socket to instruct libbpf not to load the default libbpf program on the netdev. Instead the loading is handled by the AF_XDP Device Plugin. +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument +to explicitly tell the AF_XDP PMD where to find the UDS +to interact with the AF_XDP Device Plugin. +If this argument is not passed alongside the ``use_cni`` argument +then the AF_XDP PMD configures it internally. -Limitations ------------ +.. note:: -For DPDK versions <= v23.11 the Unix Domain Socket file path -appears in the pod at "/tmp/afxdp.sock". -The handshake implementation in the AF_XDP PMD -is only compatible with the AF_XDP Device Plugin up to commit id `38317c2`_ -and the pod is limited to a single netdev. + DPDK AF_XDP PMD <= v23.11 will only work with + the AF_XDP Device Plugin <= commit id `38317c2`_. .. note:: - DPDK AF_XDP PMD <= v23.11 will not work with the latest version - of the AF_XDP Device Plugin. + DPDK AF_XDP PMD > v23.11 will work with latest version of the AF_XDP Device Plugin + through a combination of the ``dp_path`` and/or the ``use_cni`` parameter. + In these versions of the PMD if a user doesn't explicitly set the ``dp_path`` parameter + when using ``use_cni`` then that path is transparently configured in the AF_XDP PMD + to the default `AF_XDP Device Plugin for Kubernetes`_ mount point path. + The path can be overridden by explicitly setting the ``dp_path`` param. + +.. note:: -The issue is if a single pod requests different devices from different pools, -it results in multiple UDS servers serving the pod -with the container using only a single mount point for their UDS as ``/tmp/afxdp.sock``. -This means that at best one device might be able to complete the handshake. -This has been fixed in the AF_XDP Device Plugin so that the mount point in the pods -for the UDS appear at ``/tmp/afxdp_dp//afxdp.sock``. -Later versions of DPDK fix this hardcoded path in the PMD -alongside the ``use_cni`` parameter. + DPDK AF_XDP PMD > v23.11 is backwards compatible + with (older) versions of the AF_XDP DP <= commit id `38317c2`_ + by explicitly setting ``dp_path`` to ``/tmp/afxdp.sock``. .. _38317c2: https://github.com/intel/afxdp-plugins-for-kubernetes/commit/38317c256b5c7dfb39e013a0f76010c2ded03669 - Prerequisites ------------- @@ -291,7 +292,7 @@ Run dpdk-testpmd with the AF_XDP Device Plugin + CNI emptyDir: medium: HugePages - For further reference please use the `pod.yaml`_ + For further reference please see the `pod.yaml`_ .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml @@ -304,3 +305,19 @@ Run dpdk-testpmd with the AF_XDP Device Plugin + CNI --vdev=net_af_xdp0,use_cni=1,iface= \ --no-mlockall --in-memory \ -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + + Or + + .. code-block:: console + + kubectl exec -i --container -- \ + //dpdk-testpmd -l 0,1 --no-pci \ + --vdev=net_af_xdp0,use_cni=1,iface=,dp_path="/tmp/afxdp_dp//afxdp.sock" \ + --no-mlockall --in-memory \ + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + +.. note:: + + If the ``dp_path`` parameter isn't explicitly set (like the example above), + the AF_XDP PMD will set the parameter value to + ``/tmp/afxdp_dp/<>/afxdp.sock``. diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 4dd9c73742..ec97d0155e 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -171,6 +171,21 @@ enable the `AF_XDP Device Plugin for Kubernetes`_ with a DPDK application/pod. so enabling and disabling of the promiscuous mode through the DPDK application is also not supported. +dp_path +~~~~~~~ + +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument +to explicitly tell the AF_XDP PMD where to find the UDS +to interact with the `AF_XDP Device Plugin for Kubernetes`_. +If this argument is not passed alongside the ``use_cni`` argument +then the AF_XDP PMD configures it internally. + +.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes + +.. code-block:: console + + --vdev=net_af_xdp0,use_cni=1,dp_path="/tmp/afxdp_dp/<>/afxdp.sock" + Limitations ----------- diff --git a/drivers/net/af_xdp/compat.h b/drivers/net/af_xdp/compat.h index 28ea64aeaa..3b5a5c1ed5 100644 --- a/drivers/net/af_xdp/compat.h +++ b/drivers/net/af_xdp/compat.h @@ -46,6 +46,21 @@ create_shared_socket(struct xsk_socket **xsk_ptr __rte_unused, } #endif +#ifdef ETH_AF_XDP_UPDATE_XSKMAP +static __rte_always_inline int +update_xskmap(struct xsk_socket *xsk, int map_fd, int xsk_queue_idx __rte_unused) +{ + return xsk_socket__update_xskmap(xsk, map_fd); +} +#else +static __rte_always_inline int +update_xskmap(struct xsk_socket *xsk, int map_fd, int xsk_queue_idx) +{ + int fd = xsk_socket__fd(xsk); + return bpf_map_update_elem(map_fd, &xsk_queue_idx, &fd, 0); +} +#endif + #ifdef XDP_USE_NEED_WAKEUP static int tx_syscall_needed(struct xsk_ring_prod *q) diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build index 9f33e57fa2..280bfa8f80 100644 --- a/drivers/net/af_xdp/meson.build +++ b/drivers/net/af_xdp/meson.build @@ -77,6 +77,10 @@ if build dependencies : bpf_dep, args: cflags) cflags += ['-DRTE_NET_AF_XDP_LIBBPF_XDP_ATTACH'] endif + if cc.has_function('xsk_socket__update_xskmap', prefix : xsk_check_prefix, + dependencies : ext_deps, args: cflags) + cflags += ['-DETH_AF_XDP_UPDATE_XSKMAP'] + endif endif require_iova_in_mbuf = false diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 268a130c49..dea3bab983 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -83,12 +83,13 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE); #define ETH_AF_XDP_MP_KEY "afxdp_mp_send_fds" +#define DP_BASE_PATH "/tmp/afxdp_dp" +#define DP_UDS_SOCK "afxdp.sock" #define MAX_LONG_OPT_SZ 64 #define UDS_MAX_FD_NUM 2 #define UDS_MAX_CMD_LEN 64 #define UDS_MAX_CMD_RESP 128 #define UDS_XSK_MAP_FD_MSG "/xsk_map_fd" -#define UDS_SOCK "/tmp/afxdp.sock" #define UDS_CONNECT_MSG "/connect" #define UDS_HOST_OK_MSG "/host_ok" #define UDS_HOST_NAK_MSG "/host_nak" @@ -171,6 +172,7 @@ struct pmd_internals { bool custom_prog_configured; bool force_copy; bool use_cni; + char dp_path[PATH_MAX]; struct bpf_map *map; struct rte_ether_addr eth_addr; @@ -191,6 +193,7 @@ struct pmd_process_private { #define ETH_AF_XDP_BUDGET_ARG "busy_budget" #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" #define ETH_AF_XDP_USE_CNI_ARG "use_cni" +#define ETH_AF_XDP_DP_PATH_ARG "dp_path" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, @@ -201,6 +204,7 @@ static const char * const valid_arguments[] = { ETH_AF_XDP_BUDGET_ARG, ETH_AF_XDP_FORCE_COPY_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_DP_PATH_ARG, NULL }; @@ -1351,7 +1355,7 @@ err_prefer: } static int -init_uds_sock(struct sockaddr_un *server) +init_uds_sock(struct sockaddr_un *server, const char *dp_path) { int sock; @@ -1362,7 +1366,7 @@ init_uds_sock(struct sockaddr_un *server) } server->sun_family = AF_UNIX; - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path)); + strlcpy(server->sun_path, dp_path, sizeof(server->sun_path)); if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) { close(sock); @@ -1382,7 +1386,7 @@ struct msg_internal { }; static int -send_msg(int sock, char *request, int *fd) +send_msg(int sock, char *request, int *fd, const char *dp_path) { int snd; struct iovec iov; @@ -1393,7 +1397,7 @@ send_msg(int sock, char *request, int *fd) memset(&dst, 0, sizeof(dst)); dst.sun_family = AF_UNIX; - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path)); + strlcpy(dst.sun_path, dp_path, sizeof(dst.sun_path)); /* Initialize message header structure */ memset(&msgh, 0, sizeof(msgh)); @@ -1470,8 +1474,8 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd) } static int -make_request_cni(int sock, struct sockaddr_un *server, char *request, - int *req_fd, char *response, int *out_fd) +make_request_dp(int sock, struct sockaddr_un *server, char *request, + int *req_fd, char *response, int *out_fd, const char *dp_path) { int rval; @@ -1483,7 +1487,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request, if (req_fd == NULL) rval = write(sock, request, strlen(request)); else - rval = send_msg(sock, request, req_fd); + rval = send_msg(sock, request, req_fd, dp_path); if (rval < 0) { AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@ -1507,7 +1511,7 @@ check_response(char *response, char *exp_resp, long size) } static int -get_cni_fd(char *if_name) +uds_get_xskmap_fd(char *if_name, const char *dp_path) { char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP]; char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1524,14 @@ get_cni_fd(char *if_name) return -1; memset(&server, 0, sizeof(server)); - sock = init_uds_sock(&server); + sock = init_uds_sock(&server, dp_path); if (sock < 0) return -1; - /* Initiates handshake to CNI send: /connect,hostname */ + /* Initiates handshake to the AF_XDP Device Plugin send: /connect,hostname */ snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, dp_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1541,7 +1545,7 @@ get_cni_fd(char *if_name) /* Request for "/version" */ strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, dp_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1549,7 +1553,7 @@ get_cni_fd(char *if_name) /* Request for file descriptor for netdev name*/ snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, dp_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1571,7 +1575,7 @@ get_cni_fd(char *if_name) /* Initiate close connection */ strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_dp(sock, &server, request, NULL, response, &out_fd, dp_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1695,21 +1699,21 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, } if (internals->use_cni) { - int err, fd, map_fd; + int err, map_fd; - /* get socket fd from CNI plugin */ - map_fd = get_cni_fd(internals->if_name); + /* get socket fd from AF_XDP Device Plugin */ + map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path); if (map_fd < 0) { - AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n"); + AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n"); goto out_xsk; } - /* get socket fd */ - fd = xsk_socket__fd(rxq->xsk); - err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0); + + err = update_xskmap(rxq->xsk, map_fd, rxq->xsk_queue_idx); if (err) { - AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n"); + AF_XDP_LOG(ERR, "Failed to insert xsk in map.\n"); goto out_xsk; } + } else if (rxq->busy_budget) { ret = configure_preferred_busy_poll(rxq); if (ret) { @@ -1881,13 +1885,13 @@ static const struct eth_dev_ops ops = { .get_monitor_addr = eth_get_monitor_addr, }; -/* CNI option works in unprivileged container environment - * and ethernet device functionality will be reduced. So - * additional customiszed eth_dev_ops struct is needed - * for cni. Promiscuous enable and disable functionality - * is removed. +/* AF_XDP Device Plugin option works in unprivileged + * container environments and ethernet device functionality + * will be reduced. So additional customised eth_dev_ops + * struct is needed for the Device Plugin. Promiscuous + * enable and disable functionality is removed. **/ -static const struct eth_dev_ops ops_cni = { +static const struct eth_dev_ops ops_afxdp_dp = { .dev_start = eth_dev_start, .dev_stop = eth_dev_stop, .dev_close = eth_dev_close, @@ -2023,7 +2027,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues, static int parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, int *queue_cnt, int *shared_umem, char *prog_path, - int *busy_budget, int *force_copy, int *use_cni) + int *busy_budget, int *force_copy, int *use_cni, + char *dp_path) { int ret; @@ -2069,6 +2074,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_DP_PATH_ARG, + &parse_prog_arg, dp_path); + if (ret < 0) + goto free_kvlist; + free_kvlist: rte_kvargs_free(kvlist); return ret; @@ -2108,7 +2118,7 @@ static struct rte_eth_dev * init_internals(struct rte_vdev_device *dev, const char *if_name, int start_queue_idx, int queue_cnt, int shared_umem, const char *prog_path, int busy_budget, int force_copy, - int use_cni) + int use_cni, const char *dp_path) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -2138,6 +2148,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, internals->shared_umem = shared_umem; internals->force_copy = force_copy; internals->use_cni = use_cni; + strlcpy(internals->dp_path, dp_path, PATH_MAX); if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, &internals->combined_queue_cnt)) { @@ -2199,7 +2210,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, if (!internals->use_cni) eth_dev->dev_ops = &ops; else - eth_dev->dev_ops = &ops_cni; + eth_dev->dev_ops = &ops_afxdp_dp; eth_dev->rx_pkt_burst = eth_af_xdp_rx; eth_dev->tx_pkt_burst = eth_af_xdp_tx; @@ -2328,6 +2339,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int busy_budget = -1, ret; int force_copy = 0; int use_cni = 0; + char dp_path[PATH_MAX] = {'\0'}; struct rte_eth_dev *eth_dev = NULL; const char *name = rte_vdev_device_name(dev); @@ -2370,7 +2382,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, &xsk_queue_cnt, &shared_umem, prog_path, - &busy_budget, &force_copy, &use_cni) < 0) { + &busy_budget, &force_copy, &use_cni, dp_path) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -2384,7 +2396,19 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (use_cni && strnlen(prog_path, PATH_MAX)) { AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG); - return -EINVAL; + return -EINVAL; + } + + if (use_cni && !strnlen(dp_path, PATH_MAX)) { + snprintf(dp_path, sizeof(dp_path), "%s/%s/%s", DP_BASE_PATH, if_name, DP_UDS_SOCK); + AF_XDP_LOG(INFO, "'%s' parameter not provided, setting value to '%s'\n", + ETH_AF_XDP_DP_PATH_ARG, dp_path); + } + + if (!use_cni && strnlen(dp_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' was not enabled\n", + ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG); + return -EINVAL; } if (strlen(if_name) == 0) { @@ -2410,7 +2434,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, xsk_queue_cnt, shared_umem, prog_path, - busy_budget, force_copy, use_cni); + busy_budget, force_copy, use_cni, dp_path); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -2471,4 +2495,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "xdp_prog= " "busy_budget= " "force_copy= " - "use_cni= "); + "use_cni= " + "dp_path= "); -- 2.34.1 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2024-07-12 18:40:15.072871011 +0800 +++ 0017-net-af_xdp-fix-multi-interface-support-for-k8s.patch 2024-07-12 18:40:13.946594246 +0800 @@ -1 +1 @@ -From 9c1323736cf91aa46d43def8e8d2349f7498a203 Mon Sep 17 00:00:00 2001 +From 29bcdd02a3cf1673d9c4fe7502300c165e23c446 Mon Sep 17 00:00:00 2001 @@ -4,0 +5,3 @@ +Cc: Xueming Li + +[ upstream commit 9c1323736cf91aa46d43def8e8d2349f7498a203 ] @@ -28 +30,0 @@ -Cc: stable@dpdk.org @@ -33,7 +35,6 @@ - doc/guides/howto/af_xdp_dp.rst | 55 ++++++++++----- - doc/guides/nics/af_xdp.rst | 15 ++++ - doc/guides/rel_notes/release_24_07.rst | 10 +++ - drivers/net/af_xdp/compat.h | 15 ++++ - drivers/net/af_xdp/meson.build | 4 ++ - drivers/net/af_xdp/rte_eth_af_xdp.c | 97 ++++++++++++++++---------- - 6 files changed, 141 insertions(+), 55 deletions(-) + doc/guides/howto/af_xdp_dp.rst | 55 ++++++++++------ + doc/guides/nics/af_xdp.rst | 15 +++++ + drivers/net/af_xdp/compat.h | 15 +++++ + drivers/net/af_xdp/meson.build | 4 ++ + drivers/net/af_xdp/rte_eth_af_xdp.c | 97 ++++++++++++++++++----------- + 5 files changed, 131 insertions(+), 55 deletions(-) @@ -153,21 +153,0 @@ -diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst -index a69f24cf99..0b2e1d328d 100644 ---- a/doc/guides/rel_notes/release_24_07.rst -+++ b/doc/guides/rel_notes/release_24_07.rst -@@ -55,6 +55,16 @@ New Features - Also, make sure to start the actual text at the margin. - ======================================================= - -+* **Updated AF_XDP driver.** -+ -+ * Enabled multi-interface (UDS) support with AF_XDP Device Plugin. -+ -+ The vdev argument for the AF_XDP PMD ``use_cni`` previously limited -+ a pod to using only a single netdev/interface. -+ The latest changes (adding the ``dp_path`` parameter) remove this limitation -+ and maintain backward compatibility for any applications already using -+ the ``use_cni`` vdev argument with the AF_XDP Device Plugin. -+ - - Removed Items - ------------- @@ -216 +196 @@ -index 6ba455bb9b..dcd590569e 100644 +index 268a130c49..dea3bab983 100644 @@ -258 +238 @@ -@@ -1352,7 +1356,7 @@ err_prefer: +@@ -1351,7 +1355,7 @@ err_prefer: @@ -267 +247 @@ -@@ -1363,7 +1367,7 @@ init_uds_sock(struct sockaddr_un *server) +@@ -1362,7 +1366,7 @@ init_uds_sock(struct sockaddr_un *server) @@ -276 +256 @@ -@@ -1383,7 +1387,7 @@ struct msg_internal { +@@ -1382,7 +1386,7 @@ struct msg_internal { @@ -285 +265 @@ -@@ -1394,7 +1398,7 @@ send_msg(int sock, char *request, int *fd) +@@ -1393,7 +1397,7 @@ send_msg(int sock, char *request, int *fd) @@ -294 +274 @@ -@@ -1471,8 +1475,8 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd) +@@ -1470,8 +1474,8 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd) @@ -305 +285 @@ -@@ -1484,7 +1488,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request, +@@ -1483,7 +1487,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request, @@ -314 +294 @@ -@@ -1508,7 +1512,7 @@ check_response(char *response, char *exp_resp, long size) +@@ -1507,7 +1511,7 @@ check_response(char *response, char *exp_resp, long size) @@ -323 +303 @@ -@@ -1521,14 +1525,14 @@ get_cni_fd(char *if_name) +@@ -1520,14 +1524,14 @@ get_cni_fd(char *if_name) @@ -341 +321 @@ -@@ -1542,7 +1546,7 @@ get_cni_fd(char *if_name) +@@ -1541,7 +1545,7 @@ get_cni_fd(char *if_name) @@ -350 +330 @@ -@@ -1550,7 +1554,7 @@ get_cni_fd(char *if_name) +@@ -1549,7 +1553,7 @@ get_cni_fd(char *if_name) @@ -359 +339 @@ -@@ -1572,7 +1576,7 @@ get_cni_fd(char *if_name) +@@ -1571,7 +1575,7 @@ get_cni_fd(char *if_name) @@ -368 +348 @@ -@@ -1697,21 +1701,21 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, +@@ -1695,21 +1699,21 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, @@ -398 +378 @@ -@@ -1883,13 +1887,13 @@ static const struct eth_dev_ops ops = { +@@ -1881,13 +1885,13 @@ static const struct eth_dev_ops ops = { @@ -418 +398 @@ -@@ -2025,7 +2029,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues, +@@ -2023,7 +2027,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues, @@ -428 +408 @@ -@@ -2071,6 +2076,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, +@@ -2069,6 +2074,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, @@ -440 +420 @@ -@@ -2110,7 +2120,7 @@ static struct rte_eth_dev * +@@ -2108,7 +2118,7 @@ static struct rte_eth_dev * @@ -449 +429 @@ -@@ -2140,6 +2150,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, +@@ -2138,6 +2148,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, @@ -457 +437 @@ -@@ -2201,7 +2212,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, +@@ -2199,7 +2210,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, @@ -466 +446 @@ -@@ -2330,6 +2341,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) +@@ -2328,6 +2339,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) @@ -474 +454 @@ -@@ -2372,7 +2384,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) +@@ -2370,7 +2382,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) @@ -483 +463 @@ -@@ -2386,7 +2398,19 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) +@@ -2384,7 +2396,19 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) @@ -504 +484 @@ -@@ -2412,7 +2436,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) +@@ -2410,7 +2434,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) @@ -513 +493 @@ -@@ -2473,4 +2497,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, +@@ -2471,4 +2495,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,