From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EDEBF4366A; Mon, 4 Dec 2023 11:31:07 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DC98940DDE; Mon, 4 Dec 2023 11:31:07 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id CAC9440DD8 for ; Mon, 4 Dec 2023 11:31:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701685866; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=AmdkCgTPQaE22e7PvWQ580M7edLakH6qvLG0zz6BO14=; b=WmjkLgqsdbkvucrEBJCTfdkpZXQTN29riTTBFtYdIfrQy80RjhSpyZYXEpue6puZWFDBww TmcsNGSpUtg9pkBXfbbhY56cmYZhPHj4hXWXSyQHki9oR3cEvaTfunmZieHBEpBqQ0Vf0K GaKohl0EItezw267WExu04yxHKPTXuM= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-P57opxz5O46dmU_UdsnkGA-1; Mon, 04 Dec 2023 05:31:05 -0500 X-MC-Unique: P57opxz5O46dmU_UdsnkGA-1 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-77d7b2e8623so643438185a.0 for ; Mon, 04 Dec 2023 02:31:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701685865; x=1702290665; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AmdkCgTPQaE22e7PvWQ580M7edLakH6qvLG0zz6BO14=; b=Pa8ddc3lQKUwSxrLvmKz0ungN8rc9eW5nKXXgDwy26yIx5huvyzckOT/srMSx17Ofj KV3NDy5aTUYA/4fZBl4fQ6YfsbIIFaD1mwh1+MFbj+WD+EFxRNac+OBGxdZ9zdiSZDw/ Eq/8r+Lx5lf+bqTQe2OU/Uak76UBzZY0Vk1s2SuV1fQMtu0ilo0Mv2SLFlAcHwsoU1Bp v5kimgkJVmP8JIV0Qzy4pttSr/CY7CYCm3SQRtezfdP5oN61oqIKWPjdbXu0P2ibhJcY DiFr/ExpmgcsVxuV30MsJVqHqratxHWjVqCvMd9utWvQbdewJ9Iqs9GSOCql7hMxCcCK faVQ== X-Gm-Message-State: AOJu0YxMqh7VnUSvytnRFxdsHpcRkAD/yim31de20Dz5PmNeiFhsYMYo WrwpMNrges9o4p1kJsX2uvS3KFYqTXtN4UzOj7RVT1L51b+44xBhionauesTbHTCiM5pxr75AA7 6kGE= X-Received: by 2002:a05:620a:d55:b0:77e:fba3:759e with SMTP id o21-20020a05620a0d5500b0077efba3759emr3423694qkl.150.1701685864822; Mon, 04 Dec 2023 02:31:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IHi6qpTgR9IFjzNU2QByafs+0xBU5r1DdsaLdAwyHe+ENbduLoG7amy6h+6RbxWKLQUPvnhjQ== X-Received: by 2002:a05:620a:d55:b0:77e:fba3:759e with SMTP id o21-20020a05620a0d5500b0077efba3759emr3423678qkl.150.1701685864430; Mon, 04 Dec 2023 02:31:04 -0800 (PST) Received: from nfvsdn-06.redhat.com (nat-pool-232-132.redhat.com. [66.187.232.132]) by smtp.gmail.com with ESMTPSA id qh13-20020a05620a668d00b0077d85d22e89sm4131287qkn.63.2023.12.04.02.31.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 02:31:04 -0800 (PST) From: Maryam Tahhan To: ferruh.yigit@amd.com, stephen@networkplumber.org, lihuisong@huawei.com, fengchengwen@huawei.com, liuyonglong@huawei.com, shibin.koikkara.reeny@intel.com Cc: dev@dpdk.org, Maryam Tahhan Subject: [v2] net/af_xdp: enable a sock path alongside use_cni Date: Mon, 4 Dec 2023 05:31:01 -0500 Message-ID: <20231204103101.2124374-1-mtahhan@redhat.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org With the original 'use_cni' implementation, (using a hardcoded socket rather than a configurable one), if a single pod is requesting multiple net devices and these devices are from different pools, then the container attempts to mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means that at best only 1 netdev will handshake correctly with the AF_XDP DP. This patch addresses this by making the socket parameter configurable alongside the 'use_cni' param. Tested with the AF_XDP DP CNI PR 81. v2: * Rename sock_path to uds_path. * Update documentation to reflect when CAP_BPF is needed. * Fix testpmd arguments in the provided example for Pods. * Use AF_XDP API to update the xskmap entry. Signed-off-by: Maryam Tahhan --- doc/guides/howto/af_xdp_cni.rst | 24 ++++++----- drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++----------- 2 files changed, 54 insertions(+), 32 deletions(-) diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644 --- a/doc/guides/howto/af_xdp_cni.rst +++ b/doc/guides/howto/af_xdp_cni.rst @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK). The client can then proceed with creating an AF_XDP socket and inserting that socket into the XSKMAP pointed to by the descriptor. -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor -from the CNI. +The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to indicate that +the user wishes to run the PMD in unprivileged mode and to receive the XSKMAP +file descriptor from the CNI. + When this flag is set, the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag should be used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is handled by the CNI. .. note:: - The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock". + The Unix Domain Socket file path appears to the end user at "/tmp/afxdp_dp//afxdp.sock". Prerequisites @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin: securityContext: capabilities: add: - - CAP_NET_RAW - - CAP_BPF + - NET_RAW resources: requests: hugepages-2Mi: 2Gi @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin: .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml +.. note:: + + For Kernel versions older than 5.19 `CAP_BPF` is also required in + the container capabilities stanza. + * Run DPDK with a command like the following: .. code-block:: console kubectl exec -i --container -- \ - //dpdk-testpmd -l 0,1 --no-pci \ - --vdev=net_af_xdp0,use_cni=1,iface= \ - -- --no-mlockall --in-memory + //dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \ + --vdev net_af_xdp0,iface=,use_cni=1,uds_path=/tmp/afxdp_dp//afxdp.sock \ + --vdev net_af_xdp1,iface=e,use_cni=1,uds_path=/tmp/afxdp_dp//afxdp.sock \ + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_ diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 353c8688ec..505ed6cf1e 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE); #define UDS_MAX_CMD_LEN 64 #define UDS_MAX_CMD_RESP 128 #define UDS_XSK_MAP_FD_MSG "/xsk_map_fd" -#define UDS_SOCK "/tmp/afxdp.sock" #define UDS_CONNECT_MSG "/connect" #define UDS_HOST_OK_MSG "/host_ok" #define UDS_HOST_NAK_MSG "/host_nak" @@ -171,6 +170,7 @@ struct pmd_internals { bool custom_prog_configured; bool force_copy; bool use_cni; + char uds_path[PATH_MAX]; struct bpf_map *map; struct rte_ether_addr eth_addr; @@ -191,6 +191,7 @@ struct pmd_process_private { #define ETH_AF_XDP_BUDGET_ARG "busy_budget" #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" #define ETH_AF_XDP_USE_CNI_ARG "use_cni" +#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG "uds_path" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = { ETH_AF_XDP_BUDGET_ARG, ETH_AF_XDP_FORCE_COPY_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_USE_CNI_UDS_PATH_ARG, NULL }; @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct pkt_rx_queue *rxq) } static int -init_uds_sock(struct sockaddr_un *server) +init_uds_sock(struct sockaddr_un *server, const char *uds_path) { int sock; @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server) } server->sun_family = AF_UNIX; - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path)); + strlcpy(server->sun_path, uds_path, sizeof(server->sun_path)); if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) { close(sock); @@ -1382,7 +1384,7 @@ struct msg_internal { }; static int -send_msg(int sock, char *request, int *fd) +send_msg(int sock, char *request, int *fd, const char *uds_path) { int snd; struct iovec iov; @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd) memset(&dst, 0, sizeof(dst)); dst.sun_family = AF_UNIX; - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path)); + strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path)); /* Initialize message header structure */ memset(&msgh, 0, sizeof(msgh)); @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct sockaddr_un *s, int *fd) static int make_request_cni(int sock, struct sockaddr_un *server, char *request, - int *req_fd, char *response, int *out_fd) + int *req_fd, char *response, int *out_fd, const char *uds_path) { int rval; @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un *server, char *request, if (req_fd == NULL) rval = write(sock, request, strlen(request)); else - rval = send_msg(sock, request, req_fd); + rval = send_msg(sock, request, req_fd, uds_path); if (rval < 0) { AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@ -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long size) } static int -get_cni_fd(char *if_name) +get_cni_fd(char *if_name, const char *uds_path) { char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP]; char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@ get_cni_fd(char *if_name) return -1; memset(&server, 0, sizeof(server)); - sock = init_uds_sock(&server); + sock = init_uds_sock(&server, uds_path); if (sock < 0) return -1; /* Initiates handshake to CNI send: /connect,hostname */ snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name) /* Request for "/version" */ strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name) /* Request for file descriptor for netdev name*/ snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name) /* Initiate close connection */ strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN); memset(response, 0, sizeof(response)); - if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) { + if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) { AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request); goto err_close; } @@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, } if (internals->use_cni) { - int err, fd, map_fd; + int err, map_fd; /* get socket fd from CNI plugin */ - map_fd = get_cni_fd(internals->if_name); + map_fd = get_cni_fd(internals->if_name, internals->uds_path); if (map_fd < 0) { AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n"); goto out_xsk; } - /* get socket fd */ - fd = xsk_socket__fd(rxq->xsk); - err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0); + + err = xsk_socket__update_xskmap(rxq->xsk, map_fd); if (err) { AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n"); goto out_xsk; @@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues, static int parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, int *queue_cnt, int *shared_umem, char *prog_path, - int *busy_budget, int *force_copy, int *use_cni) + int *busy_budget, int *force_copy, int *use_cni, + char *uds_path) { int ret; @@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG, + &parse_prog_arg, uds_path); + if (ret < 0) + goto free_kvlist; + free_kvlist: rte_kvargs_free(kvlist); return ret; @@ -2108,7 +2115,7 @@ static struct rte_eth_dev * init_internals(struct rte_vdev_device *dev, const char *if_name, int start_queue_idx, int queue_cnt, int shared_umem, const char *prog_path, int busy_budget, int force_copy, - int use_cni) + int use_cni, const char *uds_path) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, internals->shared_umem = shared_umem; internals->force_copy = force_copy; internals->use_cni = use_cni; + strlcpy(internals->uds_path, uds_path, PATH_MAX); if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, &internals->combined_queue_cnt)) { @@ -2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int busy_budget = -1, ret; int force_copy = 0; int use_cni = 0; + char uds_path[PATH_MAX] = {'\0'}; struct rte_eth_dev *eth_dev = NULL; const char *name = rte_vdev_device_name(dev); @@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, &xsk_queue_cnt, &shared_umem, prog_path, - &busy_budget, &force_copy, &use_cni) < 0) { + &busy_budget, &force_copy, &use_cni, uds_path) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) return -EINVAL; } + if (use_cni && !strnlen(uds_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must also be provided\n", + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG); + return -EINVAL; + } + if (strlen(if_name) == 0) { AF_XDP_LOG(ERR, "Network interface must be specified\n"); return -EINVAL; @@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, xsk_queue_cnt, shared_umem, prog_path, - busy_budget, force_copy, use_cni); + busy_budget, force_copy, use_cni, uds_path); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -2471,4 +2486,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "xdp_prog= " "busy_budget= " "force_copy= " - "use_cni= "); + "use_cni= " + "uds_path= "); -- 2.41.0