From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F89843411; Thu, 30 Nov 2023 15:30:56 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 94FDB42E8B; Thu, 30 Nov 2023 15:30:55 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id BE5E2402F0 for ; Thu, 30 Nov 2023 15:30:53 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701354653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vqSsUqE5LcjSzElkHe+8i8W9tnn95Ddr3AXkGtdO/1g=; b=RNKQnStQOF5KzOCW2d+DVIuIFdzk0yiREdH18pJCXKEOdKcw9D9LLP5upVfAs3wGw6QIdh Zqc5/BzUd/ubcku/1GonpxXHMBcEt3CGfi1aYsOuvrlLg3MIu2n1UC/mINeKUOtfe7yuLA MvKXATToCbgWkmy8ZZ/68hXud4n/3g8= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-594-biuB5coUPwaSURKUkani9w-1; Thu, 30 Nov 2023 09:30:51 -0500 X-MC-Unique: biuB5coUPwaSURKUkani9w-1 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-333127fecf9so870316f8f.2 for ; Thu, 30 Nov 2023 06:30:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701354647; x=1701959447; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vqSsUqE5LcjSzElkHe+8i8W9tnn95Ddr3AXkGtdO/1g=; b=seBNXQVFcWSJnlZc4U9awhulOl7TUrTemDEGji+1IOD5zwTF7CzqYvchsaFT0dGP3x 9HEWbcQyn92OiFgtTLUvF5P2P0egZgiqnjoDzCibxa1IXzbS++PL7qGuXef4w15iwKS8 8ckrPAQObICZEXYUjitS1z3bZXZnfSq6wL9C2QyhrjTt+n823wrvizC7Y+AxScj6bhFQ 0r27jM8rk6l7NOciMhY/iSx0p8P4J1zqF9d4SEz5+KYzj1G6YGY/dAiMr90K1wlEdW1p 4LiHt87bdGwmyGxOO1ZZ0UR+QmraPcXXUOqEXdr3szgWUlpufsc0nz24sqBsCG5NiPMI SGlg== X-Gm-Message-State: AOJu0Yya83kqWVHMhOAIzRLMuK8GVwqSa45k7V446GeOni//jJ6fIUJr yH17L+lXXI9pn3tbvrqktjzIur+Q4a7QDvF0ACnVvYu7sirtw1Yg8D3Gkb6siI/FVnVJuc+uA2v 4EuU= X-Received: by 2002:a05:6000:372:b0:332:ffbc:dcbb with SMTP id f18-20020a056000037200b00332ffbcdcbbmr9534618wrf.23.1701354646948; Thu, 30 Nov 2023 06:30:46 -0800 (PST) X-Google-Smtp-Source: AGHT+IHlHU5t2NgqbFuZB5qtH4JZyvExEoO1sKQ0mFlPLyeedtAmOt4V+wxqIE07ulsJvXVbpNiV2Q== X-Received: by 2002:a05:6000:372:b0:332:ffbc:dcbb with SMTP id f18-20020a056000037200b00332ffbcdcbbmr9534594wrf.23.1701354646548; Thu, 30 Nov 2023 06:30:46 -0800 (PST) Received: from [192.168.0.12] ([78.18.22.228]) by smtp.gmail.com with ESMTPSA id a6-20020a5d4566000000b0032196c508e3sm1687002wrc.53.2023.11.30.06.30.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 30 Nov 2023 06:30:46 -0800 (PST) Message-ID: <9928ea49-e2df-41b0-8b4a-b965f9ead38b@redhat.com> Date: Thu, 30 Nov 2023 14:30:45 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [v1] net/af_xdp: enable a sock path alongside use_cni To: "Koikkara Reeny, Shibin" , "ferruh.yigit@amd.com" , "stephen@networkplumber.org" , "lihuisong@huawei.com" , "fengchengwen@huawei.com" , "liuyonglong@huawei.com" Cc: "dev@dpdk.org" References: <20231130091332.2315572-1-mtahhan@redhat.com> From: Maryam Tahhan In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Shibin I will add a more detailed note re CAP_BPF in the documentation on the V2. BR Maryam On 30/11/2023 13:56, Koikkara Reeny, Shibin wrote: > Hi Maryam, > > I have one more question. > > Regards, > Shibin > >> -----Original Message----- >> From: Koikkara Reeny, Shibin >> Sent: Thursday, November 30, 2023 12:14 PM >> To: Tahhan, Maryam ; ferruh.yigit@amd.com; >> stephen@networkplumber.org; lihuisong@huawei.com; >> fengchengwen@huawei.com; liuyonglong@huawei.com >> Cc: dev@dpdk.org; Tahhan, Maryam >> Subject: RE: [v1] net/af_xdp: enable a sock path alongside use_cni >> >> Hi Maryam, >> >> I have added some suggestion below. >> >> Regrads, >> Shibin >> >>> -----Original Message----- >>> From: Maryam Tahhan >>> Sent: Thursday, November 30, 2023 9:14 AM >>> To: ferruh.yigit@amd.com; stephen@networkplumber.org; >>> lihuisong@huawei.com; fengchengwen@huawei.com; >> liuyonglong@huawei.com >>> Cc: dev@dpdk.org; Tahhan, Maryam >>> Subject: [v1] net/af_xdp: enable a sock path alongside use_cni >>> >>> With the original 'use_cni' implementation, (using a hardcoded socket >>> rather than a configurable one), if a single pod is requesting >>> multiple net devices and these devices are from different pools, then >>> the container attempts to mount all the netdev UDSes in the pod as >>> /tmp/afxdp.sock. Which means that at best only 1 netdev will handshake >>> correctly with the AF_XDP DP. This patch addresses this by making the >>> socket parameter configurable alongside the 'use_cni' param. >>> Tested with the AF_XDP DP CNI PR 81. >>> >>> Signed-off-by: Maryam Tahhan >>> --- >>> doc/guides/howto/af_xdp_cni.rst | 18 +++++++--- >>> drivers/net/af_xdp/rte_eth_af_xdp.c | 56 >>> +++++++++++++++++++---------- >>> 2 files changed, 51 insertions(+), 23 deletions(-) >>> >>> diff --git a/doc/guides/howto/af_xdp_cni.rst >>> b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..a2d90c665d 100644 >>> --- a/doc/guides/howto/af_xdp_cni.rst >>> +++ b/doc/guides/howto/af_xdp_cni.rst >>> @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK). >>> The client can then proceed with creating an AF_XDP socket and >>> inserting that socket into the XSKMAP pointed to by the descriptor. >>> >>> -The EAL vdev argument ``use_cni`` is used to indicate that the user >>> wishes >>> +The EAL vdev arguments ``use_cni`` and ``sock`` are used to indicate >>> +that the user wishes >>> to run the PMD in unprivileged mode and to receive the XSKMAP file >>> descriptor from the CNI. >>> + >>> When this flag is set, >>> the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag should be >>> used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is >>> handled by the CNI. >>> >>> .. note:: >>> >>> - The Unix Domain Socket file path appear in the end user is >>> "/tmp/afxdp.sock". >>> + The Unix Domain Socket file path appears to the end user at >>> "/tmp/afxdp_dp//afxdp.sock". >>> >>> >>> Prerequisites >>> @@ -224,7 +225,6 @@ Howto run dpdk-testpmd with CNI plugin: >>> capabilities: >>> add: >>> - CAP_NET_RAW >>> - - CAP_BPF > Why the CAP_BPF is removed? > >>> resources: >>> requests: >>> hugepages-2Mi: 2Gi >>> @@ -245,7 +245,17 @@ Howto run dpdk-testpmd with CNI plugin: >>> >>> kubectl exec -i --container -- \ >>> //dpdk-testpmd -l 0,1 --no-pci \ >>> - --vdev=net_af_xdp0,use_cni=1,iface= \ >>> + --vdev=net_af_xdp0,use_cni=1,iface=>> name>,sock=/tmp/afxdp_dp//afxdp.sock \ >>> + -- --no-mlockall --in-memory >>> + >>> +for multiple devices use: >>> + >>> + .. code-block:: console >>> + >>> + kubectl exec -i --container -- \ >>> + //dpdk-testpmd -l 0-2 --no-pci \ >>> + --vdev=net_af_xdp0,use_cni=1,iface=>> name>,sock=/tmp/afxdp_dp//afxdp.sock \ >>> + --vdev=net_af_xdp1,use_cni=1,iface=>> + name>,sock=/tmp/afxdp_dp//afxdp.sock \ >>> -- --no-mlockall --in-memory >>> >>> For further reference please use the `e2e`_ test case in `AF_XDP >>> Plugin for Kubernetes`_ diff --git >>> a/drivers/net/af_xdp/rte_eth_af_xdp.c >>> b/drivers/net/af_xdp/rte_eth_af_xdp.c >>> index 353c8688ec..f728dae2f9 100644 >>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c >>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c >>> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, >>> NOTICE); >>> #define UDS_MAX_CMD_LEN 64 >>> #define UDS_MAX_CMD_RESP 128 >>> #define UDS_XSK_MAP_FD_MSG "/xsk_map_fd" >>> -#define UDS_SOCK "/tmp/afxdp.sock" >>> #define UDS_CONNECT_MSG "/connect" >>> #define UDS_HOST_OK_MSG "/host_ok" >>> #define UDS_HOST_NAK_MSG "/host_nak" >>> @@ -171,6 +170,7 @@ struct pmd_internals { >>> bool custom_prog_configured; >>> bool force_copy; >>> bool use_cni; >>> + char sock_path[PATH_MAX]; >> I would recommend using variable name as "uds_path". >> >>> struct bpf_map *map; >>> >>> struct rte_ether_addr eth_addr; >>> @@ -191,6 +191,7 @@ struct pmd_process_private { >>> #define ETH_AF_XDP_BUDGET_ARG >> "busy_budget" >>> #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" >>> #define ETH_AF_XDP_USE_CNI_ARG "use_cni" >>> +#define ETH_AF_XDP_SOCK_ARG "sock" >> To make it clear would recommend using "sock_path" and also >> ETH_AF_XDP_CNI_UDS_PATH_ARG or ETH_AF_XDP_SOCK_PATH_ARG. >> >>> static const char * const valid_arguments[] = { >>> ETH_AF_XDP_IFACE_ARG, >>> @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = { >>> ETH_AF_XDP_BUDGET_ARG, >>> ETH_AF_XDP_FORCE_COPY_ARG, >>> ETH_AF_XDP_USE_CNI_ARG, >>> + ETH_AF_XDP_SOCK_ARG, >>> NULL >>> }; >>> >>> @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct >>> pkt_rx_queue *rxq) } >>> >>> static int >>> -init_uds_sock(struct sockaddr_un *server) >>> +init_uds_sock(struct sockaddr_un *server, const char *sock_path) >>> { >>> int sock; >>> >>> @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server) >>> } >>> >>> server->sun_family = AF_UNIX; >>> - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path)); >>> + strlcpy(server->sun_path, sock_path, sizeof(server->sun_path)); >>> >>> if (connect(sock, (struct sockaddr *)server, sizeof(struct >>> sockaddr_un)) < 0) { >>> close(sock); >>> @@ -1382,7 +1384,7 @@ struct msg_internal { }; >>> >>> static int >>> -send_msg(int sock, char *request, int *fd) >>> +send_msg(int sock, char *request, int *fd, const char *sock_path) >>> { >>> int snd; >>> struct iovec iov; >>> @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd) >>> >>> memset(&dst, 0, sizeof(dst)); >>> dst.sun_family = AF_UNIX; >>> - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path)); >>> + strlcpy(dst.sun_path, sock_path, sizeof(dst.sun_path)); >>> >>> /* Initialize message header structure */ >>> memset(&msgh, 0, sizeof(msgh)); >>> @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct >>> sockaddr_un *s, int *fd) >>> >>> static int >>> make_request_cni(int sock, struct sockaddr_un *server, char *request, >>> - int *req_fd, char *response, int *out_fd) >>> + int *req_fd, char *response, int *out_fd, const char >>> *sock_path) >>> { >>> int rval; >>> >>> @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un >>> *server, char *request, >>> if (req_fd == NULL) >>> rval = write(sock, request, strlen(request)); >>> else >>> - rval = send_msg(sock, request, req_fd); >>> + rval = send_msg(sock, request, req_fd, sock_path); >>> >>> if (rval < 0) { >>> AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@ >>> -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, >> long >>> size) } >>> >>> static int >>> -get_cni_fd(char *if_name) >>> +get_cni_fd(char *if_name, const char *sock_path) >>> { >>> char request[UDS_MAX_CMD_LEN], >>> response[UDS_MAX_CMD_RESP]; >>> char hostname[MAX_LONG_OPT_SZ], >>> exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@ >> get_cni_fd(char >>> *if_name) >>> return -1; >>> >>> memset(&server, 0, sizeof(server)); >>> - sock = init_uds_sock(&server); >>> + sock = init_uds_sock(&server, sock_path); >>> if (sock < 0) >>> return -1; >>> >>> /* Initiates handshake to CNI send: /connect,hostname */ >>> snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, >>> hostname); >>> memset(response, 0, sizeof(response)); >>> - if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd) < 0) { >>> + if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd, >>> +sock_path) < 0) { >> Why do we need to pass "sock_path" here as we have already connected >> the sock with sock_path in init_uds_sock()? >> >>> AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", >> request); >>> goto err_close; >>> } >>> @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name) >>> /* Request for "/version" */ >>> strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN); >>> memset(response, 0, sizeof(response)); >>> - if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd) < 0) { >>> + if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd, >>> +sock_path) < 0) { >> Same question as above. >> >>> AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", >> request); >>> goto err_close; >>> } >>> @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name) >>> /* Request for file descriptor for netdev name*/ >>> snprintf(request, sizeof(request), "%s,%s", >> UDS_XSK_MAP_FD_MSG, >>> if_name); >>> memset(response, 0, sizeof(response)); >>> - if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd) < 0) { >>> + if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd, >>> +sock_path) < 0) { >> Same question as above. >> >>> AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", >> request); >>> goto err_close; >>> } >>> @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name) >>> /* Initiate close connection */ >>> strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN); >>> memset(response, 0, sizeof(response)); >>> - if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd) < 0) { >>> + if (make_request_cni(sock, &server, request, NULL, response, >>> &out_fd, >>> +sock_path) < 0) { >> Same question as above. >> >>> AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", >> request); >>> goto err_close; >>> } >>> @@ -1698,7 +1700,7 @@ xsk_configure(struct pmd_internals *internals, >>> struct pkt_rx_queue *rxq, >>> int err, fd, map_fd; >>> >>> /* get socket fd from CNI plugin */ >>> - map_fd = get_cni_fd(internals->if_name); >>> + map_fd = get_cni_fd(internals->if_name, internals- >>>> sock_path); >>> if (map_fd < 0) { >>> AF_XDP_LOG(ERR, "Failed to receive CNI plugin >> fd\n"); >>> goto out_xsk; >>> @@ -2023,7 +2025,8 @@ xdp_get_channels_info(const char *if_name, int >>> *max_queues, static int parse_parameters(struct rte_kvargs *kvlist, >>> char *if_name, int *start_queue, >>> int *queue_cnt, int *shared_umem, char *prog_path, >>> - int *busy_budget, int *force_copy, int *use_cni) >>> + int *busy_budget, int *force_copy, int *use_cni, >>> + char *sock_path) >>> { >>> int ret; >>> >>> @@ -2069,6 +2072,11 @@ parse_parameters(struct rte_kvargs *kvlist, >>> char *if_name, int *start_queue, >>> if (ret < 0) >>> goto free_kvlist; >>> >>> + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_SOCK_ARG, >>> + &parse_prog_arg, sock_path); >> Parse_prog_arg does 2 things copy the sock_arg value and also check the >> access to the socket. >> Checking access here has a chance of causing raise condition so I would >> recommend to skip this check here as this will be taken care in the >> init_uds_sock(). >> >>> + if (ret < 0) >>> + goto free_kvlist; >>> + >>> free_kvlist: >>> rte_kvargs_free(kvlist); >>> return ret; >>> @@ -2108,7 +2116,7 @@ static struct rte_eth_dev * >>> init_internals(struct rte_vdev_device *dev, const char *if_name, >>> int start_queue_idx, int queue_cnt, int shared_umem, >>> const char *prog_path, int busy_budget, int force_copy, >>> - int use_cni) >>> + int use_cni, const char *sock_path) >>> { >>> const char *name = rte_vdev_device_name(dev); >>> const unsigned int numa_node = dev->device.numa_node; @@ - >>> 2138,6 +2146,7 @@ init_internals(struct rte_vdev_device *dev, const >>> char *if_name, >>> internals->shared_umem = shared_umem; >>> internals->force_copy = force_copy; >>> internals->use_cni = use_cni; >>> + strlcpy(internals->sock_path, sock_path, PATH_MAX); >>> >>> if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, >>> &internals->combined_queue_cnt)) { @@ - >>> 2328,6 +2337,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) >>> int busy_budget = -1, ret; >>> int force_copy = 0; >>> int use_cni = 0; >>> + char sock_path[PATH_MAX] = {'\0'}; >>> struct rte_eth_dev *eth_dev = NULL; >>> const char *name = rte_vdev_device_name(dev); >>> >>> @@ -2370,7 +2380,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device >>> *dev) >>> >>> if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, >>> &xsk_queue_cnt, &shared_umem, prog_path, >>> - &busy_budget, &force_copy, &use_cni) < 0) { >>> + &busy_budget, &force_copy, &use_cni, >>> sock_path) < 0) { >>> AF_XDP_LOG(ERR, "Invalid kvargs value\n"); >>> return -EINVAL; >>> } >>> @@ -2387,6 +2397,13 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device >>> *dev) >>> return -EINVAL; >>> } >>> >>> + if (use_cni && !strnlen(sock_path, PATH_MAX)) { >>> + AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must >>> also be provided\n", >>> + ETH_AF_XDP_USE_CNI_ARG, >>> ETH_AF_XDP_SOCK_ARG); >>> + return -EINVAL; >>> + } >>> + >>> + >>> if (strlen(if_name) == 0) { >>> AF_XDP_LOG(ERR, "Network interface must be >> specified\n"); >>> return -EINVAL; >>> @@ -2410,7 +2427,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device >>> *dev) >>> >>> eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, >>> xsk_queue_cnt, shared_umem, prog_path, >>> - busy_budget, force_copy, use_cni); >>> + busy_budget, force_copy, use_cni, >>> sock_path); >>> if (eth_dev == NULL) { >>> AF_XDP_LOG(ERR, "Failed to init internals\n"); >>> return -1; >>> @@ -2471,4 +2488,5 @@ >>> RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, >>> "xdp_prog= " >>> "busy_budget= " >>> "force_copy= " >>> - "use_cni= "); >>> + "use_cni= " >>> + "sock= "); >>> -- >>> 2.41.0