From: "Koikkara Reeny, Shibin" <shibin.koikkara.reeny@intel.com>
To: "Koikkara Reeny, Shibin" <shibin.koikkara.reeny@intel.com>,
"Tahhan, Maryam" <mtahhan@redhat.com>,
"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
"stephen@networkplumber.org" <stephen@networkplumber.org>,
"lihuisong@huawei.com" <lihuisong@huawei.com>,
"fengchengwen@huawei.com" <fengchengwen@huawei.com>,
"liuyonglong@huawei.com" <liuyonglong@huawei.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Tahhan, Maryam" <mtahhan@redhat.com>
Subject: RE: [v1] net/af_xdp: enable a sock path alongside use_cni
Date: Thu, 30 Nov 2023 13:56:46 +0000 [thread overview]
Message-ID: <DM6PR11MB3995A6E5195D52EB3559E4B9A282A@DM6PR11MB3995.namprd11.prod.outlook.com> (raw)
In-Reply-To: <DM6PR11MB3995F62FADF261A94832A1DBA282A@DM6PR11MB3995.namprd11.prod.outlook.com>
Hi Maryam,
I have one more question.
Regards,
Shibin
> -----Original Message-----
> From: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>
> Sent: Thursday, November 30, 2023 12:14 PM
> To: Tahhan, Maryam <mtahhan@redhat.com>; ferruh.yigit@amd.com;
> stephen@networkplumber.org; lihuisong@huawei.com;
> fengchengwen@huawei.com; liuyonglong@huawei.com
> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
> Subject: RE: [v1] net/af_xdp: enable a sock path alongside use_cni
>
> Hi Maryam,
>
> I have added some suggestion below.
>
> Regrads,
> Shibin
>
> > -----Original Message-----
> > From: Maryam Tahhan <mtahhan@redhat.com>
> > Sent: Thursday, November 30, 2023 9:14 AM
> > To: ferruh.yigit@amd.com; stephen@networkplumber.org;
> > lihuisong@huawei.com; fengchengwen@huawei.com;
> liuyonglong@huawei.com
> > Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
> > Subject: [v1] net/af_xdp: enable a sock path alongside use_cni
> >
> > With the original 'use_cni' implementation, (using a hardcoded socket
> > rather than a configurable one), if a single pod is requesting
> > multiple net devices and these devices are from different pools, then
> > the container attempts to mount all the netdev UDSes in the pod as
> > /tmp/afxdp.sock. Which means that at best only 1 netdev will handshake
> > correctly with the AF_XDP DP. This patch addresses this by making the
> > socket parameter configurable alongside the 'use_cni' param.
> > Tested with the AF_XDP DP CNI PR 81.
> >
> > Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> > ---
> > doc/guides/howto/af_xdp_cni.rst | 18 +++++++---
> > drivers/net/af_xdp/rte_eth_af_xdp.c | 56
> > +++++++++++++++++++----------
> > 2 files changed, 51 insertions(+), 23 deletions(-)
> >
> > diff --git a/doc/guides/howto/af_xdp_cni.rst
> > b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..a2d90c665d 100644
> > --- a/doc/guides/howto/af_xdp_cni.rst
> > +++ b/doc/guides/howto/af_xdp_cni.rst
> > @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
> > The client can then proceed with creating an AF_XDP socket and
> > inserting that socket into the XSKMAP pointed to by the descriptor.
> >
> > -The EAL vdev argument ``use_cni`` is used to indicate that the user
> > wishes
> > +The EAL vdev arguments ``use_cni`` and ``sock`` are used to indicate
> > +that the user wishes
> > to run the PMD in unprivileged mode and to receive the XSKMAP file
> > descriptor from the CNI.
> > +
> > When this flag is set,
> > the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag should be
> > used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is
> > handled by the CNI.
> >
> > .. note::
> >
> > - The Unix Domain Socket file path appear in the end user is
> > "/tmp/afxdp.sock".
> > + The Unix Domain Socket file path appears to the end user at
> > "/tmp/afxdp_dp/<netdev>/afxdp.sock".
> >
> >
> > Prerequisites
> > @@ -224,7 +225,6 @@ Howto run dpdk-testpmd with CNI plugin:
> > capabilities:
> > add:
> > - CAP_NET_RAW
> > - - CAP_BPF
Why the CAP_BPF is removed?
> > resources:
> > requests:
> > hugepages-2Mi: 2Gi
> > @@ -245,7 +245,17 @@ Howto run dpdk-testpmd with CNI plugin:
> >
> > kubectl exec -i <Pod name> --container <containers name> -- \
> > /<Path>/dpdk-testpmd -l 0,1 --no-pci \
> > - --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
> > + --vdev=net_af_xdp0,use_cni=1,iface=<interface
> > name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
> > + -- --no-mlockall --in-memory
> > +
> > +for multiple devices use:
> > +
> > + .. code-block:: console
> > +
> > + kubectl exec -i <Pod name> --container <containers name> -- \
> > + /<Path>/dpdk-testpmd -l 0-2 --no-pci \
> > + --vdev=net_af_xdp0,use_cni=1,iface=<interface
> > name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
> > + --vdev=net_af_xdp1,use_cni=1,iface=<interface
> > + name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
> > -- --no-mlockall --in-memory
> >
> > For further reference please use the `e2e`_ test case in `AF_XDP
> > Plugin for Kubernetes`_ diff --git
> > a/drivers/net/af_xdp/rte_eth_af_xdp.c
> > b/drivers/net/af_xdp/rte_eth_af_xdp.c
> > index 353c8688ec..f728dae2f9 100644
> > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> > @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
> > NOTICE);
> > #define UDS_MAX_CMD_LEN 64
> > #define UDS_MAX_CMD_RESP 128
> > #define UDS_XSK_MAP_FD_MSG "/xsk_map_fd"
> > -#define UDS_SOCK "/tmp/afxdp.sock"
> > #define UDS_CONNECT_MSG "/connect"
> > #define UDS_HOST_OK_MSG "/host_ok"
> > #define UDS_HOST_NAK_MSG "/host_nak"
> > @@ -171,6 +170,7 @@ struct pmd_internals {
> > bool custom_prog_configured;
> > bool force_copy;
> > bool use_cni;
> > + char sock_path[PATH_MAX];
> I would recommend using variable name as "uds_path".
>
> > struct bpf_map *map;
> >
> > struct rte_ether_addr eth_addr;
> > @@ -191,6 +191,7 @@ struct pmd_process_private {
> > #define ETH_AF_XDP_BUDGET_ARG
> "busy_budget"
> > #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy"
> > #define ETH_AF_XDP_USE_CNI_ARG "use_cni"
> > +#define ETH_AF_XDP_SOCK_ARG "sock"
> To make it clear would recommend using "sock_path" and also
> ETH_AF_XDP_CNI_UDS_PATH_ARG or ETH_AF_XDP_SOCK_PATH_ARG.
>
> >
> > static const char * const valid_arguments[] = {
> > ETH_AF_XDP_IFACE_ARG,
> > @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
> > ETH_AF_XDP_BUDGET_ARG,
> > ETH_AF_XDP_FORCE_COPY_ARG,
> > ETH_AF_XDP_USE_CNI_ARG,
> > + ETH_AF_XDP_SOCK_ARG,
> > NULL
> > };
> >
> > @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
> > pkt_rx_queue *rxq) }
> >
> > static int
> > -init_uds_sock(struct sockaddr_un *server)
> > +init_uds_sock(struct sockaddr_un *server, const char *sock_path)
> > {
> > int sock;
> >
> > @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
> > }
> >
> > server->sun_family = AF_UNIX;
> > - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
> > + strlcpy(server->sun_path, sock_path, sizeof(server->sun_path));
> >
> > if (connect(sock, (struct sockaddr *)server, sizeof(struct
> > sockaddr_un)) < 0) {
> > close(sock);
> > @@ -1382,7 +1384,7 @@ struct msg_internal { };
> >
> > static int
> > -send_msg(int sock, char *request, int *fd)
> > +send_msg(int sock, char *request, int *fd, const char *sock_path)
> > {
> > int snd;
> > struct iovec iov;
> > @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
> >
> > memset(&dst, 0, sizeof(dst));
> > dst.sun_family = AF_UNIX;
> > - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
> > + strlcpy(dst.sun_path, sock_path, sizeof(dst.sun_path));
> >
> > /* Initialize message header structure */
> > memset(&msgh, 0, sizeof(msgh));
> > @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
> > sockaddr_un *s, int *fd)
> >
> > static int
> > make_request_cni(int sock, struct sockaddr_un *server, char *request,
> > - int *req_fd, char *response, int *out_fd)
> > + int *req_fd, char *response, int *out_fd, const char
> > *sock_path)
> > {
> > int rval;
> >
> > @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
> > *server, char *request,
> > if (req_fd == NULL)
> > rval = write(sock, request, strlen(request));
> > else
> > - rval = send_msg(sock, request, req_fd);
> > + rval = send_msg(sock, request, req_fd, sock_path);
> >
> > if (rval < 0) {
> > AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
> > -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp,
> long
> > size) }
> >
> > static int
> > -get_cni_fd(char *if_name)
> > +get_cni_fd(char *if_name, const char *sock_path)
> > {
> > char request[UDS_MAX_CMD_LEN],
> > response[UDS_MAX_CMD_RESP];
> > char hostname[MAX_LONG_OPT_SZ],
> > exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@
> get_cni_fd(char
> > *if_name)
> > return -1;
> >
> > memset(&server, 0, sizeof(server));
> > - sock = init_uds_sock(&server);
> > + sock = init_uds_sock(&server, sock_path);
> > if (sock < 0)
> > return -1;
> >
> > /* Initiates handshake to CNI send: /connect,hostname */
> > snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
> > hostname);
> > memset(response, 0, sizeof(response));
> > - if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd) < 0) {
> > + if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd,
> > +sock_path) < 0) {
> Why do we need to pass "sock_path" here as we have already connected
> the sock with sock_path in init_uds_sock()?
>
> > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
> > goto err_close;
> > }
> > @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
> > /* Request for "/version" */
> > strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
> > memset(response, 0, sizeof(response));
> > - if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd) < 0) {
> > + if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd,
> > +sock_path) < 0) {
> Same question as above.
>
> > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
> > goto err_close;
> > }
> > @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
> > /* Request for file descriptor for netdev name*/
> > snprintf(request, sizeof(request), "%s,%s",
> UDS_XSK_MAP_FD_MSG,
> > if_name);
> > memset(response, 0, sizeof(response));
> > - if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd) < 0) {
> > + if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd,
> > +sock_path) < 0) {
>
> Same question as above.
>
> > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
> > goto err_close;
> > }
> > @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
> > /* Initiate close connection */
> > strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
> > memset(response, 0, sizeof(response));
> > - if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd) < 0) {
> > + if (make_request_cni(sock, &server, request, NULL, response,
> > &out_fd,
> > +sock_path) < 0) {
>
> Same question as above.
>
> > AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
> > goto err_close;
> > }
> > @@ -1698,7 +1700,7 @@ xsk_configure(struct pmd_internals *internals,
> > struct pkt_rx_queue *rxq,
> > int err, fd, map_fd;
> >
> > /* get socket fd from CNI plugin */
> > - map_fd = get_cni_fd(internals->if_name);
> > + map_fd = get_cni_fd(internals->if_name, internals-
> > >sock_path);
> > if (map_fd < 0) {
> > AF_XDP_LOG(ERR, "Failed to receive CNI plugin
> fd\n");
> > goto out_xsk;
> > @@ -2023,7 +2025,8 @@ xdp_get_channels_info(const char *if_name, int
> > *max_queues, static int parse_parameters(struct rte_kvargs *kvlist,
> > char *if_name, int *start_queue,
> > int *queue_cnt, int *shared_umem, char *prog_path,
> > - int *busy_budget, int *force_copy, int *use_cni)
> > + int *busy_budget, int *force_copy, int *use_cni,
> > + char *sock_path)
> > {
> > int ret;
> >
> > @@ -2069,6 +2072,11 @@ parse_parameters(struct rte_kvargs *kvlist,
> > char *if_name, int *start_queue,
> > if (ret < 0)
> > goto free_kvlist;
> >
> > + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_SOCK_ARG,
> > + &parse_prog_arg, sock_path);
>
> Parse_prog_arg does 2 things copy the sock_arg value and also check the
> access to the socket.
> Checking access here has a chance of causing raise condition so I would
> recommend to skip this check here as this will be taken care in the
> init_uds_sock().
>
> > + if (ret < 0)
> > + goto free_kvlist;
> > +
> > free_kvlist:
> > rte_kvargs_free(kvlist);
> > return ret;
> > @@ -2108,7 +2116,7 @@ static struct rte_eth_dev *
> > init_internals(struct rte_vdev_device *dev, const char *if_name,
> > int start_queue_idx, int queue_cnt, int shared_umem,
> > const char *prog_path, int busy_budget, int force_copy,
> > - int use_cni)
> > + int use_cni, const char *sock_path)
> > {
> > const char *name = rte_vdev_device_name(dev);
> > const unsigned int numa_node = dev->device.numa_node; @@ -
> > 2138,6 +2146,7 @@ init_internals(struct rte_vdev_device *dev, const
> > char *if_name,
> > internals->shared_umem = shared_umem;
> > internals->force_copy = force_copy;
> > internals->use_cni = use_cni;
> > + strlcpy(internals->sock_path, sock_path, PATH_MAX);
> >
> > if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
> > &internals->combined_queue_cnt)) { @@ -
> > 2328,6 +2337,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
> > int busy_budget = -1, ret;
> > int force_copy = 0;
> > int use_cni = 0;
> > + char sock_path[PATH_MAX] = {'\0'};
> > struct rte_eth_dev *eth_dev = NULL;
> > const char *name = rte_vdev_device_name(dev);
> >
> > @@ -2370,7 +2380,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> > *dev)
> >
> > if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
> > &xsk_queue_cnt, &shared_umem, prog_path,
> > - &busy_budget, &force_copy, &use_cni) < 0) {
> > + &busy_budget, &force_copy, &use_cni,
> > sock_path) < 0) {
> > AF_XDP_LOG(ERR, "Invalid kvargs value\n");
> > return -EINVAL;
> > }
> > @@ -2387,6 +2397,13 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> > *dev)
> > return -EINVAL;
> > }
> >
> > + if (use_cni && !strnlen(sock_path, PATH_MAX)) {
> > + AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
> > also be provided\n",
> > + ETH_AF_XDP_USE_CNI_ARG,
> > ETH_AF_XDP_SOCK_ARG);
> > + return -EINVAL;
> > + }
> > +
> > +
> > if (strlen(if_name) == 0) {
> > AF_XDP_LOG(ERR, "Network interface must be
> specified\n");
> > return -EINVAL;
> > @@ -2410,7 +2427,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> > *dev)
> >
> > eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
> > xsk_queue_cnt, shared_umem, prog_path,
> > - busy_budget, force_copy, use_cni);
> > + busy_budget, force_copy, use_cni,
> > sock_path);
> > if (eth_dev == NULL) {
> > AF_XDP_LOG(ERR, "Failed to init internals\n");
> > return -1;
> > @@ -2471,4 +2488,5 @@
> > RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
> > "xdp_prog=<string> "
> > "busy_budget=<int> "
> > "force_copy=<int> "
> > - "use_cni=<int> ");
> > + "use_cni=<int> "
> > + "sock=<string> ");
> > --
> > 2.41.0
next prev parent reply other threads:[~2023-11-30 13:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-30 9:13 Maryam Tahhan
2023-11-30 12:14 ` Koikkara Reeny, Shibin
2023-11-30 12:32 ` Maryam Tahhan
2023-11-30 13:55 ` Koikkara Reeny, Shibin
2023-11-30 13:56 ` Koikkara Reeny, Shibin [this message]
2023-11-30 14:17 ` Maryam Tahhan
2023-12-01 9:55 ` Koikkara Reeny, Shibin
2023-12-01 10:20 ` Maryam Tahhan
2023-12-01 10:49 ` Koikkara Reeny, Shibin
2023-11-30 14:30 ` Maryam Tahhan
2023-12-01 10:26 ` David Marchand
2023-12-01 10:31 ` Maryam Tahhan
2023-12-01 10:33 ` Maryam Tahhan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM6PR11MB3995A6E5195D52EB3559E4B9A282A@DM6PR11MB3995.namprd11.prod.outlook.com \
--to=shibin.koikkara.reeny@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=ferruh.yigit@amd.com \
--cc=lihuisong@huawei.com \
--cc=liuyonglong@huawei.com \
--cc=mtahhan@redhat.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).