DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maryam Tahhan <mtahhan@redhat.com>
To: "Koikkara Reeny, Shibin" <shibin.koikkara.reeny@intel.com>,
	"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	"lihuisong@huawei.com" <lihuisong@huawei.com>,
	"fengchengwen@huawei.com" <fengchengwen@huawei.com>,
	"liuyonglong@huawei.com" <liuyonglong@huawei.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [v1] net/af_xdp: enable a sock path alongside use_cni
Date: Thu, 30 Nov 2023 14:30:45 +0000	[thread overview]
Message-ID: <9928ea49-e2df-41b0-8b4a-b965f9ead38b@redhat.com> (raw)
In-Reply-To: <DM6PR11MB3995A6E5195D52EB3559E4B9A282A@DM6PR11MB3995.namprd11.prod.outlook.com>

Hi Shibin

I will add a more detailed note re CAP_BPF in the documentation on the V2.

BR
Maryam


On 30/11/2023 13:56, Koikkara Reeny, Shibin wrote:
> Hi Maryam,
>
> I have one more question.
>
> Regards,
> Shibin
>
>> -----Original Message-----
>> From: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>
>> Sent: Thursday, November 30, 2023 12:14 PM
>> To: Tahhan, Maryam <mtahhan@redhat.com>; ferruh.yigit@amd.com;
>> stephen@networkplumber.org; lihuisong@huawei.com;
>> fengchengwen@huawei.com; liuyonglong@huawei.com
>> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
>> Subject: RE: [v1] net/af_xdp: enable a sock path alongside use_cni
>>
>> Hi Maryam,
>>
>> I have added some suggestion below.
>>
>> Regrads,
>> Shibin
>>
>>> -----Original Message-----
>>> From: Maryam Tahhan <mtahhan@redhat.com>
>>> Sent: Thursday, November 30, 2023 9:14 AM
>>> To: ferruh.yigit@amd.com; stephen@networkplumber.org;
>>> lihuisong@huawei.com; fengchengwen@huawei.com;
>> liuyonglong@huawei.com
>>> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
>>> Subject: [v1] net/af_xdp: enable a sock path alongside use_cni
>>>
>>> With the original 'use_cni' implementation, (using a hardcoded socket
>>> rather than a configurable one), if a single pod is requesting
>>> multiple net devices and these devices are from different pools, then
>>> the container attempts to mount all the netdev UDSes in the pod as
>>> /tmp/afxdp.sock. Which means that at best only 1 netdev will handshake
>>> correctly with the AF_XDP DP. This patch addresses this by making the
>>> socket parameter configurable alongside the 'use_cni' param.
>>> Tested with the AF_XDP DP CNI PR 81.
>>>
>>> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
>>> ---
>>>   doc/guides/howto/af_xdp_cni.rst     | 18 +++++++---
>>>   drivers/net/af_xdp/rte_eth_af_xdp.c | 56
>>> +++++++++++++++++++----------
>>>   2 files changed, 51 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/doc/guides/howto/af_xdp_cni.rst
>>> b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..a2d90c665d 100644
>>> --- a/doc/guides/howto/af_xdp_cni.rst
>>> +++ b/doc/guides/howto/af_xdp_cni.rst
>>> @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
>>>   The client can then proceed with creating an AF_XDP socket  and
>>> inserting that socket into the XSKMAP pointed to by the descriptor.
>>>
>>> -The EAL vdev argument ``use_cni`` is used to indicate that the user
>>> wishes
>>> +The EAL vdev arguments ``use_cni`` and ``sock`` are used to indicate
>>> +that the user wishes
>>>   to run the PMD in unprivileged mode and to receive the XSKMAP file
>>> descriptor  from the CNI.
>>> +
>>>   When this flag is set,
>>>   the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be
>>> used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is
>>> handled by the CNI.
>>>
>>>   .. note::
>>>
>>> -   The Unix Domain Socket file path appear in the end user is
>>> "/tmp/afxdp.sock".
>>> +   The Unix Domain Socket file path appears to the end user at
>>> "/tmp/afxdp_dp/<netdev>/afxdp.sock".
>>>
>>>
>>>   Prerequisites
>>> @@ -224,7 +225,6 @@ Howto run dpdk-testpmd with CNI plugin:
>>>             capabilities:
>>>                add:
>>>                  - CAP_NET_RAW
>>> -               - CAP_BPF
> Why the CAP_BPF is removed?
>
>>>            resources:
>>>              requests:
>>>                hugepages-2Mi: 2Gi
>>> @@ -245,7 +245,17 @@ Howto run dpdk-testpmd with CNI plugin:
>>>
>>>        kubectl exec -i <Pod name> --container <containers name> -- \
>>>              /<Path>/dpdk-testpmd -l 0,1 --no-pci \
>>> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
>>> +           --vdev=net_af_xdp0,use_cni=1,iface=<interface
>>> name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
>>> +           -- --no-mlockall --in-memory
>>> +
>>> +for multiple devices use:
>>> +
>>> +  .. code-block:: console
>>> +
>>> +     kubectl exec -i <Pod name> --container <containers name> -- \
>>> +           /<Path>/dpdk-testpmd -l 0-2 --no-pci \
>>> +           --vdev=net_af_xdp0,use_cni=1,iface=<interface
>>> name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
>>> +           --vdev=net_af_xdp1,use_cni=1,iface=<interface
>>> + name>,sock=/tmp/afxdp_dp/<interface name>/afxdp.sock \
>>>              -- --no-mlockall --in-memory
>>>
>>>   For further reference please use the `e2e`_ test case in `AF_XDP
>>> Plugin for Kubernetes`_ diff --git
>>> a/drivers/net/af_xdp/rte_eth_af_xdp.c
>>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>>> index 353c8688ec..f728dae2f9 100644
>>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>>> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
>>> NOTICE);
>>>   #define UDS_MAX_CMD_LEN			64
>>>   #define UDS_MAX_CMD_RESP		128
>>>   #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
>>> -#define UDS_SOCK			"/tmp/afxdp.sock"
>>>   #define UDS_CONNECT_MSG			"/connect"
>>>   #define UDS_HOST_OK_MSG			"/host_ok"
>>>   #define UDS_HOST_NAK_MSG		"/host_nak"
>>> @@ -171,6 +170,7 @@ struct pmd_internals {
>>>   	bool custom_prog_configured;
>>>   	bool force_copy;
>>>   	bool use_cni;
>>> +	char sock_path[PATH_MAX];
>> I would recommend using variable name as "uds_path".
>>
>>>   	struct bpf_map *map;
>>>
>>>   	struct rte_ether_addr eth_addr;
>>> @@ -191,6 +191,7 @@ struct pmd_process_private {
>>>   #define ETH_AF_XDP_BUDGET_ARG
>> 	"busy_budget"
>>>   #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
>>>   #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
>>> +#define ETH_AF_XDP_SOCK_ARG			"sock"
>> To make it clear would  recommend using "sock_path" and also
>> ETH_AF_XDP_CNI_UDS_PATH_ARG or ETH_AF_XDP_SOCK_PATH_ARG.
>>
>>>   static const char * const valid_arguments[] = {
>>>   	ETH_AF_XDP_IFACE_ARG,
>>> @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
>>>   	ETH_AF_XDP_BUDGET_ARG,
>>>   	ETH_AF_XDP_FORCE_COPY_ARG,
>>>   	ETH_AF_XDP_USE_CNI_ARG,
>>> +	ETH_AF_XDP_SOCK_ARG,
>>>   	NULL
>>>   };
>>>
>>> @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
>>> pkt_rx_queue *rxq)  }
>>>
>>>   static int
>>> -init_uds_sock(struct sockaddr_un *server)
>>> +init_uds_sock(struct sockaddr_un *server, const char *sock_path)
>>>   {
>>>   	int sock;
>>>
>>> @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
>>>   	}
>>>
>>>   	server->sun_family = AF_UNIX;
>>> -	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
>>> +	strlcpy(server->sun_path, sock_path, sizeof(server->sun_path));
>>>
>>>   	if (connect(sock, (struct sockaddr *)server, sizeof(struct
>>> sockaddr_un)) < 0) {
>>>   		close(sock);
>>> @@ -1382,7 +1384,7 @@ struct msg_internal {  };
>>>
>>>   static int
>>> -send_msg(int sock, char *request, int *fd)
>>> +send_msg(int sock, char *request, int *fd, const char *sock_path)
>>>   {
>>>   	int snd;
>>>   	struct iovec iov;
>>> @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
>>>
>>>   	memset(&dst, 0, sizeof(dst));
>>>   	dst.sun_family = AF_UNIX;
>>> -	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
>>> +	strlcpy(dst.sun_path, sock_path, sizeof(dst.sun_path));
>>>
>>>   	/* Initialize message header structure */
>>>   	memset(&msgh, 0, sizeof(msgh));
>>> @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
>>> sockaddr_un *s, int *fd)
>>>
>>>   static int
>>>   make_request_cni(int sock, struct sockaddr_un *server, char *request,
>>> -		 int *req_fd, char *response, int *out_fd)
>>> +		 int *req_fd, char *response, int *out_fd, const char
>>> *sock_path)
>>>   {
>>>   	int rval;
>>>
>>> @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
>>> *server, char *request,
>>>   	if (req_fd == NULL)
>>>   		rval = write(sock, request, strlen(request));
>>>   	else
>>> -		rval = send_msg(sock, request, req_fd);
>>> +		rval = send_msg(sock, request, req_fd, sock_path);
>>>
>>>   	if (rval < 0) {
>>>   		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
>>> -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp,
>> long
>>> size)  }
>>>
>>>   static int
>>> -get_cni_fd(char *if_name)
>>> +get_cni_fd(char *if_name, const char *sock_path)
>>>   {
>>>   	char request[UDS_MAX_CMD_LEN],
>>> response[UDS_MAX_CMD_RESP];
>>>   	char hostname[MAX_LONG_OPT_SZ],
>>> exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@
>> get_cni_fd(char
>>> *if_name)
>>>   		return -1;
>>>
>>>   	memset(&server, 0, sizeof(server));
>>> -	sock = init_uds_sock(&server);
>>> +	sock = init_uds_sock(&server, sock_path);
>>>   	if (sock < 0)
>>>   		return -1;
>>>
>>>   	/* Initiates handshake to CNI send: /connect,hostname */
>>>   	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
>>> hostname);
>>>   	memset(response, 0, sizeof(response));
>>> -	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd) < 0) {
>>> +	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd,
>>> +sock_path) < 0) {
>> Why do we need to pass "sock_path" here as we have already connected
>> the sock with sock_path in init_uds_sock()?
>>
>>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>>   		goto err_close;
>>>   	}
>>> @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
>>>   	/* Request for "/version" */
>>>   	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>>>   	memset(response, 0, sizeof(response));
>>> -	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd) < 0) {
>>> +	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd,
>>> +sock_path) < 0) {
>> Same question as above.
>>
>>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>>   		goto err_close;
>>>   	}
>>> @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
>>>   	/* Request for file descriptor for netdev name*/
>>>   	snprintf(request, sizeof(request), "%s,%s",
>> UDS_XSK_MAP_FD_MSG,
>>> if_name);
>>>   	memset(response, 0, sizeof(response));
>>> -	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd) < 0) {
>>> +	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd,
>>> +sock_path) < 0) {
>> Same question as above.
>>
>>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>>   		goto err_close;
>>>   	}
>>> @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
>>>   	/* Initiate close connection */
>>>   	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>>>   	memset(response, 0, sizeof(response));
>>> -	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd) < 0) {
>>> +	if (make_request_cni(sock, &server, request, NULL, response,
>>> &out_fd,
>>> +sock_path) < 0) {
>> Same question as above.
>>
>>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>>   		goto err_close;
>>>   	}
>>> @@ -1698,7 +1700,7 @@ xsk_configure(struct pmd_internals *internals,
>>> struct pkt_rx_queue *rxq,
>>>   		int err, fd, map_fd;
>>>
>>>   		/* get socket fd from CNI plugin */
>>> -		map_fd = get_cni_fd(internals->if_name);
>>> +		map_fd = get_cni_fd(internals->if_name, internals-
>>>> sock_path);
>>>   		if (map_fd < 0) {
>>>   			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
>> fd\n");
>>>   			goto out_xsk;
>>> @@ -2023,7 +2025,8 @@ xdp_get_channels_info(const char *if_name, int
>>> *max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist,
>>> char *if_name, int *start_queue,
>>>   		 int *queue_cnt, int *shared_umem, char *prog_path,
>>> -		 int *busy_budget, int *force_copy, int *use_cni)
>>> +		 int *busy_budget, int *force_copy, int *use_cni,
>>> +		 char *sock_path)
>>>   {
>>>   	int ret;
>>>
>>> @@ -2069,6 +2072,11 @@ parse_parameters(struct rte_kvargs *kvlist,
>>> char *if_name, int *start_queue,
>>>   	if (ret < 0)
>>>   		goto free_kvlist;
>>>
>>> +	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_SOCK_ARG,
>>> +				 &parse_prog_arg, sock_path);
>> Parse_prog_arg does 2 things copy the sock_arg value and also check the
>> access to the socket.
>> Checking access here has a chance of causing raise condition so I would
>> recommend to skip this check here as this will be taken care in the
>> init_uds_sock().
>>
>>> +	if (ret < 0)
>>> +		goto free_kvlist;
>>> +
>>>   free_kvlist:
>>>   	rte_kvargs_free(kvlist);
>>>   	return ret;
>>> @@ -2108,7 +2116,7 @@ static struct rte_eth_dev *
>>> init_internals(struct rte_vdev_device *dev, const char *if_name,
>>>   	       int start_queue_idx, int queue_cnt, int shared_umem,
>>>   	       const char *prog_path, int busy_budget, int force_copy,
>>> -	       int use_cni)
>>> +	       int use_cni, const char *sock_path)
>>>   {
>>>   	const char *name = rte_vdev_device_name(dev);
>>>   	const unsigned int numa_node = dev->device.numa_node; @@ -
>>> 2138,6 +2146,7 @@ init_internals(struct rte_vdev_device *dev, const
>>> char *if_name,
>>>   	internals->shared_umem = shared_umem;
>>>   	internals->force_copy = force_copy;
>>>   	internals->use_cni = use_cni;
>>> +	strlcpy(internals->sock_path, sock_path, PATH_MAX);
>>>
>>>   	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>>>   				  &internals->combined_queue_cnt)) { @@ -
>>> 2328,6 +2337,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>>>   	int busy_budget = -1, ret;
>>>   	int force_copy = 0;
>>>   	int use_cni = 0;
>>> +	char sock_path[PATH_MAX] = {'\0'};
>>>   	struct rte_eth_dev *eth_dev = NULL;
>>>   	const char *name = rte_vdev_device_name(dev);
>>>
>>> @@ -2370,7 +2380,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>>> *dev)
>>>
>>>   	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>>>   			     &xsk_queue_cnt, &shared_umem, prog_path,
>>> -			     &busy_budget, &force_copy, &use_cni) < 0) {
>>> +			     &busy_budget, &force_copy, &use_cni,
>>> sock_path) < 0) {
>>>   		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>>>   		return -EINVAL;
>>>   	}
>>> @@ -2387,6 +2397,13 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>>> *dev)
>>>   			return -EINVAL;
>>>   	}
>>>
>>> +	if (use_cni && !strnlen(sock_path, PATH_MAX)) {
>>> +		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
>>> also be provided\n",
>>> +			ETH_AF_XDP_USE_CNI_ARG,
>>> ETH_AF_XDP_SOCK_ARG);
>>> +			return -EINVAL;
>>> +	}
>>> +
>>> +
>>>   	if (strlen(if_name) == 0) {
>>>   		AF_XDP_LOG(ERR, "Network interface must be
>> specified\n");
>>>   		return -EINVAL;
>>> @@ -2410,7 +2427,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>>> *dev)
>>>
>>>   	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>>>   				 xsk_queue_cnt, shared_umem, prog_path,
>>> -				 busy_budget, force_copy, use_cni);
>>> +				 busy_budget, force_copy, use_cni,
>>> sock_path);
>>>   	if (eth_dev == NULL) {
>>>   		AF_XDP_LOG(ERR, "Failed to init internals\n");
>>>   		return -1;
>>> @@ -2471,4 +2488,5 @@
>>> RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>>>   			      "xdp_prog=<string> "
>>>   			      "busy_budget=<int> "
>>>   			      "force_copy=<int> "
>>> -			      "use_cni=<int> ");
>>> +			      "use_cni=<int> "
>>> +			      "sock=<string> ");
>>> --
>>> 2.41.0



  parent reply	other threads:[~2023-11-30 14:30 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-30  9:13 Maryam Tahhan
2023-11-30 12:14 ` Koikkara Reeny, Shibin
2023-11-30 12:32   ` Maryam Tahhan
2023-11-30 13:55     ` Koikkara Reeny, Shibin
2023-11-30 13:56   ` Koikkara Reeny, Shibin
2023-11-30 14:17     ` Maryam Tahhan
2023-12-01  9:55       ` Koikkara Reeny, Shibin
2023-12-01 10:20         ` Maryam Tahhan
2023-12-01 10:49           ` Koikkara Reeny, Shibin
2023-11-30 14:30     ` Maryam Tahhan [this message]
2023-12-01 10:26 ` David Marchand
2023-12-01 10:31   ` Maryam Tahhan
2023-12-01 10:33     ` Maryam Tahhan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9928ea49-e2df-41b0-8b4a-b965f9ead38b@redhat.com \
    --to=mtahhan@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@amd.com \
    --cc=lihuisong@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=shibin.koikkara.reeny@intel.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).