From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Koikkara Reeny, Shibin" <shibin.koikkara.reeny@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"Zhang, Qi Z" <qi.z.zhang@intel.com>,
"Burakov, Anatoly" <anatoly.burakov@intel.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>,
"Mcnamara, John" <john.mcnamara@intel.com>
Cc: "Loftus, Ciara" <ciara.loftus@intel.com>
Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration
Date: Fri, 10 Feb 2023 20:50:29 +0000 [thread overview]
Message-ID: <8de21cfd-458f-17ed-5b32-013bde7636ca@amd.com> (raw)
In-Reply-To: <DM6PR11MB3995D33D1F722EF96A3EE2FCA2DE9@DM6PR11MB3995.namprd11.prod.outlook.com>
On 2/10/2023 3:38 PM, Koikkara Reeny, Shibin wrote:
>
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, February 10, 2023 1:04 PM
>> To: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>;
>> dev@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>; Burakov, Anatoly
>> <anatoly.burakov@intel.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; Mcnamara, John
>> <john.mcnamara@intel.com>
>> Cc: Loftus, Ciara <ciara.loftus@intel.com>
>> Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration
>>
>> On 2/9/2023 12:05 PM, Shibin Koikkara Reeny wrote:
>>> Integrate support for the AF_XDP CNI and device plugin [1] so that the
>>> DPDK AF_XDP PMD can work in an unprivileged container environment.
>>> Part of the AF_XDP PMD initialization process involves loading an eBPF
>>> program onto the given netdev. This operation requires privileges,
>>> which prevents the PMD from being able to work in an unprivileged
>>> container (without root access). The plugin CNI handles the program
>>> loading. CNI open Unix Domain Socket (UDS) and waits listening for a
>>> client to make requests over that UDS. The client(DPDK) connects and a
>>> "handshake" occurs, then the File Descriptor which points to the
>>> XSKMAP associated with the loaded eBPF program is handed over to the
>>> client. The client can then proceed with creating an AF_XDP socket and
>>> inserting the socket into the XSKMAP pointed to by the FD received on
>>> the UDS.
>>>
>>> A new vdev arg "use_cni" is created to indicate user wishes to run the
>>> PMD in unprivileged mode and to receive the XSKMAP FD from the CNI.
>>> When this flag is set, the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf
>>> flag should be used when creating the socket, which tells libbpf not
>>> to load the default libbpf program on the netdev. We tell libbpf not
>>> to do this because the loading is handled by the CNI in this scenario.
>>>
>>> Patch include howto doc explain how to configure AF_XDP CNI to working
>>> with DPDK.
>>>
>>> [1]: https://github.com/intel/afxdp-plugins-for-kubernetes
>>>
>>> Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.reeny@intel.com>
>>
>>
>> Is Anatoly's tested-by tag still valid with this version?
>
> Yes it is still valid.
>
>>
>> <...>
>>
>>> @@ -1413,7 +1678,23 @@ xsk_configure(struct pmd_internals *internals,
>> struct pkt_rx_queue *rxq,
>>> }
>>> }
>>>
>>> - if (rxq->busy_budget) {
>>> + if (internals->use_cni) {
>>> + int err, fd, map_fd;
>>> +
>>> + /* get socket fd from CNI plugin */
>>> + map_fd = get_cni_fd(internals->if_name);
>>> + if (map_fd < 0) {
>>> + AF_XDP_LOG(ERR, "Failed to receive CNI plugin
>> fd\n");
>>> + goto out_xsk;
>>> + }
>>> + /* get socket fd */
>>> + fd = xsk_socket__fd(rxq->xsk);
>>> + err = bpf_map_update_elem(map_fd, &rxq-
>>> xsk_queue_idx, &fd, 0);
>>> + if (err) {
>>> + AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
>> in map.\n");
>>> + goto out_xsk;
>>> + }
>>> + } else if (rxq->busy_budget) {
>>
>>
>> 'use_cni' argument is added as if-else, this result 'use_cni' parameter
>> automatically makes 'busy_budget' argument ineffective, is this intentional?
>> If so can you please describe why?
>> And can you please document this in the driver documentation that 'use_cni'
>> and 'busy_budget' paramters are mutually exclusive.
>> May be this condition can be checked and an error message sent in runtime,
>> not sure.
>>
>
> When we use "use_cni" option inorder to configure the busy_budget we need to send the request to the CNI plugin
> and CNI plugin will configure the busy_poll. As the dpdk is running inside a container with limited permissions.
>
>>
>> Similarly, another parameter check above this (not visible in this patch),
>> xdp_prog (custom_prog_configured) is calling same APIs
>> (bpf_map_update_elem()), if both paramters are provided, 'use_cni' will
>> overwrite previous one, is this intentional?
>> Are 'use_cni' & 'xdp_prog' paramters mutually exclusive?
>
> When we use "use_cni" we don't have the permission to load the xdp_prog. As our privileges are limited inside the container.
> CNI plugin handle the loading of the program.
Yes, but what happens if user provides 'xdp_prog' parameter?
>>
>>
>> Overall is the combination of 'use_cni' paramter with other parameters
>> tested?
>
> We have tested the communication with CNI plugin which load the program and traffic flow.
>
I got that, but is the combination of 'use_cni' parameter with other
parameters tested?
Like what happens if user provides both 'xdp_prog' & 'use_cni'?
There is no documentation for this condition or there is no check in the
code that can provide some log message to user.
>>
>>
>>> ret = configure_preferred_busy_poll(rxq);
>>> if (ret) {
>>> AF_XDP_LOG(ERR, "Failed configure busy
>> polling.\n"); @@ -1584,6
>>> +1865,27 @@ static const struct eth_dev_ops ops = {
>>> .get_monitor_addr = eth_get_monitor_addr, };
>>>
>>> +/* CNI option works in unprivileged container environment
>>> + * and ethernet device functionality will be reduced. So
>>> + * additional customiszed eth_dev_ops struct is needed
>>> + * for cni. Promiscuous enable and disable functionality
>>> + * is removed.
>>
>>
>> Why promiscuous enable and disable functionality can't be used with
>> 'use_cni'?
>
> When we use "use_cni" we are running dpdk_testpmd inside a docker and inside the docker we have only
> limited permissions only ie the reason I have written it as "unprivileged container environment"
> it the comment.
>>
>> Can you please document the limitation in the driver document, also if
>> possible briefly mention reason of the limitation?
>
> In the documentation as prerequisites we have added :
> +* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF``
> + for AF_XDP along with support for hugepages.
>
> In the Background:
> +The standard `AF_XDP PMD`_ initialization process involves loading an eBPF program
> +onto the kernel netdev to be used by the PMD. This operation requires root or
> +escalated Linux privileges and thus prevents the PMD from working in an
> +unprivileged container. The AF_XDP CNI plugin handles this situation by
> +providing a device plugin that performs the program loading.
>
> If you think we need to add more please let me know.
>
Hi Shibin,
Thanks for the update.
I think it would be good to update driver documentation,
'doc/guides/nics/af_xdp.rst', and update where 'use_cni' parameter
documented with following additional information:
- When 'use_cni' parameter is used, 'busy_budget' parameter is not valid
and has no impact
- When 'use_cni' parameter is used, 'xdp_prog' parameter is not valid
and ? (what happens when provided)
- enable and disable promiscuous mode is not supported, and describe
briefly why (I know code has comment for it but less put it in
documentation too).
next prev parent reply other threads:[~2023-02-10 20:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <to=20221214154102.1521489-1-shibin.koikkara.reeny@intel.com>
2023-02-02 16:55 ` [PATCH v3] " Shibin Koikkara Reeny
2023-02-03 9:46 ` Mcnamara, John
2023-02-08 8:23 ` Zhang, Qi Z
2023-02-08 18:30 ` Ferruh Yigit
2023-02-09 11:49 ` Koikkara Reeny, Shibin
2023-02-09 12:05 ` [PATCH v4] " Shibin Koikkara Reeny
2023-02-10 13:04 ` Ferruh Yigit
2023-02-10 15:38 ` Koikkara Reeny, Shibin
2023-02-10 20:50 ` Ferruh Yigit [this message]
2023-02-10 13:41 ` Ferruh Yigit
2023-02-10 15:48 ` [PATCH v5] net/af_xdp: support " Shibin Koikkara Reeny
2023-02-14 14:50 ` Mcnamara, John
2023-02-15 16:30 ` [PATCH v6] " Shibin Koikkara Reeny
2023-02-16 13:50 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8de21cfd-458f-17ed-5b32-013bde7636ca@amd.com \
--to=ferruh.yigit@amd.com \
--cc=anatoly.burakov@intel.com \
--cc=bruce.richardson@intel.com \
--cc=ciara.loftus@intel.com \
--cc=dev@dpdk.org \
--cc=john.mcnamara@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=shibin.koikkara.reeny@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).