DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Ferruh Yigit <ferruh.yigit@amd.com>
Cc: <dev@dpdk.org>, <stable@dpdk.org>,
	Padraig Connolly <padraig.j.connolly@intel.com>
Subject: Re: [PATCH] ethdev: fix device init without socket-local memory
Date: Fri, 19 Jul 2024 10:57:30 +0100	[thread overview]
Message-ID: <Zpo4itRTrrQncmPu@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <4f7e619a-0398-41cc-90a9-3c52b73d1c49@amd.com>

On Fri, Jul 19, 2024 at 09:59:50AM +0100, Ferruh Yigit wrote:
> On 7/11/2024 1:35 PM, Bruce Richardson wrote:
> > When allocating memory for an ethdev, the rte_malloc_socket call used
> > only allocates memory on the NUMA node/socket local to the device. This
> > means that even if the user wanted to, they could never use a remote NIC
> > without also having memory on that NIC's socket.
> > 
> > For example, if we change examples/skeleton/basicfwd.c to have
> > SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should
> > be able to run the app cross-numa e.g. as below, where the two PCI
> > devices are on socket 1, and core 1 is on socket 0:
> > 
> >  ./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \
> > 		-a a8:00.0 -a b8:00.0
> > 
> > This fails however, with the error:
> > 
> >   ETHDEV: failed to allocate private data
> >   PCI_BUS: Requested device 0000:a8:00.0 cannot be used
> > 
> > We can remove this restriction by doing a fallback call to general
> > rte_malloc after a call to rte_malloc_socket fails. This should be safe
> > to do because the later ethdev calls to setup Rx/Tx queues all take a
> > socket_id parameter, which can be used by applications to enforce the
> > requirement for local-only memory for a device, if so desired. [If
> > device-local memory is present it will be used as before, while if not
> > present the rte_eth_dev_configure call will now pass, but the subsequent
> > queue setup calls requesting local memory will fail].
> > 
> > Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs")
> > Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > Signed-off-by: Padraig Connolly <padraig.j.connolly@intel.com>
> >
> 
> Hi Bruce,
> 
> If device-local memory is present, behavior will be same, so I agree
> this is low impact.
> 
> But for the case device-local memory is NOT present, should we enforce
> the HW setup is the question. This can be beneficial for users new to DPDK.
> 

No we should not do so, because if we do, there is no way for the user to
allow using remote memory - the probe/init and even configure call has NO
socket_id parameter in it, so the enforcement of local memory is an
internal assumption on the part of the API which is not documented
anywhere, and is not possible for the user to override.

> Probably 'dev_private' on its own has small impact on the performance
> (although it may depend if these fields used in datapath), but it may be
> vehicle to enforce local memory.
> 

As I explain above in the cover letter, there are already other ways to
enforce local memory - we don't need another one. If the user only wants to
use local memory for a port, they can do so by setting the socket_id
parameter of the rx and tx queues. Enforcing local memory in probe
doesn't add anything to that, and just prevents other use-cases.

> What is enabled by enabling app to run on cross-numa memory, since on a
> production I expect users would like to use device-local memory for
> performance reasons anyway?
> 

Mostly users want socket-local memory, but with increasing speeds of NICs
we are already seeing cases where users want to run cross-NUMA. For
example, a multi-port NIC may have some ports in use on each socket.

> 
> Also I am not sure if this is a fix, or change of a intentional behavior.
> 

I suppose it can be viewed either way. However, for me this is a fix,
because right now it is impossible for many users to run their applications
with memory on a different socket to the ports. Nowhere is it documented in
DPDK that there is a hard restriction that ports must have local memory, so
any enforcement of such a policy is wrong.

Turning things the other way around - I can't see how anything will break
or even slow down with this patch applied. As I point out above, the user
can already enforce local memory by passing the required socket id when
allocating rx and tx rings - this patch only pushed the failure for
non-local memory a bit later in the initialization sequence, where the user
can actually specify the desired NUMA behaviour. Is there some
case I'm missing where you forsee this causing problems?

/Bruce

  reply	other threads:[~2024-07-19  9:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-11 12:35 Bruce Richardson
2024-07-19  8:59 ` Ferruh Yigit
2024-07-19  9:57   ` Bruce Richardson [this message]
2024-07-19 11:10     ` Ferruh Yigit
2024-07-19 13:22       ` Bruce Richardson
2024-07-19 15:31         ` Ferruh Yigit
2024-07-19 16:10           ` Bruce Richardson
2024-07-21 22:56             ` Ferruh Yigit
2024-07-22 10:06               ` Bruce Richardson
2024-07-19 10:41   ` Bruce Richardson
2024-07-22 10:02 ` [PATCH v2] " Bruce Richardson
2024-07-22 13:24   ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zpo4itRTrrQncmPu@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=padraig.j.connolly@intel.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).