DPDK patches and discussions
 help / color / mirror / Atom feed
From: Asaf Sinai <AsafSi@Radware.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
Date: Mon, 25 Jul 2022 09:31:32 +0000	[thread overview]
Message-ID: <DB9PR01MB9980A9F829F63D0A6187BC3ECC959@DB9PR01MB9980.eurprd01.prod.exchangelabs.com> (raw)
In-Reply-To: <b9265788-0072-f602-e9bf-29f6c96101f0@intel.com>

Hi Anatoly,

Thank you very much for the helpful information and support!

Regards,
Asaf

-----Original Message-----
From: Burakov, Anatoly <anatoly.burakov@intel.com> 
Sent: Monday, July 25, 2022 12:22
To: Asaf Sinai <AsafSi@Radware.com>; dev@dpdk.org
Subject: Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes

On 18-Jul-22 12:58 PM, Asaf Sinai wrote:
> Hi Anatoly,
> 
> DPDK runs as root, and secondary processes have all the info.
> 
> The problem was as follows:
> 
> The external memory regions are not managed by the Linux OS (by using 
> "memmap=x" in 'grub.conf'). Therefore, the kernel cannot supply their 
> physical addresses.
> 
> So, we added these addresses in code, and now it works fine!
> 
> Thanks for your help!

Happy to hear that!

> 
> We have several additional questions:
> 
> *_1. Usage of huge pages in "rte_eal_init":_*
> 
> We see that "rte_eal_init" *_requires_* allocating huge pages for 
> configuring the drivers.
> 
> It seems impossible to use the external memory, as 
> "rte_malloc_heap_memory_add" API cannot be used yet.

Yes, there is currently no way to initialize anything using the external 
memory. This is because at the time of initialization, external memory 
is not yet discovered and therefore cannot be acted upon. It /could/ be 
possible to implement this using an EAL plugin, but i have not looked 
into it and know very little about EAL plugin infrastructure, so I 
cannot offer suggestions off the top of my head.

> 
> So, we tried to use regular pages instead of huge pages, using the 
> option of "--no-huge".
> 
> It failed with the following printouts:
> 
> /.../
> 
> /EAL: Multi-process socket /var/run/dpdk/rte/mp_socket/
> 
> */EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not 
> available/*
> 
> */EAL: Cannot use IOVA as 'PA' since physical addresses are not available/*

This happens because no-huge will not attempt to find physical addresses 
of the memory backing the allocated segments.

> 
> /.../
> 
> 1a. Is there a way to use the external memory for “rte_eal_init”?

There is currently no way to do that, no.

> 
> 1b. Why using regular pages, causes DPDK to complain that “*/physical 
> addresses are not available/*”?

We do not populate physical addresses in case of nohuge, as per 
eal_legacy_hugepage_init(). Technically it should be possible to do so 
using calls into rte_eal_virt2phys(), we just don't. I believe the 
rationale is that 1) we have no control over that memory and kernel 
might change its PA's at any time, 2) the init would take a long time 
because there's quite a few pages even in small nohuge segments (and 
we'd need to query pagemap for every single one of them), and 3) nohuge 
is really meant to be a debug option and is not intended for production 
use, so this path is not heavily tested by intent.

> 
> 1c. Why is “—no-huge” option defined as one of "EAL options for DEBUG 
> use only" (in “eal_common_usage” routine)?

That's kind of why it was created: to test DPDK without hugepages. The 
intended use case for DPDK is to be run using hugepages.

> 
> *_2. Explanation for some details in "create_extmem" routine:_*
> 
> 2a. What is the purpose of calling "mlock" before populating IOVA addresses?
> 
> 2b. Why “munlock” is not used afterwards?

We want these pages to stay pinned in memory (i.e. the kernel shouldn't 
be allowed to move them).

-- 
Thanks,
Anatoly

      reply	other threads:[~2022-07-25  9:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12  6:05 Asaf Sinai
2022-07-12 13:13 ` Burakov, Anatoly
2022-07-14 10:24   ` Asaf Sinai
2022-07-14 10:41     ` Asaf Sinai
2022-07-15 10:17       ` Burakov, Anatoly
2022-07-18 11:58         ` Asaf Sinai
2022-07-25  9:21           ` Burakov, Anatoly
2022-07-25  9:31             ` Asaf Sinai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB9PR01MB9980A9F829F63D0A6187BC3ECC959@DB9PR01MB9980.eurprd01.prod.exchangelabs.com \
    --to=asafsi@radware.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).