DPDK usage discussions
 help / color / mirror / Atom feed
* hugepage allocation mapping failure
@ 2024-09-04 22:23 Lombardo, Ed
  2024-09-06 13:42 ` hugepage mapping to memseg failure Lombardo, Ed
  0 siblings, 1 reply; 6+ messages in thread
From: Lombardo, Ed @ 2024-09-04 22:23 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 5664 bytes --]

Hi Dmitry,
I hope you don't mind if I reach out to you for hugepage memory mapping to memseg list issue that intermittently occurs.

We are seeing on occasion the DPDK allocation of hugepages fail.
DPDK version 22.11.2
Oracle 91 OS with kernel 5.14.0-284
The VM is configured with 32GB memory and 8 vCPU cores.
Setup for 2 x 1GB = 2GB hugepage total
We dynamically allocate hugepages before our application starts, is not done in grub but done in a bash script.

I turned on EAL debug in our application, which shows debug messages during EAL init.


Enable dpdk log EAL in nsprobe.
EAL: lib.eal log level changed from info to debug
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 0 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 0 on socket 0
EAL: Maximum logical cores by configuration: 128
EAL: Detected CPU lcores: 8
EAL: Detected NUMA nodes: 1
EAL: Checking presence of .so 'librte_eal.so.23.0'
EAL: Checking presence of .so 'librte_eal.so.23'
EAL: Checking presence of .so 'librte_eal.so'
EAL: Detected static linkage of DPDK
EAL: Ask a virtual area of 0x2000 bytes
EAL: Virtual area found at 0x100000000 (size = 0x2000)
[New Thread 0x7fed931ff640 (LWP 287600)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7fed929fe640 (LWP 287601)]
EAL: PCI driver net_iavf for device 0000:00:05.0 wants IOVA as 'PA'
EAL: PCI driver net_ice_dcf for device 0000:00:05.0 wants IOVA as 'PA'
EAL: PCI driver net_iavf for device 0000:00:06.0 wants IOVA as 'PA'
EAL: PCI driver net_ice_dcf for device 0000:00:06.0 wants IOVA as 'PA'
EAL: Bus pci wants IOVA as 'PA'
EAL: Bus vdev wants IOVA as 'DC'
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Ask a virtual area of 0x2e000 bytes
EAL: Virtual area found at 0x100002000 (size = 0x2e000)
EAL: Setting up physically contiguous memory...
EAL: Setting maximum number of open files to 1024
EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
EAL: Creating 1 segment lists: n_segs:2 socket_id:0 hugepage_sz:1073741824
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x100030000 (size = 0x1000)
EAL: Memseg list allocated at socket 0, page size 0x100000kB
EAL: Ask a virtual area of 0x80000000 bytes
EAL: Virtual area found at 0x140000000 (size = 0x80000000)
EAL: VA reserved for memseg list at 0x140000000, size 80000000
EAL: Creating 1 segment lists: n_segs:1024 socket_id:0 hugepage_sz:2097152
EAL: Ask a virtual area of 0xd000 bytes
EAL: Virtual area found at 0x1c0000000 (size = 0xd000)
EAL: Memseg list allocated at socket 0, page size 0x800kB
EAL: Ask a virtual area of 0x80000000 bytes
EAL: Virtual area found at 0x1c0200000 (size = 0x80000000)
EAL: VA reserved for memseg list at 0x1c0200000, size 80000000
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Hugepage /mnt/huge/rtemap_1 is on socket 0
EAL: Hugepage /mnt/huge/rtemap_0 is on socket 0
EAL: Requesting 2 pages of size 1024MB from socket 0    <<<< Same on good and bad
EAL: Attempting to map 1024M on socket 0      <<<< here, on good VM it states Attempting to map 2048M on socket 0, we have one numa node or 1 socket.
EAL: Allocated 1024M on socket 0                         <<<< here, it allocated the 1024M on socket 0.
EAL: Attempting to map 1024M on socket 0      <<<< here, attempts to map last 1G to socket 0.
EAL: Could not find space for memseg. Please increase 1024 and/or 2048 in configuration.   <<<
EAL: Couldn't remap hugepage files into memseg lists      <<<<
EAL: FATAL: Cannot init memory
EAL: Cannot init memory


//good
EAL: Hugepage /mnt/huge/rtemap_1 is on socket 0
EAL: Hugepage /mnt/huge/rtemap_0 is on socket 0
EAL: Requesting 2 pages of size 1024MB from socket 0
EAL: Attempting to map 2048M on socket 0
EAL: Allocated 2048M on socket 0
EAL: Added 2048M to heap on socket 0

Could it be that the hugpages are not contiguous and reboot clears this issue, not able to confirm.
I tried rebooting the VM 10 times and could not get it to fail.
Tried multiple VMs and sometimes fails.
Seen on VMWare VM and openStack VMs.

Few months back you helped me reduce the VIRT memory of our application.

I added the following before building the dpdk static libraries that are used in our application build.

#define DPDK_REDUCE_VIRT_8G   // is used to select the reduced MSL, etc reductions.

#if defined(DPDK_ORIGINAL) // original, VIRT: 36.6 GB
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
#define RTE_MAX_MEMSEG_PER_TYPE 32768
#define RTE_MAX_MEM_MB_PER_TYPE 65536
#endif

#if defined(DPDK_REDUCE_VIRT_8G)  // VIRT: 5.9 GB
#define RTE_MAX_MEMSEG_LISTS 2
#define RTE_MAX_MEMSEG_PER_LIST 1024
#define RTE_MAX_MEM_MB_PER_LIST 2048
#define RTE_MAX_MEMSEG_PER_TYPE 1024
#defin
e RTE_MAX_MEM_MB_PER_TYPE 2048
#endif

We provide to rte_eal_init() the following arguments:
'app_name, -c0x2, -n4, --socket-mem=2048, --legacy-mem, --no-telemetry'

What do you suggest to eliminate this intermittent map to memseg list issue?

Thanks,
Ed

[-- Attachment #2: Type: text/html, Size: 11982 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* hugepage mapping to memseg failure
  2024-09-04 22:23 hugepage allocation mapping failure Lombardo, Ed
@ 2024-09-06 13:42 ` Lombardo, Ed
  2024-09-07 20:35   ` Dmitry Kozlyuk
  0 siblings, 1 reply; 6+ messages in thread
From: Lombardo, Ed @ 2024-09-06 13:42 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 5648 bytes --]

Hi,
I hope someone can help, I am seeing DPDK EAL initialization intermittently fail when hugepage memory is mapped to memseg.

DPDK version 22.11.2
Oracle 91 OS with kernel 5.14.0-284
The VM is configured with 32GB memory and 8 vCPU cores.
Setup for 2 x 1GB = 2GB hugepage total
We dynamically allocate hugepages before our application starts, is not done in grub but done in a bash script.

I turned on EAL debug in our application, which shows debug messages during EAL init.


Enable dpdk log EAL in nsprobe.
EAL: lib.eal log level changed from info to debug
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 0 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 0 on socket 0
EAL: Maximum logical cores by configuration: 128
EAL: Detected CPU lcores: 8
EAL: Detected NUMA nodes: 1
EAL: Checking presence of .so 'librte_eal.so.23.0'
EAL: Checking presence of .so 'librte_eal.so.23'
EAL: Checking presence of .so 'librte_eal.so'
EAL: Detected static linkage of DPDK
EAL: Ask a virtual area of 0x2000 bytes
EAL: Virtual area found at 0x100000000 (size = 0x2000)
[New Thread 0x7fed931ff640 (LWP 287600)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7fed929fe640 (LWP 287601)]
EAL: PCI driver net_iavf for device 0000:00:05.0 wants IOVA as 'PA'
EAL: PCI driver net_ice_dcf for device 0000:00:05.0 wants IOVA as 'PA'
EAL: PCI driver net_iavf for device 0000:00:06.0 wants IOVA as 'PA'
EAL: PCI driver net_ice_dcf for device 0000:00:06.0 wants IOVA as 'PA'
EAL: Bus pci wants IOVA as 'PA'
EAL: Bus vdev wants IOVA as 'DC'
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Ask a virtual area of 0x2e000 bytes
EAL: Virtual area found at 0x100002000 (size = 0x2e000)
EAL: Setting up physically contiguous memory...
EAL: Setting maximum number of open files to 1024
EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
EAL: Creating 1 segment lists: n_segs:2 socket_id:0 hugepage_sz:1073741824
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x100030000 (size = 0x1000)
EAL: Memseg list allocated at socket 0, page size 0x100000kB
EAL: Ask a virtual area of 0x80000000 bytes
EAL: Virtual area found at 0x140000000 (size = 0x80000000)
EAL: VA reserved for memseg list at 0x140000000, size 80000000
EAL: Creating 1 segment lists: n_segs:1024 socket_id:0 hugepage_sz:2097152
EAL: Ask a virtual area of 0xd000 bytes
EAL: Virtual area found at 0x1c0000000 (size = 0xd000)
EAL: Memseg list allocated at socket 0, page size 0x800kB
EAL: Ask a virtual area of 0x80000000 bytes
EAL: Virtual area found at 0x1c0200000 (size = 0x80000000)
EAL: VA reserved for memseg list at 0x1c0200000, size 80000000
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0


// bad case
EAL: Hugepage /mnt/huge/rtemap_1 is on socket 0
EAL: Hugepage /mnt/huge/rtemap_0 is on socket 0
EAL: Requesting 2 pages of size 1024MB from socket 0    <<<< Same on good and bad
EAL: Attempting to map 1024M on socket 0      <<<< here, on good VM it states Attempting to map 2048M on socket 0, we have one numa node or 1 socket.
EAL: Allocated 1024M on socket 0                         <<<< here, it allocated the 1024M on socket 0.
EAL: Attempting to map 1024M on socket 0      <<<< here, attempts to map last 1G to socket 0.
EAL: Could not find space for memseg. Please increase 1024 and/or 2048 in configuration.
EAL: Couldn't remap hugepage files into memseg lists
EAL: FATAL: Cannot init memory
EAL: Cannot init memory

//good case
EAL: Hugepage /mnt/huge/rtemap_1 is on socket 0
EAL: Hugepage /mnt/huge/rtemap_0 is on socket 0
EAL: Requesting 2 pages of size 1024MB from socket 0
EAL: Attempting to map 2048M on socket 0
EAL: Allocated 2048M on socket 0
EAL: Added 2048M to heap on socket 0

I tried rebooting the VM 10 times and could not get it to fail.
Tried multiple VMs and sometimes fails.
Stopping the application and restarting does not clear the error.
Seen on VMWare VM and openStack VMs.

I modified the DPDK #defines to reduce the VIRT memory of our application.

I added the following before building the dpdk static libraries that are used in our application build.

#define DPDK_REDUCE_VIRT_8G   // is used to select the reduced MSL, etc reductions.

#if defined(DPDK_ORIGINAL) // original, VIRT: 36.6 GB
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
#define RTE_MAX_MEMSEG_PER_TYPE 32768
#define RTE_MAX_MEM_MB_PER_TYPE 65536
#endif

#if defined(DPDK_REDUCE_VIRT_8G)  // VIRT: 5.9 GB
#define RTE_MAX_MEMSEG_LISTS 2
#define RTE_MAX_MEMSEG_PER_LIST 1024
#define RTE_MAX_MEM_MB_PER_LIST 2048
#define RTE_MAX_MEMSEG_PER_TYPE 1024
#defin
#define RTE_MAX_MEM_MB_PER_TYPE 2048
#endif

The rte_eal_init() arguments are:
'app_name, -c0x2, -n4, --socket-mem=2048, --legacy-mem, --no-telemetry'

Could it be that the hugpages are not contiguous and reboot clears this issue, not able to confirm.

Need help to eliminate this intermittent map to memseg list issue?



Thanks,
Ed

[-- Attachment #2: Type: text/html, Size: 12130 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: hugepage mapping to memseg failure
  2024-09-06 13:42 ` hugepage mapping to memseg failure Lombardo, Ed
@ 2024-09-07 20:35   ` Dmitry Kozlyuk
  2024-09-10 20:42     ` Lombardo, Ed
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Kozlyuk @ 2024-09-07 20:35 UTC (permalink / raw)
  To: Lombardo, Ed; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 945 bytes --]

Hi Ed,

On Fri, Sep 6, 2024, 16:43 Lombardo, Ed <Ed.Lombardo@netscout.com> wrote:

> The rte_eal_init() arguments are:
>
> ‘app_name, -c0x2, -n4, --socket-mem=2048, --legacy-mem, --no-telemetry’
>
>
>
> Could it be that the hugpages are not contiguous and reboot clears this
> issue, not able to confirm.
>
Yes, this is likely the root cause. Since you're building DPDK yourself,
you can print cur->physaddr around line 900 (see the link below). In legacy
mode, DPDK leaves "holes" (unused elements) in memory segment lists between
pages that are not physically contiguous. Because in your case the segment
list has only two elements, there is no room for two segments for two
hugepages plus a hole segment between them.
http://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_memory.c#n832

Your options then are:
- not using legacy memory mode;
- increasing *_MAX_MEM_MB_* constants to 3072 (will also increase VIRT).

>

[-- Attachment #2: Type: text/html, Size: 2052 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: hugepage mapping to memseg failure
  2024-09-07 20:35   ` Dmitry Kozlyuk
@ 2024-09-10 20:42     ` Lombardo, Ed
  2024-09-10 22:36       ` Dmitry Kozlyuk
  0 siblings, 1 reply; 6+ messages in thread
From: Lombardo, Ed @ 2024-09-10 20:42 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 1760 bytes --]

Hi Dmitry,
If I use grub for hugepages will the hugepages always be contiguous and we won’t see the mapping to memsegs issue?

I am investigating your option 2 you provided to see how much VIRT memory increases.

Best Regards,
Ed

From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Sent: Saturday, September 7, 2024 4:36 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users <users@dpdk.org>
Subject: Re: hugepage mapping to memseg failure

External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Ed,
On Fri, Sep 6, 2024, 16:43 Lombardo, Ed <Ed.Lombardo@netscout.com<mailto:Ed.Lombardo@netscout.com>> wrote:
The rte_eal_init() arguments are:
‘app_name, -c0x2, -n4, --socket-mem=2048, --legacy-mem, --no-telemetry’

Could it be that the hugpages are not contiguous and reboot clears this issue, not able to confirm.
Yes, this is likely the root cause. Since you're building DPDK yourself, you can print cur->physaddr around line 900 (see the link below). In legacy mode, DPDK leaves "holes" (unused elements) in memory segment lists between pages that are not physically contiguous. Because in your case the segment list has only two elements, there is no room for two segments for two hugepages plus a hole segment between them.
http://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_memory.c#n832<https://urldefense.com/v3/__http:/git.dpdk.org/dpdk/tree/lib/eal/linux/eal_memory.c*n832__;Iw!!Nzg7nt7_!B063yUJ-LVej5lTsv6gheha4kgXi82WclmIX3OgY_6c5BaWYh9OxqpMNNAkIbCDLGWTc8Vc5CQDqOMooFHKBNysj3Aw$>

Your options then are:
- not using legacy memory mode;
- increasing *_MAX_MEM_MB_* constants to 3072 (will also increase VIRT).

[-- Attachment #2: Type: text/html, Size: 5977 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: hugepage mapping to memseg failure
  2024-09-10 20:42     ` Lombardo, Ed
@ 2024-09-10 22:36       ` Dmitry Kozlyuk
  2024-09-11  4:26         ` Lombardo, Ed
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Kozlyuk @ 2024-09-10 22:36 UTC (permalink / raw)
  To: Lombardo, Ed; +Cc: users

2024-09-10 20:42 (UTC+0000), Lombardo, Ed:
> Hi Dmitry,
> If I use grub for hugepages will the hugepages always be contiguous and we won’t see the mapping to memsegs issue?

There are no guarantees about physical addresses.
On bare metal, getting continuous addresses at system startup is more likely.
On VM, I think, it is always less likely because host memory is fragmented.

> I am investigating your option 2 you provided to see how much VIRT memory increases.

There might be a third option.
If your HW and hypervisor permit accessing IOMMU from guests
and if the NIC can be bound to vfio-pci driver,
then you could use IOVA-as-VA (--iova-mode=va)
and have no issues with physical addresses ever.

Out of curiosity, why legacy memory mode is preferable for your app?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: hugepage mapping to memseg failure
  2024-09-10 22:36       ` Dmitry Kozlyuk
@ 2024-09-11  4:26         ` Lombardo, Ed
  0 siblings, 0 replies; 6+ messages in thread
From: Lombardo, Ed @ 2024-09-11  4:26 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

Hi Dmitry,
Legacy memory mode was one way to reduce the VIRT memory drastically.  Our application restricts and locks down memory for performance purposes.  We need to continue to offer our customers our virtual application with minimum of 16 GB of memory.  With DPDK 22.11 the VIRT memory jumped to 66 GB.
 The VIRT memory jump caused problems with our application startup.  You had helped me reduce VIRT memory to ~8GB and this came close to what DPDK 17.11 provided us.

Thanks,
Ed

-----Original Message-----
From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> 
Sent: Tuesday, September 10, 2024 6:37 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users <users@dpdk.org>
Subject: Re: hugepage mapping to memseg failure

External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.

2024-09-10 20:42 (UTC+0000), Lombardo, Ed:
> Hi Dmitry,
> If I use grub for hugepages will the hugepages always be contiguous and we won’t see the mapping to memsegs issue?

There are no guarantees about physical addresses.
On bare metal, getting continuous addresses at system startup is more likely.
On VM, I think, it is always less likely because host memory is fragmented.

> I am investigating your option 2 you provided to see how much VIRT memory increases.

There might be a third option.
If your HW and hypervisor permit accessing IOMMU from guests and if the NIC can be bound to vfio-pci driver, then you could use IOVA-as-VA (--iova-mode=va) and have no issues with physical addresses ever.

Out of curiosity, why legacy memory mode is preferable for your app?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-09-11  4:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-04 22:23 hugepage allocation mapping failure Lombardo, Ed
2024-09-06 13:42 ` hugepage mapping to memseg failure Lombardo, Ed
2024-09-07 20:35   ` Dmitry Kozlyuk
2024-09-10 20:42     ` Lombardo, Ed
2024-09-10 22:36       ` Dmitry Kozlyuk
2024-09-11  4:26         ` Lombardo, Ed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).