DPDK usage discussions
 help / color / mirror / Atom feed
* Unable to start mlx5 PMD on NUMA node 1
@ 2024-06-25 10:42 Tomas Jansky
  2024-06-25 14:29 ` Erez Ferber
  0 siblings, 1 reply; 2+ messages in thread
From: Tomas Jansky @ 2024-06-25 10:42 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 1968 bytes --]

Hello,

I am experiencing issues with DPDK (21.11) mlx5 PMD driver when allocating hugepages on NUMA node 1.

Card info:
  Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  Subsystem: Mellanox Technologies Device 0020
  NUMA node: 1
  Driver: mlx5_core
  Version: 5.7-1.0.2
  Firmware-version: 26.36.1010 (DEL0000000031)

The card is clearly linked to NUMA node 1, so I assigned some hugepages only to the NUMA node 1.
cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
0
cat /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
4

However, when I run my DPDK application, it presents with the following output and fails:
EAL: Detected CPU lcores: 24
EAL: Detected NUMA nodes: 2
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/0000:98:00.0/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No free 1048576 kB hugepages reported on node 0
EAL: No free 2048 kB hugepages reported on node 0
EAL: No free 2048 kB hugepages reported on node 1
EAL: No available 2048 kB hugepages reported
EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:98:00.0 (socket 1)
mlx5_net: Failed to create ASO bits mem for MR.
EAL: Error: Invalid memory
mlx5_net: probe of PCI device 0000:98:00.0 aborted after encountering an error: Operation not permitted
mlx5_common: Failed to load driver mlx5_eth
EAL: Requested device 0000:98:00.0 cannot be used
EAL: Bus (pci) probe failed.

If I allocate hugepages only for NUMA node 0, it fails with:
mlx5_common: Failed to initialize global MR share cache.

So the only working solution for me currently is to have hugepages allocated for both NUMA nodes, which is weird, considering that e.g., i40e PMD works completely fine with having hugepages only on a single NUMA node.

Is there a way to force the mlx5 PMD to allocate everything on a single specific NUMA node?

Any advice is appreciated
Tomas


[-- Attachment #2: Type: text/html, Size: 9511 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Unable to start mlx5 PMD on NUMA node 1
  2024-06-25 10:42 Unable to start mlx5 PMD on NUMA node 1 Tomas Jansky
@ 2024-06-25 14:29 ` Erez Ferber
  0 siblings, 0 replies; 2+ messages in thread
From: Erez Ferber @ 2024-06-25 14:29 UTC (permalink / raw)
  To: Tomas Jansky; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 2278 bytes --]

Make sure you're running with the below fix which should've been merged in
v21.11.1 and up.

https://mails.dpdk.org/archives/stable/2022-February/036254.html

Regards,
Erez

On Tue, 25 Jun 2024 at 12:52, Tomas Jansky <Tomas.Jansky@progress.com>
wrote:

> Hello,
>
> I am experiencing issues with DPDK (21.11) mlx5 PMD driver when allocating
> hugepages on NUMA node 1.
>
> Card info:
>   Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
>   Subsystem: Mellanox Technologies Device 0020
>   NUMA node: 1
>   Driver: mlx5_core
>   Version: 5.7-1.0.2
>   Firmware-version: 26.36.1010 (DEL0000000031)
>
> The card is clearly linked to NUMA node 1, so I assigned some hugepages
> only to the NUMA node 1.
> cat
> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
> 0
> cat
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
> 4
>
> However, when I run my DPDK application, it presents with the following
> output and fails:
> EAL: Detected CPU lcores: 24
> EAL: Detected NUMA nodes: 2
> EAL: Detected shared linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/0000:98:00.0/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No free 1048576 kB hugepages reported on node 0
> EAL: No free 2048 kB hugepages reported on node 0
> EAL: No free 2048 kB hugepages reported on node 1
> EAL: No available 2048 kB hugepages reported
> EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:98:00.0 (socket 1)
> mlx5_net: Failed to create ASO bits mem for MR.
> EAL: Error: Invalid memory
> mlx5_net: probe of PCI device 0000:98:00.0 aborted after encountering an
> error: Operation not permitted
> mlx5_common: Failed to load driver mlx5_eth
> EAL: Requested device 0000:98:00.0 cannot be used
> EAL: Bus (pci) probe failed.
>
> If I allocate hugepages only for NUMA node 0, it fails with:
> mlx5_common: Failed to initialize global MR share cache.
>
> So the only working solution for me currently is to have hugepages
> allocated for both NUMA nodes, which is weird, considering that e.g., i40e
> PMD works completely fine with having hugepages only on a single NUMA node.
>
> Is there a way to force the mlx5 PMD to allocate everything on a single
> specific NUMA node?
>
> Any advice is appreciated
> Tomas
>
>

[-- Attachment #2: Type: text/html, Size: 9202 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-06-25 14:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-25 10:42 Unable to start mlx5 PMD on NUMA node 1 Tomas Jansky
2024-06-25 14:29 ` Erez Ferber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).