DPDK patches and discussions
 help / color / mirror / Atom feed
* [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used
@ 2023-06-20 11:51 bugzilla
  2023-07-06  8:29 ` bugzilla
  2023-07-06  9:06 ` bugzilla
  0 siblings, 2 replies; 3+ messages in thread
From: bugzilla @ 2023-06-20 11:51 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 1988 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1256

            Bug ID: 1256
           Summary: drivers/common/mlx5: mlx5_malloc() called on invalid
                    socket ID when global MR cache is full and
                    rte_extmem_* API is used
           Product: DPDK
           Version: 21.11
          Hardware: x86
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: other
          Assignee: dev@dpdk.org
          Reporter: baciumariuscristian@yahoo.com
  Target Milestone: ---

Overview:
Attempt to allocate a new mlx5 MR entry when global Btree cache is full ends up
calling mlx5_malloc with the EXTERNAL_HEAP_MIN_SOCKET_ID socket ID, given that
no external-memory heap has been created. Instead the rte_extmem_* API is used.

Steps to reproduce:
- start a primary DPDK process, on a NIC compatible with mlx5_core driver;
- use rte_extmem_register() to register >512 pages of 4KB;
- use rte_dev_dma_map() to dma-map each page;
- rte_eth_tx_burst() an mbuf with an external buffer from the last page of the
registered memory (or a page above index 512). (a virtual address that will not
be found in the global Btree cache);

Actual results:
mlx5_malloc in mlx5_mr_create_primary() fails with "Unable to allocate memory
for a new MR". From this point forward, packets never reach the other end.

Expected results:
MR entries should be successfully retrieved from backup or created when cache
becomes full; calling mlx5_malloc() on external heap socket should not be
possible when rte_extmem_* API is used. As it is stated in the DPDK
documentation[1], "Memory added this way will not be available for any regular
DPDK allocators".

Build Date & Hardware:
20 Jun 2023 on Debian GNU/Linux 4.18.0

[1]: https://doc.dpdk.org/guides-21.11/prog_guide/env_abstraction_layer.html

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 4007 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used
  2023-06-20 11:51 [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used bugzilla
@ 2023-07-06  8:29 ` bugzilla
  2023-07-06  9:06 ` bugzilla
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla @ 2023-07-06  8:29 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 578 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1256

Raslan Darawsheh (rasland@nvidia.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #2 from Raslan Darawsheh (rasland@nvidia.com) ---
https://git.dpdk.org/dpdk/commit/?h=releases&id=147f6fb42bd7637b37a9180b0774275531c05f9b

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 2909 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used
  2023-06-20 11:51 [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used bugzilla
  2023-07-06  8:29 ` bugzilla
@ 2023-07-06  9:06 ` bugzilla
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla @ 2023-07-06  9:06 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1256

Marius-Cristian Baciu (baciumariuscristian@yahoo.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |---
             Status|RESOLVED                    |UNCONFIRMED

--- Comment #3 from Marius-Cristian Baciu (baciumariuscristian@yahoo.com) ---
Hi,

Unfortunately that patch only targets a memory socket issue with the ASO
mechanism. However, in my setup ASO is never an issue - I actually do not
believe it is enabled.

To give a little more insight, the problem I am describing manifests on the
data path:
- rte_eth_tx_burst();
- mlx5_tx_burst_*() is called;
- at some later point, in mr_lookup_caches(), mr_btree_lookup() returns
UINT32_MAX because all 256 entries in the cache have been occupied and last
memory registration did not catch an empty slot;
- when mr_lookup_caches() fails, mlx5_mr_create() -> mlx5_mr_create_primary()
is called;
- mlx5_malloc() at line 723 fails because it is called with an inappropriate
socket ID (the socket ID of the memseg list associated with an external buffer
(prior with rte_extmem_register()), EXTERNAL_HEAP_MIN_SOCKET_ID, which does not
actually have a valid heap associated, from which memory could be allocated.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 3679 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-06  9:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-20 11:51 [Bug 1256] drivers/common/mlx5: mlx5_malloc() called on invalid socket ID when global MR cache is full and rte_extmem_* API is used bugzilla
2023-07-06  8:29 ` bugzilla
2023-07-06  9:06 ` bugzilla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).