On 2025/1/1 20:34, Dmitry Kozlyuk wrote:
2024-12-26 16:10 (UTC+0800), Yang Ming:
Fix the issue where OS memory is mistakenly freed with rte_free
by setting the length (len) of unused memseg to 0.

When eal_legacy_hugepage_init releases the VA space for unused
memseg lists, it does not reset their length to 0. As a result,
mlx5_mem_is_rte may incorrectly identify OS memory as DPDK
memory. This can lead to mlx_free calling rte_free on OS memory,
causing an "EAL: Error: Invalid memory" log and failing to free
the OS memory.

This issue is occasional and occurs when the DPDK program’s
memory map places the heap address range between 0 and len(32G).
In such cases, malloc may return an address less than len,
causing mlx5_mem_is_rte to incorrectly treat it as DPDK memory.

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org

Signed-off-by: Yang Ming <ming.1.yang@nokia-sbell.com>
---
 lib/eal/linux/eal_memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 45879ca743..9dda60c0e1 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -1472,6 +1472,7 @@ eal_legacy_hugepage_init(void)
 		mem_sz = msl->len;
 		munmap(msl->base_va, mem_sz);
 		msl->base_va = NULL;
+		msl->len = 0;
 		msl->heap = 0;
 
 		/* destroy backing fbarray */
Hi Yang,

It seems the bug affects more than just mlx5 PMD.

Consider how the MSL with `base_va == NULL` ends up in `mlx5_mem_is_rte()`.
It comes from `rte_mem_virt2memseg_list()` which iterates MSLs
and checks that an address belongs to [`base_va`; `base_va+len`)
without checking whether `base_va == NULL` i.e. that the MSL is inactive.
Your patch also corrects `rte_mem_virt2memseg_list()` behavior.
Please mention this in the commit message.

Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

Hi Dmitry,

Thanks. I will update this patch within new version (v2) to add these content to the commit log.