2024-12-26 16:10 (UTC+0800), Yang Ming:Fix the issue where OS memory is mistakenly freed with rte_free by setting the length (len) of unused memseg to 0. When eal_legacy_hugepage_init releases the VA space for unused memseg lists, it does not reset their length to 0. As a result, mlx5_mem_is_rte may incorrectly identify OS memory as DPDK memory. This can lead to mlx_free calling rte_free on OS memory, causing an "EAL: Error: Invalid memory" log and failing to free the OS memory. This issue is occasional and occurs when the DPDK program’s memory map places the heap address range between 0 and len(32G). In such cases, malloc may return an address less than len, causing mlx5_mem_is_rte to incorrectly treat it as DPDK memory. Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") Cc: anatoly.burakov@intel.com Cc: stable@dpdk.org Signed-off-by: Yang Ming <ming.1.yang@nokia-sbell.com> --- lib/eal/linux/eal_memory.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c index 45879ca743..9dda60c0e1 100644 --- a/lib/eal/linux/eal_memory.c +++ b/lib/eal/linux/eal_memory.c @@ -1472,6 +1472,7 @@ eal_legacy_hugepage_init(void) mem_sz = msl->len; munmap(msl->base_va, mem_sz); msl->base_va = NULL; + msl->len = 0; msl->heap = 0; /* destroy backing fbarray */Hi Yang, It seems the bug affects more than just mlx5 PMD. Consider how the MSL with `base_va == NULL` ends up in `mlx5_mem_is_rte()`. It comes from `rte_mem_virt2memseg_list()` which iterates MSLs and checks that an address belongs to [`base_va`; `base_va+len`) without checking whether `base_va == NULL` i.e. that the MSL is inactive. Your patch also corrects `rte_mem_virt2memseg_list()` behavior. Please mention this in the commit message. Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Hi Dmitry,
Thanks. I will update this patch within new version (v2) to add these content to the commit log.