DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Yan, Xiaoping (NSB - CN/Hangzhou)" <xiaoping.yan@nokia-sbell.com>
To: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] dpdk-pdump prints "EAL: Error: Invalid memory"
Date: Mon, 23 Aug 2021 03:56:25 +0000	[thread overview]
Message-ID: <1f8ba365b3ea4bcfa3765a742500160d@nokia-sbell.com> (raw)
In-Reply-To: <a490537c31c44405bffe9cb7d2611fb3@nokia-sbell.com>

Hi,

Any comment for this issue?
Thank you.

Best regards
Yan Xiaoping

From: Yan, Xiaoping (NSB - CN/Hangzhou)
Sent: 2021年8月10日 13:35
To: 'users@dpdk.org' <users@dpdk.org>
Subject: RE: dpdk-pdump prints "EAL: Error: Invalid memory"

Hi,

I modified eal_legacy_hugepage_init and problem is solved.
Should the correction be added to dpdk upstream?

diff --git a/package/dpdk/dpdk-20.11/lib/librte_eal/linux/eal_memory.c b/package/dpdk/dpdk-20.11/lib/librte_eal/linux/eal_memory.c
index 03a4f2dd..89a13e91 100644
--- a/package/dpdk/dpdk-20.11/lib/librte_eal/linux/eal_memory.c
+++ b/package/dpdk/dpdk-20.11/lib/librte_eal/linux/eal_memory.c
@@ -1458,6 +1458,7 @@ eal_legacy_hugepage_init(void)
                mem_sz = msl->len;
                munmap(msl->base_va, mem_sz);
                msl->base_va = NULL;
+               msl->len = 0;
                msl->heap = 0;

                /* destroy backing fbarray */


Best regards
Yan Xiaoping

From: Yan, Xiaoping (NSB - CN/Hangzhou)
Sent: 2021年8月4日 15:14
To: 'users@dpdk.org' <users@dpdk.org<mailto:users@dpdk.org>>
Subject: dpdk-pdump prints "EAL: Error: Invalid memory"

Hi,

After updating dpdk version from 19.11 to 20.11
dpdk-pdump prints such error:
EAL: Error: Invalid memory
Port 7 MAC: 02 70 63 61 70 03
core (2), capture for (1) tuples
- port 0 device ((null)) queue 65535
^C

--legacy-mem is used for both primary primary and dpdk-pdump.
With some debug, I find that  mlx5_mem_is_rte incorrectly consider this address from os memory ((addr=0x4482b80)) as rte address, so mlx5_free
calls rte_free() to free it and caused error.
And this seems to because len of some unused memsegs is not set to 0 (so rte_mem_virt2memseg_list(0x4482b80) returns a memseg).
Here is memsegs:
(gdb) p mcfg->memsegs
$3 = {{{base_va = 0x2aac0000000, addr_64 = 2932388921344}, page_sz = 1073741824, socket_id = 0,
    version = 0, len = 34359738368, external = 0, heap = 1, memseg_arr = {
      name = "memseg-1048576k-0-0", '\000' <repeats 44 times>, count = 5, len = 32,
      elt_sz = 48, data = 0x2aaa302e000, rwlock = {cnt = 0}}}, {{base_va = 0x0, addr_64 = 0},
    page_sz = 1073741824, socket_id = 0, version = 0, len = 34359738368, external = 0,
    heap = 0, memseg_arr = {name = '\000' <repeats 63 times>, count = 0, len = 0, elt_sz = 0,
      data = 0x0, rwlock = {cnt = 0}}}, {{base_va = 0x0, addr_64 = 0}, page_sz = 1073741824,
    socket_id = 1, version = 0, len = 34359738368, external = 0, heap = 0, memseg_arr = {
      name = '\000' <repeats 63 times>, count = 0, len = 0, elt_sz = 0, data = 0x0, rwlock = {
        cnt = 0}}}, {{base_va = 0x0, addr_64 = 0}, page_sz = 1073741824, socket_id = 1,
    version = 0, len = 34359738368, external = 0, heap = 0, memseg_arr = {
      name = '\000' <repeats 63 times>, count = 0, len = 0, elt_sz = 0, data = 0x0, rwlock = {
        cnt = 0}}}, {{base_va = 0x0, addr_64 = 0}, page_sz = 0, socket_id = 0, version = 0,
    len = 0, external = 0, heap = 0, memseg_arr = {name = '\000' <repeats 63 times>, count = 0,
      len = 0, elt_sz = 0, data = 0x0, rwlock = {cnt = 0}}} <repeats 124 times>}

Here is the stack trace
(gdb) bt
#0  mlx5_free (addr=0x4482b80) at ../dpdk-20.11/drivers/common/mlx5/mlx5_malloc.c:260
#1  0x0000000000706f5c in mlx5_mp_req_verbs_cmd_fd (mp_id=mp_id@entry=0x7ffcdb6d9e50)
    at ../dpdk-20.11/drivers/common/mlx5/mlx5_common_mp.c:140
#2  0x000000000050496f in mlx5_dev_spawn (config=0x7ffcdb6d9d70, spawn=0x2ab753799c0,
    dpdk_dev=0x4491400) at ../dpdk-20.11/drivers/net/mlx5/linux/mlx5_os.c:774
#3  mlx5_os_pci_probe (pci_drv=<optimized out>, pci_dev=<optimized out>)
    at ../dpdk-20.11/drivers/net/mlx5/linux/mlx5_os.c:2154
#4  0x0000000000708b5a in drivers_probe (user_classes=1, pci_dev=0x44913f0,
    pci_drv=0xe01800 <mlx5_pci_driver>, dev=0x2ab75379a80)
    at ../dpdk-20.11/drivers/common/mlx5/mlx5_common_pci.c:246
#5  mlx5_common_pci_probe (pci_drv=0xe01800 <mlx5_pci_driver>, pci_dev=0x44913f0)
    at ../dpdk-20.11/drivers/common/mlx5/mlx5_common_pci.c:308
#6  0x00000000004268f9 in rte_pci_probe_one_driver (dev=0x44913f0,
    dr=0xe01800 <mlx5_pci_driver>) at ../dpdk-20.11/drivers/bus/pci/pci_common.c:243
#7  pci_probe_all_drivers (dev=0x44913f0) at ../dpdk-20.11/drivers/bus/pci/pci_common.c:318
#8  pci_probe () at ../dpdk-20.11/drivers/bus/pci/pci_common.c:345
#9  0x00000000006bc4d3 in rte_bus_probe ()
    at ../dpdk-20.11/lib/librte_eal/common/eal_common_bus.c:72
#10 0x0000000000422304 in rte_eal_init (argc=argc@entry=16, argv=argv@entry=0x7ffcdb6da530)
    at ../dpdk-20.11/lib/librte_eal/linux/eal.c:1210
#11 0x000000000056fee9 in main (argc=16, argv=0x7ffcdb6da748)
    at ../dpdk-20.11/app/pdump/main.c:1118
(gdb) c
Continuing.

Thread 1 "dpdk-pdump" hit Breakpoint 5, rte_free (addr=0x4482b80)


It seems to me that below code piece from eal_legacy_hugepage_init should also set len to 0?
              /* we're not going to allocate more pages, so release VA space for
              * unused memseg lists
              */
              for (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {
                             struct rte_memseg_list *msl = &mcfg->memsegs[i];
                             size_t mem_sz;

                             /* skip inactive lists */
                             if (msl->base_va == NULL)
                                           continue;
                             /* skip lists where there is at least one page allocated */
                             if (msl->memseg_arr.count > 0)
                                           continue;
                             /* this is an unused list, deallocate it */
                             mem_sz = msl->len;
                             munmap(msl->base_va, mem_sz);
                             msl->base_va = NULL;
                             // here, we should add msl->len = 0; ?
                             msl->heap = 0;

                             /* destroy backing fbarray */
                             rte_fbarray_destroy(&msl->memseg_arr);
              }

Any comment?
Thank you.


Best regards
Yan Xiaoping


           reply	other threads:[~2021-08-23  3:56 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <a490537c31c44405bffe9cb7d2611fb3@nokia-sbell.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f8ba365b3ea4bcfa3765a742500160d@nokia-sbell.com \
    --to=xiaoping.yan@nokia-sbell.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).