DPDK patches and discussions
 help / color / mirror / Atom feed
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [dpdk-dev] [Bug 56] crash when freeing memory with no mlx5 device attached
Date: Wed, 30 May 2018 13:39:45 +0000	[thread overview]
Message-ID: <bug-56-3@http.dpdk.org/tracker/> (raw)

https://dpdk.org/tracker/show_bug.cgi?id=56

            Bug ID: 56
           Summary: crash when freeing memory with no mlx5 device attached
           Product: DPDK
           Version: 18.05
          Hardware: All
                OS: All
            Status: CONFIRMED
          Severity: critical
          Priority: Normal
         Component: other
          Assignee: dev@dpdk.org
          Reporter: david.marchand@6wind.com
  Target Milestone: ---

This problem is produced when a memory free event reaches the mlx5 callback,
but no mlx5 device has been initialised (yet).

Looking at the code, the mlx5 driver always register a memory callback:

RTE_INIT(rte_mlx5_pmd_init);
static void
rte_mlx5_pmd_init(void)
{
...
        rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
                                        mlx5_mr_mem_event_cb, NULL);
}

When invoked, this callback tries to take a lock:

void                                                                            
mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,           
                     size_t len, void *arg __rte_unused)                        
{                                                                               
        struct priv *priv;                                                      
        struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;  

        switch (event_type) {                                                   
        case RTE_MEM_EVENT_FREE:                                                
                rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);     
                /* Iterate all the existing mlx5 devices. */                    

But this lock is not initialised unless a mlx5 device has been probed, since
its init is done in mlx5_prepare_shared_data() called from mlx5_pci_probe().


Reproducing the issue is not direct, I forced an allocation / liberation in the
testpmd code to make sure a free event would be triggered:

root@ubuntu1604:~/dpdk# git diff
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 35cf266..79c9531 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2772,6 +2772,8 @@ main(int argc, char** argv)
        }
 #endif

+       rte_free(rte_malloc(NULL, 10000000, 0));
+
 #ifdef RTE_LIBRTE_CMDLINE
        if (strlen(cmdline_filename) != 0)
                cmdline_read_from_file(cmdline_filename);


Then:

root@ubuntu1604:~/dpdk# LD_LIBRARY_PATH=/root/rdma-core/build/lib
./build/app/testpmd --log-level .*,8 -c 0x6 -- -i --total-num-mbufs 2048
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
...
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Calling mem event callback 'MLX5_MEM_EVENT_CB:(nil)'EAL: request:
mp_malloc_sync
EAL: Heap on socket 0 was expanded by 90MB
Interactive-mode selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=2048, size=2176,
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: alloc_pages_on_heap(): couldn't allocate physically contiguous space
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Calling mem event callback 'MLX5_MEM_EVENT_CB:(nil)'EAL: request:
mp_malloc_sync
EAL: Heap on socket 0 was expanded by 8MB
Done
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Calling mem event callback 'MLX5_MEM_EVENT_CB:(nil)'EAL: request:
mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback 'MLX5_MEM_EVENT_CB:(nil)'Segmentation fault
(core dumped)


root@ubuntu1604:~/dpdk# gdb ./build/app/testpmd core
...
Core was generated by `./build/app/testpmd --log-level .*,8 -c 0x6 -- -i
--total-num-mbufs 2048'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  rte_rwlock_write_lock (rwl=<optimized out>) at
/root/dpdk/build/include/generic/rte_rwlock.h:103
103                     x = rwl->cnt;
[Current thread is 1 (Thread 0x7f1871022c00 (LWP 5732))]
(gdb) bt
#0  rte_rwlock_write_lock (rwl=<optimized out>) at
/root/dpdk/build/include/generic/rte_rwlock.h:103
#1  mlx5_mr_mem_event_cb (event_type=RTE_MEM_EVENT_FREE, addr=0x7f1474a00000,
len=10485760, arg=<optimized out>) at /root/dpdk/drivers/net/mlx5/mlx5_mr.c:884
#2  0x000000000054ae86 in eal_memalloc_mem_event_notify ()
#3  0x0000000000558994 in malloc_heap_free ()
#4  0x000000000055445f in rte_free ()
#5  0x0000000000477231 in main ()

-- 
You are receiving this mail because:
You are the assignee for the bug.

             reply	other threads:[~2018-05-30 13:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-30 13:39 bugzilla [this message]
     [not found] <bug-56-3@http.bugs.dpdk.org/>
2018-06-28 19:01 ` bugzilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-56-3@http.dpdk.org/tracker/ \
    --to=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).