DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] struct malloc_elem overrun/corruption
@ 2018-11-26 23:44 He Huang
  2018-11-27 10:38 ` Burakov, Anatoly
  0 siblings, 1 reply; 3+ messages in thread
From: He Huang @ 2018-11-26 23:44 UTC (permalink / raw)
  To: dev

Hi,

I’ve been troubleshooting a possible memory allocator corruption:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffefdf0700 (LWP 1079)]
0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
292             LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
(gdb) bt
#0  0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
#1  0x0000000000479971 in malloc_elem_free (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:448
#2  0x000000000047b054 in malloc_heap_free (elem=0x7ff82d265fc0) at dpdk/lib/librte_eal/common/malloc_heap.c:628
#3  0x00000000004787f5 in rte_free (addr=0x7ff82d266000) at dpdk/lib/librte_eal/common/rte_malloc.c:32

Looked like the 1st field of struct malloc_elem (i.e. the heap pointer: struct malloc_heap *heap) was corrupted. Everything else looked good:
(gdb) p *elem
$2 = {
  heap = 0x9e0,
  prev = 0x7ff82d254fc0,
  next = 0x7ff84ce9a000,
  free_list = {
    le_next = 0x7ff873c89000,
    le_prev = 0x7ff82bcbf018
  },
  msl = 0x7ffff7f3d07c,
  state = ELEM_FREE,
  pad = 0,
  size = 532893696
}
(gdb) p *elem->prev
$3 = {
  heap = 0x7ffff7f3f67c,
  prev = 0x7ff82ce14000,
  next = 0x7ff82d265000,
  free_list = {
   le_next = 0x0,
    le_prev = 0x0
  },
  msl = 0x7ffff7f3d07c,
  state = ELEM_BUSY,
  pad = 0,
  size = 65600
}

I haven’t completely ruled out my own code had a buffer overrun and corrupted the first field of malloc_elem object yet, but I’m beginning to look at it as a possible DPDK internal corruption. The DPDK code isn’t the latest but it had malloc fixes up to commit 9554dbb50a8a22942128a0e5bcb52243a4f723ab.

Ideas/suggestions greatly appreciated! BTW it’s DMA memory so I couldn’t just use malloc/free and debug with standard memory debuggers.

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] struct malloc_elem overrun/corruption
  2018-11-26 23:44 [dpdk-dev] struct malloc_elem overrun/corruption He Huang
@ 2018-11-27 10:38 ` Burakov, Anatoly
  2018-11-27 10:40   ` Burakov, Anatoly
  0 siblings, 1 reply; 3+ messages in thread
From: Burakov, Anatoly @ 2018-11-27 10:38 UTC (permalink / raw)
  To: He Huang, dev

On 26-Nov-18 11:44 PM, He Huang wrote:
> Hi,
> 
> I’ve been troubleshooting a possible memory allocator corruption:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffefdf0700 (LWP 1079)]
> 0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
> 292             LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
> (gdb) bt
> #0  0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
> #1  0x0000000000479971 in malloc_elem_free (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:448
> #2  0x000000000047b054 in malloc_heap_free (elem=0x7ff82d265fc0) at dpdk/lib/librte_eal/common/malloc_heap.c:628
> #3  0x00000000004787f5 in rte_free (addr=0x7ff82d266000) at dpdk/lib/librte_eal/common/rte_malloc.c:32
> 
> Looked like the 1st field of struct malloc_elem (i.e. the heap pointer: struct malloc_heap *heap) was corrupted. Everything else looked good:
> (gdb) p *elem
> $2 = {
>    heap = 0x9e0,
>    prev = 0x7ff82d254fc0,
>    next = 0x7ff84ce9a000,
>    free_list = {
>      le_next = 0x7ff873c89000,
>      le_prev = 0x7ff82bcbf018
>    },
>    msl = 0x7ffff7f3d07c,
>    state = ELEM_FREE,
>    pad = 0,
>    size = 532893696
> }
> (gdb) p *elem->prev
> $3 = {
>    heap = 0x7ffff7f3f67c,
>    prev = 0x7ff82ce14000,
>    next = 0x7ff82d265000,
>    free_list = {
>     le_next = 0x0,
>      le_prev = 0x0
>    },
>    msl = 0x7ffff7f3d07c,
>    state = ELEM_BUSY,
>    pad = 0,
>    size = 65600
> }
> 
> I haven’t completely ruled out my own code had a buffer overrun and corrupted the first field of malloc_elem object yet, but I’m beginning to look at it as a possible DPDK internal corruption. The DPDK code isn’t the latest but it had malloc fixes up to commit 9554dbb50a8a22942128a0e5bcb52243a4f723ab.
> 
> Ideas/suggestions greatly appreciated! BTW it’s DMA memory so I couldn’t just use malloc/free and debug with standard memory debuggers.
> 
> Thanks,
> Isaac
> 

Hi Isaac,

You might want to look into enabling malloc debug options.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] struct malloc_elem overrun/corruption
  2018-11-27 10:38 ` Burakov, Anatoly
@ 2018-11-27 10:40   ` Burakov, Anatoly
  0 siblings, 0 replies; 3+ messages in thread
From: Burakov, Anatoly @ 2018-11-27 10:40 UTC (permalink / raw)
  To: He Huang, dev

On 27-Nov-18 10:38 AM, Burakov, Anatoly wrote:
> On 26-Nov-18 11:44 PM, He Huang wrote:
>> Hi,
>>
>> I’ve been troubleshooting a possible memory allocator corruption:
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fffefdf0700 (LWP 1079)]
>> 0x00000000004794ee in malloc_elem_free_list_insert 
>> (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
>> 292             LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, 
>> free_list);
>> (gdb) bt
>> #0  0x00000000004794ee in malloc_elem_free_list_insert 
>> (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292
>> #1  0x0000000000479971 in malloc_elem_free (elem=0x7ff82d265000) at 
>> dpdk/lib/librte_eal/common/malloc_elem.c:448
>> #2  0x000000000047b054 in malloc_heap_free (elem=0x7ff82d265fc0) at 
>> dpdk/lib/librte_eal/common/malloc_heap.c:628
>> #3  0x00000000004787f5 in rte_free (addr=0x7ff82d266000) at 
>> dpdk/lib/librte_eal/common/rte_malloc.c:32
>>
>> Looked like the 1st field of struct malloc_elem (i.e. the heap 
>> pointer: struct malloc_heap *heap) was corrupted. Everything else 
>> looked good:
>> (gdb) p *elem
>> $2 = {
>>    heap = 0x9e0,
>>    prev = 0x7ff82d254fc0,
>>    next = 0x7ff84ce9a000,
>>    free_list = {
>>      le_next = 0x7ff873c89000,
>>      le_prev = 0x7ff82bcbf018
>>    },
>>    msl = 0x7ffff7f3d07c,
>>    state = ELEM_FREE,
>>    pad = 0,
>>    size = 532893696
>> }
>> (gdb) p *elem->prev
>> $3 = {
>>    heap = 0x7ffff7f3f67c,
>>    prev = 0x7ff82ce14000,
>>    next = 0x7ff82d265000,
>>    free_list = {
>>     le_next = 0x0,
>>      le_prev = 0x0
>>    },
>>    msl = 0x7ffff7f3d07c,
>>    state = ELEM_BUSY,
>>    pad = 0,
>>    size = 65600
>> }
>>
>> I haven’t completely ruled out my own code had a buffer overrun and 
>> corrupted the first field of malloc_elem object yet, but I’m beginning 
>> to look at it as a possible DPDK internal corruption. The DPDK code 
>> isn’t the latest but it had malloc fixes up to commit 
>> 9554dbb50a8a22942128a0e5bcb52243a4f723ab.
>>
>> Ideas/suggestions greatly appreciated! BTW it’s DMA memory so I 
>> couldn’t just use malloc/free and debug with standard memory debuggers.
>>
>> Thanks,
>> Isaac
>>
> 
> Hi Isaac,
> 
> You might want to look into enabling malloc debug options.
> 

Also, it might be useful to apply this commit and check if it still happens:

71aae4b421da9b741d1fb73a190a9facfec555b9

http://patches.dpdk.org/patch/48080/

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-11-27 10:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-26 23:44 [dpdk-dev] struct malloc_elem overrun/corruption He Huang
2018-11-27 10:38 ` Burakov, Anatoly
2018-11-27 10:40   ` Burakov, Anatoly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).