From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 559BA1B54B for ; Tue, 27 Nov 2018 11:41:01 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Nov 2018 02:41:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,286,1539673200"; d="scan'208";a="103780198" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.109]) ([10.237.220.109]) by orsmga003.jf.intel.com with ESMTP; 27 Nov 2018 02:40:59 -0800 To: He Huang , "dev@dpdk.org" References: <6F164AE3-2B39-41CD-A70A-C9D2DE61E902@ddn.com> <22b4fc8e-8280-d699-f669-e7fdcffac82b@intel.com> From: "Burakov, Anatoly" Message-ID: <074277cf-51e6-65f3-45b9-a4e6e3c6f562@intel.com> Date: Tue, 27 Nov 2018 10:40:58 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <22b4fc8e-8280-d699-f669-e7fdcffac82b@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] struct malloc_elem overrun/corruption X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 10:41:03 -0000 On 27-Nov-18 10:38 AM, Burakov, Anatoly wrote: > On 26-Nov-18 11:44 PM, He Huang wrote: >> Hi, >> >> I’ve been troubleshooting a possible memory allocator corruption: >> Program received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x7fffefdf0700 (LWP 1079)] >> 0x00000000004794ee in malloc_elem_free_list_insert >> (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292 >> 292             LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, >> free_list); >> (gdb) bt >> #0  0x00000000004794ee in malloc_elem_free_list_insert >> (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292 >> #1  0x0000000000479971 in malloc_elem_free (elem=0x7ff82d265000) at >> dpdk/lib/librte_eal/common/malloc_elem.c:448 >> #2  0x000000000047b054 in malloc_heap_free (elem=0x7ff82d265fc0) at >> dpdk/lib/librte_eal/common/malloc_heap.c:628 >> #3  0x00000000004787f5 in rte_free (addr=0x7ff82d266000) at >> dpdk/lib/librte_eal/common/rte_malloc.c:32 >> >> Looked like the 1st field of struct malloc_elem (i.e. the heap >> pointer: struct malloc_heap *heap) was corrupted. Everything else >> looked good: >> (gdb) p *elem >> $2 = { >>    heap = 0x9e0, >>    prev = 0x7ff82d254fc0, >>    next = 0x7ff84ce9a000, >>    free_list = { >>      le_next = 0x7ff873c89000, >>      le_prev = 0x7ff82bcbf018 >>    }, >>    msl = 0x7ffff7f3d07c, >>    state = ELEM_FREE, >>    pad = 0, >>    size = 532893696 >> } >> (gdb) p *elem->prev >> $3 = { >>    heap = 0x7ffff7f3f67c, >>    prev = 0x7ff82ce14000, >>    next = 0x7ff82d265000, >>    free_list = { >>     le_next = 0x0, >>      le_prev = 0x0 >>    }, >>    msl = 0x7ffff7f3d07c, >>    state = ELEM_BUSY, >>    pad = 0, >>    size = 65600 >> } >> >> I haven’t completely ruled out my own code had a buffer overrun and >> corrupted the first field of malloc_elem object yet, but I’m beginning >> to look at it as a possible DPDK internal corruption. The DPDK code >> isn’t the latest but it had malloc fixes up to commit >> 9554dbb50a8a22942128a0e5bcb52243a4f723ab. >> >> Ideas/suggestions greatly appreciated! BTW it’s DMA memory so I >> couldn’t just use malloc/free and debug with standard memory debuggers. >> >> Thanks, >> Isaac >> > > Hi Isaac, > > You might want to look into enabling malloc debug options. > Also, it might be useful to apply this commit and check if it still happens: 71aae4b421da9b741d1fb73a190a9facfec555b9 http://patches.dpdk.org/patch/48080/ -- Thanks, Anatoly