From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 4EDA71B3A4 for ; Tue, 27 Nov 2018 11:38:11 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Nov 2018 02:38:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,286,1539673200"; d="scan'208";a="103779579" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.109]) ([10.237.220.109]) by orsmga003.jf.intel.com with ESMTP; 27 Nov 2018 02:38:10 -0800 To: He Huang , "dev@dpdk.org" References: <6F164AE3-2B39-41CD-A70A-C9D2DE61E902@ddn.com> From: "Burakov, Anatoly" Message-ID: <22b4fc8e-8280-d699-f669-e7fdcffac82b@intel.com> Date: Tue, 27 Nov 2018 10:38:09 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <6F164AE3-2B39-41CD-A70A-C9D2DE61E902@ddn.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] struct malloc_elem overrun/corruption X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 10:38:12 -0000 On 26-Nov-18 11:44 PM, He Huang wrote: > Hi, > > I’ve been troubleshooting a possible memory allocator corruption: > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fffefdf0700 (LWP 1079)] > 0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292 > 292 LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list); > (gdb) bt > #0 0x00000000004794ee in malloc_elem_free_list_insert (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:292 > #1 0x0000000000479971 in malloc_elem_free (elem=0x7ff82d265000) at dpdk/lib/librte_eal/common/malloc_elem.c:448 > #2 0x000000000047b054 in malloc_heap_free (elem=0x7ff82d265fc0) at dpdk/lib/librte_eal/common/malloc_heap.c:628 > #3 0x00000000004787f5 in rte_free (addr=0x7ff82d266000) at dpdk/lib/librte_eal/common/rte_malloc.c:32 > > Looked like the 1st field of struct malloc_elem (i.e. the heap pointer: struct malloc_heap *heap) was corrupted. Everything else looked good: > (gdb) p *elem > $2 = { > heap = 0x9e0, > prev = 0x7ff82d254fc0, > next = 0x7ff84ce9a000, > free_list = { > le_next = 0x7ff873c89000, > le_prev = 0x7ff82bcbf018 > }, > msl = 0x7ffff7f3d07c, > state = ELEM_FREE, > pad = 0, > size = 532893696 > } > (gdb) p *elem->prev > $3 = { > heap = 0x7ffff7f3f67c, > prev = 0x7ff82ce14000, > next = 0x7ff82d265000, > free_list = { > le_next = 0x0, > le_prev = 0x0 > }, > msl = 0x7ffff7f3d07c, > state = ELEM_BUSY, > pad = 0, > size = 65600 > } > > I haven’t completely ruled out my own code had a buffer overrun and corrupted the first field of malloc_elem object yet, but I’m beginning to look at it as a possible DPDK internal corruption. The DPDK code isn’t the latest but it had malloc fixes up to commit 9554dbb50a8a22942128a0e5bcb52243a4f723ab. > > Ideas/suggestions greatly appreciated! BTW it’s DMA memory so I couldn’t just use malloc/free and debug with standard memory debuggers. > > Thanks, > Isaac > Hi Isaac, You might want to look into enabling malloc debug options. -- Thanks, Anatoly