From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id BCD842BF4 for ; Thu, 19 Jul 2018 11:48:34 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Jul 2018 02:48:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,374,1526367600"; d="scan'208";a="73664723" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.102]) ([10.237.220.102]) by fmsmga001.fm.intel.com with ESMTP; 19 Jul 2018 02:48:30 -0700 To: Stephen Hemminger , Andrew Rybchenko Cc: "dev@dpdk.org" , Sergio Gonzalez Monroy References: <8bc76811-ac29-d7f2-e4c3-12b50fd44dba@solarflare.com> <58e5044c-3d13-9171-4168-b4d6b1d61927@intel.com> <20180718135817.66728c37@xeon-e3> From: "Burakov, Anatoly" Message-ID: <71478802-911e-676a-93a3-82c550cc0ee9@intel.com> Date: Thu, 19 Jul 2018 10:48:29 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] Memory allocated using rte_zmalloc() has non-zeros X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jul 2018 09:48:35 -0000 On 19-Jul-18 10:01 AM, Burakov, Anatoly wrote: > On 18-Jul-18 9:58 PM, Stephen Hemminger wrote: >> On Wed, 18 Jul 2018 22:52:12 +0300 >> Andrew Rybchenko wrote: >> >>> On 18.07.2018 20:18, Burakov, Anatoly wrote: >>>> On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote: >>>>> Hi Anatoly, >>>>> >>>>> I'm investigating issue which finally comes to the fact that memory >>>>> allocated using >>>>> rte_zmalloc() has non zeros. >>>>> >>>>> If I add memset just after allocation, everything is perfect and >>>>> works fine. >>>>> >>>>> I've found out that memset was removed from rte_zmalloc_socket() some >>>>> time ago: >>>>>   >>> >>>>> commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5 >>>>> Author: Sergio Gonzalez Monroy >>>>> Date:   Tue Jul 5 12:01:16 2016 +0100 >>>>> >>>>>       mem: do not zero out memory on zmalloc >>>>> >>>>>       Zeroing out memory on rte_zmalloc_socket is not required anymore >>>>> since all >>>>>       allocated memory is already zeroed. >>>>> >>>>>       Signed-off-by: Sergio Gonzalez Monroy >>>>> >>>>> <<< >>>>> >>>>> but may be something has changed now that made above statement false. >>>>> >>>>> I observe the problem when memory is reallocated. I.e. I configure 7 >>>>> queues, >>>>> start, stop, reconfigure to 3 queues, start. Memory is allocated on >>>>> start and >>>>> freed on stop, since we have less queues on the second start it is >>>>> allocated >>>>> in a different way and reuses previously allocated/freed memory. >>>>> >>>>> Do you have any ideas what could be wrong? >>>>> >>>>> Andrew. >>>>> >>>> >>>> Hi Andrew, >>>> >>>> I will look into it first thing tomorrow. In general, we memset(0) on >>>> free, and kernel gives us zeroed out pages initially, so the most >>>> likely point of failure is that i'm not overwring some malloc headers >>>> correctly on free. >>> >>> OK, at least now I know how it is supposed to work in theory. >>> >>> The following region was allocated  (the second number below is pointer >>> plus size) >>> ALLOC 0x7fffa3264080-0x7fffa32640b8 >>> >>> Not zerod address is 16 bytes before: >>> (gdb) p/x ((uint64_t *)0x7fffa3264070)[0] >>> $4 = 0x4000000002 >>> (gdb) p/x ((uint64_t  *)0x7fffa3264070)[1] >>> $5 = 0x80 >>> >>> then freed >>> FREE 0x7fffa3264080-0x7fffa32640b8 >>> >>> but above values (gdb) are still the same >>> then it is allocated as the part of bigger memory chunk >>> ALLOC 0x7fffa3245b80-0x7fffa3265fd8 >>> which should contain zeros, but above values are still the same. >>> >>> It is interesting that it looks like it was the first block freed on the >>> port stop. I'm not 100% sure since I've put printouts to my allocation >>> wrapper, not EAL. >>> >>> Many thanks, >>> Andrew. >> >> memset here is what is supposed to clear the data. >> >> struct malloc_elem * >> malloc_elem_free(struct malloc_elem *elem) >> { >>     void *ptr; >>     size_t data_len; >> >>     ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad); >>     data_len = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD; >> >>     elem = malloc_elem_join_adjacent_free(elem); >> >>     malloc_elem_free_list_insert(elem); >> >>     elem->pad = 0; >> >>     /* decrease heap's count of allocated elements */ >>     elem->heap->alloc_count--; >> >>     memset(ptr, 0, data_len); >> >> Maybe data_len is not correct either because of bug, or your >> application clobbered >> the malloc reserved regions  in the element. >> >> More likely, gcc is incorrectly optimizing this away. >> >> https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations >> >> https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/ >> >> > > I tend to be very wary of blaming the compiler without exhausting any > other possibilities :) It used to work before without issues, so > presumably whatever is happening, our memset works correctly. > > Andrew, you write: > > > ALLOC 0x7fffa3264080-0x7fffa32640b8 > Not zerod address is 16 bytes before: > > > Of course the memory *before* your pointer would not be zero - it is > preceded by a 64-byte malloc header, so what you're seeing is the malloc > header data (which doesn't go away if you free it - it will go away only > if it is merged with an adjacent free malloc element). So, i'm failing > to see which problem you're describing, given that all memory regions > that are supposedly not free lie outside of your malloc-allocated memory. > > However, after careful analysis, i can see that there is one possibility > where memory is not zeroed on free - if the original malloc element was > padded, and there aren't any more adjacent free elements, then newly > allocated memory may contain old pad header. I'll submit a patch for you > to try shortly. > Patch: http://patches.dpdk.org/patch/43196/ -- Thanks, Anatoly