DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <arybchenko@solarflare.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Subject: Re: [dpdk-dev] Memory allocated using rte_zmalloc() has non-zeros
Date: Thu, 19 Jul 2018 19:44:43 +0300	[thread overview]
Message-ID: <e37dae9d-5dbe-5c9a-958e-52e5e9146131@solarflare.com> (raw)
In-Reply-To: <71478802-911e-676a-93a3-82c550cc0ee9@intel.com>

On 19.07.2018 12:48, Burakov, Anatoly wrote:
> On 19-Jul-18 10:01 AM, Burakov, Anatoly wrote:
>> On 18-Jul-18 9:58 PM, Stephen Hemminger wrote:
>>> On Wed, 18 Jul 2018 22:52:12 +0300
>>> Andrew Rybchenko <arybchenko@solarflare.com> wrote:
>>>
>>>> On 18.07.2018 20:18, Burakov, Anatoly wrote:
>>>>> On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote:
>>>>>> Hi Anatoly,
>>>>>>
>>>>>> I'm investigating issue which finally comes to the fact that memory
>>>>>> allocated using
>>>>>> rte_zmalloc() has non zeros.
>>>>>>
>>>>>> If I add memset just after allocation, everything is perfect and
>>>>>> works fine.
>>>>>>
>>>>>> I've found out that memset was removed from rte_zmalloc_socket() 
>>>>>> some
>>>>>> time ago:
>>>>>>   >>>
>>>>>> commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5
>>>>>> Author: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
>>>>>> Date:   Tue Jul 5 12:01:16 2016 +0100
>>>>>>
>>>>>>       mem: do not zero out memory on zmalloc
>>>>>>
>>>>>>       Zeroing out memory on rte_zmalloc_socket is not required 
>>>>>> anymore
>>>>>> since all
>>>>>>       allocated memory is already zeroed.
>>>>>>
>>>>>>       Signed-off-by: Sergio Gonzalez Monroy
>>>>>> <sergio.gonzalez.monroy@intel.com>
>>>>>> <<<
>>>>>>
>>>>>> but may be something has changed now that made above statement 
>>>>>> false.
>>>>>>
>>>>>> I observe the problem when memory is reallocated. I.e. I configure 7
>>>>>> queues,
>>>>>> start, stop, reconfigure to 3 queues, start. Memory is allocated on
>>>>>> start and
>>>>>> freed on stop, since we have less queues on the second start it is
>>>>>> allocated
>>>>>> Andrew Rybchenko <arybchenko@solarflare.com>
>>>>>> in a different way and reuses previously allocated/freed memory.
>>>>>>
>>>>>> Do you have any ideas what could be wrong?
>>>>>>
>>>>>> Andrew.
>>>>>>
>>>>>
>>>>> Hi Andrew,
>>>>>
>>>>> I will look into it first thing tomorrow. In general, we memset(0) on
>>>>> free, and kernel gives us zeroed out pages initially, so the most
>>>>> likely point of failure is that i'm not overwring some malloc headers
>>>>> correctly on free.
>>>>
>>>> OK, at least now I know how it is supposed to work in theory.
>>>>
>>>> The following region was allocated  (the second number below is 
>>>> pointer
>>>> plus size)
>>>> ALLOC 0x7fffa3264080-0x7fffa32640b8
>>>>
>>>> Not zerod address is 16 bytes before:
>>>> (gdb) p/x ((uint64_t *)0x7fffa3264070)[0]
>>>> $4 = 0x4000000002
>>>> (gdb) p/x ((uint64_t  *)0x7fffa3264070)[1]
>>>> $5 = 0x80
>>>>
>>>> then freed
>>>> FREE 0x7fffa3264080-0x7fffa32640b8
>>>>
>>>> but above values (gdb) are still the same
>>>> then it is allocated as the part of bigger memory chunk
>>>> ALLOC 0x7fffa3245b80-0x7fffa3265fd8
>>>> which should contain zeros, but above values are still the same.
>>>>
>>>> It is interesting that it looks like it was the first block freed 
>>>> on the
>>>> port stop. I'm not 100% sure since I've put printouts to my allocation
>>>> wrapper, not EAL.
>>>>
>>>> Many thanks,
>>>> Andrew.
>>>
>>> memset here is what is supposed to clear the data.
>>>
>>> struct malloc_elem *
>>> malloc_elem_free(struct malloc_elem *elem)
>>> {
>>>     void *ptr;
>>>     size_t data_len;
>>>
>>>     ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad);
>>>     data_len = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD;
>>>
>>>     elem = malloc_elem_join_adjacent_free(elem);
>>>
>>>     malloc_elem_free_list_insert(elem);
>>>
>>>     elem->pad = 0;
>>>
>>>     /* decrease heap's count of allocated elements */
>>>     elem->heap->alloc_count--;
>>>
>>>     memset(ptr, 0, data_len);
>>>
>>> Maybe data_len is not correct either because of bug, or your 
>>> application clobbered
>>> the malloc reserved regions  in the element.
>>>
>>> More likely, gcc is incorrectly optimizing this away.
>>>
>>> https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations 
>>>
>>> https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/ 
>>>
>>>
>>
>> I tend to be very wary of blaming the compiler without exhausting any 
>> other possibilities :) It used to work before without issues, so 
>> presumably whatever is happening, our memset works correctly.
>>
>> Andrew, you write:
>>
>> <snip>
>> ALLOC 0x7fffa3264080-0x7fffa32640b8
>> Not zerod address is 16 bytes before:
>> <snip>
>>
>> Of course the memory *before* your pointer would not be zero - it is 
>> preceded by a 64-byte malloc header, so what you're seeing is the 
>> malloc header data (which doesn't go away if you free it - it will go 
>> away only if it is merged with an adjacent free malloc element). So, 
>> i'm failing to see which problem you're describing, given that all 
>> memory regions that are supposedly not free lie outside of your 
>> malloc-allocated memory.

I tried to highlight that non-zeroed bytes belong to malloc header of 
the previously allocated memory region. Later it becomes memory 
allocated region itself (significantly bigger, so merges happened):
 >>>
then it is allocated as the part of bigger memory chunk
ALLOC 0x7fffa3245b80-0x7fffa3265fd8
which should contain zeros, but above values are still the same.
<<<

>> However, after careful analysis, i can see that there is one 
>> possibility where memory is not zeroed on free - if the original 
>> malloc element was padded, and there aren't any more adjacent free 
>> elements, then newly allocated memory may contain old pad header. 
>> I'll submit a patch for you to try shortly.
>>
>
> Patch:
>
> http://patches.dpdk.org/patch/43196/

Yes, the patch fixes the problem I've observed. At least it passes 
simple test which I used for debugging.
I'll run more automated tests tonight.

Many thanks,
Andrew.

      reply	other threads:[~2018-07-19 16:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 15:20 Andrew Rybchenko
2018-07-18 16:06 ` Richardson, Bruce
2018-07-18 17:18 ` Burakov, Anatoly
2018-07-18 19:52   ` Andrew Rybchenko
2018-07-18 20:58     ` Stephen Hemminger
2018-07-19  9:01       ` Burakov, Anatoly
2018-07-19  9:48         ` Burakov, Anatoly
2018-07-19 16:44           ` Andrew Rybchenko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e37dae9d-5dbe-5c9a-958e-52e5e9146131@solarflare.com \
    --to=arybchenko@solarflare.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=sergio.gonzalez.monroy@intel.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).