From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stephen@networkplumber.org>
Received: from mail-pg1-f196.google.com (mail-pg1-f196.google.com
 [209.85.215.196]) by dpdk.org (Postfix) with ESMTP id 1AD6723C
 for <dev@dpdk.org>; Wed, 18 Jul 2018 22:58:20 +0200 (CEST)
Received: by mail-pg1-f196.google.com with SMTP id f1-v6so2526921pgq.12
 for <dev@dpdk.org>; Wed, 18 Jul 2018 13:58:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=t/WKtPVtLl8tBWYH9vuX4sW7e1kAuPW9kDhUm9LljGc=;
 b=Qf9HsBDXPfd7ERsC9bWB/mrID+hUVvEr7a/UGbKtEN/BKlkq4r3nfBoB0/gbFZxRG5
 NCHtGs0nDbTycWMxRvEESPZ0OzZRGkCqZk7FsC0MqI+FzsW9aabIZ6eakrO3dV42q8eL
 RgBUK57IuECgMsduWFKxP4BXtfkReka4/+G5zOyMVMyXr8erX6DdDmW7yyU9z7KtPIzn
 LdITRO2JTF0TnckWSVdJgTVWMeoDrSGZgc0gDE+XigIfUBrN24NhqzWpCRZNU2enwELR
 PJo4FkloTMXa+daeqEvGJG2B3ptUVxLfOgUoxA0hF3gacSOeFGMuaNH2UnX4SdZ/D8JX
 iPPw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=t/WKtPVtLl8tBWYH9vuX4sW7e1kAuPW9kDhUm9LljGc=;
 b=UDgE46V22OVzfOWTm45E1hIVcxgbStuhG87mdrSbNg7iIoaQHlUBOJ7Ia5wL4zs2WR
 lB5uTqbRXasBbr1cD2AbwJgOWBNXGqnlUALB3qYOgBWAfo/BYSky028V42DGryeTYiTN
 Hc5Avcb5ML5ijrdsCg2lA94y/FSAIxd2TowTJK8qJtA6iAOD7tHG1aJGqjcX8h/3T/f/
 OgVfU0zT5H9Hcm/Z1gGQswyvjozf+7pWn5/HrKOiw2B34G5TtA+DWwrHhdbaR6haJ7fa
 XNPq5VLd6kopsNXvpY2CeFmjrm6cThiQWIkxWOCTqUOs1z54N9I7pUcKYk3tQp3WzF1m
 J+oA==
X-Gm-Message-State: AOUpUlFjCa9PF7nlo1oxjuC7JL2+UNuYlmrBFNAXXAyBJrevlwuo0hBL
 mFYpsOwSkj6DRMcG9fj2jTyz+A==
X-Google-Smtp-Source: AAOMgpd+3nrk9qENaKqBGns9sCY863d+kL0D5RdXSrtnEGiDztLuzUF549xI3dAOysqBME0uBwdZjQ==
X-Received: by 2002:a62:1157:: with SMTP id
 z84-v6mr6737594pfi.66.1531947500146; 
 Wed, 18 Jul 2018 13:58:20 -0700 (PDT)
Received: from xeon-e3 (204-195-22-127.wavecable.com. [204.195.22.127])
 by smtp.gmail.com with ESMTPSA id o65-v6sm6381221pfb.104.2018.07.18.13.58.19
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Wed, 18 Jul 2018 13:58:20 -0700 (PDT)
Date: Wed, 18 Jul 2018 13:58:17 -0700
From: Stephen Hemminger <stephen@networkplumber.org>
To: Andrew Rybchenko <arybchenko@solarflare.com>
Cc: "Burakov, Anatoly" <anatoly.burakov@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>, Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Message-ID: <20180718135817.66728c37@xeon-e3>
In-Reply-To: <a5f0915e-ccd1-9237-4337-3a0b0265c4cf@solarflare.com>
References: <8bc76811-ac29-d7f2-e4c3-12b50fd44dba@solarflare.com>
 <58e5044c-3d13-9171-4168-b4d6b1d61927@intel.com>
 <a5f0915e-ccd1-9237-4337-3a0b0265c4cf@solarflare.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [dpdk-dev] Memory allocated using rte_zmalloc() has non-zeros
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Jul 2018 20:58:21 -0000

On Wed, 18 Jul 2018 22:52:12 +0300
Andrew Rybchenko <arybchenko@solarflare.com> wrote:

> On 18.07.2018 20:18, Burakov, Anatoly wrote:
> > On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote: =20
> >> Hi Anatoly,
> >>
> >> I'm investigating issue which finally comes to the fact that memory=20
> >> allocated using
> >> rte_zmalloc() has non zeros.
> >>
> >> If I add memset just after allocation, everything is perfect and=20
> >> works fine.
> >>
> >> I've found out that memset was removed from rte_zmalloc_socket() some=
=20
> >> time ago:
> >> =20
> >> =C2=A0>>> =20
> >> commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5
> >> Author: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> >> Date:=C2=A0=C2=A0 Tue Jul 5 12:01:16 2016 +0100
> >>
> >> =C2=A0=C2=A0=C2=A0=C2=A0 mem: do not zero out memory on zmalloc
> >>
> >> =C2=A0=C2=A0=C2=A0=C2=A0 Zeroing out memory on rte_zmalloc_socket is n=
ot required anymore=20
> >> since all
> >> =C2=A0=C2=A0=C2=A0=C2=A0 allocated memory is already zeroed.
> >>
> >> =C2=A0=C2=A0=C2=A0=C2=A0 Signed-off-by: Sergio Gonzalez Monroy=20
> >> <sergio.gonzalez.monroy@intel.com>
> >> <<<
> >>
> >> but may be something has changed now that made above statement false.
> >>
> >> I observe the problem when memory is reallocated. I.e. I configure 7=20
> >> queues,
> >> start, stop, reconfigure to 3 queues, start. Memory is allocated on=20
> >> start and
> >> freed on stop, since we have less queues on the second start it is=20
> >> allocated
> >> in a different way and reuses previously allocated/freed memory.
> >>
> >> Do you have any ideas what could be wrong?
> >>
> >> Andrew.
> >>
> >> =20
> >
> > Hi Andrew,
> >
> > I will look into it first thing tomorrow. In general, we memset(0) on=20
> > free, and kernel gives us zeroed out pages initially, so the most=20
> > likely point of failure is that i'm not overwring some malloc headers=20
> > correctly on free. =20
>=20
> OK, at least now I know how it is supposed to work in theory.
>=20
> The following region was allocated=C2=A0 (the second number below is poin=
ter=20
> plus size)
> ALLOC 0x7fffa3264080-0x7fffa32640b8
>=20
> Not zerod address is 16 bytes before:
> (gdb) p/x ((uint64_t *)0x7fffa3264070)[0]
> $4 =3D 0x4000000002
> (gdb) p/x ((uint64_t=C2=A0 *)0x7fffa3264070)[1]
> $5 =3D 0x80
>=20
> then freed
> FREE 0x7fffa3264080-0x7fffa32640b8
>=20
> but above values (gdb) are still the same
> then it is allocated as the part of bigger memory chunk
> ALLOC 0x7fffa3245b80-0x7fffa3265fd8
> which should contain zeros, but above values are still the same.
>=20
> It is interesting that it looks like it was the first block freed on the=
=20
> port stop. I'm not 100% sure since I've put printouts to my allocation=20
> wrapper, not EAL.
>=20
> Many thanks,
> Andrew.

memset here is what is supposed to clear the data.

struct malloc_elem *
malloc_elem_free(struct malloc_elem *elem)
{
	void *ptr;
	size_t data_len;

	ptr =3D RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad);
	data_len =3D elem->size - elem->pad - MALLOC_ELEM_OVERHEAD;

	elem =3D malloc_elem_join_adjacent_free(elem);

	malloc_elem_free_list_insert(elem);

	elem->pad =3D 0;

	/* decrease heap's count of allocated elements */
	elem->heap->alloc_count--;

	memset(ptr, 0, data_len);

Maybe data_len is not correct either because of bug, or your application cl=
obbered
the malloc reserved regions  in the element.

More likely, gcc is incorrectly optimizing this away.

https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+o=
ptimizations
https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizatio=
ns-and-memset_s/