From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 4CD7FA00C2;
	Fri, 24 Apr 2020 14:03:30 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 959FF1C1E5;
	Fri, 24 Apr 2020 14:03:29 +0200 (CEST)
Received: from mail-ua1-f68.google.com (mail-ua1-f68.google.com
 [209.85.222.68]) by dpdk.org (Postfix) with ESMTP id 3EDF71BF75
 for <dev@dpdk.org>; Fri, 24 Apr 2020 14:03:28 +0200 (CEST)
Received: by mail-ua1-f68.google.com with SMTP id 36so5058211uaf.9
 for <dev@dpdk.org>; Fri, 24 Apr 2020 05:03:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=smartx-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=NMNXE6KfonMX8eKtoVFpuZkVF7U6irfB+btnG+456t8=;
 b=lxppGEpd4licKwpds5r00JgUlPK+vkxnxMXfgtVWcYTwoH/2WkR2uI9+wsHHdfRhzc
 gexeiMFhrTtIoJXvM8Zwj7fxBMIEiKs52BzT5fqf2QQo/1u35ONajQ7OKuCjptbmxfZq
 +/uFLjPZ+h0YCvvorKlHkq6rGVqZoaRVqaZj9clIx4gEkhJ850/u2Ltrl2ht0V0TDOB9
 llWzGQN2O10EHz2N4UV8yLkQPK50byL3+u9nq2Ot/SO7V9Q2L1R1Dk5i0cZKtt3yARJN
 GaxndQWqdLI8SNYFMoDXX1qZvvGrz0PyUAc/1hBe/+d9glj1RCQ35tDLjzCng3+sM7us
 5lvw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=NMNXE6KfonMX8eKtoVFpuZkVF7U6irfB+btnG+456t8=;
 b=HsdhkL0chfuDQLqvLOACbD1gJOEaT8Up03rqF0FWwADocHI/RZO92rLfTcsMVU56np
 S/zC9zS+zB6/Lbr1XkSl8cIP3VP/kHc240T5MA4GIxEc/PXf081kD9zMRz91r637GHCW
 SZbE0pi2gU8U3QNuFoIwClQTm/AdYm2JUGKqL4jvk6QdA9iEl2QLwvrXnqlfazT97KWf
 JSz/FyTRulzjnhG/UeM4cZQvwfi6d6meW8JkWgtxq8n3dl3rKE8cUnl857ARvMv8zEDO
 KiiUrU3Ls3kMhPuF33D836snd4z7l0bZ+tuyjEKr4AQhsBlb05aVxpnYi3ydeNAg3ZIK
 tdzA==
X-Gm-Message-State: AGi0PuZDgxCd5alauDg8eng0UEFrEck3tdBZToCB5aQyErP3fEE69b4l
 8/tCQoG4MEG4jpPKUECduUVJSFAm+9LIu7/MM6BlnH7oQIt5vKaTfKYeZED94UwgbChnD7VO9I+
 jfNyn8pAR2v6U
X-Google-Smtp-Source: APiQypLIR5cq6arFXHPirPyKcJCfSJZiLMoIlDpYeJcs9n0+/DZP1PwBKXevJX4wShPGEvZoZ1S8MY0qff8ZBOzxTnw=
X-Received: by 2002:a67:6b41:: with SMTP id g62mr6993928vsc.168.1587729807171; 
 Fri, 24 Apr 2020 05:03:27 -0700 (PDT)
MIME-Version: 1.0
References: <20200420070508.645533-1-fengli@smartx.com>
 <20200423154302.2217041-1-fengli@smartx.com>
 <9d6dc63b-34f7-36b3-5c3f-df74b71d961c@intel.com>
 <CAJFAV8z-om+cG5-zw1o1nB1q7yWwZjy=VTcT9mxHnC2hak2ifw@mail.gmail.com>
 <083d248a-77dd-0b07-cb8b-f2703e8503f5@intel.com>
 <20200424091421.GB1440@bricha3-MOBL.ger.corp.intel.com>
 <CAEK8JBATc2S0-s7XekhiESEdy++idoQ6gksd491iY5ChNMGh2w@mail.gmail.com>
 <c86d84b1-4e0d-6861-72d9-bfe57ec5fc85@intel.com>
In-Reply-To: <c86d84b1-4e0d-6861-72d9-bfe57ec5fc85@intel.com>
From: Li Feng <fengli@smartx.com>
Date: Fri, 24 Apr 2020 20:03:15 +0800
Message-ID: <CAHckoCyU=xUdO6cyV4SZvSs1aaxRiFcR5GZLhZ=Pn29Dm=pm_w@mail.gmail.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: Feng Li <lifeng1519@gmail.com>,
 Bruce Richardson <bruce.richardson@intel.com>, 
 David Marchand <david.marchand@redhat.com>, dev <dev@dpdk.org>,
 Kyle Zhang <kyle@smartx.com>, Yang Fan <fanyang@smartx.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Thanks,

Feng Li

Burakov, Anatoly <anatoly.burakov@intel.com> =E4=BA=8E2020=E5=B9=B44=E6=9C=
=8824=E6=97=A5=E5=91=A8=E4=BA=94 =E4=B8=8B=E5=8D=887:00=E5=86=99=E9=81=93=
=EF=BC=9A
>
> On 24-Apr-20 10:33 AM, Feng Li wrote:
> > Bruce Richardson <bruce.richardson@intel.com> =E4=BA=8E2020=E5=B9=B44=
=E6=9C=8824=E6=97=A5=E5=91=A8=E4=BA=94 =E4=B8=8B=E5=8D=885:14=E5=86=99=E9=
=81=93=EF=BC=9A
> >>
> >> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> >>> On 23-Apr-20 9:04 PM, David Marchand wrote:
> >>>> On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> >>>> <anatoly.burakov@intel.com> wrote:
> >>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librt=
e_eal/common/eal_common_memory.c
> >>>>>> index cc7d54e0c..2d9564b28 100644
> >>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
> >>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
> >>>>>> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, si=
ze_t *size,
> >>>>>>                 after_len =3D RTE_PTR_DIFF(map_end, aligned_end);
> >>>>>>                 if (after_len > 0)
> >>>>>>                         munmap(aligned_end, after_len);
> >>>>>> +
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) !=3D=
 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DON=
TDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));> +   } else {
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) !=3D=
 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DON=
TDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));
> >>>>>>         }
> >>>>>>
> >>>>>>         return aligned_addr;
> >>>>>>
> >>>>>
> >>>>> For the contents of this patch,
> >>>>
> >>>> MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> >>>> be a MADV_NOCORE option on FreeBSD.
> >>>> 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7j=
y2923EpAcXrg@mail.gmail.com/
> >>>>
> >>>>
> >>>
> >>> Oh, right, so this would probably not compile on FreeBSD. Perhaps thi=
s
> >>> function would have to be OS-specific after all (or call into an OS-s=
pecific
> >>> madvise() after reserving the memory area).
> >>>
> >>
> >> Is it just a differently named flag? If so, I think a single #ifdef ma=
cro
> >> won't kill us in the common code.
> >>
> > Just the flag name is different.
> > I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?
>
> Yes, but we need this in two places, so a function call is still necessar=
y.
>
> >
> > Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
> > DONTMAP of the memory region.
> > Right? @Anatoly
>
> I don't think it's necessary. When you map different memory into that
> region, madvise() flags no longer apply. To be sure, i just tested this
> by adding another mmap() call after madvise() (in your test app) and
> remapping the same memory with MAP_FIXED, and the core dump was back to
> 1GB of size. So, no, i don't think you should undo anything - the system
> does so automatically.
Got it.
>
> >
> > Just few minutes, I have prepared a patch for the OS-specific code:
> > --- a/lib/librte_eal/common/eal_private.h
> > +++ b/lib/librte_eal/common/eal_private.h
> > @@ -443,4 +443,20 @@ rte_option_usage(void);
> >   uint64_t
> >   eal_get_baseaddr(void);
> >
> > +/**
> > + * @internal
> > + * Exclude this pages from a core dump.
> > + *
> > + * @param addr
> > + *  The memory region starts.
> > + *
> > + * @param len
> > + *  The memory region length..
> > + *
> > + * @return
> > + * returns 0 or -errno
> > + */
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len);
> > +
> >   #endif /* _EAL_PRIVATE_H_ */
> > diff --git a/lib/librte_eal/freebsd/eal_memory.c
> > b/lib/librte_eal/freebsd/eal_memory.c
> > index a97d8f0f0..585042dde 100644
> > --- a/lib/librte_eal/freebsd/eal_memory.c
> > +++ b/lib/librte_eal/freebsd/eal_memory.c
> > @@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
> >    memseg_primary_init() :
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_NOCORE);
> > +}
> > diff --git a/lib/librte_eal/linux/eal_memory.c
> > b/lib/librte_eal/linux/eal_memory.c
> > index 7a9c97ff8..cfdbfccfe 100644
> > --- a/lib/librte_eal/linux/eal_memory.c
> > +++ b/lib/librte_eal/linux/eal_memory.c
> > @@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
> >   #endif
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_DONTDUMP);
> > +}
> >
>
> That would work as well (with added FreeBSD code of course), however if
> everyone else is OK with it, i'll settle for an #ifdef in common code.
>
> --
> Thanks,
> Anatoly

--=20
The SmartX email address is only for business purpose. Any sent message=20
that is not related to the business is not authorized or permitted by=20
SmartX.
=E6=9C=AC=E9=82=AE=E7=AE=B1=E4=B8=BA=E5=8C=97=E4=BA=AC=E5=BF=97=E5=87=8C=E6=
=B5=B7=E7=BA=B3=E7=A7=91=E6=8A=80=E6=9C=89=E9=99=90=E5=85=AC=E5=8F=B8=EF=BC=
=88SmartX=EF=BC=89=E5=B7=A5=E4=BD=9C=E9=82=AE=E7=AE=B1. =E5=A6=82=E6=9C=AC=
=E9=82=AE=E7=AE=B1=E5=8F=91=E5=87=BA=E7=9A=84=E9=82=AE=E4=BB=B6=E4=B8=8E=E5=
=B7=A5=E4=BD=9C=E6=97=A0=E5=85=B3,=E8=AF=A5=E9=82=AE=E4=BB=B6=E6=9C=AA=E5=
=BE=97=E5=88=B0=E6=9C=AC=E5=85=AC=E5=8F=B8=E4=BB=BB=E4=BD=95=E7=9A=84=E6=98=
=8E=E7=A4=BA=E6=88=96=E9=BB=98=E7=A4=BA=E7=9A=84=E6=8E=88=E6=9D=83.