DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: Jerin Jacob <jerinj@marvell.com>, dpdk-dev <dev@dpdk.org>,
	Bruce Richardson <bruce.richardson@intel.com>,
	Ray Kinsella <mdr@ashroe.eu>,
	Thomas Monjalon <thomas@monjalon.net>,
	David Marchand <david.marchand@redhat.com>,
	Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>,
	Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>,
	"Dmitry Malloy (MESHCHANINOV)" <dmitrym@microsoft.com>,
	Pallavi Kadam <pallavi.kadam@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Ruifeng Wang (Arm Technology China)" <ruifeng.wang@arm.com>,
	Jan Viktorin <viktorin@rehivetech.com>,
	David Christensen <drc@linux.vnet.ibm.com>
Subject: Re: [dpdk-dev] [PATCH v2 1/6] eal: introduce oops handling API
Date: Wed, 18 Aug 2021 09:46:41 -0700	[thread overview]
Message-ID: <20210818094641.2fe829ba@hermes.local> (raw)
In-Reply-To: <CALBAE1PJPE7jOQTgBsUXncTqoB5zoBk47rGptsoSj5-=2oEQJw@mail.gmail.com>

On Wed, 18 Aug 2021 15:07:25 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Tue, Aug 17, 2021 at 9:22 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Tue, 17 Aug 2021 20:57:50 +0530
> > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >  
> > > On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:  
> > > >
> > > > On Tue, 17 Aug 2021 13:08:46 +0530
> > > > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > >  
> > > > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > > > > <stephen@networkplumber.org> wrote:  
> > > > > >
> > > > > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > > > > <jerinj@marvell.com> wrote:
> > > > > >  
> > > > > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > > > >
> > > > > > > Introducing oops handling API with following specification
> > > > > > > and enable stub implementation for Linux and FreeBSD.
> > > > > > >
> > > > > > > On rte_eal_init() invocation, the EAL library installs the
> > > > > > > oops handler for the essential signals.
> > > > > > > The rte_oops_signals_enabled() API provides the list
> > > > > > > of signals the library installed by the EAL.  
> > > > > >
> > > > > > This is a big change, and many applications already handle these
> > > > > > signals themselves. Therefore adding this needs to be opt-in
> > > > > > and not enabled by default.  
> > > > >
> > > > > In order to avoid every application explicitly register this
> > > > > sighandler and to cater to the
> > > > > co-existing application-specific signal-hander usage.
> > > > > The following design has been chosen. (It is mentioned in the commit log,
> > > > > I will describe here for more clarity)
> > > > >
> > > > > Case 1:
> > > > > a) The application installs the signal handler prior to rte_eal_init().
> > > > > b) Implementation stores the application-specific signal and replace a
> > > > > signal handler as oops eal handler
> > > > > c) when application/DPDK get the segfault, the default EAL oops
> > > > > handler gets invoked
> > > > > d) Then it dumps the EAL specific message, it calls the
> > > > > application-specific signal handler
> > > > > installed in step 1 by application. This avoids breaking any contract
> > > > > with the application.
> > > > > i.e Behavior is the same current EAL now.
> > > > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > > > > eal oops handler instead
> > > > > application-specific handler)
> > > > >
> > > > > Case 2:
> > > > > a) The application install the signal handler after rte_eal_init(),
> > > > > b) EAL hander get replaced with application handle then the application can call
> > > > > rte_oops_decode() to decode.
> > > > >
> > > > > In order to cater the above use case, rte_oops_signals_enabled() and
> > > > > rte_oops_decode()
> > > > > provided.
> > > > >
> > > > > Here we are not breaking any contract with the application.
> > > > > Do you have concerns about this design?  
> > > >
> > > > In our application as a service it is important not to do any backtrace
> > > > in production. We rely on other infrastructure to process coredumps.  
> > >
> > > Other infrastructure will work. For example, If we are using standard coredump
> > > using linux infra. In Current implementation,
> > > - EAL handler dump the DPDK OOPS like kernel on stderr
> > > - Implementation calls SIG_DFL in eal oops handler
> > > - The above step creates the coredump or re-directs any other
> > > infrastructure you are using for coredump.
> > >  
> > > >
> > > > This should be controlled enabled by a command line argument.  
> > >
> > > If we allow other infrastructure coredump to work as-is, why
> > > enable/disable required from eal?  
> >
> > The addition of DPDK OOPS adds additional steps which make all
> > faults be identified as the oops code.  
> 
> Since we are using SA_ONSTACK it is not losing the original segfault
> info.
> 
> I verified like this, Please find below the steps.
> 
> 0) Enable coredump infra in Linux using coredumpctl or so
> 1) Apply this series
> 2) Apply for the following patch to create a segfault from the library.
> This will test, segfault caught by eal and forward to default Linux singal
> handler.
> 
> [main]dell[dpdk.org] $ git diff
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 3438a96b75..b935c32c98 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1338,6 +1338,8 @@ rte_eal_init(int argc, char **argv)
> 
>         eal_mcfg_complete();
> 
> +       /* Generate a segfault */
> +       *(volatile int *)0x05 = 0;
>         return fctret;
> 
>  }
> 3)Build
> meson --buildtype debug build
> ninja -C build
> 
> 4) Run
> $ ./build/app/test/dpdk-test --no-huge  -c 0x2
> 
> Please find oops dump[1] and gdb core dump backtrace[2].
> Gdb core dump trace preserves the original segfault cause and trace.
> 
> Any other concerns?

Your new oops handling duplicates existing code in our application
(and I know others that do this as well). The problem is that an
application may do this before calling rte_eal_init and your new
code will break that.

Therefore my recommendation is that the new oops handling needs
to be not a built in feature of EAL.




  reply	other threads:[~2021-08-18 16:46 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30  8:49 [dpdk-dev] 0/6] support oops handling jerinj
2021-07-30  8:49 ` [dpdk-dev] 1/6] eal: introduce oops handling API jerinj
2021-08-17  3:27   ` [dpdk-dev] [PATCH v2 0/6] support oops handling jerinj
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 1/6] eal: introduce oops handling API jerinj
2021-08-17  3:53       ` Stephen Hemminger
2021-08-17  7:38         ` Jerin Jacob
2021-08-17 15:09           ` Stephen Hemminger
2021-08-17 15:27             ` Jerin Jacob
2021-08-17 15:52               ` Stephen Hemminger
2021-08-18  9:37                 ` Jerin Jacob
2021-08-18 16:46                   ` Stephen Hemminger [this message]
2021-08-18 18:04                     ` Jerin Jacob
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 2/6] eal: oops handling API implementation jerinj
2021-08-17  3:52       ` Stephen Hemminger
2021-08-17 10:24         ` Jerin Jacob
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 3/6] eal: support libunwind based backtrace jerinj
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 4/6] eal/x86: support register dump for oops jerinj
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 5/6] eal/arm64: " jerinj
2021-08-17  3:27     ` [dpdk-dev] [PATCH v2 6/6] test/oops: support unit test case for oops handling APIs jerinj
2021-09-06  4:17     ` [dpdk-dev] [PATCH v3 0/6] support oops handling jerinj
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 1/6] eal: introduce oops handling API jerinj
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 2/6] eal: oops handling API implementation jerinj
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 3/6] eal: support libunwind based backtrace jerinj
2022-01-27 20:47         ` Stephen Hemminger
2022-01-28  4:33           ` Jerin Jacob
2022-01-28  8:41             ` Thomas Monjalon
2022-01-28 14:27               ` Jerin Jacob
2022-01-28 17:05                 ` Stephen Hemminger
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 4/6] eal/x86: support register dump for oops jerinj
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 5/6] eal/arm64: " jerinj
2021-09-06  4:17       ` [dpdk-dev] [PATCH v3 6/6] test/oops: support unit test case for oops handling APIs jerinj
2021-09-21 17:30       ` [dpdk-dev] [PATCH v3 0/6] support oops handling Thomas Monjalon
2021-09-21 17:54         ` Jerin Jacob
2021-09-22  7:34           ` Thomas Monjalon
2021-09-22  8:03             ` Jerin Jacob
2021-09-22  8:33               ` Thomas Monjalon
2021-09-22  8:49                 ` Jerin Jacob
2021-07-30  8:49 ` [dpdk-dev] 2/6] eal: oops handling API implementation jerinj
2021-08-02 22:46   ` David Christensen
2021-07-30  8:49 ` [dpdk-dev] 3/6] eal: support libunwind based backtrace jerinj
2021-07-30  8:49 ` [dpdk-dev] 4/6] eal/x86: support register dump for oops jerinj
2021-07-30  8:49 ` [dpdk-dev] 5/6] eal/arm64: " jerinj
2021-08-02 22:49   ` David Christensen
2021-08-16 16:24     ` Jerin Jacob
2021-07-30  8:49 ` [dpdk-dev] 6/6] test/oops: support unit test case for oops handling APIs jerinj

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210818094641.2fe829ba@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=dmitrym@microsoft.com \
    --cc=drc@linux.vnet.ibm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mdr@ashroe.eu \
    --cc=navasile@linux.microsoft.com \
    --cc=pallavi.kadam@intel.com \
    --cc=ruifeng.wang@arm.com \
    --cc=thomas@monjalon.net \
    --cc=viktorin@rehivetech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).