From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C257EA0C46; Tue, 17 Aug 2021 17:52:38 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 930D640DF5; Tue, 17 Aug 2021 17:52:37 +0200 (CEST) Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by mails.dpdk.org (Postfix) with ESMTP id 70AB14014E for ; Tue, 17 Aug 2021 17:52:35 +0200 (CEST) Received: by mail-pj1-f44.google.com with SMTP id m24-20020a17090a7f98b0290178b1a81700so7023299pjl.4 for ; Tue, 17 Aug 2021 08:52:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YvKJmi+CRuKjXLZYy4fAasM0iU/Gn37CyLu/hJJSPTw=; b=zQZWJvjMo8fngZtuoZ/iqZiiFmHtZn3ADRXnWRgYL47FxbF5Q39agOSz6gjIv0X+Iq SBvA5sqpYP9MULUaWwoYytMoml5GbI0SdC14NbzZnrNgkClAxi5xRhQcDH/1DUc7TVVu EOWTDCRQDiOv5rTgig3Fwq8FmfZTbkHpIBRztQ1F9wviUKPVpLIBJjOJqj1oEckfD2cR wmlnNkQ2rIb7pguDKuuj+WbPQ9tQ/Pqbwctzjq1KOQScGVzVrPNBXsRkHyGvCTLsBSsL coYjowh5Z8XIKDOBDIbaTUnvSG0ZMWdMiT6nqrdjLaO6IUG6Zz+psinxqWvKQNCEpqol TZRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YvKJmi+CRuKjXLZYy4fAasM0iU/Gn37CyLu/hJJSPTw=; b=s6+MU5+0299dHf+QrX/ySrGI7EZPP6Ps63c0e/4VxT/wYpGJh26n72Ge2qfW1Q+c11 bUc7MqOW2pKFbX9edZ2Gt9uEi/Qm0/EWiq8HLxAnqgPILHLX9knbhZjKg6AlL/02pP8R GnDqS6xh7bieWw3ixMMc4pKwp3z4hH8um76XlRpTCsU4E+b15HnY9wtuQShiHih4OI6s gv1ff9ZyeJtuLB8TaGUH2NfFLyeecNT2uGL7P8qBKpFdaZZJ3dpL4JlFQjK30xaZDk4o SfOCAcWj4YiV5YUFfiXvCDmPI2tszu9d22WdP2i0DU7X+hhTa5SSeTy5xYMTo9xrb5Qk 3beg== X-Gm-Message-State: AOAM531Vhu9/WDaBBmT3yFZVXDWkISI125gzIgNwQimTAVl5h1lu06ce xJXIzVcvHg3/E0nOKXxEGNnbsg== X-Google-Smtp-Source: ABdhPJyduRzs6XX6DKTBqnf+LO1Tc5td0Tj+3/l442Go4NF8tzpBQ2wqkQYbItG2IG/e5klEBnIqxQ== X-Received: by 2002:a63:9752:: with SMTP id d18mr4058454pgo.320.1629215554424; Tue, 17 Aug 2021 08:52:34 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id d13sm3131111pfn.136.2021.08.17.08.52.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Aug 2021 08:52:34 -0700 (PDT) Date: Tue, 17 Aug 2021 08:52:31 -0700 From: Stephen Hemminger To: Jerin Jacob Cc: Jerin Jacob , dpdk-dev , Bruce Richardson , Ray Kinsella , Thomas Monjalon , David Marchand , Dmitry Kozlyuk , Narcisa Ana Maria Vasile , "Dmitry Malloy (MESHCHANINOV)" , Pallavi Kadam , "Ananyev, Konstantin" , "Ruifeng Wang (Arm Technology China)" , Jan Viktorin , David Christensen Message-ID: <20210817085231.16be26c5@hermes.local> In-Reply-To: References: <20210730084938.2426128-2-jerinj@marvell.com> <20210817032723.3997054-1-jerinj@marvell.com> <20210817032723.3997054-2-jerinj@marvell.com> <20210816205345.6d686c7d@hermes.local> <20210817080924.7049fa2d@hermes.local> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 1/6] eal: introduce oops handling API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, 17 Aug 2021 20:57:50 +0530 Jerin Jacob wrote: > On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger > wrote: > > > > On Tue, 17 Aug 2021 13:08:46 +0530 > > Jerin Jacob wrote: > > > > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger > > > wrote: > > > > > > > > On Tue, 17 Aug 2021 08:57:18 +0530 > > > > wrote: > > > > > > > > > From: Jerin Jacob > > > > > > > > > > Introducing oops handling API with following specification > > > > > and enable stub implementation for Linux and FreeBSD. > > > > > > > > > > On rte_eal_init() invocation, the EAL library installs the > > > > > oops handler for the essential signals. > > > > > The rte_oops_signals_enabled() API provides the list > > > > > of signals the library installed by the EAL. > > > > > > > > This is a big change, and many applications already handle these > > > > signals themselves. Therefore adding this needs to be opt-in > > > > and not enabled by default. > > > > > > In order to avoid every application explicitly register this > > > sighandler and to cater to the > > > co-existing application-specific signal-hander usage. > > > The following design has been chosen. (It is mentioned in the commit log, > > > I will describe here for more clarity) > > > > > > Case 1: > > > a) The application installs the signal handler prior to rte_eal_init(). > > > b) Implementation stores the application-specific signal and replace a > > > signal handler as oops eal handler > > > c) when application/DPDK get the segfault, the default EAL oops > > > handler gets invoked > > > d) Then it dumps the EAL specific message, it calls the > > > application-specific signal handler > > > installed in step 1 by application. This avoids breaking any contract > > > with the application. > > > i.e Behavior is the same current EAL now. > > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after > > > eal oops handler instead > > > application-specific handler) > > > > > > Case 2: > > > a) The application install the signal handler after rte_eal_init(), > > > b) EAL hander get replaced with application handle then the application can call > > > rte_oops_decode() to decode. > > > > > > In order to cater the above use case, rte_oops_signals_enabled() and > > > rte_oops_decode() > > > provided. > > > > > > Here we are not breaking any contract with the application. > > > Do you have concerns about this design? > > > > In our application as a service it is important not to do any backtrace > > in production. We rely on other infrastructure to process coredumps. > > Other infrastructure will work. For example, If we are using standard coredump > using linux infra. In Current implementation, > - EAL handler dump the DPDK OOPS like kernel on stderr > - Implementation calls SIG_DFL in eal oops handler > - The above step creates the coredump or re-directs any other > infrastructure you are using for coredump. > > > > > This should be controlled enabled by a command line argument. > > If we allow other infrastructure coredump to work as-is, why > enable/disable required from eal? The addition of DPDK OOPS adds additional steps which make all faults be identified as the oops code.