From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0FE84A0548; Tue, 17 Aug 2021 12:24:44 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9570240DF5; Tue, 17 Aug 2021 12:24:43 +0200 (CEST) Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) by mails.dpdk.org (Postfix) with ESMTP id 966374014E for ; Tue, 17 Aug 2021 12:24:42 +0200 (CEST) Received: by mail-io1-f47.google.com with SMTP id a13so26825302iol.5 for ; Tue, 17 Aug 2021 03:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=T1cNLu5XPrgF4icimO/3rc5BlYoUG5OhpLtru1klG8E=; b=uTeVgXC1ZRlvePCwSm8CDdYifKH+JbwEmMVm2OZglnQzJbFtsyurBiGvGT3XBEqA68 ucCNr6MWms02uk0OJql42RMgaZlStkz2VUqijgAE11x1bPg4Qy0MRqLvK9RHxsHszMNv i9TPpW6IN0PczrXrZ9XNILncvpaPaRtYm+fP+wgbNRFBTE4rNCWRyGa6DDBq+AJYFzml zsAAy13wYLTy0lgJW7RHfKmcm67a3UXJBsWShVxGo9DUhqiG5cPy/q3005MOswB5VjZn xDTHFnX3IfpnEPHv+J+r9Qx6gQ1CZ9YAk/gD0hE2MLx9Sib6FV6J/OyeN39bA0RFMAus h4XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=T1cNLu5XPrgF4icimO/3rc5BlYoUG5OhpLtru1klG8E=; b=Uu37rLQT/EuR1QnCsa56QpLkllrU2trCzdUvIesAfJ1PWHCfuhG7DhxZcVbDeq2ti/ yyaih1Xjk+p3TozDzkx0dUbaXuYvqYHLRjziAWY0MLXmECWNlxxuoZ9OXqbiZ5I2hcnD NHbf1PD++UbXMAZvm9oPo0cHYwfObfNu9fZfBJrlwSIiKpegO5sbNRs6n9l19z3YGsiJ AVtKwZ5geTyARBM1TZ9qbVcX967iJGFf1Tg06HFSVCHYXOe3yv9qIVDCOr6chUAjXA7+ paPXHpdoMUyz43E3+hcGnmbHYm515mLZNFY1Av+O26xi3llADNiamQtRxwC+/vnPSWdR 7TRA== X-Gm-Message-State: AOAM532pASZ23g56LsH+0NgKE643AONwPjfHcB4XvD63MQXFodU3q9Xg CvpuAd0fIbMIWubvMpCGo3Al2ANC5XC40rb0Zzg= X-Google-Smtp-Source: ABdhPJwMKAI2pm3ngLwZAoeZPCqtE4t9GFLUnZK/VqwQBlpwlHe8Chp5hSE57oUZGXGJskWuYCgaDcghjp3S7ka06XA= X-Received: by 2002:a02:b88d:: with SMTP id p13mr2287546jam.104.1629195881980; Tue, 17 Aug 2021 03:24:41 -0700 (PDT) MIME-Version: 1.0 References: <20210730084938.2426128-2-jerinj@marvell.com> <20210817032723.3997054-1-jerinj@marvell.com> <20210817032723.3997054-3-jerinj@marvell.com> <20210816205235.3dd48286@hermes.local> In-Reply-To: <20210816205235.3dd48286@hermes.local> From: Jerin Jacob Date: Tue, 17 Aug 2021 15:54:15 +0530 Message-ID: To: Stephen Hemminger Cc: Jerin Jacob , dpdk-dev , Thomas Monjalon , David Marchand , "Richardson, Bruce" , Dmitry Kozlyuk , Narcisa Ana Maria Vasile , "Dmitry Malloy (MESHCHANINOV)" , Pallavi Kadam , "Ananyev, Konstantin" , "Ruifeng Wang (Arm Technology China)" , Jan Viktorin , David Christensen Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH v2 2/6] eal: oops handling API implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Aug 17, 2021 at 9:22 AM Stephen Hemminger wrote: > > On Tue, 17 Aug 2021 08:57:19 +0530 > wrote: > > > +#define oops_print(...) rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, __VA_ARGS__) > > It is problematic to call rte_log from a signal handler. > The malloc pool maybe corrupted and rte_log can call functions that > use malloc. OK. What to use instead, fprint(stderr, ...)? > > Even rte_dump_stack() is unsafe from these signals. OK > > > + > > +static int oops_signals[] = {SIGSEGV, SIGBUS, SIGILL, SIGABRT, SIGFPE, SIGSYS}; > > Should be constant. Ack > > > + > > +struct oops_signal { > > + int sig; > > Redundant, you defined the oops_signals above. Ack. > > > + bool enabled; > > Redundant, you can just compare with action. Anyway, we need to database to hold the sigactions. This makes clean to implement rte_oops_signals_enabled(). Also != SIG_DFL is not enabled. > > > + struct sigaction sa; > > +}; > > + > > +static struct oops_signal signals_db[RTE_DIM(oops_signals)]; > > + > > +static void > > +back_trace_dump(ucontext_t *context) > > +{ > > + RTE_SET_USED(context); > > + > > + rte_dump_stack(); > > +} > > rte_dump_stack() is not safe in signal handler: > > Recommend backtrace_symbols_fd ?? > > Better yet use libunwind libunwind is an optional dependency. You can see in the next patch, back_trace_dump() will be implemented with libunwind based stack unwind, if the dependency is met. > > > +static void > > +siginfo_dump(int sig, siginfo_t *info) > > +{ > > + oops_print("PID: %" PRIdMAX "\n", (intmax_t)getpid()); > > + > > + if (info == NULL) > > + return; > > + if (sig != info->si_signo) > > + oops_print("Invalid signal info\n"); > > + > > + oops_print("Signal number: %d\n", info->si_signo); > > + oops_print("Fault address: %p\n", info->si_addr); > > +} > > + > > +static void > > +mem32_dump(void *ptr) > > Should be const Ack. > > > +{ > > + uint32_t *p = ptr; > > + int i; > > + > > + for (i = 0; i < 16; i++) > > + oops_print("%p: 0x%x\n", p + i, rte_be_to_cpu_32(p[i])); > > +} > > Why reinvent hexdump? Make sense. I can change to hexdump, But, it will use rte_log. Shouldn't we use fprint(stderr,..) variant. > > > + > > +static void > > +stack_dump_header(void) > > +{ > > + oops_print("Stack dump:\n"); > > + oops_print("----------\n"); > > +} > > + > > +static void > > +code_dump_header(void) > > +{ > > + oops_print("Code dump:\n"); > > + oops_print("----------\n"); > > +} > > + > > +static void > > +stack_code_dump(void *stack, void *code) > > +{ > > + if (stack == NULL || code == NULL) > > + return; > > + > > + oops_print("\n"); > > + stack_dump_header(); > > + mem32_dump(stack); > > + oops_print("\n"); > > + > > + code_dump_header(); > > + mem32_dump(code); > > + oops_print("\n"); > > +} > > +static void > > +archinfo_dump(ucontext_t *uc) > > { > > - RTE_SET_USED(sig); > > - RTE_SET_USED(info); > > RTE_SET_USED(uc); > > > > + stack_code_dump(NULL, NULL); > > +} > > + > > +static void > > +default_signal_handler_invoke(int sig) > > +{ > > + unsigned int idx; > > + > > + for (idx = 0; idx < RTE_DIM(oops_signals); idx++) { > > + /* Skip disabled signals */ > > + if (signals_db[idx].sig != sig) > > + continue; > > + if (!signals_db[idx].enabled) > > + continue; > > + /* Replace with stored handler */ > > + sigaction(sig, &signals_db[idx].sa, NULL); > > + kill(getpid(), sig); > > If you use SA_RESETHAND, you don't need this stuff. As mentioned in other 1/6 email reply, This is NOT the case where SIG_DFL handler called from eal oops handler, instead, it will be calling the signal handler which is registered prior to rte_eal_init() which is stored local database. > > > + } > > +} > > + > > +void > > +rte_oops_decode(int sig, siginfo_t *info, ucontext_t *uc) > > +{ > > + oops_print("Signal info:\n"); > > + oops_print("------------\n"); > > + siginfo_dump(sig, info); > > + oops_print("\n"); > > + > > + oops_print("Backtrace:\n"); > > + oops_print("----------\n"); > > + back_trace_dump(uc); > > + oops_print("\n"); > > + > > + oops_print("Arch info:\n"); > > + oops_print("----------\n"); > > + if (uc) > > + archinfo_dump(uc); > > +} > > + > > +static void > > +eal_oops_handler(int sig, siginfo_t *info, void *ctx) > > +{ > > + ucontext_t *uc = ctx; > > + > > + rte_oops_decode(sig, info, uc); > > + default_signal_handler_invoke(sig); > > If you use SA_RESETHAND, then just doing raise(sig) here. > > } > > > > int > > rte_oops_signals_enabled(int *signals) > > Why is this necessary and exported? Explained in 1/6 email reply.