From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4D578A0548; Tue, 17 Aug 2021 05:52:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1206F4014E; Tue, 17 Aug 2021 05:52:40 +0200 (CEST) Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by mails.dpdk.org (Postfix) with ESMTP id 3B5A940142 for ; Tue, 17 Aug 2021 05:52:39 +0200 (CEST) Received: by mail-pj1-f46.google.com with SMTP id u21-20020a17090a8915b02901782c36f543so2673278pjn.4 for ; Mon, 16 Aug 2021 20:52:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DjYjisgavRkGuU+qGtAT73LcSHPAR8jfAJoDcpIvOCE=; b=1bkKpFHdL7wGhO7AGBeBcDIK/BswfScv8fN15GQsEFh+tY0yj5qVcQWcU6+n1Fp56C fXtH9FvtjdxGwM5Fm6ON+a7q6tot1ojEHln8GC2AclgQRw28aFjCwmmqs9zzI8tiqxXI BQIgqocgJqX8JpB+ObO2lxHkN0jtzf83+DE9NWQ8WQBFMrvKOtsnseeo8hZr1uB7DM6r 8AebSm9z+UYIJSC43KgM9HhgLvooW9k3DOtO/eOvCuIrJ65zcew4K2wpSnppC55rc+ww 44QcIIEUyBFyBAu7fqNtHbZaOCb0teX+w49XPAiSE8EblFowCA4cW33fxJw4W7atKBzk f5wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DjYjisgavRkGuU+qGtAT73LcSHPAR8jfAJoDcpIvOCE=; b=VD7+adChnZFQQMON9MEkZhMWD9pyiKV5NKpr4CEK48ccX/dnwyEdNOSed3MTUCd9JY syHdgl8L0ryY48dxMKuBTV8O3i1I6sUE/Ic+RJwzHM1OgiUfRgvFoNWdBpzBt2fBDchU 9AKmBxnI7e+Y81FCgQViYLueA0JQsx+ERIbXsqjo12ItinlBj/6hc6mKCsQDPHZY17ac cfVJNVvoAzlu/dXb1ghARb9Qp6tMLMs2b7XVZKWwArytb78RhfduAXmEZs/nEWYUiQux SpwH0aL2LCDqn9HLmtIK8c98JZWZngksX6boJQRwHdhzb7bbeCVcBhQV7oMjGf89snFE B2/w== X-Gm-Message-State: AOAM530EFSYjI87n7FELuehiAC1VH6YeI4Sja51VwEj/T9ZZ4J8w3nei mho5JpRbAPSljs/4BpK/ZusEzA== X-Google-Smtp-Source: ABdhPJx5dBaiTa3zFPTp2ShhGQEk1/luwyGEY0fpwia/ZlFER91SJu/6r1Bsp0+N8serPO33yH/K4w== X-Received: by 2002:a63:d34e:: with SMTP id u14mr1479703pgi.244.1629172358413; Mon, 16 Aug 2021 20:52:38 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id y6sm505396pjr.48.2021.08.16.20.52.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Aug 2021 20:52:38 -0700 (PDT) Date: Mon, 16 Aug 2021 20:52:35 -0700 From: Stephen Hemminger To: Cc: , , , , , , , , , , , Message-ID: <20210816205235.3dd48286@hermes.local> In-Reply-To: <20210817032723.3997054-3-jerinj@marvell.com> References: <20210730084938.2426128-2-jerinj@marvell.com> <20210817032723.3997054-1-jerinj@marvell.com> <20210817032723.3997054-3-jerinj@marvell.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 2/6] eal: oops handling API implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, 17 Aug 2021 08:57:19 +0530 wrote: > +#define oops_print(...) rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, __VA_ARGS__) It is problematic to call rte_log from a signal handler. The malloc pool maybe corrupted and rte_log can call functions that use malloc. Even rte_dump_stack() is unsafe from these signals. > + > +static int oops_signals[] = {SIGSEGV, SIGBUS, SIGILL, SIGABRT, SIGFPE, SIGSYS}; Should be constant. > + > +struct oops_signal { > + int sig; Redundant, you defined the oops_signals above. > + bool enabled; Redundant, you can just compare with action. > + struct sigaction sa; > +}; > + > +static struct oops_signal signals_db[RTE_DIM(oops_signals)]; > + > +static void > +back_trace_dump(ucontext_t *context) > +{ > + RTE_SET_USED(context); > + > + rte_dump_stack(); > +} rte_dump_stack() is not safe in signal handler: Recommend backtrace_symbols_fd ?? Better yet use libunwind > +static void > +siginfo_dump(int sig, siginfo_t *info) > +{ > + oops_print("PID: %" PRIdMAX "\n", (intmax_t)getpid()); > + > + if (info == NULL) > + return; > + if (sig != info->si_signo) > + oops_print("Invalid signal info\n"); > + > + oops_print("Signal number: %d\n", info->si_signo); > + oops_print("Fault address: %p\n", info->si_addr); > +} > + > +static void > +mem32_dump(void *ptr) Should be const > +{ > + uint32_t *p = ptr; > + int i; > + > + for (i = 0; i < 16; i++) > + oops_print("%p: 0x%x\n", p + i, rte_be_to_cpu_32(p[i])); > +} Why reinvent hexdump? > + > +static void > +stack_dump_header(void) > +{ > + oops_print("Stack dump:\n"); > + oops_print("----------\n"); > +} > + > +static void > +code_dump_header(void) > +{ > + oops_print("Code dump:\n"); > + oops_print("----------\n"); > +} > + > +static void > +stack_code_dump(void *stack, void *code) > +{ > + if (stack == NULL || code == NULL) > + return; > + > + oops_print("\n"); > + stack_dump_header(); > + mem32_dump(stack); > + oops_print("\n"); > + > + code_dump_header(); > + mem32_dump(code); > + oops_print("\n"); > +} > +static void > +archinfo_dump(ucontext_t *uc) > { > - RTE_SET_USED(sig); > - RTE_SET_USED(info); > RTE_SET_USED(uc); > > + stack_code_dump(NULL, NULL); > +} > + > +static void > +default_signal_handler_invoke(int sig) > +{ > + unsigned int idx; > + > + for (idx = 0; idx < RTE_DIM(oops_signals); idx++) { > + /* Skip disabled signals */ > + if (signals_db[idx].sig != sig) > + continue; > + if (!signals_db[idx].enabled) > + continue; > + /* Replace with stored handler */ > + sigaction(sig, &signals_db[idx].sa, NULL); > + kill(getpid(), sig); If you use SA_RESETHAND, you don't need this stuff. > + } > +} > + > +void > +rte_oops_decode(int sig, siginfo_t *info, ucontext_t *uc) > +{ > + oops_print("Signal info:\n"); > + oops_print("------------\n"); > + siginfo_dump(sig, info); > + oops_print("\n"); > + > + oops_print("Backtrace:\n"); > + oops_print("----------\n"); > + back_trace_dump(uc); > + oops_print("\n"); > + > + oops_print("Arch info:\n"); > + oops_print("----------\n"); > + if (uc) > + archinfo_dump(uc); > +} > + > +static void > +eal_oops_handler(int sig, siginfo_t *info, void *ctx) > +{ > + ucontext_t *uc = ctx; > + > + rte_oops_decode(sig, info, uc); > + default_signal_handler_invoke(sig); If you use SA_RESETHAND, then just doing raise(sig) here. > } > > int > rte_oops_signals_enabled(int *signals) Why is this necessary and exported? > { > - RTE_SET_USED(signals); > + int count = 0, sig[RTE_OOPS_SIGNALS_MAX]; > + unsigned int idx = 0; > > - return 0; > + for (idx = 0; idx < RTE_DIM(oops_signals); idx++) { > + if (signals_db[idx].enabled) { > + sig[count] = signals_db[idx].sig; > + count++; > + } > + } > + if (signals) > + memcpy(signals, sig, sizeof(*signals) * count); > + > + return count; > } > > int > eal_oops_init(void) > { > - return 0; > + unsigned int idx, rc = 0; > + struct sigaction sa; > + > + RTE_BUILD_BUG_ON(RTE_DIM(oops_signals) > RTE_OOPS_SIGNALS_MAX); > + > + sigemptyset(&sa.sa_mask); > + sa.sa_sigaction = &eal_oops_handler; > + sa.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK; > + > + for (idx = 0; idx < RTE_DIM(oops_signals); idx++) { > + signals_db[idx].sig = oops_signals[idx]; > + /* Get exiting sigaction */ > + rc = sigaction(signals_db[idx].sig, NULL, &signals_db[idx].sa); > + if (rc) > + continue; > + /* Replace with oops handler */ > + rc = sigaction(signals_db[idx].sig, &sa, NULL); > + if (rc) > + continue; > + signals_db[idx].enabled = true; > + } > + return rc; > } > > void > eal_oops_fini(void) > { > + unsigned int idx; > + > + for (idx = 0; idx < RTE_DIM(oops_signals); idx++) { > + if (!signals_db[idx].enabled) > + continue; > + /* Replace with stored handler */ > + sigaction(signals_db[idx].sig, &signals_db[idx].sa, NULL); > + } > }