From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8C5DCA00C3; Sat, 29 Jan 2022 09:25:29 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1088D4068B; Sat, 29 Jan 2022 09:25:29 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id D964540041 for ; Sat, 29 Jan 2022 09:25:27 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC] eal_debug: do not use malloc in rte_dump_stack X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Sat, 29 Jan 2022 09:25:26 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86E59@smartserver.smartshare.dk> In-Reply-To: <20220129011039.264377-1-stephen@networkplumber.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC] eal_debug: do not use malloc in rte_dump_stack Thread-Index: AdgUrQ5cTHRkdBG4Qh6GCa1PmYvvqgAPBCkw References: <20220129011039.264377-1-stephen@networkplumber.org> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Stephen Hemminger" , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Saturday, 29 January 2022 02.11 >=20 > The glibc backtrace_symbols() calls malloc which makes it > dangerous to use rte_dump_stack() in a signal handler that > is handling errors that maybe due to memory corruption. Yes. We have experienced that problem with backtrace_symbols(); so as a = workaround, our failure signal handler dumps all other information = first, and calls backtrace_symbols() last, in case it crashes. >=20 > Instead, use dladdr() to lookup up symbols incrementally. I took a brief look at the dladdr() source code, and it looks good to = me. >=20 > The format of the messages is based on what X org server > has been doing for many years. It changes from bottom up > to top down order. Good idea. Seems more logical. >=20 > Bugzilla ID: 929 > Signed-off-by: Stephen Hemminger > --- > lib/eal/linux/eal_debug.c | 45 = ++++++++++++++++++++++++++++----------- > 1 file changed, 32 insertions(+), 13 deletions(-) >=20 > diff --git a/lib/eal/linux/eal_debug.c b/lib/eal/linux/eal_debug.c > index 64dab4e0da24..bf232f72f402 100644 > --- a/lib/eal/linux/eal_debug.c > +++ b/lib/eal/linux/eal_debug.c > @@ -4,6 +4,7 @@ >=20 > #ifdef RTE_BACKTRACE > #include > +#include > #endif > #include > #include > @@ -18,26 +19,44 @@ >=20 > #define BACKTRACE_SIZE 256 >=20 > -/* dump the stack of the calling core */ > +/* Dump the stack of the calling core > + * > + * Note: this requires some careful usage in order to > + * stay safe in case where called from a signal > + * handler and the malloc pool may be corrupted. > + */ > void rte_dump_stack(void) > { > #ifdef RTE_BACKTRACE > void *func[BACKTRACE_SIZE]; > - char **symb =3D NULL; > - int size; > + int i, size; >=20 > size =3D backtrace(func, BACKTRACE_SIZE); > - symb =3D backtrace_symbols(func, size); > - > - if (symb =3D=3D NULL) > - return; >=20 > - while (size > 0) { > - rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > - "%d: [%s]\n", size, symb[size - 1]); > - size --; > + for (i =3D 0; i < size; i++) { > + void *pc =3D func[i]; > + const char *fname; > + Dl_info info; > + > + if (dladdr(pc, &info) =3D=3D 0) { > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: ?? [%p]\n", i, pc); > + continue; > + } > + > + fname =3D (info.dli_fname && *info.dli_fname) ? > info.dli_fname : "(vdso)"; > + if (info.dli_saddr !=3D NULL) > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: %s (%s+%#tx) [%p]\n", > + i, fname, info.dli_sname, > + (ptrdiff_t)((uintptr_t)pc - > (uintptr_t)info.dli_saddr), > + pc); > + else > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: %s (%p+%#tx) [%p]\n", > + i, fname, info.dli_fbase, > + (ptrdiff_t)((uintptr_t)pc - > (uintptr_t)info.dli_fbase), > + pc); > } > - > free(symb); Probably something is lost in formatting here, but free(symb) must also = be removed. > #endif /* RTE_BACKTRACE */ > } > -- > 2.34.1 >=20 Great improvement, Stephen! Acked-by: Morten Br=F8rup