From: Bruce Richardson <bruce.richardson@intel.com>
To: zhichaox.zeng@intel.com
Cc: dev@dpdk.org, stable@dpdk.org, qiming.yang@intel.com,
david.marchand@redhat.com, stephen@networkplumber.org,
mb@smartsharesystems.com, Harman Kalra <hkalra@marvell.com>
Subject: Re: [PATCH v4] lib/eal: fix segfaults due to thread exit order
Date: Thu, 30 Jun 2022 13:24:31 +0100 [thread overview]
Message-ID: <Yr2V/yvgbWB4l6xW@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <20220615060154.6905-1-zhichaox.zeng@intel.com>
On Wed, Jun 15, 2022 at 02:01:54PM +0800, zhichaox.zeng@intel.com wrote:
> From: Zhichao Zeng <zhichaox.zeng@intel.com>
>
> The eal-intr-thread is not closed before memory cleanup in the
> process of exiting. There is a small probability that when the
> eal-intr-thread is about to use some pointers, the memory were
> just cleaned, which cause the segment fault error caught by ASan.
>
> This patch close the eal-intr-thread before memory cleanup when
> exiting to avoid segment fault. And add some atomic operations
> to avoid executing rte_eal_cleanup in the child process spawned
> by fork() in some test cases, e.g. debug_autotest of dpdk-test.
>
> Cc: stable@dpdk.org
>
Hi,
some comments inline below.
/Bruce
> ---
> v2:
> add the same API for FreeBSD
> ---
> v3:
> fix rte_eal_cleanup crash in debug_autotest
> ---
> v4:
> shorten the prompt message and optimize the commit log
>
Please put these updates below the cutline after the sign-offs, i.e.
immediately before the diffstat.
> Suggested-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Zhichao Zeng <zhichaox.zeng@intel.com>
> ---
> lib/eal/common/eal_private.h | 7 +++++++
> lib/eal/freebsd/eal.c | 21 ++++++++++++++++++++-
> lib/eal/freebsd/eal_interrupts.c | 12 ++++++++++++
> lib/eal/linux/eal.c | 20 +++++++++++++++++++-
> lib/eal/linux/eal_interrupts.c | 12 ++++++++++++
> 5 files changed, 70 insertions(+), 2 deletions(-)
>
> diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
> index 44d14241f0..7adf41b7d7 100644
> --- a/lib/eal/common/eal_private.h
> +++ b/lib/eal/common/eal_private.h
> @@ -152,6 +152,13 @@ int rte_eal_tailqs_init(void);
> */
> int rte_eal_intr_init(void);
>
> +/**
> + * Destroy interrupt handling thread.
> + *
> + * This function is private to EAL.
> + */
> +void rte_eal_intr_destroy(void);
> +
> /**
> * Close the default log stream
> *
> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index a6b20960f2..4882f27abd 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -72,6 +72,8 @@ struct lcore_config lcore_config[RTE_MAX_LCORE];
> /* used by rte_rdtsc() */
> int rte_cycles_vmware_tsc_map;
>
> +/* used to judge the running status of the eal */
> +static uint32_t run_once;
>
I don't like just moving this variable from the eal_init function. When in
eal_init the name "run_once" made sense as it tracked how often the EAL
init function was run. However, now as a global variable the name
"run_once" no longer makes sense.
Two suggestions:
1. Keep run_once in EAL init as-is, and use a different variable or value
to indicate that DPDK is initialized for cleanup.
2. Move the variable as you have here, just rename it to a more meaningful
name.
> int
> eal_clean_runtime_dir(void)
> @@ -574,12 +576,22 @@ static void rte_eal_init_alert(const char *msg)
> RTE_LOG(ERR, EAL, "%s\n", msg);
> }
>
> +static void warn_parent(void)
> +{
> + RTE_LOG(WARNING, EAL, "DPDK won't work in the child process\n");
> +}
I wonder if this contains enough information. Can we identify briefly what
parts will or won't work, or if we just want to deny everything, can we
give a brief reason why?
> +
> +static void scratch_child(void)
> +{
> + /* Scratch run_once so that a call to rte_eal_cleanup won't crash... */
> + __atomic_store_n(&run_once, 0, __ATOMIC_RELAXED);
> +}
> +
I think the name of this function needs improvement. I'm not sure that
"scratch" is the best term to use. Something like "clear_eal_flag" is
probably better.
> /* Launch threads, called at application init(). */
> int
> rte_eal_init(int argc, char **argv)
> {
> int i, fctret, ret;
> - static uint32_t run_once;
> uint32_t has_run = 0;
> char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> char thread_name[RTE_MAX_THREAD_NAME_LEN];
> @@ -883,6 +895,8 @@ rte_eal_init(int argc, char **argv)
>
> eal_mcfg_complete();
>
> + pthread_atfork(NULL, warn_parent, scratch_child);
> +
> return fctret;
> }
>
> @@ -891,8 +905,13 @@ rte_eal_cleanup(void)
> {
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> +
> + if (__atomic_load_n(&run_once, __ATOMIC_RELAXED) == 0)
> + return 0;
> +
> rte_service_finalize();
> rte_mp_channel_cleanup();
> + rte_eal_intr_destroy();
> /* after this point, any DPDK pointers will become dangling */
> rte_eal_memory_detach();
> rte_eal_alarm_cleanup();
<snip for brevity>
next prev parent reply other threads:[~2022-06-30 12:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220523111642.10406-1-zhichaox.zeng@intel.com>
2022-05-30 13:47 ` [PATCH v3] " zhichaox.zeng
2022-05-30 16:26 ` Stephen Hemminger
2022-05-30 16:28 ` Stephen Hemminger
2022-06-02 8:21 ` Zeng, ZhichaoX
2022-06-07 10:14 ` Zeng, ZhichaoX
2022-06-15 6:01 ` [PATCH v4] " zhichaox.zeng
2022-06-24 1:42 ` Zeng, ZhichaoX
2022-06-24 7:50 ` David Marchand
2022-06-30 10:38 ` Zeng, ZhichaoX
2022-06-30 12:24 ` Bruce Richardson [this message]
2022-09-06 2:51 ` [PATCH v5] lib/eal: fix segfaults in exiting Zhichao Zeng
2022-09-06 15:03 ` Stephen Hemminger
2022-09-07 8:53 ` Zeng, ZhichaoX
2022-10-11 5:25 ` [PATCH v6] " Zhichao Zeng
2022-10-11 14:04 ` Stephen Hemminger
2022-10-19 1:51 ` Zeng, ZhichaoX
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yr2V/yvgbWB4l6xW@bricha3-MOBL.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=hkalra@marvell.com \
--cc=mb@smartsharesystems.com \
--cc=qiming.yang@intel.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
--cc=zhichaox.zeng@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).