From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2ABF0A0548; Wed, 15 Jun 2022 08:02:20 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1C01741140; Wed, 15 Jun 2022 08:02:20 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by mails.dpdk.org (Postfix) with ESMTP id 920CB40220; Wed, 15 Jun 2022 08:02:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655272938; x=1686808938; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FxueHoJiFNHYk0odHBWzr8hgFg/unVXqBsVAwruIEKY=; b=gqY546n5rLPPsaLU4bTeNrvLlphvVOa/sIxe5MZACY6t1Nw3gBmnXsUz XurC+eBTIIVg1VaVDv2jl4sno16AH7KUVB4e0lVo+bd2EzIcKlzvmZcC8 9+rhjxlrNXsVpQToawKJcI7S5IzUF/Qe+5vRjQVOU4wsnfONjngUlq7tI LGg4zjov2akRwLjpkXA6s890uR8gy5/OTCDOKqzrW9jVBqcFxqBRXh/AG JvsCnrdVnviK5hf4Cpcdlcx7E8oKenlgTMTA3m7KA5jmW2JjNq4GDO/rQ 3PlFD1o/gog11ZmAXrSLbViSiEymOpjndC/ceM2WRJNu6vueDhF64nUXw Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10378"; a="279565742" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="279565742" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 23:02:17 -0700 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="830854107" Received: from unknown (HELO localhost.localdomain) ([10.239.252.103]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 23:02:14 -0700 From: zhichaox.zeng@intel.com To: dev@dpdk.org Cc: stable@dpdk.org, qiming.yang@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, mb@smartsharesystems.com, Zhichao Zeng , Bruce Richardson , Harman Kalra Subject: [PATCH v4] lib/eal: fix segfaults due to thread exit order Date: Wed, 15 Jun 2022 14:01:54 +0800 Message-Id: <20220615060154.6905-1-zhichaox.zeng@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220530134738.488602-1-zhichaox.zeng@intel.com> References: <20220530134738.488602-1-zhichaox.zeng@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Zhichao Zeng The eal-intr-thread is not closed before memory cleanup in the process of exiting. There is a small probability that when the eal-intr-thread is about to use some pointers, the memory were just cleaned, which cause the segment fault error caught by ASan. This patch close the eal-intr-thread before memory cleanup when exiting to avoid segment fault. And add some atomic operations to avoid executing rte_eal_cleanup in the child process spawned by fork() in some test cases, e.g. debug_autotest of dpdk-test. Cc: stable@dpdk.org --- v2: add the same API for FreeBSD --- v3: fix rte_eal_cleanup crash in debug_autotest --- v4: shorten the prompt message and optimize the commit log Suggested-by: David Marchand Signed-off-by: Zhichao Zeng --- lib/eal/common/eal_private.h | 7 +++++++ lib/eal/freebsd/eal.c | 21 ++++++++++++++++++++- lib/eal/freebsd/eal_interrupts.c | 12 ++++++++++++ lib/eal/linux/eal.c | 20 +++++++++++++++++++- lib/eal/linux/eal_interrupts.c | 12 ++++++++++++ 5 files changed, 70 insertions(+), 2 deletions(-) diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h index 44d14241f0..7adf41b7d7 100644 --- a/lib/eal/common/eal_private.h +++ b/lib/eal/common/eal_private.h @@ -152,6 +152,13 @@ int rte_eal_tailqs_init(void); */ int rte_eal_intr_init(void); +/** + * Destroy interrupt handling thread. + * + * This function is private to EAL. + */ +void rte_eal_intr_destroy(void); + /** * Close the default log stream * diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c index a6b20960f2..4882f27abd 100644 --- a/lib/eal/freebsd/eal.c +++ b/lib/eal/freebsd/eal.c @@ -72,6 +72,8 @@ struct lcore_config lcore_config[RTE_MAX_LCORE]; /* used by rte_rdtsc() */ int rte_cycles_vmware_tsc_map; +/* used to judge the running status of the eal */ +static uint32_t run_once; int eal_clean_runtime_dir(void) @@ -574,12 +576,22 @@ static void rte_eal_init_alert(const char *msg) RTE_LOG(ERR, EAL, "%s\n", msg); } +static void warn_parent(void) +{ + RTE_LOG(WARNING, EAL, "DPDK won't work in the child process\n"); +} + +static void scratch_child(void) +{ + /* Scratch run_once so that a call to rte_eal_cleanup won't crash... */ + __atomic_store_n(&run_once, 0, __ATOMIC_RELAXED); +} + /* Launch threads, called at application init(). */ int rte_eal_init(int argc, char **argv) { int i, fctret, ret; - static uint32_t run_once; uint32_t has_run = 0; char cpuset[RTE_CPU_AFFINITY_STR_LEN]; char thread_name[RTE_MAX_THREAD_NAME_LEN]; @@ -883,6 +895,8 @@ rte_eal_init(int argc, char **argv) eal_mcfg_complete(); + pthread_atfork(NULL, warn_parent, scratch_child); + return fctret; } @@ -891,8 +905,13 @@ rte_eal_cleanup(void) { struct internal_config *internal_conf = eal_get_internal_configuration(); + + if (__atomic_load_n(&run_once, __ATOMIC_RELAXED) == 0) + return 0; + rte_service_finalize(); rte_mp_channel_cleanup(); + rte_eal_intr_destroy(); /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); rte_eal_alarm_cleanup(); diff --git a/lib/eal/freebsd/eal_interrupts.c b/lib/eal/freebsd/eal_interrupts.c index 9f720bdc8f..cac3859b06 100644 --- a/lib/eal/freebsd/eal_interrupts.c +++ b/lib/eal/freebsd/eal_interrupts.c @@ -648,6 +648,18 @@ rte_eal_intr_init(void) return ret; } +void +rte_eal_intr_destroy(void) +{ + /* cancel the host thread to wait/handle the interrupt */ + pthread_cancel(intr_thread); + pthread_join(intr_thread, NULL); + + /* close kqueue */ + close(kq); + kq = -1; +} + int rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd, int op, unsigned int vec, void *data) diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index 1ef263434a..effebb33a6 100644 --- a/lib/eal/linux/eal.c +++ b/lib/eal/linux/eal.c @@ -76,6 +76,8 @@ struct lcore_config lcore_config[RTE_MAX_LCORE]; /* used by rte_rdtsc() */ int rte_cycles_vmware_tsc_map; +/* used to judge the running status of the eal */ +static uint32_t run_once; int eal_clean_runtime_dir(void) @@ -857,12 +859,22 @@ is_iommu_enabled(void) return n > 2; } +static void warn_parent(void) +{ + RTE_LOG(WARNING, EAL, "DPDK won't work in the child process\n"); +} + +static void scratch_child(void) +{ + /* Scratch run_once so that a call to rte_eal_cleanup won't crash... */ + __atomic_store_n(&run_once, 0, __ATOMIC_RELAXED); +} + /* Launch threads, called at application init(). */ int rte_eal_init(int argc, char **argv) { int i, fctret, ret; - static uint32_t run_once; uint32_t has_run = 0; const char *p; static char logid[PATH_MAX]; @@ -1228,6 +1240,8 @@ rte_eal_init(int argc, char **argv) eal_mcfg_complete(); + pthread_atfork(NULL, warn_parent, scratch_child); + return fctret; } @@ -1257,6 +1271,9 @@ rte_eal_cleanup(void) struct internal_config *internal_conf = eal_get_internal_configuration(); + if (__atomic_load_n(&run_once, __ATOMIC_RELAXED) == 0) + return 0; + if (rte_eal_process_type() == RTE_PROC_PRIMARY && internal_conf->hugepage_file.unlink_existing) rte_memseg_walk(mark_freeable, NULL); @@ -1266,6 +1283,7 @@ rte_eal_cleanup(void) vfio_mp_sync_cleanup(); #endif rte_mp_channel_cleanup(); + rte_eal_intr_destroy(); /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); eal_mp_dev_hotplug_cleanup(); diff --git a/lib/eal/linux/eal_interrupts.c b/lib/eal/linux/eal_interrupts.c index d52ec8eb4c..7e9853e8e7 100644 --- a/lib/eal/linux/eal_interrupts.c +++ b/lib/eal/linux/eal_interrupts.c @@ -1199,6 +1199,18 @@ rte_eal_intr_init(void) return ret; } +void +rte_eal_intr_destroy(void) +{ + /* cancel the host thread to wait/handle the interrupt */ + pthread_cancel(intr_thread); + pthread_join(intr_thread, NULL); + + /* close the pipe used by epoll */ + close(intr_pipe.writefd); + close(intr_pipe.readfd); +} + static void eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle) { -- 2.25.1