From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 779B9A0577; Mon, 6 Apr 2020 12:30:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1EFD01BEE3; Mon, 6 Apr 2020 12:30:51 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 040FA1BEDF; Mon, 6 Apr 2020 12:30:49 +0200 (CEST) IronPort-SDR: hgwsVf112ORJmp1+esPrHc6RasYHRxkXWg+ww1VDGeahEP5Th91zej9ULXNjVYFpjLAyC+XvTc cefnfGiYHyjQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Apr 2020 03:30:49 -0700 IronPort-SDR: YEToheH63xMxmVeuHfZv+8Clx29LNudIEpBq7nm3uV1p8+9gwhCuvI/J5PQ2fCjmgTq+oU6TAX WNhpVUV/LCSg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,350,1580803200"; d="scan'208";a="296620359" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.249.40.213]) ([10.249.40.213]) by FMSMGA003.fm.intel.com with ESMTP; 06 Apr 2020 03:30:47 -0700 To: David Marchand , Harry van Haaren , "Ananyev, Konstantin" Cc: dev , Aaron Conole , dpdk stable References: <20200310133304.39951-1-harry.van.haaren@intel.com> <20200311143927.76021-1-harry.van.haaren@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Mon, 6 Apr 2020 11:30:46 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] eal/service: fix exit by resetting service lcores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 13-Mar-20 10:04 AM, David Marchand wrote: > On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren > wrote: >> >> This commit releases all service cores from their role, >> returning them to ROLE_RTE on rte_service_finalize(). >> >> This may fix an issue relating to the service cores causing > > s/may fix/fixes/ > >> a race-condition on eal_cleanup(), where the service core >> could still be executing while the main thread has already >> free-d the service memory, leading to a segfault. >> >> Fixes: 21698354c832 ("service: introduce service cores concept") > > Replaced with: > Fixes: da23f0aa87d8 ("service: fix memory leak with new function") > >> Cc: stable@dpdk.org >> >> Reported-by: David Marchand >> Reported-by: Aaron Conole >> Signed-off-by: David Marchand >> Signed-off-by: Harry van Haaren >> Acked-by: Aaron Conole > > Applied, thanks. > > This patch breaks a couple of apps (or rather the apps were broken to begin with, but the brokenness has been exposed with this patch). A "good" way to handle a SIGINT is to catch it, set some kind of global exit flag, and exit the signal handler, so that all of the threads see the exit flag, stop spinning, and exit the main loop and proceed to gracefully shutdown. That's what majority of our apps do. A bad way to handle SIGINT is to call rte_exit() inside the signal handler, without setting any global exit flags. Since rte_exit() now waits for all of the threads to stop, the exit will never actually happen because threads can't stop without an exit signal, and no exit signal was provided by the signal handler. Affected apps: * l3fwd-power (i'm preparing a patch) * ip_reassembly (see main.c:988) - +Konstantin There are also a bunch of apps that simply call exit(0) and do unclean shutdown without DPDK cleanup, and also apps i have no idea what they're doing (call kill() on themselves in the SIGINT handler? l3fwd-cat does that, so do a bunch of others), but this is probably a bigger problem that should be addressed separately. -- Thanks, Anatoly