From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id B0E255F21 for ; Tue, 2 Oct 2018 17:53:52 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Oct 2018 08:53:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,332,1534834800"; d="scan'208";a="93990614" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.113]) ([10.237.220.113]) by fmsmga004.fm.intel.com with ESMTP; 02 Oct 2018 08:53:46 -0700 To: Jeff Guo , stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com, jerin.jacob@caviumnetworks.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, helin.zhang@intel.com References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1538483726-96411-1-git-send-email-jia.guo@intel.com> <1538483726-96411-7-git-send-email-jia.guo@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Tue, 2 Oct 2018 16:53:46 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1538483726-96411-7-git-send-email-jia.guo@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v12 6/7] eal: add failure handle mechanism for hot-unplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 15:53:53 -0000 On 02-Oct-18 1:35 PM, Jeff Guo wrote: > The mechanism can initially register the sigbus handler after the device > event monitor is enabled. When a sigbus event is captured, it will check > the failure address and accordingly handle the memory failure of the > corresponding device by invoke the hot-unplug handler. It could prevent > the application from crashing when a device is hot-unplugged. > > By this patch, users could call below new added APIs to enable/disable > the device hotplug handle mechanism. Note that it just implement the > hot-unplug handler in these functions, the other handler of hotplug, such > as handler for hotplug binding, could be add in the future if need: > - rte_dev_hotplug_handle_enable > - rte_dev_hotplug_handle_disable > > Signed-off-by: Jeff Guo > --- > +static void sigbus_handler(int signum, siginfo_t *info, > + void *ctx __rte_unused) > +{ > + int ret; > + > + RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n", > + (int)pthread_self(), info->si_addr); > + > + rte_spinlock_lock(&failure_handle_lock); > + ret = rte_bus_sigbus_handler(info->si_addr); > + rte_spinlock_unlock(&failure_handle_lock); > + if (ret == -1) { > + rte_exit(EXIT_FAILURE, > + "Failed to handle SIGBUS for hot-unplug, " > + "(rte_errno: %s)!", strerror(rte_errno)); Do we really want to exit the application on sigbus handle failure? > + } else if (ret == 1) { > + if (sigbus_action_old.sa_handler) > + (*(sigbus_action_old.sa_handler))(signum); > + else > + rte_exit(EXIT_FAILURE, > + "Failed to handle generic SIGBUS!"); > + } > + > + RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hot-unplug!\n"); Again, does this all need to be with INFO log level? IMO it should be DEBUG. > +} > + > +static int cmp_dev_name(const struct rte_device *dev, > + const void *_name) > +{ > + const char *name = _name; > + > + return strcmp(dev->name, name); > +} > + > static int > > int __rte_experimental > @@ -220,5 +320,67 @@ rte_dev_event_monitor_stop(void) > close(intr_handle.fd); > intr_handle.fd = -1; > monitor_started = false; > + > return 0; This looks like unintended change. > } > + > +int __rte_experimental > +rte_dev_sigbus_handler_register(void) > +{ > + sigset_t mask; > + struct sigaction action; > + > --- a/lib/librte_eal/rte_eal_version.map > +++ b/lib/librte_eal/rte_eal_version.map > @@ -281,6 +281,8 @@ EXPERIMENTAL { > rte_dev_event_callback_unregister; > rte_dev_event_monitor_start; > rte_dev_event_monitor_stop; > + rte_dev_hotplug_handle_enable; > + rte_dev_hotplug_handle_disable; Nitpicking - disable should be above enable, as E follows D in alphabet :) > rte_dev_iterator_init; > rte_dev_iterator_next; > rte_devargs_add; > -- Thanks, Anatoly