From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by dpdk.org (Postfix) with ESMTP id 79AB61B3A3 for ; Thu, 11 Oct 2018 12:32:24 +0200 (CEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 1BF182104C; Thu, 11 Oct 2018 06:32:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Thu, 11 Oct 2018 06:32:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=xabCl6sG3doByMuCmyCKxLrRpPNzPmms1jgBAqpp5DI=; b=IGn4qkHXAyO+ qrwKEh/Bp2xkynhQicAg0vCmmb7KUGnOFsFPqASE7w+cXY1JT9L1btX7DPUZgS6Y pvDpxR9o0hn/1naHESPpJjxQQqkghLvN85V+jHatIRy9hgxkLpJ4xHsdoUmjA2HU tQTgnZEpdKVy2kVLkmcNw/vuJtHJfQA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=xabCl6sG3doByMuCmyCKxLrRpPNzPmms1jgBAqpp5 DI=; b=j+nGw0HmCxXM+8cPfEHbHigt9dAQj5VDsjKMD1bpKW/av+yTM50bjOTBD 5UJeF7BlgkAprdxxU7ViMsF32kdBVjbf8eXCB4c/4enuDcSMFYD66Fp1bvQM2oL9 KIY1i3WJF/q0ka1SKuAKLP9ltPH6nooHjBya6lhYSnlYEjx1yMChaUFs6m66lU8s 3jlhCbLl/TrgajW17VkFaMCWZ/NNVQ96RmFARNUOQpNEHhtuwX3OXAtFNHLgM0Ml Em9OI5xjPJfx/NlP+9lHKw/p06JqVvji3OyQTTyHZsQfypcQQ/u0pE1kt/xfl7Ng zwnI75+zbpRqCBC5iWwqVV0GLgvgw== X-ME-Sender: X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 8C52E102ED; Thu, 11 Oct 2018 06:32:21 -0400 (EDT) From: Thomas Monjalon To: dev@dpdk.org Cc: Luca Boccassi , maxime.coquelin@redhat.com, tiwei.bie@intel.com, yongwang@vmware.com, 3chas3@gmail.com, bruce.richardson@intel.com, jianfeng.tan@intel.com, anatoly.burakov@intel.com, llouis@vmware.com, brussell@vyatta.att-mail.com, stephen@networkplumber.org, jingjing.wu@intel.com, anatoly.burakov@intel.com, qi.z.zhang@intel.com Date: Thu, 11 Oct 2018 12:32:20 +0200 Message-ID: <3749287.SKhoszCCOA@xps> In-Reply-To: <20180919125757.17938-3-bluca@debian.org> References: <20180816135032.28283-1-bluca@debian.org> <20180919125757.17938-1-bluca@debian.org> <20180919125757.17938-3-bluca@debian.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v2 3/3] eal/linux: handle uio read failure in interrupt handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Oct 2018 10:32:24 -0000 Looking for someone to review this patch please 19/09/2018 14:57, Luca Boccassi: > If a device is unplugged while an interrupt is pending, the > read call to the uio device to remove it from the poll wait list > can fail resulting in it being continually polled forever. This > change checks for the read failing and if so, unregisters the device > as an interrupt source and causes the wait list to be rebuilt. > > This race has been reported and observed in production. > > Fixes: 0a45657a6794 ("pci: rework interrupt handling") > Cc: stable@dpdk.org > > Signed-off-by: Brian Russell > Signed-off-by: Luca Boccassi > --- > lib/librte_eal/linuxapp/eal/eal_interrupts.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c > index 4076c6d6ca..34584db883 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c > +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c > @@ -627,7 +627,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) > bool call = false; > int n, bytes_read; > struct rte_intr_source *src; > - struct rte_intr_callback *cb; > + struct rte_intr_callback *cb, *next; > union rte_intr_read_buffer buf; > struct rte_intr_callback active_cb; > > @@ -701,6 +701,23 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) > "descriptor %d: %s\n", > events[n].data.fd, > strerror(errno)); > + /* > + * The device is unplugged or buggy, remove > + * it as an interrupt source and return to > + * force the wait list to be rebuilt. > + */ > + rte_spinlock_lock(&intr_lock); > + TAILQ_REMOVE(&intr_sources, src, next); > + rte_spinlock_unlock(&intr_lock); > + > + for (cb = TAILQ_FIRST(&src->callbacks); cb; > + cb = next) { > + next = TAILQ_NEXT(cb, next); > + TAILQ_REMOVE(&src->callbacks, cb, next); > + free(cb); > + } > + free(src); > + return -1; > } else if (bytes_read == 0) > RTE_LOG(ERR, EAL, "Read nothing from file " > "descriptor %d\n", events[n].data.fd); >