From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by dpdk.org (Postfix) with ESMTP id DCBC491 for ; Mon, 12 Nov 2018 02:35:46 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 5838D20EA9; Sun, 11 Nov 2018 20:35:45 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Sun, 11 Nov 2018 20:35:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=vvZ2F3/9p3Nwf5pRszFzIYHDnZEGiswav8B3H5LCD6o=; b=k5OtEVPUK+jP 649A0X5rGQxh+7OKjHMkyWHX1Bx+fjNn459/axzbV5uHrKc9AzfGmyqusBdmuzdv oS1r4WfbVnBzPU9SBKPrvwlMTm/Wr8+S+LDF3zLmui4f0MVvXtbJz+/7Ae9OFr/9 YmrmRl9Fv11Egcjd6NJ/siuRU5ZDU8s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=vvZ2F3/9p3Nwf5pRszFzIYHDnZEGiswav8B3H5LCD 6o=; b=AhLzv59aWRCX+t/B+X7all8hg872pi3qFjCf1txIPvqwi6kVMRrXrbvJx JpsIwA6InfcEoIu9TQjio11aA5Mglw6KMoPUiYcivSxV7RLOQcgr5m94PhuG0Zxr cccyv4xZLgRaTxxtOzBkD516j5LuqfPGedltubDRcyhdoIoG2eMgoFjqrlSawK+J QB3qJVWT1UxuuEQC4gWCh512LfGp2nwJVruQknfvI0aJAREkZctWS+HVTy+qoztl MYGD0Hevdom/lvwyEtxyPdzyJgtyjJi9d2HT76zonDy8ODCDqdJPxh60V35pFQkT FfMkTdwOddjzb5QOZEcgQZx026jSg== X-ME-Sender: X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 78AEFE476E; Sun, 11 Nov 2018 20:35:43 -0500 (EST) From: Thomas Monjalon To: Jeff Guo Cc: dev@dpdk.org, Matan Azrad , "Ananyev, Konstantin" , "Burakov, Anatoly" , "Iremonger, Bernard" , "Wu, Jingjing" , "Lu, Wenzhuo" , "Yigit, Ferruh" , "Zhang, Helin" , "He, Shaopeng" Date: Mon, 12 Nov 2018 02:35:42 +0100 Message-ID: <4481513.QvlXUHQEVQ@xps> In-Reply-To: References: <1541484436-91320-1-git-send-email-jia.guo@intel.com> <8bac8c10-d30b-ab93-1d6c-03e7d93b97c3@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue for hot-unplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2018 01:35:47 -0000 11/11/2018 08:31, Matan Azrad: > From: Jeff Guo > > On 11/9/2018 1:24 PM, Matan Azrad wrote: > > > From: Jeff Guo > > >> On 11/8/2018 5:35 PM, Matan Azrad wrote: > > >>> From: Jeff Guo > > >>>> On 11/8/2018 3:28 PM, Matan Azrad wrote: > > >>>>> From: Ananyev, Konstantin > > >>>>>> From: Guo, Jia > > >>>>>>> On 11/6/2018 2:36 PM, Matan Azrad wrote: > > >>>>>>>> Hi Jeff > > >>>>>>>> > > >>>>>>>> From: Jeff Guo > > >>>>>>>>> Before detach device when device be hot-unplugged, the failure > > >>>>>>>>> process in user space and kernel space both need to be > > >>>>>>>>> finished, such as eal interrupt callback need to be inactive > > >>>>>>>>> before the callback be unregistered when device is being > > >>>>>>>>> cleaned. This patch add rte alarm for device detaching, with > > >>>>>>>>> that it could finish interrupt callback soon and give time to > > >>>>>>>>> let the failure process done > > >>>>>> before detaching. > > >>>>>>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure > > >>>>>>>>> handler") > > >>>>>>>>> Signed-off-by: Jeff Guo > > >>>>>>>>> --- > > >>>>>>>>> app/test-pmd/testpmd.c | 13 ++++++++++++- > > >>>>>>>>> 1 file changed, 12 insertions(+), 1 deletion(-) > > >>>>>>>>> > > >>>>>>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > > >>>>>>>>> index 9c0edca..9c673cf 100644 > > >>>>>>>>> --- a/app/test-pmd/testpmd.c > > >>>>>>>>> +++ b/app/test-pmd/testpmd.c > > >>>>>>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char > > >>>>>>>>> *device_name, enum rte_dev_event_type type, > > >>>>>>>>> device_name); > > >>>>>>>>> return; > > >>>>>>>>> } > > >>>>>>>>> - rmv_event_callback((void *)(intptr_t)port_id); > > >>>>>>>>> + /* > > >>>>>>>>> + * Before detach device, the hot-unplug failure > > >>>> process in > > >>>>>>>>> + * user space and kernel space both need to be > > >>>> finished, > > >>>>>>>>> + * such as eal interrupt callback need to be inactive > > >>>> before > > >>>>>>>>> + * the callback be unregistered when device is being > > >>>> cleaned. > > >>>>>>>>> + * So finished interrupt callback soon here and give > > >>>> time to > > >>>>>>>>> + * let the work done before detaching. > > >>>>>>>>> + */ > > >>>>>>>>> + if (rte_eal_alarm_set(100000, > > >>>>>>>>> + rmv_event_callback, (void > > >>>>>>>>> *)(intptr_t)port_id)) > > >>>>>>>>> + RTE_LOG(ERR, EAL, > > >>>>>>>>> + "Could not set up deferred device > > >>>>>>>> It looks me strange to use callback and alarm to remove a device: > > >>>>>>>> Why not to use callback and that is it? > > >>>>>>>> > > >>>>>>>> I think that it's better to let the EAL to detach the device > > >>>>>>>> after all the > > >>>>>> callbacks were done and not to do it by the user callback. > > >>>>>>>> So the application\callback owners just need to clean its > > >>>>>>>> resources with understanding that after the callback the > > >>>>>>>> device(and the callback > > >>>>>>> itself) will be detached by the EAL. > > >>>>>>> > > >>>>>>> > > >>>>>>> Firstly, at the currently framework and solution, such as > > >>>>>>> callback for RTE_ETH_EVENT_INTR_RMV, still need to use the > > >>>>>>> deferred device > > >>>>>> removal, > > >>>>>>> we tend to give the control of detaching device to the > > >>>>>>> application, and the whole process is located on the user's > > >>>>>>> callback. Notify app to detach device by callback but make it > > >>>>>>> deferred, > > >> i think it is fine. > > >>>>> But the device must be detached in remove event, why not to do it > > >>>>> in > > >> EAL? > > >>>> I think it because of before detached the device, application > > >>>> should be stop the forwarding at first, then stop the device, then > > >>>> close > > >>>> > > >>>> the device, finally call eal unplug API to detach device. If eal > > >>>> directly detach device at the first step, there will be mountain > > >>>> user space error need to handle, so that is one reason why need to > > >>>> provider the remove notification to app, and let app to process it. > > >>> This is why the EAL need to detach the device only after all the > > >>> user > > >> callbacks were done. > > >> > > >> > > >> If i correctly got your meaning, you suppose to let eal to mandatory > > >> detach device but not app, app just need to stop/close port, right? > > > Yes, the app should stop,close,clean its own resources of the removed > > > device, Then, EAL to detach the device. > > > > > >> If so, it will need to modify rmv_event_callback by not invoke the > > >> detaching func and add some detaching logic to hotplug framework in eal. > > >> > > > rmv_event_callback is using by other hotplug mechanism (ETHDEV RMV > > event), so you need to use another one\ add parameter to it. > > > And yes, you need to detach the device from EAL, should be simple. > > > > > > I think rmv_event_callback is original use for other hotplug event (ETHDEV > > RMV event), but it still use the common hotplug mechanism(app callback and > > app detach), > > > > so i think it will still need to face this callback issue and you could check that > > eth_event_callback also use the rte alarm to detach device. > > The ETHDEV layer cannot know if more than one ethdev port is associated to the rte_device, > So, it is not correct to detach a rte_device by the ETHDEV layer. Hence, the application should decide in the ETHDEV event case. Yes, we must not detach a device (rte_dev_remove) if we are not sure all ports are closed. > But, in the EAL event case this is different as I explained before. > > Moreover, I think that the ethdev RMV event should be deprecated one day, after the EAL hot-plug mechanism will be stable. Yes, we may replace the ethdev RMV event by the EAL event RTE_DEV_EVENT_REMOVE. > > so my suggestion is that, you maybe propose a good idea but let we keep on > > current mechanism until we come to a final good solution agreement, before > > that, just let it functional. > > I still suggest to fix the issue by an EAL detaching. I don't understand the issue, but yes we can call rte_dev_remove in EAL, after the callback return. Ideally, the callback should be able to return the decision of cleaning the device or not. I suggest several steps (a roadmap for HW unplug): - in 18.11, remove call to detach_port_device() from rmv_event_callback() - in 18.11, call rte_dev_remove() in EAL after RTE_DEV_EVENT_REMOVE callback - in 18.11, remove call to rmv_event_callback() in eth_dev_event_callback() - in 18.11, rename eth_dev_event_callback to dev_event_callback - in 19.02, convert all PMDs to free port resources on rte_eth_dev_close() - in 19.05, automatically call rte_dev_remove() from PMD when closing last port - in 19.08, we may try to unbind the kernel driver in rte_dev_remove()