From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 580BA2BE5 for ; Fri, 9 Nov 2018 07:17:11 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2018 22:17:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,482,1534834800"; d="scan'208";a="272628134" Received: from jguo15x-mobl.ccr.corp.intel.com (HELO [10.67.68.85]) ([10.67.68.85]) by orsmga005.jf.intel.com with ESMTP; 08 Nov 2018 22:17:07 -0800 To: Matan Azrad , "Ananyev, Konstantin" , "Burakov, Anatoly" , Thomas Monjalon , "Iremonger, Bernard" , "Wu, Jingjing" , "Lu, Wenzhuo" Cc: "Yigit, Ferruh" , "dev@dpdk.org" , "Zhang, Helin" , "He, Shaopeng" References: <1541484436-91320-1-git-send-email-jia.guo@intel.com> <1541484436-91320-4-git-send-email-jia.guo@intel.com> <2601191342CEEE43887BDE71AB9772580103069B86@irsmsx105.ger.corp.intel.com> <11af735e-7e8a-fb16-3ea8-2b269d8437b1@intel.com> From: Jeff Guo Message-ID: <8bac8c10-d30b-ab93-1d6c-03e7d93b97c3@intel.com> Date: Fri, 9 Nov 2018 14:17:07 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue for hot-unplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2018 06:17:12 -0000 On 11/9/2018 1:24 PM, Matan Azrad wrote: > > From: Jeff Guo >> On 11/8/2018 5:35 PM, Matan Azrad wrote: >>> From: Jeff Guo >>>> On 11/8/2018 3:28 PM, Matan Azrad wrote: >>>>> From: Ananyev, Konstantin >>>>>>> -----Original Message----- >>>>>>> From: Guo, Jia >>>>>>> Sent: Wednesday, November 7, 2018 7:30 AM >>>>>>> To: Matan Azrad ; Ananyev, Konstantin >>>>>>> ; Burakov, Anatoly >>>>>>> ; Thomas Monjalon >>>>>> ; >>>>>>> Iremonger, Bernard ; Wu, Jingjing >>>>>>> ; Lu, Wenzhuo >>>>>>> Cc: Yigit, Ferruh ; dev@dpdk.org; Zhang, >>>>>>> Helin ; He, Shaopeng >>>> >>>>>>> Subject: Re: [PATCH 3/3] app/testpmd: fix callback issue for >>>>>>> hot-unplug >>>>>>> >>>>>>> matan >>>>>>> >>>>>>> On 11/6/2018 2:36 PM, Matan Azrad wrote: >>>>>>>> Hi Jeff >>>>>>>> >>>>>>>> From: Jeff Guo >>>>>>>>> Before detach device when device be hot-unplugged, the failure >>>>>>>>> process in user space and kernel space both need to be finished, >>>>>>>>> such as eal interrupt callback need to be inactive before the >>>>>>>>> callback be unregistered when device is being cleaned. This >>>>>>>>> patch add rte alarm for device detaching, with that it could >>>>>>>>> finish interrupt callback soon and give time to let the failure >>>>>>>>> process done >>>>>> before detaching. >>>>>>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure handler") >>>>>>>>> Signed-off-by: Jeff Guo >>>>>>>>> --- >>>>>>>>> app/test-pmd/testpmd.c | 13 ++++++++++++- >>>>>>>>> 1 file changed, 12 insertions(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c >>>>>>>>> index 9c0edca..9c673cf 100644 >>>>>>>>> --- a/app/test-pmd/testpmd.c >>>>>>>>> +++ b/app/test-pmd/testpmd.c >>>>>>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char >>>>>>>>> *device_name, enum rte_dev_event_type type, >>>>>>>>> device_name); >>>>>>>>> return; >>>>>>>>> } >>>>>>>>> - rmv_event_callback((void *)(intptr_t)port_id); >>>>>>>>> + /* >>>>>>>>> + * Before detach device, the hot-unplug failure >>>> process in >>>>>>>>> + * user space and kernel space both need to be >>>> finished, >>>>>>>>> + * such as eal interrupt callback need to be inactive >>>> before >>>>>>>>> + * the callback be unregistered when device is being >>>> cleaned. >>>>>>>>> + * So finished interrupt callback soon here and give >>>> time to >>>>>>>>> + * let the work done before detaching. >>>>>>>>> + */ >>>>>>>>> + if (rte_eal_alarm_set(100000, >>>>>>>>> + rmv_event_callback, (void >>>>>>>>> *)(intptr_t)port_id)) >>>>>>>>> + RTE_LOG(ERR, EAL, >>>>>>>>> + "Could not set up deferred device >>>>>>>> It looks me strange to use callback and alarm to remove a device: >>>>>>>> Why not to use callback and that is it? >>>>>>>> >>>>>>>> I think that it's better to let the EAL to detach the device >>>>>>>> after all the >>>>>> callbacks were done and not to do it by the user callback. >>>>>>>> So the application\callback owners just need to clean its >>>>>>>> resources with understanding that after the callback the >>>>>>>> device(and the callback >>>>>>> itself) will be detached by the EAL. >>>>>>> >>>>>>> >>>>>>> Firstly, at the currently framework and solution, such as callback >>>>>>> for RTE_ETH_EVENT_INTR_RMV, still need to use the deferred device >>>>>> removal, >>>>>>> we tend to give the control of detaching device to the >>>>>>> application, and the whole process is located on the user's >>>>>>> callback. Notify app to detach device by callback but make it deferred, >> i think it is fine. >>>>> But the device must be detached in remove event, why not to do it in >> EAL? >>>> I think it because of before detached the device, application should >>>> be stop the forwarding at first, then stop the device, then close >>>> >>>> the device, finally call eal unplug API to detach device. If eal >>>> directly detach device at the first step, there will be mountain user >>>> space error need to handle, so that is one reason why need to >>>> provider the remove notification to app, and let app to process it. >>> This is why the EAL need to detach the device only after all the user >> callbacks were done. >> >> >> If i correctly got your meaning, you suppose to let eal to mandatory detach >> device but not app, app just need to stop/close port, right? > Yes, the app should stop,close,clean its own resources of the removed device, > Then, EAL to detach the device. > >> If so, it will need to modify rmv_event_callback by not invoke the detaching >> func and add some detaching logic to hotplug framework in eal. >> > rmv_event_callback is using by other hotplug mechanism (ETHDEV RMV event), so you need to use another one\ add parameter to it. > And yes, you need to detach the device from EAL, should be simple. I think rmv_event_callback is original use for other hotplug event (ETHDEV RMV event), but it still use the common hotplug mechanism(app callback and app detach), so i think it will still need to face this callback issue and you could check that eth_event_callback also use the rte alarm to detach device. so my suggestion is that, you maybe propose a good idea but let we keep on current mechanism until we come to a final good solution agreement, before that, just let it functional. >> It is hardly say better or worse but this new propose need to discuss more in >> public. And it might be better to use another specific patch to handler it. >> What do you or other guys think so? > Since you are fixing issue here, it can be done by a fix series. > > Other feedbacks are welcome all the time 😊 > >> >>>>>> It is also unclear to me my we need an alarm here. >>>>>> First (probably wrong) impression we just try to hide some >>>>>> synchronization Problem by introducing delay. >>>>> Looks like, the issue is that the callback function memory will be >>>>> removed >>>> from the function itself (by the detach call), no? >>>> >>>> >>>> Answer here for both Konstantin and Matan. >>>> >>>> Yes, i think matan is right, the interrupt callback will be destroy >>>> in the app callback itself, the sequence is that as below >>>> >>>> hot-unplug interrupt -> interrupt callback -> app callback(return to >>>> finish interrupt callback, delay detaching) -> detach >>>> device(unregister interrupt >>>> callback) >>>> >>>> >>>>>> Konstantin >>>>>> >>>>>>> Secondly, the vfio is different with igb_uio for hot-unplug, it >>>>>>> register/unregister hotplug interrupt callback for each device, so >>>>>>> need to make  the callback done before unregister the callback. >>>>>>> >>>>>>> So I think it should be considerate as an workaround here, before >>>>>>> we find a better way. >>>>>>> >>>>>>> >>>>>>>>> removal\n"); >>>>>>>>> break; >>>>>>>>> case RTE_DEV_EVENT_ADD: >>>>>>>>> RTE_LOG(ERR, EAL, "The device: %s has been >>>> added!\n", >>>>>>>>> -- >>>>>>>>> 2.7.4