From: Jeff Guo <jia.guo@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: dev@dpdk.org, Matan Azrad <matan@mellanox.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"Burakov, Anatoly" <anatoly.burakov@intel.com>,
"Iremonger, Bernard" <bernard.iremonger@intel.com>,
"Wu, Jingjing" <jingjing.wu@intel.com>,
"Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
"Yigit, Ferruh" <ferruh.yigit@intel.com>,
"Zhang, Helin" <helin.zhang@intel.com>,
"He, Shaopeng" <shaopeng.he@intel.com>
Subject: Re: [dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue for hot-unplug
Date: Wed, 14 Nov 2018 17:32:11 +0800 [thread overview]
Message-ID: <0c5b7cbe-59e8-1e85-3205-f046d5a6dd0e@intel.com> (raw)
In-Reply-To: <4481513.QvlXUHQEVQ@xps>
On 11/12/2018 9:35 AM, Thomas Monjalon wrote:
> 11/11/2018 08:31, Matan Azrad:
>> From: Jeff Guo
>>> On 11/9/2018 1:24 PM, Matan Azrad wrote:
>>>> From: Jeff Guo
>>>>> On 11/8/2018 5:35 PM, Matan Azrad wrote:
>>>>>> From: Jeff Guo
>>>>>>> On 11/8/2018 3:28 PM, Matan Azrad wrote:
>>>>>>>> From: Ananyev, Konstantin
>>>>>>>>> From: Guo, Jia
>>>>>>>>>> On 11/6/2018 2:36 PM, Matan Azrad wrote:
>>>>>>>>>>> Hi Jeff
>>>>>>>>>>>
>>>>>>>>>>> From: Jeff Guo <jia.guo@intel.com>
>>>>>>>>>>>> Before detach device when device be hot-unplugged, the failure
>>>>>>>>>>>> process in user space and kernel space both need to be
>>>>>>>>>>>> finished, such as eal interrupt callback need to be inactive
>>>>>>>>>>>> before the callback be unregistered when device is being
>>>>>>>>>>>> cleaned. This patch add rte alarm for device detaching, with
>>>>>>>>>>>> that it could finish interrupt callback soon and give time to
>>>>>>>>>>>> let the failure process done
>>>>>>>>> before detaching.
>>>>>>>>>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure
>>>>>>>>>>>> handler")
>>>>>>>>>>>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> app/test-pmd/testpmd.c | 13 ++++++++++++-
>>>>>>>>>>>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>>>>>>>>>>>> index 9c0edca..9c673cf 100644
>>>>>>>>>>>> --- a/app/test-pmd/testpmd.c
>>>>>>>>>>>> +++ b/app/test-pmd/testpmd.c
>>>>>>>>>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char
>>>>>>>>>>>> *device_name, enum rte_dev_event_type type,
>>>>>>>>>>>> device_name);
>>>>>>>>>>>> return;
>>>>>>>>>>>> }
>>>>>>>>>>>> - rmv_event_callback((void *)(intptr_t)port_id);
>>>>>>>>>>>> + /*
>>>>>>>>>>>> + * Before detach device, the hot-unplug failure
>>>>>>> process in
>>>>>>>>>>>> + * user space and kernel space both need to be
>>>>>>> finished,
>>>>>>>>>>>> + * such as eal interrupt callback need to be inactive
>>>>>>> before
>>>>>>>>>>>> + * the callback be unregistered when device is being
>>>>>>> cleaned.
>>>>>>>>>>>> + * So finished interrupt callback soon here and give
>>>>>>> time to
>>>>>>>>>>>> + * let the work done before detaching.
>>>>>>>>>>>> + */
>>>>>>>>>>>> + if (rte_eal_alarm_set(100000,
>>>>>>>>>>>> + rmv_event_callback, (void
>>>>>>>>>>>> *)(intptr_t)port_id))
>>>>>>>>>>>> + RTE_LOG(ERR, EAL,
>>>>>>>>>>>> + "Could not set up deferred device
>>>>>>>>>>> It looks me strange to use callback and alarm to remove a device:
>>>>>>>>>>> Why not to use callback and that is it?
>>>>>>>>>>>
>>>>>>>>>>> I think that it's better to let the EAL to detach the device
>>>>>>>>>>> after all the
>>>>>>>>> callbacks were done and not to do it by the user callback.
>>>>>>>>>>> So the application\callback owners just need to clean its
>>>>>>>>>>> resources with understanding that after the callback the
>>>>>>>>>>> device(and the callback
>>>>>>>>>> itself) will be detached by the EAL.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Firstly, at the currently framework and solution, such as
>>>>>>>>>> callback for RTE_ETH_EVENT_INTR_RMV, still need to use the
>>>>>>>>>> deferred device
>>>>>>>>> removal,
>>>>>>>>>> we tend to give the control of detaching device to the
>>>>>>>>>> application, and the whole process is located on the user's
>>>>>>>>>> callback. Notify app to detach device by callback but make it
>>>>>>>>>> deferred,
>>>>> i think it is fine.
>>>>>>>> But the device must be detached in remove event, why not to do it
>>>>>>>> in
>>>>> EAL?
>>>>>>> I think it because of before detached the device, application
>>>>>>> should be stop the forwarding at first, then stop the device, then
>>>>>>> close
>>>>>>>
>>>>>>> the device, finally call eal unplug API to detach device. If eal
>>>>>>> directly detach device at the first step, there will be mountain
>>>>>>> user space error need to handle, so that is one reason why need to
>>>>>>> provider the remove notification to app, and let app to process it.
>>>>>> This is why the EAL need to detach the device only after all the
>>>>>> user
>>>>> callbacks were done.
>>>>>
>>>>>
>>>>> If i correctly got your meaning, you suppose to let eal to mandatory
>>>>> detach device but not app, app just need to stop/close port, right?
>>>> Yes, the app should stop,close,clean its own resources of the removed
>>>> device, Then, EAL to detach the device.
>>>>
>>>>> If so, it will need to modify rmv_event_callback by not invoke the
>>>>> detaching func and add some detaching logic to hotplug framework in eal.
>>>>>
>>>> rmv_event_callback is using by other hotplug mechanism (ETHDEV RMV
>>> event), so you need to use another one\ add parameter to it.
>>>> And yes, you need to detach the device from EAL, should be simple.
>>>
>>> I think rmv_event_callback is original use for other hotplug event (ETHDEV
>>> RMV event), but it still use the common hotplug mechanism(app callback and
>>> app detach),
>>>
>>> so i think it will still need to face this callback issue and you could check that
>>> eth_event_callback also use the rte alarm to detach device.
>> The ETHDEV layer cannot know if more than one ethdev port is associated to the rte_device,
>> So, it is not correct to detach a rte_device by the ETHDEV layer. Hence, the application should decide in the ETHDEV event case.
> Yes, we must not detach a device (rte_dev_remove) if we are not sure all ports are closed.
>
>> But, in the EAL event case this is different as I explained before.
>>
>> Moreover, I think that the ethdev RMV event should be deprecated one day, after the EAL hot-plug mechanism will be stable.
> Yes, we may replace the ethdev RMV event by the EAL event RTE_DEV_EVENT_REMOVE.
>
>>> so my suggestion is that, you maybe propose a good idea but let we keep on
>>> current mechanism until we come to a final good solution agreement, before
>>> that, just let it functional.
>> I still suggest to fix the issue by an EAL detaching.
> I don't understand the issue, but yes we can call rte_dev_remove in EAL,
> after the callback return.
> Ideally, the callback should be able to return the decision of cleaning
> the device or not.
>
> I suggest several steps (a roadmap for HW unplug):
>
>
I mainly agree on that we need to modify currently hotplug framework for
HW hot-unplug. But i think we might be think about that.
1) simple delete detach_port_device from rmv_event_callback will affect
both currently ethdev event and eal event hot-unplug mechanism.
2) If we remove call to rmv_event_callback in eth_dev_event_callback() ,
we also need to remove call to rmv_event_callback in
eth_event_callback(), since it also use deferred device removal to
waiting the interrupt callback be unregistered in
mlx5_dev_interrupt_handler_uninstall.
so i think that basically the multiple port free and the hotplug
framework integrate should be an RFC at the begin of 19.02, but should
not affect currently mechanism. And i suggest the several steps might be
better as.
- in 18.11, rename eth_dev_event_callback to dev_event_callback
- in 18.11, rename rmv_event_callback() to rmv_port_callback(), keep current function calling when detach one port both for ethdev event and eal event. (we could document it for the limitation.)
- in 19.02, remove call to detach_port_device() from rmv_event_callback()
- in 19.02, call rte_dev_remove() in EAL after RTE_DEV_EVENT_REMOVE callback
- in 19.02, remove call to rmv_event_callback() in eth_dev_event_callback()
- in 19.02, convert all PMDs to free port resources on rte_eth_dev_close()
- in 19.05, automatically call rte_dev_remove() from PMD when closing last port
- in 19.08, we may try to unbind the kernel driver in rte_dev_remove()
next prev parent reply other threads:[~2018-11-14 9:32 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-06 6:07 [dpdk-dev] [PATCH 0/3] fix vfio hot-unplug issue Jeff Guo
2018-11-06 6:07 ` [dpdk-dev] [PATCH 1/3] eal: fix lock issue for hot-unplug Jeff Guo
2018-11-06 6:22 ` Matan Azrad
2018-11-07 5:49 ` Jeff Guo
2018-11-08 7:08 ` Matan Azrad
2018-11-06 6:07 ` [dpdk-dev] [PATCH 2/3] vfio: fix to add handler lock " Jeff Guo
2018-11-06 6:23 ` Matan Azrad
2018-11-07 6:15 ` Jeff Guo
2018-11-08 7:20 ` Matan Azrad
2018-11-08 8:30 ` Jeff Guo
2018-11-06 6:07 ` [dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue " Jeff Guo
2018-11-06 6:36 ` Matan Azrad
2018-11-07 7:30 ` Jeff Guo
2018-11-07 11:05 ` Ananyev, Konstantin
2018-11-08 7:28 ` Matan Azrad
2018-11-08 8:49 ` Jeff Guo
2018-11-08 9:35 ` Matan Azrad
2018-11-09 3:55 ` Jeff Guo
2018-11-09 5:24 ` Matan Azrad
2018-11-09 6:17 ` Jeff Guo
[not found] ` <AM0PR0502MB401938411A7E2BA9E76576A2D2C00@AM0PR0502MB4019.eurprd05.prod.outlook.com>
2018-11-12 1:35 ` Thomas Monjalon
2018-11-14 9:32 ` Jeff Guo [this message]
2018-11-15 9:18 ` [dpdk-dev] [PATCH V2 0/3] fix vfio hot-unplug issue Jeff Guo
2018-11-15 9:18 ` Jeff Guo
2018-11-18 16:19 ` Thomas Monjalon
2018-11-15 9:18 ` [dpdk-dev] [PATCH V2 1/3] eal: fix lock issue for hot-unplug Jeff Guo
2018-11-15 9:18 ` [dpdk-dev] [PATCH V2 2/3] vfio: fix to add handler lock " Jeff Guo
2018-11-15 9:18 ` [dpdk-dev] [PATCH V2 3/3] app/testpmd: fix callback issue " Jeff Guo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0c5b7cbe-59e8-1e85-3205-f046d5a6dd0e@intel.com \
--to=jia.guo@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=bernard.iremonger@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=helin.zhang@intel.com \
--cc=jingjing.wu@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=matan@mellanox.com \
--cc=shaopeng.he@intel.com \
--cc=thomas@monjalon.net \
--cc=wenzhuo.lu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).