From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Hu, Jiayu" <jiayu.hu@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Xia, Chenbo" <chenbo.xia@intel.com>,
"Wang, Yinan" <yinan.wang@intel.com>,
"Pai G, Sunil" <sunil.pai.g@intel.com>,
"Jiang, Cheng1" <cheng1.jiang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v2 3/4] vhost: avoid deadlock on async register
Date: Fri, 16 Apr 2021 10:33:51 +0200 [thread overview]
Message-ID: <07eb8178-9523-6db3-1056-e96b48e65139@redhat.com> (raw)
In-Reply-To: <4fc5b419864e4b829623099cd31de9aa@intel.com>
On 4/16/21 10:18 AM, Hu, Jiayu wrote:
> Hi Maxime,
>
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Friday, April 16, 2021 2:35 PM
>> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
>> Cc: Xia, Chenbo <chenbo.xia@intel.com>; Wang, Yinan
>> <yinan.wang@intel.com>; Pai G, Sunil <sunil.pai.g@intel.com>; Jiang, Cheng1
>> <cheng1.jiang@intel.com>
>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
>>
>> Hi Jiayu,
>>
>> On 4/16/21 4:19 AM, Hu, Jiayu wrote:
>>> Hi Maxime,
>>>>>> On 4/14/21 3:40 AM, Hu, Jiayu wrote:
>>>>>>> Hi Maxime,
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> Sent: Tuesday, April 13, 2021 7:33 PM
>>>>>>>> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
>>>>>>>> Cc: Xia, Chenbo <chenbo.xia@intel.com>; Wang, Yinan
>>>>>>>> <yinan.wang@intel.com>; Pai G, Sunil <sunil.pai.g@intel.com>; Jiang,
>>>>>> Cheng1
>>>>>>>> <cheng1.jiang@intel.com>
>>>>>>>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/2/21 3:04 PM, Jiayu Hu wrote:
>>>>>>>>> Users can register async copy device in vring_state_changed(),
>>>>>>>>> when vhost queue is enabled. However, a deadlock occurs inside
>>>>>>>>> rte_vhost_async_channel_register(), if
>>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES
>>>>>>>>> is not supported, as vhost_user_msg_handler() takes vq-
>>> access_lock
>>>>>>>>> before calling vhost_user_set_vring_kick().
>>>>>>>>>
>>>>>>>>> This patch avoids async register deadlock by removing calling
>>>>>>>>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe
>>>>>>>>> as vhost_user_msg_handler() will call vring_state_changed()
>> anyway.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>>>>>>>>> ---
>>>>>>>>> lib/librte_vhost/vhost_user.c | 3 ---
>>>>>>>>> 1 file changed, 3 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>> b/lib/librte_vhost/vhost_user.c
>>>>>>>>> index 44c0452..8f0eba6 100644
>>>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>>>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct
>> virtio_net
>>>>>>>> **pdev, struct VhostUserMsg *msg,
>>>>>>>>> */
>>>>>>>>> if (!(dev->features & (1ULL <<
>>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES))) {
>>>>>>>>> vq->enabled = true;
>>>>>>>>> - if (dev->notify_ops->vring_state_changed)
>>>>>>>>> - dev->notify_ops->vring_state_changed(
>>>>>>>>> - dev->vid, file.index, 1);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> if (vq->ready) {
>>>>>>>>>
>>>>>>>>
>>>>>>>> As replied earlier on v1, I agree the call to vring_state_changed here
>>>>>>>> is not needed. But it might not be enough, there are other cases
>> where
>>>>>>>> you could have issues.
>>>>>>>
>>>>>>> vhost_user_notify_queue_state() can be called in three cases:
>>>>>>> 1. when vq ready status changes, vhost_user_msg_handler() calls it to
>>>>>> notify
>>>>>>> backend. But vhost_user_msg_handler() doesn't take lock before
>> calling
>>>> it.
>>>>>>> So in this case, no deadlock occurs in async register.
>>>>>>>
>>>>>>> 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify
>>>> backend
>>>>>>> vq is not enabled. Although vhost_user_set_vring_call() is protected by
>>>> lock,
>>>>>>> async register is called only if vq is enabled, so async register will not
>> be
>>>>>> called
>>>>>>> in this case.
>>>>>>>
>>>>>>> 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify
>>>> backend
>>>>>>> vq is not enabled. Same as #2, async register is called only when vq is
>>>>>> enabled.
>>>>>>> Even if vhost_user_set_vring_kick() is protected by lock, there is no
>>>>>> deadlock in
>>>>>>> async register, as it will not be called in this case.
>>>>>>>
>>>>>>> In summary, I think there is no deadlock issue in async register if we
>>>>>>> can remove calling vring_state_change() in
>> vhost_user_set_vring_kick().
>>>>>>
>>>>>>
>>>>>> But unregister one could be called in theory no? Otherwise it would
>> look
>>>>>> unbalanced. At least on disabled notification, the app should make sure
>>>>>> the DMA transfers to and from the vring are stopped before it returns
>>>>>> from the callabck. Otherwise it could lead to undefined behavior.
>>>>>
>>>>> Right, users need to call unregister, but we cannot remove calling
>>>>> vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to
>>>>> avoid deadlock, we recommended users to call async unregister in
>>>>> destroy_device(), instead of on vring disabled notification. Does it
>>>>> make sense to you?
>>>>
>>>> Calling async unregister in destroy device is fine by me. But I'm more
>>>> concerned about DMA transations not being stopped when the ring
>> becomes
>>>> disabled.
>>> If ring becomes disabled, no more new pkts go into async data path, as
>>> virtio_dev_rx_async_submit() returns if vq->enabled is false.
>>>
>>>>
>>>> I cannot say if you are doing it right, because the vhost example does
>>>> not implement the vring_state_changed callback.
>>>> It is not a problem with the sync datapath as we have the lock
>>>> protection + enabled variable that prevents to process the rings when it
>>>> gets stopped.
>>>> But for the async path, if you have programmed DMA transfers, you need
>>>> to rely on the vring_state_change to block the control path while the
>>>> transfers are cancelled or done.
>>>
>>> I am not sure if I understand your concern correctly, but for async data
>> path,
>>> enable variable can prevent it from enqueue new pkts when ring is
>> disabled, as
>>> virtio_dev_rx_async_submit() check enable variable before processing ring;
>>> in addition, lock protection can stop async data path as
>> virtio_dev_rx_async_submit()
>>> acquires lock too. For example, in the case that front-end updates kickfd,
>>> set_vring_kick() will notify backend that vring is stopped by calling
>>> vhost_user_notify_queue_state(). In this case, sync data path will stop as a
>> result of
>>> lock protection, and I think it's the same for async data path, as it acquires
>> lock before
>>> processing ring.
>>
>>
>> What about memory hotplug happening while the DMA transfers are on-
>> going
>> for example?
>>
>> In this case, the lock is enough for sync datapath, but is not for async
>> one.
>>
>> If a VHOST_USER_SET_MEM_TABLE request is received while
>> virtio_dev_rx_async_submit(), it will be blocked until
>> virtio_dev_rx_async_submit() returns but the DMA transfers may not be
>> finished yet.
>> When unblocked the control thread will call vhost_user_set_mem_table(),
>> which will mnumap the current memory regions before mmaping the new
>> ones while DMA transfers are on-going.
>>
>> To fix this, we need to call the vring_state_changed callback for all
>> the virtqueues with disable state. in your app, you need to stop DMA
>> transfers when disabled state notification happens, or block the
>> callback while the transfer is done.
>
> Let me summarize the problem:
> when frontend memory is hot-plug, host application needs to stop
> DMA transfer for all virtqueues, which is done by emptying in-flight
> DMA copies and unregister async inside vring_state_changed().
I don't think unregistering async channels is mandatory.
On disable notification, the callback has to block until all on-going
DMA transfers are done. New transfers won't be schedules since
virtio_dev_rx_async_submit() will be blocked by the lock while the
memory regions update is being done.
> But
> vhost library takes lock before entering vring_state_changed(), and
> async unregister and rte_vhost_poll_enqueue_completed() both
> acquire lock, so deadlock occurs inside rte_vhost_poll_enqueue_
> completed() or async unregister. (please correct me if am wrong)
>
> There may be two possible solutions:
> 1. remove lock before calling vring_state_change() in vhost library;
I don't think so. If you release the lock in vring_state_change(), it
will enable the app to enqueue/dequeue descriptors in the middle of
handling the request.
> 2. provide a new callback without acquiring lock for DMA accelerated
> case. The callback is used to notify backend that you need to stop
> DMA transfer.
No, in your app just wait for DMA transfers to be finished before
returning from the callback.
Please implement vring_state_changed callback in Vhost example, it is
now mandatory with async datapath, and it will help to have an example
on what real applications should do.
Thanks,
Maxime
> Thanks,
> Jiayu
>
next prev parent reply other threads:[~2021-04-16 8:34 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-17 12:56 [dpdk-dev] [PATCH 0/4] Refactor async vhost control path Jiayu Hu
2021-03-17 12:56 ` [dpdk-dev] [PATCH 1/4] vhost: fix uninitialized vhost queue Jiayu Hu
2021-03-26 15:14 ` Maxime Coquelin
2021-03-17 12:56 ` [dpdk-dev] [PATCH 2/4] vhost: remove unnecessary free Jiayu Hu
2021-03-29 15:03 ` Maxime Coquelin
2021-03-17 12:56 ` [dpdk-dev] [PATCH 3/4] vhost: avoid deadlock on async register Jiayu Hu
2021-03-29 15:19 ` Maxime Coquelin
2021-03-30 1:20 ` Hu, Jiayu
2021-04-13 9:37 ` Maxime Coquelin
2021-03-17 12:56 ` [dpdk-dev] [PATCH 4/4] doc: update async vhost register/unregister Jiayu Hu
2021-04-02 13:03 ` [dpdk-dev] [PATCH v2 0/4] Refactor async vhost control path Jiayu Hu
2021-04-02 8:06 ` Wang, Yinan
2021-04-02 13:03 ` [dpdk-dev] [PATCH v2 1/4] vhost: fix uninitialized vhost queue Jiayu Hu
2021-04-13 11:30 ` Maxime Coquelin
2021-04-02 13:04 ` [dpdk-dev] [PATCH v2 2/4] vhost: remove unnecessary free Jiayu Hu
2021-04-02 13:04 ` [dpdk-dev] [PATCH v2 3/4] vhost: avoid deadlock on async register Jiayu Hu
2021-04-13 11:33 ` Maxime Coquelin
2021-04-14 1:40 ` Hu, Jiayu
2021-04-14 10:08 ` Maxime Coquelin
2021-04-15 1:08 ` Hu, Jiayu
2021-04-15 7:09 ` Maxime Coquelin
2021-04-16 2:19 ` Hu, Jiayu
2021-04-16 6:35 ` Maxime Coquelin
2021-04-16 8:18 ` Hu, Jiayu
2021-04-16 8:33 ` Maxime Coquelin [this message]
2021-04-19 4:10 ` Hu, Jiayu
2021-04-19 7:13 ` Maxime Coquelin
2021-04-19 9:02 ` Hu, Jiayu
2021-04-02 13:04 ` [dpdk-dev] [PATCH v2 4/4] doc: update async vhost register/unregister Jiayu Hu
2021-04-20 8:57 ` [dpdk-dev] [PATCH v3 0/4] Refactor async vhost control path Jiayu Hu
2021-04-20 8:57 ` [dpdk-dev] [PATCH v3 1/4] vhost: fix uninitialized vhost queue Jiayu Hu
2021-04-20 8:57 ` [dpdk-dev] [PATCH v3 2/4] vhost: remove unnecessary free Jiayu Hu
2021-04-20 8:57 ` [dpdk-dev] [PATCH v3 3/4] vhost: fix unnecessary vring_state_changed call Jiayu Hu
2021-04-20 9:39 ` Maxime Coquelin
2021-04-21 1:36 ` Xia, Chenbo
2021-04-20 8:57 ` [dpdk-dev] [PATCH v3 4/4] doc: update async vhost register/unregister Jiayu Hu
2021-04-20 9:31 ` Maxime Coquelin
2021-04-28 2:05 ` [dpdk-dev] [PATCH v3 0/4] Refactor async vhost control path Xia, Chenbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07eb8178-9523-6db3-1056-e96b48e65139@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=chenbo.xia@intel.com \
--cc=cheng1.jiang@intel.com \
--cc=dev@dpdk.org \
--cc=jiayu.hu@intel.com \
--cc=sunil.pai.g@intel.com \
--cc=yinan.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).