From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 56A92A0C41; Fri, 16 Apr 2021 10:34:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CA031141BC1; Fri, 16 Apr 2021 10:34:03 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 58B71141BBF for ; Fri, 16 Apr 2021 10:34:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618562040; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wGebA2bc+h6whZ2jbpfcSdgVqque20WLmcl8v5tsZEY=; b=QZW/1R/t4kNALqFzsIWDXc+jdf2bheDPRAxYam76b/aB9SWeIXMvAen/KSXfptpAVTfMWe xCscm4rN1xAEBUOOqlwFszWsnxCuUE2CoKYIdnMKMbwbfILR/xBfO2fMLaL9+HOJZ4omG2 9kIGgki5mz+QV+Auksd/IUXnHGRCgwM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-491-tkO2IuvkMESAIIyVsoSzGQ-1; Fri, 16 Apr 2021 04:33:56 -0400 X-MC-Unique: tkO2IuvkMESAIIyVsoSzGQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 64E20802B4F; Fri, 16 Apr 2021 08:33:55 +0000 (UTC) Received: from [10.36.110.28] (unknown [10.36.110.28]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7FDD360BE5; Fri, 16 Apr 2021 08:33:53 +0000 (UTC) To: "Hu, Jiayu" , "dev@dpdk.org" Cc: "Xia, Chenbo" , "Wang, Yinan" , "Pai G, Sunil" , "Jiang, Cheng1" References: <1615985773-406787-1-git-send-email-jiayu.hu@intel.com> <1617368642-131298-1-git-send-email-jiayu.hu@intel.com> <1617368642-131298-4-git-send-email-jiayu.hu@intel.com> <0b974e3cd6534947ba6568d93d9431ab@intel.com> <9eb47884f1254cc98b9f1c60dfe9714c@intel.com> <4fc5b419864e4b829623099cd31de9aa@intel.com> From: Maxime Coquelin Message-ID: <07eb8178-9523-6db3-1056-e96b48e65139@redhat.com> Date: Fri, 16 Apr 2021 10:33:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <4fc5b419864e4b829623099cd31de9aa@intel.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v2 3/4] vhost: avoid deadlock on async register X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 4/16/21 10:18 AM, Hu, Jiayu wrote: > Hi Maxime, > >> -----Original Message----- >> From: Maxime Coquelin >> Sent: Friday, April 16, 2021 2:35 PM >> To: Hu, Jiayu ; dev@dpdk.org >> Cc: Xia, Chenbo ; Wang, Yinan >> ; Pai G, Sunil ; Jiang, Cheng1 >> >> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register >> >> Hi Jiayu, >> >> On 4/16/21 4:19 AM, Hu, Jiayu wrote: >>> Hi Maxime, >>>>>> On 4/14/21 3:40 AM, Hu, Jiayu wrote: >>>>>>> Hi Maxime, >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Maxime Coquelin >>>>>>>> Sent: Tuesday, April 13, 2021 7:33 PM >>>>>>>> To: Hu, Jiayu ; dev@dpdk.org >>>>>>>> Cc: Xia, Chenbo ; Wang, Yinan >>>>>>>> ; Pai G, Sunil ; Jiang, >>>>>> Cheng1 >>>>>>>> >>>>>>>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 4/2/21 3:04 PM, Jiayu Hu wrote: >>>>>>>>> Users can register async copy device in vring_state_changed(), >>>>>>>>> when vhost queue is enabled. However, a deadlock occurs inside >>>>>>>>> rte_vhost_async_channel_register(), if >>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES >>>>>>>>> is not supported, as vhost_user_msg_handler() takes vq- >>> access_lock >>>>>>>>> before calling vhost_user_set_vring_kick(). >>>>>>>>> >>>>>>>>> This patch avoids async register deadlock by removing calling >>>>>>>>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe >>>>>>>>> as vhost_user_msg_handler() will call vring_state_changed() >> anyway. >>>>>>>>> >>>>>>>>> Signed-off-by: Jiayu Hu >>>>>>>>> --- >>>>>>>>> lib/librte_vhost/vhost_user.c | 3 --- >>>>>>>>> 1 file changed, 3 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c >>>> b/lib/librte_vhost/vhost_user.c >>>>>>>>> index 44c0452..8f0eba6 100644 >>>>>>>>> --- a/lib/librte_vhost/vhost_user.c >>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c >>>>>>>>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct >> virtio_net >>>>>>>> **pdev, struct VhostUserMsg *msg, >>>>>>>>> */ >>>>>>>>> if (!(dev->features & (1ULL << >>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES))) { >>>>>>>>> vq->enabled = true; >>>>>>>>> - if (dev->notify_ops->vring_state_changed) >>>>>>>>> - dev->notify_ops->vring_state_changed( >>>>>>>>> - dev->vid, file.index, 1); >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (vq->ready) { >>>>>>>>> >>>>>>>> >>>>>>>> As replied earlier on v1, I agree the call to vring_state_changed here >>>>>>>> is not needed. But it might not be enough, there are other cases >> where >>>>>>>> you could have issues. >>>>>>> >>>>>>> vhost_user_notify_queue_state() can be called in three cases: >>>>>>> 1. when vq ready status changes, vhost_user_msg_handler() calls it to >>>>>> notify >>>>>>> backend. But vhost_user_msg_handler() doesn't take lock before >> calling >>>> it. >>>>>>> So in this case, no deadlock occurs in async register. >>>>>>> >>>>>>> 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify >>>> backend >>>>>>> vq is not enabled. Although vhost_user_set_vring_call() is protected by >>>> lock, >>>>>>> async register is called only if vq is enabled, so async register will not >> be >>>>>> called >>>>>>> in this case. >>>>>>> >>>>>>> 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify >>>> backend >>>>>>> vq is not enabled. Same as #2, async register is called only when vq is >>>>>> enabled. >>>>>>> Even if vhost_user_set_vring_kick() is protected by lock, there is no >>>>>> deadlock in >>>>>>> async register, as it will not be called in this case. >>>>>>> >>>>>>> In summary, I think there is no deadlock issue in async register if we >>>>>>> can remove calling vring_state_change() in >> vhost_user_set_vring_kick(). >>>>>> >>>>>> >>>>>> But unregister one could be called in theory no? Otherwise it would >> look >>>>>> unbalanced. At least on disabled notification, the app should make sure >>>>>> the DMA transfers to and from the vring are stopped before it returns >>>>>> from the callabck. Otherwise it could lead to undefined behavior. >>>>> >>>>> Right, users need to call unregister, but we cannot remove calling >>>>> vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to >>>>> avoid deadlock, we recommended users to call async unregister in >>>>> destroy_device(), instead of on vring disabled notification. Does it >>>>> make sense to you? >>>> >>>> Calling async unregister in destroy device is fine by me. But I'm more >>>> concerned about DMA transations not being stopped when the ring >> becomes >>>> disabled. >>> If ring becomes disabled, no more new pkts go into async data path, as >>> virtio_dev_rx_async_submit() returns if vq->enabled is false. >>> >>>> >>>> I cannot say if you are doing it right, because the vhost example does >>>> not implement the vring_state_changed callback. >>>> It is not a problem with the sync datapath as we have the lock >>>> protection + enabled variable that prevents to process the rings when it >>>> gets stopped. >>>> But for the async path, if you have programmed DMA transfers, you need >>>> to rely on the vring_state_change to block the control path while the >>>> transfers are cancelled or done. >>> >>> I am not sure if I understand your concern correctly, but for async data >> path, >>> enable variable can prevent it from enqueue new pkts when ring is >> disabled, as >>> virtio_dev_rx_async_submit() check enable variable before processing ring; >>> in addition, lock protection can stop async data path as >> virtio_dev_rx_async_submit() >>> acquires lock too. For example, in the case that front-end updates kickfd, >>> set_vring_kick() will notify backend that vring is stopped by calling >>> vhost_user_notify_queue_state(). In this case, sync data path will stop as a >> result of >>> lock protection, and I think it's the same for async data path, as it acquires >> lock before >>> processing ring. >> >> >> What about memory hotplug happening while the DMA transfers are on- >> going >> for example? >> >> In this case, the lock is enough for sync datapath, but is not for async >> one. >> >> If a VHOST_USER_SET_MEM_TABLE request is received while >> virtio_dev_rx_async_submit(), it will be blocked until >> virtio_dev_rx_async_submit() returns but the DMA transfers may not be >> finished yet. >> When unblocked the control thread will call vhost_user_set_mem_table(), >> which will mnumap the current memory regions before mmaping the new >> ones while DMA transfers are on-going. >> >> To fix this, we need to call the vring_state_changed callback for all >> the virtqueues with disable state. in your app, you need to stop DMA >> transfers when disabled state notification happens, or block the >> callback while the transfer is done. > > Let me summarize the problem: > when frontend memory is hot-plug, host application needs to stop > DMA transfer for all virtqueues, which is done by emptying in-flight > DMA copies and unregister async inside vring_state_changed(). I don't think unregistering async channels is mandatory. On disable notification, the callback has to block until all on-going DMA transfers are done. New transfers won't be schedules since virtio_dev_rx_async_submit() will be blocked by the lock while the memory regions update is being done. > But > vhost library takes lock before entering vring_state_changed(), and > async unregister and rte_vhost_poll_enqueue_completed() both > acquire lock, so deadlock occurs inside rte_vhost_poll_enqueue_ > completed() or async unregister. (please correct me if am wrong) > > There may be two possible solutions: > 1. remove lock before calling vring_state_change() in vhost library; I don't think so. If you release the lock in vring_state_change(), it will enable the app to enqueue/dequeue descriptors in the middle of handling the request. > 2. provide a new callback without acquiring lock for DMA accelerated > case. The callback is used to notify backend that you need to stop > DMA transfer. No, in your app just wait for DMA transfers to be finished before returning from the callback. Please implement vring_state_changed callback in Vhost example, it is now mandatory with async datapath, and it will help to have an example on what real applications should do. Thanks, Maxime > Thanks, > Jiayu >