From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6BC3FA0524; Tue, 13 Apr 2021 11:38:10 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EA518160D59; Tue, 13 Apr 2021 11:38:09 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 155B1160D58 for ; Tue, 13 Apr 2021 11:38:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618306687; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AYdPgagJU3+oSqQpqgA1Pz5NQkKwkA763meVxP1VZgU=; b=dEYxUybhsHRLjS1n/16LUnPIos8FvOP//cAsoXVIh1nXybcArcHx2d3DGGIk+6iepGrqVo bqtOokbt8QRRbRakDGfj+wkS0YylUSnPZ9OmqEub4QnuE/0KdOemwQ7KNBvE8hxOZUC8yu +aQ6b3wXe6tAXIUmOPdn0buonsrkA2o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-601-8ftVEvEXP6W6W0fI44ElIw-1; Tue, 13 Apr 2021 05:38:03 -0400 X-MC-Unique: 8ftVEvEXP6W6W0fI44ElIw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D4DF21006C80; Tue, 13 Apr 2021 09:38:01 +0000 (UTC) Received: from [10.36.110.28] (unknown [10.36.110.28]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6B8B35B4B3; Tue, 13 Apr 2021 09:38:00 +0000 (UTC) To: "Hu, Jiayu" , "dev@dpdk.org" Cc: "Xia, Chenbo" , "Wang, Yinan" , "Jiang, Cheng1" , "Pai G, Sunil" References: <1615985773-406787-1-git-send-email-jiayu.hu@intel.com> <1615985773-406787-4-git-send-email-jiayu.hu@intel.com> <8cd04e21-8acd-7fe6-c0c3-dc162c137c4a@redhat.com> <75b23d7a233b46a9bcf56d3dde9c7bc7@intel.com> From: Maxime Coquelin Message-ID: <96029d21-de44-ffe5-c1b3-c6224fa9d41e@redhat.com> Date: Tue, 13 Apr 2021 11:37:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <75b23d7a233b46a9bcf56d3dde9c7bc7@intel.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH 3/4] vhost: avoid deadlock on async register X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 3/30/21 3:20 AM, Hu, Jiayu wrote: > Hi Maxime, > >> -----Original Message----- >> From: Maxime Coquelin >> Sent: Monday, March 29, 2021 11:19 PM >> To: Hu, Jiayu ; dev@dpdk.org >> Cc: Xia, Chenbo ; Wang, Yinan >> ; Jiang, Cheng1 ; Pai G, >> Sunil >> Subject: Re: [PATCH 3/4] vhost: avoid deadlock on async register >> >> >> >> On 3/17/21 1:56 PM, Jiayu Hu wrote: >>> Users register async copy device when vhost queue is enabled. >>> However, if VHOST_USER_F_PROTOCOL_FEATURES is not supported, >>> a deadlock occurs inside rte_vhost_async_channel_register(), >>> as vhost_user_msg_handler() already takes vq->access_lock >>> before processing VHOST_USER_SET_VRING_KICK message. >>> >>> This patch removes calling vring_state_changed() in >>> vhost_user_set_vring_kick() to avoid deadlock on async register. >>> >>> Signed-off-by: Jiayu Hu >>> --- >>> lib/librte_vhost/vhost_user.c | 3 --- >>> 1 file changed, 3 deletions(-) >>> >>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c >>> index 399675c..a319c1c 100644 >>> --- a/lib/librte_vhost/vhost_user.c >>> +++ b/lib/librte_vhost/vhost_user.c >>> @@ -1919,9 +1919,6 @@ vhost_user_set_vring_kick(struct virtio_net >> **pdev, struct VhostUserMsg *msg, >>> */ >>> if (!(dev->features & (1ULL << >> VHOST_USER_F_PROTOCOL_FEATURES))) { >>> vq->enabled = 1; >>> - if (dev->notify_ops->vring_state_changed) >>> - dev->notify_ops->vring_state_changed( >>> - dev->vid, file.index, 1); >> >> That looks very wrong, as: >> 1. The apps want to receive this notification. It looks like breaking >> existing apps in order to support the experimental async datapath. E.g. >> OVS needs it to start polling the queues when protocol features is not >> negotiated. > > IMHO, if protocol feature is not negotiated, vring_state_chaned will also > be called in vhost_user_msg_handler. In the case you mentioned, > vq->enabled is set to true in set_vring_kick, and in vhost_user_msg_handler, > "cur_ready != (vq && vq->ready)" is true, as vq->ready is false when init. So > vhost_user_msg_handler will call vhost_user_notify_queue_state, which > calls set_vring_kick inside. OK, I agree, we can drop this one. But it is not enough as vhost_user_notify_queue_state() is called at several place with the lock taken. > In addition, calling vring_state_changed in set_vring_kick is protected by lock, > but it's not in in vhost_user_msg_handler. It looks confusing to me. Is there > any special reason for this design? I think we need the lock help every time the callback is called, to avoid the case an application calls a Vhost API that would modify the vq struct. We could get undefined behavior if it happened. > >> >> 2. The fix in your case seems to indicate that your app's >> vring_state_changed callback called rte_vhost_async_channel_register. >> And your fix consists in no more calling the callback, and so no more >> calling rte_vhost_async_channel_register? > > rte_vhost_async_channel_register is recommended to call in > vring_state_changed, and vring_state_changed will be called > by vhost_user_msg_handler. You might want to schedule a thread to call channel registration. Maybe using rte_set_alarm? Regards, Maxime > > Thanks, > Jiayu >> >>> } >>> >>> if (vq->ready) { >>> >