From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id A7A061B90C; Fri, 8 Feb 2019 15:12:42 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 961E0811C0; Fri, 8 Feb 2019 14:12:41 +0000 (UTC) Received: from [10.36.112.52] (ovpn-112-52.ams2.redhat.com [10.36.112.52]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4444C5DE6E; Fri, 8 Feb 2019 14:12:38 +0000 (UTC) To: sunwenjie , dev@dpdk.org Cc: stable@dpdk.org References: <20190128065549.98266-1-findtheonlyway@gmail.com> From: Maxime Coquelin Message-ID: <341edf62-2e01-dcce-377c-11471b02c125@redhat.com> Date: Fri, 8 Feb 2019 15:12:35 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190128065549.98266-1-findtheonlyway@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 08 Feb 2019 14:12:41 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] vhost: fix deadlock when vhost unregister X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2019 14:12:43 -0000 On 1/28/19 7:55 AM, sunwenjie wrote: > When rte_vhost_driver_unregister delete the connection fd, > fdset_try_del will always try and donot release the > vhostuser.mutex if the fd is busy, but the fdset_event_dispatch > will set the fd to busy and call vhost_user_msg_handler to get > vhostuser.mutex, which will cause deadlock. Unlock the > vhost_user.mutexif fdset_try_del fail and relock it when retry. What about this wording: In rte_vhost_driver_unregister(), the connection fd is removed from the fdset using fdset_try_del(). Call to this function may fail if the corresponding fd is in busy state, indicating that event dispatcher is executing the read or write callback on this fd. When it happens, rte_vhost_driver_unregister() keeps trying to remove the fd from the set until it is no more busy. This situation is causing a deadlock, because rte_vhost_driver_unregister() keeps trying to remove the fd from the set with vhost_user.mutex held, while the callback executed by the dispatcher, vhost_user_read_cb(), also takes this mutex at numerous places. The fix consists in releasing vhost_user.mutex between each retry in vhost_driver_unregister(). > > Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode") > Cc: stable@dpdk.org > > Signed-off-by: sunwenjie We need your real name for legal reasons: Signed-off-by: Surname Lastname No need to resubmit, I can handle the commit message fixup and the fix looks good to me: Reviewed-by: Maxime Coquelin As soon as I get your name in above format I will apply the patch in Virtio tree. Thanks for submitting the fix. Maxime > --- > lib/librte_vhost/socket.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c > index 9cf34ad17..9883b0491 100644 > --- a/lib/librte_vhost/socket.c > +++ b/lib/librte_vhost/socket.c > @@ -961,13 +961,13 @@ rte_vhost_driver_unregister(const char *path) > int count; > struct vhost_user_connection *conn, *next; > > +again: > pthread_mutex_lock(&vhost_user.mutex); > > for (i = 0; i < vhost_user.vsocket_cnt; i++) { > struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; > > if (!strcmp(vsocket->path, path)) { > -again: > pthread_mutex_lock(&vsocket->conn_mutex); > for (conn = TAILQ_FIRST(&vsocket->conn_list); > conn != NULL; > @@ -983,6 +983,7 @@ rte_vhost_driver_unregister(const char *path) > conn->connfd) == -1) { > pthread_mutex_unlock( > &vsocket->conn_mutex); > + pthread_mutex_unlock(&vhost_user.mutex); > goto again; > } > >