From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 3C3B623C; Fri, 4 May 2018 15:00:40 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9C8094276E87; Fri, 4 May 2018 13:00:39 +0000 (UTC) Received: from [10.36.112.52] (ovpn-112-52.ams2.redhat.com [10.36.112.52]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7BC5E83B85; Fri, 4 May 2018 13:00:38 +0000 (UTC) To: xiangxia.m.yue@gmail.com, jianfeng.tan@intel.com, yliu@fridaylinux.org Cc: dev@dpdk.org, stable@dpdk.org References: <1524842385-61707-1-git-send-email-xiangxia.m.yue@gmail.com> From: Maxime Coquelin Message-ID: <7336c453-0d29-9353-8f10-3761c6a6fb5b@redhat.com> Date: Fri, 4 May 2018 15:00:37 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <1524842385-61707-1-git-send-email-xiangxia.m.yue@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 04 May 2018 13:00:39 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 04 May 2018 13:00:39 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: Re: [dpdk-dev] [PATCH 1/3] vhost: fix deadlock due to vhostuser socket and fdset X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2018 13:00:40 -0000 On 04/27/2018 05:19 PM, xiangxia.m.yue@gmail.com wrote: > From: Tonghao Zhang > > When qemu close the unix socket fd of the vhostuser as a > server, and then immediately delete the vhostuser port on > openvswitch. There will be a deadlock. > > A thread (fdset event thread): B thread: > 1. fdset_event_dispatch rte_vhost_driver_unregister > 2. set the fd busy to 1. lock vsocket->conn_mutex > 3. vhost_user_read_cb fdset_del waits busy changed to 0. > 4. vhost peer closed, remove the > conn from vsocket->conn_list: > lock vsocket->conn_mutex > > 5. set the fd busy to 0 > > Fixes: 65388b43 ("vhost: fix fd leaks for vhost-user server mode") > Cc: stable@dpdk.org > Cc: Yuanhan Liu > Signed-off-by: Tonghao Zhang > --- > lib/librte_vhost/fd_man.c | 32 ++++++++++++++++++++++++++++++++ > lib/librte_vhost/fd_man.h | 1 + > lib/librte_vhost/socket.c | 13 ++++++++++++- > 3 files changed, 45 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/fd_man.c b/lib/librte_vhost/fd_man.c > index 8590ee5..b24c27d 100644 > --- a/lib/librte_vhost/fd_man.c > +++ b/lib/librte_vhost/fd_man.c > @@ -174,6 +174,38 @@ > return dat; > } > > +/** > + * Unregister the fd from the fdset. > + * > + * If parameters are invalid, return directly -2. > + * And check whether fd is busy, if yes, return -1. > + * Otherwise, try to delete the fd from fdset and > + * return true. > + */ > +int > +fdset_try_del(struct fdset *pfdset, int fd) > +{ > + int i; > + > + if (pfdset == NULL || fd == -1) > + return -2; > + > + pthread_mutex_lock(&pfdset->fd_mutex); > + i = fdset_find_fd(pfdset, fd); > + if (i != -1 && pfdset->fd[i].busy) { > + pthread_mutex_unlock(&pfdset->fd_mutex); > + return -1; > + } > + > + if (i != -1) { > + pfdset->fd[i].fd = -1; > + pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL; > + pfdset->fd[i].dat = NULL; > + } > + > + pthread_mutex_unlock(&pfdset->fd_mutex); > + return 0; > +} > > /** > * This functions runs in infinite blocking loop until there is no fd in > diff --git a/lib/librte_vhost/fd_man.h b/lib/librte_vhost/fd_man.h > index 76a42fb..3331bcd 100644 > --- a/lib/librte_vhost/fd_man.h > +++ b/lib/librte_vhost/fd_man.h > @@ -44,6 +44,7 @@ int fdset_add(struct fdset *pfdset, int fd, > fd_cb rcb, fd_cb wcb, void *dat); > > void *fdset_del(struct fdset *pfdset, int fd); > +int fdset_try_del(struct fdset *pfdset, int fd); > > void *fdset_event_dispatch(void *arg); > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c > index 4a561ad..822db41 100644 > --- a/lib/librte_vhost/socket.c > +++ b/lib/librte_vhost/socket.c > @@ -922,13 +922,24 @@ struct vhost_user_reconnect_list { > vhost_user_remove_reconnect(vsocket); > } > > +again: > pthread_mutex_lock(&vsocket->conn_mutex); > for (conn = TAILQ_FIRST(&vsocket->conn_list); > conn != NULL; > conn = next) { > next = TAILQ_NEXT(conn, next); > > - fdset_del(&vhost_user.fdset, conn->connfd); > + /* > + * If r/wcb is executing, release the > + * conn_mutex lock, and try again since > + * the r/wcb may use the conn_mutex lock. > + */ > + if (fdset_try_del(&vhost_user.fdset, > + conn->connfd) == -1) { > + pthread_mutex_unlock(&vsocket->conn_mutex); > + goto again; > + } > + > RTE_LOG(INFO, VHOST_CONFIG, > "free connfd = %d for device '%s'\n", > conn->connfd, path); > It looks a bit fragile at first, but I don't have a better alternative in mind, so: Acked-by: Maxime Coquelin Thanks, Maxime