* [dpdk-dev] OVS+DPDK: deadlock and race condtion, which leads OVS deadlock or crash
@ 2018-01-23 10:26 王志克
0 siblings, 0 replies; only message in thread
From: 王志克 @ 2018-01-23 10:26 UTC (permalink / raw)
To: Yuanhan Liu; +Cc: dev, dev
Hi Yuanha and all,
Thanks for review. I change the title and let more people involved.
The issues can be reproduced easily with OVS.
The steps are below:
1) virsh start vm
2) ovs-vsctl add port
3) delete the port via ovs-vsctl del-port.
4) shutdown VM via vrish destroy.
I run a script to repeatly do such job for 2 VM, and it needs about 1 hour to reproduce it.
However, I found there are more issues (I can reproduce it):
1) see below code
static void
vhost_user_read_cb(int connfd, void *dat, int *remove)
{
struct vhost_user_connection *conn = dat;
struct vhost_user_socket *vsocket = conn->vsocket;
int ret;
ret = vhost_user_msg_handler(conn->vid, connfd);
if (ret < 0) {
close(connfd);
*remove = 1;
vhost_destroy_device(conn->vid);
pthread_mutex_lock(&vsocket->conn_mutex);
TAILQ_REMOVE(&vsocket->conn_list, conn, next);
pthread_mutex_unlock(&vsocket->conn_mutex);
free(conn);
-------------------------> [Zhike Wang] we can see that the vsocket does not exist in conn_list or reconn_list at this short time slot. So in this time, if rte_vhost_driver_unregister() is called, vsocket is freed. So below vhost_user_create_client() may use raw pointer, and lead to crash.
if (vsocket->reconnect)
vhost_user_create_client(vsocket);
}
}
2) regarding the above issue, I think we need one mutex to lock conn_list and reconn_list, so the vsocket can in either conn_list or reconn_list, not both or none. So I use vhost_user.mutex()
diff --git a/dpdk-16.11.3/lib/librte_vhost/fd_man.c b/dpdk-16.11.3/lib/librte_vhost/fd_man.c
@@ -233,6 +234,7 @@
poll(pfdset->rwfds, numfds, 1000 /* millisecs */);
need_shrink = 0;
+ pthread_mutex_lock(&vhost_user.mutex);
for (i = 0; i < numfds; i++) {
pthread_mutex_lock(&pfdset->fd_mutex);
@@ -264,7 +266,10 @@
rcb(fd, dat, &remove1);
if (wcb && pfd->revents & (POLLOUT | FDPOLLERR))
wcb(fd, dat, &remove2);
+
+ pthread_mutex_lock(&pfdset->fd_mutex);
pfdentry->busy = 0;
+ pthread_mutex_unlock(&pfdset->fd_mutex);
/*
* fdset_del needs to check busy flag.
* We don't allow fdset_del to be called in callback
@@ -285,5 +290,6 @@
if (need_shrink)
fdset_shrink(pfdset);
+ pthread_mutex_unlock(&vhost_user.mutex);
@@ -390,6 +395,8 @@ struct vhost_user_reconnect_list {
struct vhost_user_reconnect *reconn, *next;
while (1) {
+
+ pthread_mutex_lock(&vhost_user.mutex);
pthread_mutex_lock(&reconn_list.mutex);
/*
@@ -417,11 +424,18 @@ struct vhost_user_reconnect_list {
"%s: connected\n", reconn->vsocket->path);
vhost_user_add_connection(reconn->fd, reconn->vsocket);
remove_fd:
pthread_mutex_unlock(&reconn_list.mutex);
+ pthread_mutex_unlock(&vhost_user.mutex);
sleep(1);
}
Then I met a deadlock (another thread enters rte_vhost_driver_unregister, and waiting for vhost_user.mutex.)
(gdb) bt
#0 0x00007febfaf8ebcd in poll () from /lib64/libc.so.6
#1 0x00007fec029b6836 in poll (__timeout=<optimized out>, __nfds=2, __fds=0x7febe80008c0) at /usr/include/bits/poll2.h:46
#2 time_poll (pollfds=pollfds@entry=0x7febe80008c0, n_pollfds=2, handles=handles@entry=0x0, timeout_when=4098990819, elapsed=elapsed@entry=0x7febf7450290) at lib/timeval.c:305
#3 0x00007fec0299afea in poll_block () at lib/poll-loop.c:364
#4 0x00007fec0297da25 in ovsrcu_synchronize () at lib/ovs-rcu.c:231
#5 0x00007fec029e9477 in destroy_device (vid=<optimized out>) at lib/netdev-dpdk.c:2651
#6 0x00007fec0195db04 in vhost_destroy_device (vid=<optimized out>) at /lib/librte_vhost/vhost.c:277
#7 0x00007fec0195cbdc in vhost_user_read_cb (connfd=<optimized out>, dat=0x7febc80008c0, remove=0x7febf7451420) at /lib/librte_vhost/socket.c:269
#8 0x00007fec0195c200 in fdset_event_dispatch (pfdset=pfdset@entry=0x7fec01b70180 <vhost_user+8192>) at /lib/librte_vhost/fd_man.c:266
#9 0x00007fec0195d647 in rte_vhost_driver_session_start () at /lib/librte_vhost/socket.c:714
#10 0x00007fec029e661b in start_vhost_loop (dummy=<optimized out>) at lib/netdev-dpdk.c:2736
#11 0x00007fec0297f796 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:348
#12 0x00007febfb9b6dc5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007febfaf9921d in clone () from /lib64/libc.so.6
(gdb) n
So now I believe it is hard to fix the issues insides the modules, and I would like to present identified issues, and want to hear your proposal or fix.
Br,
Zhike Wang
-----Original Message-----
From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
Sent: Thursday, January 18, 2018 10:04 PM
To: 王志克
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v3] lib/librte_vhost: move fdset_del out of conn_mutex
Hi,
Apologize for late review.
On Tue, Jan 02, 2018 at 02:08:36AM -0800, zhike wang wrote:
> From: wang zhike <wangzhike@jd.com>
>
> v3:
> * Fix duplicate variable name, which leads to unexpected memory write.
> v2:
> * Move fdset_del before conn destroy.
> * Fix coding style.
Note that we prefer to put the change logs after "---" below Signed-off-by,
so that those change logs won't be tracked in the git log history.
> This patch fixes below race condition:
> 1. one thread calls: rte_vhost_driver_unregister->lock conn_mutex
> ->fdset_del->loop to check fd.busy.
> 2. another thread calls fdset_event_dispatch, and the busy flag is
> changed AFTER handling on the fd, i.e, rcb(). However, the rcb,
> such as vhost_user_read_cb() would try to retrieve the conn_mutex.
>
> So issue is that the 1st thread will loop check the flag while holding
> the mutex, while the 2nd thread would be blocked by mutex and can not
> change the flag. Then dead lock is observed.
I then would change the title to "vhost: fix deadlock".
I'm also keen to know how do you reproduce this issue with real-life
APP (say ovs) and how easy it is for reproduce.
> Signed-off-by: zhike wang <wangzhike@jd.com>
Again, you need fix your git config file about your name.
> ---
> lib/librte_vhost/socket.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> index 422da00..ea01327 100644
> --- a/lib/librte_vhost/socket.c
> +++ b/lib/librte_vhost/socket.c
> @@ -749,6 +749,9 @@ struct vhost_user_reconnect_list {
> struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>
> if (!strcmp(vsocket->path, path)) {
> + int del_fds[MAX_FDS];
> + int num_of_fds = 0, fd_index;
> +
I think the naming could be a bit shorter, like "fds, nr_fds (or nb_fds),
fd_idx".
> if (vsocket->is_server) {
> fdset_del(&vhost_user.fdset, vsocket->socket_fd);
> close(vsocket->socket_fd);
> @@ -757,13 +760,26 @@ struct vhost_user_reconnect_list {
> vhost_user_remove_reconnect(vsocket);
> }
>
> + /* fdset_del() must be called without conn_mutex. */
> + pthread_mutex_lock(&vsocket->conn_mutex);
> + for (conn = TAILQ_FIRST(&vsocket->conn_list);
> + conn != NULL;
> + conn = next) {
> + next = TAILQ_NEXT(conn, next);
> +
> + del_fds[num_of_fds++] = conn->connfd;
> + }
> + pthread_mutex_unlock(&vsocket->conn_mutex);
> +
> + for (fd_index = 0; fd_index < num_of_fds; fd_index++)
> + fdset_del(&vhost_user.fdset, del_fds[fd_index]);
> +
> pthread_mutex_lock(&vsocket->conn_mutex);
> for (conn = TAILQ_FIRST(&vsocket->conn_list);
> conn != NULL;
> conn = next) {
> next = TAILQ_NEXT(conn, next);
>
> - fdset_del(&vhost_user.fdset, conn->connfd);
If you log the fd here and invoke fdset_del() and close() after the loop,
you then could avoid one extra loop as you did above.
--yliu
> RTE_LOG(INFO, VHOST_CONFIG,
> "free connfd = %d for device '%s'\n",
> conn->connfd, path);
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2018-01-23 10:26 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-23 10:26 [dpdk-dev] OVS+DPDK: deadlock and race condtion, which leads OVS deadlock or crash 王志克
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).