From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 2BFAAF94 for ; Thu, 21 Jul 2016 13:13:07 +0200 (CEST) Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245]) by mailout2.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OAN00F5HWHTAL80@mailout2.w1.samsung.com> for dev@dpdk.org; Thu, 21 Jul 2016 12:13:06 +0100 (BST) X-AuditID: cbfec7f5-f792a6d000001302-37-5790ae418773 Received: from eusync2.samsung.com ( [203.254.199.212]) by eucpsbgm2.samsung.com (EUCPMTA) with SMTP id F1.DE.04866.14EA0975; Thu, 21 Jul 2016 12:13:05 +0100 (BST) Received: from imaximets.rnd.samsung.ru ([106.109.129.180]) by eusync2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OAN00EBAWHPP570@eusync2.samsung.com>; Thu, 21 Jul 2016 12:13:05 +0100 (BST) From: Ilya Maximets To: dev@dpdk.org, Huawei Xie , Yuanhan Liu Cc: Dyasly Sergey , Heetae Ahn , Thomas Monjalon , Ilya Maximets Date: Thu, 21 Jul 2016 14:12:58 +0300 Message-id: <1469099578-31631-1-git-send-email-i.maximets@samsung.com> X-Mailer: git-send-email 2.7.4 In-reply-to: <1469089275-15209-1-git-send-email-i.maximets@samsung.com> References: <1469089275-15209-1-git-send-email-i.maximets@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprELMWRmVeSWpSXmKPExsVy+t/xK7qO6yaEG1ybZWnx7tN2Jotpn2+z W7TPPMtkcaX9J7vF5NlSFl82TWezuD7hAqsDu8fF/juMHr8WLGX1WLznJZPHvJOBHn1bVjEG sEZx2aSk5mSWpRbp2yVwZTz9aFjwRb2i//VXpgbGLoUuRk4OCQETiZvLNrJC2GISF+6tZ+ti 5OIQEljKKPH5XA8jhNPKJPH/QwcLSBWbgI7EqdVHGEFsEYEEiSP7f7OCFDELrGKUuP54EnsX IweHsICVxP2mQpAaFgFViekzj4HV8wq4SRyccIgZYpucxM1znWA2p4C7xKmed0wgthBQzepf h1gmMPIuYGRYxSiaWppcUJyUnmukV5yYW1yal66XnJ+7iRESUl93MC49ZnWIUYCDUYmHd8fq /nAh1sSy4srcQ4wSHMxKIrzzV00IF+JNSaysSi3Kjy8qzUktPsQozcGiJM47c9f7ECGB9MSS 1OzU1ILUIpgsEwenVAMjy697PF5W01SL/OvUoz/Pn8Gq2Bx1UtVvbU5mgX53gnyxRUrs1XDN i8s3751xwERFq+31htqnv0IPtxWYL4k5yhf3sT0m7pP4vi2/X2x12bzM7nLShLnlNUpt3rOm xIfuyGpkvtAr8HFSskPD5OcN7SmRqzKnTeDSvDRRQIHhV6eH8iTGZgklluKMREMt5qLiRAC5 MiPnJQIAAA== Subject: [dpdk-dev] [PATCH v2] vhost: fix connect hang in client mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2016 11:13:07 -0000 If something abnormal happened to QEMU, 'connect()' can block calling thread (e.g. main thread of OVS) forever or for a really long time. This can break whole application or block the reconnection thread. Example with OVS: ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce (gdb) bt #0 connect () from /lib64/libpthread.so.0 #1 vhost_user_create_client (vsocket=0xa816e0) #2 rte_vhost_driver_register #3 netdev_dpdk_vhost_user_construct #4 netdev_open (name=0xa664b0 "vhost1") [...] #11 main Fix that by setting non-blocking mode for client sockets for connection. Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode") Signed-off-by: Ilya Maximets --- This was reproduced with current QEMU master branch (commit 1ecfb24da987b862f) + patch-set "vhost-user reconnect fixes" (https://lists.nongnu.org/archive/html/qemu-devel/2016-07/msg01547.html). OVS was patched to support client mode: http://openvswitch.org/pipermail/dev/2016-July/074972.html Following script forces QEMU to fail to initialize vhost because disconnection occures while device not fully configured: while true do ovs-vsctl set Interface vhost1 ofport_request=125 ovs-vsctl set Interface vhost1 ofport_request=126 done As a result: QEMU still works, network interface broken and OVS main thread stalled inside 'connect()'. Version 2: * EINPROGRESS not checked. EISCONN checked instead on the next iteration of reconnection loop. lib/librte_vhost/vhost_user/vhost-net-user.c | 62 ++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index 8c6a096..63e0840 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -449,6 +450,14 @@ create_unix_socket(const char *path, struct sockaddr_un *un, bool is_server) RTE_LOG(INFO, VHOST_CONFIG, "vhost-user %s: socket created, fd: %d\n", is_server ? "server" : "client", fd); + if (!is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) { + RTE_LOG(ERR, VHOST_CONFIG, + "vhost-user: can't set nonblocking mode for socket, fd: " + "%d (%s)\n", fd, strerror(errno)); + close(fd); + return -1; + } + memset(un, 0, sizeof(*un)); un->sun_family = AF_UNIX; strncpy(un->sun_path, path, sizeof(un->sun_path)); @@ -516,9 +525,43 @@ struct vhost_user_reconnect_list { static struct vhost_user_reconnect_list reconn_list; static pthread_t reconn_tid; +static int +vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) +{ + int ret, flags, so_error; + socklen_t len = sizeof(so_error); + + errno = EINVAL; + + ret = connect(fd, un, sz); + if (ret < 0 && errno != EISCONN) + return -1; + + ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); + if (ret < 0 || so_error) { + if (!ret) + errno = so_error; + return -1; + } + + flags = fcntl(fd, F_GETFL, 0); + if (flags < 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "can't get flags for connfd %d\n", fd); + return -2; + } + if ((flags & O_NONBLOCK) && fcntl(fd, F_SETFL, flags & ~O_NONBLOCK)) { + RTE_LOG(ERR, VHOST_CONFIG, + "can't disable nonblocking on fd %d\n", fd); + return -2; + } + return 0; +} + static void * vhost_user_client_reconnect(void *arg __rte_unused) { + int ret; struct vhost_user_reconnect *reconn, *next; while (1) { @@ -532,13 +575,23 @@ vhost_user_client_reconnect(void *arg __rte_unused) reconn != NULL; reconn = next) { next = TAILQ_NEXT(reconn, next); - if (connect(reconn->fd, (struct sockaddr *)&reconn->un, - sizeof(reconn->un)) < 0) + ret = vhost_user_connect_nonblock(reconn->fd, + (struct sockaddr *)&reconn->un, + sizeof(reconn->un)); + if (ret == -2) { + close(reconn->fd); + RTE_LOG(ERR, VHOST_CONFIG, + "reconnection for fd %d failed\n", + reconn->fd); + goto remove_fd; + } + if (ret == -1) continue; RTE_LOG(INFO, VHOST_CONFIG, "%s: connected\n", reconn->vsocket->path); vhost_user_add_connection(reconn->fd, reconn->vsocket); +remove_fd: TAILQ_REMOVE(&reconn_list.head, reconn, next); free(reconn); } @@ -579,7 +632,8 @@ vhost_user_create_client(struct vhost_user_socket *vsocket) if (fd < 0) return -1; - ret = connect(fd, (struct sockaddr *)&un, sizeof(un)); + ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&un, + sizeof(un)); if (ret == 0) { vhost_user_add_connection(fd, vsocket); return 0; @@ -589,7 +643,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket) "failed to connect to %s: %s\n", path, strerror(errno)); - if (!vsocket->reconnect) { + if (ret == -2 || !vsocket->reconnect) { close(fd); return -1; } -- 2.7.4