From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 288F9F94 for ; Thu, 21 Jul 2016 13:15:02 +0200 (CEST) Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244]) by mailout2.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OAN00G6QWL0MH80@mailout2.w1.samsung.com> for dev@dpdk.org; Thu, 21 Jul 2016 12:15:00 +0100 (BST) X-AuditID: cbfec7f4-f796c6d000001486-45-5790aeb45f40 Received: from eusync4.samsung.com ( [203.254.199.214]) by eucpsbgm1.samsung.com (EUCPMTA) with SMTP id C4.63.05254.4BEA0975; Thu, 21 Jul 2016 12:15:00 +0100 (BST) Received: from [106.109.129.180] by eusync4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OAN0050NWKZIO00@eusync4.samsung.com>; Thu, 21 Jul 2016 12:15:00 +0100 (BST) To: Yuanhan Liu References: <1469089275-15209-1-git-send-email-i.maximets@samsung.com> <20160721093714.GD28708@yliu-dev.sh.intel.com> <579099BC.9050603@samsung.com> <20160721101311.GE28708@yliu-dev.sh.intel.com> <5790A5D4.1090703@samsung.com> Cc: dev@dpdk.org, Huawei Xie , Dyasly Sergey , Heetae Ahn , Thomas Monjalon From: Ilya Maximets Message-id: <5790AEB3.2010708@samsung.com> Date: Thu, 21 Jul 2016 14:14:59 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-version: 1.0 In-reply-to: <5790A5D4.1090703@samsung.com> Content-type: text/plain; charset=windows-1252 Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrFLMWRmVeSWpSXmKPExsVy+t/xa7pb1k0IN7j2ytDi3aftTBbTPt9m t2ifeZbJ4kr7T3aLybOlLL5sms5mcX3CBVYHdo+L/XcYPX4tWMrqsXjPSyaPeScDPfq2rGIM YI3isklJzcksSy3St0vgypjQN5u14LhsxfV3bYwNjN/Euxg5OSQETCQ23uxjgrDFJC7cW8/W xcjFISSwlFFi1tSTzCAJIYEXjBIHTseA2MIC1hLTJr5gA7FFBHQlns5ZxwrR8IxRYvaJx8wg DrPARkaJBVNPg1WxCehInFp9hBHE5hXQkniyuY0VxGYRUJU4uX0hmC0qECExa/sPJogaQYkf k++xgNicAtoSy9pOANkcQEP1JO5f1AIJMwvIS2xe85Z5AqPALCQdsxCqZiGpWsDIvIpRNLU0 uaA4KT3XUK84Mbe4NC9dLzk/dxMjJMS/7GBcfMzqEKMAB6MSD2/Cyv5wIdbEsuLK3EOMEhzM SiK881dNCBfiTUmsrEotyo8vKs1JLT7EKM3BoiTOO3fX+xAhgfTEktTs1NSC1CKYLBMHp1QD o9JUC4f7FhlqQimVuhpJ2X97Leom5nmoSf4XirTZo6LIdyjmwDLTjTb8vg8mH2yuCI36fDFq 99ena9cdDv95fUWoudWRIx3Bmp3/0wzPxxSnbhY1lL32oLegPeXO3pVqEY+s7mzq2l6wL9H+ 466fX4/7TNvczZDTuev+5cksRfv6pD+qMyy8rMRSnJFoqMVcVJwIAIdpOkFtAgAA Subject: Re: [dpdk-dev] [PATCH] vhost: fix connect hang in client mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2016 11:15:02 -0000 On 21.07.2016 13:37, Ilya Maximets wrote: > > > On 21.07.2016 13:13, Yuanhan Liu wrote: >> On Thu, Jul 21, 2016 at 12:45:32PM +0300, Ilya Maximets wrote: >>> On 21.07.2016 12:37, Yuanhan Liu wrote: >>>> On Thu, Jul 21, 2016 at 11:21:15AM +0300, Ilya Maximets wrote: >>>>> If something abnormal happened to QEMU, 'connect()' can block calling >>>>> thread (e.g. main thread of OVS) forever or for a really long time. >>>>> This can break whole application or block the reconnection thread. >>>>> >>>>> Example with OVS: >>>>> >>>>> ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce >>>>> (gdb) bt >>>>> #0 connect () from /lib64/libpthread.so.0 >>>>> #1 vhost_user_create_client (vsocket=0xa816e0) >>>>> #2 rte_vhost_driver_register >>>>> #3 netdev_dpdk_vhost_user_construct >>>>> #4 netdev_open (name=0xa664b0 "vhost1") >>>>> [...] >>>>> #11 main >>>>> >>>>> Fix that by setting non-blocking mode for client sockets for connection. >>>>> >>>>> Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode") >>>> >>>> Thanks for spotting and fixing yet another bug! >>>> >>>>> >>>>> +static int >>>>> +vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>>> >>>> I don't quite understand why this is needed: connect() with O_NONBLOCK >>>> flag set is not enough? >>> >>> There is a little issue with non-blocking connect() call. Connection >>> establishing may be started but '-1' returned with 'errno = EINPROGRESS'. >>> In this case we must wait on fd until it will be available for writing. >>> After that we need to check current status of connection using getsockopt(). >>> >>> I don't sure that we're able to get such situation, but it's documented, >>> and, I think, we should handle it. >>> >>> See 'man connect' for details. >> >> I see. Thanks. >> >> But basically, I don't like the way of introduing yet another >> fdset here. I'm wondering we could leverage current fdset code >> to achieve that. This might need some work though. >> >> So how about making it simple and stupid at this stage: sleep a >> while (maybe 1ms, or maybe 1s) when that happens, and give up >> when the connection is still not established? > > Hmm, how about this fixup: > ------------------------------------------------------------------------------ > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c > index 8626d13..b0f45e6 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > errno = EINVAL; > > ret = connect(fd, un, sz); > - if (ret == -1 && errno != EINPROGRESS) > - return -1; > - if (ret == 0) > - goto connected; > - > - FD_ZERO(&fdset); > - FD_SET(fd, &fdset); > - > - ret = select(fd + 1, NULL, &fdset, NULL, &tv); > - if (!ret) > - errno = ETIMEDOUT; > - if (ret != 1) > + if (ret < 0 && errno != EISCONN) > return -1; > > ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); > @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > return -1; > } > > -connected: > flags = fcntl(fd, F_GETFL, 0); > if (flags < 0) { > RTE_LOG(ERR, VHOST_CONFIG, > ------------------------------------------------------------------------------ > ? > > We will not check the EINPROGRESS, but subsequent 'connect()' will return > EISCONN if connection already established. getsockopt() is kept just in > case. Subsequent 'connect()' will happen on the next iteration of > reconnection cycle (1 second sleep). I've sent v2 with this changes. Best regards, Ilya Maximets.