From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 45D8FADB7 for ; Tue, 10 May 2016 19:17:54 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP; 10 May 2016 10:17:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,605,1455004800"; d="scan'208";a="962976416" Received: from irsmsx151.ger.corp.intel.com ([163.33.192.59]) by fmsmga001.fm.intel.com with ESMTP; 10 May 2016 10:17:52 -0700 Received: from irsmsx106.ger.corp.intel.com ([169.254.8.131]) by IRSMSX151.ger.corp.intel.com ([169.254.4.7]) with mapi id 14.03.0248.002; Tue, 10 May 2016 18:17:51 +0100 From: "Loftus, Ciara" To: "Michael S. Tsirkin" , "Xie, Huawei" CC: Yuanhan Liu , "dev@dpdk.org" Thread-Topic: [PATCH 3/6] vhost: add reconnect ability Thread-Index: AQHRqoz0N/8ZlKW72k+i/N1d2WnoP5+x07AAgACQXFA= Date: Tue, 10 May 2016 17:17:50 +0000 Message-ID: <74F120C019F4A64C9B78E802F6AD4CC24F8A74B0@IRSMSX106.ger.corp.intel.com> References: <1462603224-29510-1-git-send-email-yuanhan.liu@linux.intel.com> <1462603224-29510-4-git-send-email-yuanhan.liu@linux.intel.com> <20160509181254.GE5641@yliu-dev.sh.intel.com> <20160510105138-mutt-send-email-mst@redhat.com> <20160510114044-mutt-send-email-mst@redhat.com> <20160510121350-mutt-send-email-mst@redhat.com> In-Reply-To: <20160510121350-mutt-send-email-mst@redhat.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZTJlMTJmNjktNjY0OS00NTI1LTgwMzctNTc1NTJlYTE0YmRiIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6InA2aUZHZTJ2a2EySmgwU3Y5b0JcL2o0Nkg1NFFLZ24yVFkxXC9NdCtRNHNMVT0ifQ== x-ctpclassification: CTP_IC x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 3/6] vhost: add reconnect ability X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 May 2016 17:17:55 -0000 > On Tue, May 10, 2016 at 09:00:45AM +0000, Xie, Huawei wrote: > > On 5/10/2016 4:42 PM, Michael S. Tsirkin wrote: > > > On Tue, May 10, 2016 at 08:07:00AM +0000, Xie, Huawei wrote: > > >> On 5/10/2016 3:56 PM, Michael S. Tsirkin wrote: > > >>> On Tue, May 10, 2016 at 07:24:10AM +0000, Xie, Huawei wrote: > > >>>> On 5/10/2016 2:08 AM, Yuanhan Liu wrote: > > >>>>> On Mon, May 09, 2016 at 04:47:02PM +0000, Xie, Huawei wrote: > > >>>>>> On 5/7/2016 2:36 PM, Yuanhan Liu wrote: > > >>>>>>> +static void * > > >>>>>>> +vhost_user_client_reconnect(void *arg) > > >>>>>>> +{ > > >>>>>>> + struct reconnect_info *reconn =3D arg; > > >>>>>>> + int ret; > > >>>>>>> + > > >>>>>>> + RTE_LOG(ERR, VHOST_CONFIG, "reconnecting...\n"); > > >>>>>>> + while (1) { > > >>>>>>> + ret =3D connect(reconn->fd, (struct sockaddr > *)&reconn->un, > > >>>>>>> + sizeof(reconn->un)); > > >>>>>>> + if (ret =3D=3D 0) > > >>>>>>> + break; > > >>>>>>> + sleep(1); > > >>>>>>> + } > > >>>>>>> + > > >>>>>>> + vhost_user_add_connection(reconn->fd, reconn->vsocket); > > >>>>>>> + free(reconn); > > >>>>>>> + > > >>>>>>> + return NULL; > > >>>>>>> +} > > >>>>>>> + > > >>>>>> We could create hundreds of vhost-user ports in OVS. Wihout > connections > > >>>>>> with QEMU established, those ports are just inactive. This works > fine in > > >>>>>> server mode. > > >>>>>> With client modes, do we need to create hundreds of vhost > threads? This > > >>>>>> would be too interruptible. > > >>>>>> How about we create only one thread and do the reconnections > for all the > > >>>>>> unconnected socket? > > >>>>> Yes, good point and good suggestion. Will do it in v2. > > >>>> Hi Michael: > > >>>> This reminds me another irrelevant issue. > > >>>> In OVS, currently for each vhost port, we create an unix domain > socket, > > >>>> and QEMU vhost proxy connects to this socket, and we use this to > > >>>> identify the connection. This works fine but is our workaround, > > >>>> otherwise we have no way to identify the connection. > > >>>> Do you think if this is an issue? > > >> Let us say vhost creates one unix domain socket, with path as > "sockpath", > > >> and two virtio ports in two VMS both connect to the same socket with > the > > >> following command line > > >> -chardev socket,id=3Dchar0,path=3Dsockpath > > >> How could vhost identify the connection? > > > getpeername(2)? > > > > getpeer name returns host/port? then it isn't useful. >=20 > Maybe but I'm still in the dark. Useful for what? >=20 > > The typical scenario in my mind is: > > We create a OVS port with the name "port1", and when we receive an > > virtio connection with ID "port1", we attach this virtio interface to > > the OVS port "port1". >=20 > If you are going to listen on a socket, you can create ports > and attach socket fds to it dynamically as long as you get connections. > What is wrong with that? Hi Michael, I haven't reviewed the patchset fully, but to hopefully provide more clarif= y on how OVS uses vHost: OVS with DPDK needs some way to distinguish vHost connections from one anot= her so it can switch traffic to the correct port depending on how the switc= h is programmed. At the moment this is achieved by: 1. user provides unique port name eg. 'vhost0' (this is normal behaviour in= OVS - checks are put in place to avoid overlapping port names) 2. DPDK vHost lib creates socket called 'vhost0' 3. VM launched with vhost0 socket // -chardev socket,id=3Dchar0,path=3D/pat= h/to/vhost0 4. OVS receives 'new_device' vhost callback, checks the name of the device = (virtio_dev->ifname =3D=3D vhost0?), if the name matches the name provided = in step 1, OVS stores the virtio_net *dev pointer 5. OVS uses *dev pointer to send and receive traffic to vhost0 via calls to= vhost library functions eg. enqueue(*dev) / dequeue(*dev) 6. Repeat for multiple vhost devices We need to make sure that there is still some way to distinguish devices fr= om one another like in step 4. Let me know if you need any further clarific= ation. Thanks, Ciara >=20 >=20 > > > > > > > > > > >> Workarounds: > > >> vhost creates two unix domain sockets, with path as "sockpath1" and > > >> "sockpath2" respectively, > > >> and the virtio ports in two VMS respectively connect to "sockpath1" = and > > >> "sockpath2". > > >> > > >> If we have some name string from QEMU over vhost, as you > mentioned, we > > >> could create only one socket with path "sockpath". > > >> User ensure that the names are unique, just as how they do with > multiple > > >> sockets. > > >> > > > Seems rather fragile. > > > > >From the scenario above, it is enough. That is also how it works today > > in DPDK OVS implementation with multiple sockets. > > Any other idea? > > > > > > > >>> I'm sorry, I have trouble understanding what you wrote above. > > >>> What is the issue you are trying to work around? > > >>> > > >>>> Do we have plan to support identification in > VHOST_USER_MESSAGE? With > > >>>> the identification, if vhost as server, we only need to create one > > >>>> socket which receives multiple connections, and use the ID in the > > >>>> message to identify the connection. > > >>>> > > >>>> /huawei > > >>> Sending e.g. -name string from qemu over vhost might be useful > > >>> for debugging, but I'm not sure it's a good idea to > > >>> rely on it being unique. > > >>> > > >>>>> Thanks. > > >>>>> > > >>>>> --yliu > > >>>>> > >