From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Loftus, Ciara" <ciara.loftus@intel.com>
Cc: "Xie, Huawei" <huawei.xie@intel.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 3/6] vhost: add reconnect ability
Date: Thu, 12 May 2016 00:46:08 +0300 [thread overview]
Message-ID: <20160512004240-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <74F120C019F4A64C9B78E802F6AD4CC24F8A74B0@IRSMSX106.ger.corp.intel.com>
On Tue, May 10, 2016 at 05:17:50PM +0000, Loftus, Ciara wrote:
> > On Tue, May 10, 2016 at 09:00:45AM +0000, Xie, Huawei wrote:
> > > On 5/10/2016 4:42 PM, Michael S. Tsirkin wrote:
> > > > On Tue, May 10, 2016 at 08:07:00AM +0000, Xie, Huawei wrote:
> > > >> On 5/10/2016 3:56 PM, Michael S. Tsirkin wrote:
> > > >>> On Tue, May 10, 2016 at 07:24:10AM +0000, Xie, Huawei wrote:
> > > >>>> On 5/10/2016 2:08 AM, Yuanhan Liu wrote:
> > > >>>>> On Mon, May 09, 2016 at 04:47:02PM +0000, Xie, Huawei wrote:
> > > >>>>>> On 5/7/2016 2:36 PM, Yuanhan Liu wrote:
> > > >>>>>>> +static void *
> > > >>>>>>> +vhost_user_client_reconnect(void *arg)
> > > >>>>>>> +{
> > > >>>>>>> + struct reconnect_info *reconn = arg;
> > > >>>>>>> + int ret;
> > > >>>>>>> +
> > > >>>>>>> + RTE_LOG(ERR, VHOST_CONFIG, "reconnecting...\n");
> > > >>>>>>> + while (1) {
> > > >>>>>>> + ret = connect(reconn->fd, (struct sockaddr
> > *)&reconn->un,
> > > >>>>>>> + sizeof(reconn->un));
> > > >>>>>>> + if (ret == 0)
> > > >>>>>>> + break;
> > > >>>>>>> + sleep(1);
> > > >>>>>>> + }
> > > >>>>>>> +
> > > >>>>>>> + vhost_user_add_connection(reconn->fd, reconn->vsocket);
> > > >>>>>>> + free(reconn);
> > > >>>>>>> +
> > > >>>>>>> + return NULL;
> > > >>>>>>> +}
> > > >>>>>>> +
> > > >>>>>> We could create hundreds of vhost-user ports in OVS. Wihout
> > connections
> > > >>>>>> with QEMU established, those ports are just inactive. This works
> > fine in
> > > >>>>>> server mode.
> > > >>>>>> With client modes, do we need to create hundreds of vhost
> > threads? This
> > > >>>>>> would be too interruptible.
> > > >>>>>> How about we create only one thread and do the reconnections
> > for all the
> > > >>>>>> unconnected socket?
> > > >>>>> Yes, good point and good suggestion. Will do it in v2.
> > > >>>> Hi Michael:
> > > >>>> This reminds me another irrelevant issue.
> > > >>>> In OVS, currently for each vhost port, we create an unix domain
> > socket,
> > > >>>> and QEMU vhost proxy connects to this socket, and we use this to
> > > >>>> identify the connection. This works fine but is our workaround,
> > > >>>> otherwise we have no way to identify the connection.
> > > >>>> Do you think if this is an issue?
> > > >> Let us say vhost creates one unix domain socket, with path as
> > "sockpath",
> > > >> and two virtio ports in two VMS both connect to the same socket with
> > the
> > > >> following command line
> > > >> -chardev socket,id=char0,path=sockpath
> > > >> How could vhost identify the connection?
> > > > getpeername(2)?
> > >
> > > getpeer name returns host/port? then it isn't useful.
> >
> > Maybe but I'm still in the dark. Useful for what?
> >
> > > The typical scenario in my mind is:
> > > We create a OVS port with the name "port1", and when we receive an
> > > virtio connection with ID "port1", we attach this virtio interface to
> > > the OVS port "port1".
> >
> > If you are going to listen on a socket, you can create ports
> > and attach socket fds to it dynamically as long as you get connections.
> > What is wrong with that?
>
> Hi Michael,
>
> I haven't reviewed the patchset fully, but to hopefully provide more clarify on how OVS uses vHost:
>
> OVS with DPDK needs some way to distinguish vHost connections from one another so it can switch traffic to the correct port depending on how the switch is programmed.
> At the moment this is achieved by:
> 1. user provides unique port name eg. 'vhost0' (this is normal behaviour in OVS - checks are put in place to avoid overlapping port names)
> 2. DPDK vHost lib creates socket called 'vhost0'
> 3. VM launched with vhost0 socket // -chardev socket,id=char0,path=/path/to/vhost0
> 4. OVS receives 'new_device' vhost callback, checks the name of the device (virtio_dev->ifname == vhost0?), if the name matches the name provided in step 1, OVS stores the virtio_net *dev pointer
> 5. OVS uses *dev pointer to send and receive traffic to vhost0 via calls to vhost library functions eg. enqueue(*dev) / dequeue(*dev)
> 6. Repeat for multiple vhost devices
>
> We need to make sure that there is still some way to distinguish devices from one another like in step 4. Let me know if you need any further clarification.
>
> Thanks,
> Ciara
I see. I think that the path approach is better personally -
it is easier to secure, just give each QEMU access to the correct
path only.
> >
> >
> > >
> > > >
> > > >
> > > >> Workarounds:
> > > >> vhost creates two unix domain sockets, with path as "sockpath1" and
> > > >> "sockpath2" respectively,
> > > >> and the virtio ports in two VMS respectively connect to "sockpath1" and
> > > >> "sockpath2".
> > > >>
> > > >> If we have some name string from QEMU over vhost, as you
> > mentioned, we
> > > >> could create only one socket with path "sockpath".
> > > >> User ensure that the names are unique, just as how they do with
> > multiple
> > > >> sockets.
> > > >>
> > > > Seems rather fragile.
> > >
> > > >From the scenario above, it is enough. That is also how it works today
> > > in DPDK OVS implementation with multiple sockets.
> > > Any other idea?
> > >
> > > >
> > > >>> I'm sorry, I have trouble understanding what you wrote above.
> > > >>> What is the issue you are trying to work around?
> > > >>>
> > > >>>> Do we have plan to support identification in
> > VHOST_USER_MESSAGE? With
> > > >>>> the identification, if vhost as server, we only need to create one
> > > >>>> socket which receives multiple connections, and use the ID in the
> > > >>>> message to identify the connection.
> > > >>>>
> > > >>>> /huawei
> > > >>> Sending e.g. -name string from qemu over vhost might be useful
> > > >>> for debugging, but I'm not sure it's a good idea to
> > > >>> rely on it being unique.
> > > >>>
> > > >>>>> Thanks.
> > > >>>>>
> > > >>>>> --yliu
> > > >>>>>
> > >
next prev parent reply other threads:[~2016-05-11 21:46 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-07 6:40 [dpdk-dev] [PATCH 0/6] vhost: add vhost-user client mode and " Yuanhan Liu
2016-05-07 6:40 ` [dpdk-dev] [PATCH 1/6] vhost: rename structs for enabling client mode Yuanhan Liu
2016-05-07 6:40 ` [dpdk-dev] [PATCH 2/6] vhost: add vhost-user " Yuanhan Liu
2016-05-09 10:33 ` Victor Kaplansky
2016-05-09 20:33 ` Yuanhan Liu
2016-05-09 20:30 ` Michael S. Tsirkin
2016-05-07 6:40 ` [dpdk-dev] [PATCH 3/6] vhost: add reconnect ability Yuanhan Liu
2016-05-09 16:47 ` Xie, Huawei
2016-05-09 18:12 ` Yuanhan Liu
2016-05-10 7:24 ` Xie, Huawei
2016-05-10 7:54 ` Michael S. Tsirkin
2016-05-10 8:07 ` Xie, Huawei
2016-05-10 8:42 ` Michael S. Tsirkin
2016-05-10 9:00 ` Xie, Huawei
2016-05-10 9:17 ` Michael S. Tsirkin
2016-05-10 17:17 ` Loftus, Ciara
2016-05-11 21:46 ` Michael S. Tsirkin [this message]
2016-05-07 6:40 ` [dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base Yuanhan Liu
2016-05-09 10:45 ` Victor Kaplansky
2016-05-09 13:39 ` Xie, Huawei
2016-05-09 18:23 ` Yuanhan Liu
2016-05-09 12:19 ` Michael S. Tsirkin
2016-05-09 16:25 ` Xie, Huawei
2016-05-09 18:22 ` Yuanhan Liu
2016-06-13 20:47 ` Michael S. Tsirkin
2016-05-10 8:21 ` Xie, Huawei
2016-05-07 6:40 ` [dpdk-dev] [PATCH 5/6] examples/vhost: add client and reconnect option Yuanhan Liu
2016-05-09 10:47 ` Victor Kaplansky
2016-05-07 6:40 ` [dpdk-dev] [PATCH 6/6] vhost: add pmd " Yuanhan Liu
2016-05-09 10:54 ` Victor Kaplansky
2016-05-09 18:26 ` Yuanhan Liu
2016-05-10 3:23 ` [dpdk-dev] [PATCH 0/6] vhost: add vhost-user client mode and reconnect ability Xu, Qian Q
2016-05-10 17:41 ` Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 " Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 1/6] vhost: rename structs for enabling client mode Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 2/6] vhost: add vhost-user " Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 3/6] vhost: add reconnect ability Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 4/6] vhost: workaround stale vring base Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 5/6] examples/vhost: add client and reconnect option Yuanhan Liu
2016-05-13 6:16 ` [dpdk-dev] [PATCH v2 6/6] vhost: add pmd " Yuanhan Liu
2016-05-25 17:45 ` Rich Lane
2016-05-26 8:01 ` Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 0/6] vhost: add vhost-user client mode and reconnect ability Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 1/6] vhost: rename structs for enabling client mode Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 2/6] vhost: add vhost-user " Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 3/6] vhost: add reconnect ability Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 4/6] vhost: workaround stale vring base Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 5/6] examples/vhost: add client option Yuanhan Liu
2016-06-07 4:05 ` [dpdk-dev] [PATCH v3 6/6] vhost: add pmd " Yuanhan Liu
2016-06-14 12:00 ` [dpdk-dev] [PATCH v3 0/6] vhost: add vhost-user client mode and reconnect ability Yuanhan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160512004240-mutt-send-email-mst@redhat.com \
--to=mst@redhat.com \
--cc=ciara.loftus@intel.com \
--cc=dev@dpdk.org \
--cc=huawei.xie@intel.com \
--cc=yuanhan.liu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).