From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
To: Patrik Andersson R <patrik.r.andersson@ericsson.com>,
Daniele Di Proietto <diproiettod@vmware.com>
Cc: "Xie, Huawei" <huawei.xie@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>,
Thomas Monjalon <thomas.monjalon@6wind.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>
Subject: Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023
Date: Thu, 7 Apr 2016 16:49:27 +0200 [thread overview]
Message-ID: <CAATJJ0LTPKe6hbnWM-noXDbCJ0Y2H6SErg7N8UL1vWuLC1qqLg@mail.gmail.com> (raw)
In-Reply-To: <570379F9.6020306@ericsson.com>
Hi Patrick,
On Tue, Apr 5, 2016 at 10:40 AM, Patrik Andersson R <
patrik.r.andersson@ericsson.com> wrote:
>
> The described fault situation arises due to the fact that there is a bug
> in an OpenStack component, Neutron or Nova, that fails to release ports
> on VM deletion. This typically leads to an accumulation of 1-2 file
> descriptors per unreleased port. It could also arise when allocating a
> large
> number (~500?) of vhost user ports and connecting them all to VMs.
>
I can confirm that I'm able to trigger this without Openstack.
Using DPDK 2.2 and OpenVswitch 2.5.
Initially I had at least 2 guests attached to the first two ports, but it
seems not necessary which makes it as easy as:
ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev
ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk
for idx in {1..1023}; do ovs-vsctl add-port ovsdpdkbr0 vhost-user-${idx} --
set Interface vhost-user-${idx} type=dpdkvhostuser; done
=> as soon as the associated fd is >1023 the vhost_user socket gets
created, but just afterwards I see the crash mentioned by Patrick
#0 0x00007f51cb187518 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f51cb1890ea in __GI_abort () at abort.c:89
#2 0x00007f51cb1c98c4 in __libc_message (do_abort=do_abort@entry=2,
fmt=fmt@entry=0x7f51cb2e1584 "*** %s ***: %s terminated\n") at
../sysdeps/posix/libc_fatal.c:175
#3 0x00007f51cb26af94 in __GI___fortify_fail (msg=<optimized out>,
msg@entry=0x7f51cb2e1515 "buffer overflow detected") at fortify_fail.c:37
#4 0x00007f51cb268fa0 in __GI___chk_fail () at chk_fail.c:28
#5 0x00007f51cb26aee7 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
#6 0x00007f51cbd6d665 in fdset_fill (pfdset=0x7f51cc03dfa0
<g_vhost_server+8192>, wfset=0x7f51c78e4a30, rfset=0x7f51c78e49b0)
at
/build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/fd_man.c:110
#7 fdset_event_dispatch (pfdset=pfdset@entry=0x7f51cc03dfa0
<g_vhost_server+8192>) at
/build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/fd_man.c:243
#8 0x00007f51cbdc1b00 in rte_vhost_driver_session_start () at
/build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/vhost-net-user.c:525
#9 0x00000000005061ab in start_vhost_loop (dummy=<optimized out>) at
../lib/netdev-dpdk.c:2047
#10 0x00000000004c2c64 in ovsthread_wrapper (aux_=<optimized out>) at
../lib/ovs-thread.c:340
#11 0x00007f51cba346fa in start_thread (arg=0x7f51c78e5700) at
pthread_create.c:333
#12 0x00007f51cb2592dd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
As Patrick I don't have a "pure" DPDK test yet, but at least OpenStack is
our of the scope now which should help.
[...]
> The key point, I think, is that more than one file descriptor is used per
> vhost user device. This means that there is no real relation between the
> number of devices and the number of file descriptors in use.
Well it is "one per vhost_user device" as far as I've seen, but those are
not the only fd's used overall.
[...]
> In my opinion the problem is that the assumption: number of vhost
> user device == number of file descriptors does not hold. What the actual
> relation might be hard to determine with any certainty.
>
I totally agree to that there is no deterministic rule what to expect.
The only rule is that #fd certainly always is > #vhost_user devices.
In various setup variants I've crossed fd 1024 anywhere between 475 and 970
vhost_user ports.
Once the discussion continues and we have an updates version of the patch
with some more agreement I hope I can help to test it.
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd
next prev parent reply other threads:[~2016-04-07 14:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-18 9:13 Patrik Andersson
2016-03-30 9:05 ` Xie, Huawei
2016-04-05 8:40 ` Patrik Andersson R
2016-04-07 14:49 ` Christian Ehrhardt [this message]
2016-04-08 6:47 ` Xie, Huawei
2016-04-11 6:06 ` Patrik Andersson R
2016-04-11 9:34 ` Christian Ehrhardt
2016-04-11 11:21 ` Patrik Andersson R
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAATJJ0LTPKe6hbnWM-noXDbCJ0Y2H6SErg7N8UL1vWuLC1qqLg@mail.gmail.com \
--to=christian.ehrhardt@canonical.com \
--cc=dev@dpdk.org \
--cc=diproiettod@vmware.com \
--cc=huawei.xie@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=patrik.r.andersson@ericsson.com \
--cc=thomas.monjalon@6wind.com \
--cc=yuanhan.liu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).