* [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 @ 2016-03-18 9:13 Patrik Andersson 2016-03-30 9:05 ` Xie, Huawei 0 siblings, 1 reply; 8+ messages in thread From: Patrik Andersson @ 2016-03-18 9:13 UTC (permalink / raw) To: dev; +Cc: Patrik Andersson Protect against DPDK crash when allocation of listen fd >= 1023. For events on fd:s >1023, the current implementation will trigger an abort due to access outside of allocated bit mask. Corrections would include: * Match fdset_add() signature in fd_man.c to fd_man.h * Handling of return codes from fdset_add() * Addition of check of fd number in fdset_add_fd() --- The rationale behind the suggested code change is that, fdset_event_dispatch() could attempt access outside of the FD_SET bitmask if there is an event on a file descriptor that in turn looks up a virtio file descriptor with a value > 1023. Such an attempt will lead to an abort() and a restart of any vswitch using DPDK. A discussion topic exist in the ovs-discuss mailing list that can provide a little more background: http://openvswitch.org/pipermail/discuss/2016-February/020243.html Signed-off-by: Patrik Andersson <patrik.r.andersson@ericsson.com> --- lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- lib/librte_vhost/vhost_user/vhost-net-user.c | 22 ++++++++++++++++++++-- 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/lib/librte_vhost/vhost_user/fd_man.c b/lib/librte_vhost/vhost_user/fd_man.c index 087aaed..c691339 100644 --- a/lib/librte_vhost/vhost_user/fd_man.c +++ b/lib/librte_vhost/vhost_user/fd_man.c @@ -71,20 +71,22 @@ fdset_find_free_slot(struct fdset *pfdset) return fdset_find_fd(pfdset, -1); } -static void +static int fdset_add_fd(struct fdset *pfdset, int idx, int fd, fd_cb rcb, fd_cb wcb, void *dat) { struct fdentry *pfdentry; - if (pfdset == NULL || idx >= MAX_FDS) - return; + if (pfdset == NULL || idx >= MAX_FDS || fd >= FD_SETSIZE) + return -1; pfdentry = &pfdset->fd[idx]; pfdentry->fd = fd; pfdentry->rcb = rcb; pfdentry->wcb = wcb; pfdentry->dat = dat; + + return 0; } /** @@ -150,12 +152,11 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) /* Find a free slot in the list. */ i = fdset_find_free_slot(pfdset); - if (i == -1) { + if (i == -1 || fdset_add_fd(pfdset, i, fd, rcb, wcb, dat) < 0) { pthread_mutex_unlock(&pfdset->fd_mutex); return -2; } - fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pfdset->num++; pthread_mutex_unlock(&pfdset->fd_mutex); diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index df2bd64..778851d 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -288,6 +288,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) int fh; struct vhost_device_ctx vdev_ctx = { (pid_t)0, 0 }; unsigned int size; + int ret; conn_fd = accept(fd, NULL, NULL); RTE_LOG(INFO, VHOST_CONFIG, @@ -317,8 +318,15 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) ctx->vserver = vserver; ctx->fh = fh; - fdset_add(&g_vhost_server.fdset, + ret = fdset_add(&g_vhost_server.fdset, conn_fd, vserver_message_handler, NULL, ctx); + if (ret < 0) { + free(ctx); + close(conn_fd); + RTE_LOG(ERR, VHOST_CONFIG, + "failed to add fd %d into vhost server fdset\n", + conn_fd); + } } /* callback when there is message on the connfd */ @@ -453,6 +461,7 @@ int rte_vhost_driver_register(const char *path) { struct vhost_server *vserver; + int ret; pthread_mutex_lock(&g_vhost_server.server_mutex); @@ -478,8 +487,17 @@ rte_vhost_driver_register(const char *path) vserver->path = strdup(path); - fdset_add(&g_vhost_server.fdset, vserver->listenfd, + ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd, vserver_new_vq_conn, NULL, vserver); + if (ret < 0) { + pthread_mutex_unlock(&g_vhost_server.server_mutex); + RTE_LOG(ERR, VHOST_CONFIG, + "failed to add listen fd %d to vhost server fdset\n", + vserver->listenfd); + free(vserver->path); + free(vserver); + return -1; + } g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver; pthread_mutex_unlock(&g_vhost_server.server_mutex); -- 1.9.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-03-18 9:13 [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 Patrik Andersson @ 2016-03-30 9:05 ` Xie, Huawei 2016-04-05 8:40 ` Patrik Andersson R 0 siblings, 1 reply; 8+ messages in thread From: Xie, Huawei @ 2016-03-30 9:05 UTC (permalink / raw) To: Patrik Andersson, dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu On 3/18/2016 5:15 PM, Patrik Andersson wrote: > Protect against DPDK crash when allocation of listen fd >= 1023. > For events on fd:s >1023, the current implementation will trigger > an abort due to access outside of allocated bit mask. > > Corrections would include: > > * Match fdset_add() signature in fd_man.c to fd_man.h > * Handling of return codes from fdset_add() > * Addition of check of fd number in fdset_add_fd() > > --- > > The rationale behind the suggested code change is that, > fdset_event_dispatch() could attempt access outside of the FD_SET > bitmask if there is an event on a file descriptor that in turn > looks up a virtio file descriptor with a value > 1023. > Such an attempt will lead to an abort() and a restart of any > vswitch using DPDK. > > A discussion topic exist in the ovs-discuss mailing list that can > provide a little more background: > > http://openvswitch.org/pipermail/discuss/2016-February/020243.html Thanks for catching this. Could you give more details on how to accumulating fds? The buggy code is based on the fact that FD_SETSIZE limits the number of file descriptors, which might be true on Windows. However from the manual, it says clearly it limits the value of file descriptors. The use of select is for portability. I have been wondering if it is truly that important. Use poll could simplify the code a bit, for example we need add timeout to select so that another thread could insert/remove a fd into/from the monitored list. Any comments on using poll/epoll? > > Signed-off-by: Patrik Andersson <patrik.r.andersson@ericsson.com> > --- > lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- > lib/librte_vhost/vhost_user/vhost-net-user.c | 22 ++++++++++++++++++++-- > 2 files changed, 26 insertions(+), 7 deletions(-) > > diff --git a/lib/librte_vhost/vhost_user/fd_man.c b/lib/librte_vhost/vhost_user/fd_man.c > index 087aaed..c691339 100644 > --- a/lib/librte_vhost/vhost_user/fd_man.c > +++ b/lib/librte_vhost/vhost_user/fd_man.c > @@ -71,20 +71,22 @@ fdset_find_free_slot(struct fdset *pfdset) > return fdset_find_fd(pfdset, -1); > } > > -static void > +static int > fdset_add_fd(struct fdset *pfdset, int idx, int fd, > fd_cb rcb, fd_cb wcb, void *dat) > { > struct fdentry *pfdentry; > > - if (pfdset == NULL || idx >= MAX_FDS) seems we had better change the definition of MAX_FDS and MAX_VHOST_SERVER to FD_SETSIZE or add some build time check. > - return; > + if (pfdset == NULL || idx >= MAX_FDS || fd >= FD_SETSIZE) > + return -1; > > pfdentry = &pfdset->fd[idx]; > pfdentry->fd = fd; > pfdentry->rcb = rcb; > pfdentry->wcb = wcb; > pfdentry->dat = dat; > + > + return 0; > } > > /** > @@ -150,12 +152,11 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) > > /* Find a free slot in the list. */ > i = fdset_find_free_slot(pfdset); > - if (i == -1) { > + if (i == -1 || fdset_add_fd(pfdset, i, fd, rcb, wcb, dat) < 0) { > pthread_mutex_unlock(&pfdset->fd_mutex); > return -2; > } > > - fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); > pfdset->num++; > > pthread_mutex_unlock(&pfdset->fd_mutex); > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c > index df2bd64..778851d 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -288,6 +288,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) > int fh; > struct vhost_device_ctx vdev_ctx = { (pid_t)0, 0 }; > unsigned int size; > + int ret; > > conn_fd = accept(fd, NULL, NULL); > RTE_LOG(INFO, VHOST_CONFIG, > @@ -317,8 +318,15 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) > > ctx->vserver = vserver; > ctx->fh = fh; > - fdset_add(&g_vhost_server.fdset, > + ret = fdset_add(&g_vhost_server.fdset, > conn_fd, vserver_message_handler, NULL, ctx); > + if (ret < 0) { > + free(ctx); > + close(conn_fd); > + RTE_LOG(ERR, VHOST_CONFIG, > + "failed to add fd %d into vhost server fdset\n", > + conn_fd); > + } > } > > /* callback when there is message on the connfd */ > @@ -453,6 +461,7 @@ int > rte_vhost_driver_register(const char *path) > { > struct vhost_server *vserver; > + int ret; > > pthread_mutex_lock(&g_vhost_server.server_mutex); > > @@ -478,8 +487,17 @@ rte_vhost_driver_register(const char *path) > > vserver->path = strdup(path); > > - fdset_add(&g_vhost_server.fdset, vserver->listenfd, > + ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd, > vserver_new_vq_conn, NULL, vserver); > + if (ret < 0) { > + pthread_mutex_unlock(&g_vhost_server.server_mutex); > + RTE_LOG(ERR, VHOST_CONFIG, > + "failed to add listen fd %d to vhost server fdset\n", > + vserver->listenfd); > + free(vserver->path); > + free(vserver); > + return -1; > + } > > g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver; > pthread_mutex_unlock(&g_vhost_server.server_mutex); ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-03-30 9:05 ` Xie, Huawei @ 2016-04-05 8:40 ` Patrik Andersson R 2016-04-07 14:49 ` Christian Ehrhardt 0 siblings, 1 reply; 8+ messages in thread From: Patrik Andersson R @ 2016-04-05 8:40 UTC (permalink / raw) To: Xie, Huawei, dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu Hi, thank you for the response on this. The described fault situation arises due to the fact that there is a bug in an OpenStack component, Neutron or Nova, that fails to release ports on VM deletion. This typically leads to an accumulation of 1-2 file descriptors per unreleased port. It could also arise when allocating a large number (~500?) of vhost user ports and connecting them all to VMs. Unfortunately I don't have a DPDK design test environment, thus I have not tried to reproduce the issue with DPDK only. But, I assume that by creating enough vhost user devices it can be triggered. File descriptors would be "consumed" by calls to rte_vhost_driver_register() and user_set_mem_table() both, presumably. The key point, I think, is that more than one file descriptor is used per vhost user device. This means that there is no real relation between the number of devices and the number of file descriptors in use. As for using select() or epoll(), I don't know the strength of the portability argument. It is possible that epoll() would exhibit some better properties, like simpler code in the polling loop, but I have yet seen too little of the total picture to venture an opinion. On 03/30/2016 11:05 AM, Xie, Huawei wrote: > On 3/18/2016 5:15 PM, Patrik Andersson wrote: >> Protect against DPDK crash when allocation of listen fd >= 1023. >> For events on fd:s >1023, the current implementation will trigger >> an abort due to access outside of allocated bit mask. >> >> Corrections would include: >> >> * Match fdset_add() signature in fd_man.c to fd_man.h >> * Handling of return codes from fdset_add() >> * Addition of check of fd number in fdset_add_fd() >> >> --- >> >> The rationale behind the suggested code change is that, >> fdset_event_dispatch() could attempt access outside of the FD_SET >> bitmask if there is an event on a file descriptor that in turn >> looks up a virtio file descriptor with a value > 1023. >> Such an attempt will lead to an abort() and a restart of any >> vswitch using DPDK. >> >> A discussion topic exist in the ovs-discuss mailing list that can >> provide a little more background: >> >> http://openvswitch.org/pipermail/discuss/2016-February/020243.html > Thanks for catching this. Could you give more details on how to > accumulating fds? > The buggy code is based on the fact that FD_SETSIZE limits the number of > file descriptors, which might be true on Windows. However from the > manual, it says clearly it limits the value of file descriptors. > The use of select is for portability. I have been wondering if it is > truly that important. Use poll could simplify the code a bit, for > example we need add timeout to select so that another thread could > insert/remove a fd into/from the monitored list. > > Any comments on using poll/epoll? > >> Signed-off-by: Patrik Andersson <patrik.r.andersson@ericsson.com> >> --- >> lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- >> lib/librte_vhost/vhost_user/vhost-net-user.c | 22 ++++++++++++++++++++-- >> 2 files changed, 26 insertions(+), 7 deletions(-) >> >> diff --git a/lib/librte_vhost/vhost_user/fd_man.c b/lib/librte_vhost/vhost_user/fd_man.c >> index 087aaed..c691339 100644 >> --- a/lib/librte_vhost/vhost_user/fd_man.c >> +++ b/lib/librte_vhost/vhost_user/fd_man.c >> @@ -71,20 +71,22 @@ fdset_find_free_slot(struct fdset *pfdset) >> return fdset_find_fd(pfdset, -1); >> } >> >> -static void >> +static int >> fdset_add_fd(struct fdset *pfdset, int idx, int fd, >> fd_cb rcb, fd_cb wcb, void *dat) >> { >> struct fdentry *pfdentry; >> >> - if (pfdset == NULL || idx >= MAX_FDS) > seems we had better change the definition of MAX_FDS and > MAX_VHOST_SERVER to FD_SETSIZE or add some build time check. I'm not sure how build time checks would help, in my build the MAX_FDS == FD_SETSIZE == 1024. Here it is not actually possible to support 1024 vhost user devices, because of the file descriptor limitation. In my opinion the problem is that the assumption: number of vhost user device == number of file descriptors does not hold. What the actual relation might be hard to determine with any certainty. Use of epoll() instead of select() might relax checking to the number of vhost user devices only. In addition the return value of the fdset_add_fd() should be checked (note: in the corresponding include file the function signature is "static int", not "static void" as in the code). > >> - return; >> + if (pfdset == NULL || idx >= MAX_FDS || fd >= FD_SETSIZE) >> + return -1; >> >> pfdentry = &pfdset->fd[idx]; >> pfdentry->fd = fd; >> pfdentry->rcb = rcb; >> pfdentry->wcb = wcb; >> pfdentry->dat = dat; >> + >> + return 0; >> } >> >> /** >> @@ -150,12 +152,11 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) >> >> /* Find a free slot in the list. */ >> i = fdset_find_free_slot(pfdset); >> - if (i == -1) { >> + if (i == -1 || fdset_add_fd(pfdset, i, fd, rcb, wcb, dat) < 0) { >> pthread_mutex_unlock(&pfdset->fd_mutex); >> return -2; >> } >> >> - fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); >> pfdset->num++; >> >> pthread_mutex_unlock(&pfdset->fd_mutex); >> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c >> index df2bd64..778851d 100644 >> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c >> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c >> @@ -288,6 +288,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) >> int fh; >> struct vhost_device_ctx vdev_ctx = { (pid_t)0, 0 }; >> unsigned int size; >> + int ret; >> >> conn_fd = accept(fd, NULL, NULL); >> RTE_LOG(INFO, VHOST_CONFIG, >> @@ -317,8 +318,15 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) >> >> ctx->vserver = vserver; >> ctx->fh = fh; >> - fdset_add(&g_vhost_server.fdset, >> + ret = fdset_add(&g_vhost_server.fdset, >> conn_fd, vserver_message_handler, NULL, ctx); >> + if (ret < 0) { >> + free(ctx); >> + close(conn_fd); >> + RTE_LOG(ERR, VHOST_CONFIG, >> + "failed to add fd %d into vhost server fdset\n", >> + conn_fd); >> + } >> } >> >> /* callback when there is message on the connfd */ >> @@ -453,6 +461,7 @@ int >> rte_vhost_driver_register(const char *path) >> { >> struct vhost_server *vserver; >> + int ret; >> >> pthread_mutex_lock(&g_vhost_server.server_mutex); >> >> @@ -478,8 +487,17 @@ rte_vhost_driver_register(const char *path) >> >> vserver->path = strdup(path); >> >> - fdset_add(&g_vhost_server.fdset, vserver->listenfd, >> + ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd, >> vserver_new_vq_conn, NULL, vserver); >> + if (ret < 0) { >> + pthread_mutex_unlock(&g_vhost_server.server_mutex); >> + RTE_LOG(ERR, VHOST_CONFIG, >> + "failed to add listen fd %d to vhost server fdset\n", >> + vserver->listenfd); >> + free(vserver->path); >> + free(vserver); >> + return -1; >> + } >> >> g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver; >> pthread_mutex_unlock(&g_vhost_server.server_mutex); ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-04-05 8:40 ` Patrik Andersson R @ 2016-04-07 14:49 ` Christian Ehrhardt 2016-04-08 6:47 ` Xie, Huawei 0 siblings, 1 reply; 8+ messages in thread From: Christian Ehrhardt @ 2016-04-07 14:49 UTC (permalink / raw) To: Patrik Andersson R, Daniele Di Proietto Cc: Xie, Huawei, dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu Hi Patrick, On Tue, Apr 5, 2016 at 10:40 AM, Patrik Andersson R < patrik.r.andersson@ericsson.com> wrote: > > The described fault situation arises due to the fact that there is a bug > in an OpenStack component, Neutron or Nova, that fails to release ports > on VM deletion. This typically leads to an accumulation of 1-2 file > descriptors per unreleased port. It could also arise when allocating a > large > number (~500?) of vhost user ports and connecting them all to VMs. > I can confirm that I'm able to trigger this without Openstack. Using DPDK 2.2 and OpenVswitch 2.5. Initially I had at least 2 guests attached to the first two ports, but it seems not necessary which makes it as easy as: ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk for idx in {1..1023}; do ovs-vsctl add-port ovsdpdkbr0 vhost-user-${idx} -- set Interface vhost-user-${idx} type=dpdkvhostuser; done => as soon as the associated fd is >1023 the vhost_user socket gets created, but just afterwards I see the crash mentioned by Patrick #0 0x00007f51cb187518 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x00007f51cb1890ea in __GI_abort () at abort.c:89 #2 0x00007f51cb1c98c4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f51cb2e1584 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007f51cb26af94 in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f51cb2e1515 "buffer overflow detected") at fortify_fail.c:37 #4 0x00007f51cb268fa0 in __GI___chk_fail () at chk_fail.c:28 #5 0x00007f51cb26aee7 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25 #6 0x00007f51cbd6d665 in fdset_fill (pfdset=0x7f51cc03dfa0 <g_vhost_server+8192>, wfset=0x7f51c78e4a30, rfset=0x7f51c78e49b0) at /build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/fd_man.c:110 #7 fdset_event_dispatch (pfdset=pfdset@entry=0x7f51cc03dfa0 <g_vhost_server+8192>) at /build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/fd_man.c:243 #8 0x00007f51cbdc1b00 in rte_vhost_driver_session_start () at /build/dpdk-3lQdSB/dpdk-2.2.0/lib/librte_vhost/vhost_user/vhost-net-user.c:525 #9 0x00000000005061ab in start_vhost_loop (dummy=<optimized out>) at ../lib/netdev-dpdk.c:2047 #10 0x00000000004c2c64 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:340 #11 0x00007f51cba346fa in start_thread (arg=0x7f51c78e5700) at pthread_create.c:333 #12 0x00007f51cb2592dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 As Patrick I don't have a "pure" DPDK test yet, but at least OpenStack is our of the scope now which should help. [...] > The key point, I think, is that more than one file descriptor is used per > vhost user device. This means that there is no real relation between the > number of devices and the number of file descriptors in use. Well it is "one per vhost_user device" as far as I've seen, but those are not the only fd's used overall. [...] > In my opinion the problem is that the assumption: number of vhost > user device == number of file descriptors does not hold. What the actual > relation might be hard to determine with any certainty. > I totally agree to that there is no deterministic rule what to expect. The only rule is that #fd certainly always is > #vhost_user devices. In various setup variants I've crossed fd 1024 anywhere between 475 and 970 vhost_user ports. Once the discussion continues and we have an updates version of the patch with some more agreement I hope I can help to test it. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-04-07 14:49 ` Christian Ehrhardt @ 2016-04-08 6:47 ` Xie, Huawei 2016-04-11 6:06 ` Patrik Andersson R 0 siblings, 1 reply; 8+ messages in thread From: Xie, Huawei @ 2016-04-08 6:47 UTC (permalink / raw) To: Christian Ehrhardt, Patrik Andersson R, Daniele Di Proietto Cc: dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu On 4/7/2016 10:52 PM, Christian Ehrhardt wrote: > I totally agree to that there is no deterministic rule what to expect. > The only rule is that #fd certainly always is > #vhost_user devices. > In various setup variants I've crossed fd 1024 anywhere between 475 > and 970 vhost_user ports. > > Once the discussion continues and we have an updates version of the > patch with some more agreement I hope I can help to test it. Thanks. Let us first temporarily fix this problem for robustness, then we consider whether upgrade to (e)poll. Will check the patch in detail later. Basically it should work but need check whether we need extra fixes elsewhere. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-04-08 6:47 ` Xie, Huawei @ 2016-04-11 6:06 ` Patrik Andersson R 2016-04-11 9:34 ` Christian Ehrhardt 0 siblings, 1 reply; 8+ messages in thread From: Patrik Andersson R @ 2016-04-11 6:06 UTC (permalink / raw) To: Xie, Huawei, Christian Ehrhardt, Daniele Di Proietto Cc: dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu I fully agree with this course of action. Thank you, Patrik On 04/08/2016 08:47 AM, Xie, Huawei wrote: > On 4/7/2016 10:52 PM, Christian Ehrhardt wrote: >> I totally agree to that there is no deterministic rule what to expect. >> The only rule is that #fd certainly always is > #vhost_user devices. >> In various setup variants I've crossed fd 1024 anywhere between 475 >> and 970 vhost_user ports. >> >> Once the discussion continues and we have an updates version of the >> patch with some more agreement I hope I can help to test it. > Thanks. Let us first temporarily fix this problem for robustness, then > we consider whether upgrade to (e)poll. > Will check the patch in detail later. Basically it should work but need > check whether we need extra fixes elsewhere. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-04-11 6:06 ` Patrik Andersson R @ 2016-04-11 9:34 ` Christian Ehrhardt 2016-04-11 11:21 ` Patrik Andersson R 0 siblings, 1 reply; 8+ messages in thread From: Christian Ehrhardt @ 2016-04-11 9:34 UTC (permalink / raw) To: Patrik Andersson R Cc: Xie, Huawei, Daniele Di Proietto, dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu I like the approach as well to go for the fix for robustness first. I was accidentally able to find another testcase to hit the same root cause. Adding guests with 15 vhost_user based NICs each while having rxq for openvswitch-dpdk set to 4 and multiqueue for the guest devices at 4 already breaks when adding the thirds such guests. That is way earlier than I would have expected the fd's to be exhausted but still the same root cause, so just another test for the same. In prep to the wider check to the patch a minor review question from me: On the section of rte_vhost_driver_register that now detects if there were issues we might want to close the socketfd as well when bailing out. Otherwise we would just have another source of fd leaks or would that be reused later on even now that we have freed vserver-path and vserver itself? ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd, vserver_new_vq_conn, NULL, vserver); if (ret < 0) { pthread_mutex_unlock(&g_vhost_server.server_mutex); RTE_LOG(ERR, VHOST_CONFIG, "failed to add listen fd %d to vhost server fdset\n", vserver->listenfd); free(vserver->path); + close(vserver->listenfd); free(vserver); return -1; } Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Mon, Apr 11, 2016 at 8:06 AM, Patrik Andersson R < patrik.r.andersson@ericsson.com> wrote: > I fully agree with this course of action. > > Thank you, > > Patrik > > > > On 04/08/2016 08:47 AM, Xie, Huawei wrote: > >> On 4/7/2016 10:52 PM, Christian Ehrhardt wrote: >> >>> I totally agree to that there is no deterministic rule what to expect. >>> The only rule is that #fd certainly always is > #vhost_user devices. >>> In various setup variants I've crossed fd 1024 anywhere between 475 >>> and 970 vhost_user ports. >>> >>> Once the discussion continues and we have an updates version of the >>> patch with some more agreement I hope I can help to test it. >>> >> Thanks. Let us first temporarily fix this problem for robustness, then >> we consider whether upgrade to (e)poll. >> Will check the patch in detail later. Basically it should work but need >> check whether we need extra fixes elsewhere. >> > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 2016-04-11 9:34 ` Christian Ehrhardt @ 2016-04-11 11:21 ` Patrik Andersson R 0 siblings, 0 replies; 8+ messages in thread From: Patrik Andersson R @ 2016-04-11 11:21 UTC (permalink / raw) To: Christian Ehrhardt Cc: Xie, Huawei, Daniele Di Proietto, dev, Thomas Monjalon, Ananyev, Konstantin, Yuanhan Liu Yes, that is correct. Closing the socket on failure needs to be added. Regards, Patrik On 04/11/2016 11:34 AM, Christian Ehrhardt wrote: > I like the approach as well to go for the fix for robustness first. > > I was accidentally able to find another testcase to hit the same root > cause. > Adding guests with 15 vhost_user based NICs each while having rxq for > openvswitch-dpdk set to 4 and multiqueue for the guest devices at 4 > already breaks when adding the thirds such guests. > That is way earlier than I would have expected the fd's to be > exhausted but still the same root cause, so just another test for the > same. > > In prep to the wider check to the patch a minor review question from me: > On the section of rte_vhost_driver_register that now detects if there > were issues we might want to close the socketfd as well when bailing out. > Otherwise we would just have another source of fd leaks or would that > be reused later on even now that we have freed vserver-path and > vserver itself? > > ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd, > vserver_new_vq_conn, NULL, vserver); > if (ret < 0) { > pthread_mutex_unlock(&g_vhost_server.server_mutex); > RTE_LOG(ERR, VHOST_CONFIG, > "failed to add listen fd %d to vhost server fdset\n", > vserver->listenfd); > free(vserver->path); > + close(vserver->listenfd); > free(vserver); > return -1; > } > > > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > On Mon, Apr 11, 2016 at 8:06 AM, Patrik Andersson R > <patrik.r.andersson@ericsson.com > <mailto:patrik.r.andersson@ericsson.com>> wrote: > > I fully agree with this course of action. > > Thank you, > > Patrik > > > > On 04/08/2016 08:47 AM, Xie, Huawei wrote: > > On 4/7/2016 10:52 PM, Christian Ehrhardt wrote: > > I totally agree to that there is no deterministic rule > what to expect. > The only rule is that #fd certainly always is > > #vhost_user devices. > In various setup variants I've crossed fd 1024 anywhere > between 475 > and 970 vhost_user ports. > > Once the discussion continues and we have an updates > version of the > patch with some more agreement I hope I can help to test it. > > Thanks. Let us first temporarily fix this problem for > robustness, then > we consider whether upgrade to (e)poll. > Will check the patch in detail later. Basically it should work > but need > check whether we need extra fixes elsewhere. > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-04-11 11:21 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-03-18 9:13 [dpdk-dev] [RFC] vhost user: add error handling for fd > 1023 Patrik Andersson 2016-03-30 9:05 ` Xie, Huawei 2016-04-05 8:40 ` Patrik Andersson R 2016-04-07 14:49 ` Christian Ehrhardt 2016-04-08 6:47 ` Xie, Huawei 2016-04-11 6:06 ` Patrik Andersson R 2016-04-11 9:34 ` Christian Ehrhardt 2016-04-11 11:21 ` Patrik Andersson R
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).