* [dpdk-dev] Memory leak when adding/removing vhost_user ports @ 2016-04-18 17:18 Christian Ehrhardt 2016-04-18 17:46 ` Yuanhan Liu 2016-04-21 11:01 ` [dpdk-dev] " Ilya Maximets 0 siblings, 2 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-04-18 17:18 UTC (permalink / raw) To: dev; +Cc: Daniele Di Proietto I assume there is a leak somewhere on adding/removing vhost_user ports. Although it could also be "only" a fragmentation issue. Reproduction is easy: I set up a pair of nicely working OVS-DPDK connected KVM Guests. Then in a loop I - add up to more 512 ports - test connectivity between the two guests - remove up to 512 ports Depending on memory and the amount of multiqueue/rxq I use it seems to slightly change when exactly it breaks. But for my default setup of 4 queues and 5G Hugepages initialized by DPDK it always breaks at the sixth iteration. Here a link to the stack trace indicating a memory shortage (TBC): https://launchpadlibrarian.net/253916410/apport-retrace.log Known Todos: - I want to track it down more, and will try to come up with a non openvswitch based looping testcase that might show it as well to simplify debugging. - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and Openvswitch master is planned. I will go on debugging this and let you know, but I wanted to give a heads up to everyone. In case this is a known issue for some of you please let me know. Kind Regards, Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd P.S. I think it is a dpdk issue, but adding Daniele on CC to represent ovs-dpdk as well. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-18 17:18 [dpdk-dev] Memory leak when adding/removing vhost_user ports Christian Ehrhardt @ 2016-04-18 17:46 ` Yuanhan Liu 2016-04-18 18:14 ` Yuanhan Liu 2016-04-21 11:01 ` [dpdk-dev] " Ilya Maximets 1 sibling, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-04-18 17:46 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: dev, Daniele Di Proietto On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > I assume there is a leak somewhere on adding/removing vhost_user ports. > Although it could also be "only" a fragmentation issue. > > Reproduction is easy: > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > Then in a loop I > - add up to more 512 ports > - test connectivity between the two guests > - remove up to 512 ports > > Depending on memory and the amount of multiqueue/rxq I use it seems to > slightly change when exactly it breaks. But for my default setup of 4 > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > iteration. > Here a link to the stack trace indicating a memory shortage (TBC): > https://launchpadlibrarian.net/253916410/apport-retrace.log > > Known Todos: > - I want to track it down more, and will try to come up with a non > openvswitch based looping testcase that might show it as well to simplify > debugging. > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and > Openvswitch master is planned. > > I will go on debugging this and let you know, but I wanted to give a heads > up to everyone. Thanks for the report. > In case this is a known issue for some of you please let me know. Yeah, it might be. I'm wondering that virtio_net struct is not freed. It will be freed only (if I'm not mistaken) when guest quits, by far. BTW, could you dump the ovs-dpdk log? --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-18 17:46 ` Yuanhan Liu @ 2016-04-18 18:14 ` Yuanhan Liu 2016-04-19 16:33 ` Christian Ehrhardt 0 siblings, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-04-18 18:14 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: dev, Daniele Di Proietto On Mon, Apr 18, 2016 at 10:46:50AM -0700, Yuanhan Liu wrote: > On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > > I assume there is a leak somewhere on adding/removing vhost_user ports. > > Although it could also be "only" a fragmentation issue. > > > > Reproduction is easy: > > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > > Then in a loop I > > - add up to more 512 ports > > - test connectivity between the two guests > > - remove up to 512 ports > > > > Depending on memory and the amount of multiqueue/rxq I use it seems to > > slightly change when exactly it breaks. But for my default setup of 4 > > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > > iteration. > > Here a link to the stack trace indicating a memory shortage (TBC): > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > Known Todos: > > - I want to track it down more, and will try to come up with a non > > openvswitch based looping testcase that might show it as well to simplify > > debugging. > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and > > Openvswitch master is planned. > > > > I will go on debugging this and let you know, but I wanted to give a heads > > up to everyone. > > Thanks for the report. > > > In case this is a known issue for some of you please let me know. > > Yeah, it might be. I'm wondering that virtio_net struct is not freed. > It will be freed only (if I'm not mistaken) when guest quits, by far. Would you try following diff and to see if it fix your issue? --yliu --- lib/librte_vhost/vhost_user/vhost-net-user.c | 6 ++++++ lib/librte_vhost/vhost_user/vhost-net-user.h | 1 + 2 files changed, 7 insertions(+) diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index df2bd64..8f7ebd7 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -309,6 +309,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) } vdev_ctx.fh = fh; + vserver->fh = fh; size = strnlen(vserver->path, PATH_MAX); vhost_set_ifname(vdev_ctx, vserver->path, size); @@ -501,6 +502,11 @@ rte_vhost_driver_unregister(const char *path) for (i = 0; i < g_vhost_server.vserver_cnt; i++) { if (!strcmp(g_vhost_server.server[i]->path, path)) { + struct vhost_device_ctx ctx; + + ctx.fh = g_vhost_server.server[i]->fh; + vhost_destroy_device(ctx); + fdset_del(&g_vhost_server.fdset, g_vhost_server.server[i]->listenfd); diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h b/lib/librte_vhost/vhost_user/vhost-net-user.h index e3bb413..7cf21db 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.h +++ b/lib/librte_vhost/vhost_user/vhost-net-user.h @@ -43,6 +43,7 @@ struct vhost_server { char *path; /**< The path the uds is bind to. */ int listenfd; /**< The listener sockfd. */ + uint32_t fh; }; /* refer to hw/virtio/vhost-user.c */ -- 1.9.3 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-18 18:14 ` Yuanhan Liu @ 2016-04-19 16:33 ` Christian Ehrhardt 2016-04-20 5:04 ` Yuanhan Liu 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt 0 siblings, 2 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-04-19 16:33 UTC (permalink / raw) To: Yuanhan Liu; +Cc: dev, Daniele Di Proietto Hi, thanks for the patch. I backported it this way to my DPDK 2.2 based environment for now (see below): With that applied one (and only one) of my two guests looses connectivity after removing the ports the first time. No traffic seems to pass, setting the device in the guest down/up doesn't get anything. But It isn't totally broken - stopping/starting the guest gets it working again. So openvswitch/dpdk is still somewhat working - it just seems the guest lost something, after tapping on that vhost_user interface again it works. I will check tomorrow and let you know: - if I'm more lucky with that patch on top of 16.04 - if it looses connectivity after the first or a certain amount of port removes If you find issues with my backport adaption let me know. --- Backport and reasoning: new fix relies on a lot of new code, vhost_destroy_device looks totally different from the former destroy_device. History on todays function content: 4796ad63 - original code moved from examples to lib a90ca1a1 - this replaces ops->destroy_device with vhost_destroy_device 71dc571e - simple check against null pointers 45ca9c6f - this changed the code from linked list to arrays New code cleans with: notify_ops->destroy_device (callback into the parent) cleanup_device was existing before even in 2.2 code free_device as well existing before even in 2.2 code Old code cleans with: notify_ops->destroy_device - still there rm_config_ll_entry -> eventually calls cleanup_device and free_device (just in the more complex linked list way) So the "only" adaption for backporting needed is to replace vhost_destroy_device with ops->destroy_device(ctx) Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c =================================================================== --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.c +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -310,6 +310,7 @@ vserver_new_vq_conn(int fd, void *dat, _ } vdev_ctx.fh = fh; + vserver->fh = fh; size = strnlen(vserver->path, PATH_MAX); ops->set_ifname(vdev_ctx, vserver->path, size); @@ -516,6 +517,11 @@ rte_vhost_driver_unregister(const char * for (i = 0; i < g_vhost_server.vserver_cnt; i++) { if (!strcmp(g_vhost_server.server[i]->path, path)) { + struct vhost_device_ctx ctx; + + ctx.fh = g_vhost_server.server[i]->fh; + ops->destroy_device(ctx); + fdset_del(&g_vhost_server.fdset, g_vhost_server.server[i]->listenfd); Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h =================================================================== --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.h +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h @@ -43,6 +43,7 @@ struct vhost_server { char *path; /**< The path the uds is bind to. */ int listenfd; /**< The listener sockfd. */ + uint32_t fh; }; /* refer to hw/virtio/vhost-user.c */ Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Mon, Apr 18, 2016 at 8:14 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Mon, Apr 18, 2016 at 10:46:50AM -0700, Yuanhan Liu wrote: > > On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > > > I assume there is a leak somewhere on adding/removing vhost_user ports. > > > Although it could also be "only" a fragmentation issue. > > > > > > Reproduction is easy: > > > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > > > Then in a loop I > > > - add up to more 512 ports > > > - test connectivity between the two guests > > > - remove up to 512 ports > > > > > > Depending on memory and the amount of multiqueue/rxq I use it seems to > > > slightly change when exactly it breaks. But for my default setup of 4 > > > queues and 5G Hugepages initialized by DPDK it always breaks at the > sixth > > > iteration. > > > Here a link to the stack trace indicating a memory shortage (TBC): > > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > > > Known Todos: > > > - I want to track it down more, and will try to come up with a non > > > openvswitch based looping testcase that might show it as well to > simplify > > > debugging. > > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK > 16.04 and > > > Openvswitch master is planned. > > > > > > I will go on debugging this and let you know, but I wanted to give a > heads > > > up to everyone. > > > > Thanks for the report. > > > > > In case this is a known issue for some of you please let me know. > > > > Yeah, it might be. I'm wondering that virtio_net struct is not freed. > > It will be freed only (if I'm not mistaken) when guest quits, by far. > > Would you try following diff and to see if it fix your issue? > > --yliu > > --- > lib/librte_vhost/vhost_user/vhost-net-user.c | 6 ++++++ > lib/librte_vhost/vhost_user/vhost-net-user.h | 1 + > 2 files changed, 7 insertions(+) > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c > b/lib/librte_vhost/vhost_user/vhost-net-user.c > index df2bd64..8f7ebd7 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -309,6 +309,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused > int *remove) > } > > vdev_ctx.fh = fh; > + vserver->fh = fh; > size = strnlen(vserver->path, PATH_MAX); > vhost_set_ifname(vdev_ctx, vserver->path, > size); > @@ -501,6 +502,11 @@ rte_vhost_driver_unregister(const char *path) > > for (i = 0; i < g_vhost_server.vserver_cnt; i++) { > if (!strcmp(g_vhost_server.server[i]->path, path)) { > + struct vhost_device_ctx ctx; > + > + ctx.fh = g_vhost_server.server[i]->fh; > + vhost_destroy_device(ctx); > + > fdset_del(&g_vhost_server.fdset, > g_vhost_server.server[i]->listenfd); > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h > b/lib/librte_vhost/vhost_user/vhost-net-user.h > index e3bb413..7cf21db 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.h > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.h > @@ -43,6 +43,7 @@ > struct vhost_server { > char *path; /**< The path the uds is bind to. */ > int listenfd; /**< The listener sockfd. */ > + uint32_t fh; > }; > > /* refer to hw/virtio/vhost-user.c */ > -- > 1.9.3 > > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-19 16:33 ` Christian Ehrhardt @ 2016-04-20 5:04 ` Yuanhan Liu 2016-04-20 6:18 ` Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt 1 sibling, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-04-20 5:04 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: dev, Daniele Di Proietto On Tue, Apr 19, 2016 at 06:33:50PM +0200, Christian Ehrhardt wrote: > Hi, > thanks for the patch. > I backported it this way to my DPDK 2.2 based environment for now (see below): > > With that applied one (and only one) of my two guests looses connectivity after > removing the ports the first time. Yeah, that's should be because I invoked the "->destroy_device()" callback. BTW, I'm curious how do you do the test? I saw you added 256 ports, but with 2 guests only? So, 254 of them are idle, just for testing the memory leak bug? And then you remove all of them, without stopping the guest. How that's gonna work? I mean, the vhost-user connection would be broken, and data flow would not work. --yliu > No traffic seems to pass, setting the device in the guest down/up doesn't get > anything. > But It isn't totally broken - stopping/starting the guest gets it working > again. > So openvswitch/dpdk is still somewhat working - it just seems the guest lost > something, after tapping on that vhost_user interface again it works. > > I will check tomorrow and let you know: > - if I'm more lucky with that patch on top of 16.04 > - if it looses connectivity after the first or a certain amount of port removes > > If you find issues with my backport adaption let me know. > > > --- > > Backport and reasoning: > > new fix relies on a lot of new code, vhost_destroy_device looks totally > different from the former destroy_device. > History on todays function content: > 4796ad63 - original code moved from examples to lib > a90ca1a1 - this replaces ops->destroy_device with vhost_destroy_device > 71dc571e - simple check against null pointers > 45ca9c6f - this changed the code from linked list to arrays > > New code cleans with: > notify_ops->destroy_device (callback into the parent) > cleanup_device was existing before even in 2.2 code > free_device as well existing before even in 2.2 code > Old code cleans with: > notify_ops->destroy_device - still there > rm_config_ll_entry -> eventually calls cleanup_device and free_device > (just in the more complex linked list way) > > So the "only" adaption for backporting needed is to replace > vhost_destroy_device > with ops->destroy_device(ctx) > > Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c > =================================================================== > --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -310,6 +310,7 @@ vserver_new_vq_conn(int fd, void *dat, _ > } > > vdev_ctx.fh = fh; > + vserver->fh = fh; > size = strnlen(vserver->path, PATH_MAX); > ops->set_ifname(vdev_ctx, vserver->path, > size); > @@ -516,6 +517,11 @@ rte_vhost_driver_unregister(const char * > > for (i = 0; i < g_vhost_server.vserver_cnt; i++) { > if (!strcmp(g_vhost_server.server[i]->path, path)) { > + struct vhost_device_ctx ctx; > + > + ctx.fh = g_vhost_server.server[i]->fh; > + ops->destroy_device(ctx); > + > fdset_del(&g_vhost_server.fdset, > g_vhost_server.server[i]->listenfd); > > Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h > =================================================================== > --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.h > +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h > @@ -43,6 +43,7 @@ > struct vhost_server { > char *path; /**< The path the uds is bind to. */ > int listenfd; /**< The listener sockfd. */ > + uint32_t fh; > }; > > /* refer to hw/virtio/vhost-user.c */ > > > > > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > On Mon, Apr 18, 2016 at 8:14 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> > wrote: > > On Mon, Apr 18, 2016 at 10:46:50AM -0700, Yuanhan Liu wrote: > > On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > > > I assume there is a leak somewhere on adding/removing vhost_user ports. > > > Although it could also be "only" a fragmentation issue. > > > > > > Reproduction is easy: > > > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > > > Then in a loop I > > > - add up to more 512 ports > > > - test connectivity between the two guests > > > - remove up to 512 ports > > > > > > Depending on memory and the amount of multiqueue/rxq I use it seems to > > > slightly change when exactly it breaks. But for my default setup of 4 > > > queues and 5G Hugepages initialized by DPDK it always breaks at the > sixth > > > iteration. > > > Here a link to the stack trace indicating a memory shortage (TBC): > > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > > > Known Todos: > > > - I want to track it down more, and will try to come up with a non > > > openvswitch based looping testcase that might show it as well to > simplify > > > debugging. > > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 > and > > > Openvswitch master is planned. > > > > > > I will go on debugging this and let you know, but I wanted to give a > heads > > > up to everyone. > > > > Thanks for the report. > > > > > In case this is a known issue for some of you please let me know. > > > > Yeah, it might be. I'm wondering that virtio_net struct is not freed. > > It will be freed only (if I'm not mistaken) when guest quits, by far. > > Would you try following diff and to see if it fix your issue? > > --yliu > > --- > lib/librte_vhost/vhost_user/vhost-net-user.c | 6 ++++++ > lib/librte_vhost/vhost_user/vhost-net-user.h | 1 + > 2 files changed, 7 insertions(+) > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/ > librte_vhost/vhost_user/vhost-net-user.c > index df2bd64..8f7ebd7 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -309,6 +309,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int > *remove) > } > > vdev_ctx.fh = fh; > + vserver->fh = fh; > size = strnlen(vserver->path, PATH_MAX); > vhost_set_ifname(vdev_ctx, vserver->path, > size); > @@ -501,6 +502,11 @@ rte_vhost_driver_unregister(const char *path) > > for (i = 0; i < g_vhost_server.vserver_cnt; i++) { > if (!strcmp(g_vhost_server.server[i]->path, path)) { > + struct vhost_device_ctx ctx; > + > + ctx.fh = g_vhost_server.server[i]->fh; > + vhost_destroy_device(ctx); > + > fdset_del(&g_vhost_server.fdset, > g_vhost_server.server[i]->listenfd); > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h b/lib/ > librte_vhost/vhost_user/vhost-net-user.h > index e3bb413..7cf21db 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.h > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.h > @@ -43,6 +43,7 @@ > struct vhost_server { > char *path; /**< The path the uds is bind to. */ > int listenfd; /**< The listener sockfd. */ > + uint32_t fh; > }; > > /* refer to hw/virtio/vhost-user.c */ > -- > 1.9.3 > > > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-20 5:04 ` Yuanhan Liu @ 2016-04-20 6:18 ` Christian Ehrhardt 2016-04-21 5:54 ` Yuanhan Liu 0 siblings, 1 reply; 22+ messages in thread From: Christian Ehrhardt @ 2016-04-20 6:18 UTC (permalink / raw) To: Yuanhan Liu; +Cc: dev, Daniele Di Proietto On Wed, Apr 20, 2016 at 7:04 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Tue, Apr 19, 2016 at 06:33:50PM +0200, Christian Ehrhardt wrote: > [...] > > With that applied one (and only one) of my two guests looses > connectivity after > > removing the ports the first time. > > Yeah, that's should be because I invoked the "->destroy_device()" > callback. > Shouldn't that not only destroy the particular vhost_user device I remove? See below for some better details on the test to clarify that. BTW, I'm curious how do you do the test? I saw you added 256 ports, but > with 2 guests only? So, 254 of them are idle, just for testing the > memory leak bug? > Maybe I should describe it better: 1. Spawn some vhost-user ports (40 in my case) 2. Spawn a pair of guests that connect via four of those ports per guest 3. Guests only intialize one of that vhost_user based NICs 4. check connectivity between guests via the vhost_user based connection (working at this stage) LOOP 5-7: 5. add ports 41-512 6. remove ports 41-512 7. check connectivity between guests via the vhost_user based connection So the vhost_user ports the guests are using are never deleted. Only some extra (not even used) ports are added&removed in the loop to search for potential leaks over a longer lifetime of an openvswitch-dpdk based solution. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-20 6:18 ` Christian Ehrhardt @ 2016-04-21 5:54 ` Yuanhan Liu 2016-04-21 9:07 ` Christian Ehrhardt 0 siblings, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-04-21 5:54 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: dev, Daniele Di Proietto On Wed, Apr 20, 2016 at 08:18:49AM +0200, Christian Ehrhardt wrote: > On Wed, Apr 20, 2016 at 7:04 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com> > wrote: > > On Tue, Apr 19, 2016 at 06:33:50PM +0200, Christian Ehrhardt wrote: > > [...] > > > With that applied one (and only one) of my two guests looses connectivity > after > > removing the ports the first time. > > Yeah, that's should be because I invoked the "->destroy_device()" > callback. > > > Shouldn't that not only destroy the particular vhost_user device I remove? I assume the "not" is typed wrong here, then yes. Well, it turned out that I accidentally destroyed the first guest (with id 0) with following code: ctx.fh = g_vhost_server.server[i]->fh; vhost_destroy_device(ctx); server[i]->fh is initialized with 0 when no connection is established (check below for more info), and the first id is started with 0. Anyway, this could be fixed easily. > See below for some better details on the test to clarify that. > > > BTW, I'm curious how do you do the test? I saw you added 256 ports, but > with 2 guests only? So, 254 of them are idle, just for testing the > memory leak bug? > > > Maybe I should describe it better: > 1. Spawn some vhost-user ports (40 in my case) > 2. Spawn a pair of guests that connect via four of those ports per guest > 3. Guests only intialize one of that vhost_user based NICs > 4. check connectivity between guests via the vhost_user based connection > (working at this stage) > LOOP 5-7: > 5. add ports 41-512 > 6. remove ports 41-512 > 7. check connectivity between guests via the vhost_user based connection Yes, it's much clearer now. Thanks. I then don't see it's a leak from DPDK vhost-user, at least not the leak on "struct virtio_net" I have mentioned before. "struct virito_net" will not even be allocated for those ports never used (ports 41-512 in your case), as it will be allocated only when there is a connection established, aka, a guest is connected. BTW, will you be able to reproduce it without any connections? Say, all 512 ports are added, and then deleted. Thanks. --yliu > > So the vhost_user ports the guests are using are never deleted. > Only some extra (not even used) ports are added&removed in the loop to search > for potential leaks over a longer lifetime of an openvswitch-dpdk based > solution. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-21 5:54 ` Yuanhan Liu @ 2016-04-21 9:07 ` Christian Ehrhardt 0 siblings, 0 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-04-21 9:07 UTC (permalink / raw) To: Yuanhan Liu; +Cc: dev, Daniele Di Proietto Hi, I can follow your argument that - and agree that in this case the leak can't be solved your patch. Still I found it useful to revise it along our discussion as eventually it will still be a good patch to have. I followed your suggestion and found: - rte_vhost_driver_register callocs vserver (implies fh=0) - later when on init when getting the callback to vserver_new_vq_conn it would get set by ops->new_device(vdev_ctx); - but as you pointed out that could be fh = 0 for the first - so I initialized vserver->fh with -1 in rte_vhost_driver_register - that won't ever be a real fh. - later on get_config_ll_entry won't find a device with that then on the call by destroy_device. - so the revised patch currently in use (still for DPDK 2.2) can be found here http://paste.ubuntu.com/15961394/ Also as you requested I tried with no guest attached at all - that way I can still reproduce it. Here is a new stacktrace, but to me it looks the same http://paste.ubuntu.com/15961185/ Also as you asked before a log of the vswitch, but it is 895MB since a lot of messages repeat on port add/remove. Even compressed still 27MB - I need to do something about verbosity there. Also the system journal of the same time. Therefore I only added links to bz2 files. The crash is at "2016-04-21T07:54:47.782Z" in the logs. => http://people.canonical.com/~paelzer/ovs-dpdk-vhost-add-remove-leak/mem-leak-addremove.journal.bz2 => http://people.canonical.com/~paelzer/ovs-dpdk-vhost-add-remove-leak/ovs-vswitchd.log.bz2 Kind Regards, Christian Ehrhardt Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Thu, Apr 21, 2016 at 7:54 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Wed, Apr 20, 2016 at 08:18:49AM +0200, Christian Ehrhardt wrote: > > On Wed, Apr 20, 2016 at 7:04 AM, Yuanhan Liu < > yuanhan.liu@linux.intel.com> > > wrote: > > > > On Tue, Apr 19, 2016 at 06:33:50PM +0200, Christian Ehrhardt wrote: > > > > [...] > > > > > With that applied one (and only one) of my two guests looses > connectivity > > after > > > removing the ports the first time. > > > > Yeah, that's should be because I invoked the "->destroy_device()" > > callback. > > > > > > Shouldn't that not only destroy the particular vhost_user device I > remove? > > I assume the "not" is typed wrong here, then yes. Well, it turned > out that I accidentally destroyed the first guest (with id 0) with > following code: > > ctx.fh = g_vhost_server.server[i]->fh; > vhost_destroy_device(ctx); > > server[i]->fh is initialized with 0 when no connection is established > (check below for more info), and the first id is started with 0. Anyway, > this could be fixed easily. > > > See below for some better details on the test to clarify that. > > > > > > BTW, I'm curious how do you do the test? I saw you added 256 ports, > but > > with 2 guests only? So, 254 of them are idle, just for testing the > > memory leak bug? > > > > > > Maybe I should describe it better: > > 1. Spawn some vhost-user ports (40 in my case) > > 2. Spawn a pair of guests that connect via four of those ports per guest > > 3. Guests only intialize one of that vhost_user based NICs > > 4. check connectivity between guests via the vhost_user based connection > > (working at this stage) > > LOOP 5-7: > > 5. add ports 41-512 > > 6. remove ports 41-512 > > 7. check connectivity between guests via the vhost_user based > connection > > Yes, it's much clearer now. Thanks. > > I then don't see it's a leak from DPDK vhost-user, at least not the leak > on "struct virtio_net" I have mentioned before. "struct virito_net" will > not even be allocated for those ports never used (ports 41-512 in your > case), > as it will be allocated only when there is a connection established, aka, > a guest is connected. > > BTW, will you be able to reproduce it without any connections? Say, all > 512 ports are added, and then deleted. > > Thanks. > > --yliu > > > > > So the vhost_user ports the guests are using are never deleted. > > Only some extra (not even used) ports are added&removed in the loop to > search > > for potential leaks over a longer lifetime of an openvswitch-dpdk based > > solution. > > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-04-19 16:33 ` Christian Ehrhardt 2016-04-20 5:04 ` Yuanhan Liu @ 2016-07-06 12:24 ` Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors Christian Ehrhardt ` (2 more replies) 1 sibling, 3 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-06 12:24 UTC (permalink / raw) To: christian.ehrhardt, patrik.r.andersson, thomas.monjalon, dev, yuanhan.liu, huawei.xie Hi, while checking for dpdk 16.07 what backports are accepted in the meantime so I can drop them I found this particular discussion has been silently forgotten by all of us. Back then we had the patch and discussion first in http://dpdk.org/dev/patchwork/patch/12103/ and then http://dpdk.org/dev/patchwork/patch/12118/ Things worked fine as I reported and I integrated the patch in our packaging as it fixed a severe issue. Since it was reported by someone else I do not seem to be the only one :-) So today I rebased the patch including my updates I made based on our discussion and I think it would make as much sense as it made back then to fix this. Christian Ehrhardt (1): vhost_user: avoid crash when exeeding file descriptors lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- lib/librte_vhost/vhost_user/vhost-net-user.c | 19 +++++++++++++++++-- 2 files changed, 23 insertions(+), 7 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt @ 2016-07-06 12:24 ` Christian Ehrhardt 2016-07-12 8:37 ` Yuanhan Liu 2016-07-06 12:26 ` [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports Christian Ehrhardt 2016-07-06 13:08 ` Yuanhan Liu 2 siblings, 1 reply; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-06 12:24 UTC (permalink / raw) To: christian.ehrhardt, patrik.r.andersson, thomas.monjalon, dev, yuanhan.liu, huawei.xie *update in v2* - refreshing for DPDK 16.07 - Close fd on vserver->listenfd as suggested in discussion Original From: From: Patrik Andersson <patrik.r.andersson@ericsson.com> Protect against DPDK crash when allocation of listen fd >= 1023. For events on fd:s >1023, the current implementation will trigger an abort due to access outside of allocated bit mask. Corrections would include: * Match fdset_add() signature in fd_man.c to fd_man.h * Handling of return codes from fdset_add() * Addition of check of fd number in fdset_add_fd() The rationale behind the suggested code change is that, fdset_event_dispatch() could attempt access outside of the FD_SET bitmask if there is an event on a file descriptor that in turn looks up a virtio file descriptor with a value > 1023. Such an attempt will lead to an abort() and a restart of any vswitch using DPDK. A discussion topic exist in the ovs-discuss mailing list that can provide a little more background: http://openvswitch.org/pipermail/discuss/2016-February/020243.html Fixes: 8f972312 ("vhost: support vhost-user") Signed-off-by: Patrik Andersson <patrik.r.andersson@ericsson.com> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> --- lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- lib/librte_vhost/vhost_user/vhost-net-user.c | 19 +++++++++++++++++-- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/lib/librte_vhost/vhost_user/fd_man.c b/lib/librte_vhost/vhost_user/fd_man.c index 087aaed..c691339 100644 --- a/lib/librte_vhost/vhost_user/fd_man.c +++ b/lib/librte_vhost/vhost_user/fd_man.c @@ -71,20 +71,22 @@ fdset_find_free_slot(struct fdset *pfdset) return fdset_find_fd(pfdset, -1); } -static void +static int fdset_add_fd(struct fdset *pfdset, int idx, int fd, fd_cb rcb, fd_cb wcb, void *dat) { struct fdentry *pfdentry; - if (pfdset == NULL || idx >= MAX_FDS) - return; + if (pfdset == NULL || idx >= MAX_FDS || fd >= FD_SETSIZE) + return -1; pfdentry = &pfdset->fd[idx]; pfdentry->fd = fd; pfdentry->rcb = rcb; pfdentry->wcb = wcb; pfdentry->dat = dat; + + return 0; } /** @@ -150,12 +152,11 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) /* Find a free slot in the list. */ i = fdset_find_free_slot(pfdset); - if (i == -1) { + if (i == -1 || fdset_add_fd(pfdset, i, fd, rcb, wcb, dat) < 0) { pthread_mutex_unlock(&pfdset->fd_mutex); return -2; } - fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pfdset->num++; pthread_mutex_unlock(&pfdset->fd_mutex); diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index 94f1b92..5743f52 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -257,6 +257,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) int vid; size_t size; struct vhost_user_connection *conn; + int ret; conn = malloc(sizeof(*conn)); if (conn == NULL) { @@ -278,7 +279,15 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) conn->vsocket = vsocket; conn->vid = vid; - fdset_add(&vhost_user.fdset, fd, vhost_user_msg_handler, NULL, conn); + ret = fdset_add(&vhost_user.fdset, fd, vhost_user_msg_handler, + NULL, conn); + if (ret < 0) { + free(conn); + close(fd); + RTE_LOG(ERR, VHOST_CONFIG, + "failed to add fd %d into vhost server fdset\n", + fd); + } } /* call back when there is new vhost-user connection from client */ @@ -469,8 +478,14 @@ vhost_user_create_server(struct vhost_user_socket *vsocket) goto err; vsocket->listenfd = fd; - fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection, + ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection, NULL, vsocket); + if (ret < 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "failed to add listen fd %d to vhost server fdset\n", + fd); + goto err; + } return 0; -- 2.7.4 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors Christian Ehrhardt @ 2016-07-12 8:37 ` Yuanhan Liu 2016-07-15 19:46 ` Thomas Monjalon 0 siblings, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-07-12 8:37 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: patrik.r.andersson, thomas.monjalon, dev, huawei.xie On Wed, Jul 06, 2016 at 02:24:58PM +0200, Christian Ehrhardt wrote: > *update in v2* > - refreshing for DPDK 16.07 > - Close fd on vserver->listenfd as suggested in discussion > > Original From: > From: Patrik Andersson <patrik.r.andersson@ericsson.com> > > Protect against DPDK crash when allocation of listen fd >= 1023. > For events on fd:s >1023, the current implementation will trigger > an abort due to access outside of allocated bit mask. Hmmm, I have no idea why I missed this email in the beginning, otherwise, it would have been in rc2 release. Thanks for the re-posting, and we should have had it in last release. Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors 2016-07-12 8:37 ` Yuanhan Liu @ 2016-07-15 19:46 ` Thomas Monjalon 0 siblings, 0 replies; 22+ messages in thread From: Thomas Monjalon @ 2016-07-15 19:46 UTC (permalink / raw) To: Christian Ehrhardt, patrik.r.andersson; +Cc: Yuanhan Liu, dev, huawei.xie 2016-07-12 16:37, Yuanhan Liu: > On Wed, Jul 06, 2016 at 02:24:58PM +0200, Christian Ehrhardt wrote: > > *update in v2* > > - refreshing for DPDK 16.07 > > - Close fd on vserver->listenfd as suggested in discussion > > > > Original From: > > From: Patrik Andersson <patrik.r.andersson@ericsson.com> > > > > Protect against DPDK crash when allocation of listen fd >= 1023. > > For events on fd:s >1023, the current implementation will trigger > > an abort due to access outside of allocated bit mask. > > Hmmm, I have no idea why I missed this email in the beginning, > otherwise, it would have been in rc2 release. > > Thanks for the re-posting, and we should have had it in last > release. > > Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Applied, thanks ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors Christian Ehrhardt @ 2016-07-06 12:26 ` Christian Ehrhardt 2016-07-06 12:30 ` Christian Ehrhardt 2016-07-06 13:08 ` Yuanhan Liu 2 siblings, 1 reply; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-06 12:26 UTC (permalink / raw) To: Christian Ehrhardt, Patrik Andersson R, Thomas Monjalon, dev, Yuanhan Liu, Xie, Huawei Sorry, please ignore the two links, the cover letter has - they belong to a different issue I have to bring up again. Everything else still applies. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Jul 6, 2016 at 2:24 PM, Christian Ehrhardt < christian.ehrhardt@canonical.com> wrote: > Hi, > while checking for dpdk 16.07 what backports are accepted in the meantime > so I > can drop them I found this particular discussion has been silently > forgotten by > all of us. > > Back then we had the patch and discussion first in > http://dpdk.org/dev/patchwork/patch/12103/ > and then > http://dpdk.org/dev/patchwork/patch/12118/ > > Things worked fine as I reported and I integrated the patch in our > packaging as > it fixed a severe issue. Since it was reported by someone else I do not > seem to > be the only one :-) > > So today I rebased the patch including my updates I made based on our > discussion > and I think it would make as much sense as it made back then to fix this. > > Christian Ehrhardt (1): > vhost_user: avoid crash when exeeding file descriptors > > lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- > lib/librte_vhost/vhost_user/vhost-net-user.c | 19 +++++++++++++++++-- > 2 files changed, 23 insertions(+), 7 deletions(-) > > -- > 2.7.4 > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-06 12:26 ` [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports Christian Ehrhardt @ 2016-07-06 12:30 ` Christian Ehrhardt 2016-07-06 12:37 ` Christian Ehrhardt 0 siblings, 1 reply; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-06 12:30 UTC (permalink / raw) To: Christian Ehrhardt, Patrik Andersson R, Thomas Monjalon, dev, Yuanhan Liu, Xie, Huawei This is the series this really belongs to http://dpdk.org/dev/patchwork/patch/11581/ Message ID <1458292380-9258-1-git-send-email-patrik.r.andersson@ericsson.com > Should I wait for a v2 to point the patch at the right ID or do you prefer a fixed resubmit right away? Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Jul 6, 2016 at 2:26 PM, Christian Ehrhardt < christian.ehrhardt@canonical.com> wrote: > Sorry, > please ignore the two links, the cover letter has - they belong to a > different issue I have to bring up again. > Everything else still applies. > > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > On Wed, Jul 6, 2016 at 2:24 PM, Christian Ehrhardt < > christian.ehrhardt@canonical.com> wrote: > >> Hi, >> while checking for dpdk 16.07 what backports are accepted in the meantime >> so I >> can drop them I found this particular discussion has been silently >> forgotten by >> all of us. >> >> Back then we had the patch and discussion first in >> http://dpdk.org/dev/patchwork/patch/12103/ >> and then >> http://dpdk.org/dev/patchwork/patch/12118/ >> >> Things worked fine as I reported and I integrated the patch in our >> packaging as >> it fixed a severe issue. Since it was reported by someone else I do not >> seem to >> be the only one :-) >> >> So today I rebased the patch including my updates I made based on our >> discussion >> and I think it would make as much sense as it made back then to fix this. >> >> Christian Ehrhardt (1): >> vhost_user: avoid crash when exeeding file descriptors >> >> lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- >> lib/librte_vhost/vhost_user/vhost-net-user.c | 19 +++++++++++++++++-- >> 2 files changed, 23 insertions(+), 7 deletions(-) >> >> -- >> 2.7.4 >> >> > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-06 12:30 ` Christian Ehrhardt @ 2016-07-06 12:37 ` Christian Ehrhardt 0 siblings, 0 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-06 12:37 UTC (permalink / raw) To: Christian Ehrhardt, Patrik Andersson R, Thomas Monjalon, dev, Yuanhan Liu, Xie, Huawei Now let us really get to this old discussion. Once again - sorry for mixing this up :-/ Found too much at once today. ## Ok - refocus ## Back then in http://dpdk.org/dev/patchwork/patch/12103/ http://dpdk.org/dev/patchwork/patch/12118/ We came up with a nice solution for the leak. But it was never picked up upstream. There were a lot of changes to all of that code, especially vhost client/server. Lacking an openvswitch for DPDK 16.07 I can't test if the issue still shows up, but looking at the code suggests it still does. Unfortunately the internal structures and scopes are slightly different so that I couldn't come up with a working forward port of the old patch. Attached is a patch as close as I got (obviously not working). I'm convinced that Yuanhan Liu and Xie Huawei with their deep context knowledge of dpdk's vhost_user will quickly know: - if the root cause still applies - what the best new way of fixing this would be As this makes long term usage of dpdk aborting by a leak I hope we have a chance to get this into 16.07 still. Kind Regards, Christian Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Jul 6, 2016 at 2:30 PM, Christian Ehrhardt < christian.ehrhardt@canonical.com> wrote: > This is the series this really belongs to > http://dpdk.org/dev/patchwork/patch/11581/ > Message ID < > 1458292380-9258-1-git-send-email-patrik.r.andersson@ericsson.com> > > Should I wait for a v2 to point the patch at the right ID or do you prefer > a fixed resubmit right away? > > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > On Wed, Jul 6, 2016 at 2:26 PM, Christian Ehrhardt < > christian.ehrhardt@canonical.com> wrote: > >> Sorry, >> please ignore the two links, the cover letter has - they belong to a >> different issue I have to bring up again. >> Everything else still applies. >> >> Christian Ehrhardt >> Software Engineer, Ubuntu Server >> Canonical Ltd >> >> On Wed, Jul 6, 2016 at 2:24 PM, Christian Ehrhardt < >> christian.ehrhardt@canonical.com> wrote: >> >>> Hi, >>> while checking for dpdk 16.07 what backports are accepted in the >>> meantime so I >>> can drop them I found this particular discussion has been silently >>> forgotten by >>> all of us. >>> >>> Back then we had the patch and discussion first in >>> http://dpdk.org/dev/patchwork/patch/12103/ >>> and then >>> http://dpdk.org/dev/patchwork/patch/12118/ >>> >>> Things worked fine as I reported and I integrated the patch in our >>> packaging as >>> it fixed a severe issue. Since it was reported by someone else I do not >>> seem to >>> be the only one :-) >>> >>> So today I rebased the patch including my updates I made based on our >>> discussion >>> and I think it would make as much sense as it made back then to fix this. >>> >>> Christian Ehrhardt (1): >>> vhost_user: avoid crash when exeeding file descriptors >>> >>> lib/librte_vhost/vhost_user/fd_man.c | 11 ++++++----- >>> lib/librte_vhost/vhost_user/vhost-net-user.c | 19 +++++++++++++++++-- >>> 2 files changed, 23 insertions(+), 7 deletions(-) >>> >>> -- >>> 2.7.4 >>> >>> >> > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors Christian Ehrhardt 2016-07-06 12:26 ` [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports Christian Ehrhardt @ 2016-07-06 13:08 ` Yuanhan Liu 2016-07-12 12:08 ` Yuanhan Liu 2 siblings, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-07-06 13:08 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: patrik.r.andersson, thomas.monjalon, dev, huawei.xie On Wed, Jul 06, 2016 at 02:24:57PM +0200, Christian Ehrhardt wrote: > Hi, > while checking for dpdk 16.07 what backports are accepted in the meantime so I > can drop them I found this particular discussion has been silently forgotten by > all of us. Not really. As later I found that my patch was actually wrong (besides the issue you already found). I will revisit this issue thoroughly when get time, hopefully, next week. --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-06 13:08 ` Yuanhan Liu @ 2016-07-12 12:08 ` Yuanhan Liu 2016-07-19 13:50 ` Christian Ehrhardt 0 siblings, 1 reply; 22+ messages in thread From: Yuanhan Liu @ 2016-07-12 12:08 UTC (permalink / raw) To: Christian Ehrhardt; +Cc: patrik.r.andersson, thomas.monjalon, dev On Wed, Jul 06, 2016 at 09:08:12PM +0800, Yuanhan Liu wrote: > On Wed, Jul 06, 2016 at 02:24:57PM +0200, Christian Ehrhardt wrote: > > Hi, > > while checking for dpdk 16.07 what backports are accepted in the meantime so I > > can drop them I found this particular discussion has been silently forgotten by > > all of us. > > Not really. As later I found that my patch was actually wrong (besides > the issue you already found). I will revisit this issue thoroughly when > get time, hopefully, next week. Here it is: vhost_destroy() will be invoked when: - QEMU quits gracefully - QEMU terminates unexpectedly Meaning, there should be no memory leak. I think we are fine here (I maybe wrong though, feeling a bit dizzy now :( --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports 2016-07-12 12:08 ` Yuanhan Liu @ 2016-07-19 13:50 ` Christian Ehrhardt 0 siblings, 0 replies; 22+ messages in thread From: Christian Ehrhardt @ 2016-07-19 13:50 UTC (permalink / raw) To: Yuanhan Liu; +Cc: Patrik Andersson R, Thomas Monjalon, dev Hi, thanks for evaluating. I'll need a few days until I'm ready for OVS again and until OVS is ready for DPDK 16.07. Then I can run an adapted version of my old testcase to be sure. In the worst cases it will be in 16.11 and I'll backport to 16.07 as distributed. I'll let you know then. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Tue, Jul 12, 2016 at 2:08 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Wed, Jul 06, 2016 at 09:08:12PM +0800, Yuanhan Liu wrote: > > On Wed, Jul 06, 2016 at 02:24:57PM +0200, Christian Ehrhardt wrote: > > > Hi, > > > while checking for dpdk 16.07 what backports are accepted in the > meantime so I > > > can drop them I found this particular discussion has been silently > forgotten by > > > all of us. > > > > Not really. As later I found that my patch was actually wrong (besides > > the issue you already found). I will revisit this issue thoroughly when > > get time, hopefully, next week. > > Here it is: vhost_destroy() will be invoked when: > > - QEMU quits gracefully > - QEMU terminates unexpectedly > > Meaning, there should be no memory leak. I think we are fine here (I > maybe wrong though, feeling a bit dizzy now :( > > --yliu > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-18 17:18 [dpdk-dev] Memory leak when adding/removing vhost_user ports Christian Ehrhardt 2016-04-18 17:46 ` Yuanhan Liu @ 2016-04-21 11:01 ` Ilya Maximets 2016-04-21 14:04 ` Christian Ehrhardt 2016-04-21 16:54 ` Yuanhan Liu 1 sibling, 2 replies; 22+ messages in thread From: Ilya Maximets @ 2016-04-21 11:01 UTC (permalink / raw) To: dev, christian.ehrhardt Cc: Daniele Di Proietto, dev, yuanhan.liu, Dyasly Sergey Hi, Christian. You're, likely, using tar archive with openvswitch from openvswitch.org. It doesn't contain many bug fixes from git/branch-2.5 unfortunately. The problem that you are facing has been solved in branch-2.5 by commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89 Author: Ilya Maximets <i.maximets@samsung.com> Date: Thu Mar 3 11:30:06 2016 +0300 netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct(). Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Best regards, Ilya Maximets. > I assume there is a leak somewhere on adding/removing vhost_user ports. > Although it could also be "only" a fragmentation issue. > > Reproduction is easy: > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > Then in a loop I > - add up to more 512 ports > - test connectivity between the two guests > - remove up to 512 ports > > Depending on memory and the amount of multiqueue/rxq I use it seems to > slightly change when exactly it breaks. But for my default setup of 4 > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > iteration. > Here a link to the stack trace indicating a memory shortage (TBC): > https://launchpadlibrarian.net/253916410/apport-retrace.log > > Known Todos: > - I want to track it down more, and will try to come up with a non > openvswitch based looping testcase that might show it as well to simplify > debugging. > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and > Openvswitch master is planned. > > I will go on debugging this and let you know, but I wanted to give a heads > up to everyone. > In case this is a known issue for some of you please let me know. > > Kind Regards, > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > P.S. I think it is a dpdk issue, but adding Daniele on CC to represent > ovs-dpdk as well. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-21 11:01 ` [dpdk-dev] " Ilya Maximets @ 2016-04-21 14:04 ` Christian Ehrhardt 2016-04-21 16:56 ` Yuanhan Liu 2016-04-21 16:54 ` Yuanhan Liu 1 sibling, 1 reply; 22+ messages in thread From: Christian Ehrhardt @ 2016-04-21 14:04 UTC (permalink / raw) To: Ilya Maximets; +Cc: dev, Daniele Di Proietto, dev, yuanhan.liu, Dyasly Sergey Thanks Ilya, yeah we usually wait for the point releases as they undergo some extra testing and verification. .1 shouldn't be too much into the future I guess. Thanks a lot for identifying. That said, I'd still go on with Yuanhan to finalize the dpdk side leak fix we identified, so we eventually get it committed. So Yuanhan, what do you think of my last revised version of your patch for upstream DPDK (there with the vhost_destroy_device then)? I mean it is essentially your patch plus a bit of polishing, not mine so I don't feel entitled to submit it as mine :-) Kind Regards, Christian Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Thu, Apr 21, 2016 at 1:01 PM, Ilya Maximets <i.maximets@samsung.com> wrote: > Hi, Christian. > You're, likely, using tar archive with openvswitch from openvswitch.org. > It doesn't contain many bug fixes from git/branch-2.5 unfortunately. > > The problem that you are facing has been solved in branch-2.5 by > > commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89 > Author: Ilya Maximets <i.maximets@samsung.com> > Date: Thu Mar 3 11:30:06 2016 +0300 > > netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct(). > > Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support") > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > Acked-by: Flavio Leitner <fbl@sysclose.org> > Acked-by: Daniele Di Proietto <diproiettod@vmware.com> > > Best regards, Ilya Maximets. > > > I assume there is a leak somewhere on adding/removing vhost_user ports. > > Although it could also be "only" a fragmentation issue. > > > > Reproduction is easy: > > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > > Then in a loop I > > - add up to more 512 ports > > - test connectivity between the two guests > > - remove up to 512 ports > > > > Depending on memory and the amount of multiqueue/rxq I use it seems to > > slightly change when exactly it breaks. But for my default setup of 4 > > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > > iteration. > > Here a link to the stack trace indicating a memory shortage (TBC): > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > Known Todos: > > - I want to track it down more, and will try to come up with a non > > openvswitch based looping testcase that might show it as well to simplify > > debugging. > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 > and > > Openvswitch master is planned. > > > > I will go on debugging this and let you know, but I wanted to give a > heads > > up to everyone. > > In case this is a known issue for some of you please let me know. > > > > Kind Regards, > > Christian Ehrhardt > > Software Engineer, Ubuntu Server > > Canonical Ltd > > > > P.S. I think it is a dpdk issue, but adding Daniele on CC to represent > > ovs-dpdk as well. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-21 14:04 ` Christian Ehrhardt @ 2016-04-21 16:56 ` Yuanhan Liu 0 siblings, 0 replies; 22+ messages in thread From: Yuanhan Liu @ 2016-04-21 16:56 UTC (permalink / raw) To: Christian Ehrhardt Cc: Ilya Maximets, dev, Daniele Di Proietto, dev, Dyasly Sergey On Thu, Apr 21, 2016 at 04:04:03PM +0200, Christian Ehrhardt wrote: > Thanks Ilya, > yeah we usually wait for the point releases as they undergo some extra testing > and verification. > .1 shouldn't be too much into the future I guess. > Thanks a lot for identifying. > > That said, I'd still go on with Yuanhan to finalize the dpdk side leak fix we > identified, so we eventually get it committed. > So Yuanhan, what do you think of my last revised version of your patch for > upstream DPDK (there with the vhost_destroy_device then)? That's good. > I mean it is essentially your patch plus a bit of polishing, not mine so I > don't feel entitled to submit it as mine :-) Thanks. I will make and send out a formal patch later. --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [dpdk-dev] Memory leak when adding/removing vhost_user ports 2016-04-21 11:01 ` [dpdk-dev] " Ilya Maximets 2016-04-21 14:04 ` Christian Ehrhardt @ 2016-04-21 16:54 ` Yuanhan Liu 1 sibling, 0 replies; 22+ messages in thread From: Yuanhan Liu @ 2016-04-21 16:54 UTC (permalink / raw) To: Ilya Maximets Cc: dev, christian.ehrhardt, Daniele Di Proietto, dev, Dyasly Sergey On Thu, Apr 21, 2016 at 02:01:26PM +0300, Ilya Maximets wrote: > Hi, Christian. > You're, likely, using tar archive with openvswitch from openvswitch.org. > It doesn't contain many bug fixes from git/branch-2.5 unfortunately. > > The problem that you are facing has been solved in branch-2.5 by > > commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89 > Author: Ilya Maximets <i.maximets@samsung.com> > Date: Thu Mar 3 11:30:06 2016 +0300 > > netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct(). > > Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support") > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > Acked-by: Flavio Leitner <fbl@sysclose.org> > Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Hi Ilya, Thanks for the info. And, I actually checked this peice of code. I was using new code, so, I didn't find anything wrong. --yliu ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2016-07-19 13:50 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-04-18 17:18 [dpdk-dev] Memory leak when adding/removing vhost_user ports Christian Ehrhardt 2016-04-18 17:46 ` Yuanhan Liu 2016-04-18 18:14 ` Yuanhan Liu 2016-04-19 16:33 ` Christian Ehrhardt 2016-04-20 5:04 ` Yuanhan Liu 2016-04-20 6:18 ` Christian Ehrhardt 2016-04-21 5:54 ` Yuanhan Liu 2016-04-21 9:07 ` Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt 2016-07-06 12:24 ` [dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors Christian Ehrhardt 2016-07-12 8:37 ` Yuanhan Liu 2016-07-15 19:46 ` Thomas Monjalon 2016-07-06 12:26 ` [dpdk-dev] [PATCH v2] Memory leak when adding/removing vhost_user ports Christian Ehrhardt 2016-07-06 12:30 ` Christian Ehrhardt 2016-07-06 12:37 ` Christian Ehrhardt 2016-07-06 13:08 ` Yuanhan Liu 2016-07-12 12:08 ` Yuanhan Liu 2016-07-19 13:50 ` Christian Ehrhardt 2016-04-21 11:01 ` [dpdk-dev] " Ilya Maximets 2016-04-21 14:04 ` Christian Ehrhardt 2016-04-21 16:56 ` Yuanhan Liu 2016-04-21 16:54 ` Yuanhan Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).