DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race
@ 2021-01-29  7:35 Peng He
  2021-02-01  6:27 ` Xia, Chenbo
  0 siblings, 1 reply; 6+ messages in thread
From: Peng He @ 2021-01-29  7:35 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin

From: "chenwei.0515" <chenwei.0515@bytedance.com>

vhost_new_devcie might be called in different threads at the same time.
thread 1(config thread)
            rte_vhost_driver_start
               ->vhost_user_start_client
                   ->vhost_user_add_connection
                     -> vhost_new_device

thread 2(vhost-events)
	vhost_user_read_cb
           ->vhost_user_msg_handler (return value < 0)
             -> vhost_user_start_client
                 -> vhost_new_device

So there could be a case that a same vid has been allocated twice, or
some vid might be lost in DPDK lib however still held by the upper
applications.

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
---
 lib/librte_vhost/vhost.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index efb136edd1..db11d293d2 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -26,6 +26,7 @@
 #include "vhost_user.h"
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
+pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
 
 /* Called with iotlb_lock read-locked */
 uint64_t
@@ -645,6 +646,7 @@ vhost_new_device(void)
 	struct virtio_net *dev;
 	int i;
 
+	pthread_mutex_lock(&vhost_dev_lock);
 	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
 		if (vhost_devices[i] == NULL)
 			break;
@@ -653,6 +655,7 @@ vhost_new_device(void)
 	if (i == MAX_VHOST_DEVICE) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to find a free slot for new device.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
@@ -660,10 +663,13 @@ vhost_new_device(void)
 	if (dev == NULL) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to allocate memory for new dev.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
 	vhost_devices[i] = dev;
+	pthread_mutex_unlock(&vhost_dev_lock);
+
 	dev->vid = i;
 	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
 	dev->slave_req_fd = -1;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race
  2021-01-29  7:35 [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race Peng He
@ 2021-02-01  6:27 ` Xia, Chenbo
  2021-02-01  8:48   ` [dpdk-dev] [PATCH v2] vhost: " Peng He
  2021-02-01  8:53   ` [dpdk-dev] [PATCH] lib/librte_vhost: " 贺鹏
  0 siblings, 2 replies; 6+ messages in thread
From: Xia, Chenbo @ 2021-02-01  6:27 UTC (permalink / raw)
  To: Peng He, dev; +Cc: maxime.coquelin, chenwei.0515, wangzhihong.wzh

Hi Peng & Fei,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Peng He
> Sent: Friday, January 29, 2021 3:36 PM
> To: dev@dpdk.org
> Cc: maxime.coquelin@redhat.com
> Subject: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race

Fix the title to 'vhost: XXXXX'

> 
> From: "chenwei.0515" <chenwei.0515@bytedance.com>

This should not be here.. you could just delete it as the author is already
in commit message.

> 
> vhost_new_devcie might be called in different threads at the same time.

s/devcie/device

> thread 1(config thread)
>             rte_vhost_driver_start
>                ->vhost_user_start_client
>                    ->vhost_user_add_connection
>                      -> vhost_new_device
> 
> thread 2(vhost-events)
> 	vhost_user_read_cb
>            ->vhost_user_msg_handler (return value < 0)
>              -> vhost_user_start_client
>                  -> vhost_new_device
> 
> So there could be a case that a same vid has been allocated twice, or
> some vid might be lost in DPDK lib however still held by the upper
> applications.

Good catch! I checked the code and find there exists cases that different threads
may allocate vhost slot.

And I also find that other functions which use the global vhost_devices[] may also
meet the same problem. For example, vhost_destroy_device() could be called by different
thread. So I suggest to fix the problem completely in all related functions like
vhost_destroy_device() and get_device(). What do you think?

If you agree with above, note that the title should also be changed.

Besides, please also add 'Fixes:$COMMID_ID' and cc to stable@dpdk.org so it could be
fixed in LTS. You can check other commit for reference.

> 
> Reported-by: Peng He <hepeng.0320@bytedance.com>
> Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  lib/librte_vhost/vhost.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index efb136edd1..db11d293d2 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -26,6 +26,7 @@
>  #include "vhost_user.h"
> 
>  struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
> +pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;

There's a duplicate space between 'pthread_mutex_t' and 'vhost_dev_lock',
Let's just leave one 😊

Thanks,
Chenbo

> 
>  /* Called with iotlb_lock read-locked */
>  uint64_t
> @@ -645,6 +646,7 @@ vhost_new_device(void)
>  	struct virtio_net *dev;
>  	int i;
> 
> +	pthread_mutex_lock(&vhost_dev_lock);
>  	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
>  		if (vhost_devices[i] == NULL)
>  			break;
> @@ -653,6 +655,7 @@ vhost_new_device(void)
>  	if (i == MAX_VHOST_DEVICE) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to find a free slot for new device.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
> @@ -660,10 +663,13 @@ vhost_new_device(void)
>  	if (dev == NULL) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to allocate memory for new dev.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
>  	vhost_devices[i] = dev;
> +	pthread_mutex_unlock(&vhost_dev_lock);
> +
>  	dev->vid = i;
>  	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
>  	dev->slave_req_fd = -1;
> --
> 2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH v2] vhost: fix vid allocation race
  2021-02-01  6:27 ` Xia, Chenbo
@ 2021-02-01  8:48   ` Peng He
  2021-02-03  2:44     ` Xia, Chenbo
  2021-02-03 17:21     ` Maxime Coquelin
  2021-02-01  8:53   ` [dpdk-dev] [PATCH] lib/librte_vhost: " 贺鹏
  1 sibling, 2 replies; 6+ messages in thread
From: Peng He @ 2021-02-01  8:48 UTC (permalink / raw)
  To: dev, chenbo.xia; +Cc: stable

vhost_new_device might be called in different threads at the same time.
thread 1(config thread)
            rte_vhost_driver_start
               ->vhost_user_start_client
                   ->vhost_user_add_connection
                     -> vhost_new_device

thread 2(vhost-events)
	vhost_user_read_cb
           ->vhost_user_msg_handler (return value < 0)
             -> vhost_user_start_client
                 -> vhost_new_device

So there could be a case that a same vid has been allocated twice, or
some vid might be lost in DPDK lib however still held by the upper
applications.

Another place where race would happen is at the func *vhost_destroy_device*,
but after a detailed investigation, the race does not exist as long as
no two devices have the same vid: Calling vhost_destroy_devices in
different threads with different vids is actually safe.

Fixes: a277c715987 ("vhost: refactor code structure")
Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
---
 lib/librte_vhost/vhost.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index efb136edd1..52ab93d1ec 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -26,6 +26,7 @@
 #include "vhost_user.h"
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
+pthread_mutex_t vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
 
 /* Called with iotlb_lock read-locked */
 uint64_t
@@ -645,6 +646,7 @@ vhost_new_device(void)
 	struct virtio_net *dev;
 	int i;
 
+	pthread_mutex_lock(&vhost_dev_lock);
 	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
 		if (vhost_devices[i] == NULL)
 			break;
@@ -653,6 +655,7 @@ vhost_new_device(void)
 	if (i == MAX_VHOST_DEVICE) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to find a free slot for new device.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
@@ -660,10 +663,13 @@ vhost_new_device(void)
 	if (dev == NULL) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to allocate memory for new dev.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
 	vhost_devices[i] = dev;
+	pthread_mutex_unlock(&vhost_dev_lock);
+
 	dev->vid = i;
 	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
 	dev->slave_req_fd = -1;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race
  2021-02-01  6:27 ` Xia, Chenbo
  2021-02-01  8:48   ` [dpdk-dev] [PATCH v2] vhost: " Peng He
@ 2021-02-01  8:53   ` 贺鹏
  1 sibling, 0 replies; 6+ messages in thread
From: 贺鹏 @ 2021-02-01  8:53 UTC (permalink / raw)
  To: Xia, Chenbo; +Cc: dev, maxime.coquelin, chenwei.0515, wangzhihong.wzh

Hi, Chenbo,

Thanks for the detailed review!


Xia, Chenbo <chenbo.xia@intel.com> 于2021年2月1日周一 下午2:27写道:
>
> Hi Peng & Fei,
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Peng He
> > Sent: Friday, January 29, 2021 3:36 PM
> > To: dev@dpdk.org
> > Cc: maxime.coquelin@redhat.com
> > Subject: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race
>
> Fix the title to 'vhost: XXXXX'
>
> >
> > From: "chenwei.0515" <chenwei.0515@bytedance.com>
>
> This should not be here.. you could just delete it as the author is already
> in commit message.
>
> >
> > vhost_new_devcie might be called in different threads at the same time.
>
> s/devcie/device
>

will fix it in v2.

> > thread 1(config thread)
> >             rte_vhost_driver_start
> >                ->vhost_user_start_client
> >                    ->vhost_user_add_connection
> >                      -> vhost_new_device
> >
> > thread 2(vhost-events)
> >       vhost_user_read_cb
> >            ->vhost_user_msg_handler (return value < 0)
> >              -> vhost_user_start_client
> >                  -> vhost_new_device
> >
> > So there could be a case that a same vid has been allocated twice, or
> > some vid might be lost in DPDK lib however still held by the upper
> > applications.
>
> Good catch! I checked the code and find there exists cases that different threads
> may allocate vhost slot.
>
> And I also find that other functions which use the global vhost_devices[] may also
> meet the same problem. For example, vhost_destroy_device() could be called by different
> thread. So I suggest to fix the problem completely in all related functions like
> vhost_destroy_device() and get_device(). What do you think?
>
> If you agree with above, note that the title should also be changed.
>

Yes, we've investigated also these places where race would exist.

In *vhost_destroy_device*, the access to vhost_devices is just to set
the specific slot to NULL.
If the vid is not the same, the race will not exist. Two threads will
not destroy the same vid at
the same time.

We will add these notes in the commits for clarity.


> Besides, please also add 'Fixes:$COMMID_ID' and cc to stable@dpdk.org so it could be
> fixed in LTS. You can check other commit for reference.

will do it the v2.

>
> >
> > Reported-by: Peng He <hepeng.0320@bytedance.com>
> > Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> > Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> > ---
> >  lib/librte_vhost/vhost.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> > index efb136edd1..db11d293d2 100644
> > --- a/lib/librte_vhost/vhost.c
> > +++ b/lib/librte_vhost/vhost.c
> > @@ -26,6 +26,7 @@
> >  #include "vhost_user.h"
> >
> >  struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
> > +pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
>
> There's a duplicate space between 'pthread_mutex_t' and 'vhost_dev_lock',
> Let's just leave one

will fix it in v2.

>
> Thanks,
> Chenbo
>
> >
> >  /* Called with iotlb_lock read-locked */
> >  uint64_t
> > @@ -645,6 +646,7 @@ vhost_new_device(void)
> >       struct virtio_net *dev;
> >       int i;
> >
> > +     pthread_mutex_lock(&vhost_dev_lock);
> >       for (i = 0; i < MAX_VHOST_DEVICE; i++) {
> >               if (vhost_devices[i] == NULL)
> >                       break;
> > @@ -653,6 +655,7 @@ vhost_new_device(void)
> >       if (i == MAX_VHOST_DEVICE) {
> >               VHOST_LOG_CONFIG(ERR,
> >                       "Failed to find a free slot for new device.\n");
> > +             pthread_mutex_unlock(&vhost_dev_lock);
> >               return -1;
> >       }
> >
> > @@ -660,10 +663,13 @@ vhost_new_device(void)
> >       if (dev == NULL) {
> >               VHOST_LOG_CONFIG(ERR,
> >                       "Failed to allocate memory for new dev.\n");
> > +             pthread_mutex_unlock(&vhost_dev_lock);
> >               return -1;
> >       }
> >
> >       vhost_devices[i] = dev;
> > +     pthread_mutex_unlock(&vhost_dev_lock);
> > +
> >       dev->vid = i;
> >       dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
> >       dev->slave_req_fd = -1;
> > --
> > 2.23.0
>


-- 
hepeng

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH v2] vhost: fix vid allocation race
  2021-02-01  8:48   ` [dpdk-dev] [PATCH v2] vhost: " Peng He
@ 2021-02-03  2:44     ` Xia, Chenbo
  2021-02-03 17:21     ` Maxime Coquelin
  1 sibling, 0 replies; 6+ messages in thread
From: Xia, Chenbo @ 2021-02-03  2:44 UTC (permalink / raw)
  To: Peng He, dev; +Cc: stable, Maxime Coquelin, chenwei.0515, Zhihong Wang

Hi Peng,

> -----Original Message-----
> From: Peng He <xnhp0320@gmail.com>
> Sent: Monday, February 1, 2021 4:49 PM
> To: dev@dpdk.org; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: stable@dpdk.org
> Subject: [PATCH v2] vhost: fix vid allocation race
> 
> vhost_new_device might be called in different threads at the same time.
> thread 1(config thread)
>             rte_vhost_driver_start
>                ->vhost_user_start_client
>                    ->vhost_user_add_connection
>                      -> vhost_new_device
> 
> thread 2(vhost-events)
> 	vhost_user_read_cb
>            ->vhost_user_msg_handler (return value < 0)
>              -> vhost_user_start_client
>                  -> vhost_new_device
> 
> So there could be a case that a same vid has been allocated twice, or
> some vid might be lost in DPDK lib however still held by the upper
> applications.
> 
> Another place where race would happen is at the func *vhost_destroy_device*,
> but after a detailed investigation, the race does not exist as long as
> no two devices have the same vid: Calling vhost_destroy_devices in
> different threads with different vids is actually safe.

I want to clarify another thing, vhost_destroy_device and get_device may have
a thread-safe problem. That is: vhost_user_read_cb() destroys the device while
app thread is calling vhost API (with get_device in it) to use that device.

A good thing is before vhost_user_read_cb() destroys the device, it notifies the app
thread, so vhost app should make sure it avoids this kind of problem. Otherwise we may
need to lock all places that uses the global vhost_devices, which is not good since
that will affect data path perf a lot.

Anyway, your patch fixes the specific problem you mentioned.

For this patch:

Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>

> 
> Fixes: a277c715987 ("vhost: refactor code structure")
> Reported-by: Peng He <hepeng.0320@bytedance.com>
> Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  lib/librte_vhost/vhost.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index efb136edd1..52ab93d1ec 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -26,6 +26,7 @@
>  #include "vhost_user.h"
> 
>  struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
> +pthread_mutex_t vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
> 
>  /* Called with iotlb_lock read-locked */
>  uint64_t
> @@ -645,6 +646,7 @@ vhost_new_device(void)
>  	struct virtio_net *dev;
>  	int i;
> 
> +	pthread_mutex_lock(&vhost_dev_lock);
>  	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
>  		if (vhost_devices[i] == NULL)
>  			break;
> @@ -653,6 +655,7 @@ vhost_new_device(void)
>  	if (i == MAX_VHOST_DEVICE) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to find a free slot for new device.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
> @@ -660,10 +663,13 @@ vhost_new_device(void)
>  	if (dev == NULL) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to allocate memory for new dev.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
>  	vhost_devices[i] = dev;
> +	pthread_mutex_unlock(&vhost_dev_lock);
> +
>  	dev->vid = i;
>  	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
>  	dev->slave_req_fd = -1;
> --
> 2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH v2] vhost: fix vid allocation race
  2021-02-01  8:48   ` [dpdk-dev] [PATCH v2] vhost: " Peng He
  2021-02-03  2:44     ` Xia, Chenbo
@ 2021-02-03 17:21     ` Maxime Coquelin
  1 sibling, 0 replies; 6+ messages in thread
From: Maxime Coquelin @ 2021-02-03 17:21 UTC (permalink / raw)
  To: Peng He, dev, chenbo.xia; +Cc: stable



On 2/1/21 9:48 AM, Peng He wrote:
> vhost_new_device might be called in different threads at the same time.
> thread 1(config thread)
>             rte_vhost_driver_start
>                ->vhost_user_start_client
>                    ->vhost_user_add_connection
>                      -> vhost_new_device
> 
> thread 2(vhost-events)
> 	vhost_user_read_cb
>            ->vhost_user_msg_handler (return value < 0)
>              -> vhost_user_start_client
>                  -> vhost_new_device
> 
> So there could be a case that a same vid has been allocated twice, or
> some vid might be lost in DPDK lib however still held by the upper
> applications.
> 
> Another place where race would happen is at the func *vhost_destroy_device*,
> but after a detailed investigation, the race does not exist as long as
> no two devices have the same vid: Calling vhost_destroy_devices in
> different threads with different vids is actually safe.
> 
> Fixes: a277c715987 ("vhost: refactor code structure")
> Reported-by: Peng He <hepeng.0320@bytedance.com>
> Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  lib/librte_vhost/vhost.c | 6 ++++++
>  1 file changed, 6 insertions(+)

Applied to dpdk-next-virtio/main.

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-03 17:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29  7:35 [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race Peng He
2021-02-01  6:27 ` Xia, Chenbo
2021-02-01  8:48   ` [dpdk-dev] [PATCH v2] vhost: " Peng He
2021-02-03  2:44     ` Xia, Chenbo
2021-02-03 17:21     ` Maxime Coquelin
2021-02-01  8:53   ` [dpdk-dev] [PATCH] lib/librte_vhost: " 贺鹏

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).