patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc
       [not found] <20210617153739.178011-1-maxime.coquelin@redhat.com>
@ 2021-06-17 15:37 ` Maxime Coquelin
  2021-06-18  4:34   ` Xia, Chenbo
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 2/7] vhost: fix missing guest pages " Maxime Coquelin
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue Maxime Coquelin
  2 siblings, 1 reply; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-17 15:37 UTC (permalink / raw)
  To: dev, david.marchand, chenbo.xia; +Cc: Maxime Coquelin, stable

When the guest allocates virtqueues on a different NUMA node
than the one the Vhost metadata are allocated, both the Vhost
device struct and the virtqueues struct are reallocated.

However, reallocating the Vhost memory table was missing, which
likely causes iat least one cross-NUMA accesses for every burst
of packets.

This patch reallocates this table on the same NUMA node as the
other metadata.

Fixes: 552e8fd3d2b4 ("vhost: simplify memory regions handling")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/vhost/vhost_user.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 8f0eba6412..031e3bfa2f 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -557,6 +557,9 @@ numa_realloc(struct virtio_net *dev, int index)
 		goto out;
 	}
 	if (oldnode != newnode) {
+		struct rte_vhost_memory *old_mem;
+		ssize_t mem_size;
+
 		VHOST_LOG_CONFIG(INFO,
 			"reallocate dev from %d to %d node\n",
 			oldnode, newnode);
@@ -568,6 +571,18 @@ numa_realloc(struct virtio_net *dev, int index)
 
 		memcpy(dev, old_dev, sizeof(*dev));
 		rte_free(old_dev);
+
+		mem_size = sizeof(struct rte_vhost_memory) +
+			sizeof(struct rte_vhost_mem_region) * dev->mem->nregions;
+		old_mem = dev->mem;
+		dev->mem = rte_malloc_socket(NULL, mem_size, 0, newnode);
+		if (!dev->mem) {
+			dev->mem = old_mem;
+			goto out;
+		}
+
+		memcpy(dev->mem, old_mem, mem_size);
+		rte_free(old_mem);
 	}
 
 out:
-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-stable] [PATCH v4 2/7] vhost: fix missing guest pages table NUMA realloc
       [not found] <20210617153739.178011-1-maxime.coquelin@redhat.com>
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc Maxime Coquelin
@ 2021-06-17 15:37 ` Maxime Coquelin
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue Maxime Coquelin
  2 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-17 15:37 UTC (permalink / raw)
  To: dev, david.marchand, chenbo.xia; +Cc: Maxime Coquelin, stable

When the guest allocates virtqueues on a different NUMA node
than the one the Vhost metadata are allocated, both the Vhost
device struct and the virtqueues struct are reallocated.

However, reallocating the guest pages table was missing, which
likely causes at least one cross-NUMA accesses for every burst
of packets.

This patch reallocates this table on the same NUMA node as the
other metadata.

Fixes: e246896178e6 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/vhost/vhost_user.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 031e3bfa2f..cbfdf1b4d8 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -558,7 +558,8 @@ numa_realloc(struct virtio_net *dev, int index)
 	}
 	if (oldnode != newnode) {
 		struct rte_vhost_memory *old_mem;
-		ssize_t mem_size;
+		struct guest_page *old_gp;
+		ssize_t mem_size, gp_size;
 
 		VHOST_LOG_CONFIG(INFO,
 			"reallocate dev from %d to %d node\n",
@@ -583,6 +584,17 @@ numa_realloc(struct virtio_net *dev, int index)
 
 		memcpy(dev->mem, old_mem, mem_size);
 		rte_free(old_mem);
+
+		gp_size = dev->max_guest_pages * sizeof(*dev->guest_pages);
+		old_gp = dev->guest_pages;
+		dev->guest_pages = rte_malloc_socket(NULL, gp_size, RTE_CACHE_LINE_SIZE, newnode);
+		if (!dev->guest_pages) {
+			dev->guest_pages = old_gp;
+			goto out;
+		}
+
+		memcpy(dev->guest_pages, old_gp, gp_size);
+		rte_free(old_gp);
 	}
 
 out:
-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
       [not found] <20210617153739.178011-1-maxime.coquelin@redhat.com>
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc Maxime Coquelin
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 2/7] vhost: fix missing guest pages " Maxime Coquelin
@ 2021-06-17 15:37 ` Maxime Coquelin
  2021-06-18  4:34   ` Xia, Chenbo
  2 siblings, 1 reply; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-17 15:37 UTC (permalink / raw)
  To: dev, david.marchand, chenbo.xia; +Cc: Maxime Coquelin, stable

Since the Vhost-user device initialization has been reworked,
enabling the application to start using the device as soon as
the first queue pair is ready, NUMA reallocation no more
happened on queue pairs other than the first one since
numa_realloc() was returning early if the device was running.

This patch fixes this issue by only preventing the device
metadata to be allocated if the device is running. For the
virtqueues, a vring state change notification is sent to
notify the application of its disablement. Since the callback
is supposed to be blocking, it is safe to reallocate it
afterwards.

Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/vhost/vhost_user.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 0e9e26ebe0..6e7b327ef8 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
 	struct batch_copy_elem *new_batch_copy_elems;
 	int ret;
 
-	if (dev->flags & VIRTIO_DEV_RUNNING)
-		return dev;
-
 	old_dev = dev;
 	vq = old_vq = dev->virtqueue[index];
 
@@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
 		return dev;
 	}
 	if (oldnode != newnode) {
+		if (vq->ready) {
+			vq->ready = false;
+			vhost_user_notify_queue_state(dev, index, 0);
+		}
+
 		VHOST_LOG_CONFIG(INFO,
 			"reallocate vq from %d to %d node\n", oldnode, newnode);
 		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
@@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
 		rte_free(old_vq);
 	}
 
+	if (dev->flags & VIRTIO_DEV_RUNNING)
+		goto out;
+
 	/* check if we need to reallocate dev */
 	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
 			    MPOL_F_NODE | MPOL_F_ADDR);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc Maxime Coquelin
@ 2021-06-18  4:34   ` Xia, Chenbo
  2021-06-18  7:40     ` Maxime Coquelin
  0 siblings, 1 reply; 10+ messages in thread
From: Xia, Chenbo @ 2021-06-18  4:34 UTC (permalink / raw)
  To: Maxime Coquelin, dev, david.marchand; +Cc: stable

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, June 17, 2021 11:38 PM
> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
> Subject: [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc
> 
> When the guest allocates virtqueues on a different NUMA node
> than the one the Vhost metadata are allocated, both the Vhost
> device struct and the virtqueues struct are reallocated.
> 
> However, reallocating the Vhost memory table was missing, which
> likely causes iat least one cross-NUMA accesses for every burst
> of packets.

'at least' ?

> 
> This patch reallocates this table on the same NUMA node as the
> other metadata.
> 
> Fixes: 552e8fd3d2b4 ("vhost: simplify memory regions handling")
> Cc: stable@dpdk.org
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/vhost/vhost_user.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 8f0eba6412..031e3bfa2f 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -557,6 +557,9 @@ numa_realloc(struct virtio_net *dev, int index)

As we are realloc more things now, the comment above 'numa_realloc(XXX)'
should also be changed like:

Reallocate related data structure to make them on the same numa node as
the memory of vring descriptor.

Thanks,
Chenbo

>  		goto out;
>  	}
>  	if (oldnode != newnode) {
> +		struct rte_vhost_memory *old_mem;
> +		ssize_t mem_size;
> +
>  		VHOST_LOG_CONFIG(INFO,
>  			"reallocate dev from %d to %d node\n",
>  			oldnode, newnode);
> @@ -568,6 +571,18 @@ numa_realloc(struct virtio_net *dev, int index)
> 
>  		memcpy(dev, old_dev, sizeof(*dev));
>  		rte_free(old_dev);
> +
> +		mem_size = sizeof(struct rte_vhost_memory) +
> +			sizeof(struct rte_vhost_mem_region) * dev->mem->nregions;
> +		old_mem = dev->mem;
> +		dev->mem = rte_malloc_socket(NULL, mem_size, 0, newnode);
> +		if (!dev->mem) {
> +			dev->mem = old_mem;
> +			goto out;
> +		}
> +
> +		memcpy(dev->mem, old_mem, mem_size);
> +		rte_free(old_mem);
>  	}
> 
>  out:
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
  2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue Maxime Coquelin
@ 2021-06-18  4:34   ` Xia, Chenbo
  2021-06-18  8:01     ` Maxime Coquelin
  0 siblings, 1 reply; 10+ messages in thread
From: Xia, Chenbo @ 2021-06-18  4:34 UTC (permalink / raw)
  To: Maxime Coquelin, dev, david.marchand; +Cc: stable

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, June 17, 2021 11:38 PM
> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> 
> Since the Vhost-user device initialization has been reworked,
> enabling the application to start using the device as soon as
> the first queue pair is ready, NUMA reallocation no more
> happened on queue pairs other than the first one since
> numa_realloc() was returning early if the device was running.
> 
> This patch fixes this issue by only preventing the device
> metadata to be allocated if the device is running. For the
> virtqueues, a vring state change notification is sent to
> notify the application of its disablement. Since the callback
> is supposed to be blocking, it is safe to reallocate it
> afterwards.

Is there a corner case? Numa_realloc may happen during vhost-user msg
set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
not take vq access lock. It could happen when numa_realloc happens on
iotlb msg and app accesses vq in the meantime?

Thanks,
Chenbo

> 
> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/vhost/vhost_user.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 0e9e26ebe0..6e7b327ef8 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
>  	struct batch_copy_elem *new_batch_copy_elems;
>  	int ret;
> 
> -	if (dev->flags & VIRTIO_DEV_RUNNING)
> -		return dev;
> -
>  	old_dev = dev;
>  	vq = old_vq = dev->virtqueue[index];
> 
> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
>  		return dev;
>  	}
>  	if (oldnode != newnode) {
> +		if (vq->ready) {
> +			vq->ready = false;
> +			vhost_user_notify_queue_state(dev, index, 0);
> +		}
> +
>  		VHOST_LOG_CONFIG(INFO,
>  			"reallocate vq from %d to %d node\n", oldnode, newnode);
>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
>  		rte_free(old_vq);
>  	}
> 
> +	if (dev->flags & VIRTIO_DEV_RUNNING)
> +		goto out;
> +
>  	/* check if we need to reallocate dev */
>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
>  			    MPOL_F_NODE | MPOL_F_ADDR);
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc
  2021-06-18  4:34   ` Xia, Chenbo
@ 2021-06-18  7:40     ` Maxime Coquelin
  0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-18  7:40 UTC (permalink / raw)
  To: Xia, Chenbo, dev, david.marchand; +Cc: stable



On 6/18/21 6:34 AM, Xia, Chenbo wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, June 17, 2021 11:38 PM
>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
>> Subject: [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc
>>
>> When the guest allocates virtqueues on a different NUMA node
>> than the one the Vhost metadata are allocated, both the Vhost
>> device struct and the virtqueues struct are reallocated.
>>
>> However, reallocating the Vhost memory table was missing, which
>> likely causes iat least one cross-NUMA accesses for every burst
>> of packets.
> 
> 'at least' ?

yes.

>>
>> This patch reallocates this table on the same NUMA node as the
>> other metadata.
>>
>> Fixes: 552e8fd3d2b4 ("vhost: simplify memory regions handling")
>> Cc: stable@dpdk.org
>>
>> Reported-by: David Marchand <david.marchand@redhat.com>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>  lib/vhost/vhost_user.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
>> index 8f0eba6412..031e3bfa2f 100644
>> --- a/lib/vhost/vhost_user.c
>> +++ b/lib/vhost/vhost_user.c
>> @@ -557,6 +557,9 @@ numa_realloc(struct virtio_net *dev, int index)
> 
> As we are realloc more things now, the comment above 'numa_realloc(XXX)'
> should also be changed like:
> 
> Reallocate related data structure to make them on the same numa node as
> the memory of vring descriptor.

Agree, I'll put this:

"
Reallocate virtio_dev, vhost_virtqueue and related data structures to
make them on the same numa node as the memory of vring descriptor.
"

Thanks,
Maxime

> Thanks,
> Chenbo
> 
>>  		goto out;
>>  	}
>>  	if (oldnode != newnode) {
>> +		struct rte_vhost_memory *old_mem;
>> +		ssize_t mem_size;
>> +
>>  		VHOST_LOG_CONFIG(INFO,
>>  			"reallocate dev from %d to %d node\n",
>>  			oldnode, newnode);
>> @@ -568,6 +571,18 @@ numa_realloc(struct virtio_net *dev, int index)
>>
>>  		memcpy(dev, old_dev, sizeof(*dev));
>>  		rte_free(old_dev);
>> +
>> +		mem_size = sizeof(struct rte_vhost_memory) +
>> +			sizeof(struct rte_vhost_mem_region) * dev->mem->nregions;
>> +		old_mem = dev->mem;
>> +		dev->mem = rte_malloc_socket(NULL, mem_size, 0, newnode);
>> +		if (!dev->mem) {
>> +			dev->mem = old_mem;
>> +			goto out;
>> +		}
>> +
>> +		memcpy(dev->mem, old_mem, mem_size);
>> +		rte_free(old_mem);
>>  	}
>>
>>  out:
>> --
>> 2.31.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
  2021-06-18  4:34   ` Xia, Chenbo
@ 2021-06-18  8:01     ` Maxime Coquelin
  2021-06-18  8:21       ` Xia, Chenbo
  0 siblings, 1 reply; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-18  8:01 UTC (permalink / raw)
  To: Xia, Chenbo, dev, david.marchand; +Cc: stable



On 6/18/21 6:34 AM, Xia, Chenbo wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, June 17, 2021 11:38 PM
>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
>>
>> Since the Vhost-user device initialization has been reworked,
>> enabling the application to start using the device as soon as
>> the first queue pair is ready, NUMA reallocation no more
>> happened on queue pairs other than the first one since
>> numa_realloc() was returning early if the device was running.
>>
>> This patch fixes this issue by only preventing the device
>> metadata to be allocated if the device is running. For the
>> virtqueues, a vring state change notification is sent to
>> notify the application of its disablement. Since the callback
>> is supposed to be blocking, it is safe to reallocate it
>> afterwards.
> 
> Is there a corner case? Numa_realloc may happen during vhost-user msg
> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
> not take vq access lock. It could happen when numa_realloc happens on
> iotlb msg and app accesses vq in the meantime?

I think we are safe wrt to numa_realloc(), because the app's
.vring_state_changed() callback is only returning when it is no more
processing the rings.


> Thanks,
> Chenbo
> 
>>
>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>  lib/vhost/vhost_user.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
>> index 0e9e26ebe0..6e7b327ef8 100644
>> --- a/lib/vhost/vhost_user.c
>> +++ b/lib/vhost/vhost_user.c
>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
>>  	struct batch_copy_elem *new_batch_copy_elems;
>>  	int ret;
>>
>> -	if (dev->flags & VIRTIO_DEV_RUNNING)
>> -		return dev;
>> -
>>  	old_dev = dev;
>>  	vq = old_vq = dev->virtqueue[index];
>>
>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
>>  		return dev;
>>  	}
>>  	if (oldnode != newnode) {
>> +		if (vq->ready) {
>> +			vq->ready = false;
>> +			vhost_user_notify_queue_state(dev, index, 0);
>> +		}
>> +
>>  		VHOST_LOG_CONFIG(INFO,
>>  			"reallocate vq from %d to %d node\n", oldnode, newnode);
>>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
>>  		rte_free(old_vq);
>>  	}
>>
>> +	if (dev->flags & VIRTIO_DEV_RUNNING)
>> +		goto out;
>> +
>>  	/* check if we need to reallocate dev */
>>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
>>  			    MPOL_F_NODE | MPOL_F_ADDR);
>> --
>> 2.31.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
  2021-06-18  8:01     ` Maxime Coquelin
@ 2021-06-18  8:21       ` Xia, Chenbo
  2021-06-18  8:48         ` Maxime Coquelin
  0 siblings, 1 reply; 10+ messages in thread
From: Xia, Chenbo @ 2021-06-18  8:21 UTC (permalink / raw)
  To: Maxime Coquelin, dev, david.marchand; +Cc: stable

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, June 18, 2021 4:01 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com
> Cc: stable@dpdk.org
> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> 
> 
> 
> On 6/18/21 6:34 AM, Xia, Chenbo wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Thursday, June 17, 2021 11:38 PM
> >> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
> <chenbo.xia@intel.com>
> >> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
> >> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> >>
> >> Since the Vhost-user device initialization has been reworked,
> >> enabling the application to start using the device as soon as
> >> the first queue pair is ready, NUMA reallocation no more
> >> happened on queue pairs other than the first one since
> >> numa_realloc() was returning early if the device was running.
> >>
> >> This patch fixes this issue by only preventing the device
> >> metadata to be allocated if the device is running. For the
> >> virtqueues, a vring state change notification is sent to
> >> notify the application of its disablement. Since the callback
> >> is supposed to be blocking, it is safe to reallocate it
> >> afterwards.
> >
> > Is there a corner case? Numa_realloc may happen during vhost-user msg
> > set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
> > not take vq access lock. It could happen when numa_realloc happens on
> > iotlb msg and app accesses vq in the meantime?
> 
> I think we are safe wrt to numa_realloc(), because the app's
> .vring_state_changed() callback is only returning when it is no more
> processing the rings.

Yes, I think it should be. But in this iotlb msg case (take vhost pmd for example),
can't vhost pmd still access vq since vq access lock is not took? Do I miss something?

Thanks,
Chenbo

> 
> 
> > Thanks,
> > Chenbo
> >
> >>
> >> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> ---
> >>  lib/vhost/vhost_user.c | 11 ++++++++---
> >>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> >> index 0e9e26ebe0..6e7b327ef8 100644
> >> --- a/lib/vhost/vhost_user.c
> >> +++ b/lib/vhost/vhost_user.c
> >> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
> >>  	struct batch_copy_elem *new_batch_copy_elems;
> >>  	int ret;
> >>
> >> -	if (dev->flags & VIRTIO_DEV_RUNNING)
> >> -		return dev;
> >> -
> >>  	old_dev = dev;
> >>  	vq = old_vq = dev->virtqueue[index];
> >>
> >> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
> >>  		return dev;
> >>  	}
> >>  	if (oldnode != newnode) {
> >> +		if (vq->ready) {
> >> +			vq->ready = false;
> >> +			vhost_user_notify_queue_state(dev, index, 0);
> >> +		}
> >> +
> >>  		VHOST_LOG_CONFIG(INFO,
> >>  			"reallocate vq from %d to %d node\n", oldnode, newnode);
> >>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
> >> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
> >>  		rte_free(old_vq);
> >>  	}
> >>
> >> +	if (dev->flags & VIRTIO_DEV_RUNNING)
> >> +		goto out;
> >> +
> >>  	/* check if we need to reallocate dev */
> >>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
> >>  			    MPOL_F_NODE | MPOL_F_ADDR);
> >> --
> >> 2.31.1
> >


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
  2021-06-18  8:21       ` Xia, Chenbo
@ 2021-06-18  8:48         ` Maxime Coquelin
  2021-06-24 10:49           ` Xia, Chenbo
  0 siblings, 1 reply; 10+ messages in thread
From: Maxime Coquelin @ 2021-06-18  8:48 UTC (permalink / raw)
  To: Xia, Chenbo, dev, david.marchand; +Cc: stable



On 6/18/21 10:21 AM, Xia, Chenbo wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Friday, June 18, 2021 4:01 PM
>> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org;
>> david.marchand@redhat.com
>> Cc: stable@dpdk.org
>> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
>>
>>
>>
>> On 6/18/21 6:34 AM, Xia, Chenbo wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Thursday, June 17, 2021 11:38 PM
>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>> <chenbo.xia@intel.com>
>>>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
>>>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
>>>>
>>>> Since the Vhost-user device initialization has been reworked,
>>>> enabling the application to start using the device as soon as
>>>> the first queue pair is ready, NUMA reallocation no more
>>>> happened on queue pairs other than the first one since
>>>> numa_realloc() was returning early if the device was running.
>>>>
>>>> This patch fixes this issue by only preventing the device
>>>> metadata to be allocated if the device is running. For the
>>>> virtqueues, a vring state change notification is sent to
>>>> notify the application of its disablement. Since the callback
>>>> is supposed to be blocking, it is safe to reallocate it
>>>> afterwards.
>>>
>>> Is there a corner case? Numa_realloc may happen during vhost-user msg
>>> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
>>> not take vq access lock. It could happen when numa_realloc happens on
>>> iotlb msg and app accesses vq in the meantime?
>>
>> I think we are safe wrt to numa_realloc(), because the app's
>> .vring_state_changed() callback is only returning when it is no more
>> processing the rings.
> 
> Yes, I think it should be. But in this iotlb msg case (take vhost pmd for example),
> can't vhost pmd still access vq since vq access lock is not took? Do I miss something?

Vhost PMD sends RTE_ETH_EVENT_QUEUE_STATE, and my assumption was that
the application would stop processing the rings when handling this
event and only return from the callback when it's one, but this seems
that's not done at least in testpmd. So we may not rely on that after
all :/.

We cannot rely on the VQ's access lock since the goal of numa_realloc is
to reallocate the vhost_virtqueue itself which contains the acces_lock.
Relying on it would cause a use after free.

Maybe the safest thing to do is to just skip the reallocation if
vq->ready == true.

Having vq->ready == true means we already received all the vrings info
from QEMU, which means the driver has already initialized the device.

It should not change runtime behavior compared to this patch since it
would not reallocate anyway.

What do you think?

> Thanks,
> Chenbo
> 
>>
>>
>>> Thanks,
>>> Chenbo
>>>
>>>>
>>>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>>  lib/vhost/vhost_user.c | 11 ++++++++---
>>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
>>>> index 0e9e26ebe0..6e7b327ef8 100644
>>>> --- a/lib/vhost/vhost_user.c
>>>> +++ b/lib/vhost/vhost_user.c
>>>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
>>>>  	struct batch_copy_elem *new_batch_copy_elems;
>>>>  	int ret;
>>>>
>>>> -	if (dev->flags & VIRTIO_DEV_RUNNING)
>>>> -		return dev;
>>>> -
>>>>  	old_dev = dev;
>>>>  	vq = old_vq = dev->virtqueue[index];
>>>>
>>>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
>>>>  		return dev;
>>>>  	}
>>>>  	if (oldnode != newnode) {
>>>> +		if (vq->ready) {
>>>> +			vq->ready = false;
>>>> +			vhost_user_notify_queue_state(dev, index, 0);
>>>> +		}
>>>> +
>>>>  		VHOST_LOG_CONFIG(INFO,
>>>>  			"reallocate vq from %d to %d node\n", oldnode, newnode);
>>>>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
>>>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
>>>>  		rte_free(old_vq);
>>>>  	}
>>>>
>>>> +	if (dev->flags & VIRTIO_DEV_RUNNING)
>>>> +		goto out;
>>>> +
>>>>  	/* check if we need to reallocate dev */
>>>>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
>>>>  			    MPOL_F_NODE | MPOL_F_ADDR);
>>>> --
>>>> 2.31.1
>>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
  2021-06-18  8:48         ` Maxime Coquelin
@ 2021-06-24 10:49           ` Xia, Chenbo
  0 siblings, 0 replies; 10+ messages in thread
From: Xia, Chenbo @ 2021-06-24 10:49 UTC (permalink / raw)
  To: Maxime Coquelin, dev, david.marchand; +Cc: stable

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, June 18, 2021 4:48 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com
> Cc: stable@dpdk.org
> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> 
> 
> 
> On 6/18/21 10:21 AM, Xia, Chenbo wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Friday, June 18, 2021 4:01 PM
> >> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org;
> >> david.marchand@redhat.com
> >> Cc: stable@dpdk.org
> >> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> >>
> >>
> >>
> >> On 6/18/21 6:34 AM, Xia, Chenbo wrote:
> >>> Hi Maxime,
> >>>
> >>>> -----Original Message-----
> >>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Sent: Thursday, June 17, 2021 11:38 PM
> >>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
> >> <chenbo.xia@intel.com>
> >>>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org
> >>>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> >>>>
> >>>> Since the Vhost-user device initialization has been reworked,
> >>>> enabling the application to start using the device as soon as
> >>>> the first queue pair is ready, NUMA reallocation no more
> >>>> happened on queue pairs other than the first one since
> >>>> numa_realloc() was returning early if the device was running.
> >>>>
> >>>> This patch fixes this issue by only preventing the device
> >>>> metadata to be allocated if the device is running. For the
> >>>> virtqueues, a vring state change notification is sent to
> >>>> notify the application of its disablement. Since the callback
> >>>> is supposed to be blocking, it is safe to reallocate it
> >>>> afterwards.
> >>>
> >>> Is there a corner case? Numa_realloc may happen during vhost-user msg
> >>> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
> >>> not take vq access lock. It could happen when numa_realloc happens on
> >>> iotlb msg and app accesses vq in the meantime?
> >>
> >> I think we are safe wrt to numa_realloc(), because the app's
> >> .vring_state_changed() callback is only returning when it is no more
> >> processing the rings.
> >
> > Yes, I think it should be. But in this iotlb msg case (take vhost pmd for
> example),
> > can't vhost pmd still access vq since vq access lock is not took? Do I miss
> something?
> 
> Vhost PMD sends RTE_ETH_EVENT_QUEUE_STATE, and my assumption was that
> the application would stop processing the rings when handling this
> event and only return from the callback when it's one, but this seems
> that's not done at least in testpmd. So we may not rely on that after
> all :/.
> 
> We cannot rely on the VQ's access lock since the goal of numa_realloc is
> to reallocate the vhost_virtqueue itself which contains the acces_lock.
> Relying on it would cause a use after free.
> 
> Maybe the safest thing to do is to just skip the reallocation if
> vq->ready == true.
> 
> Having vq->ready == true means we already received all the vrings info
> from QEMU, which means the driver has already initialized the device.
> 
> It should not change runtime behavior compared to this patch since it
> would not reallocate anyway.
> 
> What do you think?

That sounds good to me 😊

Thanks,
Chenbo

> 
> > Thanks,
> > Chenbo
> >
> >>
> >>
> >>> Thanks,
> >>> Chenbo
> >>>
> >>>>
> >>>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
> >>>> Cc: stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> ---
> >>>>  lib/vhost/vhost_user.c | 11 ++++++++---
> >>>>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> >>>> index 0e9e26ebe0..6e7b327ef8 100644
> >>>> --- a/lib/vhost/vhost_user.c
> >>>> +++ b/lib/vhost/vhost_user.c
> >>>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  	struct batch_copy_elem *new_batch_copy_elems;
> >>>>  	int ret;
> >>>>
> >>>> -	if (dev->flags & VIRTIO_DEV_RUNNING)
> >>>> -		return dev;
> >>>> -
> >>>>  	old_dev = dev;
> >>>>  	vq = old_vq = dev->virtqueue[index];
> >>>>
> >>>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  		return dev;
> >>>>  	}
> >>>>  	if (oldnode != newnode) {
> >>>> +		if (vq->ready) {
> >>>> +			vq->ready = false;
> >>>> +			vhost_user_notify_queue_state(dev, index, 0);
> >>>> +		}
> >>>> +
> >>>>  		VHOST_LOG_CONFIG(INFO,
> >>>>  			"reallocate vq from %d to %d node\n", oldnode,
> newnode);
> >>>>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
> >>>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  		rte_free(old_vq);
> >>>>  	}
> >>>>
> >>>> +	if (dev->flags & VIRTIO_DEV_RUNNING)
> >>>> +		goto out;
> >>>> +
> >>>>  	/* check if we need to reallocate dev */
> >>>>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
> >>>>  			    MPOL_F_NODE | MPOL_F_ADDR);
> >>>> --
> >>>> 2.31.1
> >>>
> >


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-24 10:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210617153739.178011-1-maxime.coquelin@redhat.com>
2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 1/7] vhost: fix missing memory table NUMA realloc Maxime Coquelin
2021-06-18  4:34   ` Xia, Chenbo
2021-06-18  7:40     ` Maxime Coquelin
2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 2/7] vhost: fix missing guest pages " Maxime Coquelin
2021-06-17 15:37 ` [dpdk-stable] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue Maxime Coquelin
2021-06-18  4:34   ` Xia, Chenbo
2021-06-18  8:01     ` Maxime Coquelin
2021-06-18  8:21       ` Xia, Chenbo
2021-06-18  8:48         ` Maxime Coquelin
2021-06-24 10:49           ` Xia, Chenbo

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git