DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx
@ 2021-03-18  7:18 Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache Feifei Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Feifei Wang @ 2021-03-18  7:18 UTC (permalink / raw)
  Cc: dev, nd, Feifei Wang

For net/mlx4 and net/mlx5, fix cache rebuild bug and replace SMP
barriers with atomic fence.

Feifei Wang (4):
  net/mlx4: fix rebuild bug for Memory Region cache
  net/mlx4: replace SMP barrier with C11 barriers
  net/mlx5: fix rebuild bug for Memory Region cache
  net/mlx5: replace SMP barriers with C11 barriers

 drivers/net/mlx4/mlx4_mr.c | 21 +++++++++----------
 drivers/net/mlx5/mlx5_mr.c | 41 ++++++++++++++++++--------------------
 2 files changed, 28 insertions(+), 34 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache
  2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
@ 2021-03-18  7:18 ` Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 2/4] net/mlx4: replace SMP barrier with C11 barriers Feifei Wang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Feifei Wang @ 2021-03-18  7:18 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Yongseok Koh
  Cc: dev, nd, Feifei Wang, stable, Ruifeng Wang

'dev_gen' is a variable to inform other cores to flush their local cache
when global cache is rebuilt.

However, if 'dev_gen' is updated after global cache is rebuilt, other
cores may load a wrong memory region lkey value from old local cache.

Timeslot        main core               worker core
  1         rebuild global cache
  2                                  load unchanged dev_gen
  3            update dev_gen
  4                                  look up old local cache

From the example above, we can see that though global cache is rebuilt,
due to that dev_gen is not updated, the worker core may look up old
cache table and receive a wrong memory region lkey value.

To fix this, updating 'dev_gen' should be moved before rebuilding global
cache to inform worker cores to flush their local cache when global
cache start rebuilding. And wmb can ensure the sequence of this process.

Fixes: 9797bfcce1c9 ("net/mlx4: add new memory region support")
Cc: stable@dpdk.org

Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/mlx4/mlx4_mr.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
index 6b2f0cf18..cfd7d4a9c 100644
--- a/drivers/net/mlx4/mlx4_mr.c
+++ b/drivers/net/mlx4/mlx4_mr.c
@@ -946,20 +946,17 @@ mlx4_mr_mem_event_free_cb(struct rte_eth_dev *dev, const void *addr, size_t len)
 		rebuild = 1;
 	}
 	if (rebuild) {
-		mr_rebuild_dev_cache(dev);
-		/*
-		 * Flush local caches by propagating invalidation across cores.
-		 * rte_smp_wmb() is enough to synchronize this event. If one of
-		 * freed memsegs is seen by other core, that means the memseg
-		 * has been allocated by allocator, which will come after this
-		 * free call. Therefore, this store instruction (incrementing
-		 * generation below) will be guaranteed to be seen by other core
-		 * before the core sees the newly allocated memory.
-		 */
 		++priv->mr.dev_gen;
 		DEBUG("broadcasting local cache flush, gen=%d",
-		      priv->mr.dev_gen);
+			priv->mr.dev_gen);
+
+		/* Flush local caches by propagating invalidation across cores.
+		 * rte_smp_wmb is to keep the order that dev_gen updated before
+		 * rebuilding global cache. Therefore, other core can flush their
+		 * local cache on time.
+		 */
 		rte_smp_wmb();
+		mr_rebuild_dev_cache(dev);
 	}
 	rte_rwlock_write_unlock(&priv->mr.rwlock);
 #ifdef RTE_LIBRTE_MLX4_DEBUG
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v1 2/4] net/mlx4: replace SMP barrier with C11 barriers
  2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache Feifei Wang
@ 2021-03-18  7:18 ` Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache Feifei Wang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Feifei Wang @ 2021-03-18  7:18 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler; +Cc: dev, nd, Feifei Wang, Ruifeng Wang

Replace SMP barrier with atomic thread fence.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/mlx4/mlx4_mr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
index cfd7d4a9c..503e8a7bb 100644
--- a/drivers/net/mlx4/mlx4_mr.c
+++ b/drivers/net/mlx4/mlx4_mr.c
@@ -951,11 +951,11 @@ mlx4_mr_mem_event_free_cb(struct rte_eth_dev *dev, const void *addr, size_t len)
 			priv->mr.dev_gen);
 
 		/* Flush local caches by propagating invalidation across cores.
-		 * rte_smp_wmb is to keep the order that dev_gen updated before
+		 * release-fence is to keep the order that dev_gen updated before
 		 * rebuilding global cache. Therefore, other core can flush their
 		 * local cache on time.
 		 */
-		rte_smp_wmb();
+		rte_atomic_thread_fence(__ATOMIC_RELEASE);
 		mr_rebuild_dev_cache(dev);
 	}
 	rte_rwlock_write_unlock(&priv->mr.rwlock);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
  2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache Feifei Wang
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 2/4] net/mlx4: replace SMP barrier with C11 barriers Feifei Wang
@ 2021-03-18  7:18 ` Feifei Wang
  2021-04-12  8:27   ` Slava Ovsiienko
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 4/4] net/mlx5: replace SMP barriers with C11 barriers Feifei Wang
  2021-04-07  1:45 ` [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Alexander Kozyrev
  4 siblings, 1 reply; 8+ messages in thread
From: Feifei Wang @ 2021-03-18  7:18 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko, Yongseok Koh
  Cc: dev, nd, Feifei Wang, stable, Ruifeng Wang

'dev_gen' is a variable to inform other cores to flush their local cache
when global cache is rebuilt.

However, if 'dev_gen' is updated after global cache is rebuilt, other
cores may load a wrong memory region lkey value from old local cache.

Timeslot        main core               worker core
  1         rebuild global cache
  2                                  load unchanged dev_gen
  3            update dev_gen
  4                                  look up old local cache

From the example above, we can see that though global cache is rebuilt,
due to that dev_gen is not updated, the worker core may look up old
cache table and receive a wrong memory region lkey value.

To fix this, updating 'dev_gen' should be moved before rebuilding global
cache to inform worker cores to flush their local cache when global
cache start rebuilding. And wmb can ensure the sequence of this process.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/mlx5/mlx5_mr.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index da4e91fc2..7ce1d3e64 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -103,20 +103,18 @@ mlx5_mr_mem_event_free_cb(struct mlx5_dev_ctx_shared *sh,
 		rebuild = 1;
 	}
 	if (rebuild) {
-		mlx5_mr_rebuild_cache(&sh->share_cache);
+		++sh->share_cache.dev_gen;
+		DEBUG("broadcasting local cache flush, gen=%d",
+			sh->share_cache.dev_gen);
+
 		/*
 		 * Flush local caches by propagating invalidation across cores.
-		 * rte_smp_wmb() is enough to synchronize this event. If one of
-		 * freed memsegs is seen by other core, that means the memseg
-		 * has been allocated by allocator, which will come after this
-		 * free call. Therefore, this store instruction (incrementing
-		 * generation below) will be guaranteed to be seen by other core
-		 * before the core sees the newly allocated memory.
+		 * rte_smp_wmb() is to keep the order that dev_gen updated before
+		 * rebuilding global cache. Therefore, other core can flush their
+		 * local cache on time.
 		 */
-		++sh->share_cache.dev_gen;
-		DEBUG("broadcasting local cache flush, gen=%d",
-		      sh->share_cache.dev_gen);
 		rte_smp_wmb();
+		mlx5_mr_rebuild_cache(&sh->share_cache);
 	}
 	rte_rwlock_write_unlock(&sh->share_cache.rwlock);
 }
@@ -407,20 +405,19 @@ mlx5_dma_unmap(struct rte_pci_device *pdev, void *addr,
 	mlx5_mr_free(mr, sh->share_cache.dereg_mr_cb);
 	DEBUG("port %u remove MR(%p) from list", dev->data->port_id,
 	      (void *)mr);
-	mlx5_mr_rebuild_cache(&sh->share_cache);
+
+	++sh->share_cache.dev_gen;
+	DEBUG("broadcasting local cache flush, gen=%d",
+		sh->share_cache.dev_gen);
+
 	/*
 	 * Flush local caches by propagating invalidation across cores.
-	 * rte_smp_wmb() is enough to synchronize this event. If one of
-	 * freed memsegs is seen by other core, that means the memseg
-	 * has been allocated by allocator, which will come after this
-	 * free call. Therefore, this store instruction (incrementing
-	 * generation below) will be guaranteed to be seen by other core
-	 * before the core sees the newly allocated memory.
+	 * rte_smp_wmb() is to keep the order that dev_gen updated before
+	 * rebuilding global cache. Therefore, other core can flush their
+	 * local cache on time.
 	 */
-	++sh->share_cache.dev_gen;
-	DEBUG("broadcasting local cache flush, gen=%d",
-	      sh->share_cache.dev_gen);
 	rte_smp_wmb();
+	mlx5_mr_rebuild_cache(&sh->share_cache);
 	rte_rwlock_read_unlock(&sh->share_cache.rwlock);
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v1 4/4] net/mlx5: replace SMP barriers with C11 barriers
  2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
                   ` (2 preceding siblings ...)
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache Feifei Wang
@ 2021-03-18  7:18 ` Feifei Wang
  2021-04-07  1:45 ` [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Alexander Kozyrev
  4 siblings, 0 replies; 8+ messages in thread
From: Feifei Wang @ 2021-03-18  7:18 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko
  Cc: dev, nd, Feifei Wang, Ruifeng Wang, Honnappa Nagarahalli

Replace SMP barrier with atomic thread fence.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
---
 drivers/net/mlx5/mlx5_mr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 7ce1d3e64..650fe9093 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -109,11 +109,11 @@ mlx5_mr_mem_event_free_cb(struct mlx5_dev_ctx_shared *sh,
 
 		/*
 		 * Flush local caches by propagating invalidation across cores.
-		 * rte_smp_wmb() is to keep the order that dev_gen updated before
+		 * release-fence is to keep the order that dev_gen updated before
 		 * rebuilding global cache. Therefore, other core can flush their
 		 * local cache on time.
 		 */
-		rte_smp_wmb();
+		rte_atomic_thread_fence(__ATOMIC_RELEASE);
 		mlx5_mr_rebuild_cache(&sh->share_cache);
 	}
 	rte_rwlock_write_unlock(&sh->share_cache.rwlock);
@@ -412,11 +412,11 @@ mlx5_dma_unmap(struct rte_pci_device *pdev, void *addr,
 
 	/*
 	 * Flush local caches by propagating invalidation across cores.
-	 * rte_smp_wmb() is to keep the order that dev_gen updated before
+	 * release-fence is to keep the order that dev_gen updated before
 	 * rebuilding global cache. Therefore, other core can flush their
 	 * local cache on time.
 	 */
-	rte_smp_wmb();
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
 	mlx5_mr_rebuild_cache(&sh->share_cache);
 	rte_rwlock_read_unlock(&sh->share_cache.rwlock);
 	return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx
  2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
                   ` (3 preceding siblings ...)
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 4/4] net/mlx5: replace SMP barriers with C11 barriers Feifei Wang
@ 2021-04-07  1:45 ` Alexander Kozyrev
  4 siblings, 0 replies; 8+ messages in thread
From: Alexander Kozyrev @ 2021-04-07  1:45 UTC (permalink / raw)
  To: Feifei Wang; +Cc: dev, nd

Looks good to me. Thank you for fixing this bug for us.

Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Feifei Wang
> Sent: Thursday, March 18, 2021 3:19
> Cc: dev@dpdk.org; nd@arm.com; Feifei Wang <feifei.wang2@arm.com>
> Subject: [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx
> 
> For net/mlx4 and net/mlx5, fix cache rebuild bug and replace SMP
> barriers with atomic fence.
> 
> Feifei Wang (4):
>   net/mlx4: fix rebuild bug for Memory Region cache
>   net/mlx4: replace SMP barrier with C11 barriers
>   net/mlx5: fix rebuild bug for Memory Region cache
>   net/mlx5: replace SMP barriers with C11 barriers
> 
>  drivers/net/mlx4/mlx4_mr.c | 21 +++++++++----------
>  drivers/net/mlx5/mlx5_mr.c | 41 ++++++++++++++++++--------------------
>  2 files changed, 28 insertions(+), 34 deletions(-)
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
  2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache Feifei Wang
@ 2021-04-12  8:27   ` Slava Ovsiienko
  2021-04-13  5:20     ` [dpdk-dev] 回复: " Feifei Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Slava Ovsiienko @ 2021-04-12  8:27 UTC (permalink / raw)
  To: Feifei Wang, Matan Azrad, Shahaf Shuler, Yongseok Koh
  Cc: dev, nd, stable, Ruifeng Wang

Hi, Feifei

Sorry, I do not follow what this patch fixes. Do we have some issue/bug with MR cache in practice?

Each Tx queue has its own dedicated "local" cache for MRs to convert buffer address in mbufs being transmitted
to LKeys (HW-related entity handle) and the "global" cache for all MR registered on the device.

AFAIK, how conversion happens in datapath:
- check the local queue cache flush request
- lookup in local cache
- if not found:
- acquire lock for global cache read access
- lookup in global cache
- release lock for global cache

How cache update on memory freeing/unregistering happens:
- acquire lock for global cache write access 
- [a] remove relevant MRs from the global cache
- [b] set local caches flush request 
- free global cache lock

If I understand correctly, your patch swaps [a] and [b],
and local caches flush is requested earlier. What problem does it solve?
It is not supposed there are in datapath some mbufs referencing
to the memory being freed. Application must ensure this and must not allocate
new mbufs from this memory regions being freed. Hence, the lookups for these
MRs in caches should not occur.

For other side, the cache flush has negative effect - the local cache is
getting empty and can't provide translation for other valid (not being removed) MRs,
and the translation has to look up in the global cache, that is locked now
for rebuilding, this causes the delays in datapatch on acquiring global cache lock.
So, I see some potential performance impact. 

With best regards,
Slava

> -----Original Message-----
> From: Feifei Wang <feifei.wang2@arm.com>
> Sent: Thursday, March 18, 2021 9:19
> To: Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>;
> Yongseok Koh <yskoh@mellanox.com>
> Cc: dev@dpdk.org; nd@arm.com; Feifei Wang <feifei.wang2@arm.com>;
> stable@dpdk.org; Ruifeng Wang <ruifeng.wang@arm.com>
> Subject: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
> 
> 'dev_gen' is a variable to inform other cores to flush their local cache when
> global cache is rebuilt.
> 
> However, if 'dev_gen' is updated after global cache is rebuilt, other cores
> may load a wrong memory region lkey value from old local cache.
> 
> Timeslot        main core               worker core
>   1         rebuild global cache
>   2                                  load unchanged dev_gen
>   3            update dev_gen
>   4                                  look up old local cache
> 
> From the example above, we can see that though global cache is rebuilt, due
> to that dev_gen is not updated, the worker core may look up old cache table
> and receive a wrong memory region lkey value.
> 
> To fix this, updating 'dev_gen' should be moved before rebuilding global
> cache to inform worker cores to flush their local cache when global cache
> start rebuilding. And wmb can ensure the sequence of this process.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: stable@dpdk.org
> 
> Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/mlx5/mlx5_mr.c | 37 +++++++++++++++++--------------------
>  1 file changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index
> da4e91fc2..7ce1d3e64 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -103,20 +103,18 @@ mlx5_mr_mem_event_free_cb(struct
> mlx5_dev_ctx_shared *sh,
>  		rebuild = 1;
>  	}
>  	if (rebuild) {
> -		mlx5_mr_rebuild_cache(&sh->share_cache);
> +		++sh->share_cache.dev_gen;
> +		DEBUG("broadcasting local cache flush, gen=%d",
> +			sh->share_cache.dev_gen);
> +
>  		/*
>  		 * Flush local caches by propagating invalidation across cores.
> -		 * rte_smp_wmb() is enough to synchronize this event. If
> one of
> -		 * freed memsegs is seen by other core, that means the
> memseg
> -		 * has been allocated by allocator, which will come after this
> -		 * free call. Therefore, this store instruction (incrementing
> -		 * generation below) will be guaranteed to be seen by other
> core
> -		 * before the core sees the newly allocated memory.
> +		 * rte_smp_wmb() is to keep the order that dev_gen
> updated before
> +		 * rebuilding global cache. Therefore, other core can flush
> their
> +		 * local cache on time.
>  		 */
> -		++sh->share_cache.dev_gen;
> -		DEBUG("broadcasting local cache flush, gen=%d",
> -		      sh->share_cache.dev_gen);
>  		rte_smp_wmb();
> +		mlx5_mr_rebuild_cache(&sh->share_cache);
>  	}
>  	rte_rwlock_write_unlock(&sh->share_cache.rwlock);
>  }
> @@ -407,20 +405,19 @@ mlx5_dma_unmap(struct rte_pci_device *pdev,
> void *addr,
>  	mlx5_mr_free(mr, sh->share_cache.dereg_mr_cb);
>  	DEBUG("port %u remove MR(%p) from list", dev->data->port_id,
>  	      (void *)mr);
> -	mlx5_mr_rebuild_cache(&sh->share_cache);
> +
> +	++sh->share_cache.dev_gen;
> +	DEBUG("broadcasting local cache flush, gen=%d",
> +		sh->share_cache.dev_gen);
> +
>  	/*
>  	 * Flush local caches by propagating invalidation across cores.
> -	 * rte_smp_wmb() is enough to synchronize this event. If one of
> -	 * freed memsegs is seen by other core, that means the memseg
> -	 * has been allocated by allocator, which will come after this
> -	 * free call. Therefore, this store instruction (incrementing
> -	 * generation below) will be guaranteed to be seen by other core
> -	 * before the core sees the newly allocated memory.
> +	 * rte_smp_wmb() is to keep the order that dev_gen updated
> before
> +	 * rebuilding global cache. Therefore, other core can flush their
> +	 * local cache on time.
>  	 */
> -	++sh->share_cache.dev_gen;
> -	DEBUG("broadcasting local cache flush, gen=%d",
> -	      sh->share_cache.dev_gen);
>  	rte_smp_wmb();
> +	mlx5_mr_rebuild_cache(&sh->share_cache);
>  	rte_rwlock_read_unlock(&sh->share_cache.rwlock);
>  	return 0;
>  }
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
  2021-04-12  8:27   ` Slava Ovsiienko
@ 2021-04-13  5:20     ` Feifei Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Feifei Wang @ 2021-04-13  5:20 UTC (permalink / raw)
  To: Slava Ovsiienko, Matan Azrad, Shahaf Shuler, yskoh
  Cc: dev, nd, stable, Ruifeng Wang, nd

Hi, Slava

Thanks very much for your attention.





Best Regards
Feifei

> -----邮件原件-----
> 发件人: Slava Ovsiienko <viacheslavo@nvidia.com>
> 发送时间: 2021年4月12日 16:28
> 收件人: Feifei Wang <Feifei.Wang2@arm.com>; Matan Azrad
> <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>;
> yskoh@mellanox.com
> 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng Wang
> <Ruifeng.Wang@arm.com>
> 主题: RE: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
> 
> Hi, Feifei
> 
> Sorry, I do not follow what this patch fixes. Do we have some issue/bug with
> MR cache in practice?

This patch fixes the bug which is based on logical deduction, 
and it doesn't actually happen.

> 
> Each Tx queue has its own dedicated "local" cache for MRs to convert buffer
> address in mbufs being transmitted to LKeys (HW-related entity handle) and
> the "global" cache for all MR registered on the device.
> 
> AFAIK, how conversion happens in datapath:
> - check the local queue cache flush request
> - lookup in local cache
> - if not found:
> - acquire lock for global cache read access
> - lookup in global cache
> - release lock for global cache
> 
> How cache update on memory freeing/unregistering happens:
> - acquire lock for global cache write access
> - [a] remove relevant MRs from the global cache
> - [b] set local caches flush request
> - free global cache lock
> 
> If I understand correctly, your patch swaps [a] and [b], and local caches flush
> is requested earlier. What problem does it solve?
> It is not supposed there are in datapath some mbufs referencing to the
> memory being freed. Application must ensure this and must not allocate new
> mbufs from this memory regions being freed. Hence, the lookups for these
> MRs in caches should not occur.

For your first point that, application can take charge of preventing MR freed memory
being allocated to data path.

Does it means that If there is an emergency of MR fragment, such as hotplug, the application
must inform thedata path in advance, and this memory will not be allocated, and then the
control path will free this memory? If application  can do like this, I agree that this bug
cannot happen.

> For other side, the cache flush has negative effect - the local cache is getting
> empty and can't provide translation for other valid (not being removed) MRs,
> and the translation has to look up in the global cache, that is locked now for
> rebuilding, this causes the delays in datapatch on acquiring global cache lock.
> So, I see some potential performance impact.

If above assumption is true, we can go to your second point. I think this is a problem
of the tradeoff between cache coherence and performance.  

I can understand your meaning that though global cache has been changed, we should 
keep the valid MR in local cache as long as possible to ensure the fast searching speed. 
In the meanwhile, the local cache can be rebuilt later to reduce its waiting time for
acquiring the global cache lock.

However,  this mechanism just ensures the performance unchanged for  the first few mbufs. 
During the next mbufs lkey searching after 'dev_gen' updated, it is still necessary to update
the local cache. And the performance can firstly reduce and then returns. Thus, no matter
whether there is this patch or not,  the performance will jitter in a certain period of time. 

Finally, in conclusion, I tend to think that the bottom layer can do more things to ensure
the correct execution of the program, which may have a negative impact on the performance in
a short time, but in the long run, the performance will eventually come back.  Furthermore,
maybe we should pay attention to the performance in the stable period, and try our best to ensure the
correctness of the program in case of emergencies.

Best Regards
Feifei

> With best regards,
> Slava
> 
> > -----Original Message-----
> > From: Feifei Wang <feifei.wang2@arm.com>
> > Sent: Thursday, March 18, 2021 9:19
> > To: Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>;
> > Yongseok Koh <yskoh@mellanox.com>
> > Cc: dev@dpdk.org; nd@arm.com; Feifei Wang <feifei.wang2@arm.com>;
> > stable@dpdk.org; Ruifeng Wang <ruifeng.wang@arm.com>
> > Subject: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region
> > cache
> >
> > 'dev_gen' is a variable to inform other cores to flush their local
> > cache when global cache is rebuilt.
> >
> > However, if 'dev_gen' is updated after global cache is rebuilt, other
> > cores may load a wrong memory region lkey value from old local cache.
> >
> > Timeslot        main core               worker core
> >   1         rebuild global cache
> >   2                                  load unchanged dev_gen
> >   3            update dev_gen
> >   4                                  look up old local cache
> >
> > From the example above, we can see that though global cache is
> > rebuilt, due to that dev_gen is not updated, the worker core may look
> > up old cache table and receive a wrong memory region lkey value.
> >
> > To fix this, updating 'dev_gen' should be moved before rebuilding
> > global cache to inform worker cores to flush their local cache when
> > global cache start rebuilding. And wmb can ensure the sequence of this
> process.
> >
> > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > Cc: stable@dpdk.org
> >
> > Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  drivers/net/mlx5/mlx5_mr.c | 37 +++++++++++++++++--------------------
> >  1 file changed, 17 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> > index
> > da4e91fc2..7ce1d3e64 100644
> > --- a/drivers/net/mlx5/mlx5_mr.c
> > +++ b/drivers/net/mlx5/mlx5_mr.c
> > @@ -103,20 +103,18 @@ mlx5_mr_mem_event_free_cb(struct
> > mlx5_dev_ctx_shared *sh,
> >  		rebuild = 1;
> >  	}
> >  	if (rebuild) {
> > -		mlx5_mr_rebuild_cache(&sh->share_cache);
> > +		++sh->share_cache.dev_gen;
> > +		DEBUG("broadcasting local cache flush, gen=%d",
> > +			sh->share_cache.dev_gen);
> > +
> >  		/*
> >  		 * Flush local caches by propagating invalidation across cores.
> > -		 * rte_smp_wmb() is enough to synchronize this event. If
> > one of
> > -		 * freed memsegs is seen by other core, that means the
> > memseg
> > -		 * has been allocated by allocator, which will come after this
> > -		 * free call. Therefore, this store instruction (incrementing
> > -		 * generation below) will be guaranteed to be seen by other
> > core
> > -		 * before the core sees the newly allocated memory.
> > +		 * rte_smp_wmb() is to keep the order that dev_gen
> > updated before
> > +		 * rebuilding global cache. Therefore, other core can flush
> > their
> > +		 * local cache on time.
> >  		 */
> > -		++sh->share_cache.dev_gen;
> > -		DEBUG("broadcasting local cache flush, gen=%d",
> > -		      sh->share_cache.dev_gen);
> >  		rte_smp_wmb();
> > +		mlx5_mr_rebuild_cache(&sh->share_cache);
> >  	}
> >  	rte_rwlock_write_unlock(&sh->share_cache.rwlock);
> >  }
> > @@ -407,20 +405,19 @@ mlx5_dma_unmap(struct rte_pci_device *pdev,
> void
> > *addr,
> >  	mlx5_mr_free(mr, sh->share_cache.dereg_mr_cb);
> >  	DEBUG("port %u remove MR(%p) from list", dev->data->port_id,
> >  	      (void *)mr);
> > -	mlx5_mr_rebuild_cache(&sh->share_cache);
> > +
> > +	++sh->share_cache.dev_gen;
> > +	DEBUG("broadcasting local cache flush, gen=%d",
> > +		sh->share_cache.dev_gen);
> > +
> >  	/*
> >  	 * Flush local caches by propagating invalidation across cores.
> > -	 * rte_smp_wmb() is enough to synchronize this event. If one of
> > -	 * freed memsegs is seen by other core, that means the memseg
> > -	 * has been allocated by allocator, which will come after this
> > -	 * free call. Therefore, this store instruction (incrementing
> > -	 * generation below) will be guaranteed to be seen by other core
> > -	 * before the core sees the newly allocated memory.
> > +	 * rte_smp_wmb() is to keep the order that dev_gen updated
> > before
> > +	 * rebuilding global cache. Therefore, other core can flush their
> > +	 * local cache on time.
> >  	 */
> > -	++sh->share_cache.dev_gen;
> > -	DEBUG("broadcasting local cache flush, gen=%d",
> > -	      sh->share_cache.dev_gen);
> >  	rte_smp_wmb();
> > +	mlx5_mr_rebuild_cache(&sh->share_cache);
> >  	rte_rwlock_read_unlock(&sh->share_cache.rwlock);
> >  	return 0;
> >  }
> > --
> > 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-13  5:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18  7:18 [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Feifei Wang
2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache Feifei Wang
2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 2/4] net/mlx4: replace SMP barrier with C11 barriers Feifei Wang
2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache Feifei Wang
2021-04-12  8:27   ` Slava Ovsiienko
2021-04-13  5:20     ` [dpdk-dev] 回复: " Feifei Wang
2021-03-18  7:18 ` [dpdk-dev] [PATCH v1 4/4] net/mlx5: replace SMP barriers with C11 barriers Feifei Wang
2021-04-07  1:45 ` [dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx Alexander Kozyrev

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git