patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm
       [not found] <20210604073405.14880-1-joyce.kong@arm.com>
@ 2021-06-23  8:43 ` Joyce Kong
  2021-06-30  1:14   ` Honnappa Nagarahalli
  2021-07-06  6:54 ` [dpdk-stable] [PATCH v3 0/2] fixes for i40e hw scan ring Joyce Kong
  1 sibling, 1 reply; 12+ messages in thread
From: Joyce Kong @ 2021-06-23  8:43 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ruifeng.wang, honnappa.nagarahalli,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd

For Arm platforms, reading descs can get re-ordered, then the
status of DD bits will be discontinuous, so add the logic to
only process continuous descs by checking DD bits.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/i40e/i40e_rxtx.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decec..86e2f083e 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 	uint16_t pkt_len;
 	uint64_t qword1;
 	uint32_t rx_status;
-	int32_t s[I40E_LOOK_AHEAD], nb_dd;
+	int32_t s[I40E_LOOK_AHEAD], var, nb_dd;
 	int32_t i, j, nb_rx = 0;
 	uint64_t pkt_flags;
 	uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl;
@@ -482,11 +482,22 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 					I40E_RXD_QW1_STATUS_SHIFT;
 		}
 
-		rte_smp_rmb();
+		/* This barrier is to order loads of different words in the descriptor */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
 
 		/* Compute how many status bits were set */
-		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++)
-			nb_dd += s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
+			var = s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+#ifdef RTE_ARCH_ARM
+			/* For Arm platforms, only compute continuous status bits */
+			if (var)
+				nb_dd += 1;
+			else
+				break;
+#else
+			nb_dd += var;
+#endif
+		}
 
 		nb_rx += nb_dd;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm
  2021-06-23  8:43 ` [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
@ 2021-06-30  1:14   ` Honnappa Nagarahalli
  2021-07-05  3:41     ` Joyce Kong
  0 siblings, 1 reply; 12+ messages in thread
From: Honnappa Nagarahalli @ 2021-06-30  1:14 UTC (permalink / raw)
  To: Joyce Kong, beilei.xing, qi.z.zhang, Ruifeng Wang,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>
> 
> For Arm platforms, reading descs can get re-ordered, then the status of DD
> bits will be discontinuous, so add the logic to only process continuous descs
> by checking DD bits.
> 
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 6c58decec..86e2f083e 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
>  	uint16_t pkt_len;
>  	uint64_t qword1;
>  	uint32_t rx_status;
> -	int32_t s[I40E_LOOK_AHEAD], nb_dd;
> +	int32_t s[I40E_LOOK_AHEAD], var, nb_dd;
>  	int32_t i, j, nb_rx = 0;
>  	uint64_t pkt_flags;
>  	uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; @@ -482,11
> +482,22 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
>  					I40E_RXD_QW1_STATUS_SHIFT;
>  		}
> 
> -		rte_smp_rmb();
> +		/* This barrier is to order loads of different words in the
> descriptor */
> +		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
I think this should go into a separate commit as the following change is unrelated.

> 
>  		/* Compute how many status bits were set */
> -		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++)
> -			nb_dd += s[j] & (1 <<
> I40E_RX_DESC_STATUS_DD_SHIFT);
> +		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
> +			var = s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
> #ifdef
> +RTE_ARCH_ARM
> +			/* For Arm platforms, only compute continuous
> status bits */
> +			if (var)
> +				nb_dd += 1;
> +			else
> +				break;
> +#else
> +			nb_dd += var;
> +#endif
> +		}
> 
>  		nb_rx += nb_dd;
> 
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm
  2021-06-30  1:14   ` Honnappa Nagarahalli
@ 2021-07-05  3:41     ` Joyce Kong
  0 siblings, 0 replies; 12+ messages in thread
From: Joyce Kong @ 2021-07-05  3:41 UTC (permalink / raw)
  To: Honnappa Nagarahalli, beilei.xing, qi.z.zhang, Ruifeng Wang,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd



> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Wednesday, June 30, 2021 9:15 AM
> To: Joyce Kong <Joyce.Kong@arm.com>; beilei.xing@intel.com;
> qi.z.zhang@intel.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> bruce.richardson@intel.com; helin.zhang@intel.com
> Cc: dev@dpdk.org; stable@dpdk.org; nd <nd@arm.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH v2] net/i40e: add logic of processing continuous DD bits
> for Arm
> 
> <snip>
> >
> > For Arm platforms, reading descs can get re-ordered, then the status
> > of DD bits will be discontinuous, so add the logic to only process
> > continuous descs by checking DD bits.
> >
> > Fixes: 4861cde46116 ("i40e: new poll mode driver")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  drivers/net/i40e/i40e_rxtx.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index 6c58decec..86e2f083e 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
> >  	uint16_t pkt_len;
> >  	uint64_t qword1;
> >  	uint32_t rx_status;
> > -	int32_t s[I40E_LOOK_AHEAD], nb_dd;
> > +	int32_t s[I40E_LOOK_AHEAD], var, nb_dd;
> >  	int32_t i, j, nb_rx = 0;
> >  	uint64_t pkt_flags;
> >  	uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; @@ -482,11
> > +482,22 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
> >  					I40E_RXD_QW1_STATUS_SHIFT;
> >  		}
> >
> > -		rte_smp_rmb();
> > +		/* This barrier is to order loads of different words in the
> > descriptor */
> > +		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> I think this should go into a separate commit as the following change is
> unrelated.

Will separate the two changes in v3.

> 
> >
> >  		/* Compute how many status bits were set */
> > -		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++)
> > -			nb_dd += s[j] & (1 <<
> > I40E_RX_DESC_STATUS_DD_SHIFT);
> > +		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
> > +			var = s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
> > #ifdef
> > +RTE_ARCH_ARM
> > +			/* For Arm platforms, only compute continuous
> > status bits */
> > +			if (var)
> > +				nb_dd += 1;
> > +			else
> > +				break;
> > +#else
> > +			nb_dd += var;
> > +#endif
> > +		}
> >
> >  		nb_rx += nb_dd;
> >
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v3 0/2] fixes for i40e hw scan ring
       [not found] <20210604073405.14880-1-joyce.kong@arm.com>
  2021-06-23  8:43 ` [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
@ 2021-07-06  6:54 ` Joyce Kong
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence Joyce Kong
  1 sibling, 2 replies; 12+ messages in thread
From: Joyce Kong @ 2021-07-06  6:54 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ruifeng.wang, honnappa.nagarahalli,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd

This patchset contains two parts for i40e PMD, one is to add
the logic of processing continuous DD bits for Arm platform,
the other is to replace SMP barrier with thread fence.

v3:
 Seperate the commit changes into two parts. <Honnappa Nagarahalli>

v2:
 Only add the compile option for Arm and keep X86 intact. <Qi Zhang> 

v1:
 The initial version.

Joyce Kong (2):
  net/i40e: add logic of processing continuous DD bits for Arm
  net/i40e: replace SMP barrier with thread fence

 drivers/net/i40e/i40e_rxtx.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm
  2021-07-06  6:54 ` [dpdk-stable] [PATCH v3 0/2] fixes for i40e hw scan ring Joyce Kong
@ 2021-07-06  6:54   ` Joyce Kong
  2021-07-09  3:05     ` Zhang, Qi Z
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence Joyce Kong
  1 sibling, 1 reply; 12+ messages in thread
From: Joyce Kong @ 2021-07-06  6:54 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ruifeng.wang, honnappa.nagarahalli,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd

For Arm platforms, reading descs can get re-ordered, then the
status of DD bits will be discontinuous, so add the logic to
only process continuous descs by checking DD bits.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/i40e/i40e_rxtx.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decec..9aaabfd92 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 	uint16_t pkt_len;
 	uint64_t qword1;
 	uint32_t rx_status;
-	int32_t s[I40E_LOOK_AHEAD], nb_dd;
+	int32_t s[I40E_LOOK_AHEAD], var, nb_dd;
 	int32_t i, j, nb_rx = 0;
 	uint64_t pkt_flags;
 	uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl;
@@ -485,8 +485,18 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 		rte_smp_rmb();
 
 		/* Compute how many status bits were set */
-		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++)
-			nb_dd += s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
+			var = s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+#ifdef RTE_ARCH_ARM
+			/* For Arm platforms, only compute continuous status bits */
+			if (var)
+				nb_dd += 1;
+			else
+				break;
+#else
+			nb_dd += var;
+#endif
+		}
 
 		nb_rx += nb_dd;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-06  6:54 ` [dpdk-stable] [PATCH v3 0/2] fixes for i40e hw scan ring Joyce Kong
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
@ 2021-07-06  6:54   ` Joyce Kong
  2021-07-08 12:09     ` Zhang, Qi Z
  2021-07-13  0:46     ` [dpdk-stable] " Zhang, Qi Z
  1 sibling, 2 replies; 12+ messages in thread
From: Joyce Kong @ 2021-07-06  6:54 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ruifeng.wang, honnappa.nagarahalli,
	bruce.richardson, helin.zhang
  Cc: dev, stable, nd

Simply replace the SMP barrier with atomic thread fence for
i40e hw ring sacn, if there is no synchronization point.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/i40e/i40e_rxtx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 9aaabfd92..86e2f083e 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -482,7 +482,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 					I40E_RXD_QW1_STATUS_SHIFT;
 		}
 
-		rte_smp_rmb();
+		/* This barrier is to order loads of different words in the descriptor */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
 
 		/* Compute how many status bits were set */
 		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence Joyce Kong
@ 2021-07-08 12:09     ` Zhang, Qi Z
  2021-07-08 13:51       ` [dpdk-stable] [dpdk-dev] " Lance Richardson
  2021-07-13  0:46     ` [dpdk-stable] " Zhang, Qi Z
  1 sibling, 1 reply; 12+ messages in thread
From: Zhang, Qi Z @ 2021-07-08 12:09 UTC (permalink / raw)
  To: Joyce Kong, Xing, Beilei, ruifeng.wang, honnappa.nagarahalli,
	Richardson, Bruce, Zhang, Helin
  Cc: dev, stable, nd



> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Tuesday, July 6, 2021 2:54 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org; nd@arm.com
> Subject: [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
> 
> Simply replace the SMP barrier with atomic thread fence for i40e hw ring sacn,
> if there is no synchronization point.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 9aaabfd92..86e2f083e 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -482,7 +482,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
>  					I40E_RXD_QW1_STATUS_SHIFT;
>  		}
> 
> -		rte_smp_rmb();
> +		/* This barrier is to order loads of different words in the descriptor */
> +		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);

Now for x86, you actually replace a compiler barrier with a memory fence, this may have potential performance impact which need additional resource to investigate 
So there are 2 options:
1. if you want this patch be merged into DPDK 21.08, please change this for ARM only.
2. you can wait for our update for x86 but I guess it will miss 21.08.

What do you think?

Btw for patch 1/2, I think I can merge it independently right?


> 
>  		/* Compute how many status bits were set */
>  		for (j = 0, nb_dd = 0; j < I40E_LOOK_AHEAD; j++) {
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-08 12:09     ` Zhang, Qi Z
@ 2021-07-08 13:51       ` Lance Richardson
  2021-07-08 14:26         ` Zhang, Qi Z
  0 siblings, 1 reply; 12+ messages in thread
From: Lance Richardson @ 2021-07-08 13:51 UTC (permalink / raw)
  To: Zhang, Qi Z
  Cc: Joyce Kong, Xing, Beilei, ruifeng.wang, honnappa.nagarahalli,
	Richardson, Bruce, Zhang, Helin, dev, stable, nd

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]

On Thu, Jul 8, 2021 at 8:09 AM Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Joyce Kong <joyce.kong@arm.com>
> > Sent: Tuesday, July 6, 2021 2:54 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> > ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson, Bruce
> > <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>
> > Cc: dev@dpdk.org; stable@dpdk.org; nd@arm.com
> > Subject: [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
> >
> > Simply replace the SMP barrier with atomic thread fence for i40e hw ring sacn,
> > if there is no synchronization point.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  drivers/net/i40e/i40e_rxtx.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> > 9aaabfd92..86e2f083e 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -482,7 +482,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
> >                                       I40E_RXD_QW1_STATUS_SHIFT;
> >               }
> >
> > -             rte_smp_rmb();
> > +             /* This barrier is to order loads of different words in the descriptor */
> > +             rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>
> Now for x86, you actually replace a compiler barrier with a memory fence, this may have potential performance impact which need additional resource to investigate

No memory fence instruction is generated for
__ATOMIC_ACQUIRE on x86 for any version of gcc
or clang that I've tried, based on experiments here:

    https://godbolt.org/z/Yxr1vGhKP

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-08 13:51       ` [dpdk-stable] [dpdk-dev] " Lance Richardson
@ 2021-07-08 14:26         ` Zhang, Qi Z
  2021-07-08 14:44           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Qi Z @ 2021-07-08 14:26 UTC (permalink / raw)
  To: Lance Richardson
  Cc: Joyce Kong, Xing, Beilei, ruifeng.wang, honnappa.nagarahalli,
	Richardson, Bruce, Zhang, Helin, dev, stable, nd



> -----Original Message-----
> From: Lance Richardson <lance.richardson@broadcom.com>
> Sent: Thursday, July 8, 2021 9:51 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: Joyce Kong <joyce.kong@arm.com>; Xing, Beilei <beilei.xing@intel.com>;
> ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org; stable@dpdk.org; nd@arm.com
> Subject: Re: [dpdk-dev] [PATCH v3 2/2] net/i40e: replace SMP barrier with
> thread fence
> 
> On Thu, Jul 8, 2021 at 8:09 AM Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Joyce Kong <joyce.kong@arm.com>
> > > Sent: Tuesday, July 6, 2021 2:54 PM
> > > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>;
> > > ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson,
> Bruce
> > > <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>
> > > Cc: dev@dpdk.org; stable@dpdk.org; nd@arm.com
> > > Subject: [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
> > >
> > > Simply replace the SMP barrier with atomic thread fence for i40e hw ring
> sacn,
> > > if there is no synchronization point.
> > >
> > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > >  drivers/net/i40e/i40e_rxtx.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> > > 9aaabfd92..86e2f083e 100644
> > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > @@ -482,7 +482,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
> > >
> I40E_RXD_QW1_STATUS_SHIFT;
> > >               }
> > >
> > > -             rte_smp_rmb();
> > > +             /* This barrier is to order loads of different words in the
> descriptor */
> > > +             rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> >
> > Now for x86, you actually replace a compiler barrier with a memory fence,
> this may have potential performance impact which need additional resource to
> investigate
> 
> No memory fence instruction is generated for
> __ATOMIC_ACQUIRE on x86 for any version of gcc
> or clang that I've tried, based on experiments here:
> 
>     https://godbolt.org/z/Yxr1vGhKP

Nice tool!
I try to write some dummy code combined with or without __atomic_thread_fence(__ATOMIC_ACQUIRE)
but I didn't see any difference of the generated assembly code, does that means __atomic_thread_fence(__ATOMIC_ACQUIRE) just does nothing on x86?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-08 14:26         ` Zhang, Qi Z
@ 2021-07-08 14:44           ` Honnappa Nagarahalli
  0 siblings, 0 replies; 12+ messages in thread
From: Honnappa Nagarahalli @ 2021-07-08 14:44 UTC (permalink / raw)
  To: Zhang, Qi Z, Lance Richardson
  Cc: Joyce Kong, Xing, Beilei, Ruifeng Wang, Richardson, Bruce, Zhang,
	Helin, dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> > > >
> > > > Simply replace the SMP barrier with atomic thread fence for i40e
> > > > hw ring
> > sacn,
> > > > if there is no synchronization point.
> > > >
> > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > ---
> > > >  drivers/net/i40e/i40e_rxtx.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > > > b/drivers/net/i40e/i40e_rxtx.c index 9aaabfd92..86e2f083e 100644
> > > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > > @@ -482,7 +482,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue
> > > > *rxq)
> > > >
> > I40E_RXD_QW1_STATUS_SHIFT;
> > > >               }
> > > >
> > > > -             rte_smp_rmb();
> > > > +             /* This barrier is to order loads of different words
> > > > + in the
> > descriptor */
> > > > +             rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> > >
> > > Now for x86, you actually replace a compiler barrier with a memory
> > > fence,
> > this may have potential performance impact which need additional
> > resource to investigate
> >
> > No memory fence instruction is generated for __ATOMIC_ACQUIRE on x86
> > for any version of gcc or clang that I've tried, based on experiments
> > here:
> >
> >     https://godbolt.org/z/Yxr1vGhKP
> 
> Nice tool!
> I try to write some dummy code combined with or without
> __atomic_thread_fence(__ATOMIC_ACQUIRE)
> but I didn't see any difference of the generated assembly code, does that means
> __atomic_thread_fence(__ATOMIC_ACQUIRE) just does nothing on x86?
Yes, it should not have any barriers generated for x86. At the same time it also acts as a compiler barrier.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
@ 2021-07-09  3:05     ` Zhang, Qi Z
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang, Qi Z @ 2021-07-09  3:05 UTC (permalink / raw)
  To: Joyce Kong, Xing, Beilei, ruifeng.wang, honnappa.nagarahalli,
	Richardson, Bruce, Zhang, Helin
  Cc: dev, stable, nd



> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Tuesday, July 6, 2021 2:54 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org; nd@arm.com
> Subject: [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for
> Arm
> 
> For Arm platforms, reading descs can get re-ordered, then the status of DD
> bits will be discontinuous, so add the logic to only process continuous descs by
> checking DD bits.
> 
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

Applied to dpdk-next-net-intel.

Thanks
Qi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
  2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence Joyce Kong
  2021-07-08 12:09     ` Zhang, Qi Z
@ 2021-07-13  0:46     ` Zhang, Qi Z
  1 sibling, 0 replies; 12+ messages in thread
From: Zhang, Qi Z @ 2021-07-13  0:46 UTC (permalink / raw)
  To: Joyce Kong, Xing, Beilei, ruifeng.wang, honnappa.nagarahalli,
	Richardson, Bruce, Zhang, Helin
  Cc: dev, stable, nd



> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Tuesday, July 6, 2021 2:54 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> ruifeng.wang@arm.com; honnappa.nagarahalli@arm.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org; nd@arm.com
> Subject: [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence
> 
> Simply replace the SMP barrier with atomic thread fence for i40e hw ring sacn,
> if there is no synchronization point.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-13  0:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210604073405.14880-1-joyce.kong@arm.com>
2021-06-23  8:43 ` [dpdk-stable] [PATCH v2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
2021-06-30  1:14   ` Honnappa Nagarahalli
2021-07-05  3:41     ` Joyce Kong
2021-07-06  6:54 ` [dpdk-stable] [PATCH v3 0/2] fixes for i40e hw scan ring Joyce Kong
2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 1/2] net/i40e: add logic of processing continuous DD bits for Arm Joyce Kong
2021-07-09  3:05     ` Zhang, Qi Z
2021-07-06  6:54   ` [dpdk-stable] [PATCH v3 2/2] net/i40e: replace SMP barrier with thread fence Joyce Kong
2021-07-08 12:09     ` Zhang, Qi Z
2021-07-08 13:51       ` [dpdk-stable] [dpdk-dev] " Lance Richardson
2021-07-08 14:26         ` Zhang, Qi Z
2021-07-08 14:44           ` Honnappa Nagarahalli
2021-07-13  0:46     ` [dpdk-stable] " Zhang, Qi Z

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).