DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic
@ 2020-12-21 14:23 Joyce Kong
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd

This patchset is to replace rte smp barriers in virtio with C11 atomic
built-ins.

The rte_smp_*mb APIs provide full barrier functionality. However, many
use cases do not require full barriers. To support such use cases, DPDK
will adopt C11 barrier semantics and provide wrappers using C11 atomic
built-ins.[1]

With this patch set, under 0.001% acceptable loss with 2 cores on vhost
side and 1 core on virtio side, PVP case(vhost-user + virtio-user) has
6.7% perf uplift for the split in_order path on ThunderX2 platform.

[1] http://code.dpdk.org/dpdk/latest/source/doc/guides/rel_notes/deprecation.rst

Joyce Kong (4):
  net/virtio: remove unnecessary rmb barrier
  net/virtio: replace smp barrier with IO barrier
  net/virtio: replace full barrier with relaxed barrier for Arm platform
  net/virtio: replace full barrier with thread fence

 drivers/net/virtio/virtio_ethdev.c | 10 ++++-----
 drivers/net/virtio/virtqueue.h     | 35 +++++++++++++++---------------
 2 files changed, 22 insertions(+), 23 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier
  2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
  2021-01-07 14:12   ` Maxime Coquelin
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd

As desc_is_used has a load-acquire or rte_io_rmb inside
and wait for used desc in virtqueue, it is ok to remove
virtio_rmb behind it.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/virtio/virtio_ethdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 6c233b75b..0d91f7a50 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -209,12 +209,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
 	virtio_wmb(vq->hw->weak_barriers);
 	virtqueue_notify(vq);
 
-	/* wait for used descriptors in virtqueue */
+	/* wait for used desc in virtqueue
+	 * desc_is_used has a load-acquire or rte_io_rmb inside
+	 */
 	while (!desc_is_used(&desc[head], vq))
 		usleep(100);
 
-	virtio_rmb(vq->hw->weak_barriers);
-
 	/* now get used descriptors */
 	vq->vq_free_cnt += nb_descs;
 	vq->vq_used_cons_idx += nb_descs;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier
  2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
  2021-01-07 14:42   ` Maxime Coquelin
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd

Replace rte_smp_wmb/rmb with rte_io_wmb/rmb as they are the same on x86
and ppc platforms. Then, for function virtqueue_fetch_flags_packed/
virtqueue_store_flags_packed, the if and else branch are still identical
for the platforms except Arm.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/virtio/virtqueue.h | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 42c4c9882..ac3d9e750 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -66,16 +66,15 @@ virtqueue_fetch_flags_packed(struct vring_packed_desc *dp,
 	uint16_t flags;
 
 	if (weak_barriers) {
-/* x86 prefers to using rte_smp_rmb over __atomic_load_n as it reports
+/* x86 prefers to using rte_io_rmb over __atomic_load_n as it reports
  * a better perf(~1.5%), which comes from the saved branch by the compiler.
- * The if and else branch are identical with the smp and io barriers both
- * defined as compiler barriers on x86.
+ * The if and else branch are identical  on the platforms except Arm.
  */
-#ifdef RTE_ARCH_X86_64
-		flags = dp->flags;
-		rte_smp_rmb();
-#else
+#ifdef RTE_ARCH_ARM
 		flags = __atomic_load_n(&dp->flags, __ATOMIC_ACQUIRE);
+#else
+		flags = dp->flags;
+		rte_io_rmb();
 #endif
 	} else {
 		flags = dp->flags;
@@ -90,22 +89,22 @@ virtqueue_store_flags_packed(struct vring_packed_desc *dp,
 			      uint16_t flags, uint8_t weak_barriers)
 {
 	if (weak_barriers) {
-/* x86 prefers to using rte_smp_wmb over __atomic_store_n as it reports
+/* x86 prefers to using rte_io_wmb over __atomic_store_n as it reports
  * a better perf(~1.5%), which comes from the saved branch by the compiler.
- * The if and else branch are identical with the smp and io barriers both
- * defined as compiler barriers on x86.
+ * The if and else branch are identical on the platforms except Arm.
  */
-#ifdef RTE_ARCH_X86_64
-		rte_smp_wmb();
-		dp->flags = flags;
-#else
+#ifdef RTE_ARCH_ARM
 		__atomic_store_n(&dp->flags, flags, __ATOMIC_RELEASE);
+#else
+		rte_io_wmb();
+		dp->flags = flags;
 #endif
 	} else {
 		rte_io_wmb();
 		dp->flags = flags;
 	}
 }
+
 #ifdef RTE_PMD_PACKET_PREFETCH
 #define rte_packet_prefetch(p)  rte_prefetch1(p)
 #else
-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform
  2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
  2021-01-07 14:47   ` Maxime Coquelin
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
  2021-01-08  9:16 ` [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Maxime Coquelin
  4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd

Relax the full write barriers to one-way barriers for virtio
control path for Arm platform

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/virtio/virtio_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 0d91f7a50..b3e5cba70 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -203,8 +203,8 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
 		vq->vq_packed.cached_flags ^= VRING_PACKED_DESC_F_AVAIL_USED;
 	}
 
-	virtio_wmb(vq->hw->weak_barriers);
-	desc[head].flags = VRING_DESC_F_NEXT | flags;
+	virtqueue_store_flags_packed(&desc[head], VRING_DESC_F_NEXT | flags,
+			vq->hw->weak_barriers);
 
 	virtio_wmb(vq->hw->weak_barriers);
 	virtqueue_notify(vq);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence
  2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
                   ` (2 preceding siblings ...)
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
  2021-01-07 16:28   ` Maxime Coquelin
  2021-01-08  9:16 ` [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Maxime Coquelin
  4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd

Replace the smp barriers with atomic thread fence for synchronization
between different threads, if there are no load/store operations.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/virtio/virtqueue.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index ac3d9e750..d78b94344 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -26,7 +26,7 @@ struct rte_mbuf;
 /*
  * Per virtio_ring.h in Linux.
  *     For virtio_pci on SMP, we don't need to order with respect to MMIO
- *     accesses through relaxed memory I/O windows, so smp_mb() et al are
+ *     accesses through relaxed memory I/O windows, so thread_fence is
  *     sufficient.
  *
  *     For using virtio to talk to real devices (eg. vDPA) we do need real
@@ -36,7 +36,7 @@ static inline void
 virtio_mb(uint8_t weak_barriers)
 {
 	if (weak_barriers)
-		rte_smp_mb();
+		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
 	else
 		rte_mb();
 }
@@ -45,7 +45,7 @@ static inline void
 virtio_rmb(uint8_t weak_barriers)
 {
 	if (weak_barriers)
-		rte_smp_rmb();
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
 	else
 		rte_io_rmb();
 }
@@ -54,7 +54,7 @@ static inline void
 virtio_wmb(uint8_t weak_barriers)
 {
 	if (weak_barriers)
-		rte_smp_wmb();
+		rte_atomic_thread_fence(__ATOMIC_RELEASE);
 	else
 		rte_io_wmb();
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
@ 2021-01-07 14:12   ` Maxime Coquelin
  0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:12 UTC (permalink / raw)
  To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd



On 12/21/20 3:23 PM, Joyce Kong wrote:
> As desc_is_used has a load-acquire or rte_io_rmb inside
> and wait for used desc in virtqueue, it is ok to remove
> virtio_rmb behind it.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/virtio/virtio_ethdev.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index 6c233b75b..0d91f7a50 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -209,12 +209,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
>  	virtio_wmb(vq->hw->weak_barriers);
>  	virtqueue_notify(vq);
>  
> -	/* wait for used descriptors in virtqueue */
> +	/* wait for used desc in virtqueue
> +	 * desc_is_used has a load-acquire or rte_io_rmb inside
> +	 */
>  	while (!desc_is_used(&desc[head], vq))
>  		usleep(100);
>  
> -	virtio_rmb(vq->hw->weak_barriers);
> -
>  	/* now get used descriptors */
>  	vq->vq_free_cnt += nb_descs;
>  	vq->vq_used_cons_idx += nb_descs;
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
@ 2021-01-07 14:42   ` Maxime Coquelin
  0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:42 UTC (permalink / raw)
  To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd



On 12/21/20 3:23 PM, Joyce Kong wrote:
> Replace rte_smp_wmb/rmb with rte_io_wmb/rmb as they are the same on x86
> and ppc platforms. Then, for function virtqueue_fetch_flags_packed/
> virtqueue_store_flags_packed, the if and else branch are still identical
> for the platforms except Arm.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/virtio/virtqueue.h | 27 +++++++++++++--------------
>  1 file changed, 13 insertions(+), 14 deletions(-)
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
@ 2021-01-07 14:47   ` Maxime Coquelin
  0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:47 UTC (permalink / raw)
  To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd



On 12/21/20 3:23 PM, Joyce Kong wrote:
> Relax the full write barriers to one-way barriers for virtio
> control path for Arm platform
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/virtio/virtio_ethdev.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index 0d91f7a50..b3e5cba70 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -203,8 +203,8 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
>  		vq->vq_packed.cached_flags ^= VRING_PACKED_DESC_F_AVAIL_USED;
>  	}
>  
> -	virtio_wmb(vq->hw->weak_barriers);
> -	desc[head].flags = VRING_DESC_F_NEXT | flags;
> +	virtqueue_store_flags_packed(&desc[head], VRING_DESC_F_NEXT | flags,
> +			vq->hw->weak_barriers);
>  
>  	virtio_wmb(vq->hw->weak_barriers);
>  	virtqueue_notify(vq);
> 

Performance does not matter in the case of ctrl queue, but it is cleaner
to reuse existing helpers anyway:

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
@ 2021-01-07 16:28   ` Maxime Coquelin
  0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 16:28 UTC (permalink / raw)
  To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd



On 12/21/20 3:23 PM, Joyce Kong wrote:
> Replace the smp barriers with atomic thread fence for synchronization
> between different threads, if there are no load/store operations.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/virtio/virtqueue.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic
  2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
                   ` (3 preceding siblings ...)
  2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
@ 2021-01-08  9:16 ` Maxime Coquelin
  4 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-08  9:16 UTC (permalink / raw)
  To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd



On 12/21/20 3:23 PM, Joyce Kong wrote:
> This patchset is to replace rte smp barriers in virtio with C11 atomic
> built-ins.
> 
> The rte_smp_*mb APIs provide full barrier functionality. However, many
> use cases do not require full barriers. To support such use cases, DPDK
> will adopt C11 barrier semantics and provide wrappers using C11 atomic
> built-ins.[1]
> 
> With this patch set, under 0.001% acceptable loss with 2 cores on vhost
> side and 1 core on virtio side, PVP case(vhost-user + virtio-user) has
> 6.7% perf uplift for the split in_order path on ThunderX2 platform.
> 
> [1] http://code.dpdk.org/dpdk/latest/source/doc/guides/rel_notes/deprecation.rst
> 
> Joyce Kong (4):
>   net/virtio: remove unnecessary rmb barrier
>   net/virtio: replace smp barrier with IO barrier
>   net/virtio: replace full barrier with relaxed barrier for Arm platform
>   net/virtio: replace full barrier with thread fence
> 
>  drivers/net/virtio/virtio_ethdev.c | 10 ++++-----
>  drivers/net/virtio/virtqueue.h     | 35 +++++++++++++++---------------
>  2 files changed, 22 insertions(+), 23 deletions(-)
> 

Series applied to dpdk-next-virtio/main.

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-08  9:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
2021-01-07 14:12   ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
2021-01-07 14:42   ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
2021-01-07 14:47   ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
2021-01-07 16:28   ` Maxime Coquelin
2021-01-08  9:16 ` [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).