* [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
2021-01-07 14:12 ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
As desc_is_used has a load-acquire or rte_io_rmb inside
and wait for used desc in virtqueue, it is ok to remove
virtio_rmb behind it.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
drivers/net/virtio/virtio_ethdev.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 6c233b75b..0d91f7a50 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -209,12 +209,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
virtio_wmb(vq->hw->weak_barriers);
virtqueue_notify(vq);
- /* wait for used descriptors in virtqueue */
+ /* wait for used desc in virtqueue
+ * desc_is_used has a load-acquire or rte_io_rmb inside
+ */
while (!desc_is_used(&desc[head], vq))
usleep(100);
- virtio_rmb(vq->hw->weak_barriers);
-
/* now get used descriptors */
vq->vq_free_cnt += nb_descs;
vq->vq_used_cons_idx += nb_descs;
--
2.29.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
@ 2021-01-07 14:12 ` Maxime Coquelin
0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:12 UTC (permalink / raw)
To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
On 12/21/20 3:23 PM, Joyce Kong wrote:
> As desc_is_used has a load-acquire or rte_io_rmb inside
> and wait for used desc in virtqueue, it is ok to remove
> virtio_rmb behind it.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> drivers/net/virtio/virtio_ethdev.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index 6c233b75b..0d91f7a50 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -209,12 +209,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
> virtio_wmb(vq->hw->weak_barriers);
> virtqueue_notify(vq);
>
> - /* wait for used descriptors in virtqueue */
> + /* wait for used desc in virtqueue
> + * desc_is_used has a load-acquire or rte_io_rmb inside
> + */
> while (!desc_is_used(&desc[head], vq))
> usleep(100);
>
> - virtio_rmb(vq->hw->weak_barriers);
> -
> /* now get used descriptors */
> vq->vq_free_cnt += nb_descs;
> vq->vq_used_cons_idx += nb_descs;
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
2021-01-07 14:42 ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
Replace rte_smp_wmb/rmb with rte_io_wmb/rmb as they are the same on x86
and ppc platforms. Then, for function virtqueue_fetch_flags_packed/
virtqueue_store_flags_packed, the if and else branch are still identical
for the platforms except Arm.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
drivers/net/virtio/virtqueue.h | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 42c4c9882..ac3d9e750 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -66,16 +66,15 @@ virtqueue_fetch_flags_packed(struct vring_packed_desc *dp,
uint16_t flags;
if (weak_barriers) {
-/* x86 prefers to using rte_smp_rmb over __atomic_load_n as it reports
+/* x86 prefers to using rte_io_rmb over __atomic_load_n as it reports
* a better perf(~1.5%), which comes from the saved branch by the compiler.
- * The if and else branch are identical with the smp and io barriers both
- * defined as compiler barriers on x86.
+ * The if and else branch are identical on the platforms except Arm.
*/
-#ifdef RTE_ARCH_X86_64
- flags = dp->flags;
- rte_smp_rmb();
-#else
+#ifdef RTE_ARCH_ARM
flags = __atomic_load_n(&dp->flags, __ATOMIC_ACQUIRE);
+#else
+ flags = dp->flags;
+ rte_io_rmb();
#endif
} else {
flags = dp->flags;
@@ -90,22 +89,22 @@ virtqueue_store_flags_packed(struct vring_packed_desc *dp,
uint16_t flags, uint8_t weak_barriers)
{
if (weak_barriers) {
-/* x86 prefers to using rte_smp_wmb over __atomic_store_n as it reports
+/* x86 prefers to using rte_io_wmb over __atomic_store_n as it reports
* a better perf(~1.5%), which comes from the saved branch by the compiler.
- * The if and else branch are identical with the smp and io barriers both
- * defined as compiler barriers on x86.
+ * The if and else branch are identical on the platforms except Arm.
*/
-#ifdef RTE_ARCH_X86_64
- rte_smp_wmb();
- dp->flags = flags;
-#else
+#ifdef RTE_ARCH_ARM
__atomic_store_n(&dp->flags, flags, __ATOMIC_RELEASE);
+#else
+ rte_io_wmb();
+ dp->flags = flags;
#endif
} else {
rte_io_wmb();
dp->flags = flags;
}
}
+
#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
#else
--
2.29.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
@ 2021-01-07 14:42 ` Maxime Coquelin
0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:42 UTC (permalink / raw)
To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
On 12/21/20 3:23 PM, Joyce Kong wrote:
> Replace rte_smp_wmb/rmb with rte_io_wmb/rmb as they are the same on x86
> and ppc platforms. Then, for function virtqueue_fetch_flags_packed/
> virtqueue_store_flags_packed, the if and else branch are still identical
> for the platforms except Arm.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> drivers/net/virtio/virtqueue.h | 27 +++++++++++++--------------
> 1 file changed, 13 insertions(+), 14 deletions(-)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier Joyce Kong
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
2021-01-07 14:47 ` Maxime Coquelin
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
2021-01-08 9:16 ` [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Maxime Coquelin
4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
Relax the full write barriers to one-way barriers for virtio
control path for Arm platform
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
drivers/net/virtio/virtio_ethdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 0d91f7a50..b3e5cba70 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -203,8 +203,8 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
vq->vq_packed.cached_flags ^= VRING_PACKED_DESC_F_AVAIL_USED;
}
- virtio_wmb(vq->hw->weak_barriers);
- desc[head].flags = VRING_DESC_F_NEXT | flags;
+ virtqueue_store_flags_packed(&desc[head], VRING_DESC_F_NEXT | flags,
+ vq->hw->weak_barriers);
virtio_wmb(vq->hw->weak_barriers);
virtqueue_notify(vq);
--
2.29.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
@ 2021-01-07 14:47 ` Maxime Coquelin
0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 14:47 UTC (permalink / raw)
To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
On 12/21/20 3:23 PM, Joyce Kong wrote:
> Relax the full write barriers to one-way barriers for virtio
> control path for Arm platform
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> drivers/net/virtio/virtio_ethdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index 0d91f7a50..b3e5cba70 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -203,8 +203,8 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
> vq->vq_packed.cached_flags ^= VRING_PACKED_DESC_F_AVAIL_USED;
> }
>
> - virtio_wmb(vq->hw->weak_barriers);
> - desc[head].flags = VRING_DESC_F_NEXT | flags;
> + virtqueue_store_flags_packed(&desc[head], VRING_DESC_F_NEXT | flags,
> + vq->hw->weak_barriers);
>
> virtio_wmb(vq->hw->weak_barriers);
> virtqueue_notify(vq);
>
Performance does not matter in the case of ctrl queue, but it is cleaner
to reuse existing helpers anyway:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
` (2 preceding siblings ...)
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform Joyce Kong
@ 2020-12-21 14:23 ` Joyce Kong
2021-01-07 16:28 ` Maxime Coquelin
2021-01-08 9:16 ` [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Maxime Coquelin
4 siblings, 1 reply; 10+ messages in thread
From: Joyce Kong @ 2020-12-21 14:23 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
Replace the smp barriers with atomic thread fence for synchronization
between different threads, if there are no load/store operations.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
drivers/net/virtio/virtqueue.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index ac3d9e750..d78b94344 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -26,7 +26,7 @@ struct rte_mbuf;
/*
* Per virtio_ring.h in Linux.
* For virtio_pci on SMP, we don't need to order with respect to MMIO
- * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * accesses through relaxed memory I/O windows, so thread_fence is
* sufficient.
*
* For using virtio to talk to real devices (eg. vDPA) we do need real
@@ -36,7 +36,7 @@ static inline void
virtio_mb(uint8_t weak_barriers)
{
if (weak_barriers)
- rte_smp_mb();
+ rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
else
rte_mb();
}
@@ -45,7 +45,7 @@ static inline void
virtio_rmb(uint8_t weak_barriers)
{
if (weak_barriers)
- rte_smp_rmb();
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
else
rte_io_rmb();
}
@@ -54,7 +54,7 @@ static inline void
virtio_wmb(uint8_t weak_barriers)
{
if (weak_barriers)
- rte_smp_wmb();
+ rte_atomic_thread_fence(__ATOMIC_RELEASE);
else
rte_io_wmb();
}
--
2.29.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
@ 2021-01-07 16:28 ` Maxime Coquelin
0 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-07 16:28 UTC (permalink / raw)
To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
On 12/21/20 3:23 PM, Joyce Kong wrote:
> Replace the smp barriers with atomic thread fence for synchronization
> between different threads, if there are no load/store operations.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> drivers/net/virtio/virtqueue.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic
2020-12-21 14:23 [dpdk-dev] [PATCH v1 0/4] replace smp barriers in virtio with C11 atomic Joyce Kong
` (3 preceding siblings ...)
2020-12-21 14:23 ` [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence Joyce Kong
@ 2021-01-08 9:16 ` Maxime Coquelin
4 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2021-01-08 9:16 UTC (permalink / raw)
To: Joyce Kong, chenbo.xia, honnappa.nagarahalli, ruifeng.wang; +Cc: dev, nd
On 12/21/20 3:23 PM, Joyce Kong wrote:
> This patchset is to replace rte smp barriers in virtio with C11 atomic
> built-ins.
>
> The rte_smp_*mb APIs provide full barrier functionality. However, many
> use cases do not require full barriers. To support such use cases, DPDK
> will adopt C11 barrier semantics and provide wrappers using C11 atomic
> built-ins.[1]
>
> With this patch set, under 0.001% acceptable loss with 2 cores on vhost
> side and 1 core on virtio side, PVP case(vhost-user + virtio-user) has
> 6.7% perf uplift for the split in_order path on ThunderX2 platform.
>
> [1] http://code.dpdk.org/dpdk/latest/source/doc/guides/rel_notes/deprecation.rst
>
> Joyce Kong (4):
> net/virtio: remove unnecessary rmb barrier
> net/virtio: replace smp barrier with IO barrier
> net/virtio: replace full barrier with relaxed barrier for Arm platform
> net/virtio: replace full barrier with thread fence
>
> drivers/net/virtio/virtio_ethdev.c | 10 ++++-----
> drivers/net/virtio/virtqueue.h | 35 +++++++++++++++---------------
> 2 files changed, 22 insertions(+), 23 deletions(-)
>
Series applied to dpdk-next-virtio/main.
Thanks,
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread