DPDK patches and discussions
 help / color / mirror / Atom feed
From: David Marchand <david.marchand@redhat.com>
To: Radu Nicolau <radu.nicolau@intel.com>
Cc: dev <dev@dpdk.org>, Beilei Xing <beilei.xing@intel.com>,
	Jeff Guo <jia.guo@intel.com>,
	 Bruce Richardson <bruce.richardson@intel.com>,
	 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	Jerin Jacob <jerinjacobk@gmail.com>,
	 "Trahe, Fiona" <fiona.trahe@intel.com>,
	Wei Zhao <wei.zhao1@intel.com>,
	 "Ruifeng Wang (Arm Technology China)" <ruifeng.wang@arm.com>
Subject: Re: [dpdk-dev] [PATCH v9 1/4] eal: add WC store functions
Date: Mon, 20 Jul 2020 14:20:32 +0200
Message-ID: <CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com> (raw)
In-Reply-To: <1595236337-28230-2-git-send-email-radu.nicolau@intel.com>

On Mon, Jul 20, 2020 at 11:12 AM Radu Nicolau <radu.nicolau@intel.com> wrote:
>
> Add rte_write32_wc and rte_write32_wc_relaxed functions
> that implement 32bit stores using write combining memory protocol.
> Provided generic stubs and x86 implementation.

What is the difference of using this new API when compared to the
existing pci driver flag RTE_PCI_DRV_WC_ACTIVATE?
Do we have some overlap between the two?

This commitlog is quite short for something that touches performance.
I saw a question from Ruifeng, it is worth adding this to the commitlog.

Which x86 platforms will benefit from it?
What is the impact on performance for existing platforms that have no
MOVDIRI support?


>
> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  lib/librte_eal/arm/include/rte_io_64.h  | 12 +++++++
>  lib/librte_eal/include/generic/rte_io.h | 48 ++++++++++++++++++++++++++++
>  lib/librte_eal/x86/include/rte_io.h     | 56 +++++++++++++++++++++++++++++++++
>  3 files changed, 116 insertions(+)
>
> diff --git a/lib/librte_eal/arm/include/rte_io_64.h b/lib/librte_eal/arm/include/rte_io_64.h
> index e534624..d07d9cb 100644
> --- a/lib/librte_eal/arm/include/rte_io_64.h
> +++ b/lib/librte_eal/arm/include/rte_io_64.h
> @@ -164,6 +164,18 @@ rte_write64(uint64_t value, volatile void *addr)
>         rte_write64_relaxed(value, addr);
>  }
>
> +static __rte_always_inline void
> +rte_write32_wc(uint32_t value, volatile void *addr)
> +{
> +       rte_write32(value, addr);
> +}
> +
> +static __rte_always_inline void
> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)
> +{
> +       rte_write32_relaxed(value, addr);
> +}
> +

We were using a single knob RTE_OVERRIDE_IO_H for overriding the whole rte_io.h.
Now we would have a special case for an API for x86 and the code is
copy/pasted in the ARM header and keeping the "whole" override mode.

This leaves an unfinished taste.

Why did you not flag all relevant "native" helpers?
This would factor some code from the ARM header.


>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_eal/include/generic/rte_io.h b/lib/librte_eal/include/generic/rte_io.h
> index da457f7..0669baa 100644
> --- a/lib/librte_eal/include/generic/rte_io.h
> +++ b/lib/librte_eal/include/generic/rte_io.h
> @@ -229,6 +229,40 @@ rte_write32(uint32_t value, volatile void *addr);
>  static inline void
>  rte_write64(uint64_t value, volatile void *addr);
>
> +/**
> + * Write a 32-bit value to I/O device memory address addr using write
> + * combining memory write protocol. Depending on the platform write combining
> + * may not be available and/or may be treated as a hint and the behavior may
> + * fallback to a regular store.
> + *
> + * @param value
> + *  Value to write
> + * @param addr
> + *  I/O memory address to write the value to
> + */
> +__rte_experimental
> +static inline void
> +rte_write32_wc(uint32_t value, volatile void *addr);
> +
> +/**
> + * Write a 32-bit value to I/O device memory address addr using write
> + * combining memory write protocol. Depending on the platform write combining
> + * may not be available and/or may be treated as a hint and the behavior may
> + * fallback to a regular store.
> + *
> + * The relaxed version does not have additional I/O memory barrier, useful in
> + * accessing the device registers of integrated controllers which implicitly
> + * strongly ordered with respect to memory access.

It might be just me, but I have trouble reading the last part of this sentence.
Maybe remove "with respect to"?


> + *
> + * @param value
> + *  Value to write
> + * @param addr
> + *  I/O memory address to write the value to
> + */
> +__rte_experimental
> +static inline void
> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr);
> +
>  #endif /* __DOXYGEN__ */
>
>  #ifndef RTE_OVERRIDE_IO_H
> @@ -345,6 +379,20 @@ rte_write64(uint64_t value, volatile void *addr)
>         rte_write64_relaxed(value, addr);
>  }
>
> +#ifndef RTE_NATIVE_WRITE32_WC
> +static __rte_always_inline void
> +rte_write32_wc(uint32_t value, volatile void *addr)
> +{
> +       rte_write32(value, addr);
> +}
> +
> +static __rte_always_inline void
> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)
> +{
> +       rte_write32_relaxed(value, addr);
> +}
> +#endif /* RTE_NATIVE_WRITE32_WC */
> +
>  #endif /* RTE_OVERRIDE_IO_H */
>
>  #endif /* _RTE_IO_H_ */
> diff --git a/lib/librte_eal/x86/include/rte_io.h b/lib/librte_eal/x86/include/rte_io.h
> index 2db71b1..c95ed67 100644
> --- a/lib/librte_eal/x86/include/rte_io.h
> +++ b/lib/librte_eal/x86/include/rte_io.h
> @@ -9,8 +9,64 @@
>  extern "C" {
>  #endif
>
> +#include "rte_cpuflags.h"
> +
> +#define RTE_NATIVE_WRITE32_WC
>  #include "generic/rte_io.h"
>
> +/**
> + * @internal
> + * MOVDIRI wrapper.
> + */
> +static __rte_always_inline void
> +_rte_x86_movdiri(uint32_t value, volatile void *addr)
> +{
> +       asm volatile(
> +               /* MOVDIRI */
> +               ".byte 0x40, 0x0f, 0x38, 0xf9, 0x02"
> +               :
> +               : "a" (value), "d" (addr));
> +}
> +
> +static __rte_always_inline void
> +rte_write32_wc(uint32_t value, volatile void *addr)
> +{
> +       static int _x86_movdiri_flag = -1;
> +       if (_x86_movdiri_flag == 1) {
> +               rte_wmb();
> +               _rte_x86_movdiri(value, addr);
> +       } else if (_x86_movdiri_flag == 0) {
> +               rte_write32(value, addr);
> +       } else {
> +               _x86_movdiri_flag =
> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);
> +               if (_x86_movdiri_flag == 1) {
> +                       rte_wmb();
> +                       _rte_x86_movdiri(value, addr);
> +               } else {
> +                       rte_write32(value, addr);
> +               }
> +       }
> +}
> +
> +static __rte_always_inline void
> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)
> +{
> +       static int _x86_movdiri_flag = -1;
> +       if (_x86_movdiri_flag == 1) {
> +               _rte_x86_movdiri(value, addr);
> +       } else if (_x86_movdiri_flag == 0) {
> +               rte_write32_relaxed(value, addr);
> +       } else {
> +               _x86_movdiri_flag =
> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);
> +               if (_x86_movdiri_flag == 1)
> +                       _rte_x86_movdiri(value, addr);
> +               else
> +                       rte_write32_relaxed(value, addr);
> +       }
> +}
> +

Repeating some comments I made earlier.

- If a single helper called by both rte_write32_wc and
rte_write32_wc_relaxed with a _constant_ flag is not to your liking (I
don't see where it would have an impact on performance), then maybe
rte_write32_wc() can be simply implemented as:

+static __rte_always_inline void
+rte_write32_wc(uint32_t value, volatile void *addr)
+{
+    rte_wmb();
+    rte_write32_wc_relaxed(value, addr);
+}
+

- Looking at this above suggestion, I wonder about the non-relaxed case.
Is rte_io_wmb() not enough?


- The cpuflag check can be resolved at init once and for all.
By this, I mean in lib/librte_eal/x86/rte_cpuflags.c:

+int rte_x86_movdiri_flag = -1;
+
+RTE_INIT(rte_x86_movdiri_init)
+{
+       rte_x86_movdiri_flag =
+               (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);
+}

The variable can be exported in lib/librte_eal/x86/include/rte_cpuflags.h.

Then rte_write32_wc_relaxed() becomes:

+static __rte_always_inline void
+rte_write32_wc_relaxed(uint32_t value, volatile void *addr)
+{
+    if (rte_x86_movdiri_flag == 1) {
+        asm volatile(
+            /* MOVDIRI */
+            ".byte 0x40, 0x0f, 0x38, 0xf9, 0x02"
+            :
+            : "a" (value), "d" (addr));
+        return;
+    }
+
+    rte_write32_relaxed(value, addr);
+}
+


Thanks.

-- 
David Marchand


  reply	other threads:[~2020-07-20 12:20 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 10:11 [dpdk-dev] [PATCH v1 1/2] eal/x86: add WC store function Radu Nicolau
2020-06-11 10:11 ` [dpdk-dev] [PATCH v1 2/2] net/i40e: use movdiri to update queue tail registers Radu Nicolau
2020-06-11 12:23 ` [dpdk-dev] [PATCH v1 1/2] eal/x86: add WC store function Jerin Jacob
2020-06-11 13:56   ` Nicolau, Radu
2020-06-11 15:33     ` Jerin Jacob
2020-06-15 11:11 ` Ananyev, Konstantin
2020-06-19 12:06 ` [dpdk-dev] [PATCH v2 1/2] eal: add WC store functions Radu Nicolau
2020-06-19 12:06   ` [dpdk-dev] [PATCH v2 2/2] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-01 13:15     ` Bruce Richardson
2020-07-01 13:14   ` [dpdk-dev] [PATCH v2 1/2] eal: add WC store functions Bruce Richardson
2020-07-01 14:15 ` [dpdk-dev] [PATCH v3 0/2] " Radu Nicolau
2020-07-01 14:15   ` [dpdk-dev] [PATCH v3 1/2] " Radu Nicolau
2020-07-01 14:15   ` [dpdk-dev] [PATCH v3 2/2] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-02  9:23 ` [dpdk-dev] [PATCH v4 0/2] eal: add WC store functions Radu Nicolau
2020-07-02  9:23   ` [dpdk-dev] [PATCH v4 1/2] " Radu Nicolau
2020-07-03 15:19     ` David Marchand
2020-07-06  9:15       ` Nicolau, Radu
2020-07-02  9:23   ` [dpdk-dev] [PATCH v4 2/2] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-06 12:29 ` [dpdk-dev] [PATCH v5 0/2] eal: add WC store functions Radu Nicolau
2020-07-06 12:29   ` [dpdk-dev] [PATCH v5 1/2] " Radu Nicolau
2020-07-06 12:30   ` [dpdk-dev] [PATCH v5 2/2] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-13 12:27 ` [dpdk-dev] [PATCH v6 0/4] eal: add WC store functions Radu Nicolau
2020-07-13 12:27   ` [dpdk-dev] [PATCH v6 1/4] " Radu Nicolau
2020-07-13 12:27   ` [dpdk-dev] [PATCH v6 2/4] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-13 12:27   ` [dpdk-dev] [PATCH v6 3/4] qat: " Radu Nicolau
2020-07-13 12:44     ` Bruce Richardson
2020-07-13 12:52       ` Trahe, Fiona
2020-07-13 12:57         ` Bruce Richardson
2020-07-13 12:27   ` [dpdk-dev] [PATCH v6 4/4] net/ixgbe: use WC store to update doorbell register Radu Nicolau
2020-07-16 12:29 ` [dpdk-dev] [PATCH v7 0/4] eal: add WC store functions Radu Nicolau
2020-07-16 12:29   ` [dpdk-dev] [PATCH v7 1/4] " Radu Nicolau
2020-07-16 12:29   ` [dpdk-dev] [PATCH v7 2/4] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-16 12:29   ` [dpdk-dev] [PATCH v7 3/4] common/qat: " Radu Nicolau
2020-07-16 12:29   ` [dpdk-dev] [PATCH v7 4/4] net/ixgbe: use WC store to update doorbell register Radu Nicolau
2020-07-17 10:49 ` [dpdk-dev] [PATCH v8 0/4] eal: add WC store functions Radu Nicolau
2020-07-17 10:49   ` [dpdk-dev] [PATCH v8 1/4] " Radu Nicolau
2020-07-20  6:42     ` Ruifeng Wang
2020-07-20  8:52       ` Nicolau, Radu
2020-07-17 10:49   ` [dpdk-dev] [PATCH v8 2/4] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-20  6:46     ` Ruifeng Wang
2020-07-20  8:54       ` Nicolau, Radu
2020-07-17 10:49   ` [dpdk-dev] [PATCH v8 3/4] common/qat: " Radu Nicolau
2020-07-17 16:42     ` Trahe, Fiona
2020-07-17 10:49   ` [dpdk-dev] [PATCH v8 4/4] net/ixgbe: " Radu Nicolau
2020-07-17 11:18     ` Ananyev, Konstantin
2020-07-20  9:12 ` [dpdk-dev] [PATCH v9 0/4] eal: add WC store functions Radu Nicolau
2020-07-20  9:12   ` [dpdk-dev] [PATCH v9 1/4] " Radu Nicolau
2020-07-20 12:20     ` David Marchand [this message]
2020-07-21  8:56       ` Nicolau, Radu
2020-07-20  9:12   ` [dpdk-dev] [PATCH v9 2/4] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-20  9:12   ` [dpdk-dev] [PATCH v9 3/4] common/qat: " Radu Nicolau
2020-07-20  9:12   ` [dpdk-dev] [PATCH v9 4/4] net/ixgbe: " Radu Nicolau
2020-07-21 11:31 ` [dpdk-dev] [PATCH v10 0/4] eal: add WC store functions Radu Nicolau
2020-07-21 11:31   ` [dpdk-dev] [PATCH v10 1/4] " Radu Nicolau
2020-07-21 11:31   ` [dpdk-dev] [PATCH v10 2/4] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-07-21 11:31   ` [dpdk-dev] [PATCH v10 3/4] common/qat: " Radu Nicolau
2020-07-21 11:31   ` [dpdk-dev] [PATCH v10 4/4] net/ixgbe: " Radu Nicolau
2020-08-26  9:55 ` [dpdk-dev] [PATCH v11 0/5] eal: add WC store functions Radu Nicolau
2020-08-26  9:55   ` [dpdk-dev] [PATCH v11 1/5] " Radu Nicolau
2020-08-26  9:55   ` [dpdk-dev] [PATCH v11 2/5] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-09-23  1:19     ` Lu, Wenzhuo
2020-08-26  9:55   ` [dpdk-dev] [PATCH v11 3/5] common/qat: " Radu Nicolau
2020-08-26  9:55   ` [dpdk-dev] [PATCH v11 4/5] net/ixgbe: " Radu Nicolau
2020-09-23  1:20     ` Lu, Wenzhuo
2020-08-26  9:55   ` [dpdk-dev] [PATCH v11 5/5] net/ice: " Radu Nicolau
2020-09-23  1:20     ` Lu, Wenzhuo
2020-09-23 14:22 ` [dpdk-dev] [PATCH v12 0/5] eal: add WC store functions Radu Nicolau
2020-09-23 14:22   ` [dpdk-dev] [PATCH v12 1/5] " Radu Nicolau
2020-09-23 14:22   ` [dpdk-dev] [PATCH v12 2/5] net/i40e: use WC store to update queue tail registers Radu Nicolau
2020-09-23 14:22   ` [dpdk-dev] [PATCH v12 3/5] common/qat: " Radu Nicolau
2020-09-23 14:22   ` [dpdk-dev] [PATCH v12 4/5] net/ixgbe: " Radu Nicolau
2020-09-23 14:22   ` [dpdk-dev] [PATCH v12 5/5] net/ice: " Radu Nicolau
2020-10-08  7:28   ` [dpdk-dev] [PATCH v12 0/5] eal: add WC store functions David Marchand
2020-10-08  9:51     ` Nicolau, Radu
2020-10-13  8:57     ` Ferruh Yigit
2020-10-13 12:50   ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com' \
    --to=david.marchand@redhat.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=fiona.trahe@intel.com \
    --cc=jerinjacobk@gmail.com \
    --cc=jia.guo@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=radu.nicolau@intel.com \
    --cc=ruifeng.wang@arm.com \
    --cc=wei.zhao1@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git