DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] clarify purpose of empty cache lines
@ 2023-09-04  8:43 Morten Brørup
  2023-09-04  9:12 ` Bruce Richardson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Morten Brørup @ 2023-09-04  8:43 UTC (permalink / raw)
  To: thomas, david.marchand, honnappa.nagarahalli,
	konstantin.v.ananyev, bruce.richardson, mattias.ronnblom
  Cc: olivier.matz, andrew.rybchenko, dev, Morten Brørup

This patch introduces the generic RTE_CACHE_GUARD macro into the EAL, and
replaces vaguely described empty cache lines in the rte_ring structure
with this macro.

Although the implementation of the rte_ring structure assumes that the
hardware speculatively prefetches 1 cache line, this number can be changed
at build time by modifying RTE_CACHE_GUARD_LINES in rte_config.h.

The background and the RFC was discussed in this thread:
http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87B39@smartserver.smartshare.dk/

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 config/rte_config.h          |  1 +
 lib/eal/include/rte_common.h | 13 +++++++++++++
 lib/ring/rte_ring_core.h     |  6 +++---
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index 400e44e3cf..cfdf787724 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -37,6 +37,7 @@
 #define RTE_MAX_TAILQ 32
 #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
 #define RTE_MAX_VFIO_CONTAINERS 64
+#define RTE_CACHE_GUARD_LINES 1
 
 /* bsd module defines */
 #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
index 771c70f2c8..daf1866a32 100644
--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -527,6 +527,19 @@ rte_is_aligned(const void * const __rte_restrict ptr, const unsigned int align)
 /** Force minimum cache line alignment. */
 #define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_LINE_MIN_SIZE)
 
+#define _RTE_CACHE_GUARD_HELPER2(unique) \
+		char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE * RTE_CACHE_GUARD_LINES] \
+		__rte_cache_aligned
+#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
+/**
+ * Empty cache lines, to guard against false sharing-like effects
+ * on systems with a next-N-lines hardware prefetcher.
+ *
+ * Use as spacing between data accessed by different lcores,
+ * to prevent cache thrashing on hardware with speculative prefetching.
+ */
+#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)
+
 /*********** PA/IOVA type definitions ********/
 
 /** Physical address */
diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
index d1e59bf9ad..327fdcf28f 100644
--- a/lib/ring/rte_ring_core.h
+++ b/lib/ring/rte_ring_core.h
@@ -126,7 +126,7 @@ struct rte_ring {
 	uint32_t mask;           /**< Mask (size-1) of ring. */
 	uint32_t capacity;       /**< Usable size of ring */
 
-	char pad0 __rte_cache_aligned; /**< empty cache line */
+	RTE_CACHE_GUARD;
 
 	/** Ring producer status. */
 	union {
@@ -135,7 +135,7 @@ struct rte_ring {
 		struct rte_ring_rts_headtail rts_prod;
 	}  __rte_cache_aligned;
 
-	char pad1 __rte_cache_aligned; /**< empty cache line */
+	RTE_CACHE_GUARD;
 
 	/** Ring consumer status. */
 	union {
@@ -144,7 +144,7 @@ struct rte_ring {
 		struct rte_ring_rts_headtail rts_cons;
 	}  __rte_cache_aligned;
 
-	char pad2 __rte_cache_aligned; /**< empty cache line */
+	RTE_CACHE_GUARD;
 };
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] clarify purpose of empty cache lines
  2023-09-04  8:43 [PATCH] clarify purpose of empty cache lines Morten Brørup
@ 2023-09-04  9:12 ` Bruce Richardson
  2023-10-11 15:59   ` Thomas Monjalon
  2023-09-05  6:41 ` Morten Brørup
  2023-10-11 23:43 ` Thomas Monjalon
  2 siblings, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2023-09-04  9:12 UTC (permalink / raw)
  To: Morten Brørup
  Cc: thomas, david.marchand, honnappa.nagarahalli,
	konstantin.v.ananyev, mattias.ronnblom, olivier.matz,
	andrew.rybchenko, dev

On Mon, Sep 04, 2023 at 10:43:49AM +0200, Morten Brørup wrote:
> This patch introduces the generic RTE_CACHE_GUARD macro into the EAL, and
> replaces vaguely described empty cache lines in the rte_ring structure
> with this macro.
> 
> Although the implementation of the rte_ring structure assumes that the
> hardware speculatively prefetches 1 cache line, this number can be changed
> at build time by modifying RTE_CACHE_GUARD_LINES in rte_config.h.
> 
> The background and the RFC was discussed in this thread:
> http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87B39@smartserver.smartshare.dk/
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
Seems fine to me.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] clarify purpose of empty cache lines
  2023-09-04  8:43 [PATCH] clarify purpose of empty cache lines Morten Brørup
  2023-09-04  9:12 ` Bruce Richardson
@ 2023-09-05  6:41 ` Morten Brørup
  2023-10-11 23:43 ` Thomas Monjalon
  2 siblings, 0 replies; 7+ messages in thread
From: Morten Brørup @ 2023-09-05  6:41 UTC (permalink / raw)
  To: mattias.ronnblom
  Cc: olivier.matz, andrew.rybchenko, dev, thomas, david.marchand,
	honnappa.nagarahalli, konstantin.v.ananyev, bruce.richardson

> From: Morten Brørup [mailto:mb@smartsharesystems.com]
> Sent: Monday, 4 September 2023 10.44
> 
> This patch introduces the generic RTE_CACHE_GUARD macro into the EAL, and
> replaces vaguely described empty cache lines in the rte_ring structure
> with this macro.
> 
> Although the implementation of the rte_ring structure assumes that the
> hardware speculatively prefetches 1 cache line, this number can be changed
> at build time by modifying RTE_CACHE_GUARD_LINES in rte_config.h.
> 
> The background and the RFC was discussed in this thread:
> http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87B39@smartserver.s
> martshare.dk/
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
>  config/rte_config.h          |  1 +
>  lib/eal/include/rte_common.h | 13 +++++++++++++
>  lib/ring/rte_ring_core.h     |  6 +++---
>  3 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/config/rte_config.h b/config/rte_config.h
> index 400e44e3cf..cfdf787724 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -37,6 +37,7 @@
>  #define RTE_MAX_TAILQ 32
>  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
>  #define RTE_MAX_VFIO_CONTAINERS 64
> +#define RTE_CACHE_GUARD_LINES 1
> 
>  /* bsd module defines */
>  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
> index 771c70f2c8..daf1866a32 100644
> --- a/lib/eal/include/rte_common.h
> +++ b/lib/eal/include/rte_common.h
> @@ -527,6 +527,19 @@ rte_is_aligned(const void * const __rte_restrict ptr,
> const unsigned int align)
>  /** Force minimum cache line alignment. */
>  #define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_LINE_MIN_SIZE)
> 
> +#define _RTE_CACHE_GUARD_HELPER2(unique) \
> +		char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> RTE_CACHE_GUARD_LINES] \
> +		__rte_cache_aligned
> +#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
> +/**
> + * Empty cache lines, to guard against false sharing-like effects
> + * on systems with a next-N-lines hardware prefetcher.
> + *
> + * Use as spacing between data accessed by different lcores,
> + * to prevent cache thrashing on hardware with speculative prefetching.
> + */
> +#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)
> +
>  /*********** PA/IOVA type definitions ********/
> 
>  /** Physical address */
> diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
> index d1e59bf9ad..327fdcf28f 100644
> --- a/lib/ring/rte_ring_core.h
> +++ b/lib/ring/rte_ring_core.h
> @@ -126,7 +126,7 @@ struct rte_ring {
>  	uint32_t mask;           /**< Mask (size-1) of ring. */
>  	uint32_t capacity;       /**< Usable size of ring */
> 
> -	char pad0 __rte_cache_aligned; /**< empty cache line */
> +	RTE_CACHE_GUARD;
> 
>  	/** Ring producer status. */
>  	union {
> @@ -135,7 +135,7 @@ struct rte_ring {
>  		struct rte_ring_rts_headtail rts_prod;
>  	}  __rte_cache_aligned;
> 
> -	char pad1 __rte_cache_aligned; /**< empty cache line */
> +	RTE_CACHE_GUARD;
> 
>  	/** Ring consumer status. */
>  	union {
> @@ -144,7 +144,7 @@ struct rte_ring {
>  		struct rte_ring_rts_headtail rts_cons;
>  	}  __rte_cache_aligned;
> 
> -	char pad2 __rte_cache_aligned; /**< empty cache line */
> +	RTE_CACHE_GUARD;
>  };
> 
>  #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer".
> */
> --
> 2.17.1

I asked Mattias Rönnblom in the RFC discussion thread to ACK this patch, which he then did [1], so I'm copying his ACK here for patchwork to see.

[1]: http://inbox.dpdk.org/dev/66b862bc-ca22-02da-069c-a82c2d52ed49@lysator.liu.se/

Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] clarify purpose of empty cache lines
  2023-09-04  9:12 ` Bruce Richardson
@ 2023-10-11 15:59   ` Thomas Monjalon
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2023-10-11 15:59 UTC (permalink / raw)
  To: Morten Brørup
  Cc: dev, david.marchand, honnappa.nagarahalli, konstantin.v.ananyev,
	mattias.ronnblom, olivier.matz, andrew.rybchenko,
	Bruce Richardson

04/09/2023 11:12, Bruce Richardson:
> On Mon, Sep 04, 2023 at 10:43:49AM +0200, Morten Brørup wrote:
> > This patch introduces the generic RTE_CACHE_GUARD macro into the EAL, and
> > replaces vaguely described empty cache lines in the rte_ring structure
> > with this macro.
> > 
> > Although the implementation of the rte_ring structure assumes that the
> > hardware speculatively prefetches 1 cache line, this number can be changed
> > at build time by modifying RTE_CACHE_GUARD_LINES in rte_config.h.
> > 
> > The background and the RFC was discussed in this thread:
> > http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87B39@smartserver.smartshare.dk/
> > 
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> Seems fine to me.
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] clarify purpose of empty cache lines
  2023-09-04  8:43 [PATCH] clarify purpose of empty cache lines Morten Brørup
  2023-09-04  9:12 ` Bruce Richardson
  2023-09-05  6:41 ` Morten Brørup
@ 2023-10-11 23:43 ` Thomas Monjalon
  2023-10-12  6:00   ` Morten Brørup
  2023-10-12  6:07   ` Morten Brørup
  2 siblings, 2 replies; 7+ messages in thread
From: Thomas Monjalon @ 2023-10-11 23:43 UTC (permalink / raw)
  To: Morten Brørup
  Cc: david.marchand, honnappa.nagarahalli, konstantin.v.ananyev,
	bruce.richardson, mattias.ronnblom, dev, olivier.matz,
	andrew.rybchenko, dev

04/09/2023 10:43, Morten Brørup:
>  /** Force minimum cache line alignment. */
>  #define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_LINE_MIN_SIZE)
>  
> +#define _RTE_CACHE_GUARD_HELPER2(unique) \
> +		char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE * RTE_CACHE_GUARD_LINES] \
> +		__rte_cache_aligned
> +#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)

What is the reason for this intermediate helper macro?

> +/**
> + * Empty cache lines, to guard against false sharing-like effects
> + * on systems with a next-N-lines hardware prefetcher.
> + *
> + * Use as spacing between data accessed by different lcores,
> + * to prevent cache thrashing on hardware with speculative prefetching.
> + */
> +#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)





^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] clarify purpose of empty cache lines
  2023-10-11 23:43 ` Thomas Monjalon
@ 2023-10-12  6:00   ` Morten Brørup
  2023-10-12  6:07   ` Morten Brørup
  1 sibling, 0 replies; 7+ messages in thread
From: Morten Brørup @ 2023-10-12  6:00 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: david.marchand, honnappa.nagarahalli, konstantin.v.ananyev,
	bruce.richardson, mattias.ronnblom, dev, olivier.matz,
	andrew.rybchenko, dev

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, 12 October 2023 01.44
> 
> 04/09/2023 10:43, Morten Brørup:
> >  /** Force minimum cache line alignment. */
> >  #define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_LINE_MIN_SIZE)
> >
> > +#define _RTE_CACHE_GUARD_HELPER2(unique) \
> > +		char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> RTE_CACHE_GUARD_LINES] \
> > +		__rte_cache_aligned
> > +#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
> 
> What is the reason for this intermediate helper macro?
> 
> > +/**
> > + * Empty cache lines, to guard against false sharing-like effects
> > + * on systems with a next-N-lines hardware prefetcher.
> > + *
> > + * Use as spacing between data accessed by different lcores,
> > + * to prevent cache thrashing on hardware with speculative prefetching.
> > + */
> > +#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)

HELPER1 is required to convert __COUNTER__ to a number before HELPER2 concatenates it to cache_guard_.

If using HELPER2 directly,

#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER2(__COUNTER__)

would expand to:

char cache_guard_ ## __COUNTER__ [RTE_CACHE_LINE_SIZE * RTE_CACHE_GUARD_LINES]
	__rte_cache_aligned

Which is not unique, and would prevent using RTE_CACHE_GUARD multiple times in the same structure.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] clarify purpose of empty cache lines
  2023-10-11 23:43 ` Thomas Monjalon
  2023-10-12  6:00   ` Morten Brørup
@ 2023-10-12  6:07   ` Morten Brørup
  1 sibling, 0 replies; 7+ messages in thread
From: Morten Brørup @ 2023-10-12  6:07 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: david.marchand, honnappa.nagarahalli, konstantin.v.ananyev,
	bruce.richardson, mattias.ronnblom, dev, olivier.matz,
	andrew.rybchenko, dev

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, 12 October 2023 01.44
> 
> 04/09/2023 10:43, Morten Brørup:
> >  /** Force minimum cache line alignment. */
> >  #define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_LINE_MIN_SIZE)
> >
> > +#define _RTE_CACHE_GUARD_HELPER2(unique) \
> > +		char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> RTE_CACHE_GUARD_LINES] \
> > +		__rte_cache_aligned
> > +#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
> 
> What is the reason for this intermediate helper macro?
> 
> > +/**
> > + * Empty cache lines, to guard against false sharing-like effects
> > + * on systems with a next-N-lines hardware prefetcher.
> > + *
> > + * Use as spacing between data accessed by different lcores,
> > + * to prevent cache thrashing on hardware with speculative prefetching.
> > + */
> > +#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)

HELPER1 is required to convert __COUNTER__ to a number before HELPER2 concatenates it to cache_guard_.

If using HELPER2 directly,

#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER2(__COUNTER__)

would expand to:

char cache_guard___COUNTER__[RTE_CACHE_LINE_SIZE * RTE_CACHE_GUARD_LINES]
	__rte_cache_aligned

Which is not unique, and would prevent using RTE_CACHE_GUARD multiple times in the same structure.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-12  6:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-04  8:43 [PATCH] clarify purpose of empty cache lines Morten Brørup
2023-09-04  9:12 ` Bruce Richardson
2023-10-11 15:59   ` Thomas Monjalon
2023-09-05  6:41 ` Morten Brørup
2023-10-11 23:43 ` Thomas Monjalon
2023-10-12  6:00   ` Morten Brørup
2023-10-12  6:07   ` Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).