DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC PATCH] eventdev: ensure 16-byte alignment for events
@ 2023-10-05 11:51 Bruce Richardson
  2023-10-05 12:06 ` Bruce Richardson
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Bruce Richardson @ 2023-10-05 11:51 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Jerin Jacob

The event structure in DPDK is 16-bytes in size, and events are
regularly passed as parameters directly rather than being passed as
pointers. To help compiler optimize correctly, we can explicitly request
16-byte alignment for events, which means that we should be able
to do aligned vector loads/stores (e.g. with SSE or Neon) when working
with those events.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eventdev/rte_eventdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 2ba8a7b090..bb0d59b059 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1344,7 +1344,7 @@ struct rte_event {
 		struct rte_event_vector *vec;
 		/**< Event vector pointer. */
 	};
-};
+} __rte_aligned(16);
 
 /* Ethdev Rx adapter capability bitmap flags */
 #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1
-- 
2.39.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
@ 2023-10-05 12:06 ` Bruce Richardson
  2023-10-05 13:11   ` Jerin Jacob
  2023-10-05 12:12 ` Morten Brørup
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-05 12:06 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob

On Thu, Oct 05, 2023 at 12:51:00PM +0100, Bruce Richardson wrote:
> The event structure in DPDK is 16-bytes in size, and events are
> regularly passed as parameters directly rather than being passed as
> pointers. To help compiler optimize correctly, we can explicitly request
> 16-byte alignment for events, which means that we should be able
> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> with those events.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  lib/eventdev/rte_eventdev.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> index 2ba8a7b090..bb0d59b059 100644
> --- a/lib/eventdev/rte_eventdev.h
> +++ b/lib/eventdev/rte_eventdev.h
> @@ -1344,7 +1344,7 @@ struct rte_event {
>  		struct rte_event_vector *vec;
>  		/**< Event vector pointer. */
>  	};
> -};
> +} __rte_aligned(16);
>  

Looking for feedback on this idea - hence the fact this is going as an RFC.
It seems to me something that should be done for performance reasons, but
I'm not sure if there are any negative consequences of doing this.

Since this is an ABI-affecting change, a decision on this needs to be made
for 23.11, or else it will be locked in for at least another year. Hence me
sending it now as an RFC late in the release cycle, rather than deferring
to next release.

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
  2023-10-05 12:06 ` Bruce Richardson
@ 2023-10-05 12:12 ` Morten Brørup
  2023-10-05 13:02   ` Bruce Richardson
  2023-10-06  9:37 ` [PATCH v2] " Bruce Richardson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-10-05 12:12 UTC (permalink / raw)
  To: Bruce Richardson, dev; +Cc: Jerin Jacob

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Thursday, 5 October 2023 13.51
ure 16-byte alignment for events
> 
> The event structure in DPDK is 16-bytes in size, and events are
> regularly passed as parameters directly rather than being passed as
> pointers. To help compiler optimize correctly, we can explicitly request
> 16-byte alignment for events, which means that we should be able
> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> with those events.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

AFAICS it is not specified anywhere that the size of this struct must be 16 byte.

Consider adding this requirement to the struct documentation, and possibly a static_assert to verify.

Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 12:12 ` Morten Brørup
@ 2023-10-05 13:02   ` Bruce Richardson
  0 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2023-10-05 13:02 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Jerin Jacob

On Thu, Oct 05, 2023 at 02:12:10PM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Thursday, 5 October 2023 13.51
> ure 16-byte alignment for events
> > 
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers. To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> 
> AFAICS it is not specified anywhere that the size of this struct must be 16 byte.
> 
> Consider adding this requirement to the struct documentation, and possibly a static_assert to verify.
> 
Yes, good point to document and verify.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 12:06 ` Bruce Richardson
@ 2023-10-05 13:11   ` Jerin Jacob
  2023-10-05 13:15     ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Jerin Jacob @ 2023-10-05 13:11 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, Jerin Jacob, Pavan Nikhilesh, Abdullah Sevincer,
	Shijith Thotton, Hemant Agrawal, Sachin Saxena, Van Haaren,
	Harry, Mattias Rönnblom, Liang Ma, Peter Mccarthy,
	Honnappa Nagarahalli

On Thu, Oct 5, 2023 at 6:01 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Oct 05, 2023 at 12:51:00PM +0100, Bruce Richardson wrote:
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers. To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  lib/eventdev/rte_eventdev.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > index 2ba8a7b090..bb0d59b059 100644
> > --- a/lib/eventdev/rte_eventdev.h
> > +++ b/lib/eventdev/rte_eventdev.h
> > @@ -1344,7 +1344,7 @@ struct rte_event {
> >               struct rte_event_vector *vec;
> >               /**< Event vector pointer. */
> >       };
> > -};
> > +} __rte_aligned(16);
> >
>

+ Eventdev driver maintainers for review and for performance testing.

> Looking for feedback on this idea - hence the fact this is going as an RFC.

Are you seeing any performance improvement ? Look like only DLB2
driver only using SEE or AVX512 instructions.

> It seems to me something that should be done for performance reasons, but
> I'm not sure if there are any negative consequences of doing this.

In general, it looks OK, However, We may need more testing.
I can only speculate the following as of now, Since event memory is
allocated from stack most case,
there may stack pointer fix up in code for desired alignment ie. some
add/sub instructions which comes for free most likely due to pipeline.

We will test on Marvel HW. Request others to test on their HWs.


>
> Since this is an ABI-affecting change, a decision on this needs to be made
> for 23.11, or else it will be locked in for at least another year. Hence me
> sending it now as an RFC late in the release cycle, rather than deferring
> to next release.
>
> /Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 13:11   ` Jerin Jacob
@ 2023-10-05 13:15     ` Bruce Richardson
  2023-10-06  7:19       ` Jerin Jacob
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-05 13:15 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Jerin Jacob, Pavan Nikhilesh, Abdullah Sevincer,
	Shijith Thotton, Hemant Agrawal, Sachin Saxena, Van Haaren,
	Harry, Mattias Rönnblom, Liang Ma, Peter Mccarthy,
	Honnappa Nagarahalli

On Thu, Oct 05, 2023 at 06:41:34PM +0530, Jerin Jacob wrote:
> On Thu, Oct 5, 2023 at 6:01 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Oct 05, 2023 at 12:51:00PM +0100, Bruce Richardson wrote:
> > > The event structure in DPDK is 16-bytes in size, and events are
> > > regularly passed as parameters directly rather than being passed as
> > > pointers. To help compiler optimize correctly, we can explicitly request
> > > 16-byte alignment for events, which means that we should be able
> > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > with those events.
> > >
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > >  lib/eventdev/rte_eventdev.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > > index 2ba8a7b090..bb0d59b059 100644
> > > --- a/lib/eventdev/rte_eventdev.h
> > > +++ b/lib/eventdev/rte_eventdev.h
> > > @@ -1344,7 +1344,7 @@ struct rte_event {
> > >               struct rte_event_vector *vec;
> > >               /**< Event vector pointer. */
> > >       };
> > > -};
> > > +} __rte_aligned(16);
> > >
> >
> 
> + Eventdev driver maintainers for review and for performance testing.
> 
> > Looking for feedback on this idea - hence the fact this is going as an RFC.
> 
> Are you seeing any performance improvement ? Look like only DLB2
> driver only using SEE or AVX512 instructions.
> 

The idea would be that the driver code (and eventdev code) should not need
to use SSE directly. If we mark the event struct as aligned, it should help
encourage the compiler to use these instructions under-the-hood. For
example, when copying an event, the compiler should be emitting 128-bit
loads and stores for most platforms.

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 13:15     ` Bruce Richardson
@ 2023-10-06  7:19       ` Jerin Jacob
  0 siblings, 0 replies; 27+ messages in thread
From: Jerin Jacob @ 2023-10-06  7:19 UTC (permalink / raw)
  To: Bruce Richardson, Thomas Monjalon
  Cc: dev, Jerin Jacob, Pavan Nikhilesh, Abdullah Sevincer,
	Shijith Thotton, Hemant Agrawal, Sachin Saxena, Van Haaren,
	Harry, Mattias Rönnblom, Liang Ma, Peter Mccarthy,
	Honnappa Nagarahalli

On Thu, Oct 5, 2023 at 6:52 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Oct 05, 2023 at 06:41:34PM +0530, Jerin Jacob wrote:
> > On Thu, Oct 5, 2023 at 6:01 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Thu, Oct 05, 2023 at 12:51:00PM +0100, Bruce Richardson wrote:
> > > > The event structure in DPDK is 16-bytes in size, and events are
> > > > regularly passed as parameters directly rather than being passed as
> > > > pointers. To help compiler optimize correctly, we can explicitly request
> > > > 16-byte alignment for events, which means that we should be able
> > > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > > with those events.
> > > >
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > > >  lib/eventdev/rte_eventdev.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > > > index 2ba8a7b090..bb0d59b059 100644
> > > > --- a/lib/eventdev/rte_eventdev.h
> > > > +++ b/lib/eventdev/rte_eventdev.h
> > > > @@ -1344,7 +1344,7 @@ struct rte_event {
> > > >               struct rte_event_vector *vec;
> > > >               /**< Event vector pointer. */
> > > >       };
> > > > -};
> > > > +} __rte_aligned(16);
> > > >
> > >
> >
> > + Eventdev driver maintainers for review and for performance testing.
> >
> > > Looking for feedback on this idea - hence the fact this is going as an RFC.
> >
> > Are you seeing any performance improvement ? Look like only DLB2
> > driver only using SEE or AVX512 instructions.
> >
>
> The idea would be that the driver code (and eventdev code) should not need
> to use SSE directly. If we mark the event struct as aligned, it should help
> encourage the compiler to use these instructions under-the-hood. For
> example, when copying an event, the compiler should be emitting 128-bit
> loads and stores for most platforms.

With limited testing, there is no performance regression is seen. In
fact, the compiler is generating the
same instruction stream on both cases.

If there are no objections from others, Please send v1 with "ABI
change" update in doc/guides/rel_notes/release_23_11.rst.

With above change,
Acked-by: Jerin Jacob <jerinj@marvell.com>

NOTE: I already made PR to Thomas for rc1. Since is needs to part of
rc1, I need to sync with @Thomas Monjalon  as well.







>
> /Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
  2023-10-05 12:06 ` Bruce Richardson
  2023-10-05 12:12 ` Morten Brørup
@ 2023-10-06  9:37 ` Bruce Richardson
  2023-10-06  9:45 ` [PATCH v3] " Bruce Richardson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06  9:37 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Morten Brørup, Jerin Jacob

The event structure in DPDK is 16-bytes in size, and events are
regularly passed as parameters directly rather than being passed as
pointers. To help compiler optimize correctly, we can explicitly request
16-byte alignment for events, which means that we should be able
to do aligned vector loads/stores (e.g. with SSE or Neon) when working
with those events.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 doc/guides/rel_notes/release_23_11.rst | 4 ++++
 lib/eventdev/rte_eventdev.h            | 6 +++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_23_11.rst b/doc/guides/rel_notes/release_23_11.rst
index 261594aacc..48c19ae52a 100644
--- a/doc/guides/rel_notes/release_23_11.rst
+++ b/doc/guides/rel_notes/release_23_11.rst
@@ -158,6 +158,10 @@ ABI Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* The ``rte_event`` structure, used by eventdev library and DPDK "event" class drivers,
+  is now 16-byte aligned, as well as being 16-bytes in size.
+  In previous releases, the structure only required 8-byte alignement.
+
 
 Known Issues
 ------------
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 2ea98302b8..5b7c5b3399 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1284,6 +1284,8 @@ struct rte_event_vector {
 /**
  * The generic *rte_event* structure to hold the event attributes
  * for dequeue and enqueue operation
+ *
+ * This structure must be kept at 16-bytes in size, and has 16-byte alignment.
  */
 struct rte_event {
 	/** WORD0 */
@@ -1356,7 +1358,9 @@ struct rte_event {
 		struct rte_event_vector *vec;
 		/**< Event vector pointer. */
 	};
-};
+} __rte_aligned(16);
+
+_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
 
 /* Ethdev Rx adapter capability bitmap flags */
 #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1
-- 
2.39.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
                   ` (2 preceding siblings ...)
  2023-10-06  9:37 ` [PATCH v2] " Bruce Richardson
@ 2023-10-06  9:45 ` Bruce Richardson
  2023-10-06 10:13   ` Morten Brørup
  2023-10-06 10:29 ` [PATCH v4] " Bruce Richardson
  2023-10-06 12:15 ` [RFC PATCH] " Mattias Rönnblom
  5 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06  9:45 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Morten Brørup, Jerin Jacob

The event structure in DPDK is 16-bytes in size, and events are
regularly passed as parameters directly rather than being passed as
pointers. To help compiler optimize correctly, we can explicitly request
16-byte alignment for events, which means that we should be able
to do aligned vector loads/stores (e.g. with SSE or Neon) when working
with those events.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

---
v3: Fix spelling mistake in RN entry
    rebase on latest eventdev tree HEAD

v2: added release note entry for ABI change
    added structure comment on 16-byte size and alignment
    added static assert for structure size
---
 doc/guides/rel_notes/release_23_11.rst | 4 ++++
 lib/eventdev/rte_eventdev.h            | 6 +++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_23_11.rst b/doc/guides/rel_notes/release_23_11.rst
index 0903046b0c..33fed3e433 100644
--- a/doc/guides/rel_notes/release_23_11.rst
+++ b/doc/guides/rel_notes/release_23_11.rst
@@ -209,6 +209,10 @@ ABI Changes
   fields, to move ``rxq`` and ``txq`` fields, to change the size of
   ``reserved1`` and ``reserved2`` fields.
 
+* The ``rte_event`` structure, used by eventdev library and DPDK "event" class drivers,
+  is now 16-byte aligned, as well as being 16-bytes in size.
+  In previous releases, the structure only required 8-byte alignment.
+
 
 Known Issues
 ------------
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 2ea98302b8..5b7c5b3399 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1284,6 +1284,8 @@ struct rte_event_vector {
 /**
  * The generic *rte_event* structure to hold the event attributes
  * for dequeue and enqueue operation
+ *
+ * This structure must be kept at 16-bytes in size, and has 16-byte alignment.
  */
 struct rte_event {
 	/** WORD0 */
@@ -1356,7 +1358,9 @@ struct rte_event {
 		struct rte_event_vector *vec;
 		/**< Event vector pointer. */
 	};
-};
+} __rte_aligned(16);
+
+_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
 
 /* Ethdev Rx adapter capability bitmap flags */
 #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1
-- 
2.39.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06  9:45 ` [PATCH v3] " Bruce Richardson
@ 2023-10-06 10:13   ` Morten Brørup
  2023-10-06 10:16     ` Jerin Jacob
  2023-10-06 10:16     ` Bruce Richardson
  0 siblings, 2 replies; 27+ messages in thread
From: Morten Brørup @ 2023-10-06 10:13 UTC (permalink / raw)
  To: Bruce Richardson, dev; +Cc: Jerin Jacob

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 6 October 2023 11.45
> 
> The event structure in DPDK is 16-bytes in size, and events are
> regularly passed as parameters directly rather than being passed as
> pointers. To help compiler optimize correctly, we can explicitly request
> 16-byte alignment for events, which means that we should be able
> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> with those events.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> 
> ---

[...]

> +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");

Thank you for adding this extra check. We should have more of these.

NB: _Static_assert is deprecated in C23 [1], so for forward compatibility, you could use static_assert (which is available in <assert.h>) instead. Nice to have; feel free to ignore this comment.

[1]: https://en.cppreference.com/w/c/language/_Static_assert


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:13   ` Morten Brørup
@ 2023-10-06 10:16     ` Jerin Jacob
  2023-10-06 10:19       ` Bruce Richardson
  2023-10-06 10:16     ` Bruce Richardson
  1 sibling, 1 reply; 27+ messages in thread
From: Jerin Jacob @ 2023-10-06 10:16 UTC (permalink / raw)
  To: Morten Brørup; +Cc: Bruce Richardson, dev, Jerin Jacob

On Fri, Oct 6, 2023 at 3:44 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 6 October 2023 11.45
> >
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers. To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > Acked-by: Jerin Jacob <jerinj@marvell.com>
> >
> > ---
>
> [...]
>
> > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
>
> Thank you for adding this extra check. We should have more of these.

Use existing RTE_BUILD_BUG_ON this on  .c file instead of header file.

>
> NB: _Static_assert is deprecated in C23 [1], so for forward compatibility, you could use static_assert (which is available in <assert.h>) instead. Nice to have; feel free to ignore this comment.
>
> [1]: https://en.cppreference.com/w/c/language/_Static_assert
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:13   ` Morten Brørup
  2023-10-06 10:16     ` Jerin Jacob
@ 2023-10-06 10:16     ` Bruce Richardson
  2023-10-06 10:35       ` Morten Brørup
  1 sibling, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 10:16 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Jerin Jacob

On Fri, Oct 06, 2023 at 12:13:54PM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 6 October 2023 11.45
> > 
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers. To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > 
> > ---
> 
> [...]
> 
> > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
> 
> Thank you for adding this extra check. We should have more of these.
> 
> NB: _Static_assert is deprecated in C23 [1], so for forward compatibility, you could use static_assert (which is available in <assert.h>) instead. Nice to have; feel free to ignore this comment.
> 
Is the availability in assert.h backward compatible with C11, since the
link you posted seems to imply that "static_assert" is only from C23
onwards?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:16     ` Jerin Jacob
@ 2023-10-06 10:19       ` Bruce Richardson
  2023-10-06 10:24         ` Jerin Jacob
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 10:19 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Morten Brørup, dev, Jerin Jacob

On Fri, Oct 06, 2023 at 03:46:21PM +0530, Jerin Jacob wrote:
> On Fri, Oct 6, 2023 at 3:44 PM Morten Brørup <mb@smartsharesystems.com> wrote:
> >
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Friday, 6 October 2023 11.45
> > >
> > > The event structure in DPDK is 16-bytes in size, and events are
> > > regularly passed as parameters directly rather than being passed as
> > > pointers. To help compiler optimize correctly, we can explicitly request
> > > 16-byte alignment for events, which means that we should be able
> > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > with those events.
> > >
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > ---
> >
> > [...]
> >
> > > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
> >
> > Thank you for adding this extra check. We should have more of these.
> 
> Use existing RTE_BUILD_BUG_ON this on  .c file instead of header file.
> 

Ok to move to a C file, but I think using static_asserts are better than
using the old macro tricks of negatively sized arrays.

> >
> > NB: _Static_assert is deprecated in C23 [1], so for forward compatibility, you could use static_assert (which is available in <assert.h>) instead. Nice to have; feel free to ignore this comment.
> >
> > [1]: https://en.cppreference.com/w/c/language/_Static_assert
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:19       ` Bruce Richardson
@ 2023-10-06 10:24         ` Jerin Jacob
  2023-10-06 10:27           ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Jerin Jacob @ 2023-10-06 10:24 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Morten Brørup, dev, Jerin Jacob

On Fri, Oct 6, 2023 at 3:49 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Fri, Oct 06, 2023 at 03:46:21PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 6, 2023 at 3:44 PM Morten Brørup <mb@smartsharesystems.com> wrote:
> > >
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Friday, 6 October 2023 11.45
> > > >
> > > > The event structure in DPDK is 16-bytes in size, and events are
> > > > regularly passed as parameters directly rather than being passed as
> > > > pointers. To help compiler optimize correctly, we can explicitly request
> > > > 16-byte alignment for events, which means that we should be able
> > > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > > with those events.
> > > >
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > ---
> > >
> > > [...]
> > >
> > > > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
> > >
> > > Thank you for adding this extra check. We should have more of these.
> >
> > Use existing RTE_BUILD_BUG_ON this on  .c file instead of header file.
> >
>
> Ok to move to a C file, but I think using static_asserts are better than
> using the old macro tricks of negatively sized arrays.

No strong opinion on the API. I just told because compatibility
discussions came in.
Key is to move to .c file.



>
> > >
> > > NB: _Static_assert is deprecated in C23 [1], so for forward compatibility, you could use static_assert (which is available in <assert.h>) instead. Nice to have; feel free to ignore this comment.
> > >
> > > [1]: https://en.cppreference.com/w/c/language/_Static_assert
> > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:24         ` Jerin Jacob
@ 2023-10-06 10:27           ` Bruce Richardson
  0 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 10:27 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Morten Brørup, dev, Jerin Jacob

On Fri, Oct 06, 2023 at 03:54:26PM +0530, Jerin Jacob wrote:
> On Fri, Oct 6, 2023 at 3:49 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Fri, Oct 06, 2023 at 03:46:21PM +0530, Jerin Jacob wrote:
> > > On Fri, Oct 6, 2023 at 3:44 PM Morten Brørup <mb@smartsharesystems.com> wrote:
> > > >
> > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > > Sent: Friday, 6 October 2023 11.45
> > > > >
> > > > > The event structure in DPDK is 16-bytes in size, and events are
> > > > > regularly passed as parameters directly rather than being passed as
> > > > > pointers. To help compiler optimize correctly, we can explicitly request
> > > > > 16-byte alignment for events, which means that we should be able
> > > > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > > > with those events.
> > > > >
> > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > > >
> > > > > ---
> > > >
> > > > [...]
> > > >
> > > > > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
> > > >
> > > > Thank you for adding this extra check. We should have more of these.
> > >
> > > Use existing RTE_BUILD_BUG_ON this on  .c file instead of header file.
> > >
> >
> > Ok to move to a C file, but I think using static_asserts are better than
> > using the old macro tricks of negatively sized arrays.
> 
> No strong opinion on the API. I just told because compatibility
> discussions came in.
> Key is to move to .c file.
> 
Thanks. V4 coming up....

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
                   ` (3 preceding siblings ...)
  2023-10-06  9:45 ` [PATCH v3] " Bruce Richardson
@ 2023-10-06 10:29 ` Bruce Richardson
  2023-11-12  0:01   ` Stephen Hemminger
  2023-10-06 12:15 ` [RFC PATCH] " Mattias Rönnblom
  5 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 10:29 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Morten Brørup, Jerin Jacob

The event structure in DPDK is 16-bytes in size, and events are
regularly passed as parameters directly rather than being passed as
pointers. To help compiler optimize correctly, we can explicitly request
16-byte alignment for events, which means that we should be able
to do aligned vector loads/stores (e.g. with SSE or Neon) when working
with those events.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

---
v4: change _Static_assert to static_assert for fwd compatibility
    moved size check on event structure to C file from header file

v3: Fix spelling mistake in RN entry
    rebase on latest eventdev tree HEAD

v2: added release note entry for ABI change
    added structure comment on 16-byte size and alignment
    added static assert for structure size
---
 doc/guides/rel_notes/release_23_11.rst | 4 ++++
 lib/eventdev/rte_eventdev.c            | 3 +++
 lib/eventdev/rte_eventdev.h            | 4 +++-
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_23_11.rst b/doc/guides/rel_notes/release_23_11.rst
index 0903046b0c..33fed3e433 100644
--- a/doc/guides/rel_notes/release_23_11.rst
+++ b/doc/guides/rel_notes/release_23_11.rst
@@ -209,6 +209,10 @@ ABI Changes
   fields, to move ``rxq`` and ``txq`` fields, to change the size of
   ``reserved1`` and ``reserved2`` fields.
 
+* The ``rte_event`` structure, used by eventdev library and DPDK "event" class drivers,
+  is now 16-byte aligned, as well as being 16-bytes in size.
+  In previous releases, the structure only required 8-byte alignment.
+
 
 Known Issues
 ------------
diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c
index 95373bbaad..adc9751cef 100644
--- a/lib/eventdev/rte_eventdev.c
+++ b/lib/eventdev/rte_eventdev.c
@@ -9,6 +9,7 @@
 #include <errno.h>
 #include <stdint.h>
 #include <inttypes.h>
+#include <assert.h>
 
 #include <rte_string_fns.h>
 #include <rte_log.h>
@@ -28,6 +29,8 @@
 #include "eventdev_pmd.h"
 #include "eventdev_trace.h"
 
+static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
+
 static struct rte_eventdev rte_event_devices[RTE_EVENT_MAX_DEVS];
 
 struct rte_eventdev *rte_eventdevs = rte_event_devices;
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 2ea98302b8..758ee83a3f 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1284,6 +1284,8 @@ struct rte_event_vector {
 /**
  * The generic *rte_event* structure to hold the event attributes
  * for dequeue and enqueue operation
+ *
+ * This structure must be kept at 16-bytes in size, and has 16-byte alignment.
  */
 struct rte_event {
 	/** WORD0 */
@@ -1356,7 +1358,7 @@ struct rte_event {
 		struct rte_event_vector *vec;
 		/**< Event vector pointer. */
 	};
-};
+} __rte_aligned(16);
 
 /* Ethdev Rx adapter capability bitmap flags */
 #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1
-- 
2.39.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:16     ` Bruce Richardson
@ 2023-10-06 10:35       ` Morten Brørup
  2023-10-06 10:44         ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-10-06 10:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Jerin Jacob

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 6 October 2023 12.17
ev: ensure 16-byte alignment for events
> 
> On Fri, Oct 06, 2023 at 12:13:54PM +0200, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Friday, 6 October 2023 11.45
> > >
> > > The event structure in DPDK is 16-bytes in size, and events are
> > > regularly passed as parameters directly rather than being passed as
> > > pointers. To help compiler optimize correctly, we can explicitly request
> > > 16-byte alignment for events, which means that we should be able
> > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > with those events.
> > >
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > ---
> >
> > [...]
> >
> > > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is
> not 16-bytes in size");
> >
> > Thank you for adding this extra check. We should have more of these.
> >
> > NB: _Static_assert is deprecated in C23 [1], so for forward compatibility,
> you could use static_assert (which is available in <assert.h>) instead. Nice
> to have; feel free to ignore this comment.
> >
> Is the availability in assert.h backward compatible with C11, since the
> link you posted seems to imply that "static_assert" is only from C23
> onwards?

Yes, the link mentions "static_assert" being available for C11 as a convenience macro in assert.h.

I had to read the link very carefully to get this. I guess I'm not the only one. :-)

I don't object to moving it to the .c file. However, I think it's convenient for readability to have the static_assert close to the thing it checks, and/or close to any code that relies on the assumption it checks.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:35       ` Morten Brørup
@ 2023-10-06 10:44         ` Bruce Richardson
  0 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 10:44 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Jerin Jacob

On Fri, Oct 06, 2023 at 12:35:26PM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 6 October 2023 12.17
> ev: ensure 16-byte alignment for events
> > 
> > On Fri, Oct 06, 2023 at 12:13:54PM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Friday, 6 October 2023 11.45
> > > >
> > > > The event structure in DPDK is 16-bytes in size, and events are
> > > > regularly passed as parameters directly rather than being passed as
> > > > pointers. To help compiler optimize correctly, we can explicitly request
> > > > 16-byte alignment for events, which means that we should be able
> > > > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > > > with those events.
> > > >
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > ---
> > >
> > > [...]
> > >
> > > > +_Static_assert(sizeof(struct rte_event) == 16, "Event structure size is
> > not 16-bytes in size");
> > >
> > > Thank you for adding this extra check. We should have more of these.
> > >
> > > NB: _Static_assert is deprecated in C23 [1], so for forward compatibility,
> > you could use static_assert (which is available in <assert.h>) instead. Nice
> > to have; feel free to ignore this comment.
> > >
> > Is the availability in assert.h backward compatible with C11, since the
> > link you posted seems to imply that "static_assert" is only from C23
> > onwards?
> 
> Yes, the link mentions "static_assert" being available for C11 as a convenience macro in assert.h.
> 
> I had to read the link very carefully to get this. I guess I'm not the only one. :-)
> 

I missed that in the link, but I did have a read of the header file itself
to see that there was a macro there with C11 guards. :-)

> I don't object to moving it to the .c file. However, I think it's convenient for readability to have the static_assert close to the thing it checks, and/or close to any code that relies on the assumption it checks.
> 
I'm ok either way, and happy enough to have the check in the C file. I
suppose it saves cluttering up the public header file with checks that
should not be relevant to the end-user, and are only for DPDK developers.

Anyway, v4 sent, hopefully with all concerns addressed.

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
                   ` (4 preceding siblings ...)
  2023-10-06 10:29 ` [PATCH v4] " Bruce Richardson
@ 2023-10-06 12:15 ` Mattias Rönnblom
  2023-10-06 12:19   ` Bruce Richardson
  2024-01-19 22:30   ` Stephen Hemminger
  5 siblings, 2 replies; 27+ messages in thread
From: Mattias Rönnblom @ 2023-10-06 12:15 UTC (permalink / raw)
  To: Bruce Richardson, dev; +Cc: Jerin Jacob

On 2023-10-05 13:51, Bruce Richardson wrote:
> The event structure in DPDK is 16-bytes in size, and events are
> regularly passed as parameters directly rather than being passed as
> pointers.

When are events passed by-value, rather than by-reference? There are no 
such examples in the public eventdev API.

To help compiler optimize correctly, we can explicitly request
> 16-byte alignment for events, which means that we should be able
> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> with those events.
> 

That change is both helping and sabotaging the optimizer's work. Now 
every stack allocation needs to be 2-byte aligned - in DPDK code, and in 
the application.

The effect this change has on an eventdev app using DSW is a ~3 
cycle/event performance degradation on an AMD Zen 3 system, and a ~4 
cycle/event performance degradation on a Skylake-generation Intel CPU.

What scenarios do you have in mind, where this change would improve the 
generated code? Something where there are no unaligned loads available 
in the ISA, or they are much slower than their aligned counterparts?

When I looked into the same issue for the DPDK IP checksumming routines, 
there basically were no such. Not that I could find.

> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>   lib/eventdev/rte_eventdev.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> index 2ba8a7b090..bb0d59b059 100644
> --- a/lib/eventdev/rte_eventdev.h
> +++ b/lib/eventdev/rte_eventdev.h
> @@ -1344,7 +1344,7 @@ struct rte_event {
>   		struct rte_event_vector *vec;
>   		/**< Event vector pointer. */
>   	};
> -};
> +} __rte_aligned(16);
>   
>   /* Ethdev Rx adapter capability bitmap flags */
>   #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-06 12:15 ` [RFC PATCH] " Mattias Rönnblom
@ 2023-10-06 12:19   ` Bruce Richardson
  2023-10-06 12:29     ` Mattias Rönnblom
  2024-01-19 22:30   ` Stephen Hemminger
  1 sibling, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-06 12:19 UTC (permalink / raw)
  To: Mattias Rönnblom; +Cc: dev, Jerin Jacob

On Fri, Oct 06, 2023 at 02:15:00PM +0200, Mattias Rönnblom wrote:
> On 2023-10-05 13:51, Bruce Richardson wrote:
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers.
> 
> When are events passed by-value, rather than by-reference? There are no such
> examples in the public eventdev API.
> 
> To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> > 
> 
> That change is both helping and sabotaging the optimizer's work. Now every
> stack allocation needs to be 2-byte aligned - in DPDK code, and in the
> application.
> 
> The effect this change has on an eventdev app using DSW is a ~3 cycle/event
> performance degradation on an AMD Zen 3 system, and a ~4 cycle/event
> performance degradation on a Skylake-generation Intel CPU.
> 

Thanks for checking - this is the sort of feedback needed alright. In SW
eventdev we copy events around alot without using pointers, so I felt that
alignment would be helpful to avoid issues with events spanning cachelines.

However, since it has negative impacts, I'm quite happy to drop the idea,
and keep things as they are. I'll mark the change as rejected in patchwork.

/Bruce


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-06 12:19   ` Bruce Richardson
@ 2023-10-06 12:29     ` Mattias Rönnblom
  0 siblings, 0 replies; 27+ messages in thread
From: Mattias Rönnblom @ 2023-10-06 12:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Jerin Jacob

On 2023-10-06 14:19, Bruce Richardson wrote:
> On Fri, Oct 06, 2023 at 02:15:00PM +0200, Mattias Rönnblom wrote:
>> On 2023-10-05 13:51, Bruce Richardson wrote:
>>> The event structure in DPDK is 16-bytes in size, and events are
>>> regularly passed as parameters directly rather than being passed as
>>> pointers.
>>
>> When are events passed by-value, rather than by-reference? There are no such
>> examples in the public eventdev API.
>>
>> To help compiler optimize correctly, we can explicitly request
>>> 16-byte alignment for events, which means that we should be able
>>> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
>>> with those events.
>>>
>>
>> That change is both helping and sabotaging the optimizer's work. Now every
>> stack allocation needs to be 2-byte aligned - in DPDK code, and in the
>> application. >>
>> The effect this change has on an eventdev app using DSW is a ~3 cycle/event
>> performance degradation on an AMD Zen 3 system, and a ~4 cycle/event
>> performance degradation on a Skylake-generation Intel CPU.
>>
> 
> Thanks for checking - this is the sort of feedback needed alright. In SW
> eventdev we copy events around alot without using pointers, so I felt that
> alignment would be helpful to avoid issues with events spanning cachelines.
> 
> However, since it has negative impacts, I'm quite happy to drop the idea,
> and keep things as they are. I'll mark the change as rejected in patchwork.
> 

I think this was an excellent idea, it just didn't happen to have the 
desired effect. At least not in the particular cases where I evaluated it.

Given the complexity of compilers and CPUs, it's almost impossible to 
predict the outcome, performance-wise, of a particular change.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-10-06 10:29 ` [PATCH v4] " Bruce Richardson
@ 2023-11-12  0:01   ` Stephen Hemminger
  2023-11-12  8:30     ` Morten Brørup
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-11-12  0:01 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Morten Brørup, Jerin Jacob

On Fri,  6 Oct 2023 11:29:32 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c
> index 95373bbaad..adc9751cef 100644
> --- a/lib/eventdev/rte_eventdev.c
> +++ b/lib/eventdev/rte_eventdev.c
> @@ -9,6 +9,7 @@
>  #include <errno.h>
>  #include <stdint.h>
>  #include <inttypes.h>
> +#include <assert.h>
>  
>  #include <rte_string_fns.h>
>  #include <rte_log.h>
> @@ -28,6 +29,8 @@
>  #include "eventdev_pmd.h"
>  #include "eventdev_trace.h"
>  
> +static_assert(sizeof(struct rte_event) == 16, "Event structure size is not 16-bytes in size");
> +
>  static struct rte_eventdev rte_event_devices[RTE_EVENT_MAX_DEVS];

Please don't reinvent RTE_BUILD_BUG_ON().
Instead fix that to be a static_assert()

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-11-12  0:01   ` Stephen Hemminger
@ 2023-11-12  8:30     ` Morten Brørup
  2023-11-12 23:31       ` Stephen Hemminger
  0 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-11-12  8:30 UTC (permalink / raw)
  To: Stephen Hemminger, Bruce Richardson; +Cc: dev, Jerin Jacob

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Sunday, 12 November 2023 01.01
> 
> On Fri,  6 Oct 2023 11:29:32 +0100
> Bruce Richardson <bruce.richardson@intel.com> wrote:
> 
> > diff --git a/lib/eventdev/rte_eventdev.c
> b/lib/eventdev/rte_eventdev.c
> > index 95373bbaad..adc9751cef 100644
> > --- a/lib/eventdev/rte_eventdev.c
> > +++ b/lib/eventdev/rte_eventdev.c
> > @@ -9,6 +9,7 @@
> >  #include <errno.h>
> >  #include <stdint.h>
> >  #include <inttypes.h>
> > +#include <assert.h>
> >
> >  #include <rte_string_fns.h>
> >  #include <rte_log.h>
> > @@ -28,6 +29,8 @@
> >  #include "eventdev_pmd.h"
> >  #include "eventdev_trace.h"
> >
> > +static_assert(sizeof(struct rte_event) == 16, "Event structure size
> is not 16-bytes in size");
> > +
> >  static struct rte_eventdev rte_event_devices[RTE_EVENT_MAX_DEVS];
> 
> Please don't reinvent RTE_BUILD_BUG_ON().
> Instead fix that to be a static_assert()

I would say the opposite:
With our upgrade to the C11 standard, let's get rid of the RTE_BUILD_BUG_ON() workaround for the lack of static_assert() in older C standards.

Unfortunately, the static_assert(expression) variant without the "message" parameter, which would make our RTE_BUILD_BUG_ON() macro completely obsolete, requires C23. And I don't see how we can make this variant available with C11. So we probably have to wait until DPDK requires C23.

Until then, let's gradually phase out the DPDK-specific RTE_BUILD_BUG_ON() in favor of standard C's static_assert(), and live with the inconvenience of having to provide a message parameter for it.

Please also note that static_assert() can be used outside code blocks, which makes it handy for use in header files.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-11-12  8:30     ` Morten Brørup
@ 2023-11-12 23:31       ` Stephen Hemminger
  2023-11-13  7:58         ` Morten Brørup
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-11-12 23:31 UTC (permalink / raw)
  To: Morten Brørup; +Cc: Bruce Richardson, dev, Jerin Jacob

On Sun, 12 Nov 2023 09:30:24 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> > > +static_assert(sizeof(struct rte_event) == 16, "Event structure size  
> > is not 16-bytes in size");  
> > > +
> > >  static struct rte_eventdev rte_event_devices[RTE_EVENT_MAX_DEVS];  
> > 
> > Please don't reinvent RTE_BUILD_BUG_ON().
> > Instead fix that to be a static_assert()  
> 
> I would say the opposite:
> With our upgrade to the C11 standard, let's get rid of the RTE_BUILD_BUG_ON() workaround for the lack of static_assert() in older C standards.
> 
> Unfortunately, the static_assert(expression) variant without the "message" parameter, which would make our RTE_BUILD_BUG_ON() macro completely obsolete, requires C23. And I don't see how we can make this variant available with C11. So we probably have to wait until DPDK requires C23.
> 
> Until then, let's gradually phase out the DPDK-specific RTE_BUILD_BUG_ON() in favor of standard C's static_assert(), and live with the inconvenience of having to provide a message parameter for it.
> 
> Please also note that static_assert() can be used outside code blocks, which makes it handy for use in header files.

If you look at my RFC, the message is just as good as the one in this code.
It ends up being stringified version of the expression. Which is more exact than the wording used in some other places.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-11-12 23:31       ` Stephen Hemminger
@ 2023-11-13  7:58         ` Morten Brørup
  2024-01-19 21:05           ` Tyler Retzlaff
  0 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-11-13  7:58 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Bruce Richardson, dev, Jerin Jacob

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, 13 November 2023 00.32
> 
> On Sun, 12 Nov 2023 09:30:24 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > > +static_assert(sizeof(struct rte_event) == 16, "Event structure
> size
> > > is not 16-bytes in size");
> > > > +
> > > >  static struct rte_eventdev
> rte_event_devices[RTE_EVENT_MAX_DEVS];
> > >
> > > Please don't reinvent RTE_BUILD_BUG_ON().
> > > Instead fix that to be a static_assert()
> >
> > I would say the opposite:
> > With our upgrade to the C11 standard, let's get rid of the
> RTE_BUILD_BUG_ON() workaround for the lack of static_assert() in older
> C standards.
> >
> > Unfortunately, the static_assert(expression) variant without the
> "message" parameter, which would make our RTE_BUILD_BUG_ON() macro
> completely obsolete, requires C23. And I don't see how we can make this
> variant available with C11. So we probably have to wait until DPDK
> requires C23.
> >
> > Until then, let's gradually phase out the DPDK-specific
> RTE_BUILD_BUG_ON() in favor of standard C's static_assert(), and live
> with the inconvenience of having to provide a message parameter for it.
> >
> > Please also note that static_assert() can be used outside code
> blocks, which makes it handy for use in header files.
> 
> If you look at my RFC, the message is just as good as the one in this
> code.
> It ends up being stringified version of the expression. Which is more
> exact than the wording used in some other places.

I agree that the output of your RFC is better than that of static_assert(expr, msg).
I'm arguing that RTE_BUILD_BUG_ON() would be considered reinventing the wheel if introduced today, because the C11 standard already offers something similar. So we should prefer that over RTE_BUILD_BUG_ON(), not the other way around.

It's a difference in opinion. Both RTE_BUILD_BUG_ON() and static_assert() are viable; you seem to prefer the first, I prefer the latter.

Both occur in existing DPDK code, and Bruce happened to chose static_assert() for this patch.

The overall question is a choice between three options:
1. prefer using RTE_BUILD_BUG_ON() (using your RFC implementation),
2. phase out RTE_BUILD_BUG_ON() in favor of the C11 standard's static_assert(), or
3. continue using both?

I'm not strongly opposed to either of the three, as long as the community agrees.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] eventdev: ensure 16-byte alignment for events
  2023-11-13  7:58         ` Morten Brørup
@ 2024-01-19 21:05           ` Tyler Retzlaff
  0 siblings, 0 replies; 27+ messages in thread
From: Tyler Retzlaff @ 2024-01-19 21:05 UTC (permalink / raw)
  To: Morten Brørup; +Cc: Stephen Hemminger, Bruce Richardson, dev, Jerin Jacob

On Mon, Nov 13, 2023 at 08:58:19AM +0100, Morten Brørup wrote:
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Monday, 13 November 2023 00.32
> > 
> > On Sun, 12 Nov 2023 09:30:24 +0100
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> > 
> > > > > +static_assert(sizeof(struct rte_event) == 16, "Event structure
> > size
> > > > is not 16-bytes in size");
> > > > > +
> > > > >  static struct rte_eventdev
> > rte_event_devices[RTE_EVENT_MAX_DEVS];
> > > >
> > > > Please don't reinvent RTE_BUILD_BUG_ON().
> > > > Instead fix that to be a static_assert()
> > >
> > > I would say the opposite:
> > > With our upgrade to the C11 standard, let's get rid of the
> > RTE_BUILD_BUG_ON() workaround for the lack of static_assert() in older
> > C standards.
> > >
> > > Unfortunately, the static_assert(expression) variant without the
> > "message" parameter, which would make our RTE_BUILD_BUG_ON() macro
> > completely obsolete, requires C23. And I don't see how we can make this
> > variant available with C11. So we probably have to wait until DPDK
> > requires C23.
> > >
> > > Until then, let's gradually phase out the DPDK-specific
> > RTE_BUILD_BUG_ON() in favor of standard C's static_assert(), and live
> > with the inconvenience of having to provide a message parameter for it.
> > >
> > > Please also note that static_assert() can be used outside code
> > blocks, which makes it handy for use in header files.
> > 
> > If you look at my RFC, the message is just as good as the one in this
> > code.
> > It ends up being stringified version of the expression. Which is more
> > exact than the wording used in some other places.
> 
> I agree that the output of your RFC is better than that of static_assert(expr, msg).
> I'm arguing that RTE_BUILD_BUG_ON() would be considered reinventing the wheel if introduced today, because the C11 standard already offers something similar. So we should prefer that over RTE_BUILD_BUG_ON(), not the other way around.
> 
> It's a difference in opinion. Both RTE_BUILD_BUG_ON() and static_assert() are viable; you seem to prefer the first, I prefer the latter.
> 
> Both occur in existing DPDK code, and Bruce happened to chose static_assert() for this patch.
> 
> The overall question is a choice between three options:
> 1. prefer using RTE_BUILD_BUG_ON() (using your RFC implementation),
> 2. phase out RTE_BUILD_BUG_ON() in favor of the C11 standard's static_assert(), or
> 3. continue using both?
> 
> I'm not strongly opposed to either of the three, as long as the community agrees.
> 

not a strong preference, but i see no reason to maintain macros when the
standard offers nearly equivalent.

* keep RTE_BUILD_BUG_ON
* discourage new additions using RTE_BUILD_BUG_ON with checkpatches
* convert to static_assert gradually over time

glad to see the conversion picked up real problems though, thank you
stephen for doing this.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH] eventdev: ensure 16-byte alignment for events
  2023-10-06 12:15 ` [RFC PATCH] " Mattias Rönnblom
  2023-10-06 12:19   ` Bruce Richardson
@ 2024-01-19 22:30   ` Stephen Hemminger
  1 sibling, 0 replies; 27+ messages in thread
From: Stephen Hemminger @ 2024-01-19 22:30 UTC (permalink / raw)
  To: Mattias Rönnblom; +Cc: Bruce Richardson, dev, Jerin Jacob

On Fri, 6 Oct 2023 14:15:00 +0200
Mattias Rönnblom <hofors@lysator.liu.se> wrote:

> On 2023-10-05 13:51, Bruce Richardson wrote:
> > The event structure in DPDK is 16-bytes in size, and events are
> > regularly passed as parameters directly rather than being passed as
> > pointers.  
> 
> When are events passed by-value, rather than by-reference? There are no 
> such examples in the public eventdev API.
> 
> To help compiler optimize correctly, we can explicitly request
> > 16-byte alignment for events, which means that we should be able
> > to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> > with those events.
> >   
> 
> That change is both helping and sabotaging the optimizer's work. Now 
> every stack allocation needs to be 2-byte aligned - in DPDK code, and in 
> the application.
> 
> The effect this change has on an eventdev app using DSW is a ~3 
> cycle/event performance degradation on an AMD Zen 3 system, and a ~4 
> cycle/event performance degradation on a Skylake-generation Intel CPU.
> 
> What scenarios do you have in mind, where this change would improve the 
> generated code? Something where there are no unaligned loads available 
> in the ISA, or they are much slower than their aligned counterparts?
> 
> When I looked into the same issue for the DPDK IP checksumming routines, 
> there basically were no such. Not that I could find.
> 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Don't understand what the issue is. Isn't the event structure typically
on stack. Adding padding there would not help.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-01-19 22:30 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-05 11:51 [RFC PATCH] eventdev: ensure 16-byte alignment for events Bruce Richardson
2023-10-05 12:06 ` Bruce Richardson
2023-10-05 13:11   ` Jerin Jacob
2023-10-05 13:15     ` Bruce Richardson
2023-10-06  7:19       ` Jerin Jacob
2023-10-05 12:12 ` Morten Brørup
2023-10-05 13:02   ` Bruce Richardson
2023-10-06  9:37 ` [PATCH v2] " Bruce Richardson
2023-10-06  9:45 ` [PATCH v3] " Bruce Richardson
2023-10-06 10:13   ` Morten Brørup
2023-10-06 10:16     ` Jerin Jacob
2023-10-06 10:19       ` Bruce Richardson
2023-10-06 10:24         ` Jerin Jacob
2023-10-06 10:27           ` Bruce Richardson
2023-10-06 10:16     ` Bruce Richardson
2023-10-06 10:35       ` Morten Brørup
2023-10-06 10:44         ` Bruce Richardson
2023-10-06 10:29 ` [PATCH v4] " Bruce Richardson
2023-11-12  0:01   ` Stephen Hemminger
2023-11-12  8:30     ` Morten Brørup
2023-11-12 23:31       ` Stephen Hemminger
2023-11-13  7:58         ` Morten Brørup
2024-01-19 21:05           ` Tyler Retzlaff
2023-10-06 12:15 ` [RFC PATCH] " Mattias Rönnblom
2023-10-06 12:19   ` Bruce Richardson
2023-10-06 12:29     ` Mattias Rönnblom
2024-01-19 22:30   ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).