DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] mbuf next field belongs in the first cacheline
@ 2021-06-15 12:16 Morten Brørup
  2021-06-15 13:05 ` Bruce Richardson
  0 siblings, 1 reply; 3+ messages in thread
From: Morten Brørup @ 2021-06-15 12:16 UTC (permalink / raw)
  To: Olivier Matz, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko; +Cc: dev

MBUF and MLX5 maintainers,

I'm picking up an old discussion, which you might consider pursuing. Feel free to ignore, if you consider this discussion irrelevant or already closed and done with.

The Techboard has previously discussed the organization of the mbuf fields. Ref: http://mails.dpdk.org/archives/dev/2020-November/191859.html

It was concluded that there was no measured performance difference if the "pool" or "next" field was in the first cacheline, so it was decided to put the "pool" field in the first cacheline. And further optimizing the mbuf field organization could be reconsidered later.

I have been looking at it. In theory it should not be required to touch the "pool" field at RX. But the "next" field must be written for segmented packets.

I think you could achieve an RX performance gain in the MLX5 driver if the mbuf structure was changed so the "next" and "pool" fields were swapped (i.e. putting "next" in the first cacheline), and /drivers/net/mlx5/mlx5_rx.c line 821 was modified to replace "rep = rte_mbuf_raw_alloc(seg->pool)" with something conceptually like "rep = rte_mbuf_raw_alloc(rxq->pool)". Then you don't have to touch the mbuf's "pool" field (residing in the second cacheline with this change) during RX. This way, you would only touch the mbuf's first cacheline during RX.

My suggested optimization might be purely theoretical: Many applications touch the mbuf's second cacheline shortly after RX anyway.

If you don't pursue this mbuf reorganization, the comment to the mbuf's cacheline1 field is incorrect and should be updated:
- /* second cache line - fields only used in slow path or on TX */
+ /* second cache line - fields mainly used in slow path or on TX */

-Morten


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] mbuf next field belongs in the first cacheline
  2021-06-15 12:16 [dpdk-dev] mbuf next field belongs in the first cacheline Morten Brørup
@ 2021-06-15 13:05 ` Bruce Richardson
  2021-06-15 13:40   ` Morten Brørup
  0 siblings, 1 reply; 3+ messages in thread
From: Bruce Richardson @ 2021-06-15 13:05 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Olivier Matz, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko, dev

On Tue, Jun 15, 2021 at 02:16:27PM +0200, Morten Brørup wrote:
> MBUF and MLX5 maintainers,
> 
> I'm picking up an old discussion, which you might consider pursuing. Feel free to ignore, if you consider this discussion irrelevant or already closed and done with.
> 
> The Techboard has previously discussed the organization of the mbuf fields. Ref: http://mails.dpdk.org/archives/dev/2020-November/191859.html
> 
> It was concluded that there was no measured performance difference if the "pool" or "next" field was in the first cacheline, so it was decided to put the "pool" field in the first cacheline. And further optimizing the mbuf field organization could be reconsidered later.
> 
> I have been looking at it. In theory it should not be required to touch the "pool" field at RX. But the "next" field must be written for segmented packets.
> 
Question: are there cases where segmented packets are used, but they aren't
big packets, and so need a high packets-per-second value? The thinking when
designing the mbuf was that any application which could handle high packets
per second for medium/small packets would be fine with a few extra cycles
penalty for big ones, since the overall PPS for the driver would be much
lower.

/Bruce

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] mbuf next field belongs in the first cacheline
  2021-06-15 13:05 ` Bruce Richardson
@ 2021-06-15 13:40   ` Morten Brørup
  0 siblings, 0 replies; 3+ messages in thread
From: Morten Brørup @ 2021-06-15 13:40 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Olivier Matz, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko, dev

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, 15 June 2021 15.05
> 
> On Tue, Jun 15, 2021 at 02:16:27PM +0200, Morten Brørup wrote:
> > MBUF and MLX5 maintainers,
> >
> > I'm picking up an old discussion, which you might consider pursuing.
> Feel free to ignore, if you consider this discussion irrelevant or
> already closed and done with.
> >
> > The Techboard has previously discussed the organization of the mbuf
> fields. Ref: http://mails.dpdk.org/archives/dev/2020-
> November/191859.html
> >
> > It was concluded that there was no measured performance difference if
> the "pool" or "next" field was in the first cacheline, so it was
> decided to put the "pool" field in the first cacheline. And further
> optimizing the mbuf field organization could be reconsidered later.
> >
> > I have been looking at it. In theory it should not be required to
> touch the "pool" field at RX. But the "next" field must be written for
> segmented packets.
> >
> Question: are there cases where segmented packets are used, but they
> aren't
> big packets, and so need a high packets-per-second value? The thinking
> when
> designing the mbuf was that any application which could handle high
> packets
> per second for medium/small packets would be fine with a few extra
> cycles
> penalty for big ones, since the overall PPS for the driver would be
> much
> lower.

Always good with a reality check! :-)

I recall a proposal from NVIDIA that introduced a feature to split RX packets into multiple small segments from a list of mbuf pools; basically a variant of "header split". Here it is: https://patchwork.dpdk.org/project/dpdk/list/?series=13070&state=%2A&archive=both

I don't know if swapping "next" and "pool" fields would make a performance difference if the RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is being used.

Otherwise, you are correct: The performance gain is mostly theoretical.

So in reality, it would be a very big change for an insignificant improvement.

It's mainly the principle that annoys me: The DPDK documentation mentions that the mbuf structure is designed for the second cache line not to be touched by RX. If that is not the bearing principle anymore, the documentation needs to be updated.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-15 13:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-15 12:16 [dpdk-dev] mbuf next field belongs in the first cacheline Morten Brørup
2021-06-15 13:05 ` Bruce Richardson
2021-06-15 13:40   ` Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).