From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Garrett D'Amore" <garrett@damore.org>,
"Bruce Richardson" <bruce.richardson@intel.com>,
"Stephen Hemminger" <stephen@networkplumber.org>
Cc: <dev@dpdk.org>, "Parthakumar Roy" <Parthakumar.Roy@ibm.com>
Subject: RE: meson option to customize RTE_PKTMBUF_HEADROOM patch
Date: Tue, 26 Mar 2024 09:05:37 +0100 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9F32D@smartserver.smartshare.dk> (raw)
In-Reply-To: <d426a895-ddc6-465d-9c8d-d713167ae2b0@Spark>
[-- Attachment #1: Type: text/plain, Size: 4641 bytes --]
Interesting requirement. I can easily imagine how a (non-forwarding, i.e. traffic terminating) application, which doesn’t really care about the preceding headers, can benefit from having its actual data at a specific offset for alignment purposes. I don’t consider this very exotic. (Even the Linux kernel uses this trick to achieve improved IP header alignment on RX.)
I think the proper solution would be to add a new offload parameter to rte_eth_rxconf to specify how many bytes the driver should subtract from RTE_PKTMBUF_HEADROOM when writing the RX descriptor to the NIC hardware. Depending on driver support, this would make it configurable per device and per RX queue.
If this parameter is set, the driver should adjust m->data_off accordingly on RX, so rte_pktmbuf_mtod[_offset]() and rte_pktmbuf_iova[_offset]() still point to the Ethernet header.
Med venlig hilsen / Kind regards,
-Morten Brørup
From: Garrett D'Amore [mailto:garrett@damore.org]
Sent: Monday, 25 March 2024 23.56
So we need (for reasons that I don't want to get to into in too much detail) that our UDP payload headers are at a specific offset in the packet.
This was not a problem as long as we only used IPv4. (We have configured 40 bytes of headroom, which is more than any of our PMDs need by a hefty margin.)
Now that we're extending to support IPv6, we need to reduce that headroom by 20 bytes, to preserve our UDP payload offset.
This has big ramifications for how we fragment our own upper layer messages, and it has been determined that updating the PMDs to allow us to change the headroom for this use case (on a per port basis, as we will have some ports on IPv4 and others on IPv6) is the least effort, but a large margin. (Well, copying the frames via memcpy would be less development effort, but would be a performance catastrophe.)
For transmit side we don't need this, as we can simply adjust the packet as needed. But for the receive side, we are kind of stuck, as the PMDs rely on the hard coded RTE_PKTMBUF_HEADROOM to program receive locations.
As far as header splitting, that would indeed be a much much nicer solution.
I haven't looked in the latest code to see if header splitting is even an option -- the version of the DPDK I'm working with is a little older (20.11) -- we have to update but we have other local changes and so updating is one of the things that we still have to do.
At any rate, the version I did look at doesn't seem to support header splits on any device other than FM10K. That's not terrifically interesting for us. We use Mellanox, E810 (ICE), bnxt, cloud NICs (all of them really -- ENA, virtio-net, etc.) We also have a fair amount of ixgbe and i40e on client systems in the field.
We also, unfortunately, have an older DPDK 18 with Mellanox contributions for IPoverIB.... though I'm not sure we will try to support IPv6 there. (We are working towards replacing that part of stack with UCX.)
Unless header splitting will work on all of this (excepting the IPoIB piece), then it's not something we can really use.
On Mar 25, 2024 at 10:20 AM -0700, Stephen Hemminger <stephen@networkplumber.org>, wrote:
On Mon, 25 Mar 2024 10:01:52 +0000
Bruce Richardson <bruce.richardson@intel.com> wrote:
On Sat, Mar 23, 2024 at 01:51:25PM -0700, Garrett D'Amore wrote:
> So we right now (at WEKA) have a somewhat older version of DPDK that we
> have customized heavily, and I am going to to need to to make the
> headroom *dynamic* (passed in at run time, and per port.)
> We have this requirement because we need payload to be at a specific
> offset, but have to deal with different header lengths for IPv4 and now
> IPv6.
> My reason for pointing this out, is that I would dearly like if we
> could collaborate on this -- this change is going to touch pretty much
> every PMD (we don't need it on all of them as we only support a subset
> of PMDs, but its still a significant set.)
> I'm not sure if anyone else has considered such a need -- this
> particular message caught my eye as I'm looking specifically in this
> area right now.
>
Hi
thanks for reaching out. Can you clarify a little more as to the need for
this requirement? Can you not just set the headroom value to the max needed
value for any port and use that? Is there an issue with having blank space
at the start of a buffer?
Thanks,
/Bruce
If you have to make such a deep change across all PMD's then maybe
it is not the best solution. What about being able to do some form of buffer
chaining or pullup.
[-- Attachment #2: Type: text/html, Size: 8422 bytes --]
next prev parent reply other threads:[~2024-03-26 8:05 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-15 19:02 Parthakumar Roy
2024-02-16 7:52 ` David Marchand
2024-02-20 14:57 ` [PATCH v2] build: make buffer headroom configurable Parthakumar Roy
2024-02-27 16:02 ` Bruce Richardson
2024-02-27 16:10 ` Morten Brørup
2024-03-06 16:45 ` Thomas Monjalon
2024-03-23 20:51 ` meson option to customize RTE_PKTMBUF_HEADROOM patch Garrett D'Amore
2024-03-25 10:01 ` Bruce Richardson
2024-03-25 17:20 ` Stephen Hemminger
2024-03-25 22:56 ` Garrett D'Amore
2024-03-26 8:05 ` Morten Brørup [this message]
2024-03-26 14:19 ` Garrett D'Amore
2024-03-26 15:06 ` Morten Brørup
2024-03-26 17:43 ` Garrett D'Amore
2024-03-26 20:35 ` Stephen Hemminger
2024-03-26 21:10 ` Garrett D'Amore
2024-03-26 16:14 ` Konstantin Ananyev
2024-03-26 17:44 ` Garrett D'Amore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9F32D@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=Parthakumar.Roy@ibm.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=garrett@damore.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).