DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier Matz <olivier.matz@6wind.com>
To: "Wiles, Keith" <keith.wiles@intel.com>
Cc: dpdk dev community <dev@dpdk.org>,
	Stephen Hemminger <stephen@networkplumber.org>
Subject: Re: [dpdk-dev] [RFC] mbuf: support dynamic fields and flags
Date: Fri, 12 Jul 2019 11:06:43 +0200	[thread overview]
Message-ID: <20190712090643.z3mpl6e4smara3eo@platinum> (raw)
In-Reply-To: <B48AF229-7386-441D-B5F4-CB27A60DF4DF@intel.com>

Hi,

On Thu, Jul 11, 2019 at 02:37:23PM +0000, Wiles, Keith wrote:
> 
> 
> > On Jul 11, 2019, at 2:53 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> > 
> > Hi Keith,
> > 
> > On Wed, Jul 10, 2019 at 06:12:16PM +0000, Wiles, Keith wrote:
> >> 
> >> 
> >>> On Jul 10, 2019, at 12:49 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> >>> 
> >>> On Wed, 10 Jul 2019 11:29:07 +0200
> >>> Olivier Matz <olivier.matz@6wind.com> wrote:
> >>> 
> >>>> /**
> >>>> * Indicate that the metadata field in the mbuf is in use.
> >>>> @@ -738,6 +741,8 @@ struct rte_mbuf {
> >>>> 	 */
> >>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>> 
> >>>> +	uint64_t dynfield1; /**< Reserved for dynamic fields. */
> >>>> +	uint64_t dynfield2; /**< Reserved for dynamic fields. */
> >>>> } __rte_cache_aligned;
> >>> 
> >>> Growing mbuf is a fundamental ABI break and this needs
> >>> higher level approval.  Why not one pointer?
> >>> 
> >>> It looks like you are creating something like FreeBSD m_tag.
> >>> Why not use that?
> >> 
> >> Changing the mbuf structure causes a big problem for a number reasons as Stephen states.
> > 
> > Can you elaborate?
> > 
> > This is indeed an ABI break, but I think this is only due to the adding
> > of rte_mbuf_dynfield_copy() in rte_pktmbuf_attach(). The size of the
> > mbuf does not change and the fields are not initialized when creating a
> > new mbuf. So I think there is no ABI change for code that is not using
> > rte_pktmbuf_attach().
> > 
> > I don't think it's a problem to have one ABI change, if it avoids many
> > others in the future.
> > 
> >> If we leave the mbuf stucture alone and add this feature to the
> >> headroom space between the mbuf structure and the packet. When setting
> >> up the mempool/mbuf pool we define a headroom to hold the extra data
> >> when the mbuf pool is created or just use the current headroom
> >> space. Using this method we can eliminate the mbuf structure change
> >> and add the data to the packet buffer. We can do away with dynfield1
> >> and 2 as we know where headroom space begins and ends. Just a thought.
> > 
> > The size of the mbuf metadata (between the mbuf structure and the
> > buffer) is configured per pool, so it can be different accross
> > mbufs. So, the access to the dynamic field would be slower:
> > *(mbuf + dynfield_offset + metadata_size(mbuf))
> 

> We can force that space to be a minimum size when the mempool is
> created in the case of a cloned mbuf. The cloned mbuf is a small use
> case, but am important one and increasing the size for those special
> mbufs by a cache line should not be a huge problem.
> 
> I think most allocations do not change the size from the default value
> of the headroom (128). The mbuf + buffer are normally rounded to 2K or
> a bit bigger, which gives a bit more space in those cases of a packet
> size of 1518-1522. Jumbo frames are the same. Using the headroom size
> for an application needs to be defined and setup for the max size
> anyway for the application needs, so normally all mbuf creates should
> contain the same size to account for mbuf moments within the system.

If we want more room for dynamic fields, we can do something like
this. But I don't think this is something that will happen soon: we
already have 16 bytes available, and I'm sure we can get another 16
bytes very easily by just converting fields like timestamp or sequence
numbers.

To attach larger amount of data to mbufs, the metadata feature still
exists. We can imagine to extend the dynamic fields feature to be able
to use more space after the mbuf structure (in metadata?), but it is
more complex.

I don't think that using headroom or tailroom is a good idea. That's
true that mbufs are usually a bit more than 2K, and some space is lost
when mtu is 1500. But smaller mbufs are perfectly legal too, except that
some drivers do not support it. Anyway, headroom and tailroom must be
used for what they are designed: reserving room to prepend or append
data. If we want more space for dynamic fields, let's add a specific
location for it, when it will be needed.


  reply	other threads:[~2019-07-12  9:06 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-10  9:29 Olivier Matz
2019-07-10 17:14 ` Wang, Haiyue
2019-07-11  7:26   ` Olivier Matz
2019-07-11  8:04     ` Wang, Haiyue
2019-07-11  8:20       ` Olivier Matz
2019-07-11  8:34         ` Wang, Haiyue
2019-07-11 15:31     ` Stephen Hemminger
2019-07-12  9:18       ` Olivier Matz
2019-07-10 17:49 ` Stephen Hemminger
2019-07-10 18:12   ` Wiles, Keith
2019-07-11  7:53     ` Olivier Matz
2019-07-11 14:37       ` Wiles, Keith
2019-07-12  9:06         ` Olivier Matz [this message]
2019-07-11  7:36   ` Olivier Matz
2019-07-12 12:23     ` Jerin Jacob Kollanukkaran
2019-07-16  9:39       ` Olivier Matz
2019-07-16 14:43         ` Stephen Hemminger
2019-07-11  9:24 ` Thomas Monjalon
2019-07-12 14:54 ` Andrew Rybchenko
2019-07-16  9:49   ` Olivier Matz
2019-07-16 11:31     ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2019-09-18 16:54 ` [dpdk-dev] [PATCH] " Olivier Matz
2019-09-21  4:54   ` Wang, Haiyue
2019-09-23  8:31     ` Olivier Matz
2019-09-23 11:01       ` Wang, Haiyue
2019-09-21  8:28   ` Wiles, Keith
2019-09-23  8:56     ` Morten Brørup
2019-09-23  9:41       ` Olivier Matz
2019-09-23  9:13     ` Olivier Matz
2019-09-23 15:14       ` Wiles, Keith
2019-09-23 16:16         ` Olivier Matz
2019-09-23 17:14           ` Wiles, Keith
2019-09-23 16:09       ` Wiles, Keith
2019-10-01 10:49   ` Ananyev, Konstantin
2019-10-17  7:54     ` Olivier Matz
2019-10-17 11:58       ` Ananyev, Konstantin
2019-10-17 12:58         ` Olivier Matz
2019-10-17 14:42 ` [dpdk-dev] [PATCH v2] " Olivier Matz
2019-10-18  2:47   ` Wang, Haiyue
2019-10-18  7:53     ` Olivier Matz
2019-10-18  8:28       ` Wang, Haiyue
2019-10-18  9:47         ` Olivier Matz
2019-10-18 11:24           ` Wang, Haiyue
2019-10-22 22:51   ` Ananyev, Konstantin
2019-10-23  3:16     ` Wang, Haiyue
2019-10-23 10:21       ` Olivier Matz
2019-10-23 15:00         ` Stephen Hemminger
2019-10-23 15:12           ` Wang, Haiyue
2019-10-23 10:19     ` Olivier Matz
2019-10-23 11:45       ` Olivier Matz
2019-10-23 11:49         ` Ananyev, Konstantin
2019-10-23 12:00   ` Shahaf Shuler
2019-10-23 13:33     ` Olivier Matz
2019-10-24  4:54       ` Shahaf Shuler
2019-10-24  7:07         ` Olivier Matz
2019-10-24  7:38   ` Slava Ovsiienko
2019-10-24  7:56     ` Olivier Matz
2019-10-24  8:13 ` [dpdk-dev] [PATCH v3] " Olivier Matz
2019-10-24 15:30   ` Stephen Hemminger
2019-10-24 15:44     ` Thomas Monjalon
2019-10-24 17:07       ` Stephen Hemminger
2019-10-24 16:40   ` Thomas Monjalon
2019-10-26 12:39 ` [dpdk-dev] [PATCH v4] " Olivier Matz
2019-10-26 17:04   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190712090643.z3mpl6e4smara3eo@platinum \
    --to=olivier.matz@6wind.com \
    --cc=dev@dpdk.org \
    --cc=keith.wiles@intel.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).