From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id B70C3A00E6
	for <public@inbox.dpdk.org>; Fri, 12 Jul 2019 11:06:49 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 6C61A1B9AA;
	Fri, 12 Jul 2019 11:06:48 +0200 (CEST)
Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67])
 by dpdk.org (Postfix) with ESMTP id 29D3A1B9A7
 for <dev@dpdk.org>; Fri, 12 Jul 2019 11:06:47 +0200 (CEST)
Received: from lfbn-lil-1-176-160.w90-45.abo.wanadoo.fr ([90.45.26.160]
 helo=droids-corp.org)
 by mail.droids-corp.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.89) (envelope-from <olivier.matz@6wind.com>)
 id 1hlrYv-0001t9-AB; Fri, 12 Jul 2019 11:09:54 +0200
Received: by droids-corp.org (sSMTP sendmail emulation);
 Fri, 12 Jul 2019 11:06:43 +0200
Date: Fri, 12 Jul 2019 11:06:43 +0200
From: Olivier Matz <olivier.matz@6wind.com>
To: "Wiles, Keith" <keith.wiles@intel.com>
Cc: dpdk dev community <dev@dpdk.org>,
 Stephen Hemminger <stephen@networkplumber.org>
Message-ID: <20190712090643.z3mpl6e4smara3eo@platinum>
References: <20190710092907.5565-1-olivier.matz@6wind.com>
 <20190710104917.73bc9201@hermes.lan>
 <3117C805-492A-4DA9-BE8F-E119E057C80D@intel.com>
 <20190711075320.woswl6fhl6szgdqb@glumotte.dev.6wind.com>
 <B48AF229-7386-441D-B5F4-CB27A60DF4DF@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <B48AF229-7386-441D-B5F4-CB27A60DF4DF@intel.com>
User-Agent: NeoMutt/20180716
Subject: Re: [dpdk-dev] [RFC] mbuf: support dynamic fields and flags
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,

On Thu, Jul 11, 2019 at 02:37:23PM +0000, Wiles, Keith wrote:
> 
> 
> > On Jul 11, 2019, at 2:53 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> > 
> > Hi Keith,
> > 
> > On Wed, Jul 10, 2019 at 06:12:16PM +0000, Wiles, Keith wrote:
> >> 
> >> 
> >>> On Jul 10, 2019, at 12:49 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> >>> 
> >>> On Wed, 10 Jul 2019 11:29:07 +0200
> >>> Olivier Matz <olivier.matz@6wind.com> wrote:
> >>> 
> >>>> /**
> >>>> * Indicate that the metadata field in the mbuf is in use.
> >>>> @@ -738,6 +741,8 @@ struct rte_mbuf {
> >>>> 	 */
> >>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>> 
> >>>> +	uint64_t dynfield1; /**< Reserved for dynamic fields. */
> >>>> +	uint64_t dynfield2; /**< Reserved for dynamic fields. */
> >>>> } __rte_cache_aligned;
> >>> 
> >>> Growing mbuf is a fundamental ABI break and this needs
> >>> higher level approval.  Why not one pointer?
> >>> 
> >>> It looks like you are creating something like FreeBSD m_tag.
> >>> Why not use that?
> >> 
> >> Changing the mbuf structure causes a big problem for a number reasons as Stephen states.
> > 
> > Can you elaborate?
> > 
> > This is indeed an ABI break, but I think this is only due to the adding
> > of rte_mbuf_dynfield_copy() in rte_pktmbuf_attach(). The size of the
> > mbuf does not change and the fields are not initialized when creating a
> > new mbuf. So I think there is no ABI change for code that is not using
> > rte_pktmbuf_attach().
> > 
> > I don't think it's a problem to have one ABI change, if it avoids many
> > others in the future.
> > 
> >> If we leave the mbuf stucture alone and add this feature to the
> >> headroom space between the mbuf structure and the packet. When setting
> >> up the mempool/mbuf pool we define a headroom to hold the extra data
> >> when the mbuf pool is created or just use the current headroom
> >> space. Using this method we can eliminate the mbuf structure change
> >> and add the data to the packet buffer. We can do away with dynfield1
> >> and 2 as we know where headroom space begins and ends. Just a thought.
> > 
> > The size of the mbuf metadata (between the mbuf structure and the
> > buffer) is configured per pool, so it can be different accross
> > mbufs. So, the access to the dynamic field would be slower:
> > *(mbuf + dynfield_offset + metadata_size(mbuf))
> 

> We can force that space to be a minimum size when the mempool is
> created in the case of a cloned mbuf. The cloned mbuf is a small use
> case, but am important one and increasing the size for those special
> mbufs by a cache line should not be a huge problem.
> 
> I think most allocations do not change the size from the default value
> of the headroom (128). The mbuf + buffer are normally rounded to 2K or
> a bit bigger, which gives a bit more space in those cases of a packet
> size of 1518-1522. Jumbo frames are the same. Using the headroom size
> for an application needs to be defined and setup for the max size
> anyway for the application needs, so normally all mbuf creates should
> contain the same size to account for mbuf moments within the system.

If we want more room for dynamic fields, we can do something like
this. But I don't think this is something that will happen soon: we
already have 16 bytes available, and I'm sure we can get another 16
bytes very easily by just converting fields like timestamp or sequence
numbers.

To attach larger amount of data to mbufs, the metadata feature still
exists. We can imagine to extend the dynamic fields feature to be able
to use more space after the mbuf structure (in metadata?), but it is
more complex.

I don't think that using headroom or tailroom is a good idea. That's
true that mbufs are usually a bit more than 2K, and some space is lost
when mtu is 1500. But smaller mbufs are perfectly legal too, except that
some drivers do not support it. Anyway, headroom and tailroom must be
used for what they are designed: reserving room to prepend or append
data. If we want more space for dynamic fields, let's add a specific
location for it, when it will be needed.