From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <olivier.matz@6wind.com>
Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67])
 by dpdk.org (Postfix) with ESMTP id 5E84E4CC0
 for <dev@dpdk.org>; Tue, 24 Apr 2018 18:02:49 +0200 (CEST)
Received: from lfbn-lil-1-700-92.w81-254.abo.wanadoo.fr ([81.254.37.92]
 helo=droids-corp.org)
 by mail.droids-corp.org with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.89) (envelope-from <olivier.matz@6wind.com>)
 id 1fB0P5-0005bR-Ps; Tue, 24 Apr 2018 18:02:53 +0200
Received: by droids-corp.org (sSMTP sendmail emulation);
 Tue, 24 Apr 2018 18:02:44 +0200
Date: Tue, 24 Apr 2018 18:02:44 +0200
From: Olivier Matz <olivier.matz@6wind.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>
Cc: Yongseok Koh <yskoh@mellanox.com>, wenzhuo.lu@intel.com,
 jingjing.wu@intel.com, dev@dpdk.org, konstantin.ananyev@intel.com,
 adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com
Message-ID: <20180424160244.bggifhilvadxcjb2@neon>
References: <20180310012532.15809-1-yskoh@mellanox.com>
 <20180424013854.33749-1-yskoh@mellanox.com>
 <934e714e-3cba-7f5d-9fcf-4f96611d758f@solarflare.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <934e714e-3cba-7f5d-9fcf-4f96611d758f@solarflare.com>
User-Agent: NeoMutt/20170113 (1.7.2)
Subject: Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external
 buffer to mbuf
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Apr 2018 16:02:49 -0000

Hi Andrew, Yongseok,

On Tue, Apr 24, 2018 at 03:28:33PM +0300, Andrew Rybchenko wrote:
> On 04/24/2018 04:38 AM, Yongseok Koh wrote:
> > This patch introduces a new way of attaching an external buffer to a mbuf.
> > 
> > Attaching an external buffer is quite similar to mbuf indirection in
> > replacing buffer addresses and length of a mbuf, but a few differences:
> >    - When an indirect mbuf is attached, refcnt of the direct mbuf would be
> >      2 as long as the direct mbuf itself isn't freed after the attachment.
> >      In such cases, the buffer area of a direct mbuf must be read-only. But
> >      external buffer has its own refcnt and it starts from 1. Unless
> >      multiple mbufs are attached to a mbuf having an external buffer, the
> >      external buffer is writable.
> >    - There's no need to allocate buffer from a mempool. Any buffer can be
> >      attached with appropriate free callback.
> >    - Smaller metadata is required to maintain shared data such as refcnt.
> 
> Really useful. Many thanks. See my notes below.
> 
> It worries me that detach is more expensive than it really required since it
> requires to restore mbuf as direct. If mbuf mempool is used for mbufs
> as headers for external buffers only all these actions are absolutely
> useless.

I agree on the principle. And we have the same issue with indirect mbuf.
Currently, the assumption is that a free mbuf (inside a mempool) is
initialized as a direct mbuf. We can think about optimizations here,
but I'm not sure it should be in this patchset.

[...]

> > @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
> >   }
> >   /**
> > + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
> > + * otherwise.
> > + *
> > + * If a mbuf has its data in another mbuf and references it by mbuf
> > + * indirection, this mbuf can be defined as a cloned mbuf.
> > + */
> > +#define RTE_MBUF_CLONED(mb)     ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > +
> > +/**
> >    * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
> >    */
> > -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
> 
> It is still confusing that INDIRECT != !DIRECT.
> May be we have no good options right now, but I'd suggest to at least
> deprecate
> RTE_MBUF_INDIRECT() and completely remove it in the next release.

Agree. I may have missed something, but is my previous suggestion
not doable?

- direct = embeds its own data      (and indirect = !direct)
- clone (or another name) = data is another mbuf
- extbuf = data is in an external buffer

Deprecating the macro is a good idea.

> > +	m->buf_addr = buf_addr;
> > +	m->buf_iova = buf_iova;
> > +
> > +	if (shinfo == NULL) {
> > +		shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end,
> > +					sizeof(*shinfo)), sizeof(uintptr_t));
> > +		if ((void *)shinfo <= buf_addr)
> > +			return NULL;
> > +
> > +		m->buf_len = RTE_PTR_DIFF(shinfo, buf_addr);
> > +	} else {
> > +		m->buf_len = buf_len;
> > +	}
> > +
> > +	m->data_len = 0;
> > +
> > +	rte_pktmbuf_reset_headroom(m);
> 
> I would suggest to make data_off one more parameter.
> If I have a buffer with data which I'd like to attach to an mbuf, I'd like
> to control data_off.

Another option is to set the headroom to 0.
Because the after attaching the mbuf to an external buffer, we will
still require to set the length.

A user can do something like this:

	rte_pktmbuf_attach_extbuf(m, buf_va, buf_iova, buf_len, shinfo,
		free_cb, free_cb_arg);
	rte_pktmbuf_append(m, data_len + headroom);
	rte_pktmbuf_adj(m, headroom);

> 
> > +	m->ol_flags |= EXT_ATTACHED_MBUF;
> > +	m->shinfo = shinfo;
> > +
> > +	rte_mbuf_ext_refcnt_set(shinfo, 1);
> 
> Why is assignment used here? Cannot we attach extbuf already attached to
> other mbuf?

In rte_pktmbuf_attach(), this is true. That's not illogical to
keep the same approach here. Maybe an assert could be added?

> May be shinfo should be initialized only if it is not provided (shinfo ==
> NULL on input)?

I don't get why, can you explain please?


Thanks,
Olivier