DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier Matz <olivier.matz@6wind.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>
Cc: Yongseok Koh <yskoh@mellanox.com>,
	wenzhuo.lu@intel.com, jingjing.wu@intel.com, dev@dpdk.org,
	konstantin.ananyev@intel.com, adrien.mazarguil@6wind.com,
	nelio.laranjeiro@6wind.com, Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf
Date: Tue, 24 Apr 2018 21:15:38 +0200	[thread overview]
Message-ID: <20180424191538.exjgzoif4odhndew@neon> (raw)
In-Reply-To: <ab4fface-d25c-553c-ed93-5398424ce53c@solarflare.com>

On Tue, Apr 24, 2018 at 09:21:00PM +0300, Andrew Rybchenko wrote:
> On 04/24/2018 07:02 PM, Olivier Matz wrote:
> > Hi Andrew, Yongseok,
> > 
> > On Tue, Apr 24, 2018 at 03:28:33PM +0300, Andrew Rybchenko wrote:
> > > On 04/24/2018 04:38 AM, Yongseok Koh wrote:
> > > > This patch introduces a new way of attaching an external buffer to a mbuf.
> > > > 
> > > > Attaching an external buffer is quite similar to mbuf indirection in
> > > > replacing buffer addresses and length of a mbuf, but a few differences:
> > > >     - When an indirect mbuf is attached, refcnt of the direct mbuf would be
> > > >       2 as long as the direct mbuf itself isn't freed after the attachment.
> > > >       In such cases, the buffer area of a direct mbuf must be read-only. But
> > > >       external buffer has its own refcnt and it starts from 1. Unless
> > > >       multiple mbufs are attached to a mbuf having an external buffer, the
> > > >       external buffer is writable.
> > > >     - There's no need to allocate buffer from a mempool. Any buffer can be
> > > >       attached with appropriate free callback.
> > > >     - Smaller metadata is required to maintain shared data such as refcnt.
> > > Really useful. Many thanks. See my notes below.
> > > 
> > > It worries me that detach is more expensive than it really required since it
> > > requires to restore mbuf as direct. If mbuf mempool is used for mbufs
> > > as headers for external buffers only all these actions are absolutely
> > > useless.
> > I agree on the principle. And we have the same issue with indirect mbuf.
> > Currently, the assumption is that a free mbuf (inside a mempool) is
> > initialized as a direct mbuf. We can think about optimizations here,
> > but I'm not sure it should be in this patchset.
> 
> I agree that it should be addressed separately.
> 
> > [...]
> > 
> > > > @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
> > > >    }
> > > >    /**
> > > > + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
> > > > + * otherwise.
> > > > + *
> > > > + * If a mbuf has its data in another mbuf and references it by mbuf
> > > > + * indirection, this mbuf can be defined as a cloned mbuf.
> > > > + */
> > > > +#define RTE_MBUF_CLONED(mb)     ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > > > +
> > > > +/**
> > > >     * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
> > > >     */
> > > > -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > > > +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
> > > It is still confusing that INDIRECT != !DIRECT.
> > > May be we have no good options right now, but I'd suggest to at least
> > > deprecate
> > > RTE_MBUF_INDIRECT() and completely remove it in the next release.
> > Agree. I may have missed something, but is my previous suggestion
> > not doable?
> > 
> > - direct = embeds its own data      (and indirect = !direct)
> > - clone (or another name) = data is another mbuf
> > - extbuf = data is in an external buffer
> 
> I guess the problem that it changes INDIRECT semantics since EXTBUF
> is added as well. I think strictly speaking it is an API change.
> Is it OK to make it without announcement?

In any case, there will be an ABI change, because an application
compiled for 18.02 will not be able to handle these new kind of
mbuf.

So unfortunatly yes, I think this kind of changes should first be
announced.

Thomas, what do you think?


> > Deprecating the macro is a good idea.
> > 
> > > > +	m->buf_addr = buf_addr;
> > > > +	m->buf_iova = buf_iova;
> > > > +
> > > > +	if (shinfo == NULL) {
> > > > +		shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end,
> > > > +					sizeof(*shinfo)), sizeof(uintptr_t));
> > > > +		if ((void *)shinfo <= buf_addr)
> > > > +			return NULL;
> > > > +
> > > > +		m->buf_len = RTE_PTR_DIFF(shinfo, buf_addr);
> > > > +	} else {
> > > > +		m->buf_len = buf_len;
> > > > +	}
> > > > +
> > > > +	m->data_len = 0;
> > > > +
> > > > +	rte_pktmbuf_reset_headroom(m);
> > > I would suggest to make data_off one more parameter.
> > > If I have a buffer with data which I'd like to attach to an mbuf, I'd like
> > > to control data_off.
> > Another option is to set the headroom to 0.
> > Because the after attaching the mbuf to an external buffer, we will
> > still require to set the length.
> > 
> > A user can do something like this:
> > 
> > 	rte_pktmbuf_attach_extbuf(m, buf_va, buf_iova, buf_len, shinfo,
> > 		free_cb, free_cb_arg);
> > 	rte_pktmbuf_append(m, data_len + headroom);
> > 	rte_pktmbuf_adj(m, headroom);
> > 
> > > > +	m->ol_flags |= EXT_ATTACHED_MBUF;
> > > > +	m->shinfo = shinfo;
> > > > +
> > > > +	rte_mbuf_ext_refcnt_set(shinfo, 1);
> > > Why is assignment used here? Cannot we attach extbuf already attached to
> > > other mbuf?
> > In rte_pktmbuf_attach(), this is true. That's not illogical to
> > keep the same approach here. Maybe an assert could be added?
> > 
> > > May be shinfo should be initialized only if it is not provided (shinfo ==
> > > NULL on input)?
> > I don't get why, can you explain please?
> 
> May be I misunderstand how it should look like when one huge buffer
> is partitioned. I thought that it should be only one shinfo per huge buffer
> to control when it is not used any more by any mbufs with extbuf.

OK I got it.

I think both approach could make sense:
- one shinfo per huge buffer
- or one shinfo per mbuf, and use the callback to manage another refcnt
  (like what Yongseok described)

So I agree with your proposal, shinfo should be initialized by
the caller if it is != NULL, else it can be initialized by
rte_pktmbuf_attach_extbuf().


> Other option is to have shinfo per small buf plus reference counter
> per huge buf (which is decremented when small buf reference counter
> becomes zero and free callback is executed). I guess it is assumed above.
> My fear is that it is too much reference counters:
>  1. mbuf reference counter
>  2. small buf reference counter
>  3. huge buf reference counter
> May be it is possible use (1) for (2) as well?

I would prefer to have only 2 reference counters, one in the mbuf
and one in the shinfo.

  reply	other threads:[~2018-04-24 19:15 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-10  1:25 [dpdk-dev] [PATCH v1 0/6] net/mlx5: add Multi-Packet Rx support Yongseok Koh
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 1/6] mbuf: add buffer offset field for flexible indirection Yongseok Koh
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 2/6] net/mlx5: separate filling Rx flags Yongseok Koh
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 3/6] net/mlx5: add a function to rdma-core glue Yongseok Koh
2018-03-12  9:13   ` Nélio Laranjeiro
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 4/6] net/mlx5: add Multi-Packet Rx support Yongseok Koh
2018-03-12  9:20   ` Nélio Laranjeiro
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 5/6] net/mlx5: release Tx queue resource earlier than Rx Yongseok Koh
2018-03-10  1:25 ` [dpdk-dev] [PATCH v1 6/6] app/testpmd: conserve mbuf indirection flag Yongseok Koh
2018-04-02 18:50 ` [dpdk-dev] [PATCH v2 0/6] net/mlx5: add Multi-Packet Rx support Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 1/6] mbuf: add buffer offset field for flexible indirection Yongseok Koh
2018-04-03  8:26     ` Olivier Matz
2018-04-04  0:12       ` Yongseok Koh
2018-04-09 16:04         ` Olivier Matz
2018-04-10  1:59           ` Yongseok Koh
2018-04-11  0:25             ` Ananyev, Konstantin
2018-04-11  5:33               ` Yongseok Koh
2018-04-11 11:39                 ` Ananyev, Konstantin
2018-04-11 14:02                   ` Andrew Rybchenko
2018-04-11 17:18                     ` Yongseok Koh
2018-04-11 17:08                   ` Yongseok Koh
2018-04-12 16:34                     ` Ananyev, Konstantin
2018-04-12 18:58                       ` Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 2/6] net/mlx5: separate filling Rx flags Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 3/6] net/mlx5: add a function to rdma-core glue Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 4/6] net/mlx5: add Multi-Packet Rx support Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 5/6] net/mlx5: release Tx queue resource earlier than Rx Yongseok Koh
2018-04-02 18:50   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: conserve mbuf indirection flag Yongseok Koh
2018-04-19  1:11 ` [dpdk-dev] [PATCH v3 1/2] mbuf: support attaching external buffer to mbuf Yongseok Koh
2018-04-19  1:11   ` [dpdk-dev] [PATCH v3 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-23 11:53   ` [dpdk-dev] [PATCH v3 1/2] mbuf: support attaching external buffer to mbuf Ananyev, Konstantin
2018-04-24  2:04     ` Yongseok Koh
2018-04-25 13:16       ` Ananyev, Konstantin
2018-04-25 16:44         ` Yongseok Koh
2018-04-25 18:05           ` Ananyev, Konstantin
2018-04-23 16:18   ` Olivier Matz
2018-04-24  1:29     ` Yongseok Koh
2018-04-24 15:36       ` Olivier Matz
2018-04-24  1:38 ` [dpdk-dev] [PATCH v4 " Yongseok Koh
2018-04-24  1:38   ` [dpdk-dev] [PATCH v4 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-24  5:01   ` [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf Stephen Hemminger
2018-04-24 11:47     ` Yongseok Koh
2018-04-24 12:28   ` Andrew Rybchenko
2018-04-24 16:02     ` Olivier Matz
2018-04-24 18:21       ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2018-04-24 19:15         ` Olivier Matz [this message]
2018-04-24 20:22           ` [dpdk-dev] " Thomas Monjalon
2018-04-24 21:53             ` Yongseok Koh
2018-04-24 22:15               ` Thomas Monjalon
2018-04-25  8:21               ` Olivier Matz
2018-04-25 15:06             ` Stephen Hemminger
2018-04-24 23:34           ` Yongseok Koh
2018-04-25 14:45             ` Andrew Rybchenko
2018-04-25 17:40               ` Yongseok Koh
2018-04-25  8:28       ` Olivier Matz
2018-04-25  9:08         ` Yongseok Koh
2018-04-25  9:19           ` Yongseok Koh
2018-04-25 20:00             ` Olivier Matz
2018-04-25 22:54               ` Yongseok Koh
2018-04-24 22:30     ` Yongseok Koh
2018-04-25  2:53 ` [dpdk-dev] [PATCH v5 " Yongseok Koh
2018-04-25  2:53   ` [dpdk-dev] [PATCH v5 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-25 13:31   ` [dpdk-dev] [PATCH v5 1/2] mbuf: support attaching external buffer to mbuf Ananyev, Konstantin
2018-04-25 17:06     ` Yongseok Koh
2018-04-25 17:23       ` Ananyev, Konstantin
2018-04-25 18:02         ` Yongseok Koh
2018-04-25 18:22           ` Yongseok Koh
2018-04-25 18:30             ` Yongseok Koh
2018-04-26  1:10 ` [dpdk-dev] [PATCH v6 " Yongseok Koh
2018-04-26  1:10   ` [dpdk-dev] [PATCH v6 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-26 11:39   ` [dpdk-dev] [PATCH v6 1/2] mbuf: support attaching external buffer to mbuf Ananyev, Konstantin
2018-04-26 16:05   ` Andrew Rybchenko
2018-04-26 16:10     ` Thomas Monjalon
2018-04-26 19:42       ` Olivier Matz
2018-04-26 19:58         ` Thomas Monjalon
2018-04-26 20:07           ` Olivier Matz
2018-04-26 20:24             ` Thomas Monjalon
2018-04-26 17:18     ` Yongseok Koh
2018-04-26 19:45       ` Olivier Matz
2018-04-27  0:01 ` [dpdk-dev] [PATCH v7 " Yongseok Koh
2018-04-27  0:01   ` [dpdk-dev] [PATCH v7 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-27  8:00     ` Andrew Rybchenko
2018-04-27  7:22   ` [dpdk-dev] [PATCH v7 1/2] mbuf: support attaching external buffer to mbuf Andrew Rybchenko
2018-04-27 17:22 ` [dpdk-dev] [PATCH v8 " Yongseok Koh
2018-04-27 17:22   ` [dpdk-dev] [PATCH v8 2/2] app/testpmd: conserve offload flags of mbuf Yongseok Koh
2018-04-27 18:09   ` [dpdk-dev] [PATCH v8 1/2] mbuf: support attaching external buffer to mbuf Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180424191538.exjgzoif4odhndew@neon \
    --to=olivier.matz@6wind.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=arybchenko@solarflare.com \
    --cc=dev@dpdk.org \
    --cc=jingjing.wu@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=wenzhuo.lu@intel.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).