From: Vadim Suraev <vadim.suraev@gmail.com>
To: Olivier MATZ <olivier.matz@6wind.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] rte_mbuf: scattered pktmbufs freeing optimization
Date: Fri, 27 Feb 2015 19:09:32 +0200 [thread overview]
Message-ID: <CAJ0CJ8=0LCSrBE4QYA1bdKaXRGdNDDhzhpbwJ4oSj_tT+t=CQA@mail.gmail.com> (raw)
In-Reply-To: <54F06F3A.40401@6wind.com>
Hi, Olivier, Hi Konstantin,
>Indeed, this function looks useful, and I also have a work in progress
>on this topic, but currently it is not well tested.
I'm sorry, I didn't know. I'll not interfere with my patch))
>About the inlining, I have no objection now, although Stephen may be
>right. I think we can consider un-inline some functions, based on
>performance measurements.
I've also noticed in many cases it makes no difference. Seems to be some
trade-off.
>- clarify the difference between raw_alloc/raw_free and
> mempool_get/mempool_put: For instance, I think that the reference
> counter initialization should be managed by rte_pktmbuf_reset() like
> the rest of the fields, therefore raw_alloc/raw_free could be replaced
It looks useful to me since not all of the fields are used in every
particular application so
if the allocation is decoupled from reset, one can save some cycles.
Regards,
Vadim.
On Fri, Feb 27, 2015 at 3:20 PM, Olivier MATZ <olivier.matz@6wind.com>
wrote:
> Hi Vadim, Hi Konstantin,
>
> On 02/27/2015 01:18 PM, Vadim Suraev wrote:
> > Hi, Konstantin,
> >
> >> Seems really useful.
>
> Indeed, this function looks useful, and I also have a work in progress
> on this topic, but currently it is not well tested.
>
> As we are on the mbuf subject, for 2.1, I would like to discuss some
> points of mbuf API:
>
> - support mbuf and segment bulk allocation/free
> - clarify the difference between raw_alloc/raw_free and
> mempool_get/mempool_put: For instance, I think that the reference
> counter initialization should be managed by rte_pktmbuf_reset() like
> the rest of the fields, therefore raw_alloc/raw_free could be replaced
> by mempool functions (this would imply some modifications in drivers).
> - clarify which functions are internal and which are not. Some could be
> made public and hence should be documented: from what I see, the
> vhost example currently uses internal APIs and I suspect it may
> not work when MBUF_DEBUG=y.
> - factorize rte_rxmbuf_alloc() which is duplicated in almost any driver
> - discuss if driver should be calling pktmbuf_reset().
> It would avoid some bugs where fields are not properly initialized.
> - check if rte_pktmbuf_free() is called on mbufs with m->next or
> refcount not initialized
> - support clones of clones
> - support attach/detach of mbufs embedding a private part
> - move m->next in the first cache line as it's initialized in rx
> - rename "__rte_pktmbuf_prefree_seg" in something clearer like "decref".
>
> To avoid doing the work twice, please note that I already have an
> implementation for:
> - rte_mbuf_raw_free_bulk(): each mbuf return in its own pool
> - rte_mbuf_free_seg_bulk(): similar to call rte_pktmbuf_free_seg()
> on each mbuf
> but they are not tested at all now.
>
> I will submit a RFC series in the coming weeks that addresses these
> issues. It could be used as a basis for a discussion.
>
> Sorry Vadim for polluting the topic, but I think the subjects are
> quite linked together. Please find some comments on your patch below.
>
> >> One thought - why to introduce the limitation that all mbufs have to be
> > from the same mempool?
> >> I think you can reorder it a bit, so it can handle situation when
> chained
> > mbufs belong to different mempools.
> >
> > I had a doubt, my concern was how practical is that (multiple mempools)
> > case?
>
> I think having several mempool is common on multi-socket machines.
>
> > Do you think there should be two versions: lightweight (with the
> > restriction) and generic?
>
> I think having the version that supports different mempools is more
> important. Checking if the mempool is the same may not cost too much
> compared to the rest of the work. But if we find a good use case
> where having the lightweight version is preferable, we could have
> 2 versions (ex: if performance is really different).
>
> >>> +/* This macro defines the size of max bulk of mbufs to free for
> >> rte_pktmbuf_free_bulk */
> >>> +#define MAX_MBUF_FREE_SIZE 32
>
> I wonder if it's required to have this macro.
>
> In my implementation, I use a dynamic-sized table:
>
> static inline void
> rte_mbuf_free_seg_bulk(struct rte_mbuf **m_table, unsigned n)
> {
> void *obj_table[n];
> [...]
>
>
> >>> + *
> >>> + * @param head
> >>> + * The head of mbufs to be freed chain
> >>> + */
> >>> +
> >>> +static inline void __attribute__((always_inline))
> >>> +rte_pktmbuf_free_bulk(struct rte_mbuf *head)
>
> About the inlining, I have no objection now, although Stephen may be
> right. I think we can consider un-inline some functions, based on
> performance measurements.
>
> >>> +{
> >>> + void *mbufs[MAX_MBUF_FREE_SIZE];
> >>> + unsigned mbufs_count = 0;
> >>> + struct rte_mbuf *next;
> >>> +
> >>> + RTE_MBUF_MEMPOOL_CHECK1(head);
> >>> +
> >>> + while(head) {
> >>> + next = head->next;
> >>> + head->next = NULL;
> >>> + if(__rte_pktmbuf_prefree_seg(head)) {
> >>> + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(head) == 0);
> >>> + RTE_MBUF_MEMPOOL_CHECK2(head);
> >>> + mbufs[mbufs_count++] = head;
> >>> + }
> >>> + head = next;
> >>> + if(mbufs_count == MAX_MBUF_FREE_SIZE) {
> >>> + rte_mempool_put_bulk(((struct rte_mbuf
> >> *)mbufs[0])->pool,mbufs,mbufs_count);
> >>> + mbufs_count = 0;
> >>> + }
> >>> + }
> >>> + if(mbufs_count > 0) {
> >>> + rte_mempool_put_bulk(((struct rte_mbuf
> >> *)mbufs[0])->pool,mbufs,mbufs_count);
> >>> + }
> >>> +}
> >>> +
>
> Also, I think a unit test should be added.
>
> Thanks,
> Olivier
>
next prev parent reply other threads:[~2015-02-27 17:09 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-26 23:15 vadim.suraev
2015-02-27 0:49 ` Stephen Hemminger
2015-02-27 11:17 ` Ananyev, Konstantin
2015-02-27 12:18 ` Vadim Suraev
[not found] ` <2601191342CEEE43887BDE71AB977258213F2E3D@irsmsx105.ger.corp.intel.com>
2015-02-27 13:10 ` Ananyev, Konstantin
2015-02-27 13:20 ` Olivier MATZ
2015-02-27 17:09 ` Vadim Suraev [this message]
2015-03-04 8:54 ` Olivier MATZ
2015-03-06 23:24 ` Vadim Suraev
2015-03-09 8:38 ` Olivier MATZ
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJ0CJ8=0LCSrBE4QYA1bdKaXRGdNDDhzhpbwJ4oSj_tT+t=CQA@mail.gmail.com' \
--to=vadim.suraev@gmail.com \
--cc=dev@dpdk.org \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).