From: Thomas Monjalon <thomas.monjalon@6wind.com>
To: Panu Matilainen <pmatilai@redhat.com>
Cc: dev@dpdk.org, "dprovan@bivio.net" <dprovan@bivio.net>
Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
Date: Mon, 29 Feb 2016 17:14:15 +0100 [thread overview]
Message-ID: <1837798.vOsM3O1Uf3@xps13> (raw)
In-Reply-To: <56D42296.6090603@redhat.com>
2016-02-29 12:51, Panu Matilainen:
> On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Panu Matilainen
> >> On 02/23/2016 07:35 AM, Xie, Huawei wrote:
> >>> On 2/22/2016 10:52 PM, Xie, Huawei wrote:
> >>>> On 2/4/2016 1:24 AM, Olivier MATZ wrote:
> >>>>> On 01/27/2016 02:56 PM, Panu Matilainen wrote:
> >>>>>> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of
> >>>>>> the library ABI and should not be listed in the version map.
> >>>>>>
> >>>>>> I assume its inline for performance reasons, but then you lose the
> >>>>>> benefits of dynamic linking such as ability to fix bugs and/or improve
> >>>>>> itby just updating the library. Since the point of having a bulk API is
> >>>>>> to improve performance by reducing the number of calls required, does it
> >>>>>> really have to be inline? As in, have you actually measured the
> >>>>>> difference between inline and non-inline and decided its worth all the
> >>>>>> downsides?
> >>>>> Agree with Panu. It would be interesting to compare the performance
> >>>>> between inline and non inline to decide whether inlining it or not.
> >>>> Will update after i gathered more data. inline could show obvious
> >>>> performance difference in some cases.
> >>>
> >>> Panu and Oliver:
> >>> I write a simple benchmark. This benchmark run 10M rounds, in each round
> >>> 8 mbufs are allocated through bulk API, and then freed.
> >>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @
> >>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded).
> >>> Btw, i have removed some exceptional data, the frequency of which is
> >>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue
> >>> what happened.
> >>>
> >>> With 8 mbufs allocated, there is about 6% performance increase using inline.
> >> [...]
> >>>
> >>> With 16 mbufs allocated, we could still observe obvious performance
> >>> difference, though only 1%-2%
> >>>
> >> [...]
> >>>
> >>> With 32/64 mbufs allocated, the deviation of the data itself would hide
> >>> the performance difference.
> >>> So we prefer using inline for performance.
> >>
> >> At least I was more after real-world performance in a real-world
> >> use-case rather than CPU cycles in a microbenchmark, we know function
> >> calls have a cost but the benefits tend to outweight the cons.
> >>
> >> Inline functions have their place and they're far less evil in project
> >> internal use, but in library public API they are BAD and should be ...
> >> well, not banned because there are exceptions to every rule, but highly
> >> discouraged.
> >
> > Why is that?
>
> For all the reasons static linking is bad, and what's worse it forces
> the static linking badness into dynamically linked builds.
>
> If there's a bug (security or otherwise) in a library, a distro wants to
> supply an updated package which fixes that bug and be done with it. But
> if that bug is in an inlined code, supplying an update is not enough,
> you also need to recompile everything using that code, and somehow
> inform customers possibly using that code that they need to not only
> update the library but to recompile their apps as well. That is
> precisely the reason distros go to great lenghts to avoid *any*
> statically linked apps and libs in the distro, completely regardless of
> the performance overhead.
>
> In addition, inlined code complicates ABI compatibility issues because
> some of the code is one the "wrong" side, and worse, it bypasses all the
> other ABI compatibility safeguards like soname and symbol versioning.
>
> Like said, inlined code is fine for internal consumption, but incredibly
> bad for public interfaces. And of course, the more complicated a
> function is, greater the potential of needing bugfixes.
>
> Mind you, none of this is magically specific to this particular
> function. Except in the sense that bulk operations offer a better way of
> performance improvements than just inlining everything.
>
> > As you can see right now we have all mbuf alloc/free routines as static inline.
> > And I think we would like to keep it like that.
> > So why that particular function should be different?
>
> Because there's much less need to have it inlined since the function
> call overhead is "amortized" by the fact its doing bulk operations. "We
> always did it that way" is not a very good reason :)
>
> > After all that function is nothing more than a wrapper
> > around rte_mempool_get_bulk() unrolled by 4 loop {rte_pktmbuf_reset()}
> > So unless mempool get/put API would change, I can hardly see there could be any ABI
> > breakages in future.
> > About 'real world' performance gain - it was a 'real world' performance problem,
> > that we tried to solve by introducing that function:
> > http://dpdk.org/ml/archives/dev/2015-May/017633.html
> >
> > And according to the user feedback, it does help:
> > http://dpdk.org/ml/archives/dev/2016-February/033203.html
>
> The question is not whether the function is useful, not at all. The
> question is whether the real-world case sees any measurable difference
> in performance if the function is made non-inline.
This is a valid question, and it applies to a large part of DPDK.
But it's something to measure and change more globally than just
a new function.
Generally speaking, any effort to reduce the size of the exported headers
will be welcome.
That said, this patch won't be blocked.
next prev parent reply other threads:[~2016-02-29 16:15 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-13 23:35 [dpdk-dev] [PATCH 0/2] provide rte_pktmbuf_alloc_bulk API and call it in vhost dequeue Huawei Xie
2015-12-13 23:35 ` [dpdk-dev] [PATCH 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2015-12-13 23:35 ` [dpdk-dev] [PATCH 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2015-12-14 1:14 ` [dpdk-dev] [PATCH v2 0/2] provide rte_pktmbuf_alloc_bulk API and call it " Huawei Xie
2015-12-14 1:14 ` [dpdk-dev] [PATCH v2 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2015-12-17 6:41 ` Yuanhan Liu
2015-12-17 15:42 ` Ananyev, Konstantin
2015-12-18 2:17 ` Yuanhan Liu
2015-12-18 5:01 ` Stephen Hemminger
2015-12-18 5:21 ` Yuanhan Liu
2015-12-18 7:10 ` Xie, Huawei
2015-12-18 10:44 ` Ananyev, Konstantin
2015-12-18 17:32 ` Stephen Hemminger
2015-12-18 19:27 ` Wiles, Keith
2015-12-21 15:21 ` Xie, Huawei
2015-12-21 17:20 ` Wiles, Keith
2015-12-21 21:30 ` Thomas Monjalon
2015-12-22 1:58 ` Xie, Huawei
2015-12-21 22:34 ` Don Provan
2015-12-21 12:25 ` Xie, Huawei
2015-12-14 1:14 ` [dpdk-dev] [PATCH v2 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2015-12-17 6:41 ` Yuanhan Liu
2015-12-22 16:17 ` [dpdk-dev] [PATCH v3 0/2] provide rte_pktmbuf_alloc_bulk API and call it " Huawei Xie
2015-12-22 16:17 ` [dpdk-dev] [PATCH v3 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2015-12-23 18:37 ` Stephen Hemminger
2015-12-23 18:49 ` Ananyev, Konstantin
2015-12-24 1:33 ` Xie, Huawei
2015-12-22 16:17 ` [dpdk-dev] [PATCH v3 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2015-12-23 11:22 ` linhaifeng
2015-12-23 11:39 ` Xie, Huawei
2015-12-22 23:05 ` [dpdk-dev] [PATCH v4 0/2] provide rte_pktmbuf_alloc_bulk API and call it " Huawei Xie
2015-12-22 23:05 ` [dpdk-dev] [PATCH v4 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2015-12-22 23:05 ` [dpdk-dev] [PATCH v4 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2015-12-27 16:38 ` [dpdk-dev] [PATCH v5 0/2] provide rte_pktmbuf_alloc_bulk API and call it " Huawei Xie
2015-12-27 16:38 ` [dpdk-dev] [PATCH v5 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2015-12-27 16:38 ` [dpdk-dev] [PATCH v5 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2016-01-26 17:03 ` [dpdk-dev] [PATCH v6 0/2] provide rte_pktmbuf_alloc_bulk API and call it " Huawei Xie
2016-01-26 17:03 ` [dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2016-01-27 13:56 ` Panu Matilainen
2016-02-03 17:23 ` Olivier MATZ
2016-02-22 14:49 ` Xie, Huawei
2016-02-23 5:35 ` Xie, Huawei
2016-02-24 12:11 ` Panu Matilainen
2016-02-24 13:23 ` Ananyev, Konstantin
2016-02-26 7:39 ` Xie, Huawei
2016-02-26 8:45 ` Olivier MATZ
2016-02-29 10:51 ` Panu Matilainen
2016-02-29 16:14 ` Thomas Monjalon [this message]
2016-02-26 8:55 ` Olivier MATZ
2016-02-26 9:07 ` Xie, Huawei
2016-02-26 9:18 ` Olivier MATZ
2016-01-26 17:03 ` [dpdk-dev] [PATCH v6 2/2] vhost: call rte_pktmbuf_alloc_bulk in vhost dequeue Huawei Xie
2016-02-28 12:44 ` [dpdk-dev] [PATCH v7] mbuf: provide rte_pktmbuf_alloc_bulk API Huawei Xie
2016-02-29 16:27 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1837798.vOsM3O1Uf3@xps13 \
--to=thomas.monjalon@6wind.com \
--cc=dev@dpdk.org \
--cc=dprovan@bivio.net \
--cc=pmatilai@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).