RE: Optimizations are not features

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	"Jerin Jacob" <jerinjacobk@gmail.com>
Cc: "dpdk-dev" <dev@dpdk.org>, <techboard@dpdk.org>,
	"nd" <nd@arm.com>, "nd" <nd@arm.com>
Subject: RE: Optimizations are not features
Date: Thu, 30 Jun 2022 17:39:57 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8719A@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB5814C8A0A365FA629EE3B26398BB9@DBAPR08MB5814.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Wednesday, 29 June 2022 22.44
> 
> <snip>
> 
> >
> > 04/06/2022 13:51, Andrew Rybchenko пишет:
> > > On 6/4/22 15:19, Morten Brørup wrote:
> > >>> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > >>> Sent: Saturday, 4 June 2022 13.10
> > >>>
> > >>> On Sat, Jun 4, 2022 at 3:30 PM Andrew Rybchenko
> > >>> <andrew.rybchenko@oktetlabs.ru> wrote:
> > >>>>
> > >>>> On 6/4/22 12:33, Jerin Jacob wrote:
> > >>>>> On Sat, Jun 4, 2022 at 2:39 PM Morten Brørup
> > >>> <mb@smartsharesystems.com> wrote:
> > >>>>>>
> > >>>>>> I would like the DPDK community to change its view on compile
> > >>>>>> time
> > >>> options. Here is why:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> Application specific performance micro-optimizations like
> “fast
> > >>> mbuf free” and “mbuf direct re-arm” are being added to DPDK and
> > >>> presented as features.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> They are not features, but optimizations, and I don’t
> understand
> > >>> the need for them to be available at run-time!
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> Instead of adding a bunch of exotic exceptions to the fast
> path
> > >>>>>> of
> > >>> the PMDs, they should be compile time options. This will improve
> > >>> performance by avoiding branches in the fast path, both for the
> > >>> applications using them, and for generic applications (where the
> > >>> exotic code is omitted).
> > >>>>>
> > >>>>> Agree. I think, keeping the best of both worlds would be
> > >>>>>
> > >>>>> -Enable the feature/optimization as runtime -Have a compile-
> time
> > >>>>> option to disable the feature/optimization as
> > >>> an override.
> > >>>>
> > >>>> It is hard to find the right balance, but in general compile
> time
> > >>>> options are a nightmare for maintenance. Number of required
> builds
> > >>>> will grow as an exponent.
> > >>
> > >> Test combinations are exponential for N features, regardless if N
> are
> > >> runtime or compile time options.
> > >
> > > But since I'm talking about build checks I don't care about
> > > exponential grows in run time. Yes, testing should care, but it is
> a separate
> > story.
> > >
> > >>
> > >>>> Of course, we can
> > >>>> limit number of checked combinations, but it will result in flow
> of
> > >>>> patches to fix build in other cases.
> > >>>
> > >>> The build breakage can be fixed if we use (2) vs (1)
> > >>>
> > >>> 1)
> > >>> #ifdef ...
> > >>> My feature
> > >>> #endif
> > >>>
> > >>> 2)
> > >>> static __rte_always_inline int
> > >>> rte_has_xyz_feature(void)
> > >>> {
> > >>> #ifdef RTE_LIBRTE_XYZ_FEATURE
> > >>>          return RTE_LIBRTE_XYZ_FEATURE; #else
> > >>>          return 0;
> > >>> #endif
> > >>> }
> > >>>
> > >>> if(rte_has_xyz_feature())) {
> > >>> My feature code
> > >>>
> > >>> }
> > >>>
> > >
> > > Jerin, thanks, very good example.
> > >
> > >> I'm not sure all the features can be covered by that, e.g. added
> > >> fields in structures.
> > >
> > > +1
> > >
> > >>
> > >> Also, I would consider such features "opt in" at compile time
> only.
> > >> As such, they could be allowed to break the ABI/API.
> > >>
> > >>>
> > >>>
> > >>>> Also compile time options tend to make code less readable which
> > >>>> makes all aspects of the development harder.
> > >>>>
> > >>>> Yes, compile time is nice for micro optimizations, but I have
> great
> > >>>> concerns that it is a right way to go.
> > >>>>
> > >>>>>> Please note that I am only talking about the performance
> > >>> optimizations that are limited to application specific use cases.
> I
> > >>> think it makes sense to require that performance optimizing an
> > >>> application also requires recompiling the performance critical
> > >>> libraries used by it.
> > >>>>>>abandon some of existing functionality to create a 'short-cut'
> > >>>>>>
> > >>>>>>
> > >>>>>> Allowing compile time options for application specific
> > >>>>>> performance
> > >>> optimizations in DPDK would also open a path for other
> > >>> optimizations, which can only be achieved at compile time, such
> as
> > >>> “no fragmented packets”, “no attached mbufs” and “single mbuf
> pool”.
> > >>> And even more exotic optimizations, such as the “indexed mempool
> > >>> cache”, which was rejected due to ABI violations – they could be
> > >>> marked as “risky and untested” or similar, but still be part of
> the DPDK main
> > repository.
> > >>>>>>
> >
> >
> > Thanks Morten for bringing it up, it is an interesting topic.
> > Though I look at it from different angle.
> > All optimizations you mentioned above introduce new limitations:
> > MBUF_FAST_FREE - no indirect mbufs and multiple mempools, mempool
> object
> > indexes - mempool size is limited to 4GB, direct rearm - drop ability
> to
> > stop/reconfigure TX queue, while RX queue is still running, etc.
> > Note that all these limitations are not forced by HW.
> > All of them are pure SW limitations that developers forced in (or
> tried to) to get
> > few extra performance.
> > That's concerning tendency.
> >
> > As more and more such 'optimization via limitation' will come in:
> > - DPDK feature list will become more and more fragmented.
> > - Would cause more and more confusion for the users.
> > - Unmet expectations - difference in performance between 'default'
> >    and 'optimized' version of DPDK will become bigger and bigger.

I strongly disagree with this bullet!

We should not limit the performance to only what is possible with all features enabled.

An application developer should have the ability to disable performance-costly features not being used.

> > - As Andrew already mentioned, maintaining all these 'sub-flavours'
> >    of DPDK will become more and more difficult.
> The point that we need to remember is, these features/optimizations are
> introduced after seeing performance issues in practical use cases.
> DPDK is not being used in just one use case, it is being used in
> several use cases which have their own unique requirements. Is 4GB
> enough for packet buffers - yes it is enough in certain use cases. Are
> their NICs with single port - yes there are. HW is being created
> because use cases and business cases exist. It is obvious that as DPDK
> gets adopted on more platforms that differ largely, the features will
> increase and it will become complex. Complexity should not be used as a
> criteria to reject patches.
> 
> There is different perspective to what you are calling as
> 'limitations'. I can argue that multiple mempools, stop/reconfigure TX
> queue while RX queue is still running are exotic. Just because those
> are allowed currently (probably accidently) does not mean they are
> being used. Are there use cases that make use of these features?
> 
> The base/existing design for DPDK was done with one particular HW
> architecture in mind where there was an abundance of resources.
> Unfortunately, that HW architecture is fast evolving and DPDK is
> adopted in use cases where that kind of resources are not available.
> For ex: efficiency cores are being introduced by every CPU vendor now.
> Soon enough, we will see big-little architecture in networking as well.
> The existing PMD design introduces 512B of stores (256B for copying to
> stack variable and 256B to store lcore cache) and 256B load/store on RX
> side every 32 packets back to back. It doesn't make sense to have that
> kind of memcopy for little/efficiency cores just for the driver code.
> 
> >
> > So, probably instead of making such changes easier, we need somehow
> to
> > persuade developers to think more about optimizations that would be
> generic
> > and transparent to the user.
> Or may be we need to think of creating alternate ways of programming.

Exactly what I was hoping to achieve with this discussion.

> 
> > I do realize that it is not always possible due to various reasons
> (HW limitations,
> > external dependencies, etc.) but that's another story.
> >
> > Let's take for example MBUF_FAST_FREE.
> > In fact, I am not sure that we need it as tx offload flag at all.
> > PMD TX-path has all necessary information to decide at run-time can
> it do
> > fast_free() for not:
> > At tx_burst() PMD can check are all mbufs satisfy these conditions
> (same
> > mempool, refcnt==1) and update some fields and/or counters inside TXQ
> to
> > reflect it.
> > Then, at tx_free() we can use this info to decide between fast_free()
> and
> > normal_free().
> > As at tx_burst() we read mbuf fields anyway, impact for this extra
> step I guess
> > would be minimal.
> > Yes, most likely, it wouldn't be as fast as with current TX offload
> flag, or
> > conditional compilation approach.
> > But it might be still significantly faster then normal_free(), plus
> such approach
> > will be generic and transparent to the user.
> IMO, this depends on the philosophy that we want to adopt. I would
> prefer to make control plane complex for performance gains on the data
> plane. The performance on the data plane has a multiplying effect due
> to the ratio of number of cores assigned for data plane vs control
> plane.

Yes. And if some performance-costing feature is not possible to move out from the data plane to the control plane, it should be compile time optional.

And please note that I don't buy the argument that "it will be caught by branch prediction". You are not allowed to fill up my branch predictor table with cruft!

> 
> I am not against evaluating alternatives, but the alternative
> approaches need to have similar (not the same) performance.
> 
> >
> > Konstantin

next prev parent reply	other threads:[~2022-06-30 15:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-04  9:09 Morten Brørup
2022-06-04  9:33 ` Jerin Jacob
2022-06-04 10:00   ` Andrew Rybchenko
2022-06-04 11:10     ` Jerin Jacob
2022-06-04 12:19       ` Morten Brørup
2022-06-04 12:51         ` Andrew Rybchenko
2022-06-05  8:15           ` Morten Brørup
2022-06-05 16:05           ` Stephen Hemminger
2022-06-06  9:35           ` Konstantin Ananyev
2022-06-29 20:44             ` Honnappa Nagarahalli
2022-06-30 15:39               ` Morten Brørup [this message]
2022-07-03 19:38               ` Konstantin Ananyev
2022-07-04 16:33                 ` Stephen Hemminger
2022-07-04 22:06                   ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D8719A@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=nd@arm.com \
    --cc=techboard@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).