RE: Optimizations are not features

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Stephen Hemminger" <stephen@networkplumber.org>,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>
Cc: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	"Jerin Jacob" <jerinjacobk@gmail.com>, "dpdk-dev" <dev@dpdk.org>,
	<techboard@dpdk.org>, "nd" <nd@arm.com>
Subject: RE: Optimizations are not features
Date: Tue, 5 Jul 2022 00:06:49 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D871A8@smartserver.smartshare.dk> (raw)
In-Reply-To: <20220704093311.0582d592@hermes.local>

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, 4 July 2022 18.33
> 
> On Sun, 3 Jul 2022 20:38:21 +0100
> Konstantin Ananyev <konstantin.v.ananyev@yandex.ru> wrote:
> 
> > >
> > > The base/existing design for DPDK was done with one particular HW
> architecture in mind where there was an abundance of resources.
> Unfortunately, that HW architecture is fast evolving and DPDK is
> adopted in use cases where that kind of resources are not available.
> For ex: efficiency cores are being introduced by every CPU vendor now.
> Soon enough, we will see big-little architecture in networking as well.
> The existing PMD design introduces 512B of stores (256B for copying to
> stack variable and 256B to store lcore cache) and 256B load/store on RX
> side every 32 packets back to back. It doesn't make sense to have that
> kind of memcopy for little/efficiency cores just for the driver code.
> >
> > I don't object about specific use-case optimizations.
> > Specially if the use-case is a common one.

Or exotic, but high-volume, use cases! Those usually get a lot of attention from sales and product management people. :-)

DPDK needs to support those in the mainline, or we will end up with forks like Qualcomm's QSDK fork of the Linux kernel. (The QSDK fork from Qualcomm, a leading Wi-Fi chip set vendor, bypasses a lot of the Linux kernel's IP stack to provide much higher throughput for one use specific case, which is a quite high volume use case: a Wi-Fi Access Point.)

> > But I think such changes has to be transparent to the user as
> > much as possible and shouldn't cause further DPDK code fragmentation
> > (new CONFIG options, etc.).
> > I understand that it is not always possible, but for pure SW based
> > optimizations, I think it is a reasonable expectation.
> 
> Great discussion.
> 
> Also, if you look back at the mailing list history, you can see that
> lots of users just
> use DPDK because it is "go fast" secret sauce and have not
> understanding of the internals.

Certainly, DPDK should still do that!

I just want DPDK to be able to go faster for experts.

Car analogy: If you buy a fast car, it will go fast. If you bring it to a tuning specialist, it will go faster. Similarly, DPDK should go "fast", but also accept that specialists can make it go "faster".

> 
> My concern, is that if one untestable optimization goes in for one
> hardware platform then
> users will enable it all the time thinking it makes any and all uses
> cases faster.
> Try explaining to a Linux user that the real-time kernel is *not*
> faster than
> the normal kernel...

Yes, because of the common misconception that faster equals to higher bandwidth. But the real-time kernel does provide lower latency (under certain conditions), which means faster to some of us. I'm sorry... working with latency as one of our KPIs, I just couldn't resist it! ;-)

Seriously, DPDK cannot be limited to cater to everyone on Stack Overflow!

Jokes aside...

When we started using DPDK at SmartShare Systems, DPDK was a highly optimized development kit for embedded network appliances, perfect for our SmartShare StraightShaper WAN optimization appliances and future roadmap. Over time, DPDK has morphed into a packet processing library for Ubuntu and Red Hat, with a lot of added features we don't use, and no ability to remove those added features. Those added features potentially degrade the fast path performance, and increase the risk of bugs at system level.

Some software optimizations have been proposed to DPDK, to support some specific high-volume use cases. "mbuf fast free" got accepted, but "direct re-arm" is getting a lot of push-back, and the most recent "IOVA VA only mode" is another new optimization suggestion being discussed.

In theory, it would be nice if all software optimizations could be supported at run-time, but it adds at least one branch to the fast path for every optimization, eventually slowing down the fast path significantly. And some of the optimizations just make so much better sense at compile time than at runtime, e.g. the "IOVA VA mode".

So, I think we should start thinking about such optimizations differently: If someone needs to optimize something for a specific use case, it can be done at compile time; there is no need to do it at runtime. Which is what I meant by the subject of my email: Don't offer optimizations as runtime features; they are use case specific, and should be chosen at compile time only.

Referring to the Linux kernel as the golden standard, it even has "make menuconfig"... a menu driven configuration interface for compile time configuration. Why must DPDK have every exotic option available at runtime, when the Linux kernel considers it perfectly acceptable to have some things configurable at compile time only?

With this discussion, I am only asking for software optimizations (which usually also imply some other limitations) to be compile time options, rather than compile time options. Any application can achieve exactly the same without those optimizations enabled, but it will be faster with the optimization enabled.

I would love to go back to the good old days, where DPDK had a lot of compile time options to disable cruft we're not using, but I know that game was lost a long time ago! So I'm trying to find some middle ground that keeps all features in the "DPDK library for distros", but also allows hard core developers to tune the performance for their individual use cases.

Offering software optimizations as compile time options only, should also reduce the amount of push-back for such software optimizations.

Reading all the feedback from the thread, it seems that the major concern is testing. And for some mysterious reason, compiling 2^N features causes more concern than run-time testing 2^N features. I get the sense that run-time testing the various feature combinations is not happening today. :-(

     prev parent reply	other threads:[~2022-07-04 22:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-04  9:09 Morten Brørup
2022-06-04  9:33 ` Jerin Jacob
2022-06-04 10:00   ` Andrew Rybchenko
2022-06-04 11:10     ` Jerin Jacob
2022-06-04 12:19       ` Morten Brørup
2022-06-04 12:51         ` Andrew Rybchenko
2022-06-05  8:15           ` Morten Brørup
2022-06-05 16:05           ` Stephen Hemminger
2022-06-06  9:35           ` Konstantin Ananyev
2022-06-29 20:44             ` Honnappa Nagarahalli
2022-06-30 15:39               ` Morten Brørup
2022-07-03 19:38               ` Konstantin Ananyev
2022-07-04 16:33                 ` Stephen Hemminger
2022-07-04 22:06                   ` Morten Brørup [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D871A8@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=nd@arm.com \
    --cc=stephen@networkplumber.org \
    --cc=techboard@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).