From: Bruce Richardson <bruce.richardson@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
Date: Fri, 1 Aug 2014 14:08:22 -0700 [thread overview]
Message-ID: <20140801210821.GF28495@localhost.localdomain> (raw)
In-Reply-To: <20140801204352.GF31979@hmsreliant.think-freely.org>
On Fri, Aug 01, 2014 at 04:43:52PM -0400, Neil Horman wrote:
> On Fri, Aug 01, 2014 at 12:22:22PM -0700, Bruce Richardson wrote:
> > On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote:
> > > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote:
> > > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote:
> > > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote:
> > > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote:
> > > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote:
> > > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote:
> > > > > > > > > 2014-07-31 09:13, Neil Horman:
> > > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson wrote:
> > > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote:
> > > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson wrote:
> > > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote:
> > > > > > > > > > > > > > Hey all-
> > > > > >
> > > > > > With regards to the general approach for runtime detection of software
> > > > > > functions, I wonder if something like this can be handled by the
> > > > > > packaging system? Is it possible to ship out a set of shared libs
> > > > > > compiled up for different instruction sets, and then at rpm install
> > > > > > time, symlink the appropriate library? This would push the whole issue
> > > > > > of detection of code paths outside of code, work across all our
> > > > > > libraries and ensure each user got the best performance they could get
> > > > > > form a binary?
> > > > > > Has something like this been done before? The building of all the
> > > > > > libraries could be scripted easy enough, just do multiple builds using
> > > > > > different EXTRA_CFLAGS each time, and move and rename the .so's after
> > > > > > each run.
> > > > > >
> > > > >
> > > > > Sorry, I missed this in my last reply.
> > > > >
> > > > > In answer to your question, the short version is that such a thing is roughly
> > > > > possible from a packaging standpoint, but completely unworkable from a
> > > > > distribution standpoint. We could certainly build the dpdk multiple times and
> > > > > rename all the shared objects to some variant name representative of the
> > > > > optimzations we build in for certain cpu flags, but then we woudl be shipping X
> > > > > versions of the dpdk, and any appilcation (say OVS that made use of the dpdk
> > > > > would need to provide a version linked against each variant to be useful when
> > > > > making a product, and each end user would need to manually select (or run a
> > > > > script to select) which variant is most optimized for the system at hand. Its
> > > > > just not a reasonable way to package a library.
> > > >
> > > > Sorry, perhaps I was not clear, having the user have to select the
> > > > appropriate library was not what I was suggesting. Instead, I was
> > > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic",
> > > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm
> > > > post-install script would look at the cpuflags in cpuinfo and then
> > > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user
> > > > only has to link against "librte_pmd_ixgbe.so" and depending on the
> > > > system its run on, the loader will automatically resolve the symbols
> > > > from the appropriate instruction-set specific .so file.
> > > >
> > >
> > > This is an absolute packaging nightmare, it will potentially break all sorts of
> > > corner cases, and support processes. To cite a few examples:
> > >
> > > 1) Upgrade support - What if the minimum cpu requirements for dpdk are advanced
> > > at some point in the future? The above strategy has no way to know that a given
> > > update has more advanced requirements than a previous update, and when the
> > > update is installed, the previously linked library for the old base will
> > > dissappear, leaving broken applications behind.
> >
> > Firstly, I didn't know we could actually specify minimum cpu
> > requirements for packaging, that is something that could be useful :-)
> You misread my comment :). I didn't say we could specify minimum cpu
> requirements at packaging (you can't, beyond general arch), I said "what if the
> dpdk's cpu requriements were raised?". Completely different thing. Currently
> teh default, lowest common denominator system that dpdk appears to build for is
> core2 (as listed in the old default config). What if at some point you raise
> those requirements and decide that SSE4.2 really is required to achieve maximum
> performance. Using the above strategy any system that doesn't meet the new
> requirements will silently break on such an update. Thats not acceptable.
Core2 was the first Intel set of chips that had x86_64 instruction set
before (Core microarchitecture), so that's why it's listed as the
minimum - it's the same thing as generic x86_64 support. :-)
>
> > Secondly, what is the normal case for handling something like this,
> > where an upgrade has enhanced requirements compared to the previous
> > version? Presumably you either need to prevent the upgrade from
> > happening or else accept a broken app. Can the same mechanism not also
> > be used to prevent upgrades using a multi-lib scheme?
> >
> The case for handling something like this is: Don't do it. When you package
> something for Fedora (or any distro), you provide an implicit guaratee that it
> will run (or fail gracefully) on all supported systems. You can add support for
> systems as you go forward, but you can't deprecate support for systems within a
> major release. That is to say, if something runs on F20 now, its got to keep
> running on F20 for the lifetime of F20. If it stops running, thats a
> regression, the user opens a bug and you fix it.
>
> The DPDK is way off the reservation in regards to this. Application packages,
> as a general rule don't build with specific cpu features in mind, because
> performance, while important isn't on the same scale as what you're trying to do
> in the dpdk. A process getting scheduled off the cpu while we handle an
> interrupt wipes out any speedup gains made by any micro-optimizations, so theres
> no point in doing so. The DPDK is different, I understand that, but the
> drawback is that it (the DPDK) needs to make optimizations that really aren't
> considered particularly important to the rest of user space. I'm trying to
> opportunistically make the DPDK as fast as possible, but I need to do it in a
> single binary, that works on a lowest common demoninator system.
>
> > >
> > > 2) Debugging - Its going to be near impossible to support an application built
> > > with a package put together this way, because you'll never be sure as to which
> > > version of the library was running when the crash occured. You can figure it
> > > out for certain, but for support/development people to need to remember to
> > > figure this out is going to be a major turn off for them, and the result will be
> > > that they simply won't use the dpdk. Its Anathema to the expectations of linux
> > > user space.
> >
> > Sorry, I just don't see this as being any harder to support than
> > multiple code paths for the same functionality. In fact, it will surely make
> > debugging easier, since you only have the one code path, just compiled
> > up in different ways.
> >
>
> Well, then by all means, become a Fedora packager, an you can take over the DPDK
> maintenece there :). Until then, you'll just have to trust me. If you have
> multiple optional code paths (especialy if they're limited to isolated features)
> its manageable. But regardless of how you look at it, building the same
> source multiple times with different cpu support means completely different
> binaries. The assembly and optimization are just plain different. They may be
> close, but they're not the same, and they need to be QA-ed independently. With
> a single build and optional code paths, all the common code is executed no
> matter what system you're running on, and its always the same. Multiple builds
> with different instruction support means that code that is identical at a source
> level may well be significantly different at a binary level, and thats not
> something I can sanely manage in a general purpose environment.
>
> >
> > > 3) QA - Building multiple versions of a library means needing to QA multiple
> > > versions of a library. If you have to have 4 builds to support different levels
> > > of optimization, you've created a 4x increase in the amount of testing you need
> > > to do to ensure consistent behavior. You need to be aware of how many different
> > > builds are available in the single rpm at all times, and find systems on which
> > > to QA which will ensure that all of the builds get tested (as they are in fact,
> > > unique builds). While you may not hit all code paths in a single build, you
> > > will at least test all the common paths.
> >
> > Again, the exact same QA conditions will also apply to an approach using
> > multiple code paths bundled into the same library. Given a choice
> > between one code path with multiple compiles, vs multiple code paths
> > each compiled only once, the multiple code paths option leaves far
> > greater scope for bugs, and when bugs do occur means that you always
> > have to find out what specific hardware it was being run on. Using the
> > exact same code multiply compiled, the vast, vast majority of bugs are
> > going to occur across all platforms and systems so you should rarely
> > need to ask what the specific platform being used is.
> >
>
> No, they won't (see above). Enabling insructions will enable the compiler to
> emit and optimize common paths differently, so identical source code will lead
> to different binary code. I need to have a single binary so that I know what
> I'm working with when someone opens a bug. I don't have that using a multiple
> binary approach. At least with multiple runtime paths (especially/specifically
> with the run time paths we've been discussing, the igbe rx vector path and the
> acl library, which are isolated), I know that, if I get a bug report and the
> backtrace ends in either location, I know I'm sepecifically dealing with that
> code. With your multiple binary approach, if I get a crash in, say
> rte_eal_init, I need to figure out if this crash happened in the sse3 compiled
> binary, the ss4.2 compiled binary, the avx binary, the avx512 binary, or the
> core2 binary. You can say thats easy, but its easy to say that when you're not
> the one that has to support it.
>
> > >
> > > The bottom line is that Distribution packaging is all about consistency and
> > > commonality. If you install something for an arch on multiple systems, its the
> > > same thing on each system, and it works in the same way, all the time. This
> > > strategy breaks that. Thats why we do run time checks for things.
> >
> > If you want to have the best tuned code running for each instruction
> > set, then commonality and consistency goes out the window anyway,
> So, this is perhaps where communication is breaking down. I don't want to have the
> best tuned code running for each instruction set. What I want is for the dpdk
> to run on a lowest common denominator platform, and be able to opportunistically
> take advantage of accelerated code paths that require advanced cpu featrues.
>
>
> Lets take the ixgbe code as an example. Note I didn't add any code paths there,
> at all (in fact I didn't add any anywhere). The ixgbe rx_burst method gets set
> according to compile time configuration. You can pick the bulk_alloc rx method,
> or the vectorized rx method at compile time (or some others I think, but thats
> not relevant). As it happened the vectorized rx path option had an implicit
> dependency on SSE4.2. Instead of requiring that all cpus that run the dpdk have
> SSE4.2, I instead chose to move that compile time decision to a run tmie
> decision, by building only the vectorized path with sse4.2 and only using it if
> we see that the cpu supports sse4.2 at run time. No new paths created, no new
> support requirements, you're still supporting the same options upstream, the
> only difference is I was able to include them both in a single binary. Thats
> better for our end users because the single binary still works everywhere.
> Thats better for our QA group because For whatever set of tests they perform,
> they only need an sse4.2 enabled system to test the one isolated path for that
> vector rx code. The rest of their tests can be conducted once, on any system,
> because the binary is exactly the same. If we compile multiple binaries,
> testing on one system doesn't mean we've tested all the code.
>
> > because two different machines calling the same function are going to
> > execute different sets of instructions. The decision then becomes:
> But thats not at all what I wanted. I want two different machines calling the
> same function to execute the same instructions 99.9999% of the time. The only
> time I want to diverge from that is in isolated paths where we can take
> advantage of a feature that we otherwise could not (i.e. the ixgbe and acl
> code). I look at it like the alternatives code in linux. There are these
> isolated areas where you have limited bits of code that at run time are
> re-written to use available cpu features. 99.9% of the code is identical, but
> in these little spots its ok to diverge from simmilarity because they'er isolated,
> easily identifiable
>
> > a) whether you need multiple sets of instructions - if no then you pay
> > with lack of performance
> > b) how you get those multiple sets of instructions
> > c) how you validate those multiple sets of instructions.
> >
> > As is clear by now :-), my preference by far is to have multiple sets of
> > instructions come from a single code base, as less code means less
> > maintenance, and above all, fewer bugs. If that can't be done, then we
> > need to look carefully at each code path being added and do a
> > cost-benefit analysis on it.
> >
>
> Yes, its quite clear :), I think its equally clear that I need a single binary,
> and would like to opportunisitcally enhance it where possible without losing the
> fact that its a single binary.
>
> I suppose its all somewhat moot at this point though, The reduction to sse3 for
> ixgbe seems agreeable to everyone, and it lets me preserve single binary builds
> there. I'm currently working on the ACL library, as you noted thats a tougher
> nut to crack. I think I'll have it done early next week (though i'm sure my
> translation of the instruction set reference to C will need some through testing
> :)). I'll post it when its ready.
>
Agreed, lets get everything working to a common baseline anyway. In
terms of the number of RX and TX functions, you mentioned, I'd hope in future
we could cut the number of them down a bit as we make the vector versions more
generally applicable, but that's a whole discussion for another day.
/Bruce
next prev parent reply other threads:[~2014-08-01 21:08 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-29 20:24 Neil Horman
2014-07-29 20:24 ` [dpdk-dev] [PATCH 1/2] ixgbe: test sse4.2 support at runtime for vectorized receive operations Neil Horman
2014-07-29 20:24 ` [dpdk-dev] [PATCH 2/2] acl: Preform dynamic sse4.2 support check Neil Horman
2014-07-30 12:07 ` [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features Ananyev, Konstantin
2014-07-30 13:01 ` Neil Horman
2014-07-30 13:44 ` Ananyev, Konstantin
2014-07-30 14:49 ` [dpdk-dev] [PATCH v2 " Neil Horman
2014-07-30 14:49 ` [dpdk-dev] [PATCH v2 1/2] ixgbe: test sse4.2 support at runtime for vectorized receive operations Neil Horman
2014-07-30 14:49 ` [dpdk-dev] [PATCH v2 2/2] acl: Preform dynamic sse4.2 support check Neil Horman
2014-07-30 15:36 ` [dpdk-dev] [PATCH v2 0/2] dpdk: Allow for dynamic enablement of some isolated features Ananyev, Konstantin
2014-07-30 19:03 ` Venky Venkatesan
2014-07-30 19:17 ` Neil Horman
2014-07-30 19:34 ` Neil Horman
2014-07-30 18:59 ` [dpdk-dev] [PATCH " Bruce Richardson
2014-07-30 19:28 ` Neil Horman
2014-07-30 21:09 ` Bruce Richardson
2014-07-31 9:30 ` Thomas Monjalon
2014-07-31 11:36 ` Ananyev, Konstantin
2014-07-31 13:13 ` Neil Horman
2014-07-31 13:26 ` Thomas Monjalon
2014-07-31 14:32 ` Neil Horman
2014-07-31 18:10 ` Neil Horman
2014-07-31 18:36 ` Bruce Richardson
2014-07-31 19:01 ` Neil Horman
2014-07-31 20:19 ` Bruce Richardson
2014-08-01 13:36 ` Neil Horman
2014-08-01 13:56 ` Ananyev, Konstantin
2014-08-01 14:26 ` Venkatesan, Venky
2014-08-01 14:27 ` Neil Horman
2014-07-31 19:58 ` John W. Linville
2014-07-31 20:20 ` Bruce Richardson
2014-07-31 20:32 ` John W. Linville
2014-08-01 8:46 ` Vincent JARDIN
2014-08-01 14:06 ` Neil Horman
2014-08-01 14:57 ` Vincent JARDIN
2014-08-01 15:19 ` Neil Horman
2014-07-31 20:10 ` Neil Horman
2014-07-31 20:25 ` Bruce Richardson
2014-08-01 15:06 ` Neil Horman
2014-08-01 19:22 ` Bruce Richardson
2014-08-01 20:43 ` Neil Horman
2014-08-01 21:08 ` Bruce Richardson [this message]
2014-08-02 12:56 ` Neil Horman
2014-07-31 21:53 ` Thomas Monjalon
2014-07-31 21:25 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140801210821.GF28495@localhost.localdomain \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=nhorman@tuxdriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).