From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id F0D0258F4 for ; Fri, 1 Aug 2014 22:42:08 +0200 (CEST) Received: from hmsreliant.think-freely.org ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1XDJg9-0008LZ-Uz; Fri, 01 Aug 2014 16:44:09 -0400 Date: Fri, 1 Aug 2014 16:43:52 -0400 From: Neil Horman To: Bruce Richardson Message-ID: <20140801204352.GF31979@hmsreliant.think-freely.org> References: <20140730210920.GB6420@localhost.localdomain> <20140731131351.GA20718@hmsreliant.think-freely.org> <5766264.li3nkTmgY6@xps13> <20140731143228.GB20718@hmsreliant.think-freely.org> <20140731181032.GC20718@hmsreliant.think-freely.org> <20140731183631.GC6420@localhost.localdomain> <20140731201018.GE20718@hmsreliant.think-freely.org> <20140731202506.GC28495@localhost.localdomain> <20140801150629.GD31979@hmsreliant.think-freely.org> <20140801192221.GE28495@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140801192221.GE28495@localhost.localdomain> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Status: No Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Aug 2014 20:42:09 -0000 On Fri, Aug 01, 2014 at 12:22:22PM -0700, Bruce Richardson wrote: > On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote: > > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote: > > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote: > > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote: > > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote: > > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote: > > > > > > > > 2014-07-31 09:13, Neil Horman: > > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson wrote: > > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote: > > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson wrote: > > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote: > > > > > > > > > > > > > Hey all- > > > > > > > > > > With regards to the general approach for runtime detection of software > > > > > functions, I wonder if something like this can be handled by the > > > > > packaging system? Is it possible to ship out a set of shared libs > > > > > compiled up for different instruction sets, and then at rpm install > > > > > time, symlink the appropriate library? This would push the whole issue > > > > > of detection of code paths outside of code, work across all our > > > > > libraries and ensure each user got the best performance they could get > > > > > form a binary? > > > > > Has something like this been done before? The building of all the > > > > > libraries could be scripted easy enough, just do multiple builds using > > > > > different EXTRA_CFLAGS each time, and move and rename the .so's after > > > > > each run. > > > > > > > > > > > > > Sorry, I missed this in my last reply. > > > > > > > > In answer to your question, the short version is that such a thing is roughly > > > > possible from a packaging standpoint, but completely unworkable from a > > > > distribution standpoint. We could certainly build the dpdk multiple times and > > > > rename all the shared objects to some variant name representative of the > > > > optimzations we build in for certain cpu flags, but then we woudl be shipping X > > > > versions of the dpdk, and any appilcation (say OVS that made use of the dpdk > > > > would need to provide a version linked against each variant to be useful when > > > > making a product, and each end user would need to manually select (or run a > > > > script to select) which variant is most optimized for the system at hand. Its > > > > just not a reasonable way to package a library. > > > > > > Sorry, perhaps I was not clear, having the user have to select the > > > appropriate library was not what I was suggesting. Instead, I was > > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic", > > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm > > > post-install script would look at the cpuflags in cpuinfo and then > > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user > > > only has to link against "librte_pmd_ixgbe.so" and depending on the > > > system its run on, the loader will automatically resolve the symbols > > > from the appropriate instruction-set specific .so file. > > > > > > > This is an absolute packaging nightmare, it will potentially break all sorts of > > corner cases, and support processes. To cite a few examples: > > > > 1) Upgrade support - What if the minimum cpu requirements for dpdk are advanced > > at some point in the future? The above strategy has no way to know that a given > > update has more advanced requirements than a previous update, and when the > > update is installed, the previously linked library for the old base will > > dissappear, leaving broken applications behind. > > Firstly, I didn't know we could actually specify minimum cpu > requirements for packaging, that is something that could be useful :-) You misread my comment :). I didn't say we could specify minimum cpu requirements at packaging (you can't, beyond general arch), I said "what if the dpdk's cpu requriements were raised?". Completely different thing. Currently teh default, lowest common denominator system that dpdk appears to build for is core2 (as listed in the old default config). What if at some point you raise those requirements and decide that SSE4.2 really is required to achieve maximum performance. Using the above strategy any system that doesn't meet the new requirements will silently break on such an update. Thats not acceptable. > Secondly, what is the normal case for handling something like this, > where an upgrade has enhanced requirements compared to the previous > version? Presumably you either need to prevent the upgrade from > happening or else accept a broken app. Can the same mechanism not also > be used to prevent upgrades using a multi-lib scheme? > The case for handling something like this is: Don't do it. When you package something for Fedora (or any distro), you provide an implicit guaratee that it will run (or fail gracefully) on all supported systems. You can add support for systems as you go forward, but you can't deprecate support for systems within a major release. That is to say, if something runs on F20 now, its got to keep running on F20 for the lifetime of F20. If it stops running, thats a regression, the user opens a bug and you fix it. The DPDK is way off the reservation in regards to this. Application packages, as a general rule don't build with specific cpu features in mind, because performance, while important isn't on the same scale as what you're trying to do in the dpdk. A process getting scheduled off the cpu while we handle an interrupt wipes out any speedup gains made by any micro-optimizations, so theres no point in doing so. The DPDK is different, I understand that, but the drawback is that it (the DPDK) needs to make optimizations that really aren't considered particularly important to the rest of user space. I'm trying to opportunistically make the DPDK as fast as possible, but I need to do it in a single binary, that works on a lowest common demoninator system. > > > > 2) Debugging - Its going to be near impossible to support an application built > > with a package put together this way, because you'll never be sure as to which > > version of the library was running when the crash occured. You can figure it > > out for certain, but for support/development people to need to remember to > > figure this out is going to be a major turn off for them, and the result will be > > that they simply won't use the dpdk. Its Anathema to the expectations of linux > > user space. > > Sorry, I just don't see this as being any harder to support than > multiple code paths for the same functionality. In fact, it will surely make > debugging easier, since you only have the one code path, just compiled > up in different ways. > Well, then by all means, become a Fedora packager, an you can take over the DPDK maintenece there :). Until then, you'll just have to trust me. If you have multiple optional code paths (especialy if they're limited to isolated features) its manageable. But regardless of how you look at it, building the same source multiple times with different cpu support means completely different binaries. The assembly and optimization are just plain different. They may be close, but they're not the same, and they need to be QA-ed independently. With a single build and optional code paths, all the common code is executed no matter what system you're running on, and its always the same. Multiple builds with different instruction support means that code that is identical at a source level may well be significantly different at a binary level, and thats not something I can sanely manage in a general purpose environment. > > > 3) QA - Building multiple versions of a library means needing to QA multiple > > versions of a library. If you have to have 4 builds to support different levels > > of optimization, you've created a 4x increase in the amount of testing you need > > to do to ensure consistent behavior. You need to be aware of how many different > > builds are available in the single rpm at all times, and find systems on which > > to QA which will ensure that all of the builds get tested (as they are in fact, > > unique builds). While you may not hit all code paths in a single build, you > > will at least test all the common paths. > > Again, the exact same QA conditions will also apply to an approach using > multiple code paths bundled into the same library. Given a choice > between one code path with multiple compiles, vs multiple code paths > each compiled only once, the multiple code paths option leaves far > greater scope for bugs, and when bugs do occur means that you always > have to find out what specific hardware it was being run on. Using the > exact same code multiply compiled, the vast, vast majority of bugs are > going to occur across all platforms and systems so you should rarely > need to ask what the specific platform being used is. > No, they won't (see above). Enabling insructions will enable the compiler to emit and optimize common paths differently, so identical source code will lead to different binary code. I need to have a single binary so that I know what I'm working with when someone opens a bug. I don't have that using a multiple binary approach. At least with multiple runtime paths (especially/specifically with the run time paths we've been discussing, the igbe rx vector path and the acl library, which are isolated), I know that, if I get a bug report and the backtrace ends in either location, I know I'm sepecifically dealing with that code. With your multiple binary approach, if I get a crash in, say rte_eal_init, I need to figure out if this crash happened in the sse3 compiled binary, the ss4.2 compiled binary, the avx binary, the avx512 binary, or the core2 binary. You can say thats easy, but its easy to say that when you're not the one that has to support it. > > > > The bottom line is that Distribution packaging is all about consistency and > > commonality. If you install something for an arch on multiple systems, its the > > same thing on each system, and it works in the same way, all the time. This > > strategy breaks that. Thats why we do run time checks for things. > > If you want to have the best tuned code running for each instruction > set, then commonality and consistency goes out the window anyway, So, this is perhaps where communication is breaking down. I don't want to have the best tuned code running for each instruction set. What I want is for the dpdk to run on a lowest common denominator platform, and be able to opportunistically take advantage of accelerated code paths that require advanced cpu featrues. Lets take the ixgbe code as an example. Note I didn't add any code paths there, at all (in fact I didn't add any anywhere). The ixgbe rx_burst method gets set according to compile time configuration. You can pick the bulk_alloc rx method, or the vectorized rx method at compile time (or some others I think, but thats not relevant). As it happened the vectorized rx path option had an implicit dependency on SSE4.2. Instead of requiring that all cpus that run the dpdk have SSE4.2, I instead chose to move that compile time decision to a run tmie decision, by building only the vectorized path with sse4.2 and only using it if we see that the cpu supports sse4.2 at run time. No new paths created, no new support requirements, you're still supporting the same options upstream, the only difference is I was able to include them both in a single binary. Thats better for our end users because the single binary still works everywhere. Thats better for our QA group because For whatever set of tests they perform, they only need an sse4.2 enabled system to test the one isolated path for that vector rx code. The rest of their tests can be conducted once, on any system, because the binary is exactly the same. If we compile multiple binaries, testing on one system doesn't mean we've tested all the code. > because two different machines calling the same function are going to > execute different sets of instructions. The decision then becomes: But thats not at all what I wanted. I want two different machines calling the same function to execute the same instructions 99.9999% of the time. The only time I want to diverge from that is in isolated paths where we can take advantage of a feature that we otherwise could not (i.e. the ixgbe and acl code). I look at it like the alternatives code in linux. There are these isolated areas where you have limited bits of code that at run time are re-written to use available cpu features. 99.9% of the code is identical, but in these little spots its ok to diverge from simmilarity because they'er isolated, easily identifiable > a) whether you need multiple sets of instructions - if no then you pay > with lack of performance > b) how you get those multiple sets of instructions > c) how you validate those multiple sets of instructions. > > As is clear by now :-), my preference by far is to have multiple sets of > instructions come from a single code base, as less code means less > maintenance, and above all, fewer bugs. If that can't be done, then we > need to look carefully at each code path being added and do a > cost-benefit analysis on it. > Yes, its quite clear :), I think its equally clear that I need a single binary, and would like to opportunisitcally enhance it where possible without losing the fact that its a single binary. I suppose its all somewhat moot at this point though, The reduction to sse3 for ixgbe seems agreeable to everyone, and it lets me preserve single binary builds there. I'm currently working on the ACL library, as you noted thats a tougher nut to crack. I think I'll have it done early next week (though i'm sure my translation of the instruction set reference to C will need some through testing :)). I'll post it when its ready. > Regards, > /Bruce > > > > > Neil > > > > > > > > > > When pacaging software, the only consideration given to code variance at pacakge > > > > time is architecture (x86/x86_64/ppc/s390/etc). If you install a package for > > > > your a given architecture, its expected to run on that architecture. Optional > > > > code paths are just that, optional, and executed based on run time tests. Its a > > > > requirement that we build for the lowest common demoniator system that is > > > > supported, and enable accelerative code paths optionally at run time when the > > > > cpu indicates support for them. > > > > > > > > Neil > > > > > > > >