From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bruce.richardson@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id B00DB68AE
 for <dev@dpdk.org>; Fri,  1 Aug 2014 21:21:35 +0200 (CEST)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by fmsmga101.fm.intel.com with ESMTP; 01 Aug 2014 12:23:40 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="366841759"
Received: from unknown (HELO localhost.localdomain) ([134.134.172.151])
 by FMSMGA003.fm.intel.com with ESMTP; 01 Aug 2014 12:20:28 -0700
Date: Fri, 1 Aug 2014 12:22:22 -0700
From: Bruce Richardson <bruce.richardson@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Message-ID: <20140801192221.GE28495@localhost.localdomain>
References: <1406665466-29654-1-git-send-email-nhorman@tuxdriver.com>
 <20140730210920.GB6420@localhost.localdomain>
 <20140731131351.GA20718@hmsreliant.think-freely.org>
 <5766264.li3nkTmgY6@xps13>
 <20140731143228.GB20718@hmsreliant.think-freely.org>
 <20140731181032.GC20718@hmsreliant.think-freely.org>
 <20140731183631.GC6420@localhost.localdomain>
 <20140731201018.GE20718@hmsreliant.think-freely.org>
 <20140731202506.GC28495@localhost.localdomain>
 <20140801150629.GD31979@hmsreliant.think-freely.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20140801150629.GD31979@hmsreliant.think-freely.org>
Organization: Intel Shannon Limited. Registered in Ireland.  Registered
 Office: Collinstown Industrial Park, Leixlip, County Kildare. Registered
 Number: 308263. Business address: Dromore House, East Park, Shannon, Co.
 Clare.
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of
 some isolated features
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Aug 2014 19:21:36 -0000

On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote:
> On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote:
> > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote:
> > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote:
> > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote:
> > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote:
> > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote:
> > > > > > > 2014-07-31 09:13, Neil Horman:
> > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson wrote:
> > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote:
> > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson wrote:
> > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote:
> > > > > > > > > > > > Hey all-
> > > > 
> > > > With regards to the general approach for runtime detection of software
> > > > functions, I wonder if something like this can be handled by the
> > > > packaging system? Is it possible to ship out a set of shared libs
> > > > compiled up for different instruction sets, and then at rpm install
> > > > time, symlink the appropriate library? This would push the whole issue
> > > > of detection of code paths outside of code, work across all our
> > > > libraries and ensure each user got the best performance they could get
> > > > form a binary?
> > > > Has something like this been done before? The building of all the
> > > > libraries could be scripted easy enough, just do multiple builds using
> > > > different EXTRA_CFLAGS each time, and move and rename the .so's after
> > > > each run.
> > > > 
> > > 
> > > Sorry, I missed this in my last reply.
> > > 
> > > In answer to your question, the short version is that such a thing is roughly
> > > possible from a packaging standpoint, but completely unworkable from a
> > > distribution standpoint.  We could certainly build the dpdk multiple times and
> > > rename all the shared objects to some variant name representative of the
> > > optimzations we build in for certain cpu flags, but then we woudl be shipping X
> > > versions of the dpdk, and any appilcation (say OVS that made use of the dpdk
> > > would need to provide a version linked against each variant to be useful when
> > > making a product, and each end user would need to manually select (or run a
> > > script to select) which variant is most optimized for the system at hand.  Its
> > > just not a reasonable way to package a library.
> > 
> > Sorry, perhaps I was not clear, having the user have to select the
> > appropriate library was not what I was suggesting. Instead, I was
> > suggesting that the rpm install "librte_pmd_ixgbe.so.generic",
> > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm
> > post-install script would look at the cpuflags in cpuinfo and then
> > symlink librte_pmd_ixgbe.so to the best-match version. That way the user
> > only has to link against "librte_pmd_ixgbe.so" and depending on the
> > system its run on, the loader will automatically resolve the symbols
> > from the appropriate instruction-set specific .so file.
> > 
> 
> This is an absolute packaging nightmare, it will potentially break all sorts of
> corner cases, and support processes.  To cite a few examples:
> 
> 1) Upgrade support - What if the minimum cpu requirements for dpdk are advanced
> at some point in the future?  The above strategy has no way to know that a given
> update has more advanced requirements than a previous update, and when the
> update is installed, the previously linked library for the old base will
> dissappear, leaving broken applications behind.

Firstly, I didn't know we could actually specify minimum cpu
requirements for packaging, that is something that could be useful :-)
Secondly, what is the normal case for handling something like this,
where an upgrade has enhanced requirements compared to the previous
version? Presumably you either need to prevent the upgrade from
happening or else accept a broken app. Can the same mechanism not also
be used to prevent upgrades using a multi-lib scheme?

> 
> 2) Debugging - Its going to be near impossible to support an application built
> with a package put together this way, because you'll never be sure as to which
> version of the library was running when the crash occured.  You can figure it
> out for certain, but for support/development people to need to remember to
> figure this out is going to be a major turn off for them, and the result will be
> that they simply won't use the dpdk.  Its Anathema to the expectations of linux
> user space.

Sorry, I just don't see this as being any harder to support than
multiple code paths for the same functionality. In fact, it will surely make
debugging easier, since you only have the one code path, just compiled
up in different ways.

> 
> 3) QA - Building multiple versions of a library means needing to QA multiple
> versions of a library.  If you have to have 4 builds to support different levels
> of optimization, you've created a 4x increase in the amount of testing you need
> to do to ensure consistent behavior.  You need to be aware of how many different
> builds are available in the single rpm at all times, and find systems on which
> to QA which will ensure that all of the builds get tested (as they are in fact,
> unique builds).  While you may not hit all code paths in a single build, you
> will at least test all the common paths.

Again, the exact same QA conditions will also apply to an approach using
multiple code paths bundled into the same library. Given a choice
between one code path with multiple compiles, vs multiple code paths
each compiled only once, the multiple code paths option leaves far
greater scope for bugs, and when bugs do occur means that you always
have to find out what specific hardware it was being run on. Using the
exact same code multiply compiled, the vast, vast majority of bugs are
going to occur across all platforms and systems so you should rarely
need to ask what the specific platform being used is.

> 
> The bottom line is that Distribution packaging is all about consistency and
> commonality.  If you install something for an arch on multiple systems, its the
> same thing on each system, and it works in the same way, all the time.  This
> strategy breaks that.  Thats why we do run time checks for things.

If you want to have the best tuned code running for each instruction
set, then commonality and consistency goes out the window anyway,
because two different machines calling the same function are going to
execute different sets of instructions. The decision then becomes:
a) whether you need multiple sets of instructions - if no then you pay
with lack of performance
b) how you get those multiple sets of instructions
c) how you validate those multiple sets of instructions.

As is clear by now :-), my preference by far is to have multiple sets of
instructions come from a single code base, as less code means less
maintenance, and above all, fewer bugs. If that can't be done, then we
need to look carefully at each code path being added and do a
cost-benefit analysis on it.

Regards,
/Bruce

> 
> Neil
> 
> > > 
> > > When pacaging software, the only consideration given to code variance at pacakge
> > > time is architecture (x86/x86_64/ppc/s390/etc).  If you install a package for
> > > your a given architecture, its expected to run on that architecture.  Optional
> > > code paths are just that, optional, and executed based on run time tests.  Its a
> > > requirement that we build for the lowest common demoniator system that is
> > > supported, and enable accelerative code paths optionally at run time when the
> > > cpu indicates support for them.
> > > 
> > > Neil
> > > 
> >