DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
	Neil Horman <nhorman@tuxdriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
Date: Thu, 31 Jul 2014 11:36:34 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB97725821345B53@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <20140730210920.GB6420@localhost.localdomain>


Hi Bruce,

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, July 30, 2014 10:09 PM
> To: Neil Horman
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
> 
> On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote:
> > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson wrote:
> > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote:
> > > > Hey all-
> > > >         I've been trying to update the fedora dpdk package to support VFIO
> > > > enabled drivers and ran into a problem in which ixgbe didn't compile because the
> > > > rxtx_vec code uses sse4.2 instruction intrinsics, which aren't supported in the
> > > > default config I have.  I tried to remedy this by replacing the intrinsics with
> > > > the __builtin macros, but it was pointed out (correctly), that this doesn't work
> > > > properly.  So this is my second attempt, which I actually like a bit better.  I
> > > > noted that code that uses intrinsics (ixgbe and the acl library), don't need to
> > > > have those instructions turned on build-wide.  Rather, we can just enable the
> > > > instructions in the specific code we want to build with support for that, and
> > > > test for instruction support dynamically at run time.  This allows me to build
> > > > the dpdk for a generic platform, but in such a way that some optimizations can
> > > > be used if the executing cpu supports them at run time.
> > > >
> > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > > CC: Thomas Monjalon <thomas.monjalon@6wind.com>
> > > >
> > > I'd prefer if a solution could be found based off your original patch
> > > set, as it gives us more chance to deprecate the older code paths in
> > > future. Looking at the Intel Intrinsics Guide site online, it shows that
> > > the _mm_shuffle_epi8 intrinsic came in with SSSE3, rather than SSE4.x,
> > > and so should be available on all 64-bit systems, I believe. The
> > > popcount intrinsic is newer, but it's a much more basic instruction so
> > > hopefully the __builtin should work for that.
> > >
> > Yes, but as I look at it, thats somewhat counter to my goal, which is to offer
> > accelerated code paths on systems that can make use of it at run time.  If We
> > use the __builtin compiler functions, we will either:
> >
> > 1) Build those code paths with advanced instructions that won't work on older
> > systems (i.e. crash)
> >
> > 2) Build those code paths with less advanced instructions, meaning that we won't
> > speedup execution on systems that are capable of using the more advanced
> > instructions.
> >
> > Using this run time check, we can, at least in these situations, make use of the
> > accelerated paths when the instructions are available, and ignore them when
> > they're not, at run time.
> >
> > What would be ideal, would be an alternative type macro, like the linux kernel
> > employs, but implementing that would require some pretty significant work and
> > testing.  This seems like a much simpler approach.
> >
> 
> Ok, I understand where you are coming from indeed. However, within that,
> I'd like to see us reduce the amount of code that's needed for
> maintenance.
> 
> What we should really aim for, is to have common code, with perhaps some
> small ifdefs or __builtins, and then compile that code multiple times
> for multiple different architectures. So in this case, it would be nice
> to use the __builtin, and then compile that code up with and without SSE
> and select at runtime the code path to be used. Ideally, this could be
> done at the driver level.
> 
> However, once you get down this path, you are dealing with more than
> just SSE. If I compile up the PMD on my system, which has a chip based
> on Sandy Bridge uarch, I find that there are multiple instructions
> starting with "vp" which means that they are actually AVX instructions.
> Even though the code is written using intrinsics which correspond to SSE
> operations, the compiler is free to use AVX instructions where necessary
> to improve performance. 
> Therefore, if we go down this road, we need to
> look to compile up the code for all microarchitectures, rather than just
> assuming that we will get equivalent performance to "native" by turning
> on the instruction set indicated by the primitives in the code. This is
> where having one codepath recompiled multiple times will work far better
> than having multiple code paths.

Using your example - as long as we specify '-mavx' compiler can (and does) use AVX instructions
even for 'scalar' code (code without any SIMD instrincts).
And yes, that probably affects performance.
So, as I understand your suggestion, we'll then need to divide our code into:
- generic one - compiled to run on all supported platforms   
- performance critical that will be recompiled for each supported platform.
Then generic code would have to make decision at run-time what particular version of recompiled code to use.
And that for each PMD and all others performance-critical DPDK libraries.
Looks like too much hassle to me.
After all - if someone needs a package with binaries optimised for different architectures,
he can provide multiple DPDK binaries (build for different architectures) and small install script,
that would decide which binary is more appropriate for the given platform.

Konstantin
   

  parent reply	other threads:[~2014-07-31 11:35 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-29 20:24 Neil Horman
2014-07-29 20:24 ` [dpdk-dev] [PATCH 1/2] ixgbe: test sse4.2 support at runtime for vectorized receive operations Neil Horman
2014-07-29 20:24 ` [dpdk-dev] [PATCH 2/2] acl: Preform dynamic sse4.2 support check Neil Horman
2014-07-30 12:07 ` [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features Ananyev, Konstantin
2014-07-30 13:01   ` Neil Horman
2014-07-30 13:44     ` Ananyev, Konstantin
2014-07-30 14:49 ` [dpdk-dev] [PATCH v2 " Neil Horman
2014-07-30 14:49   ` [dpdk-dev] [PATCH v2 1/2] ixgbe: test sse4.2 support at runtime for vectorized receive operations Neil Horman
2014-07-30 14:49   ` [dpdk-dev] [PATCH v2 2/2] acl: Preform dynamic sse4.2 support check Neil Horman
2014-07-30 15:36   ` [dpdk-dev] [PATCH v2 0/2] dpdk: Allow for dynamic enablement of some isolated features Ananyev, Konstantin
2014-07-30 19:03   ` Venky Venkatesan
2014-07-30 19:17     ` Neil Horman
2014-07-30 19:34     ` Neil Horman
2014-07-30 18:59 ` [dpdk-dev] [PATCH " Bruce Richardson
2014-07-30 19:28   ` Neil Horman
2014-07-30 21:09     ` Bruce Richardson
2014-07-31  9:30       ` Thomas Monjalon
2014-07-31 11:36       ` Ananyev, Konstantin [this message]
2014-07-31 13:13       ` Neil Horman
2014-07-31 13:26         ` Thomas Monjalon
2014-07-31 14:32           ` Neil Horman
2014-07-31 18:10             ` Neil Horman
2014-07-31 18:36               ` Bruce Richardson
2014-07-31 19:01                 ` Neil Horman
2014-07-31 20:19                   ` Bruce Richardson
2014-08-01 13:36                     ` Neil Horman
2014-08-01 13:56                       ` Ananyev, Konstantin
2014-08-01 14:26                         ` Venkatesan, Venky
2014-08-01 14:27                         ` Neil Horman
2014-07-31 19:58                 ` John W. Linville
2014-07-31 20:20                   ` Bruce Richardson
2014-07-31 20:32                     ` John W. Linville
2014-08-01  8:46                       ` Vincent JARDIN
2014-08-01 14:06                         ` Neil Horman
2014-08-01 14:57                           ` Vincent JARDIN
2014-08-01 15:19                             ` Neil Horman
2014-07-31 20:10                 ` Neil Horman
2014-07-31 20:25                   ` Bruce Richardson
2014-08-01 15:06                     ` Neil Horman
2014-08-01 19:22                       ` Bruce Richardson
2014-08-01 20:43                         ` Neil Horman
2014-08-01 21:08                           ` Bruce Richardson
2014-08-02 12:56                             ` Neil Horman
2014-07-31 21:53               ` Thomas Monjalon
2014-07-31 21:25             ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB97725821345B53@IRSMSX105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).