DPDK patches and discussions
 help / color / mirror / Atom feed
From: "EDMISON, Kelvin (Kelvin)" <kelvin.edmison@alcatel-lucent.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Neil Horman <nhorman@tuxdriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 0/4] DPDK memcpy optimization
Date: Wed, 28 Jan 2015 21:48:09 +0000	[thread overview]
Message-ID: <D0EE79A7.42BCB%kelvin.edmison@alcatel-lucent.com> (raw)
In-Reply-To: <F60F360A2500CD45ACDB1D700268892D0E761378@SHSMSX101.ccr.corp.intel.com>


On 2015-01-27, 3:22 AM, "Wang, Zhihong" <zhihong.wang@intel.com> wrote:

>
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of EDMISON, Kelvin
>> (Kelvin)
>> Sent: Friday, January 23, 2015 2:22 AM
>> To: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 0/4] DPDK memcpy optimization
>> 
>> 
>> 
>> On 2015-01-21, 3:54 PM, "Neil Horman" <nhorman@tuxdriver.com> wrote:
>> 
>> >On Wed, Jan 21, 2015 at 11:49:47AM -0800, Stephen Hemminger wrote:
>> >> On Wed, 21 Jan 2015 13:26:20 +0000
>> >> Bruce Richardson <bruce.richardson@intel.com> wrote:
>> >>
[..trim...]
>> >> One issue I have is that as a vendor we need to ship on binary, not
>> >>different distributions
>> >> for each Intel chip variant. There is some support for multi-chip
>> >>version functions
>> >> but only in latest Gcc which isn't in Debian stable. And the
>>multi-chip
>> >>version
>> >> of functions is going to be more expensive than inlining. For some
>> >>cases, I have
>> >> seen that the overhead of fancy instructions looks good but have
>>nasty
>> >>side effects
>> >> like CPU stall and/or increased power consumption which turns of
>>turbo
>> >>boost.
>> >>
>> >>
>> >> Distro's in general have the same problem with special case
>> >>optimizations.
>> >>
>> >What we really need is to do something like borrow the alternatives
>> >mechanism
>> >from the kernel so that we can dynamically replace instructions at run
>> >time
>> >based on cpu flags.  That way we could make the choice at run time, and
>> >wouldn't
>> >have to do alot of special case jumping about.
>> >Neil
>> 
>> +1.
>> 
>> I think it should be an anti-requirement that the build machine be the
>> exact same chip as the deployment platform.
>> 
>> I like the cpu flag inspection approach.  It would help in the case
>>where
>> DPDK is in a VM and an odd set of CPU flags have been exposed.
>> 
>> If that approach doesn't work though, then perhaps DPDK memcpy could go
>> through a benchmarking at app startup time and select the most
>>performant
>> option out of a set, like mdraid's raid6 implementation does.  To give
>>an
>> example, this is what my systems print out at boot time re: raid6
>> algorithm selection.
>> raid6: sse2x1    3171 MB/s
>> raid6: sse2x2    3925 MB/s
>> raid6: sse2x4    4523 MB/s
>> raid6: using algorithm sse2x4 (4523 MB/s)
>> 
>> Regards,
>>    Kelvin
>> 
>
>Thanks for the proposal!
>
>For DPDK, performance is always the most important concern. We need to
>utilize new architecture features to achieve that, so solution per arch
>is necessary.
>Even a few extra cycles can lead to bad performance if they're in a hot
>loop.
>For instance, let's assume DPDK takes 60 cycles to process a packet on
>average, then 3 more cycles here means 5% performance drop.
>
>The dynamic solution is doable but with performance penalties, even if it
>could be small. Also it may bring extra complexity, which can lead to
>unpredictable behaviors and side effects.
>For example, the dynamic solution won't have inline unrolling, which can
>bring significant performance benefit for small copies with constant
>length, like eth_addr.
>
>We can investigate the VM scenario more.
>
>Zhihong (John)

John,

  Thanks for taking the time to answer my newbie question. I deeply
appreciate the attention paid to performance in DPDK. I have a follow-up
though.

I'm trying to figure out what requirements this approach creates for the
software build environment.  If we want to build optimized versions for
Haswell, Ivy Bridge, Sandy Bridge, etc, does this mean that we must have
one of each micro-architecture available for running the builds, or is
there a way of cross-compiling for all micro-architectures from just one
build environment?

Thanks,
  Kelvin 

  reply	other threads:[~2015-01-28 21:48 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-19  1:53 zhihong.wang
2015-01-19  1:53 ` [dpdk-dev] [PATCH 1/4] app/test: Disabled VTA for memcpy test in app/test/Makefile zhihong.wang
2015-01-19  1:53 ` [dpdk-dev] [PATCH 2/4] app/test: Removed unnecessary test cases in test_memcpy.c zhihong.wang
2015-01-19  1:53 ` [dpdk-dev] [PATCH 3/4] app/test: Extended test coverage in test_memcpy_perf.c zhihong.wang
2015-01-19  1:53 ` [dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms zhihong.wang
2015-01-20 17:15   ` Stephen Hemminger
2015-01-20 19:16     ` Neil Horman
2015-01-21  3:18       ` Wang, Zhihong
2015-01-25 20:02     ` Jim Thompson
2015-01-26 14:43   ` Wodkowski, PawelX
2015-01-27  5:12     ` Wang, Zhihong
2015-01-19 13:02 ` [dpdk-dev] [PATCH 0/4] DPDK memcpy optimization Neil Horman
2015-01-20  3:01   ` Wang, Zhihong
2015-01-20 15:11     ` Neil Horman
2015-01-20 16:14       ` Bruce Richardson
2015-01-21  3:44         ` Wang, Zhihong
2015-01-21 11:40           ` Bruce Richardson
2015-01-21 12:02           ` Ananyev, Konstantin
2015-01-21 12:38             ` Neil Horman
2015-01-23  3:26               ` Wang, Zhihong
2015-01-21 12:36           ` Marc Sune
2015-01-21 13:02             ` Bruce Richardson
2015-01-21 13:21               ` Marc Sune
2015-01-21 13:26                 ` Bruce Richardson
2015-01-21 19:49                   ` Stephen Hemminger
2015-01-21 20:54                     ` Neil Horman
2015-01-21 21:25                       ` Jim Thompson
2015-01-22  0:53                         ` Stephen Hemminger
2015-01-22  9:06                         ` Luke Gorrie
2015-01-22 13:29                           ` Jay Rolette
2015-01-22 18:27                             ` Luke Gorrie
2015-01-22 19:36                               ` Jay Rolette
2015-01-22 18:21                       ` EDMISON, Kelvin (Kelvin)
2015-01-27  8:22                         ` Wang, Zhihong
2015-01-28 21:48                           ` EDMISON, Kelvin (Kelvin) [this message]
2015-01-29  1:53                             ` Wang, Zhihong
2015-01-23  6:52                   ` Wang, Zhihong
2015-01-26 18:29                     ` Ananyev, Konstantin
2015-01-27  1:42                       ` Wang, Zhihong
2015-01-27 11:30                         ` Ananyev, Konstantin
2015-01-27 12:19                           ` Ananyev, Konstantin
2015-01-28  2:06                             ` Wang, Zhihong
2015-01-25 14:50 ` Luke Gorrie
2015-01-26  1:30   ` Wang, Zhihong
2015-01-26  8:03     ` Luke Gorrie
2015-01-27  7:19       ` Wang, Zhihong
2015-01-27 13:57         ` [dpdk-dev] [snabb-devel] " Luke Gorrie
2015-01-29  3:42 ` [dpdk-dev] " Fu, JingguoX

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D0EE79A7.42BCB%kelvin.edmison@alcatel-lucent.com \
    --to=kelvin.edmison@alcatel-lucent.com \
    --cc=dev@dpdk.org \
    --cc=nhorman@tuxdriver.com \
    --cc=stephen@networkplumber.org \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).