From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rolette@infiniteio.com>
Received: from mail-yk0-f180.google.com (mail-yk0-f180.google.com
 [209.85.160.180]) by dpdk.org (Postfix) with ESMTP id 24E9C5AB1
 for <dev@dpdk.org>; Thu, 22 Jan 2015 20:36:27 +0100 (CET)
Received: by mail-yk0-f180.google.com with SMTP id 131so1503240ykp.11
 for <dev@dpdk.org>; Thu, 22 Jan 2015 11:36:26 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=JdlFmmk+cNRYDfIID2mtQ7XHFON3icQifAzSuKQG6GM=;
 b=VkMP+qS29wdNWmPOJ+QzHAov5hEmhJ+10hc6DoTEtsPzGpRYxnCQ8EEtsFD6WsKJEF
 SzGOq6Kthj1eqHqHEQ4ICyzGx4eocvIU/hqklvEBbA0T1ChRYqxBONFSnv9om4OJMdwS
 Lxtr2dfP7IjSO+Ao072sygFv27IfS9XL+UiZaoFvzSTsM9fPJ3clu3YeatMTC2rRgN1C
 QOzMEjKspVA26b3KXOM1DDIynsjLh0qsAH1Sfu3csMLVt7NzAIlaAgLiygX16Ff3SxMh
 ne0Eq3tmxmFENuxNMST9r40a989hSRLAVcKqA0N5tyhxsapCAj9imb3GrAGZ5Fj6DskR
 /nCA==
X-Gm-Message-State: ALoCoQkDyTAmSaWt12lTavLq5PSPJo7xZmVaRi1HzhDd3u0yyWOSmhNYnkZBHLKj0I+rXI4+tcYM
MIME-Version: 1.0
X-Received: by 10.170.39.70 with SMTP id 67mr1921810ykh.36.1421955386543; Thu,
 22 Jan 2015 11:36:26 -0800 (PST)
Received: by 10.170.54.73 with HTTP; Thu, 22 Jan 2015 11:36:26 -0800 (PST)
In-Reply-To: <CAA2XHbfcFj=RE-__=6Gjetthp+XyxRRuj91G6FD=j4B=f9Je=Q@mail.gmail.com>
References: <20150119130221.GB21790@hmsreliant.think-freely.org>
 <F60F360A2500CD45ACDB1D700268892D0E75EFFE@SHSMSX101.ccr.corp.intel.com>
 <20150120151118.GD18449@hmsreliant.think-freely.org>
 <20150120161453.GA5316@bricha3-MOBL3>
 <F60F360A2500CD45ACDB1D700268892D0E75F664@SHSMSX101.ccr.corp.intel.com>
 <54BF9D59.7070104@bisdn.de> <20150121130234.GB10756@bricha3-MOBL3>
 <54BFA7D5.7020106@bisdn.de> <20150121132620.GC10756@bricha3-MOBL3>
 <20150121114947.0753ae87@urahara>
 <20150121205404.GB32617@hmsreliant.think-freely.org>
 <53D2253B-DE20-486E-ADF0-DA02AAB1EF35@netgate.com>
 <CAA2XHbcG4kZzOiMibQhjRxjg_aCJpZ4djgXbQf=FECgZropbCw@mail.gmail.com>
 <CADNuJVrzFzT6WOWM8W13xvv8ad5b2GMO8C12EFYRb1vQZGyTBA@mail.gmail.com>
 <CAA2XHbfcFj=RE-__=6Gjetthp+XyxRRuj91G6FD=j4B=f9Je=Q@mail.gmail.com>
Date: Thu, 22 Jan 2015 13:36:26 -0600
Message-ID: <CADNuJVp9Z-Cp2iJ72SwZPLcq+tJeg797E2-_D0aPmWBYuoO0rw@mail.gmail.com>
From: Jay Rolette <rolette@infiniteio.com>
To: Luke Gorrie <luke@snabb.co>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 0/4] DPDK memcpy optimization
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jan 2015 19:36:27 -0000

On Thu, Jan 22, 2015 at 12:27 PM, Luke Gorrie <luke@snabb.co> wrote:

> On 22 January 2015 at 14:29, Jay Rolette <rolette@infiniteio.com> wrote:
>
>> Microseconds matter. Scaling up to 100GbE, nanoseconds matter.
>>
>
> True. Is there a cut-off point though?
>

There are always engineering trade-offs that have to be made. If I'm
optimizing something today, I'm certainly not starting at something that
takes 1ns for an app that is doing L4-7 processing. It's all about
profiling and figuring out where the bottlenecks are.

For past networking products I've built, there was a lot of traffic that
the software didn't have to do much to. Minimal L2/L3 checks, then forward
the packet. It didn't even have to parse the headers because that was
offloaded on an FPGA. The only way to make those packets faster was to turn
them around in the FPGA and not send them to the CPU at all. That change
improved small packet performance by ~30%. That was on high-end network
processors that are significantly faster than Intel processors for packet
handling.

It seems to be a strange thing when you realize that just getting the
packets into the CPU is expensive, nevermind what you do with them after
that.

Does one nanosecond matter?
>

You just have to be careful when talking about things like a nanosecond.
It's sounds really small, but IPG for a 10G link is only 9.6ns. It's all
relative.

AVX512 will fit a 64-byte packet in one register and move that to or from
> memory with one instruction. L1/L2 cache bandwidth per server is growing on
> a double-exponential curve (both bandwidth per core and cores per CPU). I
> wonder if moving data around in cache will soon be too cheap for us to
> justify worrying about.
>

Adding cores helps with aggregate performance, but doesn't really help with
latency on a single packet. That said, I'll take advantage of anything I
can from the hardware to either let me scale up how much traffic I can
handle or the amount of features I can add at the same performance level!

Jay