From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <thomas.monjalon@6wind.com>
Received: from mail-wg0-f54.google.com (mail-wg0-f54.google.com [74.125.82.54])
 by dpdk.org (Postfix) with ESMTP id 1F828C48E
 for <dev@dpdk.org>; Wed, 17 Jun 2015 18:33:26 +0200 (CEST)
Received: by wgez8 with SMTP id z8so41601019wge.0
 for <dev@dpdk.org>; Wed, 17 Jun 2015 09:33:26 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:organization
 :user-agent:in-reply-to:references:mime-version
 :content-transfer-encoding:content-type;
 bh=nptHWMRPt6MQBuw2MGn/pJ8jWcbL1qEWxCC9I4eckOQ=;
 b=R0Ohvcjhpc3bRrucM5EL+MqfZwaUM+CcIB0Zg3NQI7W/oVekyKfPyQlFAdUOzQEhnz
 O8Ma0YOHjqQoatie/vzO0CXMNi/U4s9UTCiEk7dBKE3gZQNOPx8cImRqdKfeJz43FUqY
 shBRGrWE9I8nA1cvXYxff8UlwYIaghHoTrxvoXwqYCCsUg8cmsDwuif0VjrmHV8P6avH
 e3xDKDufBOTYDKpavkQ11duzMgzBIiSfm3mrb5wjz0KyOJsZu7nqynziGOgvMxl+eEA2
 GMOnEaGYZ0i8pTInGw233bVH4O4xW9a0hizKlDmWRk3r/Ui95EmlAicndgnW9f5ibxTg
 xokA==
X-Gm-Message-State: ALoCoQmneq02cGkYKoT7aJe6qrEYgJbXXU6roKwX0IRrPjsVaKSk6g+6dg7to+iKEbIr32tJiP4n
X-Received: by 10.180.160.210 with SMTP id xm18mr19313656wib.93.1434558806002; 
 Wed, 17 Jun 2015 09:33:26 -0700 (PDT)
Received: from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136])
 by mx.google.com with ESMTPSA id fb3sm26814458wib.21.2015.06.17.09.33.24
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 17 Jun 2015 09:33:25 -0700 (PDT)
From: Thomas Monjalon <thomas.monjalon@6wind.com>
To: "Damjan Marion (damarion)" <damarion@cisco.com>
Date: Wed, 17 Jun 2015 18:32:24 +0200
Message-ID: <5029156.q8l1qJC5K0@xps13>
Organization: 6WIND
User-Agent: KMail/4.14.8 (Linux/4.0.4-2-ARCH; KDE/4.14.8; x86_64; ; )
In-Reply-To: <56928EA5-A3DB-44B3-B0ED-54E6FC0AE361@cisco.com>
References: <87110795-201A-4A1E-A4CC-A778AA7C8218@cisco.com>
 <20150617140648.GC8208@bricha3-MOBL3>
 <56928EA5-A3DB-44B3-B0ED-54E6FC0AE361@cisco.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] rte_mbuf.next in 2nd cacheline
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Jun 2015 16:33:26 -0000

2015-06-17 14:23, Damjan Marion:
> 
> > On 17 Jun 2015, at 16:06, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > 
> > On Wed, Jun 17, 2015 at 01:55:57PM +0000, Damjan Marion (damarion) wrote:
> >> 
> >>> On 15 Jun 2015, at 16:12, Bruce Richardson <bruce.richardson@intel.com> wrote:
> >>> 
> >>> The next pointers always start out as NULL when the mbuf pool is created. The
> >>> only time it is set to non-NULL is when we have chained mbufs. If we never have
> >>> any chained mbufs, we never need to touch the next field, or even read it - since
> >>> we have the num-segments count in the first cache line. If we do have a multi-segment
> >>> mbuf, it's likely to be a big packet, so we have more processing time available
> >>> and we can then take the hit of setting the next pointer.
> >> 
> >> There are applications which are not using rx offload, but they deal with chained mbufs.
> >> Why they are less important than ones using rx offload? This is something people 
> >> should be able to configure on build time.
> > 
> > It's not that they are less important, it's that the packet processing cycle count
> > budget is going to be greater. A packet which is 64 bytes, or 128 bytes in size
> > can make use of a number of RX offloads to reduce it's processing time. However,
> > a 64/128 packet is not going to be split across multiple buffers [unless we
> > are dealing with a very unusual setup!].
> > 
> > To handle 64 byte packets at 40G line rate, one has 50 cycles per core per packet
> > when running at 3GHz. [3000000000 cycles / 59.5 mpps].
> > If we assume that we are dealing with fairly small buffers
> > here, and that anything greater than 1k packets are chained, we still have 626
> > cycles per 3GHz core per packet to work with for that 1k packet. Given that
> > "normal" DPDK buffers are 2k in size, we have over a thousand cycles per packet
> > for any packet that is split. 
> > 
> > In summary, packets spread across multiple buffers are large packets, and so have
> > larger packet cycle count budgets and so can much better absorb the cost of
> > touching a second cache line in the mbuf than a 64-byte packet can. Therefore,
> > we optimize for the 64B packet case.
> 
> This makes sense if there is no other work to do on the same core.
> Otherwise it is better to spent those cycles on actual work instead of waiting for 
> 2nd cache line...

You're probably right.
I wonder wether this flexibility can be implemented only in static lib builds?