From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 48789A0521;
	Tue,  3 Nov 2020 14:50:41 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 0FE00CA54;
	Tue,  3 Nov 2020 14:50:39 +0100 (CET)
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
 by dpdk.org (Postfix) with ESMTP id DBDECCA3B;
 Tue,  3 Nov 2020 14:50:35 +0100 (CET)
IronPort-SDR: vCmuBFTiVbL+1JLNBrjTQQG4CxRgIh9bxSmBcZRU8oYMXEWQIpYQXDKwHXLK56QQuHMy5IW8Pt
 iHf6lc8douRA==
X-IronPort-AV: E=McAfee;i="6000,8403,9793"; a="148333431"
X-IronPort-AV: E=Sophos;i="5.77,448,1596524400"; d="scan'208";a="148333431"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 03 Nov 2020 05:50:33 -0800
IronPort-SDR: Q7cRLjtXh3dnXHayFg75PtlGvL8CZtmZlpNsWuolnjuKlXQtsasM7uLDHTnbVqLzrJLWWBataX
 tihB9TOYTRYA==
X-IronPort-AV: E=Sophos;i="5.77,448,1596524400"; d="scan'208";a="470802509"
Received: from bricha3-mobl.ger.corp.intel.com ([10.249.45.202])
 by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA;
 03 Nov 2020 05:50:28 -0800
Date: Tue, 3 Nov 2020 13:50:25 +0000
From: Bruce Richardson <bruce.richardson@intel.com>
To: Morten =?iso-8859-1?Q?Br=F8rup?= <mb@smartsharesystems.com>
Cc: Thomas Monjalon <thomas@monjalon.net>, dev@dpdk.org, techboard@dpdk.org,
 Ajit Khaparde <ajit.khaparde@broadcom.com>,
 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
 "Yigit, Ferruh" <ferruh.yigit@intel.com>, david.marchand@redhat.com,
 olivier.matz@6wind.com, jerinj@marvell.com, viacheslavo@nvidia.com,
 honnappa.nagarahalli@arm.com, maxime.coquelin@redhat.com,
 stephen@networkplumber.org, hemant.agrawal@nxp.com,
 Matan Azrad <matan@nvidia.com>, Shahaf Shuler <shahafs@nvidia.com>
Message-ID: <20201103135025.GD1144@bricha3-MOBL.ger.corp.intel.com>
References: <20201029092751.3837177-1-thomas@monjalon.net>
 <3086227.yllCKDRCEA@thomas>
 <98CBD80474FA8B44BF855DF32C47DC35C613CD@smartserver.smartshare.dk>
 <13044489.RHGIMAnax8@thomas>
 <98CBD80474FA8B44BF855DF32C47DC35C613DB@smartserver.smartshare.dk>
 <20201103122547.GB1144@bricha3-MOBL.ger.corp.intel.com>
 <98CBD80474FA8B44BF855DF32C47DC35C613DC@smartserver.smartshare.dk>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35C613DC@smartserver.smartshare.dk>
Subject: Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst
 half
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Tue, Nov 03, 2020 at 02:46:17PM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Tuesday, November 3, 2020 1:26 PM
> > 
> > On Tue, Nov 03, 2020 at 01:10:05PM +0100, Morten Brørup wrote:
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > Sent: Monday, November 2, 2020 4:58 PM
> > > >
> > > > +Cc techboard
> > > >
> > > > We need benchmark numbers in order to take a decision.
> > > > Please all, prepare some arguments and numbers so we can discuss
> > > > the mbuf layout in the next techboard meeting.
> > >
> > > I propose that the techboard considers this from two angels:
> > >
> > > 1. Long term goals and their relative priority. I.e. what can be
> > > achieved with wide-ranging modifications, requiring yet another ABI
> > > break and due notices.
> > >
> > > 2. Short term goals, i.e. what can be achieved for this release.
> > >
> > >
> > > My suggestions follow...
> > >
> > > 1. Regarding long term goals:
> > >
> > > I have argued that simple forwarding of non-segmented packets using
> > > only the first mbuf cache line can be achieved by making three
> > > modifications:
> > >
> > > a) Move m->tx_offload to the first cache line.
> > > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> > >    instead of the 64 bit m->pool pointer in the second cache line.
> > > c) Do not access m->next when we know that it is NULL.
> > >    We can use m->nb_segs == 1 or some other invariant as the gate.
> > >    It can be implemented by adding an m->next accessor function:
> > >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> > >    {
> > >        return m->nb_segs == 1 ? NULL : m->next;
> > >    }
> > >
> > > Regarding the priority of this goal, I guess that simple forwarding
> > > of non-segmented packets is probably the path taken by the majority
> > > of packets handled by DPDK.
> > >
> > >
> > > An alternative goal could be:
> > > Do not touch the second cache line during RX.
> > > A comment in the mbuf structure says so, but it is not true anymore.
> > >
> > 
> > The comment should be true for non-scattered RX, I believe.
> 
> You are correct.
> 
> My suggestion was unclear: Extend this remark to include segmented packets.
> 
> This could be a priority if the techboard considers RX segmented packets more important than my suggestion for single cache line forwarding of non-segmented packets.
> 
> 
> > I'm not aware of any use of second cacheline for the fast-path RXs for many drivers.
> > Am I missing something that has changed recently here?
> 
> Check out eth_igb_recv_pkts() in the E1000 driver: rxm->next = NULL;
> Or pmd_rx_burst() in the TAP driver: new_tail->next = seg->next;
> 
> Perhaps the documentation should describe best practices for implementing RX and TX functions in drivers, including allocating/freeing mbufs. Or an example dummy Ethernet driver could do it.
> 

Yes, perhaps I should be clearer about the "fast-path", because I was
thinking of the optimized RX/TX paths for those nics at 10G and above.
Probably the documentation should indeed have an update clarifying things a
bit, since using the first cacheline only possible but not mandatory for
simple RX.

/Bruce