From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <olivier.matz@6wind.com>
Received: from mail-wr0-f178.google.com (mail-wr0-f178.google.com
 [209.85.128.178]) by dpdk.org (Postfix) with ESMTP id 5A69139EA
 for <dev@dpdk.org>; Thu, 16 Feb 2017 17:14:13 +0100 (CET)
Received: by mail-wr0-f178.google.com with SMTP id c4so15011715wrd.2
 for <dev@dpdk.org>; Thu, 16 Feb 2017 08:14:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=6wind-com.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=iGlNLq3Z7zfDwFRNc0UIMCV/k5RvSu4bBMXNEy6XVsA=;
 b=FThXmx7eSGe/qc+br7ccCXKHbg/myPtr09teqsWeuoEdIXVS4Wb2Oyhza/IOB+GFM0
 /NNSi2TgzYqYkVkbbCP+xQnvt1hIG5I84Mlf89af0O1cDK84LAzj5PLA4jywh5dOJqCt
 WkDtkJU93fiZmQv4pMX4HzJCw8uYOuWaXyTPFWFm4SWEV6BlgRSvNeyqiZ8fWIj93421
 9mHn28w1bE/Ps+/PsBjpztsmlz8NYSoU4EgXOjk40Vqe/oha6LMUY6l+l93QZQGWkb/2
 VEZJOA2A15B9mUaEpTzPz9ljQBEhknz+rj+SzjKMFxWOogcwq+8Lvf8naNVDsfwsIKSD
 Plqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=iGlNLq3Z7zfDwFRNc0UIMCV/k5RvSu4bBMXNEy6XVsA=;
 b=It95W5SpxRAVyFFBRPxpxXdYYDvz1AjceEJdC77WbP8rTwY2Qo3rIBekqr3Lu9uE0c
 Pa2y6WxFuRUiVr0NAgOjyqTV0MSqnsNykUX0xtnPESFphrMJFhadCyH0A1UaH4RHqYz9
 wH2koFsI7XYPr13Asa7KiukP3ddkwd7+/EMV4rpi/uKRReqNpQBpaJ/UT4sGbv2rhtEl
 rP4JOfSgHaWr7pyDn9z9DjeRkWCACf7gUoBl1rR2YcZa3pHflNH0mfSa4lSp+7EF5LDq
 vMmDfl4T+6o2BUYNIv7LPFyvRdiMleoSWrFvrKq9B4b/c51Dsb2KnY2kuXk6weI/gQ33
 7gVg==
X-Gm-Message-State: AMke39kKltGxO5MkhxjW+8LFZk/Rracb2OPjqFmeDjhN7gHMxexQA4THY8w9//kWS8uZwbcZ
X-Received: by 10.223.175.71 with SMTP id z65mr3473236wrc.84.1487261652863;
 Thu, 16 Feb 2017 08:14:12 -0800 (PST)
Received: from platinum (2a01cb0c03c651000226b0fffeed02fc.ipv6.abo.wanadoo.fr.
 [2a01:cb0c:3c6:5100:226:b0ff:feed:2fc])
 by smtp.gmail.com with ESMTPSA id b87sm788591wmi.0.2017.02.16.08.14.12
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Thu, 16 Feb 2017 08:14:12 -0800 (PST)
Date: Thu, 16 Feb 2017 17:14:10 +0100
From: Olivier Matz <olivier.matz@6wind.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
Message-ID: <20170216171410.57bff4ed@platinum>
In-Reply-To: <20170216154619.GA115208@bricha3-MOBL3.ger.corp.intel.com>
References: <1485271173-13408-1-git-send-email-olivier.matz@6wind.com>
 <2601191342CEEE43887BDE71AB9772583F111A29@irsmsx105.ger.corp.intel.com>
 <20170216144807.7add2c71@platinum>
 <20170216154619.GA115208@bricha3-MOBL3.ger.corp.intel.com>
X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [RFC 0/8] mbuf: structure reorganization
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Feb 2017 16:14:13 -0000

On Thu, 16 Feb 2017 15:46:19 +0000, Bruce Richardson
<bruce.richardson@intel.com> wrote:
> On Thu, Feb 16, 2017 at 02:48:07PM +0100, Olivier Matz wrote:
> > Hi Konstantin,
> > 
> > Thanks for the feedback.
> > Comments inline.
> > 
> > 
> > On Mon, 6 Feb 2017 18:41:27 +0000, "Ananyev, Konstantin"
> > <konstantin.ananyev@intel.com> wrote:  
> > > Hi Olivier,
> > > Looks good in general, some comments from me below.
> > > Thanks
> > > Konstantin
> > >   
> > > > 
> > > > The main changes are:
> > > > - reorder structure to increase vector performance on some
> > > > non-ia platforms.
> > > > - add a 64bits timestamp field in the 1st cache line    
> > > 
> > > Wonder why it deserves to be in first cache line?
> > > How it differs from seqn below (pure SW stuff right now).  
> > 
> > In case the timestamp is set from a NIC value, it is set in the Rx
> > path. So that's why I think it deserve to be located in the 1st
> > cache line.
> > 
> > As you said, the seqn is a pure sw stuff right: it is set in a lib,
> > not in a PMD rx path.
> >   
> > > > - m->next, m->nb_segs, and m->refcnt are always initialized for
> > > > mbufs in the pool, avoiding the need of setting m->next
> > > > (located in the 2nd cache line) in the Rx path for mono-segment
> > > > packets.
> > > > - change port and nb_segs to 16 bits    
> > > 
> > > Not that I am completely against it,
> > > but changing nb_segs to 16 bits seems like an overkill to me.
> > > I think we can keep and extra 8bits for something more useful in
> > > future.  
> > 
> > In my case, I use the m->next field to chain more than 256 segments
> > for L4 socket buffers. It also updates nb_seg that can overflow.
> > It's not a big issue since at the end, nb_seg is decremented for
> > each segment. On the other hand, if I enable some sanity checks on
> > mbufs, it complains because the number of segments is not equal to
> > nb_seg.
> > 
> > There is also another use case with fragmentation as discussed
> > recently: http://dpdk.org/dev/patchwork/patch/19819/
> > 
> > Of course, dealing with a long mbuf list is not that efficient,
> > but the application can maintain another structure to accelerate the
> > access to the middle/end of the list.
> > 
> > Finally, we have other ideas to get additional 8 bits if required in
> > the future, so I don't think it's really a problem.
> > 
> >   
> > >   
> > > > - move seqn in the 2nd cache line
> > > > 
> > > > Things discussed but not done in the patchset:
> > > > - move refcnt and nb_segs to the 2nd cache line: many drivers
> > > > sets them in the Rx path, so it could introduce a performance
> > > > regression, or    
> > > 
> > > I wonder can refcnt only be moved into the 2-nd cacheline?
> > > As I understand thanks to other change (from above) m->refcnt 
> > > will already be initialized, so RX code don't need to touch it.
> > > Though yes, it still would require changes in all PMDs.  
> > 
> > Yes, I agree, some fields could be moved in the 2nd cache line once
> > all PMDs stop to write them in RX path. I propose to issue some
> > guidelines to PMD maintainers at the same time the patchset is
> > pushed. Then we can consider changing it in a future version, in
> > case we need more room in the 1st mbuf cache line.
> >  
> 
> If we are changing things, we should really do all that now, rather
> than storing up future breaks to mbuf. Worst case, we should plan for
> it immediately after the release where we make these changes. Have two
> releases that break mbuf immediately after each other - and flagged as
> such, but keep it stable thereafter. I don't like having technical
> debt on mbuf just after we supposedly "fix" it.

I think there is no need to do this change now. And I don't feel good
with the idea of having a patchset that updates all the PMDs to remove
the access to a field because it moved to the 2nd cache line
(especially thinking about vector PMDs).

That's why I think the plan could be:
- push an updated version of this patchset quickly
- advertise to PMD maintainers "you don't need to set the m->next,
  m->refcnt, and m->nb_segs in the RX path, please update your drivers"
- later, if we need more room in the 1st cache line of the mbuf, we
  can move refcnt and nb_seg, probably without impacting the
  performance.


Olivier