From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 3D3255A52 for ; Mon, 15 Jun 2015 16:13:05 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 15 Jun 2015 07:13:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,618,1427785200"; d="scan'208";a="746976491" Received: from bricha3-mobl3.ger.corp.intel.com ([10.243.20.22]) by orsmga002.jf.intel.com with SMTP; 15 Jun 2015 07:13:02 -0700 Received: by (sSMTP sendmail emulation); Mon, 15 Jun 2015 15:12:58 +0025 Date: Mon, 15 Jun 2015 15:12:58 +0100 From: Bruce Richardson To: Olivier MATZ Message-ID: <20150615141258.GA580@bricha3-MOBL3> References: <87110795-201A-4A1E-A4CC-A778AA7C8218@cisco.com> <557ED116.7040508@6wind.com> <20150615134409.GA7500@bricha3-MOBL3> <2601191342CEEE43887BDE71AB97725836A0A838@irsmsx105.ger.corp.intel.com> <557EDB91.9010503@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <557EDB91.9010503@6wind.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" , "Damjan Marion \(damarion\)" Subject: Re: [dpdk-dev] rte_mbuf.next in 2nd cacheline X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2015 14:13:06 -0000 On Mon, Jun 15, 2015 at 04:05:05PM +0200, Olivier MATZ wrote: > Hi, > > On 06/15/2015 03:54 PM, Ananyev, Konstantin wrote: > > > > > >> -----Original Message----- > >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson > >> Sent: Monday, June 15, 2015 2:44 PM > >> To: Olivier MATZ > >> Cc: dev@dpdk.org; Damjan Marion (damarion) > >> Subject: Re: [dpdk-dev] rte_mbuf.next in 2nd cacheline > >> > >> On Mon, Jun 15, 2015 at 03:20:22PM +0200, Olivier MATZ wrote: > >>> Hi Damjan, > >>> > >>> On 06/10/2015 11:47 PM, Damjan Marion (damarion) wrote: > >>>> > >>>> Hi, > >>>> > >>>> We noticed 7% performance improvement by simply moving rte_mbuf.next field to the 1st cache line. > >>>> > >>>> Currently, it falls under /* second cache line - fields only used in slow path or on TX */ > >>>> but it is actually used at several places in rx fast path. (e.g.: i40e_rx_alloc_bufs() is setting that field to NULL). > >>>> > >>>> Is there anything we can do here (stop using next field, or move it to 1st cache line)? > >>> > >>> Agree, this is also something I noticed, see: > >>> http://dpdk.org/ml/archives/dev/2015-February/014400.html > >>> > >>> I did not have the time to do performance testing, but it's something > >>> I'd like to do as soon as I can. I don't see any obvious reason not to > >>> do it. > >>> > >>> It seems we currently just have enough room to do it (8 bytes are > >>> remaining in the first cache line when compiled in 64 bits). > >> > >> This, to me, is the obvious reason not to do it! It prevents us from taking in > >> any other offload fields in the RX fast-path into the mbuf. > >> > >> That being said, I can see why we might want to look to move it - but from the > >> work done in the ixgbe driver, I'd be hopeful we can get as much performance with > >> it on the second cache line for most cases, through judicious use of prefetching, > >> or otherwise. > >> > >> It took a lot of work and investigation to get free space in the mbuf - especially > >> in cache line 0, and I'd like to avoid just filling the cache line up again as > >> long as we possibly can! > > > > Yep, agree with Bruce here. > > Plus, with packet_type going to be 4B and vlan_tci_outer, > > we just don't have 8 free bytes at the first cache line any more. > > I don't understand why m->next would not be a better candidate than > rx offload fields to be in the first cache line. For instance, m->next > is mandatory and must be initialized when allocating a mbuf (to be > compared with m->seqn for instance, which is also in the first cache > line). So if we want to do some room in the first cache line, I > think we can. > > To me, the only reason for not doing it now is because we don't > have a full performance test report (several use-cases, drivers, ...) > that shows it's better. > Because the "next" field is not mandatory to be set on initialization. It can instead be set only when needed, and cleared on free if it is used. The next pointers always start out as NULL when the mbuf pool is created. The only time it is set to non-NULL is when we have chained mbufs. If we never have any chained mbufs, we never need to touch the next field, or even read it - since we have the num-segments count in the first cache line. If we do have a multi-segment mbuf, it's likely to be a big packet, so we have more processing time available and we can then take the hit of setting the next pointer. Whenever we go to free that mbuf for that packet, the code to do the freeing obviously needs to read the next pointer so as to free all the buffers in the chain, and so it can also reset the next pointer to NULL when doing so. In this way, we can ensure that the next pointer on cache line 1 is not a problem in our fast path. /Bruce