From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 7CE625A1F for ; Wed, 18 May 2016 18:43:11 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 18 May 2016 09:43:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,329,1459839600"; d="scan'208";a="983843690" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.220.66]) by fmsmga002.fm.intel.com with SMTP; 18 May 2016 09:43:02 -0700 Received: by (sSMTP sendmail emulation); Wed, 18 May 2016 17:43:01 +0025 Date: Wed, 18 May 2016 17:43:00 +0100 From: Bruce Richardson To: Jerin Jacob Cc: dev@dpdk.org, thomas.monjalon@6wind.com, konstantin.ananyev@intel.com, viktorin@rehivetech.com, jianbo.liu@linaro.org Message-ID: <20160518164300.GA12324@bricha3-MOBL3> References: <1463579863-32053-1-git-send-email-jerin.jacob@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1463579863-32053-1-git-send-email-jerin.jacob@caviumnetworks.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2016 16:43:11 -0000 On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote: > To avoid multiple stores on fast path, Ethernet drivers > aggregate the writes to data_off, refcnt, nb_segs and port > to an uint64_t data and write the data in one shot > with uint64_t* at &mbuf->rearm_data address. > > Some of the non-IA platforms have store operation overhead > if the store address is not naturally aligned.This patch > fixes the performance issue on those targets. > > Signed-off-by: Jerin Jacob > --- > > Tested this patch on IA and non-IA(ThunderX) platforms. > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector environment. > and this patch does not have any overhead on IA platform. > > Have tried an another similar approach by replacing "buf_len" with "pad" > (in this patch context), > Since it has additional overhead on read and then mask to keep "buf_len" intact, > not much improvement is not shown. > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html > > --- While this will work and from your tests doesn't seem to have a performance impact, I'm not sure I particularly like it. It's extending out the end of cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically using up any more space of it. What I'm wondering about though, is do we have any usecases where we need a variable buf_len for packets for RX. These mbufs come directly from a mempool, which is generally understood to be a set of fixed-sized buffers. I realise that this change was made in the past after some discussion, but one of the key points there [at least to my reading] was that - even though nobody actually made a concrete case where they had variable-sized buffers - having support for them made no performance difference. The latter part of that has now changed, and supporting variable-sized mbufs from an mbuf pool has a perf impact. Do we definitely need that functionality, because the easiest fix here is just to move the rxrearm marker back above mbuf_len as it was originally in releases like 1.8? Regards, /Bruce Ref: http://dpdk.org/ml/archives/dev/2014-December/009432.html