From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id E252FA0503;
	Fri,  6 May 2022 18:38:17 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 8D6DA40395;
	Fri,  6 May 2022 18:38:17 +0200 (CEST)
Received: from mga05.intel.com (mga05.intel.com [192.55.52.43])
 by mails.dpdk.org (Postfix) with ESMTP id 9F6034014F
 for <dev@dpdk.org>; Fri,  6 May 2022 18:38:16 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1651855096; x=1683391096;
 h=date:from:to:cc:subject:message-id:references:
 mime-version:in-reply-to;
 bh=wyuxo6LcnhNT2nOIFRjCnh1hApsIhu3PHFsrHxk5qts=;
 b=d3HWhvHo6lqVLzey9yu1WB7gkLS1xEFoYkUJsnlMPaS4L32tVitHMdiv
 FvGtj/A6hWmm4cysB5EDh/4yEpSVL3lqlhxpVdQDKmn8afaVR/JmzCL0u
 1vrRRIxyr1BDdHRGgR79/haORO9VBtjjaBBhtWBWXGBnpd6Zz4e6LrAiv
 1zsxYLzqtSimiNmnQySg0mWEYoSL05vPCzWtC/9Mo6z4AjZvf/G1Ygymh
 RBSn3OiNCibkuNKmaKA6Uoa0ySU3+Bx4/2we3tDsN731EonzAtS/9N/SX
 dATdnXlhx+dj2bB3hQu5BCs9hHAzZNLo5sI0M5FOPfAb4VsnevEqnnPYR g==;
X-IronPort-AV: E=McAfee;i="6400,9594,10339"; a="354948342"
X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="354948342"
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 06 May 2022 09:38:15 -0700
X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="600626100"
Received: from bricha3-mobl.ger.corp.intel.com ([10.252.0.201])
 by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA;
 06 May 2022 09:38:13 -0700
Date: Fri, 6 May 2022 17:38:10 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: Tyler Retzlaff <roretzla@linux.microsoft.com>,
 Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
 "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>
Subject: Re: [RFC] rte_ring: don't use always inline
Message-ID: <YnVO8nqi2hIcEBv/@bricha3-MOBL.ger.corp.intel.com>
References: <20220505224547.394253-1-stephen@networkplumber.org>
 <DBAPR08MB5814589717AA4AE8B537B8D698C29@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <20220506072434.GA19777@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
 <20220506084112.5bcc3000@hermes.local>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20220506084112.5bcc3000@hermes.local>
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Fri, May 06, 2022 at 08:41:12AM -0700, Stephen Hemminger wrote:
> On Fri, 6 May 2022 00:24:34 -0700
> Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> 
> > On Thu, May 05, 2022 at 10:59:32PM +0000, Honnappa Nagarahalli wrote:
> > > Thanks Stephen. Do you see any performance difference with this change?  
> > 
> > as a matter of due diligence i think a comparison should be made just
> > to be confident nothing is regressing.
> > 
> > i support this change in principal since it is generally accepted best
> > practice to not force inlining since it can remove more valuable
> > optimizations that the compiler may make that the human can't see.
> > the optimizations may vary depending on compiler implementation.
> > 
> > force inlining should be used as a targeted measure rather than blanket
> > on every function and when in use probably needs to be periodically
> > reviewed and potentially removed as the code / compiler evolves.
> > 
> > also one other consideration is the impact of a particular compiler's
> > force inlining intrinsic/builtin is that it may permit inlining of
> > functions when not declared in a header. i.e. a function from one
> > library may be able to be inlined to another binary as a link time
> > optimization. although everything here is in a header so it's a bit
> > moot.
> > 
> > i'd like to see this change go in if possible.
> > 
> > thanks
> > 
> 
> 
> Some quick numbers from Gcc 10.3 and 2.7G AMD and ring_perf_autotest
> 
> Looks like always inline is faster on second run but just inline is
> slightly faster on first run. Maybe the icache gets loaded for second run,
> but on first pass the smaller code size helps.
>
Interesting, I haven't observed that effect.

The main trouble with the unit tests is that there are so many possible
numbers to compare. We probably need to focus on a few to make sense of it
all. Running on an "Intel(R) Xeon(R) Gold 6330N CPU @ 2.20GHz" with Ubuntu
20.04 (GCC 9.4), I scanned through the numbers looking for signicant
percentage differences. This one caught my eye due to the %age difference:

Basline value:
sudo ./build-baseline/app/test/dpdk-test -- ring_perf_autotest | grep "SP/SC: single:"
...
legacy APIs: SP/SC: single: 9.08
elem APIs: element size 16B: SP/SC: single: 11.13

With patch:
sudo ./build/app/test/dpdk-test -- ring_perf_autotest | grep "SP/SC: single:"
...
legacy APIs: SP/SC: single: 15.81
elem APIs: element size 16B: SP/SC: single: 21.14

So the SP/SC element enqueue cost went from 9-11 cycles to 15-21 cycles.
Percentage-wise, this seems a lot, though in absolute terms it may not be.
Therefore, I think we'll need a decent amount of varied testing before
taking this patch.

/Bruce