From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CE90FA0503; Fri, 6 May 2022 18:39:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BF9FF40395; Fri, 6 May 2022 18:39:31 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 11E954014F for ; Fri, 6 May 2022 18:39:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651855170; x=1683391170; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=H/ZKOVPK7F4hcQZQRiJFbKIcL8vbhYx7KzGy3GH9Alc=; b=Rl1nWC+k3qhAbxTa0x/m72Etnz3+E0mV3ACRcISDMOmmaXHmIEIaQmY9 xfNy/F0uq6XYP2pr2sMEhrozoQBqMT3jPSjUlg3LmMzdT9XWvOJAnr1Dj 89MP2VCVJpgw5wg0GMEZb06v6ovkio/8S47sl4yVtbqcN1oFKa+PmMgwj XPhIZ+ieInxj178+eEP7O9tbaxR61VbtB66IZQX8SrsZDHX6fdPA6bEAA Ov1oYy89a/2KHQPU2w9uxxcpTP6GeTcqtdmF4ePwe9qkaxrtkUgeiD5tG 57YvMEHc2LMNIff+B6ZtSW8akhdKhP0jhAnboEQQOlClibD+9jgOgv+e1 A==; X-IronPort-AV: E=McAfee;i="6400,9594,10339"; a="331505043" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="331505043" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2022 09:39:27 -0700 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="600626661" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.0.201]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 06 May 2022 09:39:25 -0700 Date: Fri, 6 May 2022 17:39:22 +0100 From: Bruce Richardson To: Stephen Hemminger Cc: Honnappa Nagarahalli , Tyler Retzlaff , "dev@dpdk.org" , nd , "Ananyev, Konstantin" Subject: Re: [RFC] rte_ring: don't use always inline Message-ID: References: <20220505224547.394253-1-stephen@networkplumber.org> <20220506072434.GA19777@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20220506093341.785086a7@hermes.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220506093341.785086a7@hermes.local> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, May 06, 2022 at 09:33:41AM -0700, Stephen Hemminger wrote: > On Fri, 6 May 2022 16:28:41 +0100 > Bruce Richardson wrote: > > > On Fri, May 06, 2022 at 03:12:32PM +0000, Honnappa Nagarahalli wrote: > > > > > > > > > > > On Thu, May 05, 2022 at 10:59:32PM +0000, Honnappa Nagarahalli wrote: > > > > > Thanks Stephen. Do you see any performance difference with this change? > > > > > > > > as a matter of due diligence i think a comparison should be made just to be > > > > confident nothing is regressing. > > > > > > > > i support this change in principal since it is generally accepted best practice to > > > > not force inlining since it can remove more valuable optimizations that the > > > > compiler may make that the human can't see. > > > > the optimizations may vary depending on compiler implementation. > > > > > > > > force inlining should be used as a targeted measure rather than blanket on > > > > every function and when in use probably needs to be periodically reviewed and > > > > potentially removed as the code / compiler evolves. > > > > > > > > also one other consideration is the impact of a particular compiler's force > > > > inlining intrinsic/builtin is that it may permit inlining of functions when not > > > > declared in a header. i.e. a function from one library may be able to be inlined > > > > to another binary as a link time optimization. although everything here is in a > > > > header so it's a bit moot. > > > > > > > > i'd like to see this change go in if possible. > > > Like Stephen mentions below, I am sure we will have a for and against discussion here. > > > As a DPDK community we have put performance front and center, I would prefer to go down that route first. > > > > > > > I ran some initial numbers with this patch, and the very quick summary of > > what I've seen so far: > > > > * Unit tests show no major differences, and while it depends on what > > specific number you are interested in, most seem within margin of error. > > * Within unit tests, the one number I mostly look at when considering > > inlining is the "empty poll" cost, since I believe we should look to keep > > that as close to zero as possible. In the past I've seen that number jump > > from 3 cycles to 12 cycles due to missed inlining. In this case, it seem > > fine. > > * Ran a quick test with the eventdev_pipeline example app using SW eventdev, > > as a test of an actual app which is fairly ring-heavy [used 8 workers > > with 1000 cycles per packet hop]. (Thanks to Harry vH for this suggestion > > of a workload) > > * GCC 8 build - no difference observed > > * GCC 11 build - approx 2% perf reduction observed > > > > As I said, these are just some quick rough numbers, and I'll try and get > > some more numbers on a couple of different platforms, see if the small > > reduction seen is consistent or not. I may also test a few differnet > > combinations/options in the eventdev test. It would be good if others also > > tested on a few platforms available to them. > > > > /Bruce > > I wonder if a mixed approach might help where some key bits were marked > as more important to inline? Or setting compiler flags in build infra? Yep, could be a number of options.