From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id B0A68A0503;
	Fri,  6 May 2022 18:33:46 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 5837940395;
	Fri,  6 May 2022 18:33:46 +0200 (CEST)
Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com
 [209.85.216.47]) by mails.dpdk.org (Postfix) with ESMTP id 854B64014F
 for <dev@dpdk.org>; Fri,  6 May 2022 18:33:45 +0200 (CEST)
Received: by mail-pj1-f47.google.com with SMTP id
 cu23-20020a17090afa9700b001d98d8e53b7so8138943pjb.0
 for <dev@dpdk.org>; Fri, 06 May 2022 09:33:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20210112.gappssmtp.com; s=20210112;
 h=date:from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=XuwOyQIjUf0XYGyhmGZ3r1gt7IxBYU9zPxniNze7SQM=;
 b=eUxw8azaMZWhzAB1WoJpTEwE8k7Wf8ALllWItGMFuBhvy5BSH/SbiJ3D+UA26x4QKJ
 BCIT3lfkbZ82NP+CpADYdVWnrYQHOW79YbWYoUslqcmBmQL1zqPXFeQGnUNSU6GsuWVI
 w7IzGTOKwh0a1OqauibqJAvjVEsIFiSZcWw5KI9jX7+8ZIboXwqPJfCzss72hpoALs40
 j3x52asnoU7Gx11u9XNTIDkiGFIqUO+XOgvD29IRm10OlWRVNLyBSAiwi/q/erJ44oJG
 JG1PyvSCM1apjF3o6NbuEdy0qRBBMEmR4DXIuv8dq5mx2IWZipTiT06B1aOT5WjSsjfZ
 XVUQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=XuwOyQIjUf0XYGyhmGZ3r1gt7IxBYU9zPxniNze7SQM=;
 b=kzuoVu/Ni6GUk9qGaj2og1i9Pq4eYJAPPHNtE0mKCbUQJcH6cQy9l/4bCCm0djlmXr
 PKEHt9q64V/3mLxOWB8OvA0fzCuFHjhRk8WQkov+FF7UM0MO9FVNbAaCbJe2+a0BXV04
 2VoDVDUBXrXfYa7rDNz+LJxM/3J01jmAIBVlsTzoXPM9nO3rd2vEMZLG0QfxFbsz+qYE
 GTadTCu4FR0v0dUXEq/nHpgdaBbb+NxYMwxZ27sc9xw5dS3raFIWRFQtuNVokG+Si4ot
 M/ePt0uKnD1N7AeSqQA3AO6Uie18wnsQqaVDDtF1+X7OXHEbwIYY0bjx6Udy1W+nCodG
 j5UQ==
X-Gm-Message-State: AOAM5303MfnfV4JIvUjhsz38RVpELv5SzzaNBIzDixrtKoAd1VwH32Ww
 GZm1UOuQz7gfoGKo+7pGCS/RCA==
X-Google-Smtp-Source: ABdhPJzdytMD3yP8HoaVfixGbRoUp39cvPJryBcoYbD8vdwQDgT6ayn30DOp/q0IM5uNMb2+TtF62w==
X-Received: by 2002:a17:902:e5cc:b0:15e:84d0:decb with SMTP id
 u12-20020a170902e5cc00b0015e84d0decbmr4349842plf.91.1651854824484; 
 Fri, 06 May 2022 09:33:44 -0700 (PDT)
Received: from hermes.local (204-195-112-199.wavecable.com. [204.195.112.199])
 by smtp.gmail.com with ESMTPSA id
 e21-20020a170902d39500b0015e8d4eb2aesm1943566pld.248.2022.05.06.09.33.43
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 06 May 2022 09:33:44 -0700 (PDT)
Date: Fri, 6 May 2022 09:33:41 -0700
From: Stephen Hemminger <stephen@networkplumber.org>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, Tyler Retzlaff
 <roretzla@linux.microsoft.com>, "dev@dpdk.org" <dev@dpdk.org>, nd
 <nd@arm.com>, "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Subject: Re: [RFC] rte_ring: don't use always inline
Message-ID: <20220506093341.785086a7@hermes.local>
In-Reply-To: <YnU+qQu+Rkqu+WM1@bricha3-MOBL.ger.corp.intel.com>
References: <20220505224547.394253-1-stephen@networkplumber.org>
 <DBAPR08MB5814589717AA4AE8B537B8D698C29@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <20220506072434.GA19777@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
 <DBAPR08MB58148A2387457A31E13B5C1D98C59@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <YnU+qQu+Rkqu+WM1@bricha3-MOBL.ger.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Fri, 6 May 2022 16:28:41 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> On Fri, May 06, 2022 at 03:12:32PM +0000, Honnappa Nagarahalli wrote:
> > <snip>  
> > > 
> > > On Thu, May 05, 2022 at 10:59:32PM +0000, Honnappa Nagarahalli wrote:  
> > > > Thanks Stephen. Do you see any performance difference with this change?  
> > > 
> > > as a matter of due diligence i think a comparison should be made just to be
> > > confident nothing is regressing.
> > > 
> > > i support this change in principal since it is generally accepted best practice to
> > > not force inlining since it can remove more valuable optimizations that the
> > > compiler may make that the human can't see.
> > > the optimizations may vary depending on compiler implementation.
> > > 
> > > force inlining should be used as a targeted measure rather than blanket on
> > > every function and when in use probably needs to be periodically reviewed and
> > > potentially removed as the code / compiler evolves.
> > > 
> > > also one other consideration is the impact of a particular compiler's force
> > > inlining intrinsic/builtin is that it may permit inlining of functions when not
> > > declared in a header. i.e. a function from one library may be able to be inlined
> > > to another binary as a link time optimization. although everything here is in a
> > > header so it's a bit moot.
> > > 
> > > i'd like to see this change go in if possible.  
> > Like Stephen mentions below, I am sure we will have a for and against discussion here.
> > As a DPDK community we have put performance front and center, I would prefer to go down that route first.
> >  
> 
> I ran some initial numbers with this patch, and the very quick summary of
> what I've seen so far:
> 
> * Unit tests show no major differences, and while it depends on what
>   specific number you are interested in, most seem within margin of error.
> * Within unit tests, the one number I mostly look at when considering
>   inlining is the "empty poll" cost, since I believe we should look to keep
>   that as close to zero as possible. In the past I've seen that number jump
>   from 3 cycles to 12 cycles due to missed inlining. In this case, it seem
>   fine.
> * Ran a quick test with the eventdev_pipeline example app using SW eventdev,
>   as a test of an actual app which is fairly ring-heavy [used 8 workers
>   with 1000 cycles per packet hop]. (Thanks to Harry vH for this suggestion
>   of a workload)
>   * GCC 8 build - no difference observed
>   * GCC 11 build - approx 2% perf reduction observed
> 
> As I said, these are just some quick rough numbers, and I'll try and get
> some more numbers on a couple of different platforms, see if the small
> reduction seen is consistent or not. I may also test a few differnet
> combinations/options in the eventdev test.  It would be good if others also
> tested on a few platforms available to them.
> 
> /Bruce

I wonder if a mixed approach might help where some key bits were marked
as more important to inline? Or setting compiler flags in build infra?