From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CFF9F43DA9; Mon, 1 Apr 2024 17:20:14 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A84E54029F; Mon, 1 Apr 2024 17:20:14 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 3E3FF4028B; Mon, 1 Apr 2024 17:20:13 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 46A684F2A; Mon, 1 Apr 2024 17:20:12 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 20C024E69; Mon, 1 Apr 2024 17:20:12 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.3 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id CD1544ED4; Mon, 1 Apr 2024 17:20:08 +0200 (CEST) Message-ID: Date: Mon, 1 Apr 2024 17:20:07 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: The effect of inlining To: =?UTF-8?Q?Morten_Br=C3=B8rup?= , Maxime Coquelin , Stephen Hemminger , Andrey Ignatov Cc: dev@dpdk.org, Chenbo Xia , Wei Shen , techboard@dpdk.org References: <20240328233338.56544-1-rdna@apple.com> <20240328164426.5b600cd1@hermes.local> <20240328195353.0dc838be@hermes.local> <319d86b3-c860-4231-b263-732aa4051531@redhat.com> <98CBD80474FA8B44BF855DF32C47DC35E9F33E@smartserver.smartshare.dk> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F33E@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-03-29 14:42, Morten Brørup wrote: > +CC techboard > >> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] >> Sent: Friday, 29 March 2024 14.05 >> >> Hi Stephen, >> >> On 3/29/24 03:53, Stephen Hemminger wrote: >>> On Thu, 28 Mar 2024 17:10:42 -0700 >>> Andrey Ignatov wrote: >>> >>>>> >>>>> You don't need always inline, the compiler will do it anyway. >>>> >>>> I can remove it in v2, but it's not completely obvious to me how is >> it >>>> decided when to specify it explicitly and when not? >>>> >>>> I see plenty of __rte_always_inline in this file: >>>> >>>> % git grep -c '^static __rte_always_inline' lib/vhost/virtio_net.c >>>> lib/vhost/virtio_net.c:66 >>> >>> >>> Cargo cult really. >>> >> >> Cargo cult... really? >> >> Well, I just did a quick test by comparing IO forwarding with testpmd >> between main branch and with adding a patch that removes all the >> inline/noinline in lib/vhost/virtio_net.c [0]. >> >> main branch: 14.63Mpps >> main branch - inline/noinline: 10.24Mpps > > Thank you for testing this, Maxime. Very interesting! > > It is sometimes suggested on techboard meetings that we should convert more inline functions to non-inline for improved API/ABI stability, with the argument that the performance of inlining is negligible. > I think you are mixing two different (but related) things here. 1) marking functions with the inline family of keywords/attributes 2) keeping function definitions in header files 1) does not affect the ABI, while 2) does. Neither 1) nor 2) affects the API (i.e., source-level compatibility). 2) *allows* for function inlining even in non-LTO builds, but doesn't force it. If you don't believe 2) makes a difference performance-wise, it follows that you also don't believe LTO makes much of a difference. Both have the same effect: allowing the compiler to reason over a larger chunk of your program. Allowing the compiler to inline small, often-called functions is crucial for performance, in my experience. If the target symbol tend to be in a shared object, the difference is even larger. It's also quite common that you see no effect of LTO (other than a reduction of code footprint). As LTO becomes more practical to use, 2) loses much of its appeal. If PGO ever becomes practical to use, maybe 1) will as well. > I think this test proves that the sum of many small (negligible) performance differences it not negligible! > >> >> Andrey, thanks for the patch, I'll have a look at it next week. >> >> Maxime >> >> [0]: https://pastebin.com/72P2npZ0 >