From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ADA4045978; Fri, 13 Sep 2024 08:47:35 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4468C427A7; Fri, 13 Sep 2024 08:47:35 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 8F9544028F for ; Fri, 13 Sep 2024 08:47:33 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id C3E4C81AA for ; Fri, 13 Sep 2024 08:47:32 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id B79058286; Fri, 13 Sep 2024 08:47:32 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.2 Received: from [192.168.1.86] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 056BB8285; Fri, 13 Sep 2024 08:47:30 +0200 (CEST) Message-ID: <0a8dd454-976c-4f17-a870-09ba2d90c717@lysator.liu.se> Date: Fri, 13 Sep 2024 08:47:29 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 3/7] eal: add lcore variable performance test To: Jerin Jacob Cc: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , dev@dpdk.org, =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob References: <20240911170430.701685-2-mattias.ronnblom@ericsson.com> <20240912084429.703405-1-mattias.ronnblom@ericsson.com> <20240912084429.703405-4-mattias.ronnblom@ericsson.com> <88a778d3-e157-41cd-9da7-2d06864a654d@lysator.liu.se> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-09-12 17:11, Jerin Jacob wrote: > On Thu, Sep 12, 2024 at 6:50 PM Mattias Rönnblom wrote: >> >> On 2024-09-12 15:09, Jerin Jacob wrote: >>> On Thu, Sep 12, 2024 at 2:34 PM Mattias Rönnblom >>> wrote: >>>> >>>> Add basic micro benchmark for lcore variables, in an attempt to assure >>>> that the overhead isn't significantly greater than alternative >>>> approaches, in scenarios where the benefits aren't expected to show up >>>> (i.e., when plenty of cache is available compared to the working set >>>> size of the per-lcore data). >>>> >>>> Signed-off-by: Mattias Rönnblom >>>> --- >>>> app/test/meson.build | 1 + >>>> app/test/test_lcore_var_perf.c | 160 +++++++++++++++++++++++++++++++++ >>>> 2 files changed, 161 insertions(+) >>>> create mode 100644 app/test/test_lcore_var_perf.c >>> >>> >>>> +static double >>>> +benchmark_access_method(void (*init_fun)(void), void (*update_fun)(void)) >>>> +{ >>>> + uint64_t i; >>>> + uint64_t start; >>>> + uint64_t end; >>>> + double latency; >>>> + >>>> + init_fun(); >>>> + >>>> + start = rte_get_timer_cycles(); >>>> + >>>> + for (i = 0; i < ITERATIONS; i++) >>>> + update_fun(); >>>> + >>>> + end = rte_get_timer_cycles(); >>> >>> Use precise variant. rte_rdtsc_precise() or so to be accurate >> >> With 1e7 iterations, do you need rte_rdtsc_precise()? I suspect not. > > I was thinking in another way, with 1e7 iteration, the additional > barrier on precise will be amortized, and we get more _deterministic_ > behavior e.s.p in case if we print cycles and if we need to catch > regressions. If you time a section of code which spends ~40000000 cycles, it doesn't matter if you add or remove a few cycles at the beginning and the end. The rte_rdtsc_precise() is both better (more precise in the sense of more serialization), and worse (because it's more costly, and thus more intrusive). You can use rte_rdtsc_precise(), rte_rdtsc(), or gettimeofday(). It doesn't matter. > Furthermore, you may consider replacing rte_random() in fast path to > running number or so if it is not deterministic in cycle computation. rte_rand() is not used in the fast path. I don't understand what you mean by "running number".