From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D149CA0544; Fri, 2 Sep 2022 19:18:01 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6662440693; Fri, 2 Sep 2022 19:18:01 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id A6E3240685 for ; Fri, 2 Sep 2022 19:18:00 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id C11C313455 for ; Fri, 2 Sep 2022 19:17:59 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id BFDBC1388F; Fri, 2 Sep 2022 19:17:59 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A, T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6 X-Spam-Score: -1.5 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id C6AC21317F; Fri, 2 Sep 2022 19:17:58 +0200 (CEST) Message-ID: <66cf8b28-ab44-8eff-3e9a-cc5b37f2bc6f@lysator.liu.se> Date: Fri, 2 Sep 2022 19:17:58 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= Subject: Re: [PATCH v3 1/2] test/service: add perf measurements for with stats mode To: Harry van Haaren , dev@dpdk.org Cc: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , Honnappa Nagarahalli , =?UTF-8?Q?Morten_Br=c3=b8rup?= References: <20220711105747.3295201-1-harry.van.haaren@intel.com> <20220711131825.3373195-1-harry.van.haaren@intel.com> Content-Language: en-US In-Reply-To: <20220711131825.3373195-1-harry.van.haaren@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2022-07-11 15:18, Harry van Haaren wrote: > This commit improves the performance reporting of the service > cores polling loop to show both with and without statistics > collection modes. Collecting cycle statistics is costly, due > to calls to rte_rdtsc() per service iteration. That is true for a service deployed on only a single core. For multi-core services, non-rdtsc-related overhead dominates. For example, if the service is deployed on 11 cores, the extra statistics-related overhead is ~1000 cc/service call on x86_64. 2x rdtsc shouldn't be more than ~50 cc. > > Reported-by: Mattias Rönnblom > Suggested-by: Honnappa Nagarahalli > Suggested-by: Morten Brørup > Signed-off-by: Harry van Haaren > > --- > > This is split out as a seperate patch from the fix to allow > measuring the before/after of the service stats atomic fixup. > --- > app/test/test_service_cores.c | 36 ++++++++++++++++++++++++----------- > 1 file changed, 25 insertions(+), 11 deletions(-) > > diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c > index ced6ed0081..7415b6b686 100644 > --- a/app/test/test_service_cores.c > +++ b/app/test/test_service_cores.c > @@ -777,6 +777,22 @@ service_run_on_app_core_func(void *arg) > return rte_service_run_iter_on_app_lcore(*delay_service_id, 1); > } > > +static float > +service_app_lcore_perf_measure(uint32_t id) > +{ > + /* Performance test: call in a loop, and measure tsc() */ > + const uint32_t perf_iters = (1 << 12); > + uint64_t start = rte_rdtsc(); > + uint32_t i; > + for (i = 0; i < perf_iters; i++) { > + int err = service_run_on_app_core_func(&id); In a real-world scenario, the latency of this function isn't representative for the overall service core overhead. For example, consider a scenario where an lcore has a single service mapped to it. rte_service.c will call service_run() 64 times, but only one will be a "hit" and the service being run. One iteration in the service loop costs ~600 cc, on a machine where this performance benchmark reports 128 cc. (Both with statistics disabled.) For low-latency services, this is a significant overhead. > + TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure"); > + } > + uint64_t end = rte_rdtsc(); > + > + return (end - start)/(float)perf_iters; > +} > + > static int > service_app_lcore_poll_impl(const int mt_safe) > { > @@ -828,17 +844,15 @@ service_app_lcore_poll_impl(const int mt_safe) > "MT Unsafe: App core1 didn't return -EBUSY"); > } > > - /* Performance test: call in a loop, and measure tsc() */ > - const uint32_t perf_iters = (1 << 12); > - uint64_t start = rte_rdtsc(); > - uint32_t i; > - for (i = 0; i < perf_iters; i++) { > - int err = service_run_on_app_core_func(&id); > - TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure"); > - } > - uint64_t end = rte_rdtsc(); > - printf("perf test for %s: %0.1f cycles per call\n", mt_safe ? > - "MT Safe" : "MT Unsafe", (end - start)/(float)perf_iters); > + /* Measure performance of no-stats and with-stats. */ > + float cyc_no_stats = service_app_lcore_perf_measure(id); > + > + TEST_ASSERT_EQUAL(0, rte_service_set_stats_enable(id, 1), > + "failed to enable stats for service."); > + float cyc_with_stats = service_app_lcore_perf_measure(id); > + > + printf("perf test for %s, no stats: %0.1f, with stats %0.1f cycles/call\n", > + mt_safe ? "MT Safe" : "MT Unsafe", cyc_no_stats, cyc_with_stats); > > unregister_all(); > return TEST_SUCCESS;