From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 026D3A0544; Fri, 23 Sep 2022 15:08:06 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A08664003F; Fri, 23 Sep 2022 15:08:06 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 37BAF4003C for ; Fri, 23 Sep 2022 15:08:05 +0200 (CEST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: O/S noise histogram Date: Fri, 23 Sep 2022 15:08:04 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87352@smartserver.smartshare.dk> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: O/S noise histogram Thread-Index: AdjPTYQYZpaLJm/mTGGyiWOo4w9Sfg== From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: Cc: =?iso-8859-1?Q?Felix_M=F6=DFbauer?= , "Henning Schild" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Dear DPDK community, We have been using the code below to measure O/S noise affecting our = data plane threads, so we can tune the kernel parameters. The = measurement method is based on dedicating an lcore to run the = experimental_thread() function, which repeatedly reads the TSC and = stores the difference between each TSC interval in a histogram. Someone might find it useful, so I thought I would share. It might also = be a neat companion to the l2reflect application, to show how much of = the jitter could possibly be attributed to noise from the O/S itself. Output looks like this: experimental cycles_hz=3D3407991799 cycles=3D[ 64.. 127] ns=3D[ 19.. 37] cycles_avg=3D = 80 ns_avg=3D 24, calls=3D102974652359, focus_calls=3D100.00%, = cycles=3D8261507529060, focus_cycles=3D100.00% cycles=3D[ 128.. 255] ns=3D[ 38.. 75] cycles_avg=3D = 243 ns_avg=3D 71, calls=3D126867, focus_calls=3D0.00%, = cycles=3D30767472, focus_cycles=3D0.00% cycles=3D[ 256.. 511] ns=3D[ 75.. 150] cycles_avg=3D = 341 ns_avg=3D 100, calls=3D24751, focus_calls=3D0.00%, = cycles=3D8437950, focus_cycles=3D0.00% cycles=3D[ 512.. 1023] ns=3D[ 150.. 300] cycles_avg=3D = 597 ns_avg=3D 175, calls=3D3996, focus_calls=3D0.00%, = cycles=3D2385340, focus_cycles=3D0.00% cycles=3D[ 1024.. 2047] ns=3D[ 300.. 601] cycles_avg=3D = 1312 ns_avg=3D 385, calls=3D190, focus_calls=3D0.00%, = cycles=3D249302, focus_cycles=3D0.00% cycles=3D[ 2048.. 4095] ns=3D[ 601.. 1202] cycles_avg=3D = 2456 ns_avg=3D 721, calls=3D27, focus_calls=3D0.00%, = cycles=3D66302, focus_cycles=3D0.00% cycles=3D[ 4096.. 8191] ns=3D[ 1202.. 2403] cycles_avg=3D = 7474 ns_avg=3D 2193, calls=3D1941, focus_calls=3D0.00%, = cycles=3D14507354, focus_cycles=3D0.00% cycles=3D[ 8192.. 16383] ns=3D[ 2404.. 4807] cycles_avg=3D = 8592 ns_avg=3D 2521, calls=3D490, focus_calls=3D0.00%, = cycles=3D4209962, focus_cycles=3D0.00% bool control_running =3D true; /* Set to false to stop the threads. */ static uint64_t experimental_calls[65] __rte_cache_aligned; static uint64_t experimental_cycles[65] __rte_cache_aligned; void experimental_thread(void); void experimental_thread(void) { uint64_t tsc_now; uint64_t tsc_before; uint64_t diff; int msb_pos; printf("experimental_thread running on lcore %u\n", rte_lcore_id()); /* Warm up the cache to avoid cache misses. */ memset(experimental_calls, 0, sizeof(experimental_calls)); memset(experimental_cycles, 0, sizeof(experimental_cycles)); tsc_before =3D rte_rdtsc_precise(); while (control_running) { tsc_now =3D rte_rdtsc_precise(); diff =3D tsc_now - tsc_before; tsc_before =3D tsc_now; msb_pos =3D rte_fls_u64(diff); /* Return the last = (most-significant) bit set: 64..1, or 0 if none. */ experimental_calls[msb_pos]++; experimental_cycles[msb_pos] +=3D diff; } } void experimental_dump(FILE *f); void experimental_dump(FILE *f) { unsigned int idx; uint64_t from =3D 0, to =3D 0; uint64_t calls =3D 0; uint64_t cycles =3D 0; uint64_t all_calls =3D 0; uint64_t all_cycles =3D 0; for (idx =3D 0; idx <=3D 64; idx++) { all_calls +=3D experimental_calls[idx]; all_cycles +=3D experimental_cycles[idx]; } fprintf(f, "experimental\n"); fprintf(f, " cycles_hz=3D%"PRIu64"\n", rte_get_tsc_hz()); for (idx =3D 0; idx <=3D 64; idx++) { calls =3D experimental_calls[idx]; cycles =3D experimental_cycles[idx]; if (calls =3D=3D 0) continue; if (idx !=3D 0) { from =3D 1 << (idx - 1); to =3D from | (from - 1); } fprintf(f, " cycles=3D[%8"PRIu64"..%8"PRIu64"] = ns=3D[%8.0f..%8.0f] cycles_avg=3D%8"PRIu64" ns_avg=3D%8.0f, = calls=3D%"PRIu64", focus_calls=3D%.02f%%, cycles=3D%"PRIu64", = focus_cycles=3D%.02f%%\n", from, to, (float)from * (float)1000000000 / = (float)rte_get_tsc_hz(), (float)to * (float)1000000000 / (float)rte_get_tsc_hz(), (cycles + calls / 2) / calls, (float)cycles / (float)calls * (float)1000000000 / = (float)rte_get_tsc_hz(), calls, (float)calls / (float)all_calls * (float)100, cycles, (float)cycles / (float)all_cycles * (float)100); } Med venlig hilsen / Kind regards, -Morten Br=F8rup