From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2AD8DA0542; Mon, 29 Aug 2022 15:16:54 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BFBD54069D; Mon, 29 Aug 2022 15:16:53 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 2B2C04003C for ; Mon, 29 Aug 2022 15:16:51 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661779011; x=1693315011; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=TLf/BMj5WJSKjXp4QFa131x1DK8NoKJAoRJn2iGT/sk=; b=kc1uXuhFPPza5v7Xz7AmcBs5cuAoZd7D4CAGiQiO1rmkxYOuGIEI6Q9Q DenG2rf6J9IPlrNWr6dzo3LSuaon09JQrAK62GSCoB0JFCokUACw2/uV/ b6itWHh1+gb26ssoa2XHVK5WNnSqXYKQSZOvilGK9DydnL/9kQniQCde9 XvLcNFocQCMWRUhGuoTbPbn54or/AmhgmZ4IMr3hIzNEz5zWhrHlgiLVJ BfbbzQgMUqAZHMb4AoJOFR+i359m1C0q0b8JoKrT6Dy2UXspzVM8RiMM2 zgNPDNJZW5CUrF3QH4vLfQY8dEgPBvWeJQiycv36KbfsDJzfYxVSXaoLy g==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="381192604" X-IronPort-AV: E=Sophos;i="5.93,272,1654585200"; d="scan'208";a="381192604" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 06:16:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,272,1654585200"; d="scan'208";a="640933570" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by orsmga008.jf.intel.com with ESMTP; 29 Aug 2022 06:16:47 -0700 Received: from orsmsx612.amr.corp.intel.com (10.22.229.25) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 29 Aug 2022 06:16:47 -0700 Received: from orsmsx602.amr.corp.intel.com (10.22.229.15) by ORSMSX612.amr.corp.intel.com (10.22.229.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 29 Aug 2022 06:16:46 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31 via Frontend Transport; Mon, 29 Aug 2022 06:16:46 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.171) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2375.31; Mon, 29 Aug 2022 06:16:46 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GSuSInD7Q7hH+TzYcqRqMWMtVZjg9v2Pr8OPRaCXtnD2cE3pcI8gDAwqFBKCNuFFyaZLmumzTvORV4wX5TrMIgugkDkhG/nsjSk8NvNVV2m+wBSOVnUK3yidTzHcwj1+vbsUFIIIxSYqErmFU+v/W27JBT4bIeCzSw7IQIY5kYq78fqX3YLneRfexd+be8oJlwRTv6r890pHf2m434UOeMHIZWvkFzXd2V/r6DPiSZCsp2ydHE2XsDmSqgi24HvM+ptZI/jRZ2JeUWDTuq2PUWVAua1BhhYNAt/4zka8rwExAxezgaWzF4n854yTIJeFEvPDtmhjBUY0a5x/3tMiMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PaXE558eHKuF3TqSgrOvP4uvu4/Vz70Um40i/q4HR9Q=; b=UT6rRDW2EW7suItGGZOHhApsRf+oJaPbNGvYDntY3TxlekUNiuAeKib6xVt6JX81vDB22veik6WJEsyGrXOum6JtjwO27xtsKI7HzEz94KXXavxFYZEc6gzbdS+G0vQxmu6jz1+Gtj4J0aO1LiXbzjvdqakt3zKAc6EwTevscDN1N0e/JhdHfR+pK38eYDgSgWuUEpmX0caEKcBx+aeCCGnUYBJvQHkmh3yio+aXMlCjx2x5x/w5oEYEGAZ2aSoF7xxV0EF/xxFd3kVV0GjpsUCmre9FKEvn8R7XV4QXlyRedZERVBUrC/D9iwaHvrevz0+INMr9aQjtDE1WfCvZsg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from MW4PR11MB5872.namprd11.prod.outlook.com (2603:10b6:303:169::14) by DM6PR11MB4076.namprd11.prod.outlook.com (2603:10b6:5:197::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5566.15; Mon, 29 Aug 2022 13:16:43 +0000 Received: from MW4PR11MB5872.namprd11.prod.outlook.com ([fe80::2080:d65f:9c32:7749]) by MW4PR11MB5872.namprd11.prod.outlook.com ([fe80::2080:d65f:9c32:7749%4]) with mapi id 15.20.5566.016; Mon, 29 Aug 2022 13:16:43 +0000 Message-ID: <0813cdfe-47ac-45ce-8f33-2ed0d4f19d45@intel.com> Date: Mon, 29 Aug 2022 14:16:34 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH v3 1/3] eal: add lcore poll busyness telemetry Content-Language: en-US To: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , CC: , Conor Walsh , "David Hunt" , Bruce Richardson , Nicolas Chautru , Fan Zhang , Ashish Gupta , "Akhil Goyal" , Chengwen Feng , "Ray Kinsella" , Thomas Monjalon , "Ferruh Yigit" , Andrew Rybchenko , Jerin Jacob , "Sachin Saxena" , Hemant Agrawal , Ori Kam , Honnappa Nagarahalli , Konstantin Ananyev References: <24c49429394294cfbf0d9c506b205029bac77c8b.1657890378.git.anatoly.burakov@intel.com> <20220825152852.1231849-1-kevin.laatz@intel.com> <20220825152852.1231849-2-kevin.laatz@intel.com> From: Kevin Laatz In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: DB6PR0801CA0058.eurprd08.prod.outlook.com (2603:10a6:4:2b::26) To MW4PR11MB5872.namprd11.prod.outlook.com (2603:10b6:303:169::14) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f53fab55-a264-4701-9591-08da89c0b720 X-MS-TrafficTypeDiagnostic: DM6PR11MB4076:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wHJvJ86o4deNHJZW4GBuHdIthvYe+sRzo6SPdC74bBgqTVm9UsYGAg2nPz/xxgSRozolEm31ARGAdksppkwFQqPnLK7g5tA1TzEbv/Aa+iZYM+h6ahqU9XA9HU68V+sPnv/zyJ/2pvUDVGmM9uhoEMt6ejwR9kZElAzmqciCy9LXeuq/Bd0mr9t3mvcOUK4rngsVG80HbfcqT92f9rOX9/C+X4HY8L7jVqLonzoS5P7DQl+3B01Du94YbuYLpLYeirLrLftcomS9cbRM8V3/yhRfJXSSkFNS6tp8kDNHz305jwfNBuRsZhMApX1VgNpxhruiSleMKbN/Qr9k/eSVr/goRd38aaKsVFRaPrcjpL4I+WRmBhKatoLnNPd8U7LWj+VK52kX9FOQfDSW99WpN6Y0w4IqnS4ADcF0Pf6Nluxaoo8fz/451BYKvm4lyZ+aQyAWiIGfLCY3ZT0gnUffFl2qlq33mf1bC+Xk4VxUBbzZHCoNTwENO+5/99DLLWPBgub+cGny+uSlIETnDVnrt1UzVkW25barSr98jL0ZoUZELz91dtK4Sf2b+k/A4gk+blVxymojreKNHby+qnFmGmmkw2PICsoCLsPWpwykmF0rC/GoV7Ug6ypdHzGN+Qr7PrMGIuq7VeGWFYNe1f/UA/5taBbZj+nAai/sKVmn10Ho2HdcBz04Fg1KOvTAV7K4TRVAu6PO8Tm66V/JhXkZEEiFALXpyFBxUtqm0VrpPtOe5Xi/GjkGhZZMfmbwdD7xZMBnPLCg1uxEaEnQwm/U8Hfn7I7D17ISVfGRHmrOP53EqFMUc4W6MK1O93ZvH53b X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MW4PR11MB5872.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(136003)(366004)(39860400002)(396003)(376002)(346002)(478600001)(6486002)(6666004)(41300700001)(54906003)(296002)(316002)(66574015)(83380400001)(186003)(2616005)(26005)(6512007)(53546011)(6506007)(36756003)(66556008)(66946007)(66476007)(8676002)(4326008)(44832011)(30864003)(7416002)(5660300002)(8936002)(2906002)(38100700002)(86362001)(31696002)(31686004)(82960400001)(43740500002)(45980500001)(309714004); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NklqSThLS0I2bUdQRTZtWTM0akpOaXJRTWJWcnpyYWRiMk1NRFhrNWd0eU5h?= =?utf-8?B?YjY1MFhIOFVCc2VRNGt3dVF5R2dJams3RGpxaWVEakIvSUpWK0xlZ3NYNGF3?= =?utf-8?B?Ynord3FrMVArQXhZQ3E1M0tkZ2k3OEVZUmV6Y3Z1ajRpYnVYWG1wUEh3eDU4?= =?utf-8?B?N1U2SFVBSHFvNnprT3JCMnlacHYvNXBIQ2JGOXFMZnBaVTFVc2VRcUdTc0R2?= =?utf-8?B?N0FLUTlsaWxIdWdxc0RYR0FQKzlpc2dZUkRTWnlqamFMaXRqVHVpclFwOVpE?= =?utf-8?B?MzlGaU1MNU5rVzBKelpnZ290TzZkK3FXKzIvZmtIbkpCdGl0ZXo2OHQ1NjQ2?= =?utf-8?B?ckFkVmhlMENDWjdYaURsR3gvTTVabUkzeUN4a3owTnZ0bG4rWDd5UEFIVnRa?= =?utf-8?B?azlIak1xdEh5NklUUWlMWHIybWhFRTdUWmZyVEU4ZTdrM0U5TmQybkNHWnlu?= =?utf-8?B?Z25qOVpNYy84ZDZVTzUzMGIvVkt1cXhiSXIyRjIrQURJU1JmQjBxQ1NpekIw?= =?utf-8?B?RTlwMkU5TldZZklZcmwwVnpJRjBpcm9iNUJYdEhKZkkra2liOEhsTHBPcDd5?= =?utf-8?B?c0tUaVFVc2xIZWorcW9WTlp1ZWEyakdRVEFMTkN2aGtaSmIvem9Nclo3T09E?= =?utf-8?B?MDJNeU9EaGNXbDdXYWttT0wwRW4ycTJpdnZmTTl1Z2pFWE04NjJtemE4c2wv?= =?utf-8?B?dkw5OXhQb1p1MnB4dk10cTVBRVJIejBqSGNjQmZLb25GQmF2QVNEbCtCOFB0?= =?utf-8?B?QUFPNDdrR01oY0xjQU9UYnY0R2FCaTZmQks1TUZmaHdLTFJmUmxJM3NMRFow?= =?utf-8?B?bHh3Z2FZVXVrc1QvMHBpaWM4clpuaytZNGQwVVdUcDJIOXR2ejFnYi9QRjVG?= =?utf-8?B?SlBCR0dxMlNPSzg4dTlmaUh5N29uVkVRQU1WRDJERm45Y0pNOVgvMHdyaUs0?= =?utf-8?B?VDhaNGRtaTNEQVdZTVdTNVpmZjVYeStXTzhMVm5FR1hadzVVcHh0TW04Wmd3?= =?utf-8?B?ZVhpTmF1dEpTMWFhSDZvMlJJbC80SXFNQldjT2ZCc0UvZTVzQUhDeEdBVlBI?= =?utf-8?B?TWRyWnZ3RkUxVi9uUklzOFlVb0cxeEhXaEN2Q1hKYkw1SWV6MFZ2Y0dNdTRK?= =?utf-8?B?a1p1b1FVYmhYcGVJc2ttNEpNeTU4THdBZTl1b2Y3S1g4dTZTQUk5d0I3L0U5?= =?utf-8?B?c0pETm9TMlhhSXQ5QlRSbk16SlVmTUs0MktwL1ZkdmtoUVc4Vzc0cFNiL0lM?= =?utf-8?B?UUhvS283QWRiWEtDSnB3elJOOU00V0xMVDF0c0RqbU1rNGs5cnNIUmFJVkRK?= =?utf-8?B?TEQ3dVdxeFdVRU1kQWNIbUdENVZUYTZ3akhXWGNlSUsrOUdobTkyaGVxNUlM?= =?utf-8?B?Tys0WUYxaVVGZWZIT0J5RGxvaFM1MFlEYlViQi9yL3krVG5XNkxEdzk4WVdB?= =?utf-8?B?RlU3TWRTa2tEdU9oYTJKV05HejVoZnFKbDBHTk1aQ1RDZ3JHT2UvZStTaUFC?= =?utf-8?B?SWZnNEhKYVZtTlEweDRWa3lPVTRJcll2cWhDYzl4RzY5VzNGbjhZS0lCQ0h4?= =?utf-8?B?bDgzVnpTQStCa3FWeVRBU0F5YWVHRnB2NjRqY2dEb1l5K0tQNHNLR29yN05V?= =?utf-8?B?c3p2QW9HLzAzR0pYUUlib3I0ZUNKb2FocVBza2t3TXh4bDF5R3M5N0FmRnRP?= =?utf-8?B?Ym5jdVBOZjl4d1NFRnh0b3FHWnFzbTJGS0RscmF5dlV3alBqN05ZaHd1a2U0?= =?utf-8?B?UkttOHZ3VkVWbGxxYTFKSVYydnkvM1dSMEFMaXNnbXlRWVBRUHZNaUdzYWNM?= =?utf-8?B?MHlOZzUwZ056aVN4UndjQXRybmJpOXNka2IyaUlnMTJmYjdCQXVvckl5Sk9D?= =?utf-8?B?Rm9uR2srSytEUkJiQXhVS0p2aUxob3FrS0Z6L0psY3JrcitlZFhZcDlDbTBB?= =?utf-8?B?L1dxK3IweEpMMjVLRlRnUXdmV1A3SGFkTEFzN3dma2tvallSUCt6ZG9peXBM?= =?utf-8?B?WjZ6eGV6N3NNdVJYTVd5MVhNc2EyYk44cE5FZEhHQ3Bneml2RE9BZDJWd3lr?= =?utf-8?B?bDVIdkdYVm1LVjZON0w0ZTlDTzZZK3o1T2pJZGdLSU1KaXUrOW9mNjJqK3ll?= =?utf-8?Q?RBgJU70XtI2DwczKPhK2oi14H?= X-MS-Exchange-CrossTenant-Network-Message-Id: f53fab55-a264-4701-9591-08da89c0b720 X-MS-Exchange-CrossTenant-AuthSource: MW4PR11MB5872.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Aug 2022 13:16:42.8719 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: oSKnCLpiSpEuvkai77b+iLHX8B9zF9ll7rAeSpJsoZp63pxR28gPO6+rpmPBozPyAJKeb3ieBQkZXalG50+HtA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR11MB4076 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 26/08/2022 23:06, Mattias Rönnblom wrote: > On 2022-08-25 17:28, Kevin Laatz wrote: >> From: Anatoly Burakov >> >> Currently, there is no way to measure lcore poll busyness in a >> passive way, >> without any modifications to the application. This patch adds a new >> EAL API >> that will be able to passively track core polling busyness. > > There's no generic way, but the DSW event device keep tracks of lcore > utilization (i.e., the fraction of cycles used to perform actual work, > as opposed to just polling empty queues), and it does some with the > same basic principles as, from what it seems after a quick look, used > in this patch. > >> >> The poll busyness is calculated by relying on the fact that most DPDK >> API's >> will poll for packets. Empty polls can be counted as "idle", while > > Lcore worker threads poll for work. Packets, timeouts, completions, > event device events, etc. Yes, the wording was too restrictive here - the patch includes changes to drivers and libraries such as dmadev, eventdev, ring etc that poll for work and would want to mark it as "idle" or "busy". > >> non-empty polls can be counted as busy. To measure lcore poll >> busyness, we > > I guess what is meant here is that cycles spent after non-empty polls > can be counted as busy (useful) cycles? Potentially including the > cycles spent for the actual poll operation. ("Poll busyness" is a very > vague term, in my opionion.) > > Similiarly, cycles spent after an empty poll would not be counted. Correct, the generic functionality works this way. Any cycles between an "busy poll" and the next "idle poll" will be counted as busy/useful work (and vice versa). > >> simply call the telemetry timestamping function with the number of >> polls a >> particular code section has processed, and count the number of cycles >> we've >> spent processing empty bursts. The more empty bursts we encounter, >> the less >> cycles we spend in "busy" state, and the less core poll busyness will be >> reported. >> > > Is this the same scheme as DSW? When a non-zero burst in idle state > means a transition from the idle to busy? And a zero burst poll in > busy state means a transition from idle to busy? > > The issue with this scheme, is that you might potentially end up with > a state transition for every iteration of the application's main loop, > if packets (or other items of work) only comes in on one of the > lcore's potentially many RX queues (or other input queues, such as > eventdev ports). That means a rdtsc for every loop, which isn't too > bad, but still might be noticable. > > An application that gather items of work from multiple source before > actually doing anything breaks this model. For example, consider a > lcore worker owning two RX queues, performing rte_eth_rx_burst() on > both, before attempt to process any of the received packets. If the > last poll is empty, the cycles spent will considered idle, even though > they were busy. > > A lcore worker might also decide to poll the same RX queue multiple > times (until it hits an empty poll, or reaching some high upper > bound), before initating processing of the packets. Yes, more complex applications will need to be modified to gain a more fine-grained busyness metric. In order to achieve this level of accuracy, application context is required. The 'RTE_LCORE_POLL_BUSYNESS_TIMESTAMP()' macro can be used within the application to mark sections as "busy" or "not busy" to do so. Using your example above, the application could keep track of multiple bursts (whether they have work or not) and call the macro before initiating the processing to signal that there is, in fact, work to be done. There's a section in the documentation update in this patchset that describes it. It might need more work if its not clear :-) > > I didn't read your code in detail, so I might be jumping to conclusions. > >> In order for all of the above to work without modifications to the >> application, the library code needs to be instrumented with calls to the >> lcore telemetry busyness timestamping function. The following parts >> of DPDK >> are instrumented with lcore telemetry calls: >> >> - All major driver API's: >>    - ethdev >>    - cryptodev >>    - compressdev >>    - regexdev >>    - bbdev >>    - rawdev >>    - eventdev >>    - dmadev >> - Some additional libraries: >>    - ring >>    - distributor > > In the past, I've suggested this kind of functionality should go into > the service framework instead, with the service function explicitly > signaling wheter or not the cycles spent on something useful or not. > > That seems to me like a more straight-forward and more accurate > solution, but does require the application to deploy everything as > services, and also requires a change of the service function signature. > >> >> To avoid performance impact from having lcore telemetry support, a >> global >> variable is exported by EAL, and a call to timestamping function is >> wrapped >> into a macro, so that whenever telemetry is disabled, it only takes one > > Use an static inline function if you don't need the additional > expressive power of a macro. > > I suggest you also mention the performance implications, when this > function is enabled. Sure, I can add a note in the next revision. > >> additional branch and no function calls are performed. It is also >> possible >> to disable it at compile time by commenting out RTE_LCORE_BUSYNESS from >> build config. >> >> This patch also adds a telemetry endpoint to report lcore poll >> busyness, as >> well as telemetry endpoints to enable/disable lcore telemetry. A >> documentation entry has been added to the howto guides to explain the >> usage >> of the new telemetry endpoints and API. >> > > Should there really be a dependency from the EAL to the telemetry > library? A cycle. Maybe some dependency inversion would be in order? > The telemetry library could instead register an interest in getting > busy/idle cycles reports from lcores. > >> Signed-off-by: Kevin Laatz >> Signed-off-by: Conor Walsh >> Signed-off-by: David Hunt >> Signed-off-by: Anatoly Burakov >> >> --- >> v3: >>    * Fix missed renaming to poll busyness >>    * Fix clang compilation >>    * Fix arm compilation >> >> v2: >>    * Use rte_get_tsc_hz() to adjust the telemetry period >>    * Rename to reflect polling busyness vs general busyness >>    * Fix segfault when calling telemetry timestamp from an unregistered >>      non-EAL thread. >>    * Minor cleanup >> --- >>   config/meson.build                          |   1 + >>   config/rte_config.h                         |   1 + >>   lib/bbdev/rte_bbdev.h                       |  17 +- >>   lib/compressdev/rte_compressdev.c           |   2 + >>   lib/cryptodev/rte_cryptodev.h               |   2 + >>   lib/distributor/rte_distributor.c           |  21 +- >>   lib/distributor/rte_distributor_single.c    |  14 +- >>   lib/dmadev/rte_dmadev.h                     |  15 +- >>   lib/eal/common/eal_common_lcore_telemetry.c | 293 ++++++++++++++++++++ >>   lib/eal/common/meson.build                  |   1 + >>   lib/eal/include/rte_lcore.h                 |  80 ++++++ >>   lib/eal/meson.build                         |   3 + >>   lib/eal/version.map                         |   7 + >>   lib/ethdev/rte_ethdev.h                     |   2 + >>   lib/eventdev/rte_eventdev.h                 |  10 +- >>   lib/rawdev/rte_rawdev.c                     |   6 +- >>   lib/regexdev/rte_regexdev.h                 |   5 +- >>   lib/ring/rte_ring_elem_pvt.h                |   1 + >>   meson_options.txt                           |   2 + >>   19 files changed, 459 insertions(+), 24 deletions(-) >>   create mode 100644 lib/eal/common/eal_common_lcore_telemetry.c >> >> diff --git a/lib/eal/common/eal_common_lcore_telemetry.c >> b/lib/eal/common/eal_common_lcore_telemetry.c >> new file mode 100644 >> index 0000000000..bba0afc26d >> --- /dev/null >> +++ b/lib/eal/common/eal_common_lcore_telemetry.c >> @@ -0,0 +1,293 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2010-2014 Intel Corporation >> + */ >> + >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> +#include >> +#include >> + >> +#ifdef RTE_LCORE_POLL_BUSYNESS >> +#include >> +#endif >> + >> +int __rte_lcore_telemetry_enabled; > > Is "telemetry" really the term to use here? Isn't this just another > piece of statistics? It can be used for telemetry, or in some other > fashion. > > (Use bool not int.) Can rename to '__rte_lcore_stats_enabled' in next revision. > >> + >> +#ifdef RTE_LCORE_POLL_BUSYNESS >> + >> +struct lcore_telemetry { >> +    int poll_busyness; >> +    /**< Calculated poll busyness (gets set/returned by the API) */ >> +    int raw_poll_busyness; >> +    /**< Calculated poll busyness times 100. */ >> +    uint64_t interval_ts; >> +    /**< when previous telemetry interval started */ >> +    uint64_t empty_cycles; >> +    /**< empty cycle count since last interval */ >> +    uint64_t last_poll_ts; >> +    /**< last poll timestamp */ >> +    bool last_empty; >> +    /**< if last poll was empty */ >> +    unsigned int contig_poll_cnt; >> +    /**< contiguous (always empty/non empty) poll counter */ >> +} __rte_cache_aligned; >> + >> +static struct lcore_telemetry *telemetry_data; >> + >> +#define LCORE_POLL_BUSYNESS_MAX 100 >> +#define LCORE_POLL_BUSYNESS_NOT_SET -1 >> +#define LCORE_POLL_BUSYNESS_MIN 0 >> + >> +#define SMOOTH_COEFF 5 >> +#define STATE_CHANGE_OPT 32 >> + >> +static void lcore_config_init(void) >> +{ >> +    int lcore_id; >> + >> +    telemetry_data = calloc(RTE_MAX_LCORE, sizeof(telemetry_data[0])); >> +    if (telemetry_data == NULL) >> +        rte_panic("Could not init lcore telemetry data: Out of >> memory\n"); >> + >> +    RTE_LCORE_FOREACH(lcore_id) { >> +        struct lcore_telemetry *td = &telemetry_data[lcore_id]; >> + >> +        td->interval_ts = 0; >> +        td->last_poll_ts = 0; >> +        td->empty_cycles = 0; >> +        td->last_empty = true; >> +        td->contig_poll_cnt = 0; >> +        td->poll_busyness = LCORE_POLL_BUSYNESS_NOT_SET; >> +        td->raw_poll_busyness = 0; >> +    } >> +} >> + >> +int rte_lcore_poll_busyness(unsigned int lcore_id) >> +{ >> +    const uint64_t active_thresh = rte_get_tsc_hz() * >> RTE_LCORE_POLL_BUSYNESS_PERIOD_MS; >> +    struct lcore_telemetry *tdata; >> + >> +    if (lcore_id >= RTE_MAX_LCORE) >> +        return -EINVAL; >> +    tdata = &telemetry_data[lcore_id]; >> + >> +    /* if the lcore is not active */ >> +    if (tdata->interval_ts == 0) >> +        return LCORE_POLL_BUSYNESS_NOT_SET; >> +    /* if the core hasn't been active in a while */ >> +    else if ((rte_rdtsc() - tdata->interval_ts) > active_thresh) >> +        return LCORE_POLL_BUSYNESS_NOT_SET; >> + >> +    /* this core is active, report its poll busyness */ >> +    return telemetry_data[lcore_id].poll_busyness; >> +} >> + >> +int rte_lcore_poll_busyness_enabled(void) >> +{ >> +    return __rte_lcore_telemetry_enabled; >> +} >> + >> +void rte_lcore_poll_busyness_enabled_set(int enable) > > Use bool. > >> +{ >> +    __rte_lcore_telemetry_enabled = !!enable; > > !!Another reason to use bool!! :) > > Are you allowed to call this function during operation, you'll need a > atomic store here (and an atomic load on the read side). Ack > >> + >> +    if (!enable) >> +        lcore_config_init(); >> +} >> + >> +static inline int calc_raw_poll_busyness(const struct >> lcore_telemetry *tdata, >> +                    const uint64_t empty, const uint64_t total) >> +{ >> +    /* >> +     * we don't want to use floating point math here, but we want >> for our poll >> +     * busyness to react smoothly to sudden changes, while still >> keeping the >> +     * accuracy and making sure that over time the average follows >> poll busyness >> +     * as measured just-in-time. therefore, we will calculate the >> average poll >> +     * busyness using integer math, but shift the decimal point two >> places >> +     * to the right, so that 100.0 becomes 10000. this allows us to >> report >> +     * integer values (0..100) while still allowing ourselves to >> follow the >> +     * just-in-time measurements when we calculate our averages. >> +     */ >> +    const int max_raw_idle = LCORE_POLL_BUSYNESS_MAX * 100; >> + > > Why not just store/manage the number of busy (or idle, or both) > cycles? Then the user can decide what time period to average over, to > what extent the lcore utilization from previous periods should be > factored in, etc. There's an option 'RTE_LCORE_POLL_BUSYNESS_PERIOD_MS' added to rte_config.h which would allow the user to define the time period over which the utilization should be reported. We only do this calculation if that time interval has elapsed. > > In DSW, I initially presented only a load statistic (which averaged > over 250 us, with some contribution from previous period). I later > came to realize that just exposing the number of busy cycles left the > calling application much more options. For example, to present the > average load during 1 s, you needed to have some control thread > sampling the load statistic during that time period, as opposed to > when the busy cycles statistic was introduced, it just had to read > that value twice (at the beginning of the period, and at the end), and > compared it will the amount of wallclock time passed. >