From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 31D6BA00C2; Thu, 5 Jan 2023 22:14:25 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C86F040697; Thu, 5 Jan 2023 22:14:24 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id E8B1C4067C for ; Thu, 5 Jan 2023 22:14:22 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305CBJS0028654; Thu, 5 Jan 2023 13:14:21 -0800 Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2169.outbound.protection.outlook.com [104.47.55.169]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3mwebc57a0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 13:14:21 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OQRAQRr2yjEJdPFQsgGCFeoqnanfKMOu1FSpZqOaXKl189CSQzw+Pwv3F2TsikDCPyszCH3w4cYSPV+JsNUgBHlt5UYUz8HdLFYmskcAFEid1SReuDm0hhqIzV3WuyWyv02xhGeKtevAWqzxBFQoBT+XSMAvEsEgwdBgFEcDGK3W4tcE2hi3LoUewn7zDZdDZD5AQlJNeyYsN8vaR4ERTEsOh24btvZ2tP7if3z0jVRGYNLkzbxyQhFEvRmp8mqPOogcSwUzon9pZLY0wZ8vmrfLeN3nTmTF+I309DIQrzsErAwg5sLCkAdtVISaX9Y//o9SSMLBIlz2H97DQdfC2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7TWc4nMf1oMbsYvGKA1n2LaF+3b2WppmvIuZHk+KfSM=; b=QbRLW23l/NyGNwbtVJ1GFd/bgsOpiZEDyq9GQ2uN/LC9oqWwQO871pW27jqmUWuK/UwDqPuIOeYo4VnD61mcB7uDu7pV2rwVpcozduKdDxM25oaCT7S3YBkkAZU6ZJke9lS17KjLQLVBCzHvk0ZKD7VgfyNGt3gVXSt+HwxK7v5BQu5OLzCievYt1R48W+2M9AQhSJZAFdyZG6kuv4VTj3cPcoDCpI9mtZjlwwGd+hcmKZbVM/9LbRZ5jPh+Bel9G1IpYdkf8NXBPbdVxnGAccCqUqDeFxmYmpP3IIOdn6bPDqXRyr7RRQZZm+32Aa63CnNdR2U8KPHBQorr0xZWzA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=marvell.com; dmarc=pass action=none header.from=marvell.com; dkim=pass header.d=marvell.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector1-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7TWc4nMf1oMbsYvGKA1n2LaF+3b2WppmvIuZHk+KfSM=; b=C0cJf0vibBFHyUrM3Vk+wbdgKU0bxconOlgmH644yYk9QpLTD5f7h7zWiS8rWcYBlQkV6UliW+/LfRQ2ERg5y5vucsy3QaWU7UXXoKuRg/gZBZ4hj1E0RW3WLs1Ta50xeupiIpybb3jM7zkBwMbQsW2fknymxIaEFWdeoFQgVmg= Received: from DM4PR18MB4368.namprd18.prod.outlook.com (2603:10b6:5:39d::6) by DM4PR18MB5450.namprd18.prod.outlook.com (2603:10b6:8:183::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5944.19; Thu, 5 Jan 2023 21:14:19 +0000 Received: from DM4PR18MB4368.namprd18.prod.outlook.com ([fe80::3117:f51c:37c2:fa05]) by DM4PR18MB4368.namprd18.prod.outlook.com ([fe80::3117:f51c:37c2:fa05%9]) with mapi id 15.20.5944.019; Thu, 5 Jan 2023 21:14:18 +0000 From: Tomasz Duszynski To: =?iso-8859-1?Q?Morten_Br=F8rup?= , "dev@dpdk.org" CC: "thomas@monjalon.net" , Jerin Jacob Kollanukkaran , "zhoumin@loongson.cn" , "mattias.ronnblom@ericsson.com" Subject: RE: [PATCH v4 1/4] eal: add generic support for reading PMU events Thread-Topic: [PATCH v4 1/4] eal: add generic support for reading PMU events Thread-Index: AQHZDt/QEhRtFHSga0yHSOMIy5Zzs65rtOuAgAFVOCCAABlQ4IAjRt0A Date: Thu, 5 Jan 2023 21:14:18 +0000 Message-ID: References: <20221129092821.1304853-1-tduszynski@marvell.com> <20221213104350.3218167-1-tduszynski@marvell.com> <20221213104350.3218167-2-tduszynski@marvell.com> <98CBD80474FA8B44BF855DF32C47DC35D8758C@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D8759B@smartserver.smartshare.dk> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8759B@smartserver.smartshare.dk> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM4PR18MB4368:EE_|DM4PR18MB5450:EE_ x-ms-office365-filtering-correlation-id: 666661ff-a920-4c42-19ac-08daef61cee7 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: k9yJWttVxZk67F7SmoG/J0vjUKZX2Ktb4/vMGRvFUIFmi+sBoT3RUEHJAu+AcvOlCxSQjIzKSHzvTRg+pR0yOw2G6VrYrpWb3mCupS8a36o81//gR0p/9/G/AXx8qZAnBX9csAW0XD5cnmHHbkXcpMkfZXHcJf9RZpRtcSwI+N5sXLWNsrcP8M4bC1xYZyjqTMQhPzUiKuIlSYn6VGZeBaEks/FygXYt+C8XT3313pJB0370UlrzUywmvE++geHe7vb0BxJqUcmTIP2L6Y6tWYdgr0FSdhx9GC8qsDWuU9pBlqg/AIJC3QYl3aWlFSTme7xyqa0BPGwko4zkN06sKqV6NwWLQkgysgLn75tTz9sjcYPTMqWfYT2AL1mO4e5sXw9KaJpSgEAyJHuEXnZYgkqXpDyUiG2Q6q0lGzuoQ3mSMCgA9dRrzMArTwS8MBQCvh8+vnrizPVQKRjVrJRPaQW41Nz/EVazYqr4LxXF27fr2eiFbTdfF+co/WtkRb9K3JMOuRq40z8/28CiuuA90AbuqgQg/Fsk1FAdDENau14+6dAwu6NBV3Ll41/BVwnz3x55w4j+u6+PMwPqMLk/DXadMjygwtu8w0zTs4THHuhf/4sE0aPcNu0VT02kwf3FsfXCr4yF/1otux28SuCwS67iLg5jGz1k4sksRd2E8MZzNcyf1irZsf0YyA+nY7jfhK6kCevb7OtvwUZU5Jv1JnKNuW1w/CNWR9IKGkN/yHtzCVnPgbzk6qfWjhPqpo3+Qjw/KZpKAfwizCTNa59MA5Tq9Ux8tV7Qg8IOK66+zl0= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM4PR18MB4368.namprd18.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(136003)(39860400002)(376002)(366004)(396003)(346002)(451199015)(66574015)(186003)(26005)(9686003)(38100700002)(38070700005)(55016003)(83380400001)(86362001)(122000001)(41300700001)(66946007)(52536014)(316002)(8936002)(66556008)(5660300002)(8676002)(66476007)(4326008)(76116006)(66446008)(966005)(478600001)(7696005)(6506007)(64756008)(54906003)(110136005)(71200400001)(2906002)(33656002)(41533002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?X07Bm0ogwWZso3s0ng5t3Ly/ib5iT0x0V+s5vENYgDy18FOMY0+tcYYok6?= =?iso-8859-1?Q?iTTDUvtsCky98dFRXqz7ZmT3o3iTUOSQs0cZUm9os0oZh52Mc6M+fIqWBJ?= =?iso-8859-1?Q?MN9JQhyJ+IiS2A9j4BaO6opDk5Y6aUGcUxaXcU0+YSGit1zeonor+qGpAI?= =?iso-8859-1?Q?Na18XKjm+06cAD+afCrtYsIfyKIuxu2um/3UMYeADedakcnKP7WeggNLOC?= =?iso-8859-1?Q?Gc8R/uGHv/d4P3CDnbkkzMbflM3juaQ+mnhgjZA2NWCcaHJos6WNYEAQkL?= =?iso-8859-1?Q?lWwV0NDLI/fbzBQJWW0OGg/BSyf2XNKgnEnEPHPsUCjb2tQ9k/9om1wiuK?= =?iso-8859-1?Q?W7UV2YqKu8luLuNs3+f4qvxM4utYJ0iMmJeOawfjEvKe3Q/+M0XiDFEDcU?= =?iso-8859-1?Q?JPLzsvWWY9udNP9VqBlgxY6Z9gHHbrL2cEnvBO08FD65n/EpQI7Nqv8n0U?= =?iso-8859-1?Q?lRo6PmTfHGqO/L6rIP27I6Jy1U9i2/5fXKwoaUg53ZAx76ocShfKlLNRkb?= =?iso-8859-1?Q?28LPSGIi38lhb4CCSqmw2CUOno9HO14OjLjFGhDriPTXOWr03zXIKLTM9M?= =?iso-8859-1?Q?44AG6Cdh9FM1HteZZqtv8jyTMXfkCcQv/6297Bv1G25kkg1Gc1THWIEtCK?= =?iso-8859-1?Q?Y/t/7YVzZQubMuZpjy8yZ31Zctd3obq9AtMsxiHr06ZwZgd4jfajfJEp1s?= =?iso-8859-1?Q?Wd8uejPsG/tPLHpxGWyyXxj6VSaUiMo8UD3gHn3U0gmluXACE+N+Bj+gDz?= =?iso-8859-1?Q?lsvOaBD3qNN67q7tAhi6ZQWeOwxVqizERr+Dhb3ku/VNjZxG1dfadosScw?= =?iso-8859-1?Q?JRhwlt32g5u6HyuDjVq8eEm0wqKIQ7iQ+24wqINko2rQgf1qG5DkXZIzXL?= =?iso-8859-1?Q?HpmBMWhAwmPkVILrR2gAPIpPkCwwTY3JsGraYn6NcbFfvGhCqJSDKTYjrP?= =?iso-8859-1?Q?L71wEPmYhQKrHVL1gd71C0WMLoMoE2pxg/FqZN0E5sYMPsPXgHs99gNGG+?= =?iso-8859-1?Q?xS5Ochw4NbqbkXMVnaukTtwjmHChMmNotCN58O02Vutvb6Ugv4s4pAGEUM?= =?iso-8859-1?Q?Wf2WDueqwNYJkV+PZFWRuimvP7voCzWn45RdC6s0dmvRmnnSAS5TUytKgp?= =?iso-8859-1?Q?58VL5J6cF6tB2NTSXoqqQqT0VvjPylPeKeT3B7sMx4Hc3kETyd9ywOpG+K?= =?iso-8859-1?Q?++UlefOWiUaky9kqjAL8dIu3y4m7/C0DqFIe3ZpDXpqS4DqFkvFUk8b0fU?= =?iso-8859-1?Q?0WuzQappK7b5gW4pjqLavZxmBd5s77K0sn4MAl7bD2A/hZO6VdCzZMzH/6?= =?iso-8859-1?Q?oUwFpetL/z4QxhEL2xb+LucA3ARslpipHsVN7n2yeFXmSSC3CeB950rWU7?= =?iso-8859-1?Q?ANYY72tygYHzqnJ8dGvGvXkFpkcKZzQ7XpxpKwbgMlTbwZRnFPYui8Anys?= =?iso-8859-1?Q?s6XaTD6dvwOt+IMgLg1VgbXqZPR4GlXKZYLtPc2hNlAwzUMGC0vouaHk2D?= =?iso-8859-1?Q?rr+o4qfxaj3f78VPqbaAV+H6FvWfWYN0krCTA9MsyCp9O501GshtberEIh?= =?iso-8859-1?Q?p5M4d2JGmuYha+BGiwL6Ilw8+Cu37oDCNqY5dKKUJD4BHU+uQMrda39Ype?= =?iso-8859-1?Q?qnzJPUgtYMRL3BIgLEko39fRcCQkznll8P?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: marvell.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR18MB4368.namprd18.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 666661ff-a920-4c42-19ac-08daef61cee7 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Jan 2023 21:14:18.7995 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: kYaqXirXl7IHvvm7j84D76Hcf1JS9e34rOnxXh/I/b0rBj7M/bVFF0dchvvjJADQjHe40NuyNb8QmZXvJijhEg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR18MB5450 X-Proofpoint-ORIG-GUID: RzXTlwrLrqZ7Bb3YpM7fsN7CdI0o5eGk X-Proofpoint-GUID: RzXTlwrLrqZ7Bb3YpM7fsN7CdI0o5eGk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Morten,=20 A few comments inline.=20 >-----Original Message----- >From: Morten Br=F8rup >Sent: Wednesday, December 14, 2022 11:41 AM >To: Tomasz Duszynski ; dev@dpdk.org >Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran ; z= houmin@loongson.cn; >mattias.ronnblom@ericsson.com >Subject: [EXT] RE: [PATCH v4 1/4] eal: add generic support for reading PMU= events > >External Email > >---------------------------------------------------------------------- >+CC: Mattias, see my comment below about per-thread constructor for this > >> From: Tomasz Duszynski [mailto:tduszynski@marvell.com] >> Sent: Wednesday, 14 December 2022 10.39 >> >> Hello Morten, >> >> Thanks for review. Answers inline. >> >> [...] >> >> > > +/** >> > > + * @file >> > > + * >> > > + * PMU event tracing operations >> > > + * >> > > + * This file defines generic API and types necessary to setup PMU >> and >> > > + * read selected counters in runtime. >> > > + */ >> > > + >> > > +/** >> > > + * A structure describing a group of events. >> > > + */ >> > > +struct rte_pmu_event_group { >> > > + int *fds; /**< array of event descriptors */ >> > > + void **mmap_pages; /**< array of pointers to mmapped >> > > perf_event_attr structures */ >> > >> > There seems to be a lot of indirection involved here. Why are these >> arrays not statically sized, >> > instead of dynamically allocated? >> > >> >> Different architectures/pmus impose limits on number of simultaneously >> enabled counters. So in order relief the pain of thinking about it and >> adding macros for each and every arch I decided to allocate the number >> user wants dynamically. Also assumption holds that user knows about >> tradeoffs of using too many counters hence will not enable too many >> events at once. > >The DPDK convention is to use fixed size arrays (with a maximum size, e.g.= RTE_MAX_ETHPORTS) in the >fast path, for performance reasons. > >Please use fixed size arrays instead of dynamically allocated arrays. > I do agree that from maintenance angle fixed arrays are much more convenien= t=20 but when optimization kicks in then that statement does not necessarily hold true anymore. For example, in this case performance dropped by ~0.3% which is insignifica= nt imo. So given simpler code, next patchset will use fixed arrays.=20 >> >> > Also, what is the reason for hiding the type struct >> perf_event_mmap_page **mmap_pages opaque by >> > using void **mmap_pages instead? >> >> I think, that part doing mmap/munmap was written first hence void ** >> was chosen in the first place. > >Please update it, so the actual type is reflected here. > >> >> > >> > > + bool enabled; /**< true if group was enabled on particular lcore >> > > */ >> > > +}; >> > > + >> > > +/** >> > > + * A structure describing an event. >> > > + */ >> > > +struct rte_pmu_event { >> > > + char *name; /** name of an event */ >> > > + int index; /** event index into fds/mmap_pages */ >> > > + TAILQ_ENTRY(rte_pmu_event) next; /** list entry */ }; >> > > + >> > > +/** >> > > + * A PMU state container. >> > > + */ >> > > +struct rte_pmu { >> > > + char *name; /** name of core PMU listed under >> > > /sys/bus/event_source/devices */ >> > > + struct rte_pmu_event_group group[RTE_MAX_LCORE]; /**< per lcore >> > > event group data */ >> > > + int num_group_events; /**< number of events in a group */ >> > > + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching >> > > events */ > >The event_list is used in slow path only, so it can remain a list - i.e. n= o change requested here. >:-) > >> > > +}; >> > > + >> > > +/** Pointer to the PMU state container */ extern struct rte_pmu >> > > +*rte_pmu; >> > >> > Again, why not just extern struct rte_pmu, instead of dynamic >> allocation? >> > >> >> No strong opinions here since this is a matter of personal preference. >> Can be removed >> in the next version. > >Yes, please. > >> >> > > + >> > > +/** Each architecture supporting PMU needs to provide its own >> version >> > > */ >> > > +#ifndef rte_pmu_pmc_read >> > > +#define rte_pmu_pmc_read(index) ({ 0; }) #endif >> > > + >> > > +/** >> > > + * @internal >> > > + * >> > > + * Read PMU counter. >> > > + * >> > > + * @param pc >> > > + * Pointer to the mmapped user page. >> > > + * @return >> > > + * Counter value read from hardware. >> > > + */ >> > > +__rte_internal >> > > +static __rte_always_inline uint64_t rte_pmu_read_userpage(struct >> > > +perf_event_mmap_page *pc) { >> > > + uint64_t offset, width, pmc =3D 0; >> > > + uint32_t seq, index; >> > > + int tries =3D 100; >> > > + >> > > + for (;;) { > >As a matter of personal preference, I would write this loop differently: > >+ for (tries =3D 100; tries !=3D 0; tries--) { > >> > > + seq =3D pc->lock; >> > > + rte_compiler_barrier(); >> > > + index =3D pc->index; >> > > + offset =3D pc->offset; >> > > + width =3D pc->pmc_width; >> > > + >> > > + if (likely(pc->cap_user_rdpmc && index)) { > >Why "&& index"? The way I read [man perf_event_open], index 0 is perfectly= valid. > Valid index starts at 1. 0 means that either hw counter is stopped or isn't= active. Maybe this is not initially clear from man but there's example later on how to get actual num= ber.=20 >[man perf_event_open]: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-= 3A__man7.org_linux_man- >2Dpages_man2_perf-5Fevent- >5Fopen.2.html&d=3DDwIFAw&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DPZNXgrbjdlXxVEEGYk= xIxRndyEUwWU_ad5ce22YI6Is&m=3Dtny >gBVwOnoZDV7hItku1HtmsI8R3F6vPJdr7ON3hE5iAds96T2C9JTNcnt6ptN4Q&s=3Ds10yJogw= RRXHqAuIay47H- >aWl9SL5wpQ9tCjfiQUgrY&e=3D > >> > > + pmc =3D rte_pmu_pmc_read(index - 1); >> > > + pmc <<=3D 64 - width; >> > > + pmc >>=3D 64 - width; >> > > + } >> > > + >> > > + rte_compiler_barrier(); >> > > + >> > > + if (likely(pc->lock =3D=3D seq)) >> > > + return pmc + offset; >> > > + >> > > + if (--tries =3D=3D 0) { >> > > + RTE_LOG(DEBUG, EAL, "failed to get >> > > perf_event_mmap_page lock\n"); >> > > + break; >> > > + } > >- Remove the 4 above lines of code, and move the debug log message to the = end of the function >instead. > >> > > + } > >+ RTE_LOG(DEBUG, EAL, "failed to get perf_event_mmap_page lock\n"); > >> > > + >> > > + return 0; >> > > +} >> > > + >> > > +/** >> > > + * @internal >> > > + * >> > > + * Enable group of events for a given lcore. >> > > + * >> > > + * @param lcore_id >> > > + * The identifier of the lcore. >> > > + * @return >> > > + * 0 in case of success, negative value otherwise. >> > > + */ >> > > +__rte_internal >> > > +int >> > > +rte_pmu_enable_group(int lcore_id); >> > > + >> > > +/** >> > > + * @warning >> > > + * @b EXPERIMENTAL: this API may change without prior notice >> > > + * >> > > + * Add event to the group of enabled events. >> > > + * >> > > + * @param name >> > > + * Name of an event listed under >> > > /sys/bus/event_source/devices/pmu/events. >> > > + * @return >> > > + * Event index in case of success, negative value otherwise. >> > > + */ >> > > +__rte_experimental >> > > +int >> > > +rte_pmu_add_event(const char *name); >> > > + >> > > +/** >> > > + * @warning >> > > + * @b EXPERIMENTAL: this API may change without prior notice >> > > + * >> > > + * Read hardware counter configured to count occurrences of an >> event. >> > > + * >> > > + * @param index >> > > + * Index of an event to be read. >> > > + * @return >> > > + * Event value read from register. In case of errors or lack of >> > > support >> > > + * 0 is returned. In other words, stream of zeros in a trace >> file >> > > + * indicates problem with reading particular PMU event register. >> > > + */ >> > > +__rte_experimental >> > > +static __rte_always_inline uint64_t rte_pmu_read(int index) > >The index type can be changed from int to uint32_t. This also eliminates t= he "(index < 0" part of >the comparison further below in this function. > That's true.=20 >> > > +{ >> > > + int lcore_id =3D rte_lcore_id(); >> > > + struct rte_pmu_event_group *group; >> > > + int ret; >> > > + >> > > + if (!rte_pmu) >> > > + return 0; >> > > + >> > > + group =3D &rte_pmu->group[lcore_id]; >> > > + if (!group->enabled) { > >Optimized: if (unlikely(!group->enabled)) { > Compiler will optimize the branch itself correctly. Extra hint is not requi= red. =20 >> > > + ret =3D rte_pmu_enable_group(lcore_id); >> > > + if (ret) >> > > + return 0; >> > > + >> > > + group->enabled =3D true; >> > > + } >> > >> > Why is the group not enabled in the setup function, >> rte_pmu_add_event(), instead of here, in the >> > hot path? >> > >> >> When this is executed for the very first time then cpu will have >> obviously more work to do but afterwards setup path is not taken hence >> much less cpu cycles are required. >> >> Setup is executed by main lcore solely, before lcores are executed >> hence some info passed to SYS_perf_event_open ioctl() is missing, pid >> (via rte_gettid()) being an example here. > >OK. Thank you for the explanation. Since impossible at setup, it has to be= done at runtime. > >@Mattias: Another good example of something that would belong in per-threa= d constructors, as my >suggested feature creep in [1]. > >[1]: https://urldefense.proofpoint.com/v2/url?u=3Dhttp- >3A__inbox.dpdk.org_dev_98CBD80474FA8B44BF855DF32C47DC35D87553- >40smartserver.smartshare.dk_&d=3DDwIFAw&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DPZN= XgrbjdlXxVEEGYkxIxRndyEUwWU_ad5 >ce22YI6Is&m=3DtnygBVwOnoZDV7hItku1HtmsI8R3F6vPJdr7ON3hE5iAds96T2C9JTNcnt6p= tN4Q&s=3DaSAnYqgVnrgDp6yyMtGC >uWgJjDlgqj1wHf1nGWyHCNo&e=3D > >> >> > > + >> > > + if (index < 0 || index >=3D rte_pmu->num_group_events) > >Optimized: if (unlikely(index >=3D rte_pmu.num_group_events)) > >> > > + return 0; >> > > + >> > > + return rte_pmu_read_userpage((struct perf_event_mmap_page >> > > *)group->mmap_pages[index]); >> > >> > Using fixed size arrays instead of multiple indirections via >> > pointers >> is faster. It could be: >> > >> > return rte_pmu_read_userpage((struct perf_event_mmap_page >> > *)rte_pmu.group[lcore_id].mmap_pages[index]); >> > >> > With our without suggested performance improvements... >> > >> > Series-acked-by: Morten Br=F8rup >>