From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 31F4842383; Wed, 11 Jan 2023 17:20:53 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DC85F40A7D; Wed, 11 Jan 2023 17:20:52 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 9580C400D4 for ; Wed, 11 Jan 2023 17:20:51 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30B9aFna016470; Wed, 11 Jan 2023 08:20:48 -0800 Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3n1k56tyhd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 11 Jan 2023 08:20:48 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mMJgcfeN0E2uQ3FXnynq0pjP8iaM36thP6MrnUk5zggZDAlKzDfad07K+l2rT8ol2MhQ1vkxu0Vy/lEGmvYSobl3WWF6SzhV/awKMhr1wpLx9uiUagSv6Dh2bMpAL6RERw+9WrSgLkkGEkmirKc8P4cKjZf8sdJl+JqRNQ8Vty1JOuKr0COiosAKitgRsOVJHCSel/JG+WUVA5AWvAlM1ekcsQwDOMIfsn3jxHyseIYuke0MGj/CKWRTmpP+KZfyAzBS3zXznocOx/Ui/7MHw1oe4qG1xfc/jj2Or4qDLcE7mHNxH9L5EmJwbDcagryZptDCiG/r8ntkPS/w8JQYZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MPlhUkYVo3UN2GOCVJO8jPuL26JOLjRm27vqh0nDf8w=; b=Fgk4VGVIdAmdFNlQm0CZVejaszi5tc17JZsrqDApMpYUwtIgplcz/CdYVfN07y6r59LBXzEA2WVyBrX+gkAx+unov/AdmkYOVr+LxpI4yjyl4v1ntXR6ugIzecww+7kRjGtqGNCpsuRin25c2FFT181Q5cZ/WUCE4DVrVQEea/CPvYM9E4VG9bPLeuagNC1auQda0grpbuKTfDDIn9atHCW8Htq6Sc+FznR58bV4juSw7MPgCpe66oSWwxFhADaTi8/U0xvyv6LMXxKyhbfYJ6kbFCt5lXmN8WsPZy/qph1Wn2hnbZCSD0g5z4OU8BgL4v9uYI+tDh2+2EE5Y9XCpw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=marvell.com; dmarc=pass action=none header.from=marvell.com; dkim=pass header.d=marvell.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector1-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MPlhUkYVo3UN2GOCVJO8jPuL26JOLjRm27vqh0nDf8w=; b=WL3eF9ecNwTOXBFJCoMnahDzFGROofwk8suERIKaXtgMu2+mXf6zJcUm2rZX807witBfFuqpxE2CMlPYNEy3GzgDmAIujy1axPQtGvc6y8NaOXYwEzKqgWQT5Kghh+/BbZ+F9eW3gJV6Pr0pQiY47cwR6R2lOyw8Dr0CMmtHszI= Received: from DM4PR18MB4368.namprd18.prod.outlook.com (2603:10b6:5:39d::6) by MW4PR18MB5134.namprd18.prod.outlook.com (2603:10b6:303:1b5::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Wed, 11 Jan 2023 16:20:46 +0000 Received: from DM4PR18MB4368.namprd18.prod.outlook.com ([fe80::3117:f51c:37c2:fa05]) by DM4PR18MB4368.namprd18.prod.outlook.com ([fe80::3117:f51c:37c2:fa05%9]) with mapi id 15.20.6002.013; Wed, 11 Jan 2023 16:20:46 +0000 From: Tomasz Duszynski To: =?iso-8859-1?Q?Morten_Br=F8rup?= , "dev@dpdk.org" CC: "thomas@monjalon.net" , Jerin Jacob Kollanukkaran , "Ruifeng.Wang@arm.com" , "mattias.ronnblom@ericsson.com" , "zhoumin@loongson.cn" Subject: RE: [PATCH v5 1/4] eal: add generic support for reading PMU events Thread-Topic: [PATCH v5 1/4] eal: add generic support for reading PMU events Thread-Index: AQHZJU3R+bXQYI510E6hymM30KPKzq6Y7S4AgABtFRA= Date: Wed, 11 Jan 2023 16:20:45 +0000 Message-ID: References: <20221213104350.3218167-1-tduszynski@marvell.com> <20230110234642.1188550-1-tduszynski@marvell.com> <20230110234642.1188550-2-tduszynski@marvell.com> <98CBD80474FA8B44BF855DF32C47DC35D8764D@smartserver.smartshare.dk> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8764D@smartserver.smartshare.dk> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM4PR18MB4368:EE_|MW4PR18MB5134:EE_ x-ms-office365-filtering-correlation-id: 288a9ca7-c7f6-49a5-d28a-08daf3efcb42 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: HSMGGCGJgVlMx2AgNsKk58rIdVK3Z2Ik+QOEECRMj+qZAMoiRfugmUUHpBK3toAXea+/yLtpBX2AmcMEe8vHDOpS9I8+x+8YL3xHQe2PaImJeJs+mLbH7rYODPmOUzf/zcC1U5FlemgXAQCmFm62DZkIWVjdCqCMctvK8dub1QohtM7dC5zEa3/Y1+ZPX9kGpAFK6N3vIi9b4V8YrMFMDs92wmck2j0HlvzUX27quKzWK37GqQxogpZFz0TiLK9UOtTBJvrDtCgxVbWgqoNPferWTqtWhmIAERqQYu7/0c7xTWooBErbvDONfBdts5QRmaJ6WZ2OCTdCusDu0JEtN4LbdyaKscBPYTkJIMXAHauPTZTKhZeZaf3gwOPNDVvLwCm54CxBDjmuFZUzE9f2JZbDdnlJxHmKPp7/oiDX3NZaJVJKxvqk0inrZi5xqVzO0jvJydwLrOx4NQsaZKlKjzpTefFdjXW6CFcowVuu+89Mnc4WozUsmyRXBCU/tAAilOCvEAjIpFRR6z3xuMpz4TmORzd7ECmtTP7NKg/6NrUiOf70WXTm1umNof/Q8hUpXpTUQGHzFCFTeWYeYi7Go4NgO/uHVgjO9hb+S8u0BaRcC/uh8U6Y9ZylHHlRlNYsz+KJwF5pRApONJOpg6LKGYCNqA8cckvgstL1S/WiB0clNiSc63MrbAwN3bioM7Pypw3RQiv4a6vBcKi/Dxxreq/Ld1odPDy0DsUalJMYegA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM4PR18MB4368.namprd18.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(346002)(136003)(366004)(376002)(396003)(451199015)(83380400001)(66574015)(122000001)(2906002)(41300700001)(5660300002)(8936002)(52536014)(186003)(478600001)(6506007)(38100700002)(55016003)(8676002)(9686003)(66476007)(4326008)(64756008)(316002)(26005)(66446008)(86362001)(71200400001)(66946007)(54906003)(110136005)(66556008)(38070700005)(7696005)(76116006)(33656002)(41533002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?jE7LQn3kAzSIgbWTYFX1Uovmd4coBlbFBMcz42oeyQNSP/2+93x0BebjyJ?= =?iso-8859-1?Q?Rp7XpEWhNxnse49MB+9r5t2po6JiGmqbvjLU+7dL2X7p1vjrsJ/8nR20jy?= =?iso-8859-1?Q?2dj+FQBcbdIOLIxLZtSwa9BWUrNea1mgZVjEuQHD8L2HZCYpbqoncVtknC?= =?iso-8859-1?Q?dCPs5eUd/4VNm7kwvk3AXb1N5F6Nv5crQE0cieFKLIGn99uHp9lZ+gaSFp?= =?iso-8859-1?Q?mE1PEYvvakLlv63XlzCCbV+ltKgjgp/DK8mAi3pg0uTHtmQGTY4LzlUkPR?= =?iso-8859-1?Q?Efiju6DebBzz2Hrt5SJv9nF1KF2mNm3CIaldPu/9sy2RM9BlzVy/wmklU0?= =?iso-8859-1?Q?YLp8JhNNIjA+fmRLBMq5xbrPTAFUym4hx4T5HhOdJLxvpYIPqbh6q4L3+F?= =?iso-8859-1?Q?wC7aKjIjsOSVpeKPaCC8UchuymBHKgw9S0ACr6wfF1i/ln7XGFJpveWrlI?= =?iso-8859-1?Q?Gtf5z8CRh6hS3ikzR6+7ovAyTDHp1EcFt05QkzdL51O8C/Ibg3hfVTQf4Q?= =?iso-8859-1?Q?ZF4iS9SQIOlwrgN3ZNPJqOFXEWnsIwlG6cQKdFiP53V3112ye+E0YIsCPs?= =?iso-8859-1?Q?92y/784CY8hAuuednvws/pgInVRyo4Z8MCI5F1WsJ8jNQh45S1hYV4/wEw?= =?iso-8859-1?Q?bdPUJ6hHh4lUOAP9SxUXYABD060yMc9RPAV/0Bb6KPcr3Sb/R2G8G0+dXd?= =?iso-8859-1?Q?JAQL6q/HJd5vhhhnL9nyBas4t4n/xPBuvvmP5Zg205N85evHqexgPKEaYC?= =?iso-8859-1?Q?dhjCv7supHiDJqZLA55vqX3G82dcGSjFa0EatRWvLFqWamIHUgk5utJqjW?= =?iso-8859-1?Q?JLrdhmrXfDdFWejymwNCgPasOolN0dis79vazOuP/TiGVGjIVc9HyZSPJH?= =?iso-8859-1?Q?24sKlbs3AwmKAXC8qIiaTgjw+VjAUYSvgjX7E5wBzBqRa0ELVYrl41Nlsm?= =?iso-8859-1?Q?tRNOP1dngo8Wy5klUr3wmePciNF2ctUCYGJaSORD1dPJtb44NMLiKmp1+L?= =?iso-8859-1?Q?Nvqp5AvjxaBvm+5UO7yb2eQz+KG5WJprCczLKpwOXYc1Hh0k6sRNubycjA?= =?iso-8859-1?Q?D3YPJPvBtbGZoxl9jXY3lTgN9T88PqqxfyBgYmT5Jhr1It7di+mmRBUsQ3?= =?iso-8859-1?Q?Hwysy30fKOWaYuNePNzygFmZC/HnHkvte5/y90mDX2iGAZrA8ShevHcib3?= =?iso-8859-1?Q?CIDFkS6m6AeTKfytS5mAcc4dUuEPfg5MEnjzQmVS7sI2cdeX7Gw6w7hhTZ?= =?iso-8859-1?Q?NBBBBYdVtiYC2CBc2GFjibsdf0p5fgOPSZRmEua+ZypcwwyhNmg7Z37TJ7?= =?iso-8859-1?Q?K1CscP5m+C6c4gEaNilT2wZ2IqyBBSK6GFrmJYEvAJyVZuKAnR1SyhxL4d?= =?iso-8859-1?Q?229Z+K9H41xYOwhJq81Av3kFPSDSTeBKKdX/tNMMzHI+PwoqEd9gxGnOg3?= =?iso-8859-1?Q?Hv02b3Hp/WmnSFH94GeQK3hwKKiiwSc4hpwti5eLGWLtN8VoiaqLYl7iud?= =?iso-8859-1?Q?OinukMn0pB9yNuDiPnSg3mfCIKkyMxNnOVa/5+FE3bWjcXvLVW+5+8t0MW?= =?iso-8859-1?Q?tCpxAa6c4Jbb8wfmThqGT0O+mEb/5PSg/HBKi4Db9ooijVwn8eKTO4aMk1?= =?iso-8859-1?Q?hWXkpZo7wntXnnKk7P7xWk/rpkab//rF1N?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: marvell.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR18MB4368.namprd18.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 288a9ca7-c7f6-49a5-d28a-08daf3efcb42 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Jan 2023 16:20:45.8764 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: FqSE2UXgYmbW8rVkU6/wozISCOD1fV66N+6R6S5ziRW7zRfBOwQYtJJNybpGLoiT9XeX6x5nCUFKwCnl7ijvYg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR18MB5134 X-Proofpoint-ORIG-GUID: 2ARLFkdChrnJnozrUX4sIuYbxYkkjT0c X-Proofpoint-GUID: 2ARLFkdChrnJnozrUX4sIuYbxYkkjT0c X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-11_07,2023-01-11_02,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org >-----Original Message----- >From: Morten Br=F8rup >Sent: Wednesday, January 11, 2023 10:06 AM >To: Tomasz Duszynski ; dev@dpdk.org >Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran ; R= uifeng.Wang@arm.com; >mattias.ronnblom@ericsson.com; zhoumin@loongson.cn >Subject: [EXT] RE: [PATCH v5 1/4] eal: add generic support for reading PMU= events > >External Email > >---------------------------------------------------------------------- >> From: Tomasz Duszynski [mailto:tduszynski@marvell.com] >> Sent: Wednesday, 11 January 2023 00.47 >> >> Add support for programming PMU counters and reading their values in >> runtime bypassing kernel completely. >> >> This is especially useful in cases where CPU cores are isolated >> (nohz_full) i.e run dedicated tasks. In such cases one cannot use >> standard perf utility without sacrificing latency and performance. >> >> Signed-off-by: Tomasz Duszynski >> --- > >[...] > >> +static int >> +do_perf_event_open(uint64_t config[3], unsigned int lcore_id, int >> group_fd) >> +{ >> + struct perf_event_attr attr =3D { >> + .size =3D sizeof(struct perf_event_attr), >> + .type =3D PERF_TYPE_RAW, >> + .exclude_kernel =3D 1, >> + .exclude_hv =3D 1, >> + .disabled =3D 1, >> + }; >> + >> + pmu_arch_fixup_config(config); >> + >> + attr.config =3D config[0]; >> + attr.config1 =3D config[1]; >> + attr.config2 =3D config[2]; >> + >> + return syscall(SYS_perf_event_open, &attr, 0, >> rte_lcore_to_cpu_id(lcore_id), group_fd, 0); >> +} > >If SYS_perf_event_open() must be called from the worker thread itself, the= n lcore_id must not be >passed as a parameter to do_perf_event_open(). Otherwise, I would expect t= o be able to call >do_perf_event_open() from the main thread and pass any lcore_id of a worke= r thread. >This comment applies to all functions that must be called from the worker = thread itself. It also >applies to the functions that call such functions. > Lcore_id is being passed around so that we don't need to call rte_lcore_id(= ) each and every time.=20 >[...] > >> +/** >> + * A structure describing a group of events. >> + */ >> +struct rte_pmu_event_group { >> + int fds[MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ >> + struct perf_event_mmap_page *mmap_pages[MAX_NUM_GROUP_EVENTS]; >> /**< array of user pages */ >> + bool enabled; /**< true if group was enabled on particular lcore >> */ >> +}; >> + >> +/** >> + * A structure describing an event. >> + */ >> +struct rte_pmu_event { >> + char *name; /** name of an event */ >> + unsigned int index; /** event index into fds/mmap_pages */ >> + TAILQ_ENTRY(rte_pmu_event) next; /** list entry */ }; > >Move the "enabled" field up, making it the first field in this structure. = This might reduce the >number of instructions required to check (!group->enabled) in rte_pmu_read= (). > This will be called once and no this will not produce more instructions. Wh= y should it? In both cases compiler will need to load data at some offset and archs do h= ave instructions for that.=20 >Also, each instance of the structure is used individually per lcore, so th= e structure should be >cache line aligned to avoid unnecessarily crossing cache lines. > >I.e.: > >struct rte_pmu_event_group { > bool enabled; /**< true if group was enabled on particular lcore */ > int fds[MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ > struct perf_event_mmap_page *mmap_pages[MAX_NUM_GROUP_EVENTS]; /**< array= of user pages */ } >__rte_cache_aligned; Yes, this can be aligned. While at it, I'd be more inclined to move mmap_pa= ges up instead of enable. =20 > >> + >> +/** >> + * A PMU state container. >> + */ >> +struct rte_pmu { >> + char *name; /** name of core PMU listed under >> /sys/bus/event_source/devices */ >> + struct rte_pmu_event_group group[RTE_MAX_LCORE]; /**< per lcore >> event group data */ >> + unsigned int num_group_events; /**< number of events in a group >> */ >> + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching >> events */ >> +}; >> + >> +/** Pointer to the PMU state container */ extern struct rte_pmu >> +rte_pmu; > >Just "The PMU state container". It is not a pointer anymore. :-) > Good catch. >[...] > >> +/** >> + * @internal >> + * >> + * Read PMU counter. >> + * >> + * @param pc >> + * Pointer to the mmapped user page. >> + * @return >> + * Counter value read from hardware. >> + */ >> +__rte_internal >> +static __rte_always_inline uint64_t >> +rte_pmu_read_userpage(struct perf_event_mmap_page *pc) { >> + uint64_t width, offset; >> + uint32_t seq, index; >> + int64_t pmc; >> + >> + for (;;) { >> + seq =3D pc->lock; >> + rte_compiler_barrier(); >> + index =3D pc->index; >> + offset =3D pc->offset; >> + width =3D pc->pmc_width; >> + > >Please add a comment here about the special meaning of index =3D=3D 0. Okay.=20 > >> + if (likely(pc->cap_user_rdpmc && index)) { >> + pmc =3D rte_pmu_pmc_read(index - 1); >> + pmc <<=3D 64 - width; >> + pmc >>=3D 64 - width; >> + offset +=3D pmc; >> + } >> + >> + rte_compiler_barrier(); >> + >> + if (likely(pc->lock =3D=3D seq)) >> + return offset; >> + } >> + >> + return 0; >> +} > >[...] > >> +/** >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice >> + * >> + * Read hardware counter configured to count occurrences of an event. >> + * >> + * @param index >> + * Index of an event to be read. >> + * @return >> + * Event value read from register. In case of errors or lack of >> support >> + * 0 is returned. In other words, stream of zeros in a trace file >> + * indicates problem with reading particular PMU event register. >> + */ >> +__rte_experimental >> +static __rte_always_inline uint64_t >> +rte_pmu_read(unsigned int index) >> +{ >> + struct rte_pmu_event_group *group; >> + int ret, lcore_id =3D rte_lcore_id(); >> + >> + group =3D &rte_pmu.group[lcore_id]; >> + if (unlikely(!group->enabled)) { >> + ret =3D rte_pmu_enable_group(lcore_id); >> + if (ret) >> + return 0; >> + >> + group->enabled =3D true; > >Group->enabled should be set inside rte_pmu_enable_group(), not here. > This is easier to follow imo and not against coding guidelines so I prefer = to leave it as is. =20 >> + } >> + >> + if (unlikely(index >=3D rte_pmu.num_group_events)) >> + return 0; >> + >> + return rte_pmu_read_userpage(group->mmap_pages[index]); >> +} >