From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4A0B4418FD; Sun, 8 Jan 2023 17:30:24 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E4DE240687; Sun, 8 Jan 2023 17:30:23 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 329FA4067C for ; Sun, 8 Jan 2023 17:30:22 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH v4 1/4] eal: add generic support for reading PMU events Date: Sun, 8 Jan 2023 17:30:17 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8763C@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH v4 1/4] eal: add generic support for reading PMU events Thread-Index: AQHZDt/QEhRtFHSga0yHSOMIy5Zzs65rtOuAgAFVOCCAABlQ4IAjRt0AgAAVgdCAAM6BwIADjC0Q References: <20221129092821.1304853-1-tduszynski@marvell.com> <20221213104350.3218167-1-tduszynski@marvell.com> <20221213104350.3218167-2-tduszynski@marvell.com> <98CBD80474FA8B44BF855DF32C47DC35D8758C@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D8759B@smartserver.smartshare.dk> A <98CBD80474FA8B44BF855DF32C47DC35D8762C@smartserver.smartshare.dk> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Tomasz Duszynski" , Cc: , "Jerin Jacob Kollanukkaran" , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Tomasz Duszynski [mailto:tduszynski@marvell.com] > Sent: Sunday, 8 January 2023 16.41 >=20 > >From: Morten Br=F8rup > >Sent: Thursday, January 5, 2023 11:08 PM > > > >> From: Tomasz Duszynski [mailto:tduszynski@marvell.com] > >> Sent: Thursday, 5 January 2023 22.14 > >> > >> Hi Morten, > >> > >> A few comments inline. > >> > >> >From: Morten Br=F8rup > >> >Sent: Wednesday, December 14, 2022 11:41 AM > >> > > >> >External Email > >> > > >> = >------------------------------------------------------------------- > -- > >> >- > >> >+CC: Mattias, see my comment below about per-thread constructor = for > >> this > >> > > >> >> From: Tomasz Duszynski [mailto:tduszynski@marvell.com] > >> >> Sent: Wednesday, 14 December 2022 10.39 > >> >> > >> >> Hello Morten, > >> >> > >> >> Thanks for review. Answers inline. > >> >> [...] > >> >> > > +{ > >> >> > > + int lcore_id =3D rte_lcore_id(); > >> >> > > + struct rte_pmu_event_group *group; > >> >> > > + int ret; > >> >> > > + > >> >> > > + if (!rte_pmu) > >> >> > > + return 0; > >> >> > > + > >> >> > > + group =3D &rte_pmu->group[lcore_id]; > >> >> > > + if (!group->enabled) { > >> > > >> >Optimized: if (unlikely(!group->enabled)) { > >> > > >> > >> Compiler will optimize the branch itself correctly. Extra hint is > not > >> required. > > > >I haven't reviewed the output from this, so I'll take your word for > it. I suggested the unlikely() > >because I previously tested some very simple code, and it optimized > for taking the "if": > > > >void testb(bool b) > >{ > > if (!b) > > exit(1); > > > > exit(99); > >} > > > >I guess I should experiment with more realistic code, and update my > optimization notes! > > >=20 > I think this may be too simple to draw far-reaching conclusions from > it. Compiler will make the > fall-through path more likely. If I recall Intel Optimization = Reference > Manual has some more > info on this. IIRC, the Intel Optimization Reference Manual discusses branch = optimization for assembler, not C. >=20 > Lets take a different example. >=20 > int main(int argc, char *argv[]) > { > int *p; >=20 > p =3D malloc(sizeof(*p)); > if (!p) > return 1; > *p =3D atoi(argv[1]); > if (*p < 0) > return 2; > free(p); >=20 > return 0; > } >=20 > If compiled with -O3 and disassembled I got below. >=20 > 00000000000010a0
: > 10a0: f3 0f 1e fa endbr64 > 10a4: 55 push %rbp > 10a5: bf 04 00 00 00 mov $0x4,%edi > 10aa: 53 push %rbx > 10ab: 48 89 f3 mov %rsi,%rbx > 10ae: 48 83 ec 08 sub $0x8,%rsp > 10b2: e8 d9 ff ff ff call 1090 > 10b7: 48 85 c0 test %rax,%rax > 10ba: 74 31 je 10ed > 10bc: 48 8b 7b 08 mov 0x8(%rbx),%rdi > 10c0: ba 0a 00 00 00 mov $0xa,%edx > 10c5: 31 f6 xor %esi,%esi > 10c7: 48 89 c5 mov %rax,%rbp > 10ca: e8 b1 ff ff ff call 1080 > 10cf: 49 89 c0 mov %rax,%r8 > 10d2: b8 02 00 00 00 mov $0x2,%eax > 10d7: 45 85 c0 test %r8d,%r8d > 10da: 78 0a js 10e6 > 10dc: 48 89 ef mov %rbp,%rdi > 10df: e8 8c ff ff ff call 1070 > 10e4: 31 c0 xor %eax,%eax > 10e6: 48 83 c4 08 add $0x8,%rsp > 10ea: 5b pop %rbx > 10eb: 5d pop %rbp > 10ec: c3 ret > 10ed: b8 01 00 00 00 mov $0x1,%eax > 10f2: eb f2 jmp 10e6 >=20 > Looking at both 10ba and 10da suggests that code was laid out in a way > that jumping is frowned upon. Also > potentially lest frequently executed code (at 10ed) is pushed further > down the memory hence optimizing cache line usage. In my notes, I have (ptr =3D=3D NULL) marked as considered unlikely, but = (int =3D=3D 0) marked as considered likely. Since group->enabled is = bool, I guessed the compiler would treat it like int and consider = (!group->enabled) as likely. Like in your example here, I also have (int < 0) marked as considered = unlikely. >=20 > That said, each and every scenario needs analysis on its own. Agree. Theory is good, validation is better. ;-) >=20 > >You could add the unlikely() for readability purposes. ;-) > > >=20 > Sure. That won't hurt performance. I think we are both in agreement about the intentions here, so I won't = hold you back with further academic discussions at this point. I might = resume the discussion with your next patch version, though. ;-)