From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 72AABA04AF; Tue, 22 Sep 2020 23:54:03 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D75E51DA56; Tue, 22 Sep 2020 23:54:02 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 216BA1D8FE for ; Tue, 22 Sep 2020 23:54:00 +0200 (CEST) IronPort-SDR: xHPQJ0dDXYFzR2ftdT8DfglC+A6A5o9iS/BGghaD/hmsP18kccc2CNg0cZJYz3gTmI/+VszGB+ Z/PSJiV8HFag== X-IronPort-AV: E=McAfee;i="6000,8403,9752"; a="148471734" X-IronPort-AV: E=Sophos;i="5.77,292,1596524400"; d="scan'208";a="148471734" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2020 14:53:59 -0700 IronPort-SDR: Hnht4uHK7fE73Tmpsf5G8VX85FqXS4yqkfqPDtyqCqxKDLrxX/G29YMqoQxo87SiEh9NtepywX 3Mrn4ArPhwYg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,292,1596524400"; d="scan'208";a="510738392" Received: from fmsmsx604.amr.corp.intel.com ([10.18.126.84]) by fmsmga006.fm.intel.com with ESMTP; 22 Sep 2020 14:53:59 -0700 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by fmsmsx604.amr.corp.intel.com (10.18.126.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 22 Sep 2020 14:53:58 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 22 Sep 2020 14:53:58 -0700 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (104.47.74.44) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 22 Sep 2020 14:53:56 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ev501ljqwNQCclSKE1nQE4bcA4ok1ckmDuwJm1mt6aTornEXj3hknNCYhV7DswE5bD3tx7YmreBR0h0zn8DELtstL6gpsDQFv7eK1gpIfvgJsxiuUF+TCnfD9Ja0dmUM00IouHLZ4Sexjew78RCFrpEZI3Oe3wLIzirm+mJiFVWFCvcyYtuOrUJuPfEnYZbkJNnOVv9DAGXDDyNQmcnBeMbd1GNusNEZP8Bcn/CAghUzAFwGFh0h++/hugRKZ5WKE7l3jDT0XSBQRkLvO8R1buY1p5HjJPyVxaZSi1tNk0Mwom7dWdVYVDJ/omH0AxQzxwZxNwdY+76Wnyj+eXUp4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sZwY4YExJ52K0uZpRWlWHx8WEyJhWXGaudyqRPz5eB8=; b=aLw8ZjYK3i60PKX5MKzOXJbJMoZt7HUhSwYMLY+T5WYQwcjsZiWY8udgo0vifZ5FugHKHVOzpqbyIrEi45kBIeBYJU1gGxsttYOZBeOh+aQDlBQdnMfY0mzq+7DMejsI7xw48vB1joxUFEsQS7IL3EqLKq/4hawXsbyicUr1uoaJfU5XuikOMWYs9+KBa2s8zLWUBlFQhBqsL1QhoDP52M3IwMy8K6neHs85ZAH++VHiys2FC1ydbLQKzh0mQm9cTV/qh6SYLHQUxHPg6/uBQ4wVATnPk2Ng4wiihZYbMbnA7TA7gb/rCsO2VEcP58PBy8jBNdPGk9uF4D4smJH3vg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sZwY4YExJ52K0uZpRWlWHx8WEyJhWXGaudyqRPz5eB8=; b=E3o46kmgIlCU4oOZnv6BiknP6FpYdcBRzTWA3R4B9ZQao11vr3qn5om7INHqjEB78u99dCjqhr5+0sKYPPZZ2bVq189y9yq04cfjAD+J5Naq6v8KYcL1HDXJEeDCnd8gt8fJKEaNNTPLYXbDX4ikhq3YNDFUXFo1u5kgg8nwRXg= Received: from BY5PR11MB4228.namprd11.prod.outlook.com (2603:10b6:a03:1bd::31) by BY5PR11MB4324.namprd11.prod.outlook.com (2603:10b6:a03:1bd::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Tue, 22 Sep 2020 21:53:25 +0000 Received: from BY5PR11MB4228.namprd11.prod.outlook.com ([fe80::254b:b278:3f2:915a]) by BY5PR11MB4228.namprd11.prod.outlook.com ([fe80::254b:b278:3f2:915a%7]) with mapi id 15.20.3391.014; Tue, 22 Sep 2020 21:53:25 +0000 From: "Maslekar, Omkar" To: "Richardson, Bruce" CC: "dev@dpdk.org" , "Loftus, Ciara" Thread-Topic: [PATCH v4] eal: add cache-line demote support Thread-Index: AQHWkIRADo3HVzlAXE6jGk1cG/xwjalz5viAgABskICAAN0FoA== Date: Tue, 22 Sep 2020 21:53:25 +0000 Message-ID: References: <1599700614-22809-1-git-send-email-omkar.maslekar@intel.com> <1600739967-6499-1-git-send-email-omkar.maslekar@intel.com> <1600739967-6499-2-git-send-email-omkar.maslekar@intel.com> <20200922082801.GA1604@bricha3-MOBL.ger.corp.intel.com> In-Reply-To: <20200922082801.GA1604@bricha3-MOBL.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-version: 11.5.1.3 dlp-reaction: no-action dlp-product: dlpe-windows authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [68.231.14.32] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: f731579d-80f9-4726-87c2-08d85f41eee4 x-ms-traffictypediagnostic: BY5PR11MB4324: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: PSTtKDZQbNiBrkVRZAaSbsswB2qlyh7His+038P1ssR0hLRJIpTq8h/4hSyF4tFbFLL1GR8CodyIjKXV2gDI52wENCkgqJdHPL6n4q+c3LLk7OQnA+A/NpKrhj6CgC4tEKIZjWfDfpHAijWPOYWE2lBoPZ08PhjPhzxfCyrylwmP4rP66We3ofKDqu1mXoGqIXnsNd9eMTBnA1EvSfAAfXhuOUAVz7DxI3n1baR2KrKpidQZJ0SjJMzrax91lMsMbNmF1wWjNPHFGGX6q7nI6L4Pv4QYNXGlcDHw/xXz/T0dqkrGs4qEELlx8h4WUm0s5Dz2fiPWeQaCmjVM8P2WNG+lNYGpLdYN2PNkWRW0eT1EhpePbC8Zr2KBIXADSJQo x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY5PR11MB4228.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(39860400002)(376002)(366004)(396003)(136003)(86362001)(66946007)(186003)(54906003)(33656002)(6506007)(8936002)(71200400001)(6636002)(2906002)(26005)(76116006)(52536014)(66556008)(64756008)(66446008)(4326008)(316002)(83380400001)(66476007)(9686003)(8676002)(478600001)(55016002)(6862004)(5660300002)(7696005)(107886003); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: jdJK2xsw6pnHI/2VE7DHiQkurYYX60NZgiBPH3zEl+LoqNBInVDe/eotJ5sS5QJYuWjAyOdJxBo+Ys211FePSEftqaPi5uA6mozdZ8nrMin4Ll9g2RYOpGVecGbppxZX3XeLcqG+2CswOmrDTioDPGP0MGMd22HtaNBIhdmGjxLprQU4zNHHl7zt4STVxexl5PZ/gMKNqm0utkgqxh+LeVObG+jLWu2wtGQtJiMkLOGR3DujuV7KYA28GCMAjFI+TtNcMk4OErttOQaMQBn0TMk/aCadZjLlSau+q4vu3v5OECQ+ZV0cmgtV/zGEPehj1d+zf+HaUYLTF1vzHdZyNHaibM1ilaE5Dyw4aK49xwCcm/vz+KC9GMGnRz4oOeIMJZ45m1/qAj2uRvE9Y7Igh9qsFrqgDj0azs59YXbe2G9pezA2F5EYKEm7Uc3MOV7lr3ZQ6FLmFkwsy1Zgkg66CYFQKN5yUOoo+DmvCMDOVg6heHdbBqVBmdROgqJR0/9RkR6e0guAdyCI9l7T6c2WDHalC02mM1Z9yzG5ii1fWgdimjSsFLXqTopw5oSzer6OTzqZWMNjzgH5K+tvcbzzDj8KTBwXRyMd9SeCW0jO/ND5iiD3XeC8UGXuy1y8+8YUTaCS0otukfO9q9Aeo0OCiA== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BY5PR11MB4228.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: f731579d-80f9-4726-87c2-08d85f41eee4 X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Sep 2020 21:53:25.7493 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: zAIMKLHS4xG+RSe+1H8GpXs4+pUtGi0bissd1I0QlyLY00MVbd6nLv5q5Z/QN8/X1zA9Mrsn2NU1nx4VrfoDHDq4viS2+bDD1aufFvBgBKQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR11MB4324 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v4] eal: add cache-line demote support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Bruce, My comments are inline >-----Original Message----- >From: Bruce Richardson >Sent: Tuesday, September 22, 2020 1:28 AM >To: Maslekar, Omkar >Cc: dev@dpdk.org; Loftus, Ciara >Subject: Re: [PATCH v4] eal: add cache-line demote support > >On Mon, Sep 21, 2020 at 06:59:27PM -0700, Omkar Maslekar wrote: >> rte_cldemote is similar to a prefetch hint - in reverse. >> cldemote(addr) enables software to hint to hardware that line is likely= to be >shared. >> Useful in core-to-core communications where cache-line is likely to be >> shared. ARM and PPC implementation is provided with NOP and can be >> added if any equivalent instructions could be used for implementation >> on those architectures. >> >> Signed-off-by: Omkar Maslekar >> >Few minor suggestions below. With those fixed, feel free to add my ack to >future versions of this patch. > >Acked-by: Bruce Richardson > >> --- >> v4: updated bold text for title and fixed margin in release notes >> * >> v3: fixed warning regarding whitespace >> * >> v2: documentation updated >> --- >> --- >> doc/guides/rel_notes/release_20_11.rst | 6 ++++++ >> lib/librte_eal/arm/include/rte_prefetch_32.h | 5 +++++ >> lib/librte_eal/arm/include/rte_prefetch_64.h | 5 +++++ >> lib/librte_eal/include/generic/rte_prefetch.h | 13 +++++++++++++ >> lib/librte_eal/ppc/include/rte_prefetch.h | 5 +++++ >> lib/librte_eal/x86/include/rte_prefetch.h | 9 +++++++++ >> 6 files changed, 43 insertions(+) >> >> diff --git a/doc/guides/rel_notes/release_20_11.rst >> b/doc/guides/rel_notes/release_20_11.rst >> index df227a1..b844b96 100644 >> --- a/doc/guides/rel_notes/release_20_11.rst >> +++ b/doc/guides/rel_notes/release_20_11.rst >> @@ -55,6 +55,12 @@ New Features >> Also, make sure to start the actual text at the margin. >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> +* **Added new function rte_cldemote in rte_prefetch.h.** >> + >> + Added a hardware hint CLDEMOTE, which is similar to prefetch in >reverse. >> + CLDEMOTE moves the cache line to the more remote cache, where it >> + expects sharing to be efficient. Moving the cache line to a level >> + more distant from the processor helps to accelerate core-to-core >communication. >> > >I think you need two blank lines between sections here, not just one. [om] you are right, I will fix in v5.=20 > >> Removed Items >> ------------- >> diff --git a/lib/librte_eal/arm/include/rte_prefetch_32.h >> b/lib/librte_eal/arm/include/rte_prefetch_32.h >> index e53420a..ad91edd 100644 >> --- a/lib/librte_eal/arm/include/rte_prefetch_32.h >> +++ b/lib/librte_eal/arm/include/rte_prefetch_32.h >> @@ -33,6 +33,11 @@ static inline void rte_prefetch_non_temporal(const >volatile void *p) >> rte_prefetch0(p); >> } >> >> +static inline void rte_cldemote(const volatile void *p) { >> + RTE_SET_USED(p); >> +} >> + >> #ifdef __cplusplus >> } >> #endif >> diff --git a/lib/librte_eal/arm/include/rte_prefetch_64.h >> b/lib/librte_eal/arm/include/rte_prefetch_64.h >> index fc2b391..35d278a 100644 >> --- a/lib/librte_eal/arm/include/rte_prefetch_64.h >> +++ b/lib/librte_eal/arm/include/rte_prefetch_64.h >> @@ -32,6 +32,11 @@ static inline void rte_prefetch_non_temporal(const >volatile void *p) >> asm volatile ("PRFM PLDL1STRM, [%0]" : : "r" (p)); } >> >> +static inline void rte_cldemote(const volatile void *p) { >> + RTE_SET_USED(p); >> +} >> + >> #ifdef __cplusplus >> } >> #endif >> diff --git a/lib/librte_eal/include/generic/rte_prefetch.h >> b/lib/librte_eal/include/generic/rte_prefetch.h >> index 6e47bdf..8742412 100644 >> --- a/lib/librte_eal/include/generic/rte_prefetch.h >> +++ b/lib/librte_eal/include/generic/rte_prefetch.h >> @@ -51,4 +51,17 @@ >> */ >> static inline void rte_prefetch_non_temporal(const volatile void *p); >> >> +/** >> + * Demote a cache line to a more distant level of cache from the >processor. >> + * >> + * CLDEMOTE hints to hardware to move (demote) a cache line from the >> +closest to >> + * the processor to a level more distant from the processor. It is a >> +hint and >> + * not guarantee. rte_cldemote is intended to speed up things at the >> +producer, >> + * in the producer-consumer case. >> + * > >Two thoughts here: >1. Is it not more the consumer who benefits more since they are the ones >receiving the demoted value, while the producer pays a higher cost since >they have to demote the value on send? [OM] CLDEMOTE benefits the consumer. My statement "speed up things at the p= roducer" indicate proximity where the distance is reduced. But I will make it simple and more readable.=20 >2. Rather than talking about producer consumer case specifically, I think= it >would be good to replace the last sentence with what you have in the cove= r >letter about it being for sharing, and to indicate that a line may be acc= essed >by a different core in the future. =20 [OM] Good point, there could be many other cores that can benefit instead o= f just a single consumer. I will update this. > >> + * @param p >> + * Address to demote >> + */ >> +static inline void rte_cldemote(const volatile void *p); >> + >> #endif /* _RTE_PREFETCH_H_ */ >> diff --git a/lib/librte_eal/ppc/include/rte_prefetch.h >> b/lib/librte_eal/ppc/include/rte_prefetch.h >> index 9ba07c8..3fe9655 100644 >> --- a/lib/librte_eal/ppc/include/rte_prefetch.h >> +++ b/lib/librte_eal/ppc/include/rte_prefetch.h >> @@ -34,6 +34,11 @@ static inline void rte_prefetch_non_temporal(const >volatile void *p) >> rte_prefetch0(p); >> } >> >> +static inline void rte_cldemote(const volatile void *p) { >> + RTE_SET_USED(p); >> +} >> + >> #ifdef __cplusplus >> } >> #endif >> diff --git a/lib/librte_eal/x86/include/rte_prefetch.h >> b/lib/librte_eal/x86/include/rte_prefetch.h >> index 384c6b3..029d06e 100644 >> --- a/lib/librte_eal/x86/include/rte_prefetch.h >> +++ b/lib/librte_eal/x86/include/rte_prefetch.h >> @@ -32,6 +32,15 @@ static inline void rte_prefetch_non_temporal(const >volatile void *p) >> asm volatile ("prefetchnta %[p]" : : [p] "m" (*(const volatile char >> *)p)); } >> >> +/* >> + * we're using raw byte codes for now as only the newest compiler >> + * versions support this instruction natively. >> + */ >> +static inline void rte_cldemote(const volatile void *p) { >> + asm volatile(".byte 0x0f, 0x1c, 0x06" :: "S" (p)); } >> + >> #ifdef __cplusplus >> } >> #endif >> -- >> 1.8.3.1 >>