From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <dev-bounces@dpdk.org> Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7497BA04C3; Fri, 11 Sep 2020 23:23:26 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 02C001C1AF; Fri, 11 Sep 2020 23:23:20 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id B19D41C0CF for <dev@dpdk.org>; Fri, 11 Sep 2020 23:23:17 +0200 (CEST) IronPort-SDR: /lUXfLsSEmamuQtOXP1DnSLL9MrGU0lOanXGqet51/lfqni6XdPultlFD2SRQPyZXsqzmMBT/L bU4yl+D+7atQ== X-IronPort-AV: E=McAfee;i="6000,8403,9741"; a="138366619" X-IronPort-AV: E=Sophos;i="5.76,417,1592895600"; d="scan'208";a="138366619" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Sep 2020 14:23:16 -0700 IronPort-SDR: RPGP6t8kH6IsRLMScwjl3zd88pzO+KANC2/EYBfJZQj2Pt8sKsGXyHkMy1sEs4r7p0h15RRnyL +Du+YMektrmg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,417,1592895600"; d="scan'208";a="318397454" Received: from unknown (HELO localhost.ch.intel.com) ([143.182.137.102]) by orsmga002.jf.intel.com with ESMTP; 11 Sep 2020 14:23:16 -0700 From: Omkar Maslekar <omkar.maslekar@intel.com> To: dev@dpdk.org Cc: bruce.richardson@intel.com, ciara.loftus@intel.com, omkar.maslekar@intel.com Date: Fri, 11 Sep 2020 14:22:35 -0700 Message-Id: <1599859355-9264-2-git-send-email-omkar.maslekar@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599859355-9264-1-git-send-email-omkar.maslekar@intel.com> References: <1599700614-22809-1-git-send-email-omkar.maslekar@intel.com> <1599859355-9264-1-git-send-email-omkar.maslekar@intel.com> Subject: [dpdk-dev] [PATCH v3] EAL: An addition of cache line demote (CLDEMOTE) in rte_prefetch.h X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> rte_cldemote is similar to a prefetch hint - in reverse. cldemote(addr) enables software to hint to hardware that line is likely to be shared. Useful in core-to-core communications where cache-line is likely to be shared. ARM and PPC implementation is provided with NOP and can be added if any equivalent instructions could be used for implementation on those architectures. Signed-off-by: Omkar Maslekar <omkar.maslekar@intel.com> --- v3: fixed warning regarding whitespace * v2: documentation updated --- --- doc/guides/rel_notes/release_20_11.rst | 6 ++++++ lib/librte_eal/arm/include/rte_prefetch_32.h | 5 +++++ lib/librte_eal/arm/include/rte_prefetch_64.h | 5 +++++ lib/librte_eal/include/generic/rte_prefetch.h | 13 +++++++++++++ lib/librte_eal/ppc/include/rte_prefetch.h | 5 +++++ lib/librte_eal/x86/include/rte_prefetch.h | 9 +++++++++ 6 files changed, 43 insertions(+) diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index df227a1..248f2a8 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -55,6 +55,12 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= + EAL: Added new function rte_cldemote in rte_prefetch.h. + + Added a hardware hint CLDEMOTE, which is similar to prefetch in reverse. + CLDEMOTE moves the cache line to the more remote cache, where it expects + sharing to be efficient. Moving the cache line to a level more distant from + the processor helps to accelerate core-to-core communication. Removed Items ------------- diff --git a/lib/librte_eal/arm/include/rte_prefetch_32.h b/lib/librte_eal/arm/include/rte_prefetch_32.h index e53420a..ad91edd 100644 --- a/lib/librte_eal/arm/include/rte_prefetch_32.h +++ b/lib/librte_eal/arm/include/rte_prefetch_32.h @@ -33,6 +33,11 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) rte_prefetch0(p); } +static inline void rte_cldemote(const volatile void *p) +{ + RTE_SET_USED(p); +} + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/arm/include/rte_prefetch_64.h b/lib/librte_eal/arm/include/rte_prefetch_64.h index fc2b391..35d278a 100644 --- a/lib/librte_eal/arm/include/rte_prefetch_64.h +++ b/lib/librte_eal/arm/include/rte_prefetch_64.h @@ -32,6 +32,11 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) asm volatile ("PRFM PLDL1STRM, [%0]" : : "r" (p)); } +static inline void rte_cldemote(const volatile void *p) +{ + RTE_SET_USED(p); +} + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h index 6e47bdf..8742412 100644 --- a/lib/librte_eal/include/generic/rte_prefetch.h +++ b/lib/librte_eal/include/generic/rte_prefetch.h @@ -51,4 +51,17 @@ */ static inline void rte_prefetch_non_temporal(const volatile void *p); +/** + * Demote a cache line to a more distant level of cache from the processor. + * + * CLDEMOTE hints to hardware to move (demote) a cache line from the closest to + * the processor to a level more distant from the processor. It is a hint and + * not guarantee. rte_cldemote is intended to speed up things at the producer, + * in the producer-consumer case. + * + * @param p + * Address to demote + */ +static inline void rte_cldemote(const volatile void *p); + #endif /* _RTE_PREFETCH_H_ */ diff --git a/lib/librte_eal/ppc/include/rte_prefetch.h b/lib/librte_eal/ppc/include/rte_prefetch.h index 9ba07c8..3fe9655 100644 --- a/lib/librte_eal/ppc/include/rte_prefetch.h +++ b/lib/librte_eal/ppc/include/rte_prefetch.h @@ -34,6 +34,11 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) rte_prefetch0(p); } +static inline void rte_cldemote(const volatile void *p) +{ + RTE_SET_USED(p); +} + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/x86/include/rte_prefetch.h b/lib/librte_eal/x86/include/rte_prefetch.h index 384c6b3..029d06e 100644 --- a/lib/librte_eal/x86/include/rte_prefetch.h +++ b/lib/librte_eal/x86/include/rte_prefetch.h @@ -32,6 +32,15 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) asm volatile ("prefetchnta %[p]" : : [p] "m" (*(const volatile char *)p)); } +/* + * we're using raw byte codes for now as only the newest compiler + * versions support this instruction natively. + */ +static inline void rte_cldemote(const volatile void *p) +{ + asm volatile(".byte 0x0f, 0x1c, 0x06" :: "S" (p)); +} + #ifdef __cplusplus } #endif -- 1.8.3.1