From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A8B85A0540; Mon, 20 Jul 2020 10:52:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 806741DBB; Mon, 20 Jul 2020 10:52:50 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 16C5A1023 for ; Mon, 20 Jul 2020 10:52:48 +0200 (CEST) IronPort-SDR: K1A5GFW0CiHcWsiKLNyOIUMxUBxQVOabusYkiAZDcLtEX9XxmMTEwpUZQuqJC2TgU75AY0gHw9 nOJvktHnh3lQ== X-IronPort-AV: E=McAfee;i="6000,8403,9687"; a="149859944" X-IronPort-AV: E=Sophos;i="5.75,374,1589266800"; d="scan'208";a="149859944" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jul 2020 01:52:48 -0700 IronPort-SDR: z+xLI0YGqGSmQTRySPmEhbT1g7Dckcc7e5JeaJ8MxtU+Pxl6Xcc4QyY0P7tdzmuxD4eR/1RWl/ C3WGit1aLHsQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,374,1589266800"; d="scan'208";a="361959023" Received: from unknown (HELO [10.251.87.74]) ([10.251.87.74]) by orsmga001.jf.intel.com with ESMTP; 20 Jul 2020 01:52:45 -0700 To: Ruifeng Wang , "dev@dpdk.org" Cc: "beilei.xing@intel.com" , "jia.guo@intel.com" , "bruce.richardson@intel.com" , "konstantin.ananyev@intel.com" , "jerinjacobk@gmail.com" , "david.marchand@redhat.com" , "fiona.trahe@intel.com" , "wei.zhao1@intel.com" , nd References: <1591870283-7776-1-git-send-email-radu.nicolau@intel.com> <1594982985-31551-1-git-send-email-radu.nicolau@intel.com> <1594982985-31551-2-git-send-email-radu.nicolau@intel.com> From: "Nicolau, Radu" Message-ID: <9bfd46ab-f0a5-6148-2b84-b6c596926d56@intel.com> Date: Mon, 20 Jul 2020 09:52:44 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB Subject: Re: [dpdk-dev] [PATCH v8 1/4] eal: add WC store functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 7/20/2020 7:42 AM, Ruifeng Wang wrote: >> -----Original Message----- >> From: Radu Nicolau >> Sent: Friday, July 17, 2020 6:50 PM >> To: dev@dpdk.org >> Cc: beilei.xing@intel.com; jia.guo@intel.com; bruce.richardson@intel.com; >> konstantin.ananyev@intel.com; jerinjacobk@gmail.com; >> david.marchand@redhat.com; fiona.trahe@intel.com; wei.zhao1@intel.com; >> Ruifeng Wang ; Radu Nicolau >> >> Subject: [PATCH v8 1/4] eal: add WC store functions >> >> Add rte_write32_wc and rte_write32_wc_relaxed functions that implement >> 32bit stores using write combining memory protocol. >> Provided generic stubs and x86 implementation. >> >> Signed-off-by: Radu Nicolau >> Acked-by: Bruce Richardson >> --- >> lib/librte_eal/arm/include/rte_io_64.h | 12 +++++++ >> lib/librte_eal/include/generic/rte_io.h | 48 >> ++++++++++++++++++++++++++++ >> lib/librte_eal/x86/include/rte_io.h | 56 >> +++++++++++++++++++++++++++++++++ >> 3 files changed, 116 insertions(+) >> >> diff --git a/lib/librte_eal/arm/include/rte_io_64.h >> b/lib/librte_eal/arm/include/rte_io_64.h >> index e534624..d07d9cb 100644 >> --- a/lib/librte_eal/arm/include/rte_io_64.h >> +++ b/lib/librte_eal/arm/include/rte_io_64.h >> @@ -164,6 +164,18 @@ rte_write64(uint64_t value, volatile void *addr) >> rte_write64_relaxed(value, addr); >> } >> >> +static __rte_always_inline void >> +rte_write32_wc(uint32_t value, volatile void *addr) { >> + rte_write32(value, addr); >> +} >> + >> +static __rte_always_inline void >> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr) { >> + rte_write32_relaxed(value, addr); >> +} >> + >> #ifdef __cplusplus >> } >> #endif >> diff --git a/lib/librte_eal/include/generic/rte_io.h >> b/lib/librte_eal/include/generic/rte_io.h >> index da457f7..0669baa 100644 >> --- a/lib/librte_eal/include/generic/rte_io.h >> +++ b/lib/librte_eal/include/generic/rte_io.h >> @@ -229,6 +229,40 @@ rte_write32(uint32_t value, volatile void *addr); >> static inline void rte_write64(uint64_t value, volatile void *addr); >> >> +/** >> + * Write a 32-bit value to I/O device memory address addr using write >> + * combining memory write protocol. Depending on the platform write >> +combining >> + * may not be available and/or may be treated as a hint and the >> +behavior may >> + * fallback to a regular store. > I'm trying to understand write combining use cases here. > Is it applicable for all MMIO writes? It's dependant on the architecture and specific use case, but generally this is a good usecase, updating the tail registers. It has some particularities that prevents it to be a replacement for mmio writes, it is weakly ordered and it will bypass the cache hierarchy. > How to identify where to use rte_write32_wc(_relaxed)? The relaxed version can be used is sections of the code that already have the proper fencing, as to avoid having a redundant memory fence, or when there is no need to have a memory fence at all. > > Thanks. > /Ruifeng >> + * >> + * @param value >> + * Value to write >> + * @param addr >> + * I/O memory address to write the value to */ __rte_experimental >> +static inline void rte_write32_wc(uint32_t value, volatile void *addr); >> + >> +/** >> + * Write a 32-bit value to I/O device memory address addr using write >> + * combining memory write protocol. Depending on the platform write >> +combining >> + * may not be available and/or may be treated as a hint and the >> +behavior may >> + * fallback to a regular store. >> + * >> + * The relaxed version does not have additional I/O memory barrier, >> +useful in >> + * accessing the device registers of integrated controllers which >> +implicitly >> + * strongly ordered with respect to memory access. >> + * >> + * @param value >> + * Value to write >> + * @param addr >> + * I/O memory address to write the value to */ __rte_experimental >> +static inline void rte_write32_wc_relaxed(uint32_t value, volatile void >> +*addr); >> + >> #endif /* __DOXYGEN__ */ >> >> #ifndef RTE_OVERRIDE_IO_H >> @@ -345,6 +379,20 @@ rte_write64(uint64_t value, volatile void *addr) >> rte_write64_relaxed(value, addr); >> } >> >> +#ifndef RTE_NATIVE_WRITE32_WC >> +static __rte_always_inline void >> +rte_write32_wc(uint32_t value, volatile void *addr) { >> + rte_write32(value, addr); >> +} >> + >> +static __rte_always_inline void >> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr) { >> + rte_write32_relaxed(value, addr); >> +} >> +#endif /* RTE_NATIVE_WRITE32_WC */ >> + >> #endif /* RTE_OVERRIDE_IO_H */ >> >> #endif /* _RTE_IO_H_ */ >> diff --git a/lib/librte_eal/x86/include/rte_io.h >> b/lib/librte_eal/x86/include/rte_io.h >> index 2db71b1..c95ed67 100644 >> --- a/lib/librte_eal/x86/include/rte_io.h >> +++ b/lib/librte_eal/x86/include/rte_io.h >> @@ -9,8 +9,64 @@ >> extern "C" { >> #endif >> >> +#include "rte_cpuflags.h" >> + >> +#define RTE_NATIVE_WRITE32_WC >> #include "generic/rte_io.h" >> >> +/** >> + * @internal >> + * MOVDIRI wrapper. >> + */ >> +static __rte_always_inline void >> +_rte_x86_movdiri(uint32_t value, volatile void *addr) { >> + asm volatile( >> + /* MOVDIRI */ >> + ".byte 0x40, 0x0f, 0x38, 0xf9, 0x02" >> + : >> + : "a" (value), "d" (addr)); >> +} >> + >> +static __rte_always_inline void >> +rte_write32_wc(uint32_t value, volatile void *addr) { >> + static int _x86_movdiri_flag = -1; >> + if (_x86_movdiri_flag == 1) { >> + rte_wmb(); >> + _rte_x86_movdiri(value, addr); >> + } else if (_x86_movdiri_flag == 0) { >> + rte_write32(value, addr); >> + } else { >> + _x86_movdiri_flag = >> + >> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0); >> + if (_x86_movdiri_flag == 1) { >> + rte_wmb(); >> + _rte_x86_movdiri(value, addr); >> + } else { >> + rte_write32(value, addr); >> + } >> + } >> +} >> + >> +static __rte_always_inline void >> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr) { >> + static int _x86_movdiri_flag = -1; >> + if (_x86_movdiri_flag == 1) { >> + _rte_x86_movdiri(value, addr); >> + } else if (_x86_movdiri_flag == 0) { >> + rte_write32_relaxed(value, addr); >> + } else { >> + _x86_movdiri_flag = >> + >> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0); >> + if (_x86_movdiri_flag == 1) >> + _rte_x86_movdiri(value, addr); >> + else >> + rte_write32_relaxed(value, addr); >> + } >> +} >> + >> #ifdef __cplusplus >> } >> #endif >> -- >> 2.7.4