From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 07CF742913; Mon, 10 Apr 2023 22:58:52 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8845F40DDC; Mon, 10 Apr 2023 22:58:51 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id CECBB40A81 for ; Mon, 10 Apr 2023 22:58:49 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id D97F321345FB; Mon, 10 Apr 2023 13:58:48 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D97F321345FB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1681160328; bh=Q8/0SL6Hlh30k6xl87Z6yYcHo0/dmeDTzMmvtXTRns0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=SVSu8DzNHsdC7kHvS4nKaPQt6cXcu0Ie9uwh3EqZHwzVXDB0NfB3RlZHEjbFYedFA PGSuAMVGxtaiE7A0cx2tFps9/t1+NDL/aKoEe76jrQ4PjqNYirMM4u1cSG86bmk3mG 1cvoOrf+YPge/A9d4+Jh5EpmZjCLp5BCp9/BZu3w= Date: Mon, 10 Apr 2023 13:58:48 -0700 From: Tyler Retzlaff To: Konstantin Ananyev Cc: Konstantin Ananyev , "dev@dpdk.org" , "bruce.richardson@intel.com" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "mb@smartsharesystems.com" Subject: Re: [PATCH 3/9] eal: use barrier intrinsics when compiling with msvc Message-ID: <20230410205848.GB4798@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <1680558751-17931-1-git-send-email-roretzla@linux.microsoft.com> <1680558751-17931-4-git-send-email-roretzla@linux.microsoft.com> <754a877d020a4a199a5310b469e876a7@huawei.com> <20230404154906.GC23247@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230405000441.GA24769@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230406000740.GB2287@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <624ce08e-6f67-e078-3f37-c9dab7b094f4@yandex.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <624ce08e-6f67-e078-3f37-c9dab7b094f4@yandex.ru> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, Apr 10, 2023 at 09:02:00PM +0100, Konstantin Ananyev wrote: > 06/04/2023 01:07, Tyler Retzlaff пишет: > >On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote: > >> > >>>>>>>Inline assembly is not supported for msvc x64 instead use > >>>>>>>_{Read,Write,ReadWrite}Barrier() intrinsics. > >>>>>>> > >>>>>>>Signed-off-by: Tyler Retzlaff > >>>>>>>--- > >>>>>>> lib/eal/include/generic/rte_atomic.h | 4 ++++ > >>>>>>> lib/eal/x86/include/rte_atomic.h | 10 +++++++++- > >>>>>>> 2 files changed, 13 insertions(+), 1 deletion(-) > >>>>>>> > >>>>>>>diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h > >>>>>>>index 234b268..e973184 100644 > >>>>>>>--- a/lib/eal/include/generic/rte_atomic.h > >>>>>>>+++ b/lib/eal/include/generic/rte_atomic.h > >>>>>>>@@ -116,9 +116,13 @@ > >>>>>>> * Guarantees that operation reordering does not occur at compile time > >>>>>>> * for operations directly before and after the barrier. > >>>>>>> */ > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > >>>>>>> #define rte_compiler_barrier() do { \ > >>>>>>> asm volatile ("" : : : "memory"); \ > >>>>>>> } while(0) > >>>>>>>+#else > >>>>>>>+#define rte_compiler_barrier() _ReadWriteBarrier() > >>>>>>>+#endif > >>>>>>> > >>>>>>> /** > >>>>>>> * Synchronization fence between threads based on the specified memory order. > >>>>>>>diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h > >>>>>>>index f2ee1a9..5cce9ba 100644 > >>>>>>>--- a/lib/eal/x86/include/rte_atomic.h > >>>>>>>+++ b/lib/eal/x86/include/rte_atomic.h > >>>>>>>@@ -27,9 +27,13 @@ > >>>>>>> > >>>>>>> #define rte_rmb() _mm_lfence() > >>>>>>> > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > >>>>>>> #define rte_smp_wmb() rte_compiler_barrier() > >>>>>>>- > >>>>>>> #define rte_smp_rmb() rte_compiler_barrier() > >>>>>>>+#else > >>>>>>>+#define rte_smp_wmb() _WriteBarrier() > >>>>>>>+#define rte_smp_rmb() _ReadBarrier() > >>>>>>>+#endif > >>>>>>> > >>>>>>> /* > >>>>>>> * From Intel Software Development Manual; Vol 3; > >>>>>>>@@ -66,11 +70,15 @@ > >>>>>>> static __rte_always_inline void > >>>>>>> rte_smp_mb(void) > >>>>>>> { > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > >>>>>>> #ifdef RTE_ARCH_I686 > >>>>>>> asm volatile("lock addl $0, -128(%%esp); " ::: "memory"); > >>>>>>> #else > >>>>>>> asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); > >>>>>>> #endif > >>>>>>>+#else > >>>>>>>+ rte_compiler_barrier(); > >>>>>>>+#endif > >>>>>> > >>>>>>It doesn't look right to me: compiler_barrier is not identical to LOCK-ed operation, > >>>>>>and is not enough to serve as a proper memory barrier for SMP. > >>>>> > >>>>>i think i'm confused by the macro naming here. i'll take another look > >>>>>thank you for raising it. > >>>>> > >>>>>> > >>>>>>Another ore generic comment - do we really need to pollute all that code with RTE_TOOLCHAIN_MSVC ifdefs? > >>>>>>Right now we have ability to have subdir per arch (x86/arm/etc.). > >>>>>>Can we treat x86+windows+msvc as a special arch? > >>>>> > >>>>>i asked this question previously and confirmed in the technical board > >>>>>meeting. the answer i received was that the community did not want new > >>>>>directory/headers introduced for compiler support matrix and i should > >>>>>use #ifdef in the existing headers. > >>>> > >>>>Ok, can I then ask at least to minimize number of ifdefs to absolute > >>>>minimum? > >>> > >>>in principal no objection at all, one question though is what to do with > >>>comment based documentation attached to macros? e.g. > >>> > >>>#ifdef SOME_FOO > >>>/* some documentation */ > >>>#define some_macro > >>>#else > >>>#define some_macro > >>>#endif > >>> > >>>#ifdef SOME_FOO > >>>/* some documentation 2 */ > >>>#define some_macro2 > >>>#else > >>>#define some_macro2 > >>>#endif > >>> > >>>i can either duplicate the documentation for every define so it stays > >>>"attached" or i can only document the first expansion. let me know what > >>>you expect. > >>> > >>>so something like this? > >>> > >>>#ifdef SOME_FOO > >>>/* some documentation */ > >>>#define some_macro > >>>/* some documentation 2 */ > >>>#define some_macro2 > >>>#else > >>>#define some_macro > >>>#define some_macro2 > >>>#endif > >>> > >>>or should all documentation be duplicated? which can become a teadious > >>>redundancy for anyone maintaining it. keep in mind we might have to make > >>>an exception for rte_common.h because it seems doing this would be > >>>really ugly there. take a look let me know. > >> > >>My personal preference would be to keep one documentation block for both cases > >>(yes, I suppose it needs to be updated if required): > >> > >>/* some documentation, probably for both SOME_FOO on/off */ > >>#ifdef SOME_FOO > >>#define some_macro > >>#else > >>#define some_macro > >>#endif > >> > >> > >>> > >>>>It is really hard to read an follow acode that is heavily ifdefed. > >>>>Let say above we probably don't need to re-define > >>>>rte_smp_rmb/rte_smp_wmb, as both are boiled down to > >>>>compiler_barrier(), which is already redefined. > >>> > >>>can you take a look at v2 of the patch and re-prescribe your advise > >>>here? in v2 only the intel macros expand to the compiler barrier. though > >>>i find this vexing since as you pointed out it seems they aren't > >>>supposed to be compiler only barriers according to the documentation in > >>>generic/rte_atomic.h they are intended to be memory barriers. > >> > >>Commented, pls check if I explained my thoughts clear enough there. > >>> > >>>please help me if i've goofed up in this regard. > >>> > >>>>Another question - could it be visa-versa approach: > >>>>can we replace some inline assembly with common instincts whenever possible? > >>> > >>>msvc has only intrinsics and the conditional expansion for msvc is to > >>>use those intrinsics, gcc doesn't generally define intrinsics for processor > >>>specific code does it? > >> > >>AFAIK latest gcc (and clang) versions do support majority of these instincts: __rdtsc, _xbegin/_xend, etc. > >>So my thought was - might be we can use same instincts for all compilers... > >>One implication I can think about - older versions of gcc. > >>But might be we can re-order things and have inlines only for these oldere gcc versions? > > > >i'm going to propose if we do this we do it as a separate change later. > > > >i fear it could turn into the following dance which seems not a lot > >better given i'm sure some people will argue there is no benefit to > >removing inline assembly for gcc/clang. my preference is not to get > >side-tracked on refactoring with the short merge window. > > > >#if (defined(__clang__) && clang version < x) || > > (defined(__GNUC__) && gcc version < x) > >__asm(whatever... > >#else > >__rdtsc() > >#endif > > > Played a bit with https://godbolt.org/. > It seems that __rdtsc() and RMT builtins (xbegin/xend/xtest) are > supported all way down to gcc 4.8.1. Do you know what version of clang comes with RHEL 7. Because for 23.07 release at least I need to know it won't break clang compile on RHEL 7 either (which is what makes this painful). I played around with clang 7 (which i'm not sure is the right version) it had __rdtsc() but it did not have xbegin/xend/xtest. For the various instructions they're appearing in different versions of the compilers. I'm starting to wonder if i should drop the intrinsics changes into a separate series?