From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7E87F428D0; Thu, 6 Apr 2023 02:07:42 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 57D2D40F18; Thu, 6 Apr 2023 02:07:42 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 7DDE240DF6 for ; Thu, 6 Apr 2023 02:07:41 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id A5393210DEF7; Wed, 5 Apr 2023 17:07:40 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com A5393210DEF7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1680739660; bh=dWGnjsABnWvlInkDgxNKl+9NMn6lca+sEp4JcTO9MWQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=N2v0B2IH3vMA0ztBwNnlwEP8TgINllihHVRgwlS01gAovBLB7Y3SVLZj6f1MgsPBF hzd4CPEd5BTs0bOOjctG1DDM8b1bVLI/CIyR9m4CNA99P9pFhDnIfbRfL8DMF6sBP6 T8/j+m3Xs8rnzdohhur1jIbHDaqWsp912Q/Jlcy8= Date: Wed, 5 Apr 2023 17:07:40 -0700 From: Tyler Retzlaff To: Konstantin Ananyev Cc: Konstantin Ananyev , "dev@dpdk.org" , "bruce.richardson@intel.com" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "mb@smartsharesystems.com" Subject: Re: [PATCH 3/9] eal: use barrier intrinsics when compiling with msvc Message-ID: <20230406000740.GB2287@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <1680558751-17931-1-git-send-email-roretzla@linux.microsoft.com> <1680558751-17931-4-git-send-email-roretzla@linux.microsoft.com> <754a877d020a4a199a5310b469e876a7@huawei.com> <20230404154906.GC23247@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230405000441.GA24769@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote: > > > > >>>Inline assembly is not supported for msvc x64 instead use > > > >>>_{Read,Write,ReadWrite}Barrier() intrinsics. > > > >>> > > > >>>Signed-off-by: Tyler Retzlaff > > > >>>--- > > > >>> lib/eal/include/generic/rte_atomic.h | 4 ++++ > > > >>> lib/eal/x86/include/rte_atomic.h | 10 +++++++++- > > > >>> 2 files changed, 13 insertions(+), 1 deletion(-) > > > >>> > > > >>>diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h > > > >>>index 234b268..e973184 100644 > > > >>>--- a/lib/eal/include/generic/rte_atomic.h > > > >>>+++ b/lib/eal/include/generic/rte_atomic.h > > > >>>@@ -116,9 +116,13 @@ > > > >>> * Guarantees that operation reordering does not occur at compile time > > > >>> * for operations directly before and after the barrier. > > > >>> */ > > > >>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>> #define rte_compiler_barrier() do { \ > > > >>> asm volatile ("" : : : "memory"); \ > > > >>> } while(0) > > > >>>+#else > > > >>>+#define rte_compiler_barrier() _ReadWriteBarrier() > > > >>>+#endif > > > >>> > > > >>> /** > > > >>> * Synchronization fence between threads based on the specified memory order. > > > >>>diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h > > > >>>index f2ee1a9..5cce9ba 100644 > > > >>>--- a/lib/eal/x86/include/rte_atomic.h > > > >>>+++ b/lib/eal/x86/include/rte_atomic.h > > > >>>@@ -27,9 +27,13 @@ > > > >>> > > > >>> #define rte_rmb() _mm_lfence() > > > >>> > > > >>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>> #define rte_smp_wmb() rte_compiler_barrier() > > > >>>- > > > >>> #define rte_smp_rmb() rte_compiler_barrier() > > > >>>+#else > > > >>>+#define rte_smp_wmb() _WriteBarrier() > > > >>>+#define rte_smp_rmb() _ReadBarrier() > > > >>>+#endif > > > >>> > > > >>> /* > > > >>> * From Intel Software Development Manual; Vol 3; > > > >>>@@ -66,11 +70,15 @@ > > > >>> static __rte_always_inline void > > > >>> rte_smp_mb(void) > > > >>> { > > > >>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>> #ifdef RTE_ARCH_I686 > > > >>> asm volatile("lock addl $0, -128(%%esp); " ::: "memory"); > > > >>> #else > > > >>> asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); > > > >>> #endif > > > >>>+#else > > > >>>+ rte_compiler_barrier(); > > > >>>+#endif > > > >> > > > >>It doesn't look right to me: compiler_barrier is not identical to LOCK-ed operation, > > > >>and is not enough to serve as a proper memory barrier for SMP. > > > > > > > >i think i'm confused by the macro naming here. i'll take another look > > > >thank you for raising it. > > > > > > > >> > > > >>Another ore generic comment - do we really need to pollute all that code with RTE_TOOLCHAIN_MSVC ifdefs? > > > >>Right now we have ability to have subdir per arch (x86/arm/etc.). > > > >>Can we treat x86+windows+msvc as a special arch? > > > > > > > >i asked this question previously and confirmed in the technical board > > > >meeting. the answer i received was that the community did not want new > > > >directory/headers introduced for compiler support matrix and i should > > > >use #ifdef in the existing headers. > > > > > > Ok, can I then ask at least to minimize number of ifdefs to absolute > > > minimum? > > > > in principal no objection at all, one question though is what to do with > > comment based documentation attached to macros? e.g. > > > > #ifdef SOME_FOO > > /* some documentation */ > > #define some_macro > > #else > > #define some_macro > > #endif > > > > #ifdef SOME_FOO > > /* some documentation 2 */ > > #define some_macro2 > > #else > > #define some_macro2 > > #endif > > > > i can either duplicate the documentation for every define so it stays > > "attached" or i can only document the first expansion. let me know what > > you expect. > > > > so something like this? > > > > #ifdef SOME_FOO > > /* some documentation */ > > #define some_macro > > /* some documentation 2 */ > > #define some_macro2 > > #else > > #define some_macro > > #define some_macro2 > > #endif > > > > or should all documentation be duplicated? which can become a teadious > > redundancy for anyone maintaining it. keep in mind we might have to make > > an exception for rte_common.h because it seems doing this would be > > really ugly there. take a look let me know. > > My personal preference would be to keep one documentation block for both cases > (yes, I suppose it needs to be updated if required): > > /* some documentation, probably for both SOME_FOO on/off */ > #ifdef SOME_FOO > #define some_macro > #else > #define some_macro > #endif > > > > > > > It is really hard to read an follow acode that is heavily ifdefed. > > > Let say above we probably don't need to re-define > > > rte_smp_rmb/rte_smp_wmb, as both are boiled down to > > > compiler_barrier(), which is already redefined. > > > > can you take a look at v2 of the patch and re-prescribe your advise > > here? in v2 only the intel macros expand to the compiler barrier. though > > i find this vexing since as you pointed out it seems they aren't > > supposed to be compiler only barriers according to the documentation in > > generic/rte_atomic.h they are intended to be memory barriers. > > Commented, pls check if I explained my thoughts clear enough there. > > > > > please help me if i've goofed up in this regard. > > > > > Another question - could it be visa-versa approach: > > > can we replace some inline assembly with common instincts whenever possible? > > > > msvc has only intrinsics and the conditional expansion for msvc is to > > use those intrinsics, gcc doesn't generally define intrinsics for processor > > specific code does it? > > AFAIK latest gcc (and clang) versions do support majority of these instincts: __rdtsc, _xbegin/_xend, etc. > So my thought was - might be we can use same instincts for all compilers... > One implication I can think about - older versions of gcc. > But might be we can re-order things and have inlines only for these oldere gcc versions? i'm going to propose if we do this we do it as a separate change later. i fear it could turn into the following dance which seems not a lot better given i'm sure some people will argue there is no benefit to removing inline assembly for gcc/clang. my preference is not to get side-tracked on refactoring with the short merge window. #if (defined(__clang__) && clang version < x) || (defined(__GNUC__) && gcc version < x) __asm(whatever... #else __rdtsc() #endif