From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 66A0D4291F; Tue, 11 Apr 2023 23:22:09 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5426240E2D; Tue, 11 Apr 2023 23:22:09 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 0CB3640DFD for ; Tue, 11 Apr 2023 23:22:08 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id 5CD912171FA3; Tue, 11 Apr 2023 14:22:07 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5CD912171FA3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1681248127; bh=l4pT6IW+4xvhyBV2OhsciXv/XfjsZhQ5DrLkutCxVzk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ecDzWLWqdjWfJKILKlLh2Tg7CnDNXYs7GWMVxcOjfHqtldJ/plLs0YC6zvcd6IqUK GYiOlR8FfpnaI0wAN4zESQhbakX/Q0eD0bu6yRIqObIed9bsnspDs9u8z2YB166D7e ezBhug7E4C7HBi0phjgL+DkpbgqicFttYPyJDx3w= Date: Tue, 11 Apr 2023 14:22:07 -0700 From: Tyler Retzlaff To: Bruce Richardson Cc: Konstantin Ananyev , Konstantin Ananyev , "dev@dpdk.org" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "mb@smartsharesystems.com" Subject: Re: [PATCH 3/9] eal: use barrier intrinsics when compiling with msvc Message-ID: <20230411212207.GB9591@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <1680558751-17931-4-git-send-email-roretzla@linux.microsoft.com> <754a877d020a4a199a5310b469e876a7@huawei.com> <20230404154906.GC23247@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230405000441.GA24769@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230406000740.GB2287@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <624ce08e-6f67-e078-3f37-c9dab7b094f4@yandex.ru> <20230410205848.GB4798@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Apr 11, 2023 at 10:10:48AM +0100, Bruce Richardson wrote: > On Mon, Apr 10, 2023 at 01:58:48PM -0700, Tyler Retzlaff wrote: > > On Mon, Apr 10, 2023 at 09:02:00PM +0100, Konstantin Ananyev wrote: > > > 06/04/2023 01:07, Tyler Retzlaff пишет: > > > >On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote: > > > >> > > > >>>>>>>Inline assembly is not supported for msvc x64 instead use > > > >>>>>>>_{Read,Write,ReadWrite}Barrier() intrinsics. > > > >>>>>>> > > > >>>>>>>Signed-off-by: Tyler Retzlaff > > > >>>>>>>--- > > > >>>>>>> lib/eal/include/generic/rte_atomic.h | 4 ++++ > > > >>>>>>> lib/eal/x86/include/rte_atomic.h | 10 +++++++++- > > > >>>>>>> 2 files changed, 13 insertions(+), 1 deletion(-) > > > >>>>>>> > > > >>>>>>>diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h > > > >>>>>>>index 234b268..e973184 100644 > > > >>>>>>>--- a/lib/eal/include/generic/rte_atomic.h > > > >>>>>>>+++ b/lib/eal/include/generic/rte_atomic.h > > > >>>>>>>@@ -116,9 +116,13 @@ > > > >>>>>>> * Guarantees that operation reordering does not occur at compile time > > > >>>>>>> * for operations directly before and after the barrier. > > > >>>>>>> */ > > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>>>>>> #define rte_compiler_barrier() do { \ > > > >>>>>>> asm volatile ("" : : : "memory"); \ > > > >>>>>>> } while(0) > > > >>>>>>>+#else > > > >>>>>>>+#define rte_compiler_barrier() _ReadWriteBarrier() > > > >>>>>>>+#endif > > > >>>>>>> > > > >>>>>>> /** > > > >>>>>>> * Synchronization fence between threads based on the specified memory order. > > > >>>>>>>diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h > > > >>>>>>>index f2ee1a9..5cce9ba 100644 > > > >>>>>>>--- a/lib/eal/x86/include/rte_atomic.h > > > >>>>>>>+++ b/lib/eal/x86/include/rte_atomic.h > > > >>>>>>>@@ -27,9 +27,13 @@ > > > >>>>>>> > > > >>>>>>> #define rte_rmb() _mm_lfence() > > > >>>>>>> > > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>>>>>> #define rte_smp_wmb() rte_compiler_barrier() > > > >>>>>>>- > > > >>>>>>> #define rte_smp_rmb() rte_compiler_barrier() > > > >>>>>>>+#else > > > >>>>>>>+#define rte_smp_wmb() _WriteBarrier() > > > >>>>>>>+#define rte_smp_rmb() _ReadBarrier() > > > >>>>>>>+#endif > > > >>>>>>> > > > >>>>>>> /* > > > >>>>>>> * From Intel Software Development Manual; Vol 3; > > > >>>>>>>@@ -66,11 +70,15 @@ > > > >>>>>>> static __rte_always_inline void > > > >>>>>>> rte_smp_mb(void) > > > >>>>>>> { > > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > > >>>>>>> #ifdef RTE_ARCH_I686 > > > >>>>>>> asm volatile("lock addl $0, -128(%%esp); " ::: "memory"); > > > >>>>>>> #else > > > >>>>>>> asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); > > > >>>>>>> #endif > > > >>>>>>>+#else > > > >>>>>>>+ rte_compiler_barrier(); > > > >>>>>>>+#endif > > > >>>>>> > > > >>>>>>It doesn't look right to me: compiler_barrier is not identical to LOCK-ed operation, > > > >>>>>>and is not enough to serve as a proper memory barrier for SMP. > > > >>>>> > > > >>>>>i think i'm confused by the macro naming here. i'll take another look > > > >>>>>thank you for raising it. > > > >>>>> > > > >>>>>> > > > >>>>>>Another ore generic comment - do we really need to pollute all that code with RTE_TOOLCHAIN_MSVC ifdefs? > > > >>>>>>Right now we have ability to have subdir per arch (x86/arm/etc.). > > > >>>>>>Can we treat x86+windows+msvc as a special arch? > > > >>>>> > > > >>>>>i asked this question previously and confirmed in the technical board > > > >>>>>meeting. the answer i received was that the community did not want new > > > >>>>>directory/headers introduced for compiler support matrix and i should > > > >>>>>use #ifdef in the existing headers. > > > >>>> > > > >>>>Ok, can I then ask at least to minimize number of ifdefs to absolute > > > >>>>minimum? > > > >>> > > > >>>in principal no objection at all, one question though is what to do with > > > >>>comment based documentation attached to macros? e.g. > > > >>> > > > >>>#ifdef SOME_FOO > > > >>>/* some documentation */ > > > >>>#define some_macro > > > >>>#else > > > >>>#define some_macro > > > >>>#endif > > > >>> > > > >>>#ifdef SOME_FOO > > > >>>/* some documentation 2 */ > > > >>>#define some_macro2 > > > >>>#else > > > >>>#define some_macro2 > > > >>>#endif > > > >>> > > > >>>i can either duplicate the documentation for every define so it stays > > > >>>"attached" or i can only document the first expansion. let me know what > > > >>>you expect. > > > >>> > > > >>>so something like this? > > > >>> > > > >>>#ifdef SOME_FOO > > > >>>/* some documentation */ > > > >>>#define some_macro > > > >>>/* some documentation 2 */ > > > >>>#define some_macro2 > > > >>>#else > > > >>>#define some_macro > > > >>>#define some_macro2 > > > >>>#endif > > > >>> > > > >>>or should all documentation be duplicated? which can become a teadious > > > >>>redundancy for anyone maintaining it. keep in mind we might have to make > > > >>>an exception for rte_common.h because it seems doing this would be > > > >>>really ugly there. take a look let me know. > > > >> > > > >>My personal preference would be to keep one documentation block for both cases > > > >>(yes, I suppose it needs to be updated if required): > > > >> > > > >>/* some documentation, probably for both SOME_FOO on/off */ > > > >>#ifdef SOME_FOO > > > >>#define some_macro > > > >>#else > > > >>#define some_macro > > > >>#endif > > > >> > > > >> > > > >>> > > > >>>>It is really hard to read an follow acode that is heavily ifdefed. > > > >>>>Let say above we probably don't need to re-define > > > >>>>rte_smp_rmb/rte_smp_wmb, as both are boiled down to > > > >>>>compiler_barrier(), which is already redefined. > > > >>> > > > >>>can you take a look at v2 of the patch and re-prescribe your advise > > > >>>here? in v2 only the intel macros expand to the compiler barrier. though > > > >>>i find this vexing since as you pointed out it seems they aren't > > > >>>supposed to be compiler only barriers according to the documentation in > > > >>>generic/rte_atomic.h they are intended to be memory barriers. > > > >> > > > >>Commented, pls check if I explained my thoughts clear enough there. > > > >>> > > > >>>please help me if i've goofed up in this regard. > > > >>> > > > >>>>Another question - could it be visa-versa approach: > > > >>>>can we replace some inline assembly with common instincts whenever possible? > > > >>> > > > >>>msvc has only intrinsics and the conditional expansion for msvc is to > > > >>>use those intrinsics, gcc doesn't generally define intrinsics for processor > > > >>>specific code does it? > > > >> > > > >>AFAIK latest gcc (and clang) versions do support majority of these instincts: __rdtsc, _xbegin/_xend, etc. > > > >>So my thought was - might be we can use same instincts for all compilers... > > > >>One implication I can think about - older versions of gcc. > > > >>But might be we can re-order things and have inlines only for these oldere gcc versions? > > > > > > > >i'm going to propose if we do this we do it as a separate change later. > > > > > > > >i fear it could turn into the following dance which seems not a lot > > > >better given i'm sure some people will argue there is no benefit to > > > >removing inline assembly for gcc/clang. my preference is not to get > > > >side-tracked on refactoring with the short merge window. > > > > > > > >#if (defined(__clang__) && clang version < x) || > > > > (defined(__GNUC__) && gcc version < x) > > > >__asm(whatever... > > > >#else > > > >__rdtsc() > > > >#endif > > > > > > > > > Played a bit with https://godbolt.org/. > > > It seems that __rdtsc() and RMT builtins (xbegin/xend/xtest) are > > > supported all way down to gcc 4.8.1. > > > > Do you know what version of clang comes with RHEL 7. Because for 23.07 > > release at least I need to know it won't break clang compile on RHEL 7 > > either (which is what makes this painful). > > > > I don't know if we ever stated that we need to support clang from RHEL 7. I > always assumed that it was just gcc 4.8 compiler, but I suppose we never > stated either that we didn't need to support it.... > > > I played around with clang 7 (which i'm not sure is the right version) > > it had __rdtsc() but it did not have xbegin/xend/xtest. For the various > > instructions they're appearing in different versions of the compilers. > > > > Running Centos 7 in a VM, I see Clang v3.4.2. Konstantin, Bruce okay, i'm trying this. i sent a new series version where i use the gcc intrinsics if they are available in gcc 4.9. for intrinsics that were not available in 4.9 they have been left as conditional compile for now. if the CI blows up i will revert back to conditional compile. this series isn't supposed to be about cleaning up existing code so i'm trying to be helpful here but please check out the latest and ack if you're happy. if we want refactor / cleanups beyond this i'll commit to making the changes in subsequent series that aren't *blocking* this work. thanks