From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3C5CC42913; Mon, 10 Apr 2023 22:02:08 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A47EB40DDC; Mon, 10 Apr 2023 22:02:07 +0200 (CEST) Received: from forward205c.mail.yandex.net (forward205c.mail.yandex.net [178.154.239.216]) by mails.dpdk.org (Postfix) with ESMTP id 2FF0240A81 for ; Mon, 10 Apr 2023 22:02:06 +0200 (CEST) Received: from mail-nwsmtp-smtp-production-main-57.myt.yp-c.yandex.net (mail-nwsmtp-smtp-production-main-57.myt.yp-c.yandex.net [IPv6:2a02:6b8:c12:1488:0:640:3dc6:0]) by forward205c.mail.yandex.net (Yandex) with ESMTP id 784314231D; Mon, 10 Apr 2023 23:02:05 +0300 (MSK) Received: by mail-nwsmtp-smtp-production-main-57.myt.yp-c.yandex.net (smtp/Yandex) with ESMTPSA id 02bAlr9Wqa60-xQkjyPDX; Mon, 10 Apr 2023 23:02:04 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1681156924; bh=ReJAXrWHbmIeo6/F8Bf9o99sK01IcDMRCtb++pp7ZAE=; h=From:In-Reply-To:Cc:Date:References:To:Subject:Message-ID; b=ltiOdYN4UHT4wwGmWyAYsKKfpdqStzaE9mmKyarstjP0XV8pAaqnstXPUrxfE/clN UgUbWNF5vVvs0P7MyLuf+uUXqe3s0tzHIGZt6JHMPqsaFhBNH6fYc5eNtzOdgDd8cT JjzovCyynxN86+o4P30F7F602KwK46IAaqFeoTJ0= Authentication-Results: mail-nwsmtp-smtp-production-main-57.myt.yp-c.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: <624ce08e-6f67-e078-3f37-c9dab7b094f4@yandex.ru> Date: Mon, 10 Apr 2023 21:02:00 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH 3/9] eal: use barrier intrinsics when compiling with msvc Content-Language: en-US To: Tyler Retzlaff , Konstantin Ananyev Cc: "dev@dpdk.org" , "bruce.richardson@intel.com" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "mb@smartsharesystems.com" References: <1680558751-17931-1-git-send-email-roretzla@linux.microsoft.com> <1680558751-17931-4-git-send-email-roretzla@linux.microsoft.com> <754a877d020a4a199a5310b469e876a7@huawei.com> <20230404154906.GC23247@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230405000441.GA24769@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230406000740.GB2287@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> From: Konstantin Ananyev In-Reply-To: <20230406000740.GB2287@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 06/04/2023 01:07, Tyler Retzlaff пишет: > On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote: >> >>>>>>> Inline assembly is not supported for msvc x64 instead use >>>>>>> _{Read,Write,ReadWrite}Barrier() intrinsics. >>>>>>> >>>>>>> Signed-off-by: Tyler Retzlaff >>>>>>> --- >>>>>>> lib/eal/include/generic/rte_atomic.h | 4 ++++ >>>>>>> lib/eal/x86/include/rte_atomic.h | 10 +++++++++- >>>>>>> 2 files changed, 13 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h >>>>>>> index 234b268..e973184 100644 >>>>>>> --- a/lib/eal/include/generic/rte_atomic.h >>>>>>> +++ b/lib/eal/include/generic/rte_atomic.h >>>>>>> @@ -116,9 +116,13 @@ >>>>>>> * Guarantees that operation reordering does not occur at compile time >>>>>>> * for operations directly before and after the barrier. >>>>>>> */ >>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC >>>>>>> #define rte_compiler_barrier() do { \ >>>>>>> asm volatile ("" : : : "memory"); \ >>>>>>> } while(0) >>>>>>> +#else >>>>>>> +#define rte_compiler_barrier() _ReadWriteBarrier() >>>>>>> +#endif >>>>>>> >>>>>>> /** >>>>>>> * Synchronization fence between threads based on the specified memory order. >>>>>>> diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h >>>>>>> index f2ee1a9..5cce9ba 100644 >>>>>>> --- a/lib/eal/x86/include/rte_atomic.h >>>>>>> +++ b/lib/eal/x86/include/rte_atomic.h >>>>>>> @@ -27,9 +27,13 @@ >>>>>>> >>>>>>> #define rte_rmb() _mm_lfence() >>>>>>> >>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC >>>>>>> #define rte_smp_wmb() rte_compiler_barrier() >>>>>>> - >>>>>>> #define rte_smp_rmb() rte_compiler_barrier() >>>>>>> +#else >>>>>>> +#define rte_smp_wmb() _WriteBarrier() >>>>>>> +#define rte_smp_rmb() _ReadBarrier() >>>>>>> +#endif >>>>>>> >>>>>>> /* >>>>>>> * From Intel Software Development Manual; Vol 3; >>>>>>> @@ -66,11 +70,15 @@ >>>>>>> static __rte_always_inline void >>>>>>> rte_smp_mb(void) >>>>>>> { >>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC >>>>>>> #ifdef RTE_ARCH_I686 >>>>>>> asm volatile("lock addl $0, -128(%%esp); " ::: "memory"); >>>>>>> #else >>>>>>> asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); >>>>>>> #endif >>>>>>> +#else >>>>>>> + rte_compiler_barrier(); >>>>>>> +#endif >>>>>> >>>>>> It doesn't look right to me: compiler_barrier is not identical to LOCK-ed operation, >>>>>> and is not enough to serve as a proper memory barrier for SMP. >>>>> >>>>> i think i'm confused by the macro naming here. i'll take another look >>>>> thank you for raising it. >>>>> >>>>>> >>>>>> Another ore generic comment - do we really need to pollute all that code with RTE_TOOLCHAIN_MSVC ifdefs? >>>>>> Right now we have ability to have subdir per arch (x86/arm/etc.). >>>>>> Can we treat x86+windows+msvc as a special arch? >>>>> >>>>> i asked this question previously and confirmed in the technical board >>>>> meeting. the answer i received was that the community did not want new >>>>> directory/headers introduced for compiler support matrix and i should >>>>> use #ifdef in the existing headers. >>>> >>>> Ok, can I then ask at least to minimize number of ifdefs to absolute >>>> minimum? >>> >>> in principal no objection at all, one question though is what to do with >>> comment based documentation attached to macros? e.g. >>> >>> #ifdef SOME_FOO >>> /* some documentation */ >>> #define some_macro >>> #else >>> #define some_macro >>> #endif >>> >>> #ifdef SOME_FOO >>> /* some documentation 2 */ >>> #define some_macro2 >>> #else >>> #define some_macro2 >>> #endif >>> >>> i can either duplicate the documentation for every define so it stays >>> "attached" or i can only document the first expansion. let me know what >>> you expect. >>> >>> so something like this? >>> >>> #ifdef SOME_FOO >>> /* some documentation */ >>> #define some_macro >>> /* some documentation 2 */ >>> #define some_macro2 >>> #else >>> #define some_macro >>> #define some_macro2 >>> #endif >>> >>> or should all documentation be duplicated? which can become a teadious >>> redundancy for anyone maintaining it. keep in mind we might have to make >>> an exception for rte_common.h because it seems doing this would be >>> really ugly there. take a look let me know. >> >> My personal preference would be to keep one documentation block for both cases >> (yes, I suppose it needs to be updated if required): >> >> /* some documentation, probably for both SOME_FOO on/off */ >> #ifdef SOME_FOO >> #define some_macro >> #else >> #define some_macro >> #endif >> >> >>> >>>> It is really hard to read an follow acode that is heavily ifdefed. >>>> Let say above we probably don't need to re-define >>>> rte_smp_rmb/rte_smp_wmb, as both are boiled down to >>>> compiler_barrier(), which is already redefined. >>> >>> can you take a look at v2 of the patch and re-prescribe your advise >>> here? in v2 only the intel macros expand to the compiler barrier. though >>> i find this vexing since as you pointed out it seems they aren't >>> supposed to be compiler only barriers according to the documentation in >>> generic/rte_atomic.h they are intended to be memory barriers. >> >> Commented, pls check if I explained my thoughts clear enough there. >> >>> >>> please help me if i've goofed up in this regard. >>> >>>> Another question - could it be visa-versa approach: >>>> can we replace some inline assembly with common instincts whenever possible? >>> >>> msvc has only intrinsics and the conditional expansion for msvc is to >>> use those intrinsics, gcc doesn't generally define intrinsics for processor >>> specific code does it? >> >> AFAIK latest gcc (and clang) versions do support majority of these instincts: __rdtsc, _xbegin/_xend, etc. >> So my thought was - might be we can use same instincts for all compilers... >> One implication I can think about - older versions of gcc. >> But might be we can re-order things and have inlines only for these oldere gcc versions? > > i'm going to propose if we do this we do it as a separate change later. > > i fear it could turn into the following dance which seems not a lot > better given i'm sure some people will argue there is no benefit to > removing inline assembly for gcc/clang. my preference is not to get > side-tracked on refactoring with the short merge window. > > #if (defined(__clang__) && clang version < x) || > (defined(__GNUC__) && gcc version < x) > __asm(whatever... > #else > __rdtsc() > #endif Played a bit with https://godbolt.org/. It seems that __rdtsc() and RMT builtins (xbegin/xend/xtest) are supported all way down to gcc 4.8.1.