From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 5B4395694 for ; Thu, 7 May 2015 18:34:04 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP; 07 May 2015 09:34:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,384,1427785200"; d="scan'208";a="725421327" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by orsmga002.jf.intel.com with ESMTP; 07 May 2015 09:34:02 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.178]) by IRSMSX101.ger.corp.intel.com ([163.33.3.153]) with mapi id 14.03.0224.002; Thu, 7 May 2015 17:34:02 +0100 From: "Ananyev, Konstantin" To: Wang Dong , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb. Thread-Index: AQHQh0mlf9cWltXZ8kiIn30Bsn/4OJ1t+a6wgAKb5wCAABryQA== Date: Thu, 7 May 2015 16:34:01 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258214255E7@irsmsx105.ger.corp.intel.com> References: <2601191342CEEE43887BDE71AB97725821424E84@irsmsx105.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 May 2015 16:34:05 -0000 Hi Dong, > -----Original Message----- > From: Wang Dong [mailto:dong.wang.pro@hotmail.com] > Sent: Thursday, May 07, 2015 4:28 PM > To: Ananyev, Konstantin; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier = for IA processor's rte_wmb/rte_rmb. >=20 > Hi Konstantin, >=20 > > Hi Dong, > > > >> -----Original Message----- > >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of WangDong > >> Sent: Tuesday, May 05, 2015 4:38 PM > >> To: dev@dpdk.org > >> Subject: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier f= or IA processor's rte_wmb/rte_rmb. > >> > >> The current implementation of rte_wmb/rte_rmb for x86 is using process= or memory barrier. It's unnessary for IA processor, > compiler > >> memory barrier is enough. > > > > I wouldn't say they are 'unnecessary'. > > There are situations, even on IA, when you need _fence_ isntructions. > > So, please leave rte_*mb() macros unmodified. > OK, leave them unmodified, but I really can't find a situation to use > sfence and lfence instructions. For example: http://bartoszmilewski.com/2008/11/05/who-ordered-memory-fences-on-an-x86/ http://dpdk.org/ml/archives/dev/2014-May/002613.html >=20 >=20 > > I still think that we need to create a new set of architecture dependen= t macros, as what discussed before. > > Probably by analogy with linux kernel rte_smp_*mb() is a good name for = them. > > Though if you have some better name in mind, I am open to suggestions h= ere. > What abount rte_dma_*mb()? I find dma_*mb() in linux-4.0.1, it looks good= ~~ Hmm, but why _dma_? We need same thing for multi-core communication too. If rte_smp_ is not good enough, might be: rte_arch_? >=20 > > > >> But if dpdk runing on a AMD processor, maybe we should use processor m= emory barrier. > > > > As far as I remember, amd has the same memory ordering model. > It's too hard to find a AMD's software developer manual..... There for example: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/24593_APM_v2= 1.pdf ? Konstantin >=20 > Dong >=20 > > So, I don't think we need #ifdef RTE_ARCH_X86_IA here. > > > > Konstantin > > > >> I add a macro to distinguish them, if we compile DPDK for IA processor= , add the macro (RTE_ARCH_X86_IA) can improve > performance > >> with compiler memory barrier. Or we can add RTE_ARCH_X86_AMD for using= processor memory barrier, in this case, if didn't add > the > >> macro, the memory ordering will not be guaranteed. Which macro is bett= er? > >> If this patch applied, the PMD's old implementation of compiler memory= barrier (some volatile variable) can be fixed with > rte_rmb() > >> and rte_wmb() for any architecture. > >> > >> --- > >> lib/librte_eal/common/include/arch/x86/rte_atomic.h | 10 ++++++++++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib= /librte_eal/common/include/arch/x86/rte_atomic.h > >> index e93e8ee..52b1e81 100644 > >> --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h > >> +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h > >> @@ -49,10 +49,20 @@ extern "C" { > >> > >> #define rte_mb() _mm_mfence() > >> > >> +#ifdef RTE_ARCH_X86_IA > >> + > >> +#define rte_wmb() rte_compiler_barrier() > >> + > >> +#define rte_rmb() rte_compiler_barrier() > >> + > >> +#else > >> + > >> #define rte_wmb() _mm_sfence() > >> > >> #define rte_rmb() _mm_lfence() > >> > >> +#endif > >> + > >> /*------------------------- 16 bit atomic operations ---------------= ----------*/ > >> > >> #ifndef RTE_FORCE_INTRINSICS > >> -- > >> 1.9.1 > >