From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 0F3EF58F4 for ; Mon, 26 May 2014 15:57:19 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 26 May 2014 06:57:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.98,913,1392192000"; d="scan'208";a="517856688" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by orsmga001.jf.intel.com with ESMTP; 26 May 2014 06:57:27 -0700 Received: from irsmsx106.ger.corp.intel.com (163.33.3.31) by IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS) id 14.3.123.3; Mon, 26 May 2014 14:57:26 +0100 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.239]) by IRSMSX106.ger.corp.intel.com ([169.254.8.14]) with mapi id 14.03.0123.003; Mon, 26 May 2014 14:57:26 +0100 From: "Ananyev, Konstantin" To: Olivier MATZ , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] atomic: clarify use of memory barriers Thread-Index: AQHPdA8XFpR5Q78M6UqUBVjJHldsjZtJN2XggATxJICABLknkA== Date: Mon, 26 May 2014 13:57:25 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772580EFB0A95@IRSMSX105.ger.corp.intel.com> References: <1400578588-21137-1-git-send-email-olivier.matz@6wind.com> <2601191342CEEE43887BDE71AB9772580EFA776F@IRSMSX105.ger.corp.intel.com> <537F56C3.3060503@6wind.com> In-Reply-To: <537F56C3.3060503@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] atomic: clarify use of memory barriers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 May 2014 13:57:20 -0000 Hi Oliver, >> So with the following fragment of code: >> extern int *x; >> extern __128i a, *p; >> L0: >> _mm_stream_si128( p, a); >> rte_compiler_barrier(); >> L1: >> *x =3D 0; >> >> There is no guarantee that store at L0 will always be finished >> before store at L1. >This code fragment looks very similar to what is done in >__rte_ring_sp_do_enqueue(): > > [...] > ENQUEUE_PTRS(); /* I expect it is converted to an SSE store */ > rte_compiler_barrier(); > [...] > r->prod.tail =3D prod_next; >So, according to your previous explanation, I understand that >this code would require a write memory barrier in place of the >compiler barrier. Am I wrong? No, right now compiler barrier is enough here. ENQUEUE_PTRS() doesn't use Non-Temporal stores (MOVNT*), so write order sho= uld be guaranteed. Though, if in future we'll change ENQUEUE_PTRS() to use non-tempral stores= , we'll have to use sfence(or mfence).=20 >Moreover, if I understand well, a real wmb() is needed only if >a SSE store is issued. But the programmer may not control that, >it's the job of the compiler. 'Normal' SIMD writes are not reordered. So it is ok for the compiler to use them if appropriate. =20 > > But now, there seems a confusion: everyone has to remember that >> smp_mb() and smp_wmb() are 'real' fences, while smp_rmb() is not. >> That's why my suggestion was to simply keep using compiler_barrier() >> for all cases, when we don't need real fence. >I'm not sure the programmer has to know which smp_*mb() is a real fence >or not. He just expects that it generates the proper CPU instructions >that guarantees the effectiveness of the memory barrier. In most cases just a compiler barrier is enough, but there are few exceptio= ns. Always using fence instructions - means introduce unnecessary slowdown for= cases, when order is guaranteed. No using fences in cases, when they are needed - means introduce race windo= w and possible data corruption. That's why right now people can use either rte_compiler_barrier() or mb/rmb= /wmb - whatever is appropriate for particular case. Konstantin