From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0085.outbound.protection.outlook.com [104.47.0.85]) by dpdk.org (Postfix) with ESMTP id 3FD6A275D for ; Fri, 15 Jul 2016 07:40:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=coriant.onmicrosoft.com; s=selector1-coriant-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=XN8BERzRjWPzF6T0x9fP4KdBNp7Q03iUn9OsJzmR+nA=; b=Dl3zjkAvJoseUJvAUhtGQbTL5P51A75rGUf16TbSdf69lMPSLfYqNvIkxDIZQBxKRcj1jczmOaG+Mi2130Avt01dwY/bygqfzBjIzO4cTcW8D+LzV3y7SvWSGw0N60t4Dwl8mCYmtLpVumailyzFyr0gqe6xyhvwRnDQif/cwjI= Received: from HE1PR04MB1337.eurprd04.prod.outlook.com (10.163.175.23) by HE1PR04MB1337.eurprd04.prod.outlook.com (10.163.175.23) with Microsoft SMTP Server (TLS) id 15.1.539.14; Fri, 15 Jul 2016 05:40:57 +0000 Received: from HE1PR04MB1337.eurprd04.prod.outlook.com ([10.163.175.23]) by HE1PR04MB1337.eurprd04.prod.outlook.com ([10.163.175.23]) with mapi id 15.01.0539.019; Fri, 15 Jul 2016 05:40:57 +0000 From: "Kuusisaari, Juhamatti" To: "Ananyev, Konstantin" , "'dev@dpdk.org'" Thread-Topic: [dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location Thread-Index: AQHR214MoG1fIZBOPEC9vEAt0w2bq6ATCsYAgAAIC6CAABd0gIABBgWwgAByaYCAAHRrAIAAs/QAgACLSgCAAPhwgIAAmKiAgAELgcA= Date: Fri, 15 Jul 2016 05:40:56 +0000 Message-ID: References: <20160711102055.14855-1-juhamatti.kuusisaari@coriant.com> <2601191342CEEE43887BDE71AB97725836B7C858@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B7C916@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B7D20B@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B7D628@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B7D850@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B7DD85@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB97725836B7DD85@irsmsx105.ger.corp.intel.com> Accept-Language: fi-FI, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Juhamatti.Kuusisaari@coriant.com; x-originating-ip: [138.111.130.175] x-ms-office365-filtering-correlation-id: dcac303f-2c48-4b34-45af-08d3ac729878 x-microsoft-exchange-diagnostics: 1; HE1PR04MB1337; 6:JQIWjOffJu6R8FZbywYx9u2zoiGGOVcyt8fKDiPSJXSOHt1TIVAhfZ2JPuKJGAZdq4IJTTeTOsJ98s1X+QW9CflkG6VAK8ekcCwGvt2T7UBR0Njnfm7AxQy7BuN5Cte2N/I1bCMbtbsvfkylnszK+wwe3i2t2Rjpkin7sIhVgb7bDW7mLM0ywl2081QFbCu84Q1ukHlg9CiwmbfibizZwb9Myk+FK5ADsh+T1iJ0TBHAhiwlYG9uYvsoHoMyFKDkfdAN0V71jajuSMRetL0Fq8seFrBqueGuG7yoPtcgkVuv7hWXmqSAOolXPs8cq6Zp0Sh6WMfsu9eAKsKKi8CQtQ==; 5:bQEgNKZLztjQy63tu2HUmQwPeG9/vvQv0OlE39E5gtFdstmIEcBIkgrQK0Kq50HEYaF7afLD+vTmKM6u0hqkwScyU7BBkx/182rC0vmzwXEaqX7WICjnnzsnzmauRa9SvkTIu5qGfhIQ2ZHf8k4Mdw==; 24:0o5marF4xe/dfSriqi+2hh07WRAJiTSJ3KQFIyygSz5Hx1WyWSD+16r7uxpbVUMYSn1Ntg4HSfwWQTABwuBlCFQ9p1ipdLceV3BqqgEuJf0=; 7:zlF+kyBoR/qNksa9BemPx+DcQodIt0WCMy5oz6R9w91iVK0ilIkgGV+v009DkHjxqhEE8hOE8H4KPeL0Krfr9zS/K0bVJybXgZWuynzymSpWb/9qlZ1is7i6s9QvhS8ZKXIAs393uboZ0iAw6aj7EQkhoySaznGZ1aohyw0g66YY89acfUwR+g/IX32Cw2N0EC8FseR7lgVGRIReVjYc69HP0Ud0DY8+ma2qAQD9NA4d8eFAPI3Q9ZSR2oGEd8cz x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR04MB1337; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(51653755401839); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026); SRVR:HE1PR04MB1337; BCL:0; PCL:0; RULEID:; SRVR:HE1PR04MB1337; x-forefront-prvs: 00046D390F x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(51444003)(189002)(199003)(305945005)(19580395003)(8936002)(19580405001)(87936001)(105586002)(106356001)(81166006)(74316002)(11100500001)(7736002)(7846002)(5003600100003)(93886004)(86362001)(7696003)(101416001)(5002640100001)(6116002)(122556002)(102836003)(81156014)(586003)(3846002)(106116001)(9686002)(107886002)(2900100001)(77096005)(10400500002)(2950100001)(3660700001)(76576001)(68736007)(3280700002)(2906002)(189998001)(92566002)(76176999)(50986999)(97736004)(561944003)(33656002)(5001770100001)(66066001)(54356999)(491001); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR04MB1337; H:HE1PR04MB1337.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: coriant.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: coriant.com X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Jul 2016 05:40:56.7094 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 76595477-907e-4695-988b-a6b39087332d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR04MB1337 Subject: Re: [dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2016 05:40:58 -0000 Hi Konstantin, > Hi Juhamatti, >=20 > > > > Hi Konstantin, > > > > > > > > > It is quite safe to move the barrier before DEQUEUE because > > > > > > > after the DEQUEUE there is nothing really that we would want > > > > > > > to protect > > > with a > > > > > read barrier. > > > > > > > > > > > > I don't think so. > > > > > > If you remove barrier after DEQUEUE(), that means on systems > > > > > > with relaxed memory ordering cons.tail could be updated before > > > > > > DEQUEUE() will be finished and producer can overwrite queue > > > > > > entries that were > > > not > > > > > yet dequeued. > > > > > > So if cpu can really do such speculative out of order loads, > > > > > > then we do need for __rte_ring_sc_do_dequeue() something like: > > > > > > > > > > > > rte_smp_rmb(); > > > > > > DEQUEUE_PTRS(); > > > > > > rte_smp_rmb(); > > > > > > > > You have a valid point here, there needs to be a guarantee that > > > > cons_tail > > > cannot > > > > be updated before DEQUEUE is completed. Nevertheless, my point was > > > that it is > > > > not guaranteed with a read barrier anyway. The implementation has > > > > the > > > following > > > > sequence > > > > > > > > DEQUEUE_PTRS(); (i.e. READ/LOAD) > > > > rte_smp_rmb(); > > > > .. > > > > r->cons.tail =3D cons_next; (i.e WRITE/STORE) > > > > > > > > Above read barrier does not guarantee any ordering for the > > > > following > > > writes/stores. > > > > As a guarantee is needed, I think we in fact need to change the > > > > read barrier > > > on the > > > > dequeue to a full barrier, which guarantees the read+write order, > > > > as > > > follows > > > > > > > > DEQUEUE_PTRS(); > > > > rte_smp_mb(); > > > > .. > > > > r->cons.tail =3D cons_next; > > > > > > > > If you agree, I can for sure prepare another patch for this issue. > > > > > > Hmm, I think for __rte_ring_mc_do_dequeue() we are ok with > > > smp_rmb(), as we have to read cons.tail anyway. > > > > Are you certain that this read creates strong enough dependency between > read of cons.tail and the write of it on the mc_do_dequeue()? >=20 > Yes, I believe so. >=20 > > I think it does not really create any control dependency there as the n= ext > write is not dependent of the result of the read. >=20 > I think it is dependent: cons.tail can be updated only if it's current va= lue is > eual to precomputed before cons_head. > So cpu has to read cons.tail value first. I was thinking that it might match to this processing pattern: S1. if (a =3D=3D b) S2. a =3D a + b S3. b =3D a + b Above S3 has no control dependency to S1 and can be in theory executed in parallel.=20 In the (simplified) implementation we have: X1. while (a !=3D b) X2. { } X3. a =3D c If we would consider S3 and X3 equal, there might be a problem with coheren= ce=20 without a write barrier after the while(). It may of course be purely theor= etical,=20 depending on the HW. This is anyway the reason for my suggestion to have a= =20 write-barrier also on the mc_do_dequeue() just before tail update. -- Juhamatti=20 > > The CPU also > > knows already the value that will be written to cons.tail and that > > value does not depend on the previous read either. The CPU does not > know we are planning to do a spinlock there, so it might do things out-of= - > order without proper dependencies. > > > > > For __rte_ring_sc_do_dequeue(), I think you right, we might need > > > something stronger. > > > I don't want to put rte_smp_mb() here as it would cause full HW > > > barrier even on machines with strong memory order (IA). > > > I think that rte_smp_wmb() might be enough here: > > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become > > > visible, which means reads have to be completed too. > > > > In practice I think that rte_smp_wmb() would work fine, even though it > > is not strictly according to the book. Below solution would be my > > proposal as a fix to the issue of sc dequeueing (and also to mc dequeue= ing, > if we have the problem of CPU completely ignoring the spinlock in reality > there): > > > > DEQUEUE_PTRS(); > > .. > > rte_smp_wmb(); > > r->cons.tail =3D cons_next; >=20 > As I said in previous email - it looks good for me for > _rte_ring_sc_do_dequeue(), but I am interested to hear what ARM and PPC > maintainers think about it. > Jan, Jerin do you have any comments on it? > Chao, sorry but I still not sure why PPC is considered as architecture wi= th > strong memory ordering? > Might be I am missing something obvious here. > Thank > Konstantin >=20 > > > > -- > > Juhamatti > > > > > Another option would be to define a new macro: rte_weak_mb() or so, > > > that would be expanded into CB on boxes with strong memory model, > > > and to full MB on machines with relaxed ones. > > > Interested to hear what ARM and PPC guys think. > > > Konstantin > > > > > > P.S. Another thing a bit off-topic - for PPC guys: > > > As I can see smp_rmb/smp_wmb are just a complier barriers: > > > find lib/librte_eal/common/include/arch/ppc_64/ -type f | xargs grep > > > smp_ lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_mb() rte_mb() > > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_wmb() rte_compiler_barrier() > > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_rmb() rte_compiler_barrier() My knowledge about PPC > > > architecture is rudimental, but is that really enough? > > > > > > > > > > > Thanks, > > > > -- > > > > Juhamatti > > > > > > > > > > Konstantin > > > > > > > > > > > > > The read > > > > > > > barrier is mapped to a compiler barrier on strong memory > > > > > > > model systems and this works fine too as the order of the > > > > > > > head,tail updates is still guaranteed on the new location. > > > > > > > Even if the problem would be theoretical on most systems, it > > > > > > > is worth fixing as the risk for > > > > > problems is very low. > > > > > > > > > > > > > > -- > > > > > > > Juhamatti > > > > > > > > > > > > > > > Konstantin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Juhamatti Kuusisaari > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > > lib/librte_ring/rte_ring.h | 6 ++++-- > > > > > > > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/lib/librte_ring/rte_ring.h > > > > > > > > > > > b/lib/librte_ring/rte_ring.h index eb45e41..a923e49 > > > > > > > > > > > 100644 > > > > > > > > > > > --- a/lib/librte_ring/rte_ring.h > > > > > > > > > > > +++ b/lib/librte_ring/rte_ring.h > > > > > > > > > > > @@ -662,9 +662,10 @@ > __rte_ring_mc_do_dequeue(struct > > > > > > > > > > > rte_ring *r, > > > > > > > > > > void **obj_table, > > > > > > > > > > > cons_ne= xt); > > > > > > > > > > > } while (unlikely(success =3D=3D 0)); > > > > > > > > > > > > > > > > > > > > > > + rte_smp_rmb(); > > > > > > > > > > > + > > > > > > > > > > > /* copy in table */ > > > > > > > > > > > DEQUEUE_PTRS(); > > > > > > > > > > > - rte_smp_rmb(); > > > > > > > > > > > > > > > > > > > > > > /* > > > > > > > > > > > * If there are other dequeues in progress > > > > > > > > > > > that preceded us, @@ -746,9 +747,10 @@ > > > > > > > > > > > __rte_ring_sc_do_dequeue(struct rte_ring *r, > > > > > > > > > > void **obj_table, > > > > > > > > > > > cons_next =3D cons_head + n; > > > > > > > > > > > r->cons.head =3D cons_next; > > > > > > > > > > > > > > > > > > > > > > + rte_smp_rmb(); > > > > > > > > > > > + > > > > > > > > > > > /* copy in table */ > > > > > > > > > > > DEQUEUE_PTRS(); > > > > > > > > > > > - rte_smp_rmb(); > > > > > > > > > > > > > > > > > > > > > > __RING_STAT_ADD(r, deq_success, n); > > > > > > > > > > > r->cons.tail =3D cons_next; > > > > > > > > > > > -- > > > > > > > > > > > 2.9.0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > > > > > > =3D=3D > > > > > > > > > > > The information contained in this message may be > > > > > > > > > > > privileged and confidential and protected from > > > > > > > > > > > disclosure. If the reader of this message is not the > > > > > > > > > > > intended recipient, or an employee or agent > > > > > > > > > > > responsible for delivering this message to the > > > > > > > > > > > intended recipient, you are hereby notified that any > > > > > > > > > > > reproduction, dissemination or distribution of this > > > > > > > > > > > communication is strictly prohibited. If you have > > > > > > > > > > > received this communication in error, please notify > > > > > > > > > > > us immediately by replying to the message and > > > > > > > > > > > deleting it from your > > > > > > > > computer. Thank you. > > > > > > > > > > > Coriant-Tellabs > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > > > > > > =3D=3D