From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9D542A04B1; Thu, 27 Aug 2020 10:56:48 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4D8461C112; Thu, 27 Aug 2020 10:56:48 +0200 (CEST) Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) by dpdk.org (Postfix) with ESMTP id 693F91C10A for ; Thu, 27 Aug 2020 10:56:47 +0200 (CEST) Received: from lhreml731-chm.china.huawei.com (unknown [172.18.7.107]) by Forcepoint Email with ESMTP id 0607B96EE290DFDC28F3; Thu, 27 Aug 2020 09:56:47 +0100 (IST) Received: from lhreml728-chm.china.huawei.com (10.201.108.79) by lhreml731-chm.china.huawei.com (10.201.108.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1913.5; Thu, 27 Aug 2020 09:56:46 +0100 Received: from lhreml728-chm.china.huawei.com ([10.201.108.79]) by lhreml728-chm.china.huawei.com ([10.201.108.79]) with mapi id 15.01.1913.007; Thu, 27 Aug 2020 09:56:46 +0100 From: Diogo Behrens To: Phil Yang CC: "dev@dpdk.org" , nd , Honnappa Nagarahalli Thread-Topic: [PATCH] librte_eal: fix mcslock hang on weak memory Thread-Index: AQHWe4pZOqSFXnd9v02KDV0dJx5bZ6lKKsYAgAAFG2A= Date: Thu, 27 Aug 2020 08:56:46 +0000 Message-ID: References: <20200826092002.19395-1-diogo.behrens@huawei.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.206.134.146] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Phil, thanks for taking a look at this. Indeed, the cmpxchg will have release sem= antics, but implicit barriers do not provide the same ordering guarantees a= s DMB ISH, ie, atomic_thread_fence(__ATOMIC_ACQ_REL). Let me try to explain the scenario. Here is a pseudo-ARM assembly with the = instructions involved: STR me->locked =3D 1 // 1: initialize the node LDAXR pred =3D tail // 2: cmpxchg STLXR tail =3D me // 3: cmpxchg STR pred->next =3D me // 4: the problematic store Now, ARM allows instruction 1 and 2 to be reordered, and also instructions = 3 and 4 to be reordered: LDAXR pred =3D tail // 2: cmpxchg (acquires can be moved up) STR me-> locked =3D 1 // 1: initialize the node STR pred->next =3D me // 4: the problematic store STLXR tail =3D me // 3: cmpxchg (releases can be moved down) And since stores 1 and 4 can be reordered too, the following execution is p= ossible: LDAXR pred =3D tail // 2: cmpxchg STR pred->next =3D me // 4: the problematic store STR me-> locked =3D 1 // 1: initialize the node STLXR tail =3D me // 3: cmpxchg In this subtle situation, the locking-thread can overwrite the store to nex= t->locked of the unlocking-thread (if it occurs between 4 and 1), and subse= quently hang waiting for me->locked to become 0. We couldn't reproduce this with our available hardware, but we "reproduced"= it with the RMEM model checker, which precisely models the ARM memory mode= l (https://github.com/rems-project/rmem). The GenMC model checker (https://= github.com/MPI-SWS/genmc) also finds the issue, but its intermediate memory= model is slightly more general than the ARM model. The release attached to store (4) prevents (1) to reordering with it. Please, let me know if that makes sense or if I am missing something. Thanks, -Diogo -----Original Message----- From: Phil Yang [mailto:Phil.Yang@arm.com]=20 Sent: Wednesday, August 26, 2020 12:17 PM To: Diogo Behrens Cc: dev@dpdk.org; nd ; Honnappa Nagarahalli ; nd Subject: RE: [PATCH] librte_eal: fix mcslock hang on weak memory Diogo Behrens writes: > Subject: [PATCH] librte_eal: fix mcslock hang on weak memory >=20 > The initialization me->locked=3D1 in lock() must happen before > next->locked=3D0 in unlock(), otherwise a thread may hang forever, > waiting me->locked become 0. On weak memory systems (sHi Puch as ARMv= 8), > the current implementation allows me->locked=3D1 to be reordered with > announcing the node (pred->next=3Dme) and, consequently, to be > reordered with next->locked=3D0 in unlock(). >=20 > This fix adds a release barrier to pred->next=3Dme, forcing > me->locked=3D1 to happen before this operation. >=20 > Signed-off-by: Diogo Behrens > --- > lib/librte_eal/include/generic/rte_mcslock.h | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) >=20 > diff --git a/lib/librte_eal/include/generic/rte_mcslock.h > b/lib/librte_eal/include/generic/rte_mcslock.h > index 2bef28351..ce553f547 100644 > --- a/lib/librte_eal/include/generic/rte_mcslock.h > +++ b/lib/librte_eal/include/generic/rte_mcslock.h > @@ -68,7 +68,14 @@ rte_mcslock_lock(rte_mcslock_t **msl, rte_mcslock_t > *me) > */ > return; > } > - __atomic_store_n(&prev->next, me, __ATOMIC_RELAXED); > + /* The store to me->next above should also complete before the > node is > + * visible to predecessor thread releasing the lock. Hence, the store > + * prev->next also requires release semantics. Note that, for > example, > + * on ARM, the release semantics in the exchange operation is not > + * strong as a release fence and is not sufficient to enforce the > + * desired order here. Hi Diogo, I didn't quite understand why the exchange operation with ACQ_REL memory or= dering is not sufficient. It emits 'stlxr' on armv8.0-a and 'swpal' on armv8.1-a (with LSE support).= =20 Both of these two instructions contain a release semantics.=20 Please check the URL for the detail. https://gcc.godbolt.org/z/Mc4373 BTW, if you captured a deadlock issue on your testbed, could you please sha= re your test case here? Thanks, Phil > + */ > + __atomic_store_n(&prev->next, me, __ATOMIC_RELEASE); >=20 > /* The while-load of me->locked should not move above the previous > * store to prev->next. Otherwise it will cause a deadlock. Need a > -- > 2.28.0 >=20