From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BF8BFA04BA; Wed, 7 Oct 2020 11:55:33 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1211A1B3C8; Wed, 7 Oct 2020 11:55:32 +0200 (CEST) Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) by dpdk.org (Postfix) with ESMTP id C38FD4C9D for ; Wed, 7 Oct 2020 11:55:30 +0200 (CEST) Received: from lhreml725-chm.china.huawei.com (unknown [172.18.7.106]) by Forcepoint Email with ESMTP id A7130B3A262C731796C1; Wed, 7 Oct 2020 10:55:28 +0100 (IST) Received: from lhreml728-chm.china.huawei.com (10.201.108.79) by lhreml725-chm.china.huawei.com (10.201.108.76) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1913.5; Wed, 7 Oct 2020 10:55:28 +0100 Received: from lhreml728-chm.china.huawei.com ([10.201.108.79]) by lhreml728-chm.china.huawei.com ([10.201.108.79]) with mapi id 15.01.1913.007; Wed, 7 Oct 2020 10:55:28 +0100 From: Diogo Behrens To: Thomas Monjalon , Phil Yang , Honnappa Nagarahalli CC: "dev@dpdk.org" , nd Thread-Topic: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory Thread-Index: AQHWe4pZOqSFXnd9v02KDV0dJx5bZ6lKKsYAgAAFG2CAAwETAIAFVQuAgDjHXYCAANq+0A== Date: Wed, 7 Oct 2020 09:55:27 +0000 Message-ID: <05e6a3569608493abbcc4dba618c5c2c@huawei.com> References: <20200826092002.19395-1-diogo.behrens@huawei.com> <1947647.zX4bR4m4Ni@thomas> In-Reply-To: <1947647.zX4bR4m4Ni@thomas> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.206.134.146] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Thomas, we are still waiting for the comments from Honnappa. In our understanding, = the missing barrier is a bug according to the model. We reproduced the scen= ario in herd7, which represents the authoritative memory model: https://dev= eloper.arm.com/architectures/cpu-architecture/a-profile/memory-model-tool Here is a litmus code that shows that the XCHG (when compiled to LDAXR and = STLR) is not atomic wrt memory updates to other locations: ----- AArch64 XCHG-nonatomic { 0:X1=3Dlocked; 0:X3=3Dnext; 1:X1=3Dlocked; 1:X3=3Dnext; 1:X5=3Dtail; } P0 | P1; LDR W0, [X3] | MOV W0, #1; CBZ W0, end | STR W0, [X1]; (* init locked *)=20 MOV W2, #2 | MOV W2, #0; STR W2, [X1] | xchg:; end: | LDAXR W6, [X5]; NOP | STLXR W4, W0, [X5]; NOP | CBNZ W4, xchg; NOP | STR W0, [X3]; (* set next *)=20 exists (0:X2=3D2 /\ locked=3D1) ----- (web version of herd7: http://diy.inria.fr/www/?record=3Daarch64) P1 is trying to acquire the lock: - initializes locked - does the xchg on the tail of the mcslock - sets the next P0 is releasing the lock: - if next is not set, just terminates - if next is set, stores 2 in locked The initialization of locked should never overwrite the store 2 to locked, = but it does. To avoid that reordering to happen, one should make the last store of P1 to= have a "release" barrier, ie, STLR. This is equivalent to the reordering occurring in the mcslock of librte_eal= . Best regards, -Diogo -----Original Message----- From: Thomas Monjalon [mailto:thomas@monjalon.net]=20 Sent: Tuesday, October 6, 2020 11:50 PM To: Phil Yang ; Diogo Behrens = ; Honnappa Nagarahalli Cc: dev@dpdk.org; nd Subject: Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory 31/08/2020 20:45, Honnappa Nagarahalli: >=20 > Hi Diogo, >=20 > Thanks for your explanation. >=20 > As documented in https://developer.arm.com/documentation/ddi0487/fc B2.9= .5 Load-Exclusive and Store-Exclusive instruction usage restrictions: > " Between the Load-Exclusive and the Store-Exclusive, there are no=20 > explicit memory accesses, preloads, direct or indirect System register=20 > writes, address translation instructions, cache or TLB maintenance instru= ctions, exception generating instructions, exception returns, or indirect b= ranches." > [Honnappa] This is a requirement on the software, not on the micro-archit= ecture. > We are having few discussions internally, will get back soon. >=20 > So it is not allowed to insert (1) & (4) between (2, 3). The cmpxchg oper= ation is atomic. Please what is the conclusion?