From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50046.outbound.protection.outlook.com [40.107.5.46]) by dpdk.org (Postfix) with ESMTP id 88C2E2BBD; Fri, 5 Oct 2018 02:47:30 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c2DLifI5sUEH6WuUFshheU2ovfAqTPUfXeWi48Oj65g=; b=G9T5+hzWbxoAlUCCHvjFOzsr0paimOjsDIGYWv585HwmVLiamZMgkF5fio9XWq/DWX8N4m7wTUQUt8PFnqcTnegsWkXiX9SWRgaFuABRaYr6T6jIPNb5IyYZdZqqhcNw0t52ZwfxueM8SwZgfw6Zcsmlj2AYphOaysHFDG8NHt4= Received: from VI1PR08MB3167.eurprd08.prod.outlook.com (52.133.15.142) by VI1PR08MB3054.eurprd08.prod.outlook.com (52.133.14.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1185.25; Fri, 5 Oct 2018 00:47:29 +0000 Received: from VI1PR08MB3167.eurprd08.prod.outlook.com ([fe80::4c13:b1f:ad01:86d7]) by VI1PR08MB3167.eurprd08.prod.outlook.com ([fe80::4c13:b1f:ad01:86d7%4]) with mapi id 15.20.1207.022; Fri, 5 Oct 2018 00:47:29 +0000 From: "Gavin Hu (Arm Technology China)" To: Jerin Jacob CC: "dev@dpdk.org" , Honnappa Nagarahalli , Steve Capper , Ola Liljedahl , nd , "stable@dpdk.org" Thread-Topic: [PATCH v3 1/3] ring: read tail using atomic load Thread-Index: AQHUTl7oRf7ShYPLWECgB4oSebTnGqUHJ0+AgAij3fA= Date: Fri, 5 Oct 2018 00:47:28 +0000 Message-ID: References: <20180807031943.5331-1-gavin.hu@arm.com> <1537172244-64874-1-git-send-email-gavin.hu@arm.com> <20180929104857.GA30457@jerin> In-Reply-To: <20180929104857.GA30457@jerin> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Gavin.Hu@arm.com; x-originating-ip: [113.29.88.7] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR08MB3054; 6:edADun2RRYVavWe2tGiWnJH1htrV/4fAB7yT7cHdv1cNVZLsWKMVvOv8sK8a6wuQRcWdj3v0BmcCmFGCfNedsvO+V16VFIZ+Cgkv6IFNbCwds8G9jk3L/bEwFUTySr7qUoFBrGfQgF5p18J6Jlaj93y3Mji4Rp1ZAdQl7UYhQbjC6tdO66R9xjg2RVmMF2Th1i7ZbM/TK6pwlADjbP8j9cFE0+Nul68h+88QoA3QjrZI8z+3kdJvOtZPtDO8CKnwPAb8JZvqQGpGsBwRkGQ5jHFAM7os4KClV1wow/36Tdr9NcbIS/hlv78jHpexHusPf39/DzVSCpJ2fuDOawPfxF9KoGqYO5aYb9Y6ntbAjw86zdbfeGD6mWFcvhU8q5hggw3gCnazjqHP+x4ZqhNYvMQgGLi/eQpfdyy7IAzZvV/iGaxxdFEsPbvT5LYZ6EE+eSadk7edSYrSYGEoZpIUlw==; 5:ZmwDNljplS+Tk5FUqOBmId6pkqlieYJoEUb3xs5JPFBwrAE5DGOjjhmgvHlnwHuJ4QHcVHDBbANc8Kr2s78zXrNu2q2ZlTo18KCHzqF4HbJ5AkQGX11FUar8tLnGsSGd5e1hLCH/vMJwXMi8sNtSX5Zs4jVF6T2Jzd+Gwg+UlZ8=; 7:cozVUynWNjSAlb+BM+/RBwC170oRTuxx9lOIo/7rEIpvwY1WXd2hnExSx1yeFdTtQjd3HP1iSiSTaAxCBSl6KXpU3I83G/F6NRl/06T/zJnXE4KOir5yvwTParOmhblrScvO+zp2daGiu3SdLHRP8nzcXJgH4fLBRW4Q4vv6i/ROVWIOj4aZpV03L0HS0fZvFtYmDvPd9+OqrGvgLeT2Id1Qt9EaUqwYI9Reu/h41Ym8KUUFyu0TjyPn17OR+YgU x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-ms-office365-filtering-correlation-id: 44a2a6b4-9d1b-493a-5fda-08d62a5c2087 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:VI1PR08MB3054; x-ms-traffictypediagnostic: VI1PR08MB3054: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3002001)(3231355)(944501410)(52105095)(6055026)(149066)(150057)(6041310)(20161123558120)(20161123560045)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(201708071742011)(7699051); SRVR:VI1PR08MB3054; BCL:0; PCL:0; RULEID:; SRVR:VI1PR08MB3054; x-forefront-prvs: 0816F1D86E x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(396003)(376002)(346002)(366004)(39860400002)(189003)(199004)(13464003)(105586002)(256004)(6506007)(7696005)(14444005)(53546011)(55236004)(8936002)(74316002)(76176011)(7736002)(316002)(81156014)(33656002)(81166006)(68736007)(8676002)(2900100001)(86362001)(5660300001)(71200400001)(305945005)(2906002)(229853002)(6436002)(3846002)(6116002)(71190400001)(55016002)(72206003)(54906003)(9686003)(6916009)(99286004)(25786009)(53936002)(66066001)(14454004)(97736004)(4326008)(446003)(476003)(478600001)(102836004)(11346002)(26005)(186003)(6246003)(486006)(5250100002)(106356001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR08MB3054; H:VI1PR08MB3167.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: wCrOopFlXmG4hTH3h+iQfVzXGyNCE7NSQtc2blnvwThMbaPjntZR1pTnGv/WCEdy94PFP9vFAzXjoDOZOGfN0nfxz/8XCp/m2kh8iH3LR/H0kJOnxTU+r+yAFFtEs3vCLU10iivOWXlvi4yzUNVeEPNW5vwN0fkBSjajsO81a9ISMT70kuXdxOcOLBBNeb99erqBsLCIFeVYya0h6TZgkrnCzUIFvOlqpCEkbjILqu5JCcsaKsayWfIS+pI7mGEZ7RZQKlr1ULzSx54y+BhDb2opvud9K/dmjA1wc75C2Zc6yE9QuZj2B28KEIthP7fH/Eunmrlb3buZ/taoUjYTRIivfevKD3c1GHDifrfJ5s0= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 44a2a6b4-9d1b-493a-5fda-08d62a5c2087 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Oct 2018 00:47:28.9213 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3054 Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 00:47:31 -0000 Hi Jerin, Thanks for your review, inline comments from our internal discussions. BR. Gavin > -----Original Message----- > From: Jerin Jacob > Sent: Saturday, September 29, 2018 6:49 PM > To: Gavin Hu (Arm Technology China) > Cc: dev@dpdk.org; Honnappa Nagarahalli > ; Steve Capper > ; Ola Liljedahl ; nd > ; stable@dpdk.org > Subject: Re: [PATCH v3 1/3] ring: read tail using atomic load >=20 > -----Original Message----- > > Date: Mon, 17 Sep 2018 16:17:22 +0800 > > From: Gavin Hu > > To: dev@dpdk.org > > CC: gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, > > steve.capper@arm.com, Ola.Liljedahl@arm.com, > > jerin.jacob@caviumnetworks.com, nd@arm.com, stable@dpdk.org > > Subject: [PATCH v3 1/3] ring: read tail using atomic load > > X-Mailer: git-send-email 2.7.4 > > > > External Email > > > > In update_tail, read ht->tail using __atomic_load.Although the > > compiler currently seems to be doing the right thing even without > > _atomic_load, we don't want to give the compiler freedom to optimise > > what should be an atomic load, it should not be arbitarily moved > > around. > > > > Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option") > > Cc: stable@dpdk.org > > > > Signed-off-by: Gavin Hu > > Reviewed-by: Honnappa Nagarahalli > > Reviewed-by: Steve Capper > > Reviewed-by: Ola Liljedahl > > --- > > lib/librte_ring/rte_ring_c11_mem.h | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_ring/rte_ring_c11_mem.h > > b/lib/librte_ring/rte_ring_c11_mem.h > > index 94df3c4..234fea0 100644 > > --- a/lib/librte_ring/rte_ring_c11_mem.h > > +++ b/lib/librte_ring/rte_ring_c11_mem.h > > @@ -21,7 +21,8 @@ update_tail(struct rte_ring_headtail *ht, uint32_t > old_val, uint32_t new_val, > > * we need to wait for them to complete > > */ > > if (!single) > > - while (unlikely(ht->tail !=3D old_val)) > > + while (unlikely(old_val !=3D __atomic_load_n(&ht->tail, > > + __ATOMIC_RELAXED))) > > rte_pause(); >=20 > Since it is a while loop with rte_pause(), IMO, There is no scope of fals= e > compiler optimization. > IMO, this change may not required though I don't see any performance > difference with two core ring_perf_autotest test. May be more core case i= t > may have effect. IMO, If it not absolutely required, we can avoid this ch= ange. >=20 Using __atomic_load_n() has two purposes: 1) the old code only works because ht->tail is declared volatile which is n= ot a requirement for C11 or for the use of __atomic builtins. If ht->tail w= as not declared volatile and __atomic_load_n() not used, the compiler would= likely hoist the load above the loop.=20 2) I think all memory locations used for synchronization should use __atomi= c operations for access in order to clearly indicate that these locations (= and these accesses) are used for synchronization. The read of ht->tail needs to be atomic, a non-atomic read would not be cor= rect. But there are no memory ordering requirements (with regards to other = loads and/or stores by this thread) so relaxed memory order is sufficient. Another aspect of using __atomic_load_n() is that the compiler cannot "opti= mise" this load (e.g. combine, hoist etc), it has to be done as specified i= n the source code which is also what we need here. One point worth mentioning though is that this change is for the rte_ring_c= 11_mem.h file, not the legacy ring. It may be worth persisting with getting= the C11 code right when people are less excited about sending a release ou= t? We can explain that for C11 we would prefer to do loads and stores as per t= he C11 memory model. In the case of rte_ring, the code is separated cleanly= into C11 specific files anyway. I think reading ht->tail using __atomic_load_n() is the most appropriate wa= y. We show that ht->tail is used for synchronization, we acknowledge that h= t->tail may be written by other threads without any other kind of synchroni= zation (e.g. no lock involved) and we require an atomic load (any write to = ht->tail must also be atomic). Using volatile and explicit compiler (or processor) memory barriers (fences= ) is the legacy pre-C11 way of accomplishing these things. There's a reason= why C11/C++11 moved away from the old ways. > > > > __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE); > > -- > > 2.7.4 > >