From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id D54E7A0679 for ; Fri, 29 Mar 2019 00:27:16 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F19BA4C94; Fri, 29 Mar 2019 00:27:07 +0100 (CET) Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30089.outbound.protection.outlook.com [40.107.3.89]) by dpdk.org (Postfix) with ESMTP id CD1303772 for ; Fri, 29 Mar 2019 00:27:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=a2VllK4cgbM2yCLasCU21Wm7v/oQThEohbFRJNDspVc=; b=PRx9RAfptesAJU++knp+VvE6f4ZI5JkaDjeZB5k2JnNFNTUl9qpz1HcMMfdY2DAIV6h3DdB7FoNvBQPCHmMrE8JUWIGo2JvWThGB6Ww6Mma4SFsFKk+3dg/6s0CwoMv7XY/Wgwsw7sGi8srCRlBksY2/A+D+3iOW76tsHJk5HaY= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.152) by VE1PR08MB5168.eurprd08.prod.outlook.com (20.179.30.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1750.15; Thu, 28 Mar 2019 23:27:05 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::e0ae:ecad:ec5:8177]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::e0ae:ecad:ec5:8177%2]) with mapi id 15.20.1750.017; Thu, 28 Mar 2019 23:27:05 +0000 From: Honnappa Nagarahalli To: Gage Eads , "dev@dpdk.org" CC: "olivier.matz@6wind.com" , "arybchenko@solarflare.com" , "bruce.richardson@intel.com" , "konstantin.ananyev@intel.com" , "Gavin Hu (Arm Technology China)" , nd , "thomas@monjalon.net" , nd Thread-Topic: [PATCH v3 6/8] stack: add C11 atomic implementation Thread-Index: AQHU1CtbzfcoFG40VUqutGZl2oW8e6Yhl7Kg Date: Thu, 28 Mar 2019 23:27:05 +0000 Message-ID: References: <20190305164256.2367-1-gage.eads@intel.com> <20190306144559.391-1-gage.eads@intel.com> <20190306144559.391-7-gage.eads@intel.com> In-Reply-To: <20190306144559.391-7-gage.eads@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 394cf529-2f44-4767-f911-08d6b3d4e3b4 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(5600127)(711020)(4605104)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:VE1PR08MB5168; x-ms-traffictypediagnostic: VE1PR08MB5168: x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr nodisclaimer: True x-microsoft-antispam-prvs: x-forefront-prvs: 0990C54589 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(366004)(346002)(136003)(376002)(396003)(199004)(189003)(86362001)(486006)(71190400001)(7696005)(53936002)(8936002)(25786009)(2501003)(14454004)(8676002)(81156014)(6506007)(76176011)(81166006)(316002)(110136005)(54906003)(6246003)(5660300002)(4326008)(55016002)(478600001)(72206003)(229853002)(99286004)(68736007)(105586002)(6436002)(106356001)(74316002)(305945005)(9686003)(14444005)(33656002)(256004)(2906002)(3846002)(7736002)(66066001)(97736004)(186003)(6116002)(26005)(52536014)(71200400001)(102836004)(11346002)(446003)(476003); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5168; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: D9rjFiRVuMim5NwyGonUL1S/yr1Sms/H30e6nUZxTaQvTpy+hD2hbQZXVZzjzliMokC799Urv+QUPbzI2PtPPvu/NrJX24x+nzrBbfVWnPAMruyf9zZFsirWcAVRoXiBnQCEyodArj7m0LPHoH6xvl6e/njBAMYvEYBFjRJWzDayuPFyMyVEpehSZCh90Lu5NbjHK9JgQ8B65e7nuXzekhwvSyaT02OnK5UteHOYe59ro5/TYEX0xvBoLoj+PE0Je81WGZreAsX8iKyHFMYqxJ7t5b4hJxo8yFSmWqZF8AM08zV6/1riiSFuvrZt3JwvzpDCUA7hwcQrO/pYycOQq9llEhF3pMQkZX20XYKYtW7C1ic/sPAjOHo3NKDWEGo7IKxQ81cqSqyqqF1EYT8GSpmBwOMPaLUW0lrpdBILITc= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 394cf529-2f44-4767-f911-08d6b3d4e3b4 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Mar 2019 23:27:05.2679 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5168 Subject: Re: [dpdk-dev] [PATCH v3 6/8] stack: add C11 atomic implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190328232705.kLBlZOYKMGVS1n9XQlLUFqK8aRN8QGONOorFN9RbVms@z> >=20 > This commit adds an implementation of the lock-free stack push, pop, and > length functions that use __atomic builtins, for systems that benefit fro= m the > finer-grained memory ordering control. >=20 > Signed-off-by: Gage Eads > --- > lib/librte_stack/Makefile | 3 +- > lib/librte_stack/meson.build | 3 +- > lib/librte_stack/rte_stack.h | 4 + > lib/librte_stack/rte_stack_c11_mem.h | 175 > +++++++++++++++++++++++++++++++++++ > 4 files changed, 183 insertions(+), 2 deletions(-) create mode 100644 > lib/librte_stack/rte_stack_c11_mem.h >=20 > diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile index > 3ecddf033..94a7c1476 100644 > --- a/lib/librte_stack/Makefile > +++ b/lib/librte_stack/Makefile > @@ -19,6 +19,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_STACK) :=3D rte_stack.c >=20 > # install includes > SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include :=3D rte_stack.h \ > - rte_stack_generic.h > + rte_stack_generic.h \ > + rte_stack_c11_mem.h >=20 > include $(RTE_SDK)/mk/rte.lib.mk > diff --git a/lib/librte_stack/meson.build b/lib/librte_stack/meson.build = index > 99d7f9ec5..7e2d1dbb8 100644 > --- a/lib/librte_stack/meson.build > +++ b/lib/librte_stack/meson.build > @@ -6,4 +6,5 @@ allow_experimental_apis =3D true version =3D 1 sources = =3D > files('rte_stack.c') headers =3D files('rte_stack.h', > - 'rte_stack_generic.h') > + 'rte_stack_generic.h', > + 'rte_stack_c11_mem.h') > diff --git a/lib/librte_stack/rte_stack.h b/lib/librte_stack/rte_stack.h = index > b484313bb..de16f8fff 100644 > --- a/lib/librte_stack/rte_stack.h > +++ b/lib/librte_stack/rte_stack.h > @@ -91,7 +91,11 @@ struct rte_stack { > */ > #define RTE_STACK_F_LF 0x0001 >=20 > +#ifdef RTE_USE_C11_MEM_MODEL > +#include "rte_stack_c11_mem.h" > +#else > #include "rte_stack_generic.h" > +#endif >=20 > /** > * @internal Push several objects on the lock-free stack (MT-safe). > diff --git a/lib/librte_stack/rte_stack_c11_mem.h > b/lib/librte_stack/rte_stack_c11_mem.h > new file mode 100644 > index 000000000..44f9ece6e > --- /dev/null > +++ b/lib/librte_stack/rte_stack_c11_mem.h > @@ -0,0 +1,175 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2019 Intel Corporation > + */ > + > +#ifndef _RTE_STACK_C11_MEM_H_ > +#define _RTE_STACK_C11_MEM_H_ > + > +#include > +#include > + > +static __rte_always_inline unsigned int rte_stack_lf_len(struct > +rte_stack *s) { > + /* stack_lf_push() and stack_lf_pop() do not update the list's > contents > + * and stack_lf->len atomically, which can cause the list to appear > + * shorter than it actually is if this function is called while other > + * threads are modifying the list. > + * > + * However, given the inherently approximate nature of the > get_count > + * callback -- even if the list and its size were updated atomically, > + * the size could change between when get_count executes and when > the > + * value is returned to the caller -- this is acceptable. > + * > + * The stack_lf->len updates are placed such that the list may appear > to > + * have fewer elements than it does, but will never appear to have > more > + * elements. If the mempool is near-empty to the point that this is a > + * concern, the user should consider increasing the mempool size. > + */ > + return (unsigned int)__atomic_load_n(&s->stack_lf.used.len.cnt, > + __ATOMIC_RELAXED); > +} > + > +static __rte_always_inline void > +__rte_stack_lf_push(struct rte_stack_lf_list *list, > + struct rte_stack_lf_elem *first, > + struct rte_stack_lf_elem *last, > + unsigned int num) > +{ > +#ifndef RTE_ARCH_X86_64 > + RTE_SET_USED(first); > + RTE_SET_USED(last); > + RTE_SET_USED(list); > + RTE_SET_USED(num); > +#else > + struct rte_stack_lf_head old_head; > + int success; > + > + old_head =3D list->head; This can be a torn read (same as you have mentioned in __rte_stack_lf_pop).= I suggest we use acquire thread fence here as well (please see the comment= s in __rte_stack_lf_pop). > + > + do { > + struct rte_stack_lf_head new_head; > + We can add __atomic_thread_fence(__ATOMIC_ACQUIRE) here (please see the com= ments in __rte_stack_lf_pop). > + /* Swing the top pointer to the first element in the list and > + * make the last element point to the old top. > + */ > + new_head.top =3D first; > + new_head.cnt =3D old_head.cnt + 1; > + > + last->next =3D old_head.top; > + > + /* Use the release memmodel to ensure the writes to the LF > LIFO > + * elements are visible before the head pointer write. > + */ > + success =3D rte_atomic128_cmp_exchange( > + (rte_int128_t *)&list->head, > + (rte_int128_t *)&old_head, > + (rte_int128_t *)&new_head, > + 1, __ATOMIC_RELEASE, > + __ATOMIC_RELAXED); Success memory order can be RELAXED as the store to list->len.cnt is RELEAS= E. > + } while (success =3D=3D 0); > + > + /* Ensure the stack modifications are not reordered with respect > + * to the LIFO len update. > + */ > + __atomic_add_fetch(&list->len.cnt, num, __ATOMIC_RELEASE); > #endif } > + > +static __rte_always_inline struct rte_stack_lf_elem * > +__rte_stack_lf_pop(struct rte_stack_lf_list *list, > + unsigned int num, > + void **obj_table, > + struct rte_stack_lf_elem **last) > +{ > +#ifndef RTE_ARCH_X86_64 > + RTE_SET_USED(obj_table); > + RTE_SET_USED(last); > + RTE_SET_USED(list); > + RTE_SET_USED(num); > + > + return NULL; > +#else > + struct rte_stack_lf_head old_head; > + int success; > + > + /* Reserve num elements, if available */ > + while (1) { > + uint64_t len =3D __atomic_load_n(&list->len.cnt, > + __ATOMIC_ACQUIRE); This can be outside the loop. > + > + /* Does the list contain enough elements? */ > + if (unlikely(len < num)) > + return NULL; > + > + if (__atomic_compare_exchange_n(&list->len.cnt, > + &len, len - num, > + 0, __ATOMIC_RELAXED, > + __ATOMIC_RELAXED)) > + break; > + } > + > +#ifndef RTE_ARCH_X86_64 > + /* Use the acquire memmodel to ensure the reads to the LF LIFO > elements > + * are properly ordered with respect to the head pointer read. > + * > + * Note that for aarch64, GCC's implementation of __atomic_load_16 > in > + * libatomic uses locks, and so this function should be replaced by > + * a new function (e.g. "rte_atomic128_load()"). aarch64 does not have 'pure' atomic 128b load instructions. They have to be= implemented using load/store exclusives. > + */ > + __atomic_load((volatile __int128 *)&list->head, > + &old_head, > + __ATOMIC_ACQUIRE); Since, we know of x86/aarch64 (power?) that cannot implement pure atomic 12= 8b loads, should we just use relaxed reads and assume the possibility of to= rn reads for all architectures? Then, we can use acquire fence to prevent t= he reordering (see below) > +#else > + /* x86-64 does not require an atomic load here; if a torn read occurs, IMO, we should not make architecture specific distinctions as this algorith= m is based on C11 memory model. > + * the CAS will fail and set old_head to the correct/latest value. > + */ > + old_head =3D list->head; > +#endif > + > + /* Pop num elements */ > + do { > + struct rte_stack_lf_head new_head; > + struct rte_stack_lf_elem *tmp; > + unsigned int i; > + We can add __atomic_thread_fence(__ATOMIC_ACQUIRE) here. > + rte_prefetch0(old_head.top); > + > + tmp =3D old_head.top; > + > + /* Traverse the list to find the new head. A next pointer will > + * either point to another element or NULL; if a thread > + * encounters a pointer that has already been popped, the > CAS > + * will fail. > + */ > + for (i =3D 0; i < num && tmp !=3D NULL; i++) { > + rte_prefetch0(tmp->next); > + if (obj_table) > + obj_table[i] =3D tmp->data; > + if (last) > + *last =3D tmp; > + tmp =3D tmp->next; > + } > + > + /* If NULL was encountered, the list was modified while > + * traversing it. Retry. > + */ > + if (i !=3D num) > + continue; > + > + new_head.top =3D tmp; > + new_head.cnt =3D old_head.cnt + 1; > + > + success =3D rte_atomic128_cmp_exchange( > + (rte_int128_t *)&list->head, > + (rte_int128_t *)&old_head, > + (rte_int128_t *)&new_head, > + 1, __ATOMIC_ACQUIRE, > + __ATOMIC_ACQUIRE); The success order should be __ATOMIC_RELEASE as the write to list->len.cnt = is relaxed. The failure order can be __ATOMIC_RELAXED if the thread fence is added. > + } while (success =3D=3D 0); > + > + return old_head.top; > +#endif > +} > + > +#endif /* _RTE_STACK_C11_MEM_H_ */ > -- > 2.13.6