From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by dpdk.space (Postfix) with ESMTP id 8014BA0679
	for <public@inbox.dpdk.org>; Mon,  1 Apr 2019 22:21:47 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 526064C99;
	Mon,  1 Apr 2019 22:21:47 +0200 (CEST)
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
 by dpdk.org (Postfix) with ESMTP id E90EA4C96
 for <dev@dpdk.org>; Mon,  1 Apr 2019 22:21:44 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga008.fm.intel.com ([10.253.24.58])
 by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Apr 2019 13:21:44 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.60,297,1549958400"; d="scan'208";a="136682339"
Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204])
 by fmsmga008.fm.intel.com with ESMTP; 01 Apr 2019 13:21:44 -0700
Received: from fmsmsx161.amr.corp.intel.com (10.18.125.9) by
 FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS)
 id 14.3.408.0; Mon, 1 Apr 2019 13:21:43 -0700
Received: from fmsmsx108.amr.corp.intel.com ([169.254.9.216]) by
 FMSMSX161.amr.corp.intel.com ([169.254.12.31]) with mapi id 14.03.0415.000;
 Mon, 1 Apr 2019 13:21:43 -0700
From: "Eads, Gage" <gage.eads@intel.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, "'dev@dpdk.org'"
 <dev@dpdk.org>
CC: "'olivier.matz@6wind.com'" <olivier.matz@6wind.com>,
 "'arybchenko@solarflare.com'" <arybchenko@solarflare.com>, "Richardson,
 Bruce" <bruce.richardson@intel.com>, "Ananyev, Konstantin"
 <konstantin.ananyev@intel.com>, "Gavin Hu (Arm Technology China)"
 <Gavin.Hu@arm.com>, nd <nd@arm.com>, "thomas@monjalon.net"
 <thomas@monjalon.net>, nd <nd@arm.com>
Thread-Topic: [PATCH v3 6/8] stack: add C11 atomic implementation
Thread-Index: AQHU1CtbzfcoFG40VUqutGZl2oW8e6Yhl7KggAFxNMCAA4rc0IABN+yggAAU1yA=
Date: Mon, 1 Apr 2019 20:21:42 +0000
Message-ID:
 <9184057F7FC11744A2107296B6B8EB1E5420E6E0@FMSMSX108.amr.corp.intel.com>
References: <20190305164256.2367-1-gage.eads@intel.com>
 <20190306144559.391-1-gage.eads@intel.com>
 <20190306144559.391-7-gage.eads@intel.com>
 <VE1PR08MB5149F2B5E1A3AC800BC71D8698590@VE1PR08MB5149.eurprd08.prod.outlook.com>
 <9184057F7FC11744A2107296B6B8EB1E5420D940@FMSMSX108.amr.corp.intel.com>
 <9184057F7FC11744A2107296B6B8EB1E5420DDF2@FMSMSX108.amr.corp.intel.com>
 <VE1PR08MB51499D8D3DB5C866E29866C998550@VE1PR08MB5149.eurprd08.prod.outlook.com>
In-Reply-To: <VE1PR08MB51499D8D3DB5C866E29866C998550@VE1PR08MB5149.eurprd08.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZWI4Njc2MTctMzQ3ZC00NzRmLTgxNmEtNjU4YjI3NDkzYjBjIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoidHFCUnRyWHAwQTk2TDBqQkQxTWk4cUw0a213M0Z4K2RGXC9iaFRWQzJ1Yjd3N3o0OERTQ3NuMENwSGJ6OXp4WXcifQ==
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.0.400.15
dlp-reaction: no-action
x-originating-ip: [10.1.200.106]
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v3 6/8] stack: add C11 atomic implementation
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>
Message-ID: <20190401202142.aTGqckiKkbFEaZJFQgI4uyx7wxJP_3vx7ru5XMAy_2U@z>



> -----Original Message-----
> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Monday, April 1, 2019 2:07 PM
> To: Eads, Gage <gage.eads@intel.com>; 'dev@dpdk.org' <dev@dpdk.org>
> Cc: 'olivier.matz@6wind.com' <olivier.matz@6wind.com>;
> 'arybchenko@solarflare.com' <arybchenko@solarflare.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; nd <nd@arm.com>; thomas@monjalon.net; nd
> <nd@arm.com>
> Subject: RE: [PATCH v3 6/8] stack: add C11 atomic implementation
>=20
> > > Subject: RE: [PATCH v3 6/8] stack: add C11 atomic implementation
> > >
> > > [snip]
> > >
> > > > > +static __rte_always_inline void __rte_stack_lf_push(struct
> > > > > +rte_stack_lf_list *list,
> > > > > +		    struct rte_stack_lf_elem *first,
> > > > > +		    struct rte_stack_lf_elem *last,
> > > > > +		    unsigned int num)
> > > > > +{
> > > > > +#ifndef RTE_ARCH_X86_64
> > > > > +	RTE_SET_USED(first);
> > > > > +	RTE_SET_USED(last);
> > > > > +	RTE_SET_USED(list);
> > > > > +	RTE_SET_USED(num);
> > > > > +#else
> > > > > +	struct rte_stack_lf_head old_head;
> > > > > +	int success;
> > > > > +
> > > > > +	old_head =3D list->head;
> > > > This can be a torn read (same as you have mentioned in
> > > > __rte_stack_lf_pop). I suggest we use acquire thread fence here as
> > > > well (please see the comments in __rte_stack_lf_pop).
> > >
> > > Agreed. I'll add the acquire fence.
> > >
> >
> > On second thought, an acquire fence isn't necessary. The acquire fence
> > in
> > __rte_stack_lf_pop() ensures the list->head is ordered before the list
> > element reads. That isn't necessary here; we need to ensure that the
> > last->next write occurs (and is observed) before the list->head write,
> > which the CAS's RELEASE success memorder accomplishes.
> >
> > If a torn read occurs, the CAS will fail and will atomically re-load &o=
ld_head.
>=20
> Following is my understanding:
> The general guideline is there should be a load-acquire for every store-
> release. In both xxx_lf_pop and xxx_lf_push, the head is store-released,
> hence the load of the head should be load-acquire.
> From the code (for ex: in function _xxx_lf_push), you can notice that the=
re is
> dependency from 'old_head to new_head to list->head(in
> compare_exchange)'. When such a dependency exists, if the memory
> orderings have to be avoided, one needs to use __ATOMIC_CONSUME.
> Currently, the compilers will use a stronger memory order (which is
> __ATOMIC_ACQUIRE) as __ATOMIC_CONSUME is not well defined. Please
> refer to [1] and [2] for more info.
>=20
> IMO, since, for 128b, we do not have a pure load-acquire, I suggest we us=
e
> thread_fence with acquire semantics. It is a heavier barrier, but I think=
 it is a
> safer code which will adhere to C11 memory model.
>=20
> [1] https://preshing.com/20140709/the-purpose-of-
> memory_order_consume-in-cpp11/
> [2] http://www.open-
> std.org/jtc1/sc22/wg21/docs/papers/2018/p0750r1.html

Thanks for those two links, they're good resources.

I agree with your understanding. I admit I'm not fully convinced the synchr=
onized-with relationship is needed between pop's list->head store and push'=
s list->head load (or between push's list->head store and its list->head lo=
ad), but it's better to err on the side of caution to ensure it's functiona=
lly correct...at least until I can manage to convince you :).

I'll send out a V6 with the acquire thread fence.

Thanks,
Gage