From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo6.mail-out.ovh.net (16.mo6.mail-out.ovh.net [87.98.139.208]) by dpdk.org (Postfix) with ESMTP id 69FFF591D for ; Fri, 10 Jan 2014 19:03:34 +0100 (CET) Received: from mail435.ha.ovh.net (b6.ovh.net [213.186.33.56]) by mo6.mail-out.ovh.net (Postfix) with SMTP id C70F7FFAD61 for ; Fri, 10 Jan 2014 19:08:37 +0100 (CET) Received: from b0.ovh.net (HELO queueout) (213.186.33.50) by b0.ovh.net with SMTP; 10 Jan 2014 20:07:02 +0200 Received: from lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr (HELO pcdeff) (ff@ozog.com@193.252.40.75) by ns0.ovh.net with SMTP; 10 Jan 2014 20:07:00 +0200 From: =?iso-8859-1?Q?Fran=E7ois-Fr=E9d=E9ric_Ozog?= To: "'Thomas Monjalon'" References: <1387582656-1892-1-git-send-email-thomas.monjalon@6wind.com> In-Reply-To: <1387582656-1892-1-git-send-email-thomas.monjalon@6wind.com> Date: Fri, 10 Jan 2014 19:02:18 +0100 Message-ID: <019301cf0e2e$1b0af0a0$5120d1e0$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ac793IBzoP6rf279SDSeOys7Rmww/gQSpwUA Content-Language: fr X-Ovh-Tracer-Id: 3946279173837150425 X-Ovh-Remote: 193.252.40.75 (lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr) X-Ovh-Local: 213.186.33.20 (ns0.ovh.net) X-OVH-SPAMSTATE: OK X-OVH-SPAMSCORE: -100 X-OVH-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrfedvucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd X-Spam-Check: DONE|U 0.5/N X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrfedvucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] spinlock: fix atomic and out of order execution X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jan 2014 18:03:34 -0000 Hi Thomas, I am afraid I introduced unnecessary complexity in the discussion as the spinlock issues I mentioned are connected to a work in progress on my = side (implement a Chelsio cxgb5 PMD) but *not* to the general DPDK.=20 I'll explain some aspects of the context and how critical sections has = to be handled in the case of the Chelsio cxgb5 PMD. 1) WC memory Most memory is cached, some is not cached at all, and some is tagged = (via paging attributes) to be "Write Combining". This means that writes to WC memory locations do *NOT* go through the cache but rather to a memory = "write buffer" internal to the processor. The processor delays write to memory = as it please, trying to minimize the actual number of writes to DRAM. This = type of memory is used by some devices as a fast interface, for instance = Chelsio cxgb5 write queue can filled using this method (http://lwn.net/Articles/542643/). Public DPDK cannot declare such = memory type as it requires kernel support. 2) fencing and out of order execution Out of order execution is related to the processor that may actually = change the order of the instructions of a program as it has been produced by = the compiler. In general, this relates only to performance but for critical sections, it is absolutely necessary to ensure that the processor does = NOT postpone critical section instructions AFTER the unlock. So there is a need of a "fence" to tell the processor that some = instruction blocks out of order execution: no previous instruction can be postponed after the "fence". 3) Public DPDK critical sections The DPDK implements lock/unlock with xchg instruction which is atomic = and serializing for "normal" cached memory. It includes an implicit lock = prefix, so no need to add such prefix (your patch proposal). This (implicit) = lock prefix is actually acting as an out of order execution fence, thus = ensuring that critical section is working as expected. Nothing has to be added or changed for public DPDK to have correct critical sections. 4) "Extended" DPDK critical sections Because of Chelsio cxgb5 PMD driver I am developing, I added kernel = support to DPDK to declare some memory regions as Write Combining, hence to be = able to leverage the output queue fast path. Because of this, I also had to add proper fencing for critical sections. The relation is the fact that writes to WC memory does NOT go through = the cache AND that the lock prefix is implemented as a cache protocol transaction. In other words, a lock prefix is NOT an out of order instruction fence for instructions that deal with WC memory. So, the processor may decide to postpone the WC related instruction ATFER the unlock. To create a proper out of order execution fence, the processor has a set = of explicit memory fences that do the job. So before a rte_unlock, I have to add a rte_mb(). So: - for people willing to develop applications on top of DPDK, my comments = can be disregarded, they are not relevant, there is no need to understand = them.=20 - for people willing to develop PMD drivers, the fencing comments should = be clear to use the proper fencing. I am sorry for any confusion I may have introduced in the community. Fran=E7ois-Fr=E9d=E9ric > -----Message d'origine----- > De=A0: dev [mailto:dev-bounces@dpdk.org] De la part de Thomas Monjalon > Envoy=E9=A0: samedi 21 d=E9cembre 2013 00:38 > =C0=A0: dev@dpdk.org > Objet=A0: [dpdk-dev] [PATCH] spinlock: fix atomic and out of order = execution >=20 > From: Damien Millescamps >=20 > Add lock prefix before xchg instructions in order to be atomic and = flush > speculative values to ensure effective execution order (as an acquire > barrier). >=20 > MPLOCKED is a "lock" in multicore case. >=20 > Signed-off-by: Damien Millescamps > Signed-off-by: Thomas Monjalon > --- > lib/librte_eal/common/include/rte_spinlock.h | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) >=20 > diff --git a/lib/librte_eal/common/include/rte_spinlock.h > b/lib/librte_eal/common/include/rte_spinlock.h > index f7a245a..8edb971 100644 > --- a/lib/librte_eal/common/include/rte_spinlock.h > +++ b/lib/librte_eal/common/include/rte_spinlock.h > @@ -51,6 +51,7 @@ > extern "C" { > #endif >=20 > +#include > #include > #ifdef RTE_FORCE_INTRINSICS > #include > @@ -93,7 +94,7 @@ rte_spinlock_lock(rte_spinlock_t *sl) > int lock_val =3D 1; > asm volatile ( > "1:\n" > - "xchg %[locked], %[lv]\n" > + MPLOCKED "xchg %[locked], %[lv]\n" > "test %[lv], %[lv]\n" > "jz 3f\n" > "2:\n" > @@ -124,7 +125,7 @@ rte_spinlock_unlock (rte_spinlock_t *sl) #ifndef > RTE_FORCE_INTRINSICS > int unlock_val =3D 0; > asm volatile ( > - "xchg %[locked], %[ulv]\n" > + MPLOCKED "xchg %[locked], %[ulv]\n" > : [locked] "=3Dm" (sl->locked), [ulv] "=3Dq" (unlock_val) > : "[ulv]" (unlock_val) > : "memory"); > @@ -148,7 +149,7 @@ rte_spinlock_trylock (rte_spinlock_t *sl) > int lockval =3D 1; >=20 > asm volatile ( > - "xchg %[locked], %[lockval]" > + MPLOCKED "xchg %[locked], %[lockval]" > : [locked] "=3Dm" (sl->locked), [lockval] "=3Dq" (lockval) > : "[lockval]" (lockval) > : "memory"); > -- > 1.7.10.4