From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo1.mail-out.ovh.net (13.mo1.mail-out.ovh.net [178.33.253.128]) by dpdk.org (Postfix) with ESMTP id A5ABF156 for ; Sat, 7 Dec 2013 18:55:25 +0100 (CET) Received: from mail406.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo1.mail-out.ovh.net (Postfix) with SMTP id 3295CFFA1E4 for ; Sat, 7 Dec 2013 18:59:33 +0100 (CET) Received: from b0.ovh.net (HELO queueout) (213.186.33.50) by b0.ovh.net with SMTP; 7 Dec 2013 20:00:37 +0200 Received: from lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr (HELO pcdeff) (ff@ozog.com@193.252.40.75) by ns0.ovh.net with SMTP; 7 Dec 2013 20:00:36 +0200 From: =?iso-8859-1?Q?Fran=E7ois-Fr=E9d=E9ric_Ozog?= To: "'Thomas Monjalon'" , "'Pashupati Kumar'" References: <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E6B@HQ1-EXCH02.corp.brocade.com> <6733914.61HpXdIraN@x220> <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E7A@HQ1-EXCH02.corp.brocade.com> <4656219.tgqzelRNOJ@x220> In-Reply-To: <4656219.tgqzelRNOJ@x220> Date: Sat, 7 Dec 2013 18:54:57 +0100 Message-ID: <01d001cef375$7167e300$5437a900$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ac7y0cqDuUOlMKHwSEuU2WV2+kNz5wAmjR8w Content-Language: fr X-Ovh-Tracer-Id: 7080784514514409689 X-Ovh-Remote: 193.252.40.75 (lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr) X-Ovh-Local: 213.186.33.20 (ns0.ovh.net) X-OVH-SPAMSTATE: OK X-OVH-SPAMSCORE: -100 X-OVH-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeeiledrkeekucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd X-Spam-Check: DONE|U 0.500415/N X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeeiledrkeekucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Cc: dev@dpdk.org Subject: Re: [dpdk-dev] Bit spinlocks in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Dec 2013 17:55:27 -0000 > -----Message d'origine----- > De=A0: dev [mailto:dev-bounces@dpdk.org] De la part de Thomas Monjalon > Envoy=E9=A0: vendredi 6 d=E9cembre 2013 23:24 > =C0=A0: Pashupati Kumar > Cc=A0: dev@dpdk.org > Objet=A0: Re: [dpdk-dev] Bit spinlocks in DPDK >=20 > 06/12/2013 14:12, Pashupati Kumar : > > From: Thomas Monjalon > > > 06/12/2013 13:04, Pashupati Kumar : > > > > We use bit spinlocks extensively to have compact data = structures. > > > > Are there any plans for adding them to DPDK in some future = release? > > > > > > Not sure to understand your request. > > > Are you looking for that? > > > http://dpdk.org/doc/api/rte__spinlock_8h.html > > > > I am looking for spinlocks that use a single bit (bit 31) of a 32 = bit > > word for locking. The rest of the bits in the word are left > > undisturbed. This enables more compact data structures as only 1 = bit > > is consumed for the lock. >=20 > Oh yes, like test_and_set_bit_lock() in Linux: > http://lxr.free- > electrons.com/source/arch/ia64/include/asm/bitops.h?v=3D3.12#L205 >=20 > I think that a patch would be appreciated :) >=20 > PS: please try to answer below the question. It's far easier to read. > -- > Thomas Hi, I assume you mean the x64 version, not the ia64: http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h. I agree with Pash, the goal of compacting data structures is of the = highest importance for high performance networking (HPN). Last week I gave a training on some aspects of multiprocessing, and in particular locking. So I got the mood to check DPDK implementation and I have a few remarks: 1) If the critical section deals with weakly ordered loads then explicit fencing MUST be used: if not, out of order execution will just kill your idea of critical section. Weakly order loads occur on a number of = situations involving Write Combining memory (say memory mapped IO). see Intel programming guide 3, 8.1.2.2 Software Controlled Bus Locking and 12.10.3 Streaming Load Hint Instruction. So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend = the rte_unlock code and documentation explains the out of order execution = issues and the conditions they have to be mitigated with rte*mb(). I wonder if having an explicit mfence in rte_sinlock_unlock wouldn't be just = necessary to avoid "hairy" bugs. In addition, we would have = rte_sinlock_unlock_no_mb used internally for performance reasons, and usable externally by = advanced users. 2) code that use rte spinlocks are subject to starvation. A simple code = with one producer and one consumer demonstrate the issue: I saw either the producer not giving a single chance to a consumer after 1M loops or the consumer grabs the lock right after the first production and loops for = ever testing if the producer had produced something (the cycle of unlock/lock = is too fast)! Introducing "random" rte_pause() (actually lcore_id number of pauses) solves the problem in a ugly unpredictable manner. It may not = create any visible problem on light loads, but may end up being a real issue = with several million packets per second. There are a number of scenarios were = we don't care if lock algorithm create starvation, but sometimes we do. So = I will provide the community with a ticketlock implementation that is starvation free.=20 What is the policy to share a sample program: inline in a mail or as a attachment? What is the policy to submit a patch? Fran=E7ois-Fr=E9d=E9ric