DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Bit spinlocks in DPDK
@ 2013-12-06 21:04 Pashupati Kumar
  2013-12-06 22:02 ` Thomas Monjalon
  0 siblings, 1 reply; 10+ messages in thread
From: Pashupati Kumar @ 2013-12-06 21:04 UTC (permalink / raw)
  To: dev

Hi,
We use bit spinlocks extensively to have compact data structures.  Are there any plans for adding them to DPDK in some future release?


Thanks
Pash

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-06 21:04 [dpdk-dev] Bit spinlocks in DPDK Pashupati Kumar
@ 2013-12-06 22:02 ` Thomas Monjalon
  2013-12-06 22:12   ` Pashupati Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Monjalon @ 2013-12-06 22:02 UTC (permalink / raw)
  To: Pashupati Kumar; +Cc: dev

Hello,

06/12/2013 13:04, Pashupati Kumar :
> We use bit spinlocks extensively to have compact data structures.  Are there
> any plans for adding them to DPDK in some future release?

Not sure to understand your request.
Are you looking for that?
	http://dpdk.org/doc/api/rte__spinlock_8h.html

-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-06 22:02 ` Thomas Monjalon
@ 2013-12-06 22:12   ` Pashupati Kumar
  2013-12-06 22:24     ` Thomas Monjalon
  0 siblings, 1 reply; 10+ messages in thread
From: Pashupati Kumar @ 2013-12-06 22:12 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

I am looking for spinlocks that use a single bit (bit 31) of a 32 bit word for locking. The rest of the bits in the word are left undisturbed.  This enables more compact data structures as only 1 bit is consumed for the lock.

Thanks
Pash

-----Original Message-----
From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] 
Sent: Friday, December 06, 2013 2:02 PM
To: Pashupati Kumar
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] Bit spinlocks in DPDK

Hello,

06/12/2013 13:04, Pashupati Kumar :
> We use bit spinlocks extensively to have compact data structures.  Are 
> there any plans for adding them to DPDK in some future release?

Not sure to understand your request.
Are you looking for that?
	http://dpdk.org/doc/api/rte__spinlock_8h.html

--
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-06 22:12   ` Pashupati Kumar
@ 2013-12-06 22:24     ` Thomas Monjalon
  2013-12-06 22:54       ` Pashupati Kumar
  2013-12-07 17:54       ` François-Frédéric Ozog
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Monjalon @ 2013-12-06 22:24 UTC (permalink / raw)
  To: Pashupati Kumar; +Cc: dev

06/12/2013 14:12, Pashupati Kumar :
> From: Thomas Monjalon
> > 06/12/2013 13:04, Pashupati Kumar :
> > > We use bit spinlocks extensively to have compact data structures.  Are
> > > there any plans for adding them to DPDK in some future release?
> > 
> > Not sure to understand your request.
> > Are you looking for that?
> > 	http://dpdk.org/doc/api/rte__spinlock_8h.html
>
> I am looking for spinlocks that use a single bit (bit 31) of a 32 bit word
> for locking. The rest of the bits in the word are left undisturbed.  This
> enables more compact data structures as only 1 bit is consumed for the
> lock.

Oh yes, like test_and_set_bit_lock() in Linux:
	http://lxr.free-electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205

I think that a patch would be appreciated :)

PS: please try to answer below the question. It's far easier to read.
-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-06 22:24     ` Thomas Monjalon
@ 2013-12-06 22:54       ` Pashupati Kumar
  2013-12-07 17:54       ` François-Frédéric Ozog
  1 sibling, 0 replies; 10+ messages in thread
From: Pashupati Kumar @ 2013-12-06 22:54 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

> 06/12/2013 14:12, Pashupati Kumar :
> > From: Thomas Monjalon
> > > 06/12/2013 13:04, Pashupati Kumar :
> > > > We use bit spinlocks extensively to have compact data structures.
> > > > Are there any plans for adding them to DPDK in some future release?
> > >
> > > Not sure to understand your request.
> > > Are you looking for that?
> > > 	http://dpdk.org/doc/api/rte__spinlock_8h.html
> >
> > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit
> > word for locking. The rest of the bits in the word are left
> > undisturbed.  This enables more compact data structures as only 1 bit
> > is consumed for the lock.
> 
> Oh yes, like test_and_set_bit_lock() in Linux:
> 	http://lxr.free-
> electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205
> 
> I think that a patch would be appreciated :)
> 
> PS: please try to answer below the question. It's far easier to read.
> --
> Thomas

Yes. Thank you.

Pash

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-06 22:24     ` Thomas Monjalon
  2013-12-06 22:54       ` Pashupati Kumar
@ 2013-12-07 17:54       ` François-Frédéric Ozog
  2013-12-19 16:41         ` Thomas Monjalon
  2013-12-20 15:39         ` Thomas Monjalon
  1 sibling, 2 replies; 10+ messages in thread
From: François-Frédéric Ozog @ 2013-12-07 17:54 UTC (permalink / raw)
  To: 'Thomas Monjalon', 'Pashupati Kumar'; +Cc: dev


> -----Message d'origine-----
> De : dev [mailto:dev-bounces@dpdk.org] De la part de Thomas Monjalon
> Envoyé : vendredi 6 décembre 2013 23:24
> À : Pashupati Kumar
> Cc : dev@dpdk.org
> Objet : Re: [dpdk-dev] Bit spinlocks in DPDK
> 
> 06/12/2013 14:12, Pashupati Kumar :
> > From: Thomas Monjalon
> > > 06/12/2013 13:04, Pashupati Kumar :
> > > > We use bit spinlocks extensively to have compact data structures.
> > > > Are there any plans for adding them to DPDK in some future release?
> > >
> > > Not sure to understand your request.
> > > Are you looking for that?
> > > 	http://dpdk.org/doc/api/rte__spinlock_8h.html
> >
> > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit
> > word for locking. The rest of the bits in the word are left
> > undisturbed.  This enables more compact data structures as only 1 bit
> > is consumed for the lock.
> 
> Oh yes, like test_and_set_bit_lock() in Linux:
> 	http://lxr.free-
> electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205
> 
> I think that a patch would be appreciated :)
> 
> PS: please try to answer below the question. It's far easier to read.
> --
> Thomas

Hi,

I assume you mean the x64 version, not the ia64:
http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h.

I agree with Pash, the goal of compacting data structures is of the highest
importance for high performance networking (HPN).

Last week I gave a training on some aspects of multiprocessing, and in
particular locking. So I got the mood to check DPDK implementation and I
have a few remarks:

1) If the critical section deals with weakly ordered loads then explicit
fencing MUST be used: if not, out of order execution will just kill your
idea of critical section. Weakly order loads occur on a number of situations
involving Write Combining memory (say memory mapped IO). see Intel
programming guide 3, 8.1.2.2 Software Controlled Bus Locking and 12.10.3
Streaming Load Hint Instruction.

So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend the
rte_unlock code and documentation explains the out of order execution issues
and the conditions they have to be mitigated with rte*mb(). I wonder if
having an explicit mfence in rte_sinlock_unlock wouldn't be just necessary
to avoid "hairy" bugs. In addition, we would have rte_sinlock_unlock_no_mb
used internally for performance reasons, and usable externally by advanced
users.

2) code that use rte spinlocks are subject to starvation. A simple code with
one producer and one consumer demonstrate the issue: I saw either the
producer not giving a single chance to a consumer after 1M loops or the
consumer grabs the lock right after the first production and loops for ever
testing if the producer had produced something (the cycle of unlock/lock is
too fast)! Introducing "random" rte_pause() (actually lcore_id number of
pauses) solves the problem in a ugly unpredictable manner. It may not create
any visible problem on light loads, but may end up being a real issue with
several million packets per second. There are a number of scenarios were we
don't care if lock algorithm create starvation, but sometimes we do. So I
will provide the community with a ticketlock implementation that is
starvation free. 


What is the policy to share a sample program: inline in a mail or as a
attachment?
What is the policy to submit a patch?


François-Frédéric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-07 17:54       ` François-Frédéric Ozog
@ 2013-12-19 16:41         ` Thomas Monjalon
  2013-12-20 15:39         ` Thomas Monjalon
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Monjalon @ 2013-12-19 16:41 UTC (permalink / raw)
  To: François-Frédéric Ozog; +Cc: dev

07/12/2013 18:54, François-Frédéric Ozog :
> > De Thomas Monjalon
> > 06/12/2013 14:12, Pashupati Kumar :
> > > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit
> > > word for locking. The rest of the bits in the word are left
> > > undisturbed.  This enables more compact data structures as only 1 bit
> > > is consumed for the lock.
> > 
> > Oh yes, like test_and_set_bit_lock() in Linux:
> > 	http://lxr.free-electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205
> > 
> > I think that a patch would be appreciated :)
> 
> I assume you mean the x64 version, not the ia64:
> http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h.

Yes you're right.

> What is the policy to share a sample program: inline in a mail or as a
> attachment?

Inline should be fine. It allows to comment.

> What is the policy to submit a patch?

It's now explained here:
http://dpdk.org/dev#send

-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-07 17:54       ` François-Frédéric Ozog
  2013-12-19 16:41         ` Thomas Monjalon
@ 2013-12-20 15:39         ` Thomas Monjalon
  2013-12-20 16:00           ` François-Frédéric Ozog
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Monjalon @ 2013-12-20 15:39 UTC (permalink / raw)
  To: François-Frédéric Ozog; +Cc: dev

Hello,

07/12/2013 18:54, François-Frédéric Ozog :
> 1) If the critical section deals with weakly ordered loads then explicit
> fencing MUST be used: if not, out of order execution will just kill your
> idea of critical section.
[...]
> So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend
> the rte_unlock code and documentation explains the out of order execution
> issues and the conditions they have to be mitigated with rte*mb(). I
> wonder if having an explicit mfence in rte_sinlock_unlock wouldn't be just
> necessary to avoid "hairy" bugs. In addition, we would have
> rte_sinlock_unlock_no_mb used internally for performance reasons, and
> usable externally by advanced users.

Using lock prefix is lighter than using memory barrier and have the same 
effects.
But you're right about the bug in spinlocks.
I am going to send a patch for this.

-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-20 15:39         ` Thomas Monjalon
@ 2013-12-20 16:00           ` François-Frédéric Ozog
  2013-12-20 16:36             ` Stephen Hemminger
  0 siblings, 1 reply; 10+ messages in thread
From: François-Frédéric Ozog @ 2013-12-20 16:00 UTC (permalink / raw)
  To: 'Thomas Monjalon'; +Cc: dev



> -----Message d'origine-----
> De : Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Envoyé : vendredi 20 décembre 2013 16:39
> À : François-Frédéric Ozog
> Cc : dev@dpdk.org
> Objet : Re: [dpdk-dev] Bit spinlocks in DPDK
> 
> Hello,
> 
> 07/12/2013 18:54, François-Frédéric Ozog :
> > 1) If the critical section deals with weakly ordered loads then
> > explicit fencing MUST be used: if not, out of order execution will
> > just kill your idea of critical section.
> [...]
> > So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I
> > recommend the rte_unlock code and documentation explains the out of
> > order execution issues and the conditions they have to be mitigated
> > with rte*mb(). I wonder if having an explicit mfence in
> > rte_sinlock_unlock wouldn't be just necessary to avoid "hairy" bugs.
> > In addition, we would have rte_sinlock_unlock_no_mb used internally
> > for performance reasons, and usable externally by advanced users.
> 
> Using lock prefix is lighter than using memory barrier and have the same
> effects.

Well, in general yes BUT Intel states "../.. locked operations serialize all
outstanding load and store operations ../.. with one exception. Load
operations that reference weakly ordered memory types (such as the WC memory
type) may *not* be serialized" in 8.1.2.2 Software Controlled Bus Locking;
particularly if streaming loads are used (may happen on certain devices
memory mapped I/O accesses and the compiler generating streaming loads).

So this comment is essentially for the PMD writers: use the fencing where
appropriate, even if the lock prefix is there. As I will be the one
forgetting the rule, I like to have that in the documentation/comments as
reminders to keep things neat.

François-Frédéric

> But you're right about the bug in spinlocks.
> I am going to send a patch for this.
> 
> --
> Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Bit spinlocks in DPDK
  2013-12-20 16:00           ` François-Frédéric Ozog
@ 2013-12-20 16:36             ` Stephen Hemminger
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2013-12-20 16:36 UTC (permalink / raw)
  To: François-Frédéric Ozog; +Cc: dev

On Fri, 20 Dec 2013 17:00:43 +0100
François-Frédéric Ozog <ff@ozog.com> wrote:

> 
> 
> > -----Message d'origine-----
> > De : Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Envoyé : vendredi 20 décembre 2013 16:39
> > À : François-Frédéric Ozog
> > Cc : dev@dpdk.org
> > Objet : Re: [dpdk-dev] Bit spinlocks in DPDK
> > 
> > Hello,
> > 
> > 07/12/2013 18:54, François-Frédéric Ozog :
> > > 1) If the critical section deals with weakly ordered loads then
> > > explicit fencing MUST be used: if not, out of order execution will
> > > just kill your idea of critical section.
> > [...]
> > > So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I
> > > recommend the rte_unlock code and documentation explains the out of
> > > order execution issues and the conditions they have to be mitigated
> > > with rte*mb(). I wonder if having an explicit mfence in
> > > rte_sinlock_unlock wouldn't be just necessary to avoid "hairy" bugs.
> > > In addition, we would have rte_sinlock_unlock_no_mb used internally
> > > for performance reasons, and usable externally by advanced users.
> > 
> > Using lock prefix is lighter than using memory barrier and have the same
> > effects.
> 
> Well, in general yes BUT Intel states "../.. locked operations serialize all
> outstanding load and store operations ../.. with one exception. Load
> operations that reference weakly ordered memory types (such as the WC memory
> type) may *not* be serialized" in 8.1.2.2 Software Controlled Bus Locking;
> particularly if streaming loads are used (may happen on certain devices
> memory mapped I/O accesses and the compiler generating streaming loads).
> 
> So this comment is essentially for the PMD writers: use the fencing where
> appropriate, even if the lock prefix is there. As I will be the one
> forgetting the rule, I like to have that in the documentation/comments as
> reminders to keep things neat.
> 
> François-Frédéric

I recommend anyone who needs more information read:
  Documentation/memory-barriers.txt
in the Linux kernel source.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-12-20 16:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-06 21:04 [dpdk-dev] Bit spinlocks in DPDK Pashupati Kumar
2013-12-06 22:02 ` Thomas Monjalon
2013-12-06 22:12   ` Pashupati Kumar
2013-12-06 22:24     ` Thomas Monjalon
2013-12-06 22:54       ` Pashupati Kumar
2013-12-07 17:54       ` François-Frédéric Ozog
2013-12-19 16:41         ` Thomas Monjalon
2013-12-20 15:39         ` Thomas Monjalon
2013-12-20 16:00           ` François-Frédéric Ozog
2013-12-20 16:36             ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).