DPDK patches and discussions
 help / color / mirror / Atom feed
From: "François-Frédéric Ozog" <ff@ozog.com>
To: "'Thomas Monjalon'" <thomas.monjalon@6wind.com>,
	"'Pashupati Kumar'" <kumarp@brocade.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] Bit spinlocks in DPDK
Date: Sat, 7 Dec 2013 18:54:57 +0100	[thread overview]
Message-ID: <01d001cef375$7167e300$5437a900$@com> (raw)
In-Reply-To: <4656219.tgqzelRNOJ@x220>


> -----Message d'origine-----
> De : dev [mailto:dev-bounces@dpdk.org] De la part de Thomas Monjalon
> Envoyé : vendredi 6 décembre 2013 23:24
> À : Pashupati Kumar
> Cc : dev@dpdk.org
> Objet : Re: [dpdk-dev] Bit spinlocks in DPDK
> 
> 06/12/2013 14:12, Pashupati Kumar :
> > From: Thomas Monjalon
> > > 06/12/2013 13:04, Pashupati Kumar :
> > > > We use bit spinlocks extensively to have compact data structures.
> > > > Are there any plans for adding them to DPDK in some future release?
> > >
> > > Not sure to understand your request.
> > > Are you looking for that?
> > > 	http://dpdk.org/doc/api/rte__spinlock_8h.html
> >
> > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit
> > word for locking. The rest of the bits in the word are left
> > undisturbed.  This enables more compact data structures as only 1 bit
> > is consumed for the lock.
> 
> Oh yes, like test_and_set_bit_lock() in Linux:
> 	http://lxr.free-
> electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205
> 
> I think that a patch would be appreciated :)
> 
> PS: please try to answer below the question. It's far easier to read.
> --
> Thomas

Hi,

I assume you mean the x64 version, not the ia64:
http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h.

I agree with Pash, the goal of compacting data structures is of the highest
importance for high performance networking (HPN).

Last week I gave a training on some aspects of multiprocessing, and in
particular locking. So I got the mood to check DPDK implementation and I
have a few remarks:

1) If the critical section deals with weakly ordered loads then explicit
fencing MUST be used: if not, out of order execution will just kill your
idea of critical section. Weakly order loads occur on a number of situations
involving Write Combining memory (say memory mapped IO). see Intel
programming guide 3, 8.1.2.2 Software Controlled Bus Locking and 12.10.3
Streaming Load Hint Instruction.

So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend the
rte_unlock code and documentation explains the out of order execution issues
and the conditions they have to be mitigated with rte*mb(). I wonder if
having an explicit mfence in rte_sinlock_unlock wouldn't be just necessary
to avoid "hairy" bugs. In addition, we would have rte_sinlock_unlock_no_mb
used internally for performance reasons, and usable externally by advanced
users.

2) code that use rte spinlocks are subject to starvation. A simple code with
one producer and one consumer demonstrate the issue: I saw either the
producer not giving a single chance to a consumer after 1M loops or the
consumer grabs the lock right after the first production and loops for ever
testing if the producer had produced something (the cycle of unlock/lock is
too fast)! Introducing "random" rte_pause() (actually lcore_id number of
pauses) solves the problem in a ugly unpredictable manner. It may not create
any visible problem on light loads, but may end up being a real issue with
several million packets per second. There are a number of scenarios were we
don't care if lock algorithm create starvation, but sometimes we do. So I
will provide the community with a ticketlock implementation that is
starvation free. 


What is the policy to share a sample program: inline in a mail or as a
attachment?
What is the policy to submit a patch?


François-Frédéric

  parent reply	other threads:[~2013-12-07 17:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-06 21:04 Pashupati Kumar
2013-12-06 22:02 ` Thomas Monjalon
2013-12-06 22:12   ` Pashupati Kumar
2013-12-06 22:24     ` Thomas Monjalon
2013-12-06 22:54       ` Pashupati Kumar
2013-12-07 17:54       ` François-Frédéric Ozog [this message]
2013-12-19 16:41         ` Thomas Monjalon
2013-12-20 15:39         ` Thomas Monjalon
2013-12-20 16:00           ` François-Frédéric Ozog
2013-12-20 16:36             ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='01d001cef375$7167e300$5437a900$@com' \
    --to=ff@ozog.com \
    --cc=dev@dpdk.org \
    --cc=kumarp@brocade.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).