DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Kuusisaari, Juhamatti" <Juhamatti.Kuusisaari@coriant.com>,
	"'dev@dpdk.org'" <dev@dpdk.org>,
	"Jan Viktorin (viktorin@rehivetech.com)"
	<viktorin@rehivetech.com>,
	"Chao Zhu (bjzhuc@cn.ibm.com)" <bjzhuc@cn.ibm.com>
Subject: Re: [dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location
Date: Fri, 15 Jul 2016 11:59:06 +0530	[thread overview]
Message-ID: <20160715062905.GA6473@localhost.localdomain> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725836B7DD85@irsmsx105.ger.corp.intel.com>

On Thu, Jul 14, 2016 at 12:56:11PM +0000, Ananyev, Konstantin wrote:
> 
> 
> > The CPU also
> > knows already the value that will be written to cons.tail and that value does not depend on the previous read either. The CPU does not
> > know we are planning to do a spinlock there, so it might do things out-of-order without proper dependencies.
> > 
> > > For  __rte_ring_sc_do_dequeue(), I think you right, we might need
> > > something stronger.
> > > I don't want to put rte_smp_mb() here as it would cause full HW
> > > barrier even on machines with strong memory order (IA).
> > > I think that rte_smp_wmb() might be enough here:
> > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become
> > > visible, which means reads have to be completed too.
> > 
> > In practice I think that rte_smp_wmb() would work fine, even though it is not strictly according to the book. Below solution would be my
> > proposal as a fix to the issue of sc dequeueing (and also to mc dequeueing, if we have the problem of CPU completely ignoring the spinlock
> > in reality there):
> > 
> > DEQUEUE_PTRS();
> > ..
> > rte_smp_wmb();
> > r->cons.tail = cons_next;
> 
> As I said in previous email - it looks good for me for _rte_ring_sc_do_dequeue(),
> but I am interested to hear what  ARM and PPC maintainers think about it.
> Jan, Jerin do you have any comments on it?

Actually it is NOT performance effective and difficult to capture the
ORDER dependency with plane store and load barriers on WEAK ordered machines.
Beyond plane store and load barriers, We need to express  #LoadLoad,
#LoadStore,#StoreStore barrier dependency with Acquire and Release
Semantics in Arch neutral code(Looks like this is compiler barrier on IA)
http://preshing.com/20120913/acquire-and-release-semantics/

For instance, Full barrier CAS(__sync_bool_compare_and_swap) will not be
required for weak ordered machine in MP case.
I can send out a RFC version of ring implementation changes required
with acquire-and-release-semantics.
If it has performance degradation on IA then we can separate it out
through conditional compilation flag.

GCC Built-in Functions for Memory Model Aware Atomic Operations
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Thoughts ?

Jerin

> Chao, sorry but I still not sure why PPC is considered as architecture with strong memory ordering?
> Might be I am missing something obvious here.
> Thank
> Konstantin
> 

  parent reply	other threads:[~2016-07-15  6:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-11 10:20 Juhamatti Kuusisaari
2016-07-11 10:41 ` Ananyev, Konstantin
2016-07-11 11:22   ` Kuusisaari, Juhamatti
2016-07-11 11:40     ` Olivier Matz
2016-07-12  4:10       ` Kuusisaari, Juhamatti
2016-07-11 12:34     ` Ananyev, Konstantin
2016-07-12  5:27       ` Kuusisaari, Juhamatti
2016-07-12 11:01         ` Ananyev, Konstantin
2016-07-12 17:58           ` Ananyev, Konstantin
2016-07-13  5:27             ` Kuusisaari, Juhamatti
2016-07-13 13:00               ` Ananyev, Konstantin
2016-07-14  4:17                 ` Kuusisaari, Juhamatti
2016-07-14 12:56                   ` Ananyev, Konstantin
2016-07-15  5:40                     ` Kuusisaari, Juhamatti
2016-07-15  6:29                     ` Jerin Jacob [this message]
2016-07-15 10:34                       ` Ananyev, Konstantin
2016-07-18  2:47                         ` Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160715062905.GA6473@localhost.localdomain \
    --to=jerin.jacob@caviumnetworks.com \
    --cc=Juhamatti.Kuusisaari@coriant.com \
    --cc=bjzhuc@cn.ibm.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=viktorin@rehivetech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).