From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stephen@networkplumber.org>
Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com
 [209.85.192.178]) by dpdk.org (Postfix) with ESMTP id EC9471B257
 for <dev@dpdk.org>; Thu, 12 Oct 2017 18:15:39 +0200 (CEST)
Received: by mail-pf0-f178.google.com with SMTP id p87so5517501pfj.3
 for <dev@dpdk.org>; Thu, 12 Oct 2017 09:15:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=Y27uNFB+/Zsk61N3KqL6snWgH/GNSvLKYuyngkOD4Hk=;
 b=B1+gcz6Eqqx8l+sTkNXDm1HDct4k7FfkILY1Jvri5eQ7b6S2pzuT/odic7qr/GbAFT
 ph+Y0KUDcpYsj5MuwZIX16fSafnX72QRMi+S30+1foWXbcS5Ocfh5OMQVFmtPqTE9Tef
 9Hl7B40TKoq0IPPNQvBR6ldi67LKSnUGiz31C4rHvNsRne+XsmDVnLG4EPJnRlEBIx91
 qfzSXt1eyzbACPCLPYcY6zwpIayCjhaflH4U0gUxB6xNxPgE4tLOU8ieIFegYW0ekcrD
 Ab7951I7ePptDzFEnP9D1D/7Rz1oqshcWsWSQ4zJfuM0+voSS/f2cXUoBy6ZKZ1q+rCZ
 Gyhw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=Y27uNFB+/Zsk61N3KqL6snWgH/GNSvLKYuyngkOD4Hk=;
 b=YeQm9t+bByVMFHWqXGt8gY2SrGR8jCYhUAf2Ll4nHmidImR5tPnYPSG6RGv+DyLlns
 5hBtdO8zjUdgXnRWdIKdAeAm5ezi1XkPN8NzR6F4vwJ78PJeE6k4ObZbrzTsW4yJfp/M
 1rt22T9ErLlYU7ZrBowbEi1zMyBjbYBDN0OCElqyrGKprN/g37EuIzZaxOigBrMeWo9Z
 Nnj/+t7StUkHa9kcEMHbBiDhNYeoEkOW8gBUiWM2q+Xjl2zaHSYcJGoDr29sSD8ZT2cR
 4YJP1RPmoZLtFUNc7agzS2x41cHviymp683pbhbH3QVe/WPv/Id8sUp2X7yq/Pi9kQFr
 JzTQ==
X-Gm-Message-State: AMCzsaX3W4Iumoj9f490TdBEpcxYqVrPNKJzsSnqUpszSzx0VbvLYhyb
 i2XnnHP9qOaoH0NkvQ+H7yGMNA==
X-Google-Smtp-Source: AOwi7QA3hB2n6bfzlrI157D13tWlQz0nV2koUCjCtVvzkoX45JxTxL7nx06lvxcG/8mcYFU8iJxGhw==
X-Received: by 10.98.62.17 with SMTP id l17mr2686830pfa.210.1507824939111;
 Thu, 12 Oct 2017 09:15:39 -0700 (PDT)
Received: from xeon-e3 (76-14-207-240.or.wavecable.com. [76.14.207.240])
 by smtp.gmail.com with ESMTPSA id 77sm30903922pfl.133.2017.10.12.09.15.38
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Thu, 12 Oct 2017 09:15:39 -0700 (PDT)
Date: Thu, 12 Oct 2017 09:15:31 -0700
From: Stephen Hemminger <stephen@networkplumber.org>
To: Olivier MATZ <olivier.matz@6wind.com>
Cc: Jia He <hejianet@gmail.com>, dev@dpdk.org, jia.he@hxt-semitech.com,
 jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com,
 konstantin.ananyev@intel.com, jerin.jacob@caviumnetworks.com
Message-ID: <20171012091531.26178988@xeon-e3>
In-Reply-To: <20171012155350.j34ddtivxzd27pag@platinum>
References: <20171010095636.4507-1-hejianet@gmail.com>
 <20171012155350.j34ddtivxzd27pag@platinum>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod
 loading when doing enqueue/dequeue
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Oct 2017 16:15:40 -0000

On Thu, 12 Oct 2017 17:53:51 +0200
Olivier MATZ <olivier.matz@6wind.com> wrote:

> Hi,
> 
> On Tue, Oct 10, 2017 at 05:56:36PM +0800, Jia He wrote:
> > Before this patch:
> > In __rte_ring_move_cons_head()
> > ...
> >         do {
> >                 /* Restore n as it may change every loop */
> >                 n = max;
> > 
> >                 *old_head = r->cons.head;                //1st load
> >                 const uint32_t prod_tail = r->prod.tail; //2nd load
> > 
> > In weak memory order architectures(powerpc,arm), the 2nd load might be
> > reodered before the 1st load, that makes *entries is bigger than we wanted.
> > This nasty reording messed enque/deque up.
> > 
> > cpu1(producer)          cpu2(consumer)          cpu3(consumer)
> >                         load r->prod.tail
> > in enqueue:
> > load r->cons.tail
> > load r->prod.head
> > 
> > store r->prod.tail
> > 
> >                                                 load r->cons.head
> >                                                 load r->prod.tail
> >                                                 ...
> >                                                 store r->cons.{head,tail}
> >                         load r->cons.head
> > 
> > THEN,r->cons.head will be bigger than prod_tail, then make *entries very big
> > 
> > After this patch, the old cons.head will be recaculated after failure of
> > rte_atomic32_cmpset
> > 
> > There is no such issue in X86 cpu, because X86 is strong memory order model
> > 
> > Signed-off-by: Jia He <hejianet@gmail.com>
> > Signed-off-by: jia.he@hxt-semitech.com
> > Signed-off-by: jie2.liu@hxt-semitech.com
> > Signed-off-by: bing.zhao@hxt-semitech.com
> > 
> > ---
> >  lib/librte_ring/rte_ring.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 5e9b3b7..15c72e2 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -409,6 +409,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> >  		n = max;
> >  
> >  		*old_head = r->prod.head;
> > +
> > +		/* load of prod.tail can't be reordered before cons.head */
> > +		rte_smp_rmb();
> > +
> >  		const uint32_t cons_tail = r->cons.tail;
> >  		/*
> >  		 *  The subtraction is done between two unsigned 32bits value
> > @@ -517,6 +521,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> >  		n = max;
> >  
> >  		*old_head = r->cons.head;
> > +
> > +		/* load of prod.tail can't be reordered before cons.head */
> > +		rte_smp_rmb();
> > +
> >  		const uint32_t prod_tail = r->prod.tail;
> >  		/* The subtraction is done between two unsigned 32bits value
> >  		 * (the result is always modulo 32 bits even if we have
> > -- 
> > 2.7.4
> >   
> 
> The explanation convinces me.
> 
> However, since it's in a critical path, it would be good to have other
> opinions. This patch reminds me this discussion, that was also related to
> memory barrier, but at another place:
> http://dpdk.org/ml/archives/dev/2016-July/043765.html
> Lead to that patch: http://dpdk.org/browse/dpdk/commit/?id=ecc7d10e448e
> But finally reverted: http://dpdk.org/browse/dpdk/commit/?id=c3acd92746c3
> 
> Konstatin, Jerin, do you have any comment?

On strongly ordered architectures like x86 rte_smp_rmb() turns into compiler_barrier().
Compiler barriers generate no instructions, they just tell the compiler to
not do data flow optimization across.  Since the pointers are mostly volatile
it shouldn't even change the generated code.