From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id BE0E91B3E1 for ; Thu, 12 Oct 2017 17:54:02 +0200 (CEST) Received: from lfbn-lil-1-178-121.w90-45.abo.wanadoo.fr ([90.45.28.121] helo=droids-corp.org) by mail.droids-corp.org with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1e2ftl-00038i-4i; Thu, 12 Oct 2017 17:59:50 +0200 Received: by droids-corp.org (sSMTP sendmail emulation); Thu, 12 Oct 2017 17:53:51 +0200 Date: Thu, 12 Oct 2017 17:53:51 +0200 From: Olivier MATZ To: Jia He Cc: dev@dpdk.org, jia.he@hxt-semitech.com, jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com, konstantin.ananyev@intel.com, jerin.jacob@caviumnetworks.com Message-ID: <20171012155350.j34ddtivxzd27pag@platinum> References: <20171010095636.4507-1-hejianet@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171010095636.4507-1-hejianet@gmail.com> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Oct 2017 15:54:02 -0000 Hi, On Tue, Oct 10, 2017 at 05:56:36PM +0800, Jia He wrote: > Before this patch: > In __rte_ring_move_cons_head() > ... > do { > /* Restore n as it may change every loop */ > n = max; > > *old_head = r->cons.head; //1st load > const uint32_t prod_tail = r->prod.tail; //2nd load > > In weak memory order architectures(powerpc,arm), the 2nd load might be > reodered before the 1st load, that makes *entries is bigger than we wanted. > This nasty reording messed enque/deque up. > > cpu1(producer) cpu2(consumer) cpu3(consumer) > load r->prod.tail > in enqueue: > load r->cons.tail > load r->prod.head > > store r->prod.tail > > load r->cons.head > load r->prod.tail > ... > store r->cons.{head,tail} > load r->cons.head > > THEN,r->cons.head will be bigger than prod_tail, then make *entries very big > > After this patch, the old cons.head will be recaculated after failure of > rte_atomic32_cmpset > > There is no such issue in X86 cpu, because X86 is strong memory order model > > Signed-off-by: Jia He > Signed-off-by: jia.he@hxt-semitech.com > Signed-off-by: jie2.liu@hxt-semitech.com > Signed-off-by: bing.zhao@hxt-semitech.com > > --- > lib/librte_ring/rte_ring.h | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > index 5e9b3b7..15c72e2 100644 > --- a/lib/librte_ring/rte_ring.h > +++ b/lib/librte_ring/rte_ring.h > @@ -409,6 +409,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp, > n = max; > > *old_head = r->prod.head; > + > + /* load of prod.tail can't be reordered before cons.head */ > + rte_smp_rmb(); > + > const uint32_t cons_tail = r->cons.tail; > /* > * The subtraction is done between two unsigned 32bits value > @@ -517,6 +521,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc, > n = max; > > *old_head = r->cons.head; > + > + /* load of prod.tail can't be reordered before cons.head */ > + rte_smp_rmb(); > + > const uint32_t prod_tail = r->prod.tail; > /* The subtraction is done between two unsigned 32bits value > * (the result is always modulo 32 bits even if we have > -- > 2.7.4 > The explanation convinces me. However, since it's in a critical path, it would be good to have other opinions. This patch reminds me this discussion, that was also related to memory barrier, but at another place: http://dpdk.org/ml/archives/dev/2016-July/043765.html Lead to that patch: http://dpdk.org/browse/dpdk/commit/?id=ecc7d10e448e But finally reverted: http://dpdk.org/browse/dpdk/commit/?id=c3acd92746c3 Konstatin, Jerin, do you have any comment?