From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 90C7EA0A02;
	Wed, 24 Mar 2021 22:45:21 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 33F984067B;
	Wed, 24 Mar 2021 22:45:21 +0100 (CET)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by mails.dpdk.org (Postfix) with ESMTP id C70D24014F;
 Wed, 24 Mar 2021 22:45:19 +0100 (CET)
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
 by mailout.nyi.internal (Postfix) with ESMTP id 7096E5C017D;
 Wed, 24 Mar 2021 17:45:19 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute4.internal (MEProxy); Wed, 24 Mar 2021 17:45:19 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=fm3; bh=
 +SicABdpcW/6kX6gzXZq9DLFdGj20Il98RrfEs+IDQ4=; b=4Y31pharzH4Vnv+0
 HnbaMw/XwYp68NQlPaNnaBsT9lq48VNXZVed6sw/QhAuyJx2+XBmpKiEfy10Rh3i
 XotqRDFxatBuORXVZT5yIn2YY1pLaaU41LMjxp/cE8yH624EMYRx8r1MdRYYY8Xs
 RIAfB56XmHE+L8rVRp9TCVyoOfEwyIhtWrX1LPRJ2vlzEGkS+9akR9lp8e8oHzYd
 napel/4mBVrRyHyazbSAmGNZ6FpZiN5o+g/p5SRtgZ/G+qPQTXMPqv3IwKp+Kv4l
 xErgIIBQs0CFRBxCd4NILYze8MGepumV+iIeOM7I1/PhJEMB07yXO2oNToBqtITX
 wAvZyw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm2; bh=+SicABdpcW/6kX6gzXZq9DLFdGj20Il98RrfEs+ID
 Q4=; b=k5FJrkAjVcZ3DvNSuSWyaOJSBkZxHPVFhviFWOPhAICjhMDUBk7oxxMpz
 Y8DPLiwozQtew0qGLIEEJOOqzhIRIM4Hhkopq0+d/zDrZQ6NZzp2fL+4a8X0Pz9Z
 cnd1yr8n4U2hGzoWrJ1MtRM6DwPzXRuoxZv11UxaUwOb6LA9iaGwB7nrmredyWhy
 IN4qOnZ22AGL4QlKhe+W+3cPxzJsAA7smReVq/Z1QUhdjzOyrlKjNqp2QKy5I0cX
 mQbMJ4iWiCKTWiVsjjdjX2BewlKFp7Shfz74upE77f9nKZx43yvS8WL5dp6MTXj8
 XGDd/52MxyYP4ssAHO6theEQ7pqLw==
X-ME-Sender: <xms:7bJbYPGfP9t76Z5xD494Dx-2EAOtFd3TBCFohqMrHgkiDg4bIxEedQ>
 <xme:7bJbYMQga1zH6PjABei1gUWa_oMWiUJ6zsXmU_YYmVwl3Lrl3_8ccHXwqRFTYgPzx
 Al3Zg6m9Krz541ZRg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrudegkedgudehvdcutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh
 necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
 enucfjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm
 rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc
 ggtffrrghtthgvrhhnpeegjeffhffhgefhteevffegffetleevkefhgffhfeegvdelueev
 teffgfduleevhfenucffohhmrghinhepghhithhhuhgsrdgtohhmnecukfhppeejjedrud
 efgedrvddtfedrudekgeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgr
 ihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght
X-ME-Proxy: <xmx:7bJbYOAA4mYfpaZ1_Ysyb-4rn3Q7-PcqezJJSkOT7-fNcJcJgVRiOA>
 <xmx:7bJbYE2MLquUzA60k6oBE-IWfTbwZqOByxZ6UgQ-EVW0gj3nz8zhFQ>
 <xmx:7bJbYEX02eCMRV6ftaV4saJcXs_0BtThcQVPoTKKb78OB4h4Z-AwcQ>
 <xmx:77JbYIDAtd6cbcaCsng20fj0ybexqCYg5Q0hmhZugcW3CLxLDHcXIQ>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id 1F794240422;
 Wed, 24 Mar 2021 17:45:17 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: Takeshi Yoshimura <t.yoshimura8869@gmail.com>
Cc: stable@dpdk.org, dev@dpdk.org, olivier.matz@6wind.com,
 chaozhu@linux.vnet.ibm.com, konstantin.ananyev@intel.com,
 Jerin Jacob <jerin.jacob@caviumnetworks.com>, honnappa.nagarahalli@arm.com
Date: Wed, 24 Mar 2021 22:45:16 +0100
Message-ID: <7532216.RsKJNViV3k@thomas>
In-Reply-To: <20180717033410.GA3344@jerin>
References: <20180712024414.4756-1-t.yoshimura8869@gmail.com>
 <CALBPOTWYxCN3+DL_5Ozx3parn+hyo582BJgrjRzicPOEwR1CcA@mail.gmail.com>
 <20180717033410.GA3344@jerin>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] rte_ring: fix racy
 dequeue/enqueue in ppc64
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

No reply after more than 2 years.
Unfortunately it is probably outdated now.
Classified as "Changes Requested".


17/07/2018 05:34, Jerin Jacob:
> From: Takeshi Yoshimura <t.yoshimura8869@gmail.com>
> 
> Cc: olivier.matz@6wind.com
> Cc: chaozhu@linux.vnet.ibm.com
> Cc: konstantin.ananyev@intel.com
> 
> > 
> > > Adding rte_smp_rmb() cause performance regression on non x86 platforms.
> > > Having said that, load-load barrier can be expressed very  well with C11 memory
> > > model. I guess ppc64 supports C11 memory model. If so,
> > > Could you try CONFIG_RTE_RING_USE_C11_MEM_MODEL=y for ppc64 and check
> > > original issue?
> > 
> > Yes, the performance regression happens on non-x86 with single
> > producer/consumer.
> > The average latency of an enqueue was increased from 21 nsec to 24 nsec in my
> > simple experiment. But, I think it is worth it.
> 
> That varies to machine to machine. What is the burst size etc.
> 
> > 
> > 
> > I also tested C11 rte_ring, however, it caused the same race condition in ppc64.
> > I tried to fix the C11 problem as well, but I also found the C11
> > rte_ring had other potential
> > incorrect choices of memory orders, which caused another race
> > condition in ppc64.
> 
> Does it happens on all ppc64 machines? Or on a specific machine?
> Is following tests are passing on your system without the patch?
> test/test/test_ring_perf.c
> test/test/test_ring.c
> 
> > 
> > For example,
> > __ATOMIC_ACQUIRE is passed to __atomic_compare_exchange_n(), but
> > I am not sure why the load-acquire is used for the compare exchange.
> 
> It correct as per C11 acquire and release semantics.
> 
> > Also in update_tail, the pause can be called before the data copy because
> > of ht->tail load without atomic_load_n.
> > 
> > The memory order is simply difficult, so it might take a bit longer
> > time to check
> > if the code is correct. I think I can fix the C11 rte_ring as another patch.
> > 
> > >>
> > >> SPDK blobfs encountered a crash around rte_ring dequeues in ppc64.
> > >> It uses a single consumer and multiple producers for a rte_ring.
> > >> The problem was a load-load reorder in rte_ring_sc_dequeue_bulk().
> > >
> > > Adding rte_smp_rmb() cause performance regression on non x86 platforms.
> > > Having said that, load-load barrier can be expressed very  well with C11 memory
> > > model. I guess ppc64 supports C11 memory model. If so,
> > > Could you try CONFIG_RTE_RING_USE_C11_MEM_MODEL=y for ppc64 and check
> > > original issue?
> > >
> > >>
> > >> The reordered loads happened on r->prod.tail in
> 
> There is rte_smp_rmb() just before reading r->prod.tail in 
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> _rte_ring_move_cons_head(). Would that not suffice the requirement?
> 
> Can you check adding compiler barrier and see is compiler is reordering
> the stuff?
> 
> DPDK's ring implementation is based freebsd's ring implementation, I
> don't see need for such barrier
> 
> https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h
> 
> If it is something specific to ppc64 or a specific ppc64 machine, we 
> could add a compile option as it is arch specific to avoid performance
> impact on other architectures.
> 
> > >> __rte_ring_move_cons_head() (rte_ring_generic.h) and ring[idx] in
> > >> DEQUEUE_PTRS() (rte_ring.h). They have a load-load control
> > >> dependency, but the code does not satisfy it. Note that they are
> > >> not reordered if __rte_ring_move_cons_head() with is_sc != 1 because
> > >> cmpset invokes a read barrier.
> > >>
> > >> The paired stores on these loads are in ENQUEUE_PTRS() and
> > >> update_tail(). Simplified code around the reorder is the following.
> > >>
> > >> Consumer             Producer
> > >> load idx[ring]
> > >>                      store idx[ring]
> > >>                      store r->prod.tail
> > >> load r->prod.tail
> > >>
> > >> In this case, the consumer loads old idx[ring] and confirms the load
> > >> is valid with the new r->prod.tail.
> > >>
> > >> I added a read barrier in the case where __IS_SC is passed to
> > >> __rte_ring_move_cons_head(). I also fixed __rte_ring_move_prod_head()
> > >> to avoid similar problems with a single producer.
> > >>
> > >> Cc: stable@dpdk.org
> > >>
> > >> Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
> > >> ---
> > >>  lib/librte_ring/rte_ring_generic.h | 10 ++++++----
> > >>  1 file changed, 6 insertions(+), 4 deletions(-)
> > >>
> > >> diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> > >> index ea7dbe5b9..477326180 100644
> > >> --- a/lib/librte_ring/rte_ring_generic.h
> > >> +++ b/lib/librte_ring/rte_ring_generic.h
> > >> @@ -90,9 +90,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
> > >>                         return 0;
> > >>
> > >>                 *new_head = *old_head + n;
> > >> -               if (is_sp)
> > >> +               if (is_sp) {
> > >> +                       rte_smp_rmb();
> > >>                         r->prod.head = *new_head, success = 1;
> > >> -               else
> > >> +               } else
> > >>                         success = rte_atomic32_cmpset(&r->prod.head,
> > >>                                         *old_head, *new_head);
> > >>         } while (unlikely(success == 0));
> > >> @@ -158,9 +159,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
> > >>                         return 0;
> > >>
> > >>                 *new_head = *old_head + n;
> > >> -               if (is_sc)
> > >> +               if (is_sc) {
> > >> +                       rte_smp_rmb();
> > >>                         r->cons.head = *new_head, success = 1;
> > >> -               else
> > >> +               } else
> > >>                         success = rte_atomic32_cmpset(&r->cons.head, *old_head,
> > >>                                         *new_head);
> > >>         } while (unlikely(success == 0));
> > >> --
> > >> 2.17.1