DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: Feifei Wang <Feifei.Wang2@arm.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	Takeshi Yoshimura <t.yoshimura8869@gmail.com>
Cc: "stable@dpdk.org" <stable@dpdk.org>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"chaozhu@linux.vnet.ibm.com" <chaozhu@linux.vnet.ibm.com>,
	"konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>, nd <nd@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] rte_ring: fix racy dequeue/enqueue in ppc64
Date: Wed, 16 Jun 2021 16:37:31 +0000
Message-ID: <DBAPR08MB58146BE07EE82497AF4B1D65980F9@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <AM9PR08MB691571CAF56CF5D58E1B0C72C80F9@AM9PR08MB6915.eurprd08.prod.outlook.com>

<snip>

> 
> Hi, everyone
> 
> This patch can be closed with the following reasons.
> 
> > -----邮件原件-----
> > 发件人: dev <dev-bounces@dpdk.org> 代表 Honnappa Nagarahalli
> > 发送时间: 2021年3月28日 9:00
> > 收件人: thomas@monjalon.net; Takeshi Yoshimura
> > <t.yoshimura8869@gmail.com>
> > 抄送: stable@dpdk.org; dev@dpdk.org; olivier.matz@6wind.com;
> > chaozhu@linux.vnet.ibm.com; konstantin.ananyev@intel.com; Jerin Jacob
> > <jerin.jacob@caviumnetworks.com>; nd <nd@arm.com>; nd
> <nd@arm.com>
> > 主题: Re: [dpdk-dev] [dpdk-stable] [PATCH] rte_ring: fix racy
> > dequeue/enqueue in ppc64
> >
> > <snip>
> >
> > > Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] rte_ring: fix racy
> > > dequeue/enqueue in ppc64
> > >
> > > No reply after more than 2 years.
> > > Unfortunately it is probably outdated now.
> > > Classified as "Changes Requested".
> > Looking at the code, I think this patch in fact fixes a bug.
> > Appreciate rebasing this patch.
> >
> > The problem is already fixed in '__rte_ring_move_cons_head' but needs
> > to be fixed in '__rte_ring_move_prod_head'.
> > This problem is fixed for C11 version due to acquire load of cons.tail
> > and prod.tail.
> 
> First, for consumer in dequeue:
> the reason for that adding a rmb in move_cons_head of “generic” is based on
> this patch:
> http://patches.dpdk.org/project/dpdk/patch/1552409933-45684-2-git-send-
> email-gavin.hu@arm.com/
> 
> Slot                        Consumer                                                           Producer
> 1                     dequeue elements
> 2                                                                                                  update prod_tail
> 3                   load new prod_tail
> 4                   check room is enough(n < entries)
> 
> Dequeue elements maybe before load updated prod_tail, so consumer can
> load incorrect elements value.
> For dequeue multiple consumers case, ‘rte_atomic32_cmpset’ with acquire
> and release order can prevent dequeue before load prod_tail, no extra rmb is
> needed.
> 
> Second, for single producer in enqueue:
> 
> Slot                        Producer                                                             Consumer
> 1                     enqueue elements(not commited)
> 2                                                                                                  update
> consumer_tail
> 3                   load new consumer_tail
> 4                   check room is enough(n < entries)
> 5                   enqueued elements is committed
> 
> Though enqueue elements maybe reorder before load consumer_tail, these
> elements will not be committed until ‘check’ has finished. So from load to
> write control dependency is reliable and rmb is not needed here.
> [1] https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf (page:15)
> 
> As a result, it is unnecessary to add a rmb for enqueue single producer due to
> control dependency. And this patch can be closed.
Thanks Feifei, I did not consider the control dependency from load to store which is reliable in my comments below.
Agree, we can reject this patch.

> 
> Best Regards
> Feifei
> >
> > >
> > >
> > > 17/07/2018 05:34, Jerin Jacob:
> > > > From: Takeshi Yoshimura <t.yoshimura8869@gmail.com>
> > > >
> > > > Cc: olivier.matz@6wind.com
> > > > Cc: chaozhu@linux.vnet.ibm.com
> > > > Cc: konstantin.ananyev@intel.com
> > > >
> > > > >
> > > > > > Adding rte_smp_rmb() cause performance regression on non x86
> > > platforms.
> > > > > > Having said that, load-load barrier can be expressed very
> > > > > > well with C11 memory model. I guess ppc64 supports C11 memory
> model.
> > > > > > If so, Could you try CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
> > for
> > > > > > ppc64 and check original issue?
> > > > >
> > > > > Yes, the performance regression happens on non-x86 with single
> > > > > producer/consumer.
> > > > > The average latency of an enqueue was increased from 21 nsec to
> > > > > 24 nsec in my simple experiment. But, I think it is worth it.
> > > >
> > > > That varies to machine to machine. What is the burst size etc.
> > > >
> > > > >
> > > > >
> > > > > I also tested C11 rte_ring, however, it caused the same race
> > > > > condition in
> > > ppc64.
> > > > > I tried to fix the C11 problem as well, but I also found the C11
> > > > > rte_ring had other potential incorrect choices of memory orders,
> > > > > which caused another race condition in ppc64.
> > > >
> > > > Does it happens on all ppc64 machines? Or on a specific machine?
> > > > Is following tests are passing on your system without the patch?
> > > > test/test/test_ring_perf.c
> > > > test/test/test_ring.c
> > > >
> > > > >
> > > > > For example,
> > > > > __ATOMIC_ACQUIRE is passed to __atomic_compare_exchange_n(),
> > but I
> > > > > am not sure why the load-acquire is used for the compare exchange.
> > > >
> > > > It correct as per C11 acquire and release semantics.
> > > >
> > > > > Also in update_tail, the pause can be called before the data
> > > > > copy because of ht->tail load without atomic_load_n.
> > > > >
> > > > > The memory order is simply difficult, so it might take a bit
> > > > > longer time to check if the code is correct. I think I can fix
> > > > > the
> > > > > C11 rte_ring as another patch.
> > > > >
> > > > > >>
> > > > > >> SPDK blobfs encountered a crash around rte_ring dequeues in
> ppc64.
> > > > > >> It uses a single consumer and multiple producers for a rte_ring.
> > > > > >> The problem was a load-load reorder in
> > rte_ring_sc_dequeue_bulk().
> > > > > >
> > > > > > Adding rte_smp_rmb() cause performance regression on non x86
> > > platforms.
> > > > > > Having said that, load-load barrier can be expressed very
> > > > > > well with C11 memory model. I guess ppc64 supports C11 memory
> model.
> > > > > > If so, Could you try CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
> > for
> > > > > > ppc64 and check original issue?
> > > > > >
> > > > > >>
> > > > > >> The reordered loads happened on r->prod.tail in
> > > >
> > > > There is rte_smp_rmb() just before reading r->prod.tail in
> > > >         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > _rte_ring_move_cons_head(). Would that not suffice the requirement?
> > > >
> > > > Can you check adding compiler barrier and see is compiler is
> > > > reordering the stuff?
> > > >
> > > > DPDK's ring implementation is based freebsd's ring implementation,
> > > > I don't see need for such barrier
> > > >
> > > > https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h
> > > >
> > > > If it is something specific to ppc64 or a specific ppc64 machine,
> > > > we could add a compile option as it is arch specific to avoid
> > > > performance impact on other architectures.
> > > >
> > > > > >> __rte_ring_move_cons_head() (rte_ring_generic.h) and
> > > > > >> ring[idx] in
> > > > > >> DEQUEUE_PTRS() (rte_ring.h). They have a load-load control
> > > > > >> dependency, but the code does not satisfy it. Note that they
> > > > > >> are not reordered if __rte_ring_move_cons_head() with is_sc
> > > > > >> !=
> > > > > >> 1 because cmpset invokes a read barrier.
> > > > > >>
> > > > > >> The paired stores on these loads are in ENQUEUE_PTRS() and
> > > > > >> update_tail(). Simplified code around the reorder is the following.
> > > > > >>
> > > > > >> Consumer             Producer
> > > > > >> load idx[ring]
> > > > > >>                      store idx[ring]
> > > > > >>                      store r->prod.tail load r->prod.tail
> > > > > >>
> > > > > >> In this case, the consumer loads old idx[ring] and confirms
> > > > > >> the load is valid with the new r->prod.tail.
> > > > > >>
> > > > > >> I added a read barrier in the case where __IS_SC is passed to
> > > > > >> __rte_ring_move_cons_head(). I also fixed
> > > > > >> __rte_ring_move_prod_head() to avoid similar problems with a
> > > > > >> single
> > > producer.
> > > > > >>
> > > > > >> Cc: stable@dpdk.org
> > > > > >>
> > > > > >> Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
> > > > > >> ---
> > > > > >>  lib/librte_ring/rte_ring_generic.h | 10 ++++++----
> > > > > >>  1 file changed, 6 insertions(+), 4 deletions(-)
> > > > > >>
> > > > > >> diff --git a/lib/librte_ring/rte_ring_generic.h
> > > > > >> b/lib/librte_ring/rte_ring_generic.h
> > > > > >> index ea7dbe5b9..477326180 100644
> > > > > >> --- a/lib/librte_ring/rte_ring_generic.h
> > > > > >> +++ b/lib/librte_ring/rte_ring_generic.h
> > > > > >> @@ -90,9 +90,10 @@ __rte_ring_move_prod_head(struct rte_ring
> > > > > >> *r,
> > > unsigned int is_sp,
> > > > > >>                         return 0;
> > > > > >>
> > > > > >>                 *new_head = *old_head + n;
> > > > > >> -               if (is_sp)
> > > > > >> +               if (is_sp) {
> > > > > >> +                       rte_smp_rmb();
> > > > > >>                         r->prod.head = *new_head, success = 1;
> > > > > >> -               else
> > > > > >> +               } else
> > > > > >>                         success = rte_atomic32_cmpset(&r->prod.head,
> > > > > >>                                         *old_head, *new_head);
> > > > > >>         } while (unlikely(success == 0)); @@ -158,9 +159,10
> > > > > >> @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int
> is_sc,
> > > > > >>                         return 0;
> > > > > >>
> > > > > >>                 *new_head = *old_head + n;
> > > > > >> -               if (is_sc)
> > > > > >> +               if (is_sc) {
> > > > > >> +                       rte_smp_rmb();
> > > > > >>                         r->cons.head = *new_head, success = 1;
> > > > > >> -               else
> > > > > >> +               } else
> > > > > >>                         success =
> > > > > >> rte_atomic32_cmpset(&r->cons.head,
> > *old_head,
> > > > > >>                                         *new_head);
> > > > > >>         } while (unlikely(success == 0));
> > > > > >> --
> > > > > >> 2.17.1
> > >
> > >


      reply	other threads:[~2021-06-16 16:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-12  2:44 [dpdk-dev] " Takeshi Yoshimura
2018-07-12 17:08 ` Jerin Jacob
2018-07-17  2:54   ` Takeshi Yoshimura
2018-07-17  3:34     ` Jerin Jacob
2021-03-24 21:45       ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
2021-03-28  1:00         ` Honnappa Nagarahalli
2021-06-16  7:14           ` [dpdk-dev] 回复: " Feifei Wang
2021-06-16 16:37             ` Honnappa Nagarahalli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBAPR08MB58146BE07EE82497AF4B1D65980F9@DBAPR08MB5814.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Feifei.Wang2@arm.com \
    --cc=chaozhu@linux.vnet.ibm.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    --cc=t.yoshimura8869@gmail.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git