DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Harry van Haaren" <harry.van.haaren@intel.com>, <dev@dpdk.org>
Cc: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
	nd <nd@arm.com>, nd <nd@arm.com>
Subject: RE: [PATCH 2/2] service: fix potential stats race-condition on MT services
Date: Fri, 8 Jul 2022 20:08:26 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D871BA@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB581412228A468F7156FB87D498829@DBAPR08MB5814.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Friday, 8 July 2022 19.40
> 
> <snip>
> > > > > >
> > > > > > This commit fixes a potential racey-add that could occur if
> > > multiple
> > > > > service-
> > > > > > lcores were executing the same MT-safe service at the same
> time,
> > > > > > with service statistics collection enabled.
> > > > > >
> > > > > > Because multiple threads can run and execute the service, the
> > > stats
> > > > > values
> > > > > > can have multiple writer threads, resulting in the
> requirement
> > > > > > of
> > > > > using
> > > > > > atomic addition for correctness.
> > > > > >
> > > > > > Note that when a MT unsafe service is executed, a spinlock is
> > > held,
> > > > > so the
> > > > > > stats increments are protected. This fact is used to avoid
> > > executing
> > > > > atomic
> > > > > > add instructions when not required.
> > > > > >
> > > > > > This patch causes a 1.25x increase in cycle-cost for polling
> a
> > > > > > MT
> > > > > safe service
> > > > > > when statistics are enabled. No change was seen for MT unsafe
> > > > > services, or
> > > > > > when statistics are disabled.
> > > > > >
> > > > > > Reported-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > > > > > Suggested-by: Honnappa Nagarahalli
> > > > > > <Honnappa.Nagarahalli@arm.com>
> > > > > > Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > > > >
> > > > > > ---
> > > > > > ---
> > > > > >  lib/eal/common/rte_service.c | 10 ++++++++--
> > > > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/eal/common/rte_service.c
> > > > > b/lib/eal/common/rte_service.c
> > > > > > index ef31b1f63c..f045e74ef3 100644
> > > > > > --- a/lib/eal/common/rte_service.c
> > > > > > +++ b/lib/eal/common/rte_service.c
> > > > > > @@ -363,9 +363,15 @@ service_runner_do_callback(struct
> > > > > > rte_service_spec_impl *s,
> > > > > >  		uint64_t start = rte_rdtsc();
> > > > > >  		s->spec.callback(userdata);
> > > > > >  		uint64_t end = rte_rdtsc();
> > > > > > -		s->cycles_spent += end - start;
> > > > > > +		uint64_t cycles = end - start;
> > > > > >  		cs->calls_per_service[service_idx]++;
> > > > > > -		s->calls++;
> > > > > > +		if (service_mt_safe(s)) {
> > > > > > +			__atomic_fetch_add(&s->cycles_spent,
> > cycles,
> > > > > > __ATOMIC_RELAXED);
> > > > > > +			__atomic_fetch_add(&s->calls, 1,
> > > > > > __ATOMIC_RELAXED);
> > > > > > +		} else {
> > > > > > +			s->cycles_spent += cycles;
> > > > > > +			s->calls++;
> > > > > This is still a problem from a reader perspective. It is
> possible
> > > that
> > > > > the writes could be split while a reader is reading the stats.
> > > These
> > > > > need to be atomic adds.
> > > >
> > > > I don't understand what you suggest can go wrong here, Honnappa.
> If
> > > you
> > > > talking about 64 bit counters on 32 bit architectures, then I
> > > understand the
> > > > problem (and have many years of direct experience with it
> myself).
> > > > Otherwise, I hope you can elaborate or direct me to educational
> > > material
> > > > about the issue, considering this a learning opportunity. :-)
> > > I am thinking of the case where the 64b write is split into two 32b
> > > (or
> > > more) write operations either by the compiler or the micro-
> > > architecture. If this were to happen, it causes race conditions
> with
> > > the reader.
> > >
> > > As far as I understand, the compiler does not provide any
> guarantees
> > > on generating non-tearing stores unless an atomic builtin/function
> is
> > > used.
> >
> > This seems like a generic problem for all 64b statistics counters in
> DPDK, and
> > any other C code using 64 bit counters. Being a generic C problem,
> there is
> > probably a generic solution to it.
> Browsing through the code, I see similar problems elsewhere.
> 
> >
> > Is any compiler going to do something that stupid (i.e. tearing a
> store into
> > multiple write operations) to a simple 64b counter on any 64 bit
> architecture
> > (assuming the counter is 64b aligned)? Otherwise, we might only need
> to
> > take special precautions for 32 bit architectures.
> It is always a debate on who is stupid, compiler or programmer 😊

Compilers will never stop surprising me. Thankfully, they are not so unreliable and full of bugs as they were 25 years ago. :-)

> 
> Though not the same case, you can look at this discussion where
> compiler generated torn stores [1] when we all thought it has been
> generating a 64b store.
> 
> [1] http://inbox.dpdk.org/dev/d5d563ab-0411-3faf-39ec-
> 4994f2bc9f6f@intel.com/

Good reference.

Technically, this sets a bunch of fields in the rte_lpm_tbl_entry structure (which happens to be 32b in total size), so it is not completely unreasonable for the compiler to store those fields individually. I wonder if using a union to cast the rte_lpm_tbl_entry struct to uint32_t (and ensuring 32b alignment) would have solved the problem, and the __atomic_store() could be avoided?

> 
> >
> > > If we have to ensure the micro-architecture does not generate split
> > > writes, we need to be careful that future code additions do not
> change
> > > the alignment of the stats.
> >
> > Unless the structure containing the stats counters is packed, the
> contained
> > 64b counters will be 64b aligned (on 64 bit architecture). So we
> should not
> > worry about alignment, except perhaps on 32 bit architectures.
> Agree, future code changes need to be aware of these issues and DPDK
> supports 32b architectures.

  reply	other threads:[~2022-07-08 18:08 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-08 12:56 [PATCH 1/2] test/service: add perf measurements for with stats mode Harry van Haaren
2022-07-08 12:56 ` [PATCH 2/2] service: fix potential stats race-condition on MT services Harry van Haaren
2022-07-08 13:23   ` Morten Brørup
2022-07-08 13:44     ` Van Haaren, Harry
2022-07-08 14:14       ` Morten Brørup
2022-07-08 13:48   ` Mattias Rönnblom
2022-07-08 15:16   ` Honnappa Nagarahalli
2022-07-08 15:31     ` Van Haaren, Harry
2022-07-08 16:21       ` Bruce Richardson
2022-07-08 16:33         ` Honnappa Nagarahalli
2022-07-08 20:02         ` Mattias Rönnblom
2022-07-08 16:29     ` Morten Brørup
2022-07-08 16:45       ` Honnappa Nagarahalli
2022-07-08 17:22         ` Morten Brørup
2022-07-08 17:39           ` Honnappa Nagarahalli
2022-07-08 18:08             ` Morten Brørup [this message]
2022-09-06 16:13   ` [PATCH 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-03 13:33       ` Van Haaren, Harry
2022-10-03 14:32         ` Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-09-07  8:41       ` Morten Brørup
2022-10-03 13:45         ` Van Haaren, Harry
2022-09-06 16:13     ` [PATCH 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-03  8:06     ` [PATCH 1/6] service: reduce statistics overhead for parallel services David Marchand
2022-10-03  8:40       ` Mattias Rönnblom
2022-10-03  9:53         ` David Marchand
2022-10-03 11:37           ` Mattias Rönnblom
2022-10-03 13:03             ` Van Haaren, Harry
2022-10-03 13:33     ` Van Haaren, Harry
2022-10-03 14:37       ` Mattias Rönnblom
2022-10-05  9:16     ` [PATCH v2 0/6] Service cores performance and statistics improvements Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-05  9:49       ` [PATCH v2 0/6] Service cores performance and statistics improvements Morten Brørup
2022-10-05 10:14         ` Mattias Rönnblom
2022-10-05 13:39       ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D871BA@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).