DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: David Marchand <david.marchand@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>, dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [RFC] service: stop lcore threads before 'finalize'
Date: Tue, 10 Mar 2020 13:27:59 +0000	[thread overview]
Message-ID: <MWHPR1101MB21571507B22DECEDD7D34841D7FF0@MWHPR1101MB2157.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAJFAV8zTOMpEBxB_jHFwpRJ-Hac1L4VO6t4srFTpdQM6P+vEcw@mail.gmail.com>

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, March 10, 2020 1:05 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: Aaron Conole <aconole@redhat.com>; dev <dev@dpdk.org>
> Subject: Re: [RFC] service: stop lcore threads before 'finalize'
> 
> On Fri, Feb 21, 2020 at 1:28 PM Van Haaren, Harry
> <harry.van.haaren@intel.com> wrote:
<snip>
> >
> > Hi David,
> >
> > I have been attempting to reproduce, unfortunately without success.
> >
> > Attempted you suggested meson test approach (thanks for suggesting!), but
> > I haven't had a segfault with that approach (yet, and its done a lot of
> iterations..)
> 
> I reproduced it on the first try, just now.
> Travis catches it every once in a while (look at the ovsrobot).
> 
> For the reproduction, this is on my laptop (core i7-8650U), baremetal,
> no fancy stuff.
> FWIW, the cores are ruled by the "powersave" governor.
> I can see the frequency oscillates between 3.5GHz and 3.7Ghz while the
> max frequency is 4.2GHz.
> 
> Travis runs virtual machines with 2 cores, and there must be quite
> some overprovisioning on those servers.
> We can expect some cycles being stolen or at least something happening
> on the various cores.
> 
> 
> >
> > I've made the service-cores unit tests delay before exit, in an attempt
> > to have them access previously rte_free()-ed memory, no luck to reproduce.
> 
> Ok, let's forget about the segfault, what do you think of the
> backtrace I caught?
> A service lcore thread is still in the service loop.
> The master thread of the application is in the libc exiting code.
> 
> This is what I get in all crashes.

Hi,

I was actually coding up the above as a patch to send to ML for testing.
I've tried to reproduce - it doesn't happen here. I don't like sending
patches for fixes that I haven't been able to reliably reproduce and fix
locally - but in this case there's I don't see any other option.

I'll post the fix patch to the mailing list ASAP, your and Aaron's
help in testing would be greatly appreciated.


> > Thinking perhaps we need it on exit, I've also POCed a unit test that
> leaves
> > service cores active on exit on purpose, to try have them poll after exit,
> > still no luck.
> >
> > Simplifying the problem, and using hello-world sample app with a
> rte_eal_cleaup()
> > call at the end also doesn't easily aggravate the problem.
> >
> > From code inspection, I agree there is an issue. It seems like a call to
> > rte_service_lcore_reset_all() from rte_service_finalize() is enough...
> > But without reproducing it is hard to have good confidence in a fix.
> 
> You promised a doc update on the services API.
> Thanks.

Yes, I heard there are some questions around what service cores is useful for.
Having reviewed the programmer guide and doxygen of the API, I'm not sure
what needs to change. Do you have specific questions you'd like to see
addressed here, or what do you feel needs to change?

https://doc.dpdk.org/guides/prog_guide/service_cores.html
http://doc.dpdk.org/api/rte__service_8h.html


Regards, -Harry

      reply	other threads:[~2020-03-10 13:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 19:50 Aaron Conole
2020-01-17  8:17 ` David Marchand
2020-02-04 13:34   ` David Marchand
2020-02-04 14:50     ` Aaron Conole
2020-02-10 14:16       ` Van Haaren, Harry
2020-02-10 14:42         ` David Marchand
2020-02-20 13:25         ` David Marchand
2020-02-21 12:28           ` Van Haaren, Harry
2020-03-10 13:04             ` David Marchand
2020-03-10 13:27               ` Van Haaren, Harry [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR1101MB21571507B22DECEDD7D34841D7FF0@MWHPR1101MB2157.namprd11.prod.outlook.com \
    --to=harry.van.haaren@intel.com \
    --cc=aconole@redhat.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).