DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [RFC 0/4] DPDK multiprocess rework
Date: Mon, 10 Jul 2017 10:18:12 +0000	[thread overview]
Message-ID: <C6ECDF3AB251BE4894318F4E45123697822593B3@IRSMSX109.ger.corp.intel.com> (raw)
In-Reply-To: <1495211986-15177-1-git-send-email-anatoly.burakov@intel.com>

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> Sent: Friday, May 19, 2017 5:40 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [RFC 0/4] DPDK multiprocess rework
> 
> This is a proof-of-concept proposal for rework of how DPDK secondary
> processes work. While the code has some limitations, it works well enough to
> demonstrate the concept, and it can successfully run all existing multiprocess
> applications.
> 
> Current problems with DPDK secondary processes:
> * ASLR interferes with mappings
>   * "Fixed" by disabling ASLR, but not really a solution
> * Secondary process may map things into where we want to map shared
> memory
>   * _Almost_ works with --base-virtaddr, but unreliable and tedious
> * Function pointers don't work (so e.g. hash library is broken)
> 
> Proposed solution:
> 
> Instead of running secondary process and mapping resources from primary
> process, the following is done:
> 0) compile all applications as position-indendent executables, compile DPDK
> as
>    a shared library
> 1) fork() from primary process
> 2) dlopen() secondary process binary
> 3) use dlsym() to find entry point
> 4) run the application code while having all resources already mapped
> 
> Benefits:
> * No more ASLR issues
> * No need for --base-virtaddr
> * Function pointers from primary process will work in secondaries
>   * Hash library (and any other library that uses function pointers internally)
>     will work correctly in multi-process scenario
>   * ethdev data can be moved to shared memory
>   * Primary process interrupt callbacks can be run by secondary process
> * More secure as all applications are compiled as position-indendent binaries
>   (default on Fedora)
> 
> Potential drawbacks (that we could think of):
> * Kind of a hack
> * Puts some code restrictions on secondary processes
>   * Anything happening before EAL init will be run twice
> * Some use cases are no longer possible (attaching to a dead primary)
> * May impact binaries compiled to use a lot (kilobytes) of thread-local
> storage[1]
> * Likely wouldn't work for static linking
> 
> There are also a number of issues that need to be resolved, but those are
> implementation details and are out of scope for RFC.
> 
> What is explicitly out of scope:
> * Fixing interrupts in secondary processes
> * Fixing hotplug in secondary processes
> 
> These currently do not work in secondary processes, and this proposal does
> nothing to change that. They are better addressed using dedicated EAL-
> internal IPC proposal.
> 
> 
> Technical nitty-gritty
> 
> Things quickly get confusing, so terminology:
> - Original Primary is normal DPDK primary process
> - Forked Primary is a "clean slate" primary process, from which all secondary
>   processes will be forked (threads and fork don't mix well, so fork is done
>   after all the hugepage and PCI data is mapped, but before all the threads
> are
>   spun up)
> - Original Secondary is a process that connects to Forked Primary, sends
> some
>   data and and triggers a fork
> - Forked Secondary is _actual_ secondary process (forked from Forked
> Primary)
> 
> Timeline:
> - Original Primary starts
> - Forked Primary is forked from Original Primary
> - Original Secondary starts and connects to Forked Primary
> - Forked Primary forks into Forked Secondary
> - Original Secondary waits until Forked Secondary dies
> 
> During EAL init, Original Primary does a fork() to form a Forked Primary - a
> "clean slate" starting point for secondary processes. Forked Primary opens a
> local socket (a-la VFIO) and starts listening for incoming connections.
> 
> Original Secondary process connects to Forked Primary, sends stdout/log
> fd's, command line parameters, etc. over local socket, and sits around waiting
> for Forked Secondary to die, then exits (Original Secondary does _not_ map
> anything or do any EAL init, it rte_exit()'s from inside rte_eal_init()). Forked
> Secondary process then executes main(), passing all command-line
> arguments, and execution of secondary process resumes.
> 
> Why pre-fork and not pthread like VFIO?
> 
> Pthreads and fork() don't mix well, because fork() stops the world (all
> threads disappear, leaving behind thread stacks, locks and possibly
> inconsistent state of both app data and system libraries). On the other hand,
> forking from single- threaded context is safe. Current implementation
> doesn't _exactly_ fork from a single-threaded context, but this can be fixed
> later by rearranging EAL init.
> 
> [1]: https://www.redhat.com/archives/phil-list/2003-
> February/msg00077.html
> 

Ping

      parent reply	other threads:[~2017-07-10 10:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-19 16:39 Anatoly Burakov
2017-05-19 16:39 ` [dpdk-dev] [RFC 1/4] vfio: refactor sockets into separate files Anatoly Burakov
2017-05-19 16:39 ` [dpdk-dev] [RFC 2/4] eal: enable experimental dlopen()-based secondary process support Anatoly Burakov
2017-05-19 17:39   ` Stephen Hemminger
2017-05-19 16:39 ` [dpdk-dev] [RFC 3/4] apps: enable new secondary process support in multiprocess apps Anatoly Burakov
2017-05-19 16:39 ` [dpdk-dev] [RFC 4/4] mk: default to compiling shared libraries Anatoly Burakov
2017-07-10 10:18 ` Burakov, Anatoly [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6ECDF3AB251BE4894318F4E45123697822593B3@IRSMSX109.ger.corp.intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).