patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: anatoly.burakov@intel.com, stable@dpdk.org, dev@dpdk.org,
	david.marchand@redhat.com
Subject: Re: [PATCH] eal: fix data race in multi-process support
Date: Thu, 14 Apr 2022 13:28:34 -0700	[thread overview]
Message-ID: <20220414132834.5c073dad@hermes.local> (raw)
In-Reply-To: <9400637.ag9G3TJQzC@thomas>

On Sun, 13 Feb 2022 12:39:59 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 17/12/2021 19:29, Stephen Hemminger:
> > If DPDK is built with thread sanitizer it reports a race
> > in setting of multiprocess file descriptor. The fix is to
> > use atomic operations when updating mp_fd.  
> 
> Please could explain more the condition of the race?
> Is it between init and cleanup of the same file descriptor?
> How atomic is helping here?
> 
> 
> > 
> > Simple example:
> > $ dpdk-testpmd -l 1-3 --no-huge
> > ...
> > EAL: Error - exiting with code: 1
> >   Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
> > ==================
> > WARNING: ThreadSanitizer: data race (pid=83054)
> >   Write of size 4 at 0x55e3b7fce450 by main thread:
> >     #0 rte_mp_channel_cleanup <null> (dpdk-testpmd+0x160d79c)
> >     #1 rte_eal_cleanup <null> (dpdk-testpmd+0x1614fb5)
> >     #2 rte_exit <null> (dpdk-testpmd+0x15ec97a)
> >     #3 mbuf_pool_create.cold <null> (dpdk-testpmd+0x242e1a)
> >     #4 main <null> (dpdk-testpmd+0x5ab05d)
> > 
> >   Previous read of size 4 at 0x55e3b7fce450 by thread T2:
> >     #0 mp_handle <null> (dpdk-testpmd+0x160c979)
> >     #1 ctrl_thread_init <null> (dpdk-testpmd+0x15ff76e)
> > 
> >   As if synchronized via sleep:
> >     #0 nanosleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:362 (libtsan.so.0+0x5cd8e)
> >     #1 get_tsc_freq <null> (dpdk-testpmd+0x1622889)
> >     #2 set_tsc_freq <null> (dpdk-testpmd+0x15ffb9c)
> >     #3 rte_eal_timer_init <null> (dpdk-testpmd+0x1622a34)
> >     #4 rte_eal_init.cold <null> (dpdk-testpmd+0x26b314)
> >     #5 main <null> (dpdk-testpmd+0x5aab45)
> > 
> >   Location is global 'mp_fd' of size 4 at 0x55e3b7fce450 (dpdk-testpmd+0x0000027c7450)
> > 
> >   Thread T2 'rte_mp_handle' (tid=83057, running) created by main thread at:
> >     #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x58ba2)
> >     #1 rte_ctrl_thread_create <null> (dpdk-testpmd+0x15ff870)
> >     #2 rte_mp_channel_init.cold <null> (dpdk-testpmd+0x269986)
> >     #3 rte_eal_init <null> (dpdk-testpmd+0x1615b28)
> >     #4 main <null> (dpdk-testpmd+0x5aab45)  
> 
> 
> 

The issue is that two threads are sharing a global variable without barriers or atomic.
The variable mp_fd is set in control thread rte_mp_channel_init/rte_mp_channel_cleanup
but then read by the thread that handles multiprocess (mp_handle).

This sharing of global data without barrier or lock is unsafe/undefined, and can
break on weakly ordered CPU's like ARM.

Kind of surprised that we don't see bug already since compiler could decide that
mp_fd in the function mp_handle() is invariant and not test it and have the thread
run forever.

This is a bug from the beginning of MP support in DPDK.



  reply	other threads:[~2022-04-14 20:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 18:16 Stephen Hemminger
2021-12-17 18:29 ` Stephen Hemminger
2022-02-13 11:39   ` Thomas Monjalon
2022-04-14 20:28     ` Stephen Hemminger [this message]
2022-04-20 15:13   ` Burakov, Anatoly
     [not found]   ` <20220906164522.91776-1-stephen@networkplumber.org>
2022-10-09 23:53     ` [PATCH v2] " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220414132834.5c073dad@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=anatoly.burakov@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).