DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tal Shnaiderman <talshn@nvidia.com>
To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>,
	"Dmitry Malloy (MESHCHANINOV)" <dmitrym@microsoft.com>,
	Narcisa Ana Maria Vasile <Narcisa.Vasile@microsoft.com>
Cc: Eilon Greenstein <eilong@nvidia.com>,
	Omar Cardona <ocardona@microsoft.com>,
	Rani Sharoni <ranish@nvidia.com>, Odi Assli <odia@nvidia.com>,
	Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>,
	NBU-Contact-Thomas Monjalon <thomas@monjalon.net>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: [dpdk-dev] Windows DPDK real-time priority threads causing thread starvation
Date: Wed, 9 Dec 2020 14:15:30 +0000	[thread overview]
Message-ID: <DM6PR12MB39455643CEE4FF76DCA6743BA4CC0@DM6PR12MB3945.namprd12.prod.outlook.com> (raw)


During our verification tests on Windows DPDK we've noticed that DPDK polling threads, which run in REALTIME_PRIORITY_CLASS are causing starvation to other threads from the OS which need to change affinity and run in lower priority.

While running an application for a while we see the OS thread waits for 2:30 minutes and raises a bugcheck, see below example of such flow:

1) DPDK thread running on core-0 in real-time high priority(24) polling mode.
2) The thread is blocking the system function NtSetSystemInformation (ExpUpdateTimerConfiguration) in another thread from 
   switching to core-0 via KeSetSystemGroupAffinityThread since the calling thread is priority 15. 
3) NtSetSystemInformation exclusively acquired system-wide lock (ExpTimeRefreshLock) hence 
    it blocks other threads (e.g. calling NtQuerySystemInformation).

We've seen this behavior only while running on Windows 2019 VMs, maybe on native machines OS scheduling of such flow is done differently? 

Below is usage explanation from the documentation of SetPriorityClass [1]:

Process that has the highest possible priority. The threads of the process preempt the threads of all other processes, including operating system processes performing important tasks. For example, a real-time process that executes for more than a very brief interval can cause disk caches not to flush or cause the mouse to be unresponsive. 

So I assume using this kind of thread for a long period as we do can cause unstable behavior.

How do you think we can resolve this? Are there such cases in Linux?

[1] - https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setpriorityclass



             reply	other threads:[~2020-12-09 14:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-09 14:15 Tal Shnaiderman [this message]
2020-12-09 16:08 ` John Alexander
2020-12-09 16:08 ` Stephen Hemminger
2020-12-09 16:12   ` [dpdk-dev] [EXTERNAL] " Dmitry Malloy (MESHCHANINOV)
2020-12-16 14:53     ` Tal Shnaiderman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR12MB39455643CEE4FF76DCA6743BA4CC0@DM6PR12MB3945.namprd12.prod.outlook.com \
    --to=talshn@nvidia.com \
    --cc=Harini.Ramakrishnan@microsoft.com \
    --cc=Narcisa.Vasile@microsoft.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=dmitrym@microsoft.com \
    --cc=eilong@nvidia.com \
    --cc=ocardona@microsoft.com \
    --cc=odia@nvidia.com \
    --cc=ranish@nvidia.com \
    --cc=thomas@monjalon.net \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).