* [PATCH] eal/unix: allow creating thread with real-time priority
@ 2023-10-24 12:54 Thomas Monjalon
  2023-10-24 13:55 ` Morten Brørup
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-24 12:54 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Anatoly Burakov, Stephen Hemminger,
	Narcisa Vasile, Tyler Retzlaff, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko, Morten Brørup
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test/test_threads.c                         | 9 ---------
 doc/guides/prog_guide/env_abstraction_layer.rst | 4 +++-
 lib/eal/include/rte_thread.h                    | 2 +-
 lib/eal/unix/rte_thread.c                       | 9 ---------
 4 files changed, 4 insertions(+), 20 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..4f7d10e074 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -97,21 +97,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index 8da9d4d3fb..5e624015e4 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -59,7 +59,7 @@ enum rte_thread_priority {
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
 	/**< normal thread priority, the default */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
+	/**< highest thread priority, use with caution */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 36a21ab2f9..0ff291c6f1 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -153,11 +153,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -275,10 +270,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-24 12:54 [PATCH] eal/unix: allow creating thread with real-time priority Thomas Monjalon
@ 2023-10-24 13:55 ` Morten Brørup
  2023-10-24 16:04   ` Stephen Hemminger
  2023-10-25 15:13 ` [PATCH v2] " Thomas Monjalon
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-10-24 13:55 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: David Marchand, stable, Anatoly Burakov, Stephen Hemminger,
	Narcisa Vasile, Tyler Retzlaff, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Tuesday, 24 October 2023 14.54
> 
> When adding an API for creating threads,
> the real-time priority has been forbidden on Unix.
> 
> There is a known issue with ring behaviour,
> but it should not be completely forbidden.
> 
> Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
> Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
[...]
> @@ -815,7 +815,9 @@ Known Issues
> 
>    4. It MAY be used by preemptible multi-producer and/or preemptible multi-
> consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE
> or SCHED_BATCH. User SHOULD be aware of the performance penalty before using
> it.
> 
> -  5. It MUST not be used by multi-producer/consumer pthreads, whose
> scheduling policies are SCHED_FIFO or SCHED_RR.
> +  5. It MUST not be used by multi-producer/consumer pthreads
> +     whose scheduling policies are ``SCHED_FIFO``
> +     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
Do the RTS or HTS ring modes make any difference here?
Anyway, I agree that real-time priority should not be forbidden on Unix.
Acked-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-24 13:55 ` Morten Brørup
@ 2023-10-24 16:04   ` Stephen Hemminger
  2023-10-25 13:15     ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-24 16:04 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Narcisa Vasile, Tyler Retzlaff, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
On Tue, 24 Oct 2023 15:55:13 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> > 
> >    4. It MAY be used by preemptible multi-producer and/or preemptible multi-
> > consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE
> > or SCHED_BATCH. User SHOULD be aware of the performance penalty before using
> > it.
> > 
> > -  5. It MUST not be used by multi-producer/consumer pthreads, whose
> > scheduling policies are SCHED_FIFO or SCHED_RR.
> > +  5. It MUST not be used by multi-producer/consumer pthreads
> > +     whose scheduling policies are ``SCHED_FIFO``
> > +     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).  
> 
> Do the RTS or HTS ring modes make any difference here?
> 
> Anyway, I agree that real-time priority should not be forbidden on Unix.
> 
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
Please add a big warning message in the rte_thread.c and the documentation
to describe the problem. Need to have the "you have been warned" action.
Use of RT priority is incompatible with 100% poll mode as is typically done
in DPDK applications. A real time thread has higher priority than other necessary
kernel threads on the same CPU. Therefore if the RT thread never sleeps, critical
system actions such as delayed writes, network packet processing and timer updates
will not happen which makes the system unstable.
Multiple DPDK users have learned this the hard way.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-24 16:04   ` Stephen Hemminger
@ 2023-10-25 13:15     ` Thomas Monjalon
  2023-10-25 13:34       ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 13:15 UTC (permalink / raw)
  To: Morten Brørup, Stephen Hemminger, Min Zhou
  Cc: stable, dev, David Marchand, stable, Anatoly Burakov,
	Narcisa Vasile, Tyler Retzlaff, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
24/10/2023 18:04, Stephen Hemminger:
> On Tue, 24 Oct 2023 15:55:13 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > 
> > >    4. It MAY be used by preemptible multi-producer and/or preemptible multi-
> > > consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE
> > > or SCHED_BATCH. User SHOULD be aware of the performance penalty before using
> > > it.
> > > 
> > > -  5. It MUST not be used by multi-producer/consumer pthreads, whose
> > > scheduling policies are SCHED_FIFO or SCHED_RR.
> > > +  5. It MUST not be used by multi-producer/consumer pthreads
> > > +     whose scheduling policies are ``SCHED_FIFO``
> > > +     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).  
> > 
> > Do the RTS or HTS ring modes make any difference here?
> > 
> > Anyway, I agree that real-time priority should not be forbidden on Unix.
> > 
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> 
> Please add a big warning message in the rte_thread.c and the documentation
> to describe the problem. Need to have the "you have been warned" action.
Yes I can add more warnings.
> Use of RT priority is incompatible with 100% poll mode as is typically done
> in DPDK applications. A real time thread has higher priority than other necessary
> kernel threads on the same CPU. Therefore if the RT thread never sleeps, critical
> system actions such as delayed writes, network packet processing and timer updates
> will not happen which makes the system unstable.
Yes, and it is shown by the test on loongarch:
DPDK:fast-tests / threads_autotest        TIMEOUT        80.01s
http://mails.dpdk.org/archives/test-report/2023-October/488760.html
I'll try to pass the test by adding a sleep in the test thread.
> Multiple DPDK users have learned this the hard way.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 13:15     ` Thomas Monjalon
@ 2023-10-25 13:34       ` Bruce Richardson
  2023-10-25 13:44         ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-25 13:34 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Morten Brørup, Stephen Hemminger, Min Zhou, stable, dev,
	David Marchand, Anatoly Burakov, Narcisa Vasile, Tyler Retzlaff,
	Dmitry Kozlyuk, Konstantin Ananyev, Andrew Rybchenko
On Wed, Oct 25, 2023 at 03:15:49PM +0200, Thomas Monjalon wrote:
> 24/10/2023 18:04, Stephen Hemminger:
> > On Tue, 24 Oct 2023 15:55:13 +0200
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> > 
> > > > 
> > > >    4. It MAY be used by preemptible multi-producer and/or preemptible multi-
> > > > consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE
> > > > or SCHED_BATCH. User SHOULD be aware of the performance penalty before using
> > > > it.
> > > > 
> > > > -  5. It MUST not be used by multi-producer/consumer pthreads, whose
> > > > scheduling policies are SCHED_FIFO or SCHED_RR.
> > > > +  5. It MUST not be used by multi-producer/consumer pthreads
> > > > +     whose scheduling policies are ``SCHED_FIFO``
> > > > +     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).  
> > > 
> > > Do the RTS or HTS ring modes make any difference here?
> > > 
> > > Anyway, I agree that real-time priority should not be forbidden on Unix.
> > > 
> > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > 
> > Please add a big warning message in the rte_thread.c and the documentation
> > to describe the problem. Need to have the "you have been warned" action.
> 
> Yes I can add more warnings.
> 
> > Use of RT priority is incompatible with 100% poll mode as is typically done
> > in DPDK applications. A real time thread has higher priority than other necessary
> > kernel threads on the same CPU. Therefore if the RT thread never sleeps, critical
> > system actions such as delayed writes, network packet processing and timer updates
> > will not happen which makes the system unstable.
> 
> Yes, and it is shown by the test on loongarch:
> DPDK:fast-tests / threads_autotest        TIMEOUT        80.01s
> http://mails.dpdk.org/archives/test-report/2023-October/488760.html
> 
> I'll try to pass the test by adding a sleep in the test thread.
> 
"sched_yield()" rather than sleep perhaps? Might better convey the
intention of the call.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 13:34       ` Bruce Richardson
@ 2023-10-25 13:44         ` Thomas Monjalon
  2023-10-25 15:08           ` Stephen Hemminger
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 13:44 UTC (permalink / raw)
  To: Bruce Richardson, Tyler Retzlaff
  Cc: dev, Morten Brørup, Stephen Hemminger, Min Zhou, stable,
	dev, David Marchand, Anatoly Burakov, Narcisa Vasile,
	Tyler Retzlaff, Dmitry Kozlyuk, Konstantin Ananyev,
	Andrew Rybchenko
25/10/2023 15:34, Bruce Richardson:
> On Wed, Oct 25, 2023 at 03:15:49PM +0200, Thomas Monjalon wrote:
> > 24/10/2023 18:04, Stephen Hemminger:
> > > On Tue, 24 Oct 2023 15:55:13 +0200
> > > Morten Brørup <mb@smartsharesystems.com> wrote:
> > > 
> > > > > 
> > > > >    4. It MAY be used by preemptible multi-producer and/or preemptible multi-
> > > > > consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE
> > > > > or SCHED_BATCH. User SHOULD be aware of the performance penalty before using
> > > > > it.
> > > > > 
> > > > > -  5. It MUST not be used by multi-producer/consumer pthreads, whose
> > > > > scheduling policies are SCHED_FIFO or SCHED_RR.
> > > > > +  5. It MUST not be used by multi-producer/consumer pthreads
> > > > > +     whose scheduling policies are ``SCHED_FIFO``
> > > > > +     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).  
> > > > 
> > > > Do the RTS or HTS ring modes make any difference here?
> > > > 
> > > > Anyway, I agree that real-time priority should not be forbidden on Unix.
> > > > 
> > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > 
> > > Please add a big warning message in the rte_thread.c and the documentation
> > > to describe the problem. Need to have the "you have been warned" action.
> > 
> > Yes I can add more warnings.
> > 
> > > Use of RT priority is incompatible with 100% poll mode as is typically done
> > > in DPDK applications. A real time thread has higher priority than other necessary
> > > kernel threads on the same CPU. Therefore if the RT thread never sleeps, critical
> > > system actions such as delayed writes, network packet processing and timer updates
> > > will not happen which makes the system unstable.
> > 
> > Yes, and it is shown by the test on loongarch:
> > DPDK:fast-tests / threads_autotest        TIMEOUT        80.01s
> > http://mails.dpdk.org/archives/test-report/2023-October/488760.html
> > 
> > I'll try to pass the test by adding a sleep in the test thread.
> > 
> 
> "sched_yield()" rather than sleep perhaps? Might better convey the
> intention of the call.
Do we have sched_yield on Windows?
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 13:44         ` Thomas Monjalon
@ 2023-10-25 15:08           ` Stephen Hemminger
  2023-10-25 15:14             ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-25 15:08 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Bruce Richardson, Tyler Retzlaff, dev, Morten Brørup,
	Min Zhou, stable, David Marchand, Anatoly Burakov,
	Narcisa Vasile, Dmitry Kozlyuk, Konstantin Ananyev,
	Andrew Rybchenko
On Wed, 25 Oct 2023 15:44:25 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > I'll try to pass the test by adding a sleep in the test thread.
> > >   
> > 
> > "sched_yield()" rather than sleep perhaps? Might better convey the
> > intention of the call.  
> 
> Do we have sched_yield on Windows?
Windows has an equivalent but sched_yield() won't work here.
Since the DPDK thread is still higher priority than the kernel thread,
the scheduler will reschedule the DPDK thread. You need to sleep
to let kthread run.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-24 12:54 [PATCH] eal/unix: allow creating thread with real-time priority Thomas Monjalon
  2023-10-24 13:55 ` Morten Brørup
@ 2023-10-25 15:13 ` Thomas Monjalon
  2023-10-25 15:37   ` Stephen Hemminger
       [not found] ` <20231025163352.1076755-1-thomas@monjalon.net>
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 15:13 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Stephen Hemminger, Tyler Retzlaff,
	Andrew Rybchenko, Konstantin Ananyev
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a pause is added in the test thread.
This pause is a new API function rte_thread_yield(),
compatible with both Unix and Windows.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/test/test_threads.c                       | 11 +---------
 .../prog_guide/env_abstraction_layer.rst      |  4 +++-
 lib/eal/include/rte_thread.h                  | 13 ++++++++++--
 lib/eal/unix/rte_thread.c                     | 21 +++++++++++--------
 lib/eal/version.map                           |  3 +++
 lib/eal/windows/rte_thread.c                  |  6 ++++++
 6 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..9a449ba9c5 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -22,7 +22,7 @@ thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_thread_yield(); /* required in case of real-time priority */
 
 	return 0;
 }
@@ -97,21 +97,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index 8da9d4d3fb..eeccc40532 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,11 @@ typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/** Highest thread priority, use with caution.
+	 *  WARNING: System may be unstable because of a real-time busy loop. */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
@@ -183,6 +184,14 @@ int rte_thread_join(rte_thread_t thread_id, uint32_t *value_ptr);
  */
 int rte_thread_detach(rte_thread_t thread_id);
 
+/**
+ * Allow another thread to run on the same CPU core.
+ *
+ * Especially useful in real-time thread priority.
+ * @see RTE_THREAD_PRIORITY_REALTIME_CRITICAL
+ */
+void rte_thread_yield(void);
+
 /**
  * Get the id of the calling thread.
  *
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 36a21ab2f9..399acf2fa0 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -5,6 +5,7 @@
 
 #include <errno.h>
 #include <pthread.h>
+#include <sched.h>
 #include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
@@ -49,6 +50,11 @@ thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -153,11 +159,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -227,6 +228,12 @@ rte_thread_detach(rte_thread_t thread_id)
 	return pthread_detach((pthread_t)thread_id.opaque_id);
 }
 
+void
+rte_thread_yield(void)
+{
+	sched_yield();
+}
+
 int
 rte_thread_equal(rte_thread_t t1, rte_thread_t t2)
 {
@@ -275,10 +282,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e00a844805..0b4a503c5f 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -413,6 +413,9 @@ EXPERIMENTAL {
 	# added in 23.07
 	rte_memzone_max_get;
 	rte_memzone_max_set;
+
+	# added in 23.11
+	rte_thread_yield;
 };
 
 INTERNAL {
diff --git a/lib/eal/windows/rte_thread.c b/lib/eal/windows/rte_thread.c
index acf648456c..b0373b1a55 100644
--- a/lib/eal/windows/rte_thread.c
+++ b/lib/eal/windows/rte_thread.c
@@ -304,6 +304,12 @@ rte_thread_detach(rte_thread_t thread_id)
 	return 0;
 }
 
+void
+rte_thread_yield(void)
+{
+	Sleep(0);
+}
+
 int
 rte_thread_equal(rte_thread_t t1, rte_thread_t t2)
 {
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 15:08           ` Stephen Hemminger
@ 2023-10-25 15:14             ` Bruce Richardson
  2023-10-25 15:18               ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2023-10-25 15:14 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, Tyler Retzlaff, dev, Morten Brørup,
	Min Zhou, stable, David Marchand, Anatoly Burakov,
	Narcisa Vasile, Dmitry Kozlyuk, Konstantin Ananyev,
	Andrew Rybchenko
On Wed, Oct 25, 2023 at 08:08:52AM -0700, Stephen Hemminger wrote:
> On Wed, 25 Oct 2023 15:44:25 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > > > 
> > > > I'll try to pass the test by adding a sleep in the test thread.
> > > >   
> > > 
> > > "sched_yield()" rather than sleep perhaps? Might better convey the
> > > intention of the call.  
> > 
> > Do we have sched_yield on Windows?
> 
> Windows has an equivalent but sched_yield() won't work here.
> Since the DPDK thread is still higher priority than the kernel thread,
> the scheduler will reschedule the DPDK thread. You need to sleep
> to let kthread run.
Interesting. Thanks for clarifying the situation.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 15:14             ` Bruce Richardson
@ 2023-10-25 15:18               ` Thomas Monjalon
  2023-10-25 15:32                 ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 15:18 UTC (permalink / raw)
  To: Stephen Hemminger, Bruce Richardson
  Cc: Tyler Retzlaff, dev, Morten Brørup, Min Zhou, stable,
	David Marchand, Anatoly Burakov, Narcisa Vasile, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
25/10/2023 17:14, Bruce Richardson:
> On Wed, Oct 25, 2023 at 08:08:52AM -0700, Stephen Hemminger wrote:
> > On Wed, 25 Oct 2023 15:44:25 +0200
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > > > > 
> > > > > I'll try to pass the test by adding a sleep in the test thread.
> > > > >   
> > > > 
> > > > "sched_yield()" rather than sleep perhaps? Might better convey the
> > > > intention of the call.  
> > > 
> > > Do we have sched_yield on Windows?
> > 
> > Windows has an equivalent but sched_yield() won't work here.
> > Since the DPDK thread is still higher priority than the kernel thread,
> > the scheduler will reschedule the DPDK thread. You need to sleep
> > to let kthread run.
> 
> Interesting. Thanks for clarifying the situation.
Indeed interesting.
I've just sent a v2 before reading this.
So I should try a v3 with a sleep.
But then I need to find a better name than rte_thread_yield.
Ideas?
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH] eal/unix: allow creating thread with real-time priority
  2023-10-25 15:18               ` Thomas Monjalon
@ 2023-10-25 15:32                 ` Thomas Monjalon
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 15:32 UTC (permalink / raw)
  To: Stephen Hemminger, Bruce Richardson, Tyler Retzlaff
  Cc: dev, dev, Morten Brørup, Min Zhou, stable, David Marchand,
	Anatoly Burakov, Narcisa Vasile, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
25/10/2023 17:18, Thomas Monjalon:
> 25/10/2023 17:14, Bruce Richardson:
> > On Wed, Oct 25, 2023 at 08:08:52AM -0700, Stephen Hemminger wrote:
> > > On Wed, 25 Oct 2023 15:44:25 +0200
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > > > 
> > > > > > I'll try to pass the test by adding a sleep in the test thread.
> > > > > >   
> > > > > 
> > > > > "sched_yield()" rather than sleep perhaps? Might better convey the
> > > > > intention of the call.  
> > > > 
> > > > Do we have sched_yield on Windows?
> > > 
> > > Windows has an equivalent but sched_yield() won't work here.
> > > Since the DPDK thread is still higher priority than the kernel thread,
> > > the scheduler will reschedule the DPDK thread. You need to sleep
> > > to let kthread run.
> > 
> > Interesting. Thanks for clarifying the situation.
> 
> Indeed interesting.
> I've just sent a v2 before reading this.
> 
> So I should try a v3 with a sleep.
> But then I need to find a better name than rte_thread_yield.
> Ideas?
I will go with rte_thread_yield_realtime().
Any sleep will suffice on Linux? What about a nanosleep?
I suppose Sleep(0) is OK on Windows?
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 15:13 ` [PATCH v2] " Thomas Monjalon
@ 2023-10-25 15:37   ` Stephen Hemminger
  2023-10-25 16:46     ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-25 15:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
On Wed, 25 Oct 2023 17:13:14 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:
>  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> +		/*
> +		 * WARNING: Real-time busy loop takes priority on kernel threads,
> +		 *          making the system unstable.
> +		 *          There is also a known issue when using rte_ring.
> +		 */
I was thinking something like:
	static bool warned;
	if (!warned) {
		RTE_LOG(NOTICE, EAL, "Real time priority is unstable when thread is polling without sleep\n");
		warned = true;
	}
^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH v3 2/2] eal/unix: allow creating thread with real-time priority
       [not found] ` <20231025163352.1076755-1-thomas@monjalon.net>
@ 2023-10-25 16:31   ` Thomas Monjalon
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 16:31 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Stephen Hemminger, Tyler Retzlaff, Narcisa Vasile,
	Dmitry Kozlyuk, Andrew Rybchenko, Konstantin Ananyev
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a pause is added in the test thread.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/test/test_threads.c                         | 11 +----------
 doc/guides/prog_guide/env_abstraction_layer.rst |  4 +++-
 lib/eal/include/rte_thread.h                    |  7 +++++--
 lib/eal/unix/rte_thread.c                       | 14 +++++---------
 4 files changed, 14 insertions(+), 22 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..c14d39fc83 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -22,7 +22,7 @@ thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_thread_yield_realtime(); /* required for RT priority */
 
 	return 0;
 }
@@ -97,21 +97,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index 139cafac96..1952a10155 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,13 @@ typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/**
+	 * Highest thread priority, use with caution.
+	 * WARNING: System may be unstable because of a real-time busy loop.
+	 */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 92b4e53adb..87ddf25f1c 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -51,6 +51,11 @@ thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -155,11 +160,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -291,10 +291,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 15:37   ` Stephen Hemminger
@ 2023-10-25 16:46     ` Thomas Monjalon
  2023-10-25 17:54       ` Morten Brørup
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-25 16:46 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
25/10/2023 17:37, Stephen Hemminger:
> On Wed, 25 Oct 2023 17:13:14 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> >  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> > +		/*
> > +		 * WARNING: Real-time busy loop takes priority on kernel threads,
> > +		 *          making the system unstable.
> > +		 *          There is also a known issue when using rte_ring.
> > +		 */
> 
> I was thinking something like:
> 
> 	static bool warned;
> 	if (!warned) {
> 		RTE_LOG(NOTICE, EAL, "Real time priority is unstable when thread is polling without sleep\n");
> 		warned = true;
> 	}
I'm not sure about bothering users.
They can fear something is wrong even if the developer took care of it.
I think doc warnings for developers are more appropriate.
I've added notes in the API.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 16:46     ` Thomas Monjalon
@ 2023-10-25 17:54       ` Morten Brørup
  2023-10-25 21:33         ` Stephen Hemminger
  2023-10-26  0:00         ` Stephen Hemminger
  0 siblings, 2 replies; 27+ messages in thread
From: Morten Brørup @ 2023-10-25 17:54 UTC (permalink / raw)
  To: Thomas Monjalon, Stephen Hemminger
  Cc: dev, David Marchand, stable, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Tyler Retzlaff, Andrew Rybchenko, Konstantin Ananyev
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, 25 October 2023 18.46
> 
> 25/10/2023 17:37, Stephen Hemminger:
> > On Wed, 25 Oct 2023 17:13:14 +0200
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > >  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> > > +		/*
> > > +		 * WARNING: Real-time busy loop takes priority on kernel
> threads,
> > > +		 *          making the system unstable.
> > > +		 *          There is also a known issue when using
> rte_ring.
> > > +		 */
> >
> > I was thinking something like:
> >
> > 	static bool warned;
> > 	if (!warned) {
> > 		RTE_LOG(NOTICE, EAL, "Real time priority is unstable when
> thread is polling without sleep\n");
> > 		warned = true;
> > 	}
> 
> I'm not sure about bothering users.
> They can fear something is wrong even if the developer took care of it.
> I think doc warnings for developers are more appropriate.
> I've added notes in the API.
I agree with Thomas on this.
If you want the log message, please degrade it to INFO or DEBUG level. It is only relevant when chasing problems, not for normal production - and thus NOTICE is too high.
Someone might build a kernel with options to keep non-dataplane threads off some dedicated CPU cores, so they can be used for guaranteed low-latency dataplane threads. We do. We don't use real-time priority, though.
For reference, we did some experiments (using this custom built kernel) with a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and registering the delta. Although the overwhelming majority is ca. CPU 80 cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel threads steal some cycles from this thread, regardless of our customizations. We haven't bothered analyzing and optimizing it further.
I think our experiment supports the need to allow kernel threads to run, e.g. by calling sleep() or similar, when an EAL thread has real-time priority.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 17:54       ` Morten Brørup
@ 2023-10-25 21:33         ` Stephen Hemminger
  2023-10-26  7:33           ` Morten Brørup
  2023-10-26  0:00         ` Stephen Hemminger
  1 sibling, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-25 21:33 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
On Wed, 25 Oct 2023 19:54:06 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> I agree with Thomas on this.
> 
> If you want the log message, please degrade it to INFO or DEBUG level. It is only relevant when chasing problems, not for normal production - and thus NOTICE is too high.
I don't want the message to be hidden.
If we get any bug reports want to be able to say "read the log, don't do that".
> Someone might build a kernel with options to keep non-dataplane threads off some dedicated CPU cores, so they can be used for guaranteed low-latency dataplane threads. We do. We don't use real-time priority, though.
This is really, hard to do. Isolated CPU's are not isolated from interrupts and other sources which end up scheduling work as kernel threads. Plus there is the behavior where kernel decides to turn a soft irq into a kernel thread, then starve
itself. Under starvation, disk corruption is likely if interrupts never get processed :-(
> For reference, we did some experiments (using this custom built kernel) with a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and registering the delta. Although the overwhelming majority is ca. CPU 80 cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel threads steal some cycles from this thread, regardless of our customizations. We haven't bothered analyzing and optimizing it further.
Was this on isolated CPU?
Did you check that that CPU was excluded from the smp_affinty mask on all devices?
Did you enable the kernel feature to avoid clock ticks if CPU is dedicated?
Same thing for RCU, need to adjust parameters?
Also, on many systems there can be SMI BIOS hidden execution that will cause big outliers.
Lastly never try and use CPU 0. The kernel uses CPU 0 as catch all in lots of places.
> I think our experiment supports the need to allow kernel threads to run, e.g. by calling sleep() or similar, when an EAL thread has real-time priority.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 17:54       ` Morten Brørup
  2023-10-25 21:33         ` Stephen Hemminger
@ 2023-10-26  0:00         ` Stephen Hemminger
  1 sibling, 0 replies; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-26  0:00 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
On Wed, 25 Oct 2023 19:54:06 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> Someone might build a kernel with options to keep non-dataplane threads off some dedicated CPU cores, so they can be used for guaranteed low-latency dataplane threads. We do. We don't use real-time priority, though.
> 
> For reference, we did some experiments (using this custom built kernel) with a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and registering the delta. Although the overwhelming majority is ca. CPU 80 cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel threads steal some cycles from this thread, regardless of our customizations. We haven't bothered analyzing and optimizing it further.
> 
> I think our experiment supports the need to allow kernel threads to run, e.g. by calling sleep() or similar, when an EAL thread has real-time priority.
First. We need to dispel the myth that on Linux real time is faster.
It isn't, you can ask the RT kernel developers if you need more support.
The purpose of RT is to have user process respond in a deterministic fashion
to a kernel event (ie. usually HW interrupt or IPC). In most cases, this is
not how DPDK applications are written. It is possible to use hardware interrupts
in DPDK, but getting it right is hard, and the latency of HW to kernel to DPDK
in userspace is a long time. RT will make it more consistent, but it won't
remove the overhead; i.e less long tail with RT, but average latency will
still be too long for most network applications.
Users say "but my polling thread is getting latency", then you need to make sure
your application is running on dedicated cores and the system is configured to
not use those cores for HW events.  Doing RT won't fix that.
Then some user say "but I want to run multiple polling threads on a single core".
That plain won't work no matter what the priority.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-25 21:33         ` Stephen Hemminger
@ 2023-10-26  7:33           ` Morten Brørup
  2023-10-26 16:32             ` Stephen Hemminger
  0 siblings, 1 reply; 27+ messages in thread
From: Morten Brørup @ 2023-10-26  7:33 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 25 October 2023 23.33
> 
> On Wed, 25 Oct 2023 19:54:06 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > I agree with Thomas on this.
> >
> > If you want the log message, please degrade it to INFO or DEBUG level. It is
> only relevant when chasing problems, not for normal production - and thus
> NOTICE is too high.
> 
> I don't want the message to be hidden.
> If we get any bug reports want to be able to say "read the log, don't do
> that".
Since Stephen is arguing so strongly for it, I have changed my mind, and now support Stephen's suggestion.
It's a tradeoff: Noise for carefully designed systems, vs. important bug hunting information for systems under development (or casually developed systems).
As Stephen points out, it is a good starting point to check for bug reports possibly related to this. And, I suppose the experienced users who really understands it will not be seriously confused by such a NOTICE message in the log.
> 
> > Someone might build a kernel with options to keep non-dataplane threads off
> some dedicated CPU cores, so they can be used for guaranteed low-latency
> dataplane threads. We do. We don't use real-time priority, though.
> 
> This is really, hard to do.
As my kids would say: This is really, really, really, really, really hard to do!
We have not been able to find an authoritative source of documentation describing how to do it. :-(
And our experiment shows that we didn't 100 % succeed doing it. But we got close enough for our purposes. Outliers of max 9,000 CPU cycles on a 3+ GHz CPU corresponds to max 3 microseconds of added worst-case latency.
It would be great for latency-sensitive applications if the DPDK documentation went more into detail on this topic. However, if the DPDK runs on top of a Linux distro, it essentially depends on the distro, and should be documented there. And if running on top of a custom built Linux Kernel, it essentially depends on the kernel, and should be documented there. In other words: Such information should be contributed there, and not in the DPDK documentation. ;-)
> Isolated CPU's are not isolated from interrupts
> and other sources which end up scheduling work as kernel threads. Plus there
> is the behavior where kernel decides to turn a soft irq into a kernel thread,
> then starve itself.
We have configured the kernel to put all of this on CPU 0. (Details further below.)
> Under starvation, disk corruption is likely if interrupts never get
> processed :-(
> 
> > For reference, we did some experiments (using this custom built kernel) with
> a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and
> registering the delta. Although the overwhelming majority is ca. CPU 80
> cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of
> magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel
> threads steal some cycles from this thread, regardless of our customizations.
> We haven't bothered analyzing and optimizing it further.
> 
> Was this on isolated CPU?
Yes. We isolate all CPUs but CPU 0.
> Did you check that that CPU was excluded from the smp_affinty mask on all
> devices?
Not sure how to do that?
NB: We are currently only using single-socket hardware - this makes some things easier. Perhaps this is one of those things?
> Did you enable the kernel feature to avoid clock ticks if CPU is dedicated?
Yes:
# Timers subsystem
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y
CONFIG_CMDLINE="isolcpus=1-32 irqaffinity=0 rcu_nocb_poll"
> Same thing for RCU, need to adjust parameters?
Yes:
# RCU Subsystem
CONFIG_TREE_RCU=y
CONFIG_SRCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_ALL=y
> 
> Also, on many systems there can be SMI BIOS hidden execution that will cause
> big outliers.
Yes, this is a big surprise to many people, when it happens. Our hardware doesn't suffer from that.
> 
> Lastly never try and use CPU 0. The kernel uses CPU 0 as catch all in lots of
> places.
Yes, this is very important! We treat CPU 0 as if any random process or interrupt handler can take it away at any time.
> 
> > I think our experiment supports the need to allow kernel threads to run,
> e.g. by calling sleep() or similar, when an EAL thread has real-time priority.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH v4 2/2] eal/unix: allow creating thread with real-time priority
       [not found] ` <20231026134313.1165954-1-thomas@monjalon.net>
@ 2023-10-26 13:37   ` Thomas Monjalon
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-26 13:37 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Stephen Hemminger, Narcisa Vasile, Tyler Retzlaff,
	Dmitry Kozlyuk, Andrew Rybchenko, Konstantin Ananyev
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a pause is added in the test thread.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/test/test_threads.c                       | 11 +---------
 .../prog_guide/env_abstraction_layer.rst      |  4 +++-
 lib/eal/include/rte_thread.h                  |  8 +++++--
 lib/eal/unix/rte_thread.c                     | 21 +++++++++++--------
 4 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..c14d39fc83 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -22,7 +22,7 @@ thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_thread_yield_realtime(); /* required for RT priority */
 
 	return 0;
 }
@@ -97,21 +97,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index f2581fe152..7ff031e1b2 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,14 @@ typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/**
+	 * Highest thread priority, use with caution.
+	 * WARNING: System may be unstable because of a real-time busy loop.
+	 *          @see rte_thread_yield_realtime().
+	 */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index d0758f23bb..e9967e7e41 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -51,6 +51,18 @@ thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
+		static bool warned;
+		if (!warned) {
+			RTE_LOG(NOTICE, EAL,
+					"Real-time thread is unstable if polling without sleep.\n");
+			warned = true;
+		}
+
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -155,11 +167,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -292,10 +299,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH v5 2/2] eal/unix: allow creating thread with real-time priority
       [not found] ` <20231026142749.1174372-1-thomas@monjalon.net>
@ 2023-10-26 14:19   ` Thomas Monjalon
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-26 14:19 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Narcisa Vasile, Tyler Retzlaff, Stephen Hemminger,
	Dmitry Kozlyuk, Konstantin Ananyev, Andrew Rybchenko
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a sleep is added in the test thread,
and a warning is logged when using real-time priority.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/test/test_threads.c                       | 11 +---------
 .../prog_guide/env_abstraction_layer.rst      |  4 +++-
 lib/eal/include/rte_thread.h                  |  8 +++++--
 lib/eal/unix/rte_thread.c                     | 22 +++++++++++--------
 4 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..c14d39fc83 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -22,7 +22,7 @@ thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_thread_yield_realtime(); /* required for RT priority */
 
 	return 0;
 }
@@ -97,21 +97,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index f2581fe152..7ff031e1b2 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,14 @@ typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/**
+	 * Highest thread priority, use with caution.
+	 * WARNING: System may be unstable because of a real-time busy loop.
+	 *          @see rte_thread_yield_realtime().
+	 */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 278d8d342d..17ffb86c17 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -33,6 +33,8 @@ static int
 thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 	int *pol)
 {
+	static bool warned;
+
 	/* Clear the output parameters. */
 	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
 	*pol = -1;
@@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
+		if (!warned) {
+			RTE_LOG(NOTICE, EAL,
+					"Real-time thread is unstable if polling without sleep.\n");
+			warned = true;
+		}
+
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -155,11 +168,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -291,10 +299,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-26  7:33           ` Morten Brørup
@ 2023-10-26 16:32             ` Stephen Hemminger
  2023-10-26 17:07               ` Morten Brørup
  0 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-26 16:32 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, Andrew Rybchenko,
	Konstantin Ananyev
On Thu, 26 Oct 2023 09:33:42 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Wednesday, 25 October 2023 23.33
> > 
> > On Wed, 25 Oct 2023 19:54:06 +0200
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> >   
> > > I agree with Thomas on this.
> > >
> > > If you want the log message, please degrade it to INFO or DEBUG level. It is  
> > only relevant when chasing problems, not for normal production - and thus
> > NOTICE is too high.
> > 
> > I don't want the message to be hidden.
> > If we get any bug reports want to be able to say "read the log, don't do
> > that".  
> 
> Since Stephen is arguing so strongly for it, I have changed my mind, and now support Stephen's suggestion.
> 
> It's a tradeoff: Noise for carefully designed systems, vs. important bug hunting information for systems under development (or casually developed systems).
> As Stephen points out, it is a good starting point to check for bug reports possibly related to this. And, I suppose the experienced users who really understands it will not be seriously confused by such a NOTICE message in the log.
> 
> >   
> > > Someone might build a kernel with options to keep non-dataplane threads off  
> > some dedicated CPU cores, so they can be used for guaranteed low-latency
> > dataplane threads. We do. We don't use real-time priority, though.
> > 
> > This is really, hard to do.  
> 
> As my kids would say: This is really, really, really, really, really hard to do!
> 
> We have not been able to find an authoritative source of documentation describing how to do it. :-(
> 
> And our experiment shows that we didn't 100 % succeed doing it. But we got close enough for our purposes. Outliers of max 9,000 CPU cycles on a 3+ GHz CPU corresponds to max 3 microseconds of added worst-case latency.
> 
> It would be great for latency-sensitive applications if the DPDK documentation went more into detail on this topic. However, if the DPDK runs on top of a Linux distro, it essentially depends on the distro, and should be documented there. And if running on top of a custom built Linux Kernel, it essentially depends on the kernel, and should be documented there. In other words: Such information should be contributed there, and not in the DPDK documentation. ;-)
> 
> > Isolated CPU's are not isolated from interrupts
> > and other sources which end up scheduling work as kernel threads. Plus there
> > is the behavior where kernel decides to turn a soft irq into a kernel thread,
> > then starve itself.  
> 
> We have configured the kernel to put all of this on CPU 0. (Details further below.)
> 
> > Under starvation, disk corruption is likely if interrupts never get
> > processed :-(
> >   
> > > For reference, we did some experiments (using this custom built kernel) with  
> > a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and
> > registering the delta. Although the overwhelming majority is ca. CPU 80
> > cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of
> > magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel
> > threads steal some cycles from this thread, regardless of our customizations.
> > We haven't bothered analyzing and optimizing it further.
> > 
> > Was this on isolated CPU?  
> 
> Yes. We isolate all CPUs but CPU 0.
> 
> > Did you check that that CPU was excluded from the smp_affinty mask on all
> > devices?  
> 
> Not sure how to do that?
> 
> NB: We are currently only using single-socket hardware - this makes some things easier. Perhaps this is one of those things?
> 
> > Did you enable the kernel feature to avoid clock ticks if CPU is dedicated?  
> 
> Yes:
> # Timers subsystem
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ_COMMON=y
> CONFIG_NO_HZ_FULL=y
> CONFIG_NO_HZ_FULL_ALL=y
> 
> CONFIG_CMDLINE="isolcpus=1-32 irqaffinity=0 rcu_nocb_poll"
> 
> > Same thing for RCU, need to adjust parameters?  
> 
> Yes:
> # RCU Subsystem
> CONFIG_TREE_RCU=y
> CONFIG_SRCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_CONTEXT_TRACKING=y
> CONFIG_RCU_NOCB_CPU=y
> CONFIG_RCU_NOCB_CPU_ALL=y
> 
> > 
> > Also, on many systems there can be SMI BIOS hidden execution that will cause
> > big outliers.  
> 
> Yes, this is a big surprise to many people, when it happens. Our hardware doesn't suffer from that.
> 
> > 
> > Lastly never try and use CPU 0. The kernel uses CPU 0 as catch all in lots of
> > places.  
> 
> Yes, this is very important! We treat CPU 0 as if any random process or interrupt handler can take it away at any time.
> 
> >   
> > > I think our experiment supports the need to allow kernel threads to run,  
> > e.g. by calling sleep() or similar, when an EAL thread has real-time priority.  
> 
One benefit of doing real-time thread is that kernel will be more precise in
any calls to sleep. If you do small sleep in normal thread, the kernel will round
up the timer to try and avoid reprogramming timer chip and to save power (less wakeups from idle).
With RT thread it will do "you wanted 21us, ok for you will do 21us"
The project that was originally Vyatta, has a script that tries to isolate interrupts etc.
I started it but they have worked on it since then.
   https://github.com/danos/vyatta-cpu-shield
It adjust kernel workers, softirq, cgroups etc
^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH v2] eal/unix: allow creating thread with real-time priority
  2023-10-26 16:32             ` Stephen Hemminger
@ 2023-10-26 17:07               ` Morten Brørup
  0 siblings, 0 replies; 27+ messages in thread
From: Morten Brørup @ 2023-10-26 17:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Dmitry Kozlyuk, Narcisa Ana Maria Vasile, Dmitry Malloy,
	Pallavi Kadam, Tyler Retzlaff, bruce.richardson,
	Andrew Rybchenko, Konstantin Ananyev
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 26 October 2023 18.32
> 
> On Thu, 26 Oct 2023 09:33:42 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > Sent: Wednesday, 25 October 2023 23.33
> > >
> > > On Wed, 25 Oct 2023 19:54:06 +0200
> > > Morten Brørup <mb@smartsharesystems.com> wrote:
[...]
> > > > Someone might build a kernel with options to keep non-dataplane
> threads off
> > > some dedicated CPU cores, so they can be used for guaranteed low-
> latency
> > > dataplane threads. We do. We don't use real-time priority, though.
> > >
> > > This is really, hard to do.
> >
> > As my kids would say: This is really, really, really, really, really
> hard to do!
> >
> > We have not been able to find an authoritative source of
> documentation describing how to do it. :-(
[...]
> One benefit of doing real-time thread is that kernel will be more
> precise in
> any calls to sleep. If you do small sleep in normal thread, the kernel
> will round
> up the timer to try and avoid reprogramming timer chip and to save
> power (less wakeups from idle).
> With RT thread it will do "you wanted 21us, ok for you will do 21us"
So we don't need PR_SET_TIMERSLACK with RT threads?
> 
> The project that was originally Vyatta, has a script that tries to
> isolate interrupts etc.
> I started it but they have worked on it since then.
> 
>    https://github.com/danos/vyatta-cpu-shield
> 
> It adjust kernel workers, softirq, cgroups etc
This script looks very interesting. Thank you, Stephen!
^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH v6 1/1] eal/unix: allow creating thread with real-time priority
  2023-10-24 12:54 [PATCH] eal/unix: allow creating thread with real-time priority Thomas Monjalon
                   ` (4 preceding siblings ...)
       [not found] ` <20231026142749.1174372-1-thomas@monjalon.net>
@ 2023-10-27  8:08 ` Thomas Monjalon
  2023-10-27  8:45   ` Morten Brørup
  2024-10-07 19:27   ` Stephen Hemminger
  5 siblings, 2 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-27  8:08 UTC (permalink / raw)
  To: dev
  Cc: David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Tyler Retzlaff, Narcisa Vasile, Stephen Hemminger,
	Dmitry Kozlyuk, Konstantin Ananyev, Andrew Rybchenko
When adding an API for creating threads,
the real-time priority has been forbidden on Unix.
There is a known issue with ring behaviour,
but it should not be completely forbidden.
Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a sleep is added in the test thread,
and a warning is logged when using real-time priority.
Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
v1: no yield at all
v2: more comments, sched_yield() and Sleep(0) on Windows
v3: 2 yield functions with sleep in realtime version
v4: runtime warning, longer sleep on Unix and lighter yield on Windows
v5: fix build and increase Unix sleep to 1 ms
v6: no yield helper, use a big sleep for testing purpose
It is not clear how to implement helpers which would work
on all systems (different configuration, arch and OS).
So the first patch adding yield functions is dropped for now.
---
 app/test/test_threads.c                       | 12 ++--------
 .../prog_guide/env_abstraction_layer.rst      |  4 +++-
 lib/eal/include/rte_thread.h                  |  8 +++++--
 lib/eal/unix/rte_thread.c                     | 22 +++++++++++--------
 4 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..a863214137 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -5,6 +5,7 @@
 #include <string.h>
 
 #include <rte_thread.h>
+#include <rte_cycles.h>
 #include <rte_debug.h>
 
 #include "test.h"
@@ -22,7 +23,7 @@ thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_delay_us_sleep(1000); /* required for RT priority */
 
 	return 0;
 }
@@ -97,21 +98,12 @@ test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@ Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index f2581fe152..7ff031e1b2 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,14 @@ typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/**
+	 * Highest thread priority, use with caution.
+	 * WARNING: System may be unstable because of a real-time busy loop.
+	 *          @see rte_thread_yield_realtime().
+	 */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 278d8d342d..17ffb86c17 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -33,6 +33,8 @@ static int
 thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 	int *pol)
 {
+	static bool warned;
+
 	/* Clear the output parameters. */
 	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
 	*pol = -1;
@@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
+		if (!warned) {
+			RTE_LOG(NOTICE, EAL,
+					"Real-time thread is unstable if polling without sleep.\n");
+			warned = true;
+		}
+
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -155,11 +168,6 @@ rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				¶m.sched_priority, &policy);
 		if (ret != 0)
@@ -291,10 +299,6 @@ rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority,
 		&policy);
 	if (ret != 0)
-- 
2.42.0
^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: [PATCH v6 1/1] eal/unix: allow creating thread with real-time priority
  2023-10-27  8:08 ` [PATCH v6 1/1] " Thomas Monjalon
@ 2023-10-27  8:45   ` Morten Brørup
  2023-10-27  9:11     ` Thomas Monjalon
  2023-10-27 18:15     ` Stephen Hemminger
  2024-10-07 19:27   ` Stephen Hemminger
  1 sibling, 2 replies; 27+ messages in thread
From: Morten Brørup @ 2023-10-27  8:45 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: David Marchand, stable, Anatoly Burakov, Tyler Retzlaff,
	Narcisa Vasile, Stephen Hemminger, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, 27 October 2023 10.09
> 
> When adding an API for creating threads,
> the real-time priority has been forbidden on Unix.
> 
> There is a known issue with ring behaviour,
> but it should not be completely forbidden.
> 
> Real-time thread can block some kernel threads on the same core,
> making the system unstable.
> That's why a sleep is added in the test thread,
> and a warning is logged when using real-time priority.
> 
> Fixes: ca04c78b6262 ("eal: get/set thread priority per thread
> identifier")
> Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> ---
[...]
>  enum rte_thread_priority {
> +	/** Normal thread priority, the default. */
>  	RTE_THREAD_PRIORITY_NORMAL            = 0,
> -	/**< normal thread priority, the default */
> +	/**
> +	 * Highest thread priority, use with caution.
> +	 * WARNING: System may be unstable because of a real-time busy
> loop.
> +	 *          @see rte_thread_yield_realtime().
Please remove the reference to the now non-existing function.
Also, I'd prefer to move the warning comments (about real-time threads having priority over kernel threads, and issues with rte_ring) up here, so it goes into the public API documentation.
> +	 */
>  	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
> -	/**< highest thread priority allowed */
>  };
> 
>  /**
> diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> index 278d8d342d..17ffb86c17 100644
> --- a/lib/eal/unix/rte_thread.c
> +++ b/lib/eal/unix/rte_thread.c
> @@ -33,6 +33,8 @@ static int
>  thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int
> *os_pri,
>  	int *pol)
>  {
> +	static bool warned;
> +
>  	/* Clear the output parameters. */
>  	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
>  	*pol = -1;
> @@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum
> rte_thread_priority eal_pri, int *os_pri,
>  			sched_get_priority_max(SCHED_OTHER)) / 2;
>  		break;
>  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> +		/*
> +		 * WARNING: Real-time busy loop takes priority on kernel
> threads,
> +		 *          making the system unstable.
> +		 *          There is also a known issue when using
> rte_ring.
> +		 */
> +		if (!warned) {
> +			RTE_LOG(NOTICE, EAL,
> +					"Real-time thread is unstable if polling
> without sleep.\n");
> +			warned = true;
> +		}
Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
And technically, it's not the thread itself that becomes unstable.
How about:
"System may be unstable unless real-time thread uses blocking system calls or sleeps."
or:
"Real-time thread usually requires the use of blocking system calls or sleeps."
or something else.
My ACK is still valid.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v6 1/1] eal/unix: allow creating thread with real-time priority
  2023-10-27  8:45   ` Morten Brørup
@ 2023-10-27  9:11     ` Thomas Monjalon
  2023-10-27 18:15     ` Stephen Hemminger
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2023-10-27  9:11 UTC (permalink / raw)
  To: Morten Brørup
  Cc: dev, David Marchand, stable, Anatoly Burakov, Tyler Retzlaff,
	Narcisa Vasile, Stephen Hemminger, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
27/10/2023 10:45, Morten Brørup:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Friday, 27 October 2023 10.09
> > 
> > When adding an API for creating threads,
> > the real-time priority has been forbidden on Unix.
> > 
> > There is a known issue with ring behaviour,
> > but it should not be completely forbidden.
> > 
> > Real-time thread can block some kernel threads on the same core,
> > making the system unstable.
> > That's why a sleep is added in the test thread,
> > and a warning is logged when using real-time priority.
> > 
> > Fixes: ca04c78b6262 ("eal: get/set thread priority per thread
> > identifier")
> > Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> > Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> 
> [...]
> 
> >  enum rte_thread_priority {
> > +	/** Normal thread priority, the default. */
> >  	RTE_THREAD_PRIORITY_NORMAL            = 0,
> > -	/**< normal thread priority, the default */
> > +	/**
> > +	 * Highest thread priority, use with caution.
> > +	 * WARNING: System may be unstable because of a real-time busy
> > loop.
> > +	 *          @see rte_thread_yield_realtime().
> 
> Please remove the reference to the now non-existing function.
> 
> Also, I'd prefer to move the warning comments (about real-time threads having priority over kernel threads, and issues with rte_ring) up here, so it goes into the public API documentation.
Yes OK, thanks for the careful review.
> 
> > +	 */
> >  	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
> > -	/**< highest thread priority allowed */
> >  };
> > 
> >  /**
> > diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> > index 278d8d342d..17ffb86c17 100644
> > --- a/lib/eal/unix/rte_thread.c
> > +++ b/lib/eal/unix/rte_thread.c
> > @@ -33,6 +33,8 @@ static int
> >  thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int
> > *os_pri,
> >  	int *pol)
> >  {
> > +	static bool warned;
> > +
> >  	/* Clear the output parameters. */
> >  	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
> >  	*pol = -1;
> > @@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum
> > rte_thread_priority eal_pri, int *os_pri,
> >  			sched_get_priority_max(SCHED_OTHER)) / 2;
> >  		break;
> >  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> > +		/*
> > +		 * WARNING: Real-time busy loop takes priority on kernel
> > threads,
> > +		 *          making the system unstable.
> > +		 *          There is also a known issue when using
> > rte_ring.
> > +		 */
> > +		if (!warned) {
> > +			RTE_LOG(NOTICE, EAL,
> > +					"Real-time thread is unstable if polling
> > without sleep.\n");
> > +			warned = true;
> > +		}
> 
> Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
> And technically, it's not the thread itself that becomes unstable.
> 
> How about:
> "System may be unstable unless real-time thread uses blocking system calls or sleeps."
> or:
> "Real-time thread usually requires the use of blocking system calls or sleeps."
> or something else.
Yes something like that looks better.
I will try to find a short sentence.
> 
> My ACK is still valid.
> 
> 
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v6 1/1] eal/unix: allow creating thread with real-time priority
  2023-10-27  8:45   ` Morten Brørup
  2023-10-27  9:11     ` Thomas Monjalon
@ 2023-10-27 18:15     ` Stephen Hemminger
  1 sibling, 0 replies; 27+ messages in thread
From: Stephen Hemminger @ 2023-10-27 18:15 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, David Marchand, stable, Anatoly Burakov,
	Tyler Retzlaff, Narcisa Vasile, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
On Fri, 27 Oct 2023 10:45:03 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
> And technically, it's not the thread itself that becomes unstable.
My experience is that the goal of real time threads is "do not let kernel have higher priority than this thread".
That means that if/when a kernel worker is needed on this core, instability happens.
It is very hard but possible to ensure that a kernel worker thread never runs on that core.
But if you are able to do configure that, then there is no point in using RT threads.
The best design is to avoid the kernel needing to run, and if it does then let it run.
^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v6 1/1] eal/unix: allow creating thread with real-time priority
  2023-10-27  8:08 ` [PATCH v6 1/1] " Thomas Monjalon
  2023-10-27  8:45   ` Morten Brørup
@ 2024-10-07 19:27   ` Stephen Hemminger
  1 sibling, 0 replies; 27+ messages in thread
From: Stephen Hemminger @ 2024-10-07 19:27 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, David Marchand, stable, Morten Brørup, Anatoly Burakov,
	Tyler Retzlaff, Narcisa Vasile, Dmitry Kozlyuk,
	Konstantin Ananyev, Andrew Rybchenko
On Fri, 27 Oct 2023 10:08:52 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:
> When adding an API for creating threads,
> the real-time priority has been forbidden on Unix.
> 
> There is a known issue with ring behaviour,
> but it should not be completely forbidden.
> 
> Real-time thread can block some kernel threads on the same core,
> making the system unstable.
> That's why a sleep is added in the test thread,
> and a warning is logged when using real-time priority.
> 
> Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
> Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> ---
If RT threads are allowed on Linux, then a couple more things are necessary.
1. A functional test, maybe something with two threads exchanging packets
   ping-pong or something similar and show that the test does not crash
   or hang the CI system.
2. Sufficient documentation in programmers guide. Probably need a whole
   chapter on what is RT, when to use RT, and caveats.
   Make it clear to users that RT is not a faster mode, it is more about
   working with existing RT applications.
^ permalink raw reply	[flat|nested] 27+ messages in thread
end of thread, other threads:[~2024-10-07 19:27 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-24 12:54 [PATCH] eal/unix: allow creating thread with real-time priority Thomas Monjalon
2023-10-24 13:55 ` Morten Brørup
2023-10-24 16:04   ` Stephen Hemminger
2023-10-25 13:15     ` Thomas Monjalon
2023-10-25 13:34       ` Bruce Richardson
2023-10-25 13:44         ` Thomas Monjalon
2023-10-25 15:08           ` Stephen Hemminger
2023-10-25 15:14             ` Bruce Richardson
2023-10-25 15:18               ` Thomas Monjalon
2023-10-25 15:32                 ` Thomas Monjalon
2023-10-25 15:13 ` [PATCH v2] " Thomas Monjalon
2023-10-25 15:37   ` Stephen Hemminger
2023-10-25 16:46     ` Thomas Monjalon
2023-10-25 17:54       ` Morten Brørup
2023-10-25 21:33         ` Stephen Hemminger
2023-10-26  7:33           ` Morten Brørup
2023-10-26 16:32             ` Stephen Hemminger
2023-10-26 17:07               ` Morten Brørup
2023-10-26  0:00         ` Stephen Hemminger
     [not found] ` <20231025163352.1076755-1-thomas@monjalon.net>
2023-10-25 16:31   ` [PATCH v3 2/2] " Thomas Monjalon
     [not found] ` <20231026134313.1165954-1-thomas@monjalon.net>
2023-10-26 13:37   ` [PATCH v4 " Thomas Monjalon
     [not found] ` <20231026142749.1174372-1-thomas@monjalon.net>
2023-10-26 14:19   ` [PATCH v5 " Thomas Monjalon
2023-10-27  8:08 ` [PATCH v6 1/1] " Thomas Monjalon
2023-10-27  8:45   ` Morten Brørup
2023-10-27  9:11     ` Thomas Monjalon
2023-10-27 18:15     ` Stephen Hemminger
2024-10-07 19:27   ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).