DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] eal/linux: enhanced error handling for affinity
@ 2024-04-23  3:02 Jianyue Wu
  2024-04-24 15:50 ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Jianyue Wu @ 2024-04-23  3:02 UTC (permalink / raw)
  Cc: dev, Jianyue Wu

Improve the robustness of setting thread affinity in DPDK
by adding detailed error logging.

Changes:
1. Check the return value of pthread_setaffinity_np() and log an error
if the call fails.
2. Include the current thread name, the intended CPU set, and a detailed
error message in the log.

Sample prints:
EAL: Cannot set affinity for thread dpdk-test with cpus 0,
ret: 22, errno: 0, error description: Success
EAL: Cannot set affinity for thread dpdk-worker1 with cpus 1,
ret: 22, errno: 0, error description: Success

Signed-off-by: Jianyue Wu <wujianyue000@163.com>
---
 lib/eal/unix/rte_thread.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 1b4c73f58e..8f9eaf0dcf 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -369,8 +369,26 @@ int
 rte_thread_set_affinity_by_id(rte_thread_t thread_id,
 		const rte_cpuset_t *cpuset)
 {
-	return pthread_setaffinity_np((pthread_t)thread_id.opaque_id,
-		sizeof(*cpuset), cpuset);
+	int ret;
+	char cpus_str[RTE_CPU_AFFINITY_STR_LEN] = {'\0'};
+	char thread_name[RTE_MAX_THREAD_NAME_LEN] = {'\0'};
+
+	errno = 0;
+	ret = pthread_setaffinity_np((pthread_t)thread_id.opaque_id,
+				sizeof(*cpuset), cpuset);
+	if (ret != 0) {
+		if (pthread_getname_np((pthread_t)thread_id.opaque_id,
+					thread_name, sizeof(thread_name)) != 0)
+			EAL_LOG(ERR, "pthread_getname_np failed!");
+		if (eal_thread_dump_affinity(cpuset, cpus_str, RTE_CPU_AFFINITY_STR_LEN) != 0)
+			EAL_LOG(ERR, "eal_thread_dump_affinity failed!");
+		EAL_LOG(ERR, "Cannot set affinity for thread %s with cpus %s, "
+			"ret: %d, errno: %d, error description: %s",
+			thread_name, cpus_str,
+			ret, errno, strerror(errno));
+	}
+
+	return ret;
 }
 
 int
-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] eal/linux: enhanced error handling for affinity
  2024-04-23  3:02 [PATCH] eal/linux: enhanced error handling for affinity Jianyue Wu
@ 2024-04-24 15:50 ` Stephen Hemminger
  2024-04-25  1:08   ` 吴剑跃
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-04-24 15:50 UTC (permalink / raw)
  To: Jianyue Wu; +Cc: dev

On Tue, 23 Apr 2024 11:02:43 +0800
Jianyue Wu <wujianyue000@163.com> wrote:

> Improve the robustness of setting thread affinity in DPDK
> by adding detailed error logging.

Is this an error you saw in your application or something inside DPDK?

> Changes:
> 1. Check the return value of pthread_setaffinity_np() and log an error
> if the call fails.

Not sure this is necessary. The rte_thread functions are intended to
be os independent wrapper for threads. Does it need to be this chatty.

> 2. Include the current thread name, the intended CPU set, and a detailed
> error message in the log.

This introduces a more code and ends up being Linux/BSD specific only
for the case where application did something wrong.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
  2024-04-24 15:50 ` Stephen Hemminger
@ 2024-04-25  1:08   ` 吴剑跃
  2024-04-25  5:40     ` 吴剑跃
  0 siblings, 1 reply; 6+ messages in thread
From: 吴剑跃 @ 2024-04-25  1:08 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]

Hello, Stephen,



Good day
The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
Original error prints are:
     PANIC in rte_eal_init():
     Cannot set affinity
     # Callstacks.


Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.


At 2024-04-24 23:50:21, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Tue, 23 Apr 2024 11:02:43 +0800
>Jianyue Wu <wujianyue000@163.com> wrote:
>
>> Improve the robustness of setting thread affinity in DPDK
>> by adding detailed error logging.
>
>Is this an error you saw in your application or something inside DPDK?
>
>> Changes:
>> 1. Check the return value of pthread_setaffinity_np() and log an error
>> if the call fails.
>
>Not sure this is necessary. The rte_thread functions are intended to
>be os independent wrapper for threads. Does it need to be this chatty.
>
>> 2. Include the current thread name, the intended CPU set, and a detailed
>> error message in the log.
>
>This introduces a more code and ends up being Linux/BSD specific only
>for the case where application did something wrong.

[-- Attachment #2: Type: text/html, Size: 2633 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
  2024-04-25  1:08   ` 吴剑跃
@ 2024-04-25  5:40     ` 吴剑跃
  2024-04-25 15:04       ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: 吴剑跃 @ 2024-04-25  5:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]

After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
#if defined(__linux__) && defined(_GNU_SOURCE)


在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:

Hello, Stephen,



Good day
The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
Original error prints are:
     PANIC in rte_eal_init():
     Cannot set affinity
     # Callstacks.


Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.


At 2024-04-24 23:50:21, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Tue, 23 Apr 2024 11:02:43 +0800
>Jianyue Wu <wujianyue000@163.com> wrote:
>
>> Improve the robustness of setting thread affinity in DPDK
>> by adding detailed error logging.
>
>Is this an error you saw in your application or something inside DPDK?
>
>> Changes:
>> 1. Check the return value of pthread_setaffinity_np() and log an error
>> if the call fails.
>
>Not sure this is necessary. The rte_thread functions are intended to
>be os independent wrapper for threads. Does it need to be this chatty.
>
>> 2. Include the current thread name, the intended CPU set, and a detailed
>> error message in the log.
>
>This introduces a more code and ends up being Linux/BSD specific only
>for the case where application did something wrong.

[-- Attachment #2: Type: text/html, Size: 3295 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] eal/linux: enhanced error handling for affinity
  2024-04-25  5:40     ` 吴剑跃
@ 2024-04-25 15:04       ` Stephen Hemminger
  2024-04-26  3:14         ` Jianyue Wu
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-04-25 15:04 UTC (permalink / raw)
  To: 吴剑跃; +Cc: dev

On Thu, 25 Apr 2024 13:40:21 +0800 (CST)
吴剑跃 <wujianyue000@163.com> wrote:

> After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
> #if defined(__linux__) && defined(_GNU_SOURCE)
> 
> 
> 在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:
> 
> Hello, Stephen,
> 
> 
> 
> Good day
> The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
> Original error prints are:
>      PANIC in rte_eal_init():
>      Cannot set affinity
>      # Callstacks.
> 
> 
> Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
> I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
> 

My point is that just giving the kernel error should be sufficient, rather than having
to reformat the incoming arguments. The arguments are coming from the command line, and what I
would do is look at the error and the command line arguments to the application, as well as
any kernel logs.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
  2024-04-25 15:04       ` Stephen Hemminger
@ 2024-04-26  3:14         ` Jianyue Wu
  0 siblings, 0 replies; 6+ messages in thread
From: Jianyue Wu @ 2024-04-26  3:14 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]

Hello, Stephen,




Understand, yesterday I had added new changes to the patch, how to recall that patch?

Thank you~














At 2024-04-25 23:04:46, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Thu, 25 Apr 2024 13:40:21 +0800 (CST)
>吴剑跃 <wujianyue000@163.com> wrote:
>
>> After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
>> #if defined(__linux__) && defined(_GNU_SOURCE)
>> 
>> 
>> 在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:
>> 
>> Hello, Stephen,
>> 
>> 
>> 
>> Good day
>> The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
>> Original error prints are:
>>      PANIC in rte_eal_init():
>>      Cannot set affinity
>>      # Callstacks.
>> 
>> 
>> Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
>> I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
>> 
>
>My point is that just giving the kernel error should be sufficient, rather than having
>to reformat the incoming arguments. The arguments are coming from the command line, and what I
>would do is look at the error and the command line arguments to the application, as well as
>any kernel logs.

[-- Attachment #2: Type: text/html, Size: 2392 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-04-26  7:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-23  3:02 [PATCH] eal/linux: enhanced error handling for affinity Jianyue Wu
2024-04-24 15:50 ` Stephen Hemminger
2024-04-25  1:08   ` 吴剑跃
2024-04-25  5:40     ` 吴剑跃
2024-04-25 15:04       ` Stephen Hemminger
2024-04-26  3:14         ` Jianyue Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).