* [PATCH] eal/linux: enhanced error handling for affinity
@ 2024-04-23 3:02 Jianyue Wu
2024-04-24 15:50 ` Stephen Hemminger
0 siblings, 1 reply; 6+ messages in thread
From: Jianyue Wu @ 2024-04-23 3:02 UTC (permalink / raw)
Cc: dev, Jianyue Wu
Improve the robustness of setting thread affinity in DPDK
by adding detailed error logging.
Changes:
1. Check the return value of pthread_setaffinity_np() and log an error
if the call fails.
2. Include the current thread name, the intended CPU set, and a detailed
error message in the log.
Sample prints:
EAL: Cannot set affinity for thread dpdk-test with cpus 0,
ret: 22, errno: 0, error description: Success
EAL: Cannot set affinity for thread dpdk-worker1 with cpus 1,
ret: 22, errno: 0, error description: Success
Signed-off-by: Jianyue Wu <wujianyue000@163.com>
---
lib/eal/unix/rte_thread.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 1b4c73f58e..8f9eaf0dcf 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -369,8 +369,26 @@ int
rte_thread_set_affinity_by_id(rte_thread_t thread_id,
const rte_cpuset_t *cpuset)
{
- return pthread_setaffinity_np((pthread_t)thread_id.opaque_id,
- sizeof(*cpuset), cpuset);
+ int ret;
+ char cpus_str[RTE_CPU_AFFINITY_STR_LEN] = {'\0'};
+ char thread_name[RTE_MAX_THREAD_NAME_LEN] = {'\0'};
+
+ errno = 0;
+ ret = pthread_setaffinity_np((pthread_t)thread_id.opaque_id,
+ sizeof(*cpuset), cpuset);
+ if (ret != 0) {
+ if (pthread_getname_np((pthread_t)thread_id.opaque_id,
+ thread_name, sizeof(thread_name)) != 0)
+ EAL_LOG(ERR, "pthread_getname_np failed!");
+ if (eal_thread_dump_affinity(cpuset, cpus_str, RTE_CPU_AFFINITY_STR_LEN) != 0)
+ EAL_LOG(ERR, "eal_thread_dump_affinity failed!");
+ EAL_LOG(ERR, "Cannot set affinity for thread %s with cpus %s, "
+ "ret: %d, errno: %d, error description: %s",
+ thread_name, cpus_str,
+ ret, errno, strerror(errno));
+ }
+
+ return ret;
}
int
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] eal/linux: enhanced error handling for affinity
2024-04-23 3:02 [PATCH] eal/linux: enhanced error handling for affinity Jianyue Wu
@ 2024-04-24 15:50 ` Stephen Hemminger
2024-04-25 1:08 ` 吴剑跃
0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-04-24 15:50 UTC (permalink / raw)
To: Jianyue Wu; +Cc: dev
On Tue, 23 Apr 2024 11:02:43 +0800
Jianyue Wu <wujianyue000@163.com> wrote:
> Improve the robustness of setting thread affinity in DPDK
> by adding detailed error logging.
Is this an error you saw in your application or something inside DPDK?
> Changes:
> 1. Check the return value of pthread_setaffinity_np() and log an error
> if the call fails.
Not sure this is necessary. The rte_thread functions are intended to
be os independent wrapper for threads. Does it need to be this chatty.
> 2. Include the current thread name, the intended CPU set, and a detailed
> error message in the log.
This introduces a more code and ends up being Linux/BSD specific only
for the case where application did something wrong.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
2024-04-24 15:50 ` Stephen Hemminger
@ 2024-04-25 1:08 ` 吴剑跃
2024-04-25 5:40 ` 吴剑跃
0 siblings, 1 reply; 6+ messages in thread
From: 吴剑跃 @ 2024-04-25 1:08 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]
Hello, Stephen,
Good day
The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
Original error prints are:
PANIC in rte_eal_init():
Cannot set affinity
# Callstacks.
Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
At 2024-04-24 23:50:21, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Tue, 23 Apr 2024 11:02:43 +0800
>Jianyue Wu <wujianyue000@163.com> wrote:
>
>> Improve the robustness of setting thread affinity in DPDK
>> by adding detailed error logging.
>
>Is this an error you saw in your application or something inside DPDK?
>
>> Changes:
>> 1. Check the return value of pthread_setaffinity_np() and log an error
>> if the call fails.
>
>Not sure this is necessary. The rte_thread functions are intended to
>be os independent wrapper for threads. Does it need to be this chatty.
>
>> 2. Include the current thread name, the intended CPU set, and a detailed
>> error message in the log.
>
>This introduces a more code and ends up being Linux/BSD specific only
>for the case where application did something wrong.
[-- Attachment #2: Type: text/html, Size: 2633 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re:Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
2024-04-25 1:08 ` 吴剑跃
@ 2024-04-25 5:40 ` 吴剑跃
2024-04-25 15:04 ` Stephen Hemminger
0 siblings, 1 reply; 6+ messages in thread
From: 吴剑跃 @ 2024-04-25 5:40 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]
After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
#if defined(__linux__) && defined(_GNU_SOURCE)
在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:
Hello, Stephen,
Good day
The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
Original error prints are:
PANIC in rte_eal_init():
Cannot set affinity
# Callstacks.
Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
At 2024-04-24 23:50:21, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Tue, 23 Apr 2024 11:02:43 +0800
>Jianyue Wu <wujianyue000@163.com> wrote:
>
>> Improve the robustness of setting thread affinity in DPDK
>> by adding detailed error logging.
>
>Is this an error you saw in your application or something inside DPDK?
>
>> Changes:
>> 1. Check the return value of pthread_setaffinity_np() and log an error
>> if the call fails.
>
>Not sure this is necessary. The rte_thread functions are intended to
>be os independent wrapper for threads. Does it need to be this chatty.
>
>> 2. Include the current thread name, the intended CPU set, and a detailed
>> error message in the log.
>
>This introduces a more code and ends up being Linux/BSD specific only
>for the case where application did something wrong.
[-- Attachment #2: Type: text/html, Size: 3295 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] eal/linux: enhanced error handling for affinity
2024-04-25 5:40 ` 吴剑跃
@ 2024-04-25 15:04 ` Stephen Hemminger
2024-04-26 3:14 ` Jianyue Wu
0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-04-25 15:04 UTC (permalink / raw)
To: 吴剑跃; +Cc: dev
On Thu, 25 Apr 2024 13:40:21 +0800 (CST)
吴剑跃 <wujianyue000@163.com> wrote:
> After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
> #if defined(__linux__) && defined(_GNU_SOURCE)
>
>
> 在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:
>
> Hello, Stephen,
>
>
>
> Good day
> The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
> Original error prints are:
> PANIC in rte_eal_init():
> Cannot set affinity
> # Callstacks.
>
>
> Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
> I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
>
My point is that just giving the kernel error should be sufficient, rather than having
to reformat the incoming arguments. The arguments are coming from the command line, and what I
would do is look at the error and the command line arguments to the application, as well as
any kernel logs.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re:Re: [PATCH] eal/linux: enhanced error handling for affinity
2024-04-25 15:04 ` Stephen Hemminger
@ 2024-04-26 3:14 ` Jianyue Wu
0 siblings, 0 replies; 6+ messages in thread
From: Jianyue Wu @ 2024-04-26 3:14 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]
Hello, Stephen,
Understand, yesterday I had added new changes to the patch, how to recall that patch?
Thank you~
At 2024-04-25 23:04:46, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>On Thu, 25 Apr 2024 13:40:21 +0800 (CST)
>吴剑跃 <wujianyue000@163.com> wrote:
>
>> After reviewing the code, I believe that the combination of the __linux__ and _GNU_SOURCE macros effectively confirms whether the pthread_getname_np() API can be utilized. I will proceed with adding them. Thank you~
>> #if defined(__linux__) && defined(_GNU_SOURCE)
>>
>>
>> 在 2024-04-25 09:08:59,"吴剑跃" <wujianyue000@163.com> 写道:
>>
>> Hello, Stephen,
>>
>>
>>
>> Good day
>> The issue is not caused by DPDK itself, but arises when the DPDK worker process attempts to set affinity to a cpuset that exceeds the limits set by the cgroup cpuset settings.
>> Original error prints are:
>> PANIC in rte_eal_init():
>> Cannot set affinity
>> # Callstacks.
>>
>>
>> Finding the detailed reason for the failure was challenging, so I added extra print statements to help diagnose the issue.
>> I understand your concern about maintaining OS independence with the rte_thread functions. This change aims to provide more context when errors occur, facilitating quicker troubleshooting. I agree that this introduces more code and could be seen as platform-specific. Perhaps we could implement this conditionally, only for platforms where such detailed logging is supported and useful.
>>
>
>My point is that just giving the kernel error should be sufficient, rather than having
>to reformat the incoming arguments. The arguments are coming from the command line, and what I
>would do is look at the error and the command line arguments to the application, as well as
>any kernel logs.
[-- Attachment #2: Type: text/html, Size: 2392 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-04-26 7:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-23 3:02 [PATCH] eal/linux: enhanced error handling for affinity Jianyue Wu
2024-04-24 15:50 ` Stephen Hemminger
2024-04-25 1:08 ` 吴剑跃
2024-04-25 5:40 ` 吴剑跃
2024-04-25 15:04 ` Stephen Hemminger
2024-04-26 3:14 ` Jianyue Wu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).