DPDK patches and discussions
 help / color / mirror / Atom feed
* help with pthread_t deprecation / api changes
@ 2022-11-30 22:54 Tyler Retzlaff
  2022-12-02  1:12 ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-11-30 22:54 UTC (permalink / raw)
  To: dev; +Cc: thomas

hi folks,

i'd like to continue work moving to our platform abstracted rte_thread
but ran into a hiccup. for some recent and not so recent apis it appears
they managed to slip in without ever being __experimental. 

as a function of the dpdk project api/abi policy this means we can't
change or remove some of these functions without following the
deprecation process.

the apis that are causing me immediate difficulty are
rte_thread_setname and rte_ctrl_thread_create.

i think the least painful path forward to deprecating and removing these
apis is probably just to introduce the replacements with new names.

1. introduce functions with the following names marked as
   __experimental.

   rte_control_thread_create(rte_thread_t *, ...)
   rte_thread_set_name(rte_thread_t, ...)
   rte_thread_get_name(rte_thread_t, ...)

   along with the new functions, new unit tests will be included.

2. update dpdk internal implementation to use the new functions.

3. immediately remove the following functions from the public headers
   and issue an api deprecation notice for the functions not marked
   experimental.

   rte_ctrl_thread_create(pthread_t *, ...)
   rte_thread_setname(pthread_t *, ...)

4. when the new functions have their __experimental marking removed
   issue an abi deprecation notice for the functions from (2).

i'm open to feedback/suggestions of a better approach if anyone has one
to offer.

thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-11-30 22:54 help with pthread_t deprecation / api changes Tyler Retzlaff
@ 2022-12-02  1:12 ` Tyler Retzlaff
  2022-12-02  8:03   ` Morten Brørup
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-02  1:12 UTC (permalink / raw)
  To: dev, thomas, david.marchand, mb, stephen

On Wed, Nov 30, 2022 at 02:54:27PM -0800, Tyler Retzlaff wrote:
> hi folks,
> 
> i'd like to continue work moving to our platform abstracted rte_thread
> but ran into a hiccup. for some recent and not so recent apis it appears
> they managed to slip in without ever being __experimental. 
> 
> as a function of the dpdk project api/abi policy this means we can't
> change or remove some of these functions without following the
> deprecation process.
> 
> the apis that are causing me immediate difficulty are
> rte_thread_setname and rte_ctrl_thread_create.

after looking in more detail at our current implementations of these
functions i would like to backtrack a little and limit the scope of 
discussion to rte_thread_setname and rte_thread_getname.

as eal functions they aren't doing a good job in abstracting the
environment for applications, meaning an application would have to wrap
their use in platform conditional checks.

current status.

rte_thread_getname
  * freebsd, no implementation and it appears no possible support
  * linux, implementation conditional on __GLIBC_PREREQ(2, 12)
  * windows, can be implemented but isn't, noop success
  * fortunately is marked __rte_experimental
  * called in 1 place only from eal (logging)

i would propose to present a consistent abstraction the best thing to do
here is just remove rte_thread_getname. providing a version that
requires an application to do conditional dances / compilation based on
platform gains nothing.

rte_thread_setname
  * freebsd, implemented, imposes no limit on name length, suppresses errors
  * linux, implementation conditional on __GLIBC_PREREQ(2, 12), imposes
    limit of 16 (including terminating NUL) on name length, may return an
    error
  * windows, can be implemented, no explicit limit on name length, may
    return errors
  * unfortunately not marked __rte_experimental

i would propose to provide a replacement with the name
rte_thread_set_name with more consistent behavior across the 3 platforms.

  * returns errors for platforms that return errors, but the caller
    is free to ignore them.
  * explicit limit of 16 (including terminating NUL) on name length,
    names that are provided that exceed the limit are truncated without
    error.

your feedback would be appreciated.

thanks

> i think the least painful path forward to deprecating and removing these
> apis is probably just to introduce the replacements with new names.
> 
> 1. introduce functions with the following names marked as
>    __experimental.
> 
>    rte_control_thread_create(rte_thread_t *, ...)
>    rte_thread_set_name(rte_thread_t, ...)
>    rte_thread_get_name(rte_thread_t, ...)
> 
>    along with the new functions, new unit tests will be included.
> 
> 2. update dpdk internal implementation to use the new functions.
> 
> 3. immediately remove the following functions from the public headers
>    and issue an api deprecation notice for the functions not marked
>    experimental.
> 
>    rte_ctrl_thread_create(pthread_t *, ...)
>    rte_thread_setname(pthread_t *, ...)
> 
> 4. when the new functions have their __experimental marking removed
>    issue an abi deprecation notice for the functions from (2).
> 
> i'm open to feedback/suggestions of a better approach if anyone has one
> to offer.
> 
> thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: help with pthread_t deprecation / api changes
  2022-12-02  1:12 ` Tyler Retzlaff
@ 2022-12-02  8:03   ` Morten Brørup
  2022-12-02 19:57     ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Morten Brørup @ 2022-12-02  8:03 UTC (permalink / raw)
  To: Tyler Retzlaff, dev, thomas, david.marchand, stephen, Bruce Richardson

+Bruce, FreeBSD EAL maintainer

> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Friday, 2 December 2022 02.12
> 
> On Wed, Nov 30, 2022 at 02:54:27PM -0800, Tyler Retzlaff wrote:
> > hi folks,
> >
> > i'd like to continue work moving to our platform abstracted
> rte_thread
> > but ran into a hiccup. for some recent and not so recent apis it
> appears
> > they managed to slip in without ever being __experimental.
> >
> > as a function of the dpdk project api/abi policy this means we can't
> > change or remove some of these functions without following the
> > deprecation process.
> >
> > the apis that are causing me immediate difficulty are
> > rte_thread_setname and rte_ctrl_thread_create.
> 
> after looking in more detail at our current implementations of these
> functions i would like to backtrack a little and limit the scope of
> discussion to rte_thread_setname and rte_thread_getname.
> 
> as eal functions they aren't doing a good job in abstracting the
> environment for applications, meaning an application would have to wrap
> their use in platform conditional checks.

<rant>
This is one of the consequences of a bloated EAL.

How is an application supposed to run on top of an EAL that isn't fully implemented by the underlying environments?

I have complained about this before, but am not afraid to repeat it:
I wish the EAL wasn't treated as some catch-all library where it is easy to throw in new features, which really belong in separate libraries!
</rant>

> 
> current status.
> 
> rte_thread_getname
>   * freebsd, no implementation and it appears no possible support
>   * linux, implementation conditional on __GLIBC_PREREQ(2, 12)
>   * windows, can be implemented but isn't, noop success
>   * fortunately is marked __rte_experimental
>   * called in 1 place only from eal (logging)
> 
> i would propose to present a consistent abstraction the best thing to
> do
> here is just remove rte_thread_getname. providing a version that
> requires an application to do conditional dances / compilation based on
> platform gains nothing.

My initial thought was that it should be provided for symmetry reasons.
However, thinking about it, it is probably only used for debugging, trace, and similar. It is probably not used in any meaningful way by applications. So I won't oppose to removing it.

Alternatively:
If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.

> 
> rte_thread_setname
>   * freebsd, implemented, imposes no limit on name length, suppresses
> errors
>   * linux, implementation conditional on __GLIBC_PREREQ(2, 12), imposes
>     limit of 16 (including terminating NUL) on name length, may return
> an
>     error
>   * windows, can be implemented, no explicit limit on name length, may
>     return errors
>   * unfortunately not marked __rte_experimental
> 
> i would propose to provide a replacement with the name
> rte_thread_set_name with more consistent behavior across the 3
> platforms.
> 
>   * returns errors for platforms that return errors, but the caller
>     is free to ignore them.
>   * explicit limit of 16 (including terminating NUL) on name length,
>     names that are provided that exceed the limit are truncated without
>     error.

The function should not silently modify the provided data (i.e. truncate the name). I would prefer doing both: Return ERANGE (like pthread_setname_np()), but use the truncated name anyway. Then the application can choose to ignore or deal with the return code.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-02  8:03   ` Morten Brørup
@ 2022-12-02 19:57     ` Tyler Retzlaff
  2022-12-09  7:53       ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-02 19:57 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, thomas, david.marchand, stephen, Bruce Richardson

On Fri, Dec 02, 2022 at 09:03:25AM +0100, Morten Brørup wrote:
> +Bruce, FreeBSD EAL maintainer
> 
> 
> <rant>
> This is one of the consequences of a bloated EAL.
> 
> How is an application supposed to run on top of an EAL that isn't fully implemented by the underlying environments?

yep, i'm aware of your other thread and i am in complete agreement with
you. i think as a rule of thumb if we can't reasonably provide an
abstraction that behaves consistently then it isn't something that
should be in the eal at all.

unfortunately i'm dealing with things that pre-date that enlightenment.

> I have complained about this before, but am not afraid to repeat it:
> I wish the EAL wasn't treated as some catch-all library where it is easy to throw in new features, which really belong in separate libraries!
> </rant>
> 
> > 
> > current status.
> > 
> > rte_thread_getname
> >   * freebsd, no implementation and it appears no possible support
> >   * linux, implementation conditional on __GLIBC_PREREQ(2, 12)
> >   * windows, can be implemented but isn't, noop success
> >   * fortunately is marked __rte_experimental
> >   * called in 1 place only from eal (logging)
> > 
> > i would propose to present a consistent abstraction the best thing to
> > do
> > here is just remove rte_thread_getname. providing a version that
> > requires an application to do conditional dances / compilation based on
> > platform gains nothing.
> 
> My initial thought was that it should be provided for symmetry reasons.
> However, thinking about it, it is probably only used for debugging, trace, and similar. It is probably not used in any meaningful way by applications. So I won't oppose to removing it.
> 
> Alternatively:
> If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.

i think this raises a good question. is the purpose of setting a thread name
meant to be something we can use from the application or is it something that
is for debugging diagnostics and may be a best effort?

right now it is trying to be both and the both is trying to build on
unaligned platform facilities. one thing we could do here is decouple
the two.

if the purpose was to allow a name to be get,set by the application it
could just be stored with dpdk lcore state and does not need to be fed
into pthread_{set,get}name_np.  it can continue to be consumed by
logging/the application. extending that we could say there is a
'side-effect' that internally if your platform does support
pthread_{set,get}name_np then it will be called as well, but if it fails
we don't care.

> 
> > 
> > rte_thread_setname
> >   * freebsd, implemented, imposes no limit on name length, suppresses
> > errors
> >   * linux, implementation conditional on __GLIBC_PREREQ(2, 12), imposes
> >     limit of 16 (including terminating NUL) on name length, may return
> > an
> >     error
> >   * windows, can be implemented, no explicit limit on name length, may
> >     return errors
> >   * unfortunately not marked __rte_experimental
> > 
> > i would propose to provide a replacement with the name
> > rte_thread_set_name with more consistent behavior across the 3
> > platforms.
> > 
> >   * returns errors for platforms that return errors, but the caller
> >     is free to ignore them.
> >   * explicit limit of 16 (including terminating NUL) on name length,
> >     names that are provided that exceed the limit are truncated without
> >     error.
> 
> The function should not silently modify the provided data (i.e. truncate the name). I would prefer doing both: Return ERANGE (like pthread_setname_np()), but use the truncated name anyway. Then the application can choose to ignore or deal with the return code.

i agree with this, but the current linux implementation is already doing
it, i'd be inclined to say if we have a limit we should just fail the
set.

i'm beginning to think that complete removal of set,get thread name is
what we want here and if we need names for things like logging we
provide a facility not tied to the underlying platform threading.

others feel free to chime in.

thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-02 19:57     ` Tyler Retzlaff
@ 2022-12-09  7:53       ` Thomas Monjalon
  2022-12-09 16:48         ` Stephen Hemminger
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-09  7:53 UTC (permalink / raw)
  To: Morten Brørup, Tyler Retzlaff
  Cc: dev, david.marchand, stephen, Bruce Richardson

02/12/2022 20:57, Tyler Retzlaff:
> On Fri, Dec 02, 2022 at 09:03:25AM +0100, Morten Brørup wrote:
> > +Bruce, FreeBSD EAL maintainer
> > 
> > 
> > <rant>
> > This is one of the consequences of a bloated EAL.
> > 
> > How is an application supposed to run on top of an EAL that isn't fully implemented by the underlying environments?
> 
> yep, i'm aware of your other thread and i am in complete agreement with
> you. i think as a rule of thumb if we can't reasonably provide an
> abstraction that behaves consistently then it isn't something that
> should be in the eal at all.
> 
> unfortunately i'm dealing with things that pre-date that enlightenment.
> 
> > I have complained about this before, but am not afraid to repeat it:
> > I wish the EAL wasn't treated as some catch-all library where it is easy to throw in new features, which really belong in separate libraries!
> > </rant>
> > 
> > > 
> > > current status.
> > > 
> > > rte_thread_getname
> > >   * freebsd, no implementation and it appears no possible support
> > >   * linux, implementation conditional on __GLIBC_PREREQ(2, 12)
> > >   * windows, can be implemented but isn't, noop success
> > >   * fortunately is marked __rte_experimental
> > >   * called in 1 place only from eal (logging)
> > > 
> > > i would propose to present a consistent abstraction the best thing to
> > > do
> > > here is just remove rte_thread_getname. providing a version that
> > > requires an application to do conditional dances / compilation based on
> > > platform gains nothing.
> > 
> > My initial thought was that it should be provided for symmetry reasons.
> > However, thinking about it, it is probably only used for debugging, trace, and similar. It is probably not used in any meaningful way by applications. So I won't oppose to removing it.
> > 
> > Alternatively:
> > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.
> 
> i think this raises a good question. is the purpose of setting a thread name
> meant to be something we can use from the application or is it something that
> is for debugging diagnostics and may be a best effort?

I think yes it is only for debugging.
So best effort looks to be a good approach.
I'm not sure you need to replace the functions.
Can you just complete the implementations?

> right now it is trying to be both and the both is trying to build on
> unaligned platform facilities. one thing we could do here is decouple
> the two.
> 
> if the purpose was to allow a name to be get,set by the application it
> could just be stored with dpdk lcore state and does not need to be fed
> into pthread_{set,get}name_np.  it can continue to be consumed by
> logging/the application.

In rte_ctrl_thread_create() the name is provided and limited to 16.

> extending that we could say there is a
> 'side-effect' that internally if your platform does support
> pthread_{set,get}name_np then it will be called as well, but if it fails
> we don't care.
> 
> > 
> > > 
> > > rte_thread_setname
> > >   * freebsd, implemented, imposes no limit on name length, suppresses
> > > errors
> > >   * linux, implementation conditional on __GLIBC_PREREQ(2, 12), imposes
> > >     limit of 16 (including terminating NUL) on name length, may return
> > > an
> > >     error
> > >   * windows, can be implemented, no explicit limit on name length, may
> > >     return errors
> > >   * unfortunately not marked __rte_experimental
> > > 
> > > i would propose to provide a replacement with the name
> > > rte_thread_set_name with more consistent behavior across the 3
> > > platforms.
> > > 
> > >   * returns errors for platforms that return errors, but the caller
> > >     is free to ignore them.
> > >   * explicit limit of 16 (including terminating NUL) on name length,
> > >     names that are provided that exceed the limit are truncated without
> > >     error.
> > 
> > The function should not silently modify the provided data (i.e. truncate the name). I would prefer doing both: Return ERANGE (like pthread_setname_np()), but use the truncated name anyway. Then the application can choose to ignore or deal with the return code.
> 
> i agree with this, but the current linux implementation is already doing
> it, i'd be inclined to say if we have a limit we should just fail the
> set.
> 
> i'm beginning to think that complete removal of set,get thread name is
> what we want here and if we need names for things like logging we
> provide a facility not tied to the underlying platform threading.
> 
> others feel free to chime in.
> 
> thanks!
> 






^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09  7:53       ` Thomas Monjalon
@ 2022-12-09 16:48         ` Stephen Hemminger
  2022-12-09 20:06           ` Tyler Retzlaff
  2022-12-09 21:14           ` Thomas Monjalon
  0 siblings, 2 replies; 19+ messages in thread
From: Stephen Hemminger @ 2022-12-09 16:48 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Morten Brørup, Tyler Retzlaff, dev, david.marchand,
	Bruce Richardson

On Fri, 09 Dec 2022 08:53:57 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > 
> > i think this raises a good question. is the purpose of setting a thread name
> > meant to be something we can use from the application or is it something that
> > is for debugging diagnostics and may be a best effort?  
> 
> I think yes it is only for debugging.
> So best effort looks to be a good approach.
> I'm not sure you need to replace the functions.
> Can you just complete the implementations?


Surprisingly, thread names are not preserved in core dumps.
The core dump standard used by Linux does not put thread name in the image.
Since this is a ELF ABI unlikely to be ever be added.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 16:48         ` Stephen Hemminger
@ 2022-12-09 20:06           ` Tyler Retzlaff
  2022-12-09 21:13             ` Thomas Monjalon
  2022-12-09 21:14           ` Thomas Monjalon
  1 sibling, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-09 20:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, Morten Brørup, dev, david.marchand,
	Bruce Richardson

hey,

combining the reply to both Thomas and Stephen since i think this series
http://mails.dpdk.org/archives/dev/2022-December/257238.html addresses
both feedback comments.

On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> On Fri, 09 Dec 2022 08:53:57 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > 
> > > i think this raises a good question. is the purpose of setting a thread name
> > > meant to be something we can use from the application or is it something that
> > > is for debugging diagnostics and may be a best effort?  
> > 

> Thomas Monjalon <thomas@monjalon.net> wrote:

> I think yes it is only for debugging.
> So best effort looks to be a good approach.
> I'm not sure you need to replace the functions.
> Can you just complete the implementations?

the patch series put forward allows a set / get name per-lcore, where
you get implicit (but not exposed via the eal api) call to underlying
platform thread setname.

the intent here is if you have it and it works you'll get it and if you
don't you won't but the eal doesn't force the application to deal with it
conditionally on a per-platform basis.

> Stephen wrote:
> 
> Surprisingly, thread names are not preserved in core dumps.
> The core dump standard used by Linux does not put thread name in the image.
> Since this is a ELF ABI unlikely to be ever be added.

the patchset addresses this by actually keeping a copy of the name set,
so it will be available in the coredump.

the intent here is to make it available for use by the application i.e.
get that works on all platforms, but also you can actually pull the name
out under a debugger or a dump and does not require any conditional
dancing per-platform by the application.

as an aside there are 2 series up for review that finally clean the
remaining platform-specific thread references from the eal public
interface.

http://mails.dpdk.org/archives/dev/2022-December/257238.html
http://mails.dpdk.org/archives/dev/2022-December/257413.html

the set get name api patch series i'm preparing a v2 for due to some
minor things caught by the ci and an issue with mingw but otherwise if
we can get these in it will unblock a lot of the internal detail
cleanups we've been trying to accomplish.

really appreciate it guys.

thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 20:06           ` Tyler Retzlaff
@ 2022-12-09 21:13             ` Thomas Monjalon
  2022-12-09 23:49               ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-09 21:13 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

09/12/2022 21:06, Tyler Retzlaff:
> On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > On Fri, 09 Dec 2022 08:53:57 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > 
> > > > i think this raises a good question. is the purpose of setting a thread name
> > > > meant to be something we can use from the application or is it something that
> > > > is for debugging diagnostics and may be a best effort?  
> > > 
> 
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > I think yes it is only for debugging.
> > So best effort looks to be a good approach.
> > I'm not sure you need to replace the functions.
> > Can you just complete the implementations?
> 
> the patch series put forward allows a set / get name per-lcore, where
> you get implicit (but not exposed via the eal api) call to underlying
> platform thread setname.

I don't understand how lcore ID and thread ID are connected.
You can run multiple control threads on a single lcore.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 16:48         ` Stephen Hemminger
  2022-12-09 20:06           ` Tyler Retzlaff
@ 2022-12-09 21:14           ` Thomas Monjalon
  2022-12-09 22:38             ` Stephen Hemminger
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-09 21:14 UTC (permalink / raw)
  To: Stephen Hemminger, Tyler Retzlaff
  Cc: Morten Brørup, dev, david.marchand, Bruce Richardson

09/12/2022 17:48, Stephen Hemminger:
> On Fri, 09 Dec 2022 08:53:57 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > 
> > > i think this raises a good question. is the purpose of setting a thread name
> > > meant to be something we can use from the application or is it something that
> > > is for debugging diagnostics and may be a best effort?  
> > 
> > I think yes it is only for debugging.
> > So best effort looks to be a good approach.
> > I'm not sure you need to replace the functions.
> > Can you just complete the implementations?
> 
> 
> Surprisingly, thread names are not preserved in core dumps.
> The core dump standard used by Linux does not put thread name in the image.
> Since this is a ELF ABI unlikely to be ever be added.

What is missing exactly to have thread name in the core dump?



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 21:14           ` Thomas Monjalon
@ 2022-12-09 22:38             ` Stephen Hemminger
  2022-12-09 23:55               ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2022-12-09 22:38 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Tyler Retzlaff, Morten Brørup, dev, david.marchand,
	Bruce Richardson

On Fri, 09 Dec 2022 22:14:33 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 09/12/2022 17:48, Stephen Hemminger:
> > On Fri, 09 Dec 2022 08:53:57 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >   
> > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.    
> > > > 
> > > > i think this raises a good question. is the purpose of setting a thread name
> > > > meant to be something we can use from the application or is it something that
> > > > is for debugging diagnostics and may be a best effort?    
> > > 
> > > I think yes it is only for debugging.
> > > So best effort looks to be a good approach.
> > > I'm not sure you need to replace the functions.
> > > Can you just complete the implementations?  
> > 
> > 
> > Surprisingly, thread names are not preserved in core dumps.
> > The core dump standard used by Linux does not put thread name in the image.
> > Since this is a ELF ABI unlikely to be ever be added.  
> 
> What is missing exactly to have thread name in the core dump?
> 
> 

Linux core dump file format is ELF.
The thread info is storewd in the file notes as NT_PRPSINFO
which contains info but not the thread name. In the kernel
thread name is under comm.


typedef struct prpsinfo {       /* Information about process                 */
  unsigned char  pr_state;      /* Numeric process state                     */
  char           pr_sname;      /* Char for pr_state                         */
  unsigned char  pr_zomb;       /* Zombie                                    */
  signed char    pr_nice;       /* Nice val                                  */
  unsigned long  pr_flag;       /* Flags                                     */

  uint32_t       pr_uid;        /* User ID                                   */
  uint32_t       pr_gid;        /* Group ID                                  */

  pid_t          pr_pid;        /* Process ID                                */
  pid_t          pr_ppid;       /* Parent's process ID                       */
  pid_t          pr_pgrp;       /* Group ID                                  */
  pid_t          pr_sid;        /* Session ID                                */
  char           pr_fname[16];  /* Filename of executable                    */
  char           pr_psargs[80]; /* Initial part of arg list                  */
} prpsinfo;


Stack Overflow leads to this pages
https://www.gabriel.urdhr.fr/2015/05/29/core-file/
https://uhlo.blogspot.com/2012/05/brief-look-into-core-dumps.html

Only know this because of investigating how to get thread names
to show up in Azure with Watson.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 21:13             ` Thomas Monjalon
@ 2022-12-09 23:49               ` Tyler Retzlaff
  2022-12-11  7:50                 ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-09 23:49 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> 09/12/2022 21:06, Tyler Retzlaff:
> > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > 
> > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > meant to be something we can use from the application or is it something that
> > > > > is for debugging diagnostics and may be a best effort?  
> > > > 
> > 
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > > I think yes it is only for debugging.
> > > So best effort looks to be a good approach.
> > > I'm not sure you need to replace the functions.
> > > Can you just complete the implementations?
> > 
> > the patch series put forward allows a set / get name per-lcore, where
> > you get implicit (but not exposed via the eal api) call to underlying
> > platform thread setname.
> 
> I don't understand how lcore ID and thread ID are connected.
> You can run multiple control threads on a single lcore.
> 

correct.

the new public api allows the set of a name on an lcore only. as a
side-effect if the platform supports it the name is also set on the
thread_id associated with the lcore (from lcore_config[].thread_id).

for control threads you just get the side-effect if the platform
supports it, otherwise it is a noop.

if we want set / get name at all this seemed the most usable balance
between application consumption with debug use where available. if this
isn't acceptable then i would suggest we simply remove both
rte_thread_{get,set}name because as a debugging facility we cannot offer
a consistent abstracted api which means it shouldn't be in the eal at
all.

thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 22:38             ` Stephen Hemminger
@ 2022-12-09 23:55               ` Tyler Retzlaff
  0 siblings, 0 replies; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-09 23:55 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, Morten Brørup, dev, david.marchand,
	Bruce Richardson

On Fri, Dec 09, 2022 at 02:38:49PM -0800, Stephen Hemminger wrote:
> On Fri, 09 Dec 2022 22:14:33 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 09/12/2022 17:48, Stephen Hemminger:
> > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > >   
> > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.    
> > > > > 
> > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > meant to be something we can use from the application or is it something that
> > > > > is for debugging diagnostics and may be a best effort?    
> > > > 
> > > > I think yes it is only for debugging.
> > > > So best effort looks to be a good approach.
> > > > I'm not sure you need to replace the functions.
> > > > Can you just complete the implementations?  
> > > 
> > > 
> > > Surprisingly, thread names are not preserved in core dumps.
> > > The core dump standard used by Linux does not put thread name in the image.
> > > Since this is a ELF ABI unlikely to be ever be added.  
> > 
> > What is missing exactly to have thread name in the core dump?
> > 
> > 
> 
> Linux core dump file format is ELF.
> The thread info is storewd in the file notes as NT_PRPSINFO
> which contains info but not the thread name. In the kernel
> thread name is under comm.
> 
> 
> typedef struct prpsinfo {       /* Information about process                 */
>   unsigned char  pr_state;      /* Numeric process state                     */
>   char           pr_sname;      /* Char for pr_state                         */
>   unsigned char  pr_zomb;       /* Zombie                                    */
>   signed char    pr_nice;       /* Nice val                                  */
>   unsigned long  pr_flag;       /* Flags                                     */
> 
>   uint32_t       pr_uid;        /* User ID                                   */
>   uint32_t       pr_gid;        /* Group ID                                  */
> 
>   pid_t          pr_pid;        /* Process ID                                */
>   pid_t          pr_ppid;       /* Parent's process ID                       */
>   pid_t          pr_pgrp;       /* Group ID                                  */
>   pid_t          pr_sid;        /* Session ID                                */
>   char           pr_fname[16];  /* Filename of executable                    */
>   char           pr_psargs[80]; /* Initial part of arg list                  */
> } prpsinfo;
> 
> 
> Stack Overflow leads to this pages
> https://www.gabriel.urdhr.fr/2015/05/29/core-file/
> https://uhlo.blogspot.com/2012/05/brief-look-into-core-dumps.html
> 
> Only know this because of investigating how to get thread names
> to show up in Azure with Watson.

from a dpdk specific perspective if we want to consistently have a
thread name in a dump / coredump then we have a better chance of
success just storing it in our applications namespace ourselves.
relying on platform-specific facilities to preserve it seems hit and
miss at best.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-09 23:49               ` Tyler Retzlaff
@ 2022-12-11  7:50                 ` Thomas Monjalon
  2022-12-12 17:45                   ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-11  7:50 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

10/12/2022 00:49, Tyler Retzlaff:
> On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> > 09/12/2022 21:06, Tyler Retzlaff:
> > > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > > 
> > > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > > meant to be something we can use from the application or is it something that
> > > > > > is for debugging diagnostics and may be a best effort?  
> > > > > 
> > > 
> > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > I think yes it is only for debugging.
> > > > So best effort looks to be a good approach.
> > > > I'm not sure you need to replace the functions.
> > > > Can you just complete the implementations?
> > > 
> > > the patch series put forward allows a set / get name per-lcore, where
> > > you get implicit (but not exposed via the eal api) call to underlying
> > > platform thread setname.
> > 
> > I don't understand how lcore ID and thread ID are connected.
> > You can run multiple control threads on a single lcore.
> > 
> 
> correct.
> 
> the new public api allows the set of a name on an lcore only. as a
> side-effect if the platform supports it the name is also set on the
> thread_id associated with the lcore (from lcore_config[].thread_id).
> 
> for control threads you just get the side-effect if the platform
> supports it, otherwise it is a noop.

What does it mean? which side effect? what must be supported?

> if we want set / get name at all this seemed the most usable balance
> between application consumption with debug use where available. if this
> isn't acceptable then i would suggest we simply remove both
> rte_thread_{get,set}name because as a debugging facility we cannot offer
> a consistent abstracted api which means it shouldn't be in the eal at
> all.

Why it cannot be consistent?
Please be more precise.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-11  7:50                 ` Thomas Monjalon
@ 2022-12-12 17:45                   ` Tyler Retzlaff
  2022-12-13  8:32                     ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-12 17:45 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

On Sun, Dec 11, 2022 at 08:50:48AM +0100, Thomas Monjalon wrote:
> 10/12/2022 00:49, Tyler Retzlaff:
> > On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> > > 09/12/2022 21:06, Tyler Retzlaff:
> > > > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 
> > > > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > > > 
> > > > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > > > meant to be something we can use from the application or is it something that
> > > > > > > is for debugging diagnostics and may be a best effort?  
> > > > > > 
> > > > 
> > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > > I think yes it is only for debugging.
> > > > > So best effort looks to be a good approach.
> > > > > I'm not sure you need to replace the functions.
> > > > > Can you just complete the implementations?
> > > > 
> > > > the patch series put forward allows a set / get name per-lcore, where
> > > > you get implicit (but not exposed via the eal api) call to underlying
> > > > platform thread setname.
> > > 
> > > I don't understand how lcore ID and thread ID are connected.
> > > You can run multiple control threads on a single lcore.
> > > 
> > 
> > correct.
> > 
> > the new public api allows the set of a name on an lcore only. as a
> > side-effect if the platform supports it the name is also set on the
> > thread_id associated with the lcore (from lcore_config[].thread_id).
> > 
> > for control threads you just get the side-effect if the platform
> > supports it, otherwise it is a noop.
> 
> What does it mean? which side effect? what must be supported?
> 
> > if we want set / get name at all this seemed the most usable balance
> > between application consumption with debug use where available. if this
> > isn't acceptable then i would suggest we simply remove both
> > rte_thread_{get,set}name because as a debugging facility we cannot offer
> > a consistent abstracted api which means it shouldn't be in the eal at
> > all.
> 
> Why it cannot be consistent?
> Please be more precise.
> 

sorry i thought you had been looking at our implementation, let me
summarize.

here are the differences between the underlying platform capabilities
that prohibit both get and set. it's not a matter of just providing missing
implementation for a specific platform.

set thread name:
  freebsd
    - set reports no failure, but may silently fail
    - uncertain what name length limit is
  linux
    - set not available in older glibc
    - current rte wrapper silently truncates name length
  windows
    - set always available
    - uncertain what name length limit is

get thread name:
  freebsd
    - not available at all
  linux
    - get not available in older glibc
    - get can fail
  windows
    - get always available
    - get can fail

keep in mind the purpose of an abstraction is that the application *does
not* have to do conditional evaluation on a per-platform basis around
call sites. once you start putting #ifdef RTE_EXEC_ENV_XXX into code you
failed and i presume we want none of the use of eal to be adorned with
that.

at first glance you might think oh, well if get isn't supported then
just return some default string or an empty string but even that is a
violation of the abstraction that leaks implementation detail.

i.e. assuming success set() success  get(set()) should also succeed
without conditional compilation of the code.

the common abstraction that can be reasonably provided explicitly
operating a thread is something like.

  * for set you can provide a watered down version that doesn't report
    failure and silently truncates and ignores errors within the
    implementation. if it works it works if it doesn't you just don't
    know i.e. best effort.
  * for get it cannot be provided consistently, the platforms simply
    aren't providing what is needed.

for background one of the request to expose the native platform thread
id was to access the best effort behavior for the thread associated with
an lcore. discussion on list highlighted the constraint that this should
be done without exposing platform specific detail via the eal api.

http://mails.dpdk.org/archives/dev/2022-October/253411.html

as mentioned in a previous mail i provided a series that accomplishes
this as a side effect of an api that can be consistently provided when
available on the platform, it has 2 benefits.
  * it does not expose the platform-specific native thread handle.
  * for lcores it keeps the name in the application memory so it is
    available in crash/coredumps.

so we have 2 options.

1. a watered down set (no get) that is fire and forget and reports no
   failure and maybe it works, maybe it doesn't depending on your platform.
2. the lcore set / get which is basically (1) for the threads associated
   with lcores but provides some additional features that are supportable
   in the api surface. (set/get, stored in application namespace).

great discussion, please let me know your thoughts and what direction
we'd like to go i'll adjust patches as appropriate.

thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-12 17:45                   ` Tyler Retzlaff
@ 2022-12-13  8:32                     ` Thomas Monjalon
  2022-12-13 17:38                       ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-13  8:32 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

12/12/2022 18:45, Tyler Retzlaff:
> On Sun, Dec 11, 2022 at 08:50:48AM +0100, Thomas Monjalon wrote:
> > 10/12/2022 00:49, Tyler Retzlaff:
> > > On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> > > > 09/12/2022 21:06, Tyler Retzlaff:
> > > > > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > > > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > 
> > > > > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > > > > 
> > > > > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > > > > meant to be something we can use from the application or is it something that
> > > > > > > > is for debugging diagnostics and may be a best effort?  
> > > > > > > 
> > > > > 
> > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 
> > > > > > I think yes it is only for debugging.
> > > > > > So best effort looks to be a good approach.
> > > > > > I'm not sure you need to replace the functions.
> > > > > > Can you just complete the implementations?
> > > > > 
> > > > > the patch series put forward allows a set / get name per-lcore, where
> > > > > you get implicit (but not exposed via the eal api) call to underlying
> > > > > platform thread setname.
> > > > 
> > > > I don't understand how lcore ID and thread ID are connected.
> > > > You can run multiple control threads on a single lcore.
> > > > 
> > > 
> > > correct.
> > > 
> > > the new public api allows the set of a name on an lcore only. as a
> > > side-effect if the platform supports it the name is also set on the
> > > thread_id associated with the lcore (from lcore_config[].thread_id).
> > > 
> > > for control threads you just get the side-effect if the platform
> > > supports it, otherwise it is a noop.
> > 
> > What does it mean? which side effect? what must be supported?
> > 
> > > if we want set / get name at all this seemed the most usable balance
> > > between application consumption with debug use where available. if this
> > > isn't acceptable then i would suggest we simply remove both
> > > rte_thread_{get,set}name because as a debugging facility we cannot offer
> > > a consistent abstracted api which means it shouldn't be in the eal at
> > > all.
> > 
> > Why it cannot be consistent?
> > Please be more precise.
> > 
> 
> sorry i thought you had been looking at our implementation, let me
> summarize.
> 
> here are the differences between the underlying platform capabilities
> that prohibit both get and set. it's not a matter of just providing missing
> implementation for a specific platform.
> 
> set thread name:
>   freebsd
>     - set reports no failure, but may silently fail
>     - uncertain what name length limit is
>   linux
>     - set not available in older glibc
>     - current rte wrapper silently truncates name length
>   windows
>     - set always available
>     - uncertain what name length limit is
> 
> get thread name:
>   freebsd
>     - not available at all
>   linux
>     - get not available in older glibc
>     - get can fail
>   windows
>     - get always available
>     - get can fail
> 
> keep in mind the purpose of an abstraction is that the application *does
> not* have to do conditional evaluation on a per-platform basis around
> call sites. once you start putting #ifdef RTE_EXEC_ENV_XXX into code you
> failed and i presume we want none of the use of eal to be adorned with
> that.
> 
> at first glance you might think oh, well if get isn't supported then
> just return some default string or an empty string but even that is a
> violation of the abstraction that leaks implementation detail.
> 
> i.e. assuming success set() success  get(set()) should also succeed
> without conditional compilation of the code.
> 
> the common abstraction that can be reasonably provided explicitly
> operating a thread is something like.
> 
>   * for set you can provide a watered down version that doesn't report
>     failure and silently truncates and ignores errors within the
>     implementation. if it works it works if it doesn't you just don't
>     know i.e. best effort.
>   * for get it cannot be provided consistently, the platforms simply
>     aren't providing what is needed.
> 
> for background one of the request to expose the native platform thread
> id was to access the best effort behavior for the thread associated with
> an lcore. discussion on list highlighted the constraint that this should
> be done without exposing platform specific detail via the eal api.
> 
> http://mails.dpdk.org/archives/dev/2022-October/253411.html
> 
> as mentioned in a previous mail i provided a series that accomplishes
> this as a side effect of an api that can be consistently provided when
> available on the platform, it has 2 benefits.
>   * it does not expose the platform-specific native thread handle.
>   * for lcores it keeps the name in the application memory so it is
>     available in crash/coredumps.
> 
> so we have 2 options.
> 
> 1. a watered down set (no get) that is fire and forget and reports no
>    failure and maybe it works, maybe it doesn't depending on your platform.
> 2. the lcore set / get which is basically (1) for the threads associated
>    with lcores but provides some additional features that are supportable
>    in the api surface. (set/get, stored in application namespace).

Given thread name is not critical at all, I think best effort is OK.
We could make clear in the API documentation that it is not reliable.

I don't think implementing thread name in the specific case of
datapath lcore is a real improvement.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-13  8:32                     ` Thomas Monjalon
@ 2022-12-13 17:38                       ` Tyler Retzlaff
  2022-12-13 19:34                         ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-13 17:38 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

On Tue, Dec 13, 2022 at 09:32:06AM +0100, Thomas Monjalon wrote:
> 12/12/2022 18:45, Tyler Retzlaff:
> > On Sun, Dec 11, 2022 at 08:50:48AM +0100, Thomas Monjalon wrote:
> > > 10/12/2022 00:49, Tyler Retzlaff:
> > > > On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> > > > > 09/12/2022 21:06, Tyler Retzlaff:
> > > > > > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > > > > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > 
> > > > > > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > > > > > 
> > > > > > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > > > > > meant to be something we can use from the application or is it something that
> > > > > > > > > is for debugging diagnostics and may be a best effort?  
> > > > > > > > 
> > > > > > 
> > > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > 
> > > > > > > I think yes it is only for debugging.
> > > > > > > So best effort looks to be a good approach.
> > > > > > > I'm not sure you need to replace the functions.
> > > > > > > Can you just complete the implementations?
> > > > > > 
> > > > > > the patch series put forward allows a set / get name per-lcore, where
> > > > > > you get implicit (but not exposed via the eal api) call to underlying
> > > > > > platform thread setname.
> > > > > 
> > > > > I don't understand how lcore ID and thread ID are connected.
> > > > > You can run multiple control threads on a single lcore.
> > > > > 
> > > > 
> > > > correct.
> > > > 
> > > > the new public api allows the set of a name on an lcore only. as a
> > > > side-effect if the platform supports it the name is also set on the
> > > > thread_id associated with the lcore (from lcore_config[].thread_id).
> > > > 
> > > > for control threads you just get the side-effect if the platform
> > > > supports it, otherwise it is a noop.
> > > 
> > > What does it mean? which side effect? what must be supported?
> > > 
> > > > if we want set / get name at all this seemed the most usable balance
> > > > between application consumption with debug use where available. if this
> > > > isn't acceptable then i would suggest we simply remove both
> > > > rte_thread_{get,set}name because as a debugging facility we cannot offer
> > > > a consistent abstracted api which means it shouldn't be in the eal at
> > > > all.
> > > 
> > > Why it cannot be consistent?
> > > Please be more precise.
> > > 
> > 
> > sorry i thought you had been looking at our implementation, let me
> > summarize.
> > 
> > here are the differences between the underlying platform capabilities
> > that prohibit both get and set. it's not a matter of just providing missing
> > implementation for a specific platform.
> > 
> > set thread name:
> >   freebsd
> >     - set reports no failure, but may silently fail
> >     - uncertain what name length limit is
> >   linux
> >     - set not available in older glibc
> >     - current rte wrapper silently truncates name length
> >   windows
> >     - set always available
> >     - uncertain what name length limit is
> > 
> > get thread name:
> >   freebsd
> >     - not available at all
> >   linux
> >     - get not available in older glibc
> >     - get can fail
> >   windows
> >     - get always available
> >     - get can fail
> > 
> > keep in mind the purpose of an abstraction is that the application *does
> > not* have to do conditional evaluation on a per-platform basis around
> > call sites. once you start putting #ifdef RTE_EXEC_ENV_XXX into code you
> > failed and i presume we want none of the use of eal to be adorned with
> > that.
> > 
> > at first glance you might think oh, well if get isn't supported then
> > just return some default string or an empty string but even that is a
> > violation of the abstraction that leaks implementation detail.
> > 
> > i.e. assuming success set() success  get(set()) should also succeed
> > without conditional compilation of the code.
> > 
> > the common abstraction that can be reasonably provided explicitly
> > operating a thread is something like.
> > 
> >   * for set you can provide a watered down version that doesn't report
> >     failure and silently truncates and ignores errors within the
> >     implementation. if it works it works if it doesn't you just don't
> >     know i.e. best effort.
> >   * for get it cannot be provided consistently, the platforms simply
> >     aren't providing what is needed.
> > 
> > for background one of the request to expose the native platform thread
> > id was to access the best effort behavior for the thread associated with
> > an lcore. discussion on list highlighted the constraint that this should
> > be done without exposing platform specific detail via the eal api.
> > 
> > http://mails.dpdk.org/archives/dev/2022-October/253411.html
> > 
> > as mentioned in a previous mail i provided a series that accomplishes
> > this as a side effect of an api that can be consistently provided when
> > available on the platform, it has 2 benefits.
> >   * it does not expose the platform-specific native thread handle.
> >   * for lcores it keeps the name in the application memory so it is
> >     available in crash/coredumps.
> > 
> > so we have 2 options.
> > 
> > 1. a watered down set (no get) that is fire and forget and reports no
> >    failure and maybe it works, maybe it doesn't depending on your platform.
> > 2. the lcore set / get which is basically (1) for the threads associated
> >    with lcores but provides some additional features that are supportable
> >    in the api surface. (set/get, stored in application namespace).
> 
> Given thread name is not critical at all, I think best effort is OK.
> We could make clear in the API documentation that it is not reliable.
> 
> I don't think implementing thread name in the specific case of
> datapath lcore is a real improvement.

Okay, just one final confirmation. This is what we would like?

* completely remove the existing rte_thread_getname api.
    - by implication this means remove the 1 use of it in eal in
      logging.

* introduce a new void rte_thread_set_name(rte_thread_t, const char *name)
  that:
    - returns void (does not fail), but in cases it can be detected will
      log a DEBUG level log message.
    - quietly truncates the name (if longer) to RTE_MAX_THREAD_NAME_LEN on
      all platforms.
    - document that it is best effort and only works if the stars align
      for the target platform.

* there will be no unit test, since the set doesn't fail and there is no
  get to validate the set.

once i get confirmation i'll update the series.

thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-13 17:38                       ` Tyler Retzlaff
@ 2022-12-13 19:34                         ` Thomas Monjalon
  2022-12-13 20:39                           ` Morten Brørup
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2022-12-13 19:34 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: Stephen Hemminger, dev, Morten Brørup, david.marchand,
	Bruce Richardson

13/12/2022 18:38, Tyler Retzlaff:
> On Tue, Dec 13, 2022 at 09:32:06AM +0100, Thomas Monjalon wrote:
> > 12/12/2022 18:45, Tyler Retzlaff:
> > > On Sun, Dec 11, 2022 at 08:50:48AM +0100, Thomas Monjalon wrote:
> > > > 10/12/2022 00:49, Tyler Retzlaff:
> > > > > On Fri, Dec 09, 2022 at 10:13:44PM +0100, Thomas Monjalon wrote:
> > > > > > 09/12/2022 21:06, Tyler Retzlaff:
> > > > > > > On Fri, Dec 09, 2022 at 08:48:14AM -0800, Stephen Hemminger wrote:
> > > > > > > > On Fri, 09 Dec 2022 08:53:57 +0100
> > > > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > 
> > > > > > > > > > > If some execution environment doesn't support thread names, it could return a string that makes it possible for a human to identify the thread, e.g. the tread id. Again, this is assuming that it is only used for debugging, trace, and similar.  
> > > > > > > > > > 
> > > > > > > > > > i think this raises a good question. is the purpose of setting a thread name
> > > > > > > > > > meant to be something we can use from the application or is it something that
> > > > > > > > > > is for debugging diagnostics and may be a best effort?  
> > > > > > > > > 
> > > > > > > 
> > > > > > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > 
> > > > > > > > I think yes it is only for debugging.
> > > > > > > > So best effort looks to be a good approach.
> > > > > > > > I'm not sure you need to replace the functions.
> > > > > > > > Can you just complete the implementations?
> > > > > > > 
> > > > > > > the patch series put forward allows a set / get name per-lcore, where
> > > > > > > you get implicit (but not exposed via the eal api) call to underlying
> > > > > > > platform thread setname.
> > > > > > 
> > > > > > I don't understand how lcore ID and thread ID are connected.
> > > > > > You can run multiple control threads on a single lcore.
> > > > > > 
> > > > > 
> > > > > correct.
> > > > > 
> > > > > the new public api allows the set of a name on an lcore only. as a
> > > > > side-effect if the platform supports it the name is also set on the
> > > > > thread_id associated with the lcore (from lcore_config[].thread_id).
> > > > > 
> > > > > for control threads you just get the side-effect if the platform
> > > > > supports it, otherwise it is a noop.
> > > > 
> > > > What does it mean? which side effect? what must be supported?
> > > > 
> > > > > if we want set / get name at all this seemed the most usable balance
> > > > > between application consumption with debug use where available. if this
> > > > > isn't acceptable then i would suggest we simply remove both
> > > > > rte_thread_{get,set}name because as a debugging facility we cannot offer
> > > > > a consistent abstracted api which means it shouldn't be in the eal at
> > > > > all.
> > > > 
> > > > Why it cannot be consistent?
> > > > Please be more precise.
> > > > 
> > > 
> > > sorry i thought you had been looking at our implementation, let me
> > > summarize.
> > > 
> > > here are the differences between the underlying platform capabilities
> > > that prohibit both get and set. it's not a matter of just providing missing
> > > implementation for a specific platform.
> > > 
> > > set thread name:
> > >   freebsd
> > >     - set reports no failure, but may silently fail
> > >     - uncertain what name length limit is
> > >   linux
> > >     - set not available in older glibc
> > >     - current rte wrapper silently truncates name length
> > >   windows
> > >     - set always available
> > >     - uncertain what name length limit is
> > > 
> > > get thread name:
> > >   freebsd
> > >     - not available at all
> > >   linux
> > >     - get not available in older glibc
> > >     - get can fail
> > >   windows
> > >     - get always available
> > >     - get can fail
> > > 
> > > keep in mind the purpose of an abstraction is that the application *does
> > > not* have to do conditional evaluation on a per-platform basis around
> > > call sites. once you start putting #ifdef RTE_EXEC_ENV_XXX into code you
> > > failed and i presume we want none of the use of eal to be adorned with
> > > that.
> > > 
> > > at first glance you might think oh, well if get isn't supported then
> > > just return some default string or an empty string but even that is a
> > > violation of the abstraction that leaks implementation detail.
> > > 
> > > i.e. assuming success set() success  get(set()) should also succeed
> > > without conditional compilation of the code.
> > > 
> > > the common abstraction that can be reasonably provided explicitly
> > > operating a thread is something like.
> > > 
> > >   * for set you can provide a watered down version that doesn't report
> > >     failure and silently truncates and ignores errors within the
> > >     implementation. if it works it works if it doesn't you just don't
> > >     know i.e. best effort.
> > >   * for get it cannot be provided consistently, the platforms simply
> > >     aren't providing what is needed.
> > > 
> > > for background one of the request to expose the native platform thread
> > > id was to access the best effort behavior for the thread associated with
> > > an lcore. discussion on list highlighted the constraint that this should
> > > be done without exposing platform specific detail via the eal api.
> > > 
> > > http://mails.dpdk.org/archives/dev/2022-October/253411.html
> > > 
> > > as mentioned in a previous mail i provided a series that accomplishes
> > > this as a side effect of an api that can be consistently provided when
> > > available on the platform, it has 2 benefits.
> > >   * it does not expose the platform-specific native thread handle.
> > >   * for lcores it keeps the name in the application memory so it is
> > >     available in crash/coredumps.
> > > 
> > > so we have 2 options.
> > > 
> > > 1. a watered down set (no get) that is fire and forget and reports no
> > >    failure and maybe it works, maybe it doesn't depending on your platform.
> > > 2. the lcore set / get which is basically (1) for the threads associated
> > >    with lcores but provides some additional features that are supportable
> > >    in the api surface. (set/get, stored in application namespace).
> > 
> > Given thread name is not critical at all, I think best effort is OK.
> > We could make clear in the API documentation that it is not reliable.
> > 
> > I don't think implementing thread name in the specific case of
> > datapath lcore is a real improvement.
> 
> Okay, just one final confirmation. This is what we would like?
> 
> * completely remove the existing rte_thread_getname api.
>     - by implication this means remove the 1 use of it in eal in
>       logging.
> 
> * introduce a new void rte_thread_set_name(rte_thread_t, const char *name)
>   that:
>     - returns void (does not fail), but in cases it can be detected will
>       log a DEBUG level log message.
>     - quietly truncates the name (if longer) to RTE_MAX_THREAD_NAME_LEN on
>       all platforms.
>     - document that it is best effort and only works if the stars align
>       for the target platform.
> 
> * there will be no unit test, since the set doesn't fail and there is no
>   get to validate the set.
> 
> once i get confirmation i'll update the series.

Just my opinion: this proposal is my preference, yes.
What others think?





^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: help with pthread_t deprecation / api changes
  2022-12-13 19:34                         ` Thomas Monjalon
@ 2022-12-13 20:39                           ` Morten Brørup
  2022-12-14  0:16                             ` Tyler Retzlaff
  0 siblings, 1 reply; 19+ messages in thread
From: Morten Brørup @ 2022-12-13 20:39 UTC (permalink / raw)
  To: Thomas Monjalon, Tyler Retzlaff
  Cc: Stephen Hemminger, dev, david.marchand, Bruce Richardson

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Tuesday, 13 December 2022 20.34
> 
> 13/12/2022 18:38, Tyler Retzlaff:
> > Okay, just one final confirmation. This is what we would like?
> >
> > * completely remove the existing rte_thread_getname api.
> >     - by implication this means remove the 1 use of it in eal in
> >       logging.
> >
> > * introduce a new void rte_thread_set_name(rte_thread_t, const char
> *name)
> >   that:
> >     - returns void (does not fail), but in cases it can be detected
> will
> >       log a DEBUG level log message.
> >     - quietly truncates the name (if longer) to
> RTE_MAX_THREAD_NAME_LEN on
> >       all platforms.

Consider also DEBUG logging if truncating the name. Your choice - do or don't is fine with me.

> >     - document that it is best effort and only works if the stars
> align
> >       for the target platform.
> >
> > * there will be no unit test, since the set doesn't fail and there is
> no
> >   get to validate the set.
> >
> > once i get confirmation i'll update the series.
> 
> Just my opinion: this proposal is my preference, yes.
> What others think?

LGTM.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: help with pthread_t deprecation / api changes
  2022-12-13 20:39                           ` Morten Brørup
@ 2022-12-14  0:16                             ` Tyler Retzlaff
  0 siblings, 0 replies; 19+ messages in thread
From: Tyler Retzlaff @ 2022-12-14  0:16 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, Stephen Hemminger, dev, david.marchand,
	Bruce Richardson

On Tue, Dec 13, 2022 at 09:39:24PM +0100, Morten Brørup wrote:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Tuesday, 13 December 2022 20.34
> > 
> > 13/12/2022 18:38, Tyler Retzlaff:
> > > Okay, just one final confirmation. This is what we would like?
> > >
> > > * completely remove the existing rte_thread_getname api.
> > >     - by implication this means remove the 1 use of it in eal in
> > >       logging.
> > >
> > > * introduce a new void rte_thread_set_name(rte_thread_t, const char
> > *name)
> > >   that:
> > >     - returns void (does not fail), but in cases it can be detected
> > will
> > >       log a DEBUG level log message.
> > >     - quietly truncates the name (if longer) to
> > RTE_MAX_THREAD_NAME_LEN on
> > >       all platforms.
> 
> Consider also DEBUG logging if truncating the name. Your choice - do or don't is fine with me.

will do.

> 
> > >     - document that it is best effort and only works if the stars
> > align
> > >       for the target platform.
> > >
> > > * there will be no unit test, since the set doesn't fail and there is
> > no
> > >   get to validate the set.
> > >
> > > once i get confirmation i'll update the series.
> > 
> > Just my opinion: this proposal is my preference, yes.
> > What others think?
> 
> LGTM.

okay, i'll re-spin the series and send a v2.

thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-12-14  0:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 22:54 help with pthread_t deprecation / api changes Tyler Retzlaff
2022-12-02  1:12 ` Tyler Retzlaff
2022-12-02  8:03   ` Morten Brørup
2022-12-02 19:57     ` Tyler Retzlaff
2022-12-09  7:53       ` Thomas Monjalon
2022-12-09 16:48         ` Stephen Hemminger
2022-12-09 20:06           ` Tyler Retzlaff
2022-12-09 21:13             ` Thomas Monjalon
2022-12-09 23:49               ` Tyler Retzlaff
2022-12-11  7:50                 ` Thomas Monjalon
2022-12-12 17:45                   ` Tyler Retzlaff
2022-12-13  8:32                     ` Thomas Monjalon
2022-12-13 17:38                       ` Tyler Retzlaff
2022-12-13 19:34                         ` Thomas Monjalon
2022-12-13 20:39                           ` Morten Brørup
2022-12-14  0:16                             ` Tyler Retzlaff
2022-12-09 21:14           ` Thomas Monjalon
2022-12-09 22:38             ` Stephen Hemminger
2022-12-09 23:55               ` Tyler Retzlaff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).