* [RFC] Telemetry enhancements and Windows support @ 2022-04-01 22:59 Dmitry Kozlyuk 2022-04-02 0:55 ` Dmitry Kozlyuk 2022-04-04 10:03 ` Bruce Richardson 0 siblings, 2 replies; 6+ messages in thread From: Dmitry Kozlyuk @ 2022-04-01 22:59 UTC (permalink / raw) To: dev, Vipin Varghese, Ciara Power Cc: Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam, Bruce Richardson Vipin from AMD expressed demand for telemetry support on Windows in order to collect port statistics. Implementing a PoC, he stumbled several issues. Together we have designed a solution that eliminates these issues. It affects telemetry operation for all platforms, at least code-wise, We would like community input if this is acceptable or if there are some better suggestions. Telemetry today (talking only about v2): * Telemetry library starts a thread to listen for connections. It is affinitized to the main lcore. A thread is spawned to serve each client connection. * A listening AF_UNIX socket is created in DPDK runtime directory. This guarantees there will be no clash between sockets created by different DPDK processes. It has a fixed name within this directory, so it's trivial to discover. * The socket is SOCK_SEQPACKET. This allows the code to simply use write() and let the socket preserve message bounds and order. dpdk-telemetry.py relies on this, and probably external clients too. #define MAX_OUTPUT_LEN (1024 * 16). * The protocol is JSON objects, one per packet. * Telemetry can be enabled (default) or disabled with an EAL option. Windows issues: * Threading API is implemented in EAL. Currently this is a pthread.h shim, there are plans to replace it with rte_thread API [1]. Hence, there's a circular dependency: EAL -> telemetry -> threading API in EAL. There's a similar issue logging called by this shim. Bruce pointed out that a similar issue is with EAL logs and maybe logging should be moved its own library, on which EAL would depend. * There is no AF_UNIX and no simple way to select a unique free endpoint. * There is no SOCK_SEQPACKET and SOCK_DGRAM size limitation is too small. The only viable option is SOCK_STREAM, but it does not preserve message boundaries. Proposal: * Move threading outside of telemetry, let EAL manage the threads. eal_telemetry_client_thread(conn) { rte_telemetry_serve(conn); } eal_telemetry_server_thread(tm) { pthread_set_affinity_np(...); while (rte_telemetry_listen(tm)) { conn = rte_telemetry_accept(tm); pthread_create(eal_telemetry_client_thread, conn); } } rte_eal_init() { tm_logtype = rte_log_type_register(); tm = rte_telemetry_init(internal_conf.tm_kvargs, tm_logtype); pthread_create(eal_telementry_server_thread, tm); } Among Vipin, Micorosft engineers, and me there is a consensus that libraries should be passive and not create threads by themselves. * The logging issue can be solved differently: a) by factoring logging into a library, either as Bruce suggested; b) by replacing logging with simple print in Windows shim; c) by integrating and using the new threading API [1, 2]. * Allow to select listening endpoint and protocol. Different options may be supported on each system and be the default. rte_kvargs syntax and data structure can be used. --telemetry Default settings. Also kept for compatibility. --telemetry transport=seqpacket,protocol=message AF_UNIX+SOCK_SEQPACKET, no delimiters between objects (same as default for Unices). --telemetry transport=tcp,protocol=line,endpoint=:64000 Line-oriented JSON over a TCP socket at local port 64000 (may be same as --telemetry <no argument> on Windows). --no-telemetry Disable telemetry. Default on Windows. Parameter names and set are subject to discussion. An last example if a possible future extension as one of the reasons not to fix combinations of "transport=" and "protocol=" (similar to socket API): --telemetry transport=pipe,protocol=line,endpoint=some-name Stream of objects over a named pipe (fifo), for example. Initial support should cover: 1) "transport=seqpacket,protocol=message" on Unices; 2) "transport=tcp,protocol=line,endpoint=" at least on Windows, but this variant is really applicable everywhere. * dpdk-telemetry.py must support all combinations DPDK supports. Defaults don't change, so backward compatibility is preserved. * Because Windows cannot automatically select and endpoint that does not conflict with anything, the default would be to disable telemetry. We also propose to always log telemetry socket path. * It is technically possible to accept remote connections. However, it's a new attack surface and a security risk. OTOH, there's no need to allow remote connections: anyone who needs this can setup a tunnel or something. Local socket must be used by default. [1]: http://patchwork.dpdk.org/project/dpdk/list/?series=22319 [2]: http://patchwork.dpdk.org/project/dpdk/list/?series=20472 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Telemetry enhancements and Windows support 2022-04-01 22:59 [RFC] Telemetry enhancements and Windows support Dmitry Kozlyuk @ 2022-04-02 0:55 ` Dmitry Kozlyuk 2022-04-04 10:03 ` Bruce Richardson 1 sibling, 0 replies; 6+ messages in thread From: Dmitry Kozlyuk @ 2022-04-02 0:55 UTC (permalink / raw) To: dev, Vipin Varghese, Ciara Power Cc: Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam, Bruce Richardson 2022-04-02 01:59 (UTC+0300), Dmitry Kozlyuk: > Telemetry today (talking only about v2): > > * Telemetry library starts a thread to listen for connections. > It is affinitized to the main lcore. Should be: "to the control thread CPU set". > A thread is spawned to serve each client connection. Note: the number of active connections is limited to MAX_CONNECTIONS = 10. No change is proposed here, although the new kvargs syntax allows to add telemetry tunables easily. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Telemetry enhancements and Windows support 2022-04-01 22:59 [RFC] Telemetry enhancements and Windows support Dmitry Kozlyuk 2022-04-02 0:55 ` Dmitry Kozlyuk @ 2022-04-04 10:03 ` Bruce Richardson 2022-04-06 21:50 ` Dmitry Kozlyuk 1 sibling, 1 reply; 6+ messages in thread From: Bruce Richardson @ 2022-04-04 10:03 UTC (permalink / raw) To: Dmitry Kozlyuk Cc: dev, Vipin Varghese, Ciara Power, Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam On Sat, Apr 02, 2022 at 01:59:01AM +0300, Dmitry Kozlyuk wrote: > Vipin from AMD expressed demand for telemetry support on Windows > in order to collect port statistics. > Implementing a PoC, he stumbled several issues. > Together we have designed a solution that eliminates these issues. > It affects telemetry operation for all platforms, at least code-wise, > We would like community input if this is acceptable > or if there are some better suggestions. > > Telemetry today (talking only about v2): > > * Telemetry library starts a thread to listen for connections. > It is affinitized to the main lcore. > A thread is spawned to serve each client connection. > > * A listening AF_UNIX socket is created in DPDK runtime directory. > This guarantees there will be no clash between sockets > created by different DPDK processes. > It has a fixed name within this directory, so it's trivial to discover. > > * The socket is SOCK_SEQPACKET. > This allows the code to simply use write() > and let the socket preserve message bounds and order. > dpdk-telemetry.py relies on this, and probably external clients too. > #define MAX_OUTPUT_LEN (1024 * 16). > > * The protocol is JSON objects, one per packet. > > * Telemetry can be enabled (default) or disabled with an EAL option. > > Windows issues: > > * Threading API is implemented in EAL. > Currently this is a pthread.h shim, > there are plans to replace it with rte_thread API [1]. > Hence, there's a circular dependency: > EAL -> telemetry -> threading API in EAL. > There's a similar issue logging called by this shim. > > Bruce pointed out that a similar issue is with EAL logs > and maybe logging should be moved its own library, > on which EAL would depend. > > * There is no AF_UNIX and no simple way to select a unique free endpoint. > > * There is no SOCK_SEQPACKET and SOCK_DGRAM size limitation is too small. > The only viable option is SOCK_STREAM, > but it does not preserve message boundaries. > > Proposal: > > * Move threading outside of telemetry, let EAL manage the threads. > > eal_telemetry_client_thread(conn) { > rte_telemetry_serve(conn); > } > > eal_telemetry_server_thread(tm) { > pthread_set_affinity_np(...); > while (rte_telemetry_listen(tm)) { > conn = rte_telemetry_accept(tm); > pthread_create(eal_telemetry_client_thread, > conn); > } > } > > rte_eal_init() { > tm_logtype = rte_log_type_register(); > tm = rte_telemetry_init(internal_conf.tm_kvargs, > tm_logtype); > pthread_create(eal_telementry_server_thread, tm); > } > > Among Vipin, Micorosft engineers, and me there is a consensus > that libraries should be passive and not create threads by themselves. > Having EAL be the only one to create threads seems reasonable. However, I'm a little uncertain about the scope of change and tying telemetry and EAL together a lot more. Another alternative might be to change telemetry to use only one thread for all connections and using "select" or "poll" to wait on input from all sockets simultaneously. Would that work on Windows? > * The logging issue can be solved differently: > > a) by factoring logging into a library, either as Bruce suggested; > b) by replacing logging with simple print in Windows shim; > c) by integrating and using the new threading API [1, 2]. > What is the issue with the current logging in telemetry on windows? The current implementation has EAL pass in the logging function to telemetry on init, allowing use of EAL logging without requiring a link-time dependency. Theoretically, the same solution could also be used for the threading, though I'm wary about doing so as using the same solution multiple times could lead to very awkward APIs. Overall, I tend to think the proper solution for all this is to split EAL - something that was discussed previously but never done because it's difficult and the result may be uncertain. Theoretically, most low-level services in EAL, such as logging, should be in one library, while the EAL init code remains in a later, higher-level one. > * Allow to select listening endpoint and protocol. > Different options may be supported on each system and be the default. > rte_kvargs syntax and data structure can be used. > > --telemetry > Default settings. Also kept for compatibility. > --telemetry transport=seqpacket,protocol=message > AF_UNIX+SOCK_SEQPACKET, no delimiters between objects > (same as default for Unices). > --telemetry transport=tcp,protocol=line,endpoint=:64000 > Line-oriented JSON over a TCP socket at local port 64000 > (may be same as --telemetry <no argument> on Windows). > --no-telemetry > Disable telemetry. Default on Windows. > > Parameter names and set are subject to discussion. > > An last example if a possible future extension > as one of the reasons not to fix combinations of "transport=" > and "protocol=" (similar to socket API): > > --telemetry transport=pipe,protocol=line,endpoint=some-name > Stream of objects over a named pipe (fifo), for example. > > Initial support should cover: > > 1) "transport=seqpacket,protocol=message" on Unices; > 2) "transport=tcp,protocol=line,endpoint=" at least on Windows, > but this variant is really applicable everywhere. > > * dpdk-telemetry.py must support all combinations DPDK supports. > Defaults don't change, so backward compatibility is preserved. > > * Because Windows cannot automatically select and endpoint that does not > conflict with anything, the default would be to disable telemetry. > We also propose to always log telemetry socket path. > > * It is technically possible to accept remote connections. > However, it's a new attack surface and a security risk. > OTOH, there's no need to allow remote connections: > anyone who needs this can setup a tunnel or something. > Local socket must be used by default. > Completely agree on not accepting non-local connections. However, I'd also point out that another reason why a unix socket was chosen over other TCP/UDP/etc socket options was that it also gave local-user protection. Right now, if a DPDK process is run as root, no non-root user can connect to the telemetry socket, unless root explicitly gives permissions. For TCP or similar sockets, that would not be the case. Not sure how big a deal that would be. /Bruce ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Telemetry enhancements and Windows support 2022-04-04 10:03 ` Bruce Richardson @ 2022-04-06 21:50 ` Dmitry Kozlyuk 2022-04-06 21:57 ` Stephen Hemminger 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Kozlyuk @ 2022-04-06 21:50 UTC (permalink / raw) To: Bruce Richardson Cc: dev, Vipin Varghese, Ciara Power, Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam 2022-04-04 11:03 (UTC+0100), Bruce Richardson: [...] > Having EAL be the only one to create threads seems reasonable. However, I'm > a little uncertain about the scope of change and tying telemetry and EAL > together a lot more. Scope is a strong valid argument. What do you think if telemetry would create threads using Win32 API just like on Unices it creates threads using pthread (external component that is always present)? > Another alternative might be to change telemetry to > use only one thread for all connections and using "select" or "poll" to > wait on input from all sockets simultaneously. Would that work on Windows? Yes, there are select() and "poll()" with slightly different naming. > > * The logging issue can be solved differently: > > > > a) by factoring logging into a library, either as Bruce suggested; > > b) by replacing logging with simple print in Windows shim; > > c) by integrating and using the new threading API [1, 2]. > > > > What is the issue with the current logging in telemetry on windows? pthread.h shim contains calls to rte_log(). > The > current implementation has EAL pass in the logging function to telemetry on > init, allowing use of EAL logging without requiring a link-time dependency. > Theoretically, the same solution could also be used for the threading, > though I'm wary about doing so as using the same solution multiple times > could lead to very awkward APIs. Thinking out of scope of this RFC, telemetry *library* API should be: - register handler - JSON helpers - create session - process command in a session - destroy session All IO should be outside. Otherwise it's not a library but a framework. rte_telemetry_init() and all IO it starts is a higher-level helper for the very common case that doesn't need to be in lib/telemetry itself and even doesn't need to be a public API (but can be). > Overall, I tend to think the proper solution for all this is to split EAL - > something that was discussed previously but never done because it's > difficult and the result may be uncertain. Theoretically, most low-level > services in EAL, such as logging, should be in one library, while the EAL > init code remains in a later, higher-level one. I'd even say: utilities (log, trace, UUID, ...), environment abstraction, and runtime (init, tasks, multiprocess). It would be interesting to have a topic on that. [...] > > * It is technically possible to accept remote connections. > > However, it's a new attack surface and a security risk. > > OTOH, there's no need to allow remote connections: > > anyone who needs this can setup a tunnel or something. > > Local socket must be used by default. > > > > Completely agree on not accepting non-local connections. However, I'd also > point out that another reason why a unix socket was chosen over other > TCP/UDP/etc socket options was that it also gave local-user protection. > Right now, if a DPDK process is run as root, no non-root user can connect > to the telemetry socket, unless root explicitly gives permissions. For TCP > or similar sockets, that would not be the case. Not sure how big a deal > that would be. Windows also provides named pipes: 1. Pipes are subject to access control. 2. Pipes can work in message mode, like SOCK_SEQPACKET. 3. Pipes are addressed by path of form \\.\pipe\NAME that is not visible in the file system, like abstract Unix sockets, but this namespace is shared, so collision issue remains. 4. Pipes can be accessed as files from Python. Message mode can be activated with a Win32 API call that can be done using ctypes and msvcrt bundled modules. 5. Pipes can be configured to reject remote connections and to automatically limit the number of clients if needed. Another advantage is that message framing will not be needed this time, which limits the scope. Drawbacks: 1. There will be less shared code between Windows and Unix. Unix domain sockets and named pipes provide the same "accept client, send message, receive message, disconnect" API, so it's easy to share all non-transport logic. dpdk-telemetry.py will need a few Windows-specific lines. 2. On Windows, select() and WSAPoll() are only applicable to sockets. If we decide to multiplex IO in lib/telemetry and use named pipes, multiplexers will be different. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Telemetry enhancements and Windows support 2022-04-06 21:50 ` Dmitry Kozlyuk @ 2022-04-06 21:57 ` Stephen Hemminger 2022-04-06 22:03 ` Dmitry Kozlyuk 0 siblings, 1 reply; 6+ messages in thread From: Stephen Hemminger @ 2022-04-06 21:57 UTC (permalink / raw) To: Dmitry Kozlyuk Cc: Bruce Richardson, dev, Vipin Varghese, Ciara Power, Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam On Thu, 7 Apr 2022 00:50:29 +0300 Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote: > 2022-04-04 11:03 (UTC+0100), Bruce Richardson: > [...] > > Having EAL be the only one to create threads seems reasonable. However, I'm > > a little uncertain about the scope of change and tying telemetry and EAL > > together a lot more. > > Scope is a strong valid argument. > What do you think if telemetry would create threads using Win32 API > just like on Unices it creates threads using pthread > (external component that is always present)? Why isn't telemetry using existing rte_ctrl_thread API? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Telemetry enhancements and Windows support 2022-04-06 21:57 ` Stephen Hemminger @ 2022-04-06 22:03 ` Dmitry Kozlyuk 0 siblings, 0 replies; 6+ messages in thread From: Dmitry Kozlyuk @ 2022-04-06 22:03 UTC (permalink / raw) To: Stephen Hemminger Cc: Bruce Richardson, dev, Vipin Varghese, Ciara Power, Sivaprasad Tummala, Tyler Retzlaff, Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam 2022-04-06 14:57 (UTC-0700), Stephen Hemminger: > On Thu, 7 Apr 2022 00:50:29 +0300 > Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote: > > > 2022-04-04 11:03 (UTC+0100), Bruce Richardson: > > [...] > > > Having EAL be the only one to create threads seems reasonable. However, I'm > > > a little uncertain about the scope of change and tying telemetry and EAL > > > together a lot more. > > > > Scope is a strong valid argument. > > What do you think if telemetry would create threads using Win32 API > > just like on Unices it creates threads using pthread > > (external component that is always present)? > > Why isn't telemetry using existing rte_ctrl_thread API? It would depend on EAL at link time then. But EAL already depends on telemetry and there would be a loop. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-04-06 22:03 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-01 22:59 [RFC] Telemetry enhancements and Windows support Dmitry Kozlyuk 2022-04-02 0:55 ` Dmitry Kozlyuk 2022-04-04 10:03 ` Bruce Richardson 2022-04-06 21:50 ` Dmitry Kozlyuk 2022-04-06 21:57 ` Stephen Hemminger 2022-04-06 22:03 ` Dmitry Kozlyuk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).