DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts
@ 2021-09-15 14:10 Bruce Richardson
  2021-09-22  8:43 ` Power, Ciara
                   ` (8 more replies)
  0 siblings, 9 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-09-15 14:10 UTC (permalink / raw)
  To: dev
  Cc: ciara.power, anatoly.burakov, Bruce Richardson, stable, David Marchand

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8665db8d03..5be2834757 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -421,15 +421,30 @@ create_socket(char *path)

 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;

-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, &sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}

 	if (listen(sock, 1) < 0) {
--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
@ 2021-09-22  8:43 ` Power, Ciara
  2021-09-24 16:18 ` [dpdk-dev] [PATCH v2] " Bruce Richardson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-09-22  8:43 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Burakov, Anatoly, stable, David Marchand

Hi Bruce,


>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Wednesday 15 September 2021 15:11
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Richardson, Bruce
><bruce.richardson@intel.com>; stable@dpdk.org; David Marchand
><david.marchand@redhat.com>
>Subject: [PATCH] telemetry: fix "in-memory" process socket conflicts
>
>When DPDK is run with --in-memory mode, multiple processes can run
>simultaneously using the same runtime dir. This leads to each process
>removing another process' telemetry socket as it started up, giving
>unexpected behaviour.
>
>This patch changes that behaviour to first check if the existing socket is active.
>If not, it's an old socket to be cleaned up and can be removed. If it is active,
>telemetry initialization fails and an error message is printed out giving
>instructions on how to remove the error; either by using file-prefix to have a
>different runtime dir (and therefore socket path) or by disabling telemetry if it
>not needed.
>
>Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>Cc: stable@dpdk.org
>
>Reported-by: David Marchand <david.marchand@redhat.com>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
> lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
<snip>

Patch looks good overall, although I see CI is reporting some problems for FreeBSD:

../lib/telemetry/telemetry.c:435:21: error: incompatible pointer types passing 'struct sockaddr_un *' to parameter of type 'const struct sockaddr *' [-Werror,-Wincompatible-pointer-types]
                if (connect(sock, &sun, sizeof(sun)) == 0) {
                                  ^~~~
/usr/include/sys/socket.h:662:41: note: passing argument to parameter here
int     connect(int, const struct sockaddr *, socklen_t);
                                            ^
1 error generated.

Thanks,
Ciara

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
  2021-09-22  8:43 ` Power, Ciara
@ 2021-09-24 16:18 ` Bruce Richardson
  2021-09-29  8:50   ` Power, Ciara
  2021-09-29 12:28   ` Kevin Traynor
  2021-09-29 13:54 ` [dpdk-dev] [PATCH v3] " Bruce Richardson
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-09-24 16:18 UTC (permalink / raw)
  To: dev
  Cc: ciara.power, anatoly.burakov, Bruce Richardson, stable, David Marchand

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
v2: fix build error on FreeBSD
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8304fbf6e9..78508c1a1d 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,15 +457,30 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
-- 
2.32.0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-24 16:18 ` [dpdk-dev] [PATCH v2] " Bruce Richardson
@ 2021-09-29  8:50   ` Power, Ciara
  2021-09-29 12:28   ` Kevin Traynor
  1 sibling, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-09-29  8:50 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Burakov, Anatoly, stable, David Marchand

Hi Bruce,

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Friday 24 September 2021 17:19
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Richardson, Bruce
><bruce.richardson@intel.com>; stable@dpdk.org; David Marchand
><david.marchand@redhat.com>
>Subject: [PATCH v2] telemetry: fix "in-memory" process socket conflicts
>
>When DPDK is run with --in-memory mode, multiple processes can run
>simultaneously using the same runtime dir. This leads to each process
>removing another process' telemetry socket as it started up, giving
>unexpected behaviour.
>
>This patch changes that behaviour to first check if the existing socket is active.
>If not, it's an old socket to be cleaned up and can be removed. If it is active,
>telemetry initialization fails and an error message is printed out giving
>instructions on how to remove the error; either by using file-prefix to have a
>different runtime dir (and therefore socket path) or by disabling telemetry if it
>not needed.
>
>Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>Cc: stable@dpdk.org
>
>Reported-by: David Marchand <david.marchand@redhat.com>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
>v2: fix build error on FreeBSD
>---
<snip>

Acked-by: Ciara Power <ciara.power@intel.com>

Thanks!

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-24 16:18 ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  2021-09-29  8:50   ` Power, Ciara
@ 2021-09-29 12:28   ` Kevin Traynor
  2021-09-29 13:32     ` Bruce Richardson
  1 sibling, 1 reply; 68+ messages in thread
From: Kevin Traynor @ 2021-09-29 12:28 UTC (permalink / raw)
  To: Bruce Richardson, dev
  Cc: ciara.power, anatoly.burakov, stable, David Marchand

Hi Bruce,

On 24/09/2021 17:18, Bruce Richardson wrote:
> When DPDK is run with --in-memory mode, multiple processes can run
> simultaneously using the same runtime dir. This leads to each process
> removing another process' telemetry socket as it started up, giving
> unexpected behaviour.
> 
> This patch changes that behaviour to first check if the existing socket
> is active. If not, it's an old socket to be cleaned up and can be
> removed. If it is active, telemetry initialization fails and an error
> message is printed out giving instructions on how to remove the error;
> either by using file-prefix to have a different runtime dir (and
> therefore socket path) or by disabling telemetry if it not needed.
> 

telemetry is enabled by default but it may not be used by the 
application. Hitting this issue will cause rte_eal_init() to fail which 
will probably stop or severely limit the application.

So it could change a working application to a non-working one (albeit 
one that doesn't interfere with other process' sockets).

Can it just print a warning that telemetry will not be enabled and 
continue so it's not returning an rte_eal_init failure?

A more minor thing, I see it changes the behaviour from, last one runs 
with telemetry, to, first one runs with telemetry. Though it can be 
figured from the commit message, it might be worth calling that change 
out explicitly.

thanks,
Kevin.

> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> Cc: stable@dpdk.org
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> v2: fix build error on FreeBSD
> ---
>   lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
>   1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 8304fbf6e9..78508c1a1d 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -457,15 +457,30 @@ create_socket(char *path)
>   
>   	struct sockaddr_un sun = {.sun_family = AF_UNIX};
>   	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
> -	unlink(sun.sun_path);
> +
>   	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
>   		struct stat st;
>   
> -		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> -		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
> +		/* first check if we have a runtime dir */
> +		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
>   			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
> -		sun.sun_path[0] = 0;
> -		goto error;
> +			goto error;
> +		}
> +
> +		/* check if current socket is active */
> +		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
> +			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
> +			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
> +			path[0] = 0;
> +			goto error;
> +		}
> +
> +		/* socket is not active, delete and attempt rebind */
> +		unlink(sun.sun_path);
> +		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
> +			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> +			goto error;
> +		}
>   	}
>   
>   	if (listen(sock, 1) < 0) {
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 12:28   ` Kevin Traynor
@ 2021-09-29 13:32     ` Bruce Richardson
  2021-09-29 13:51       ` Bruce Richardson
  2021-09-29 14:54       ` Kevin Traynor
  0 siblings, 2 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-09-29 13:32 UTC (permalink / raw)
  To: Kevin Traynor; +Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> Hi Bruce,
> 
> On 24/09/2021 17:18, Bruce Richardson wrote:
> > When DPDK is run with --in-memory mode, multiple processes can run
> > simultaneously using the same runtime dir. This leads to each process
> > removing another process' telemetry socket as it started up, giving
> > unexpected behaviour.
> > 
> > This patch changes that behaviour to first check if the existing socket
> > is active. If not, it's an old socket to be cleaned up and can be
> > removed. If it is active, telemetry initialization fails and an error
> > message is printed out giving instructions on how to remove the error;
> > either by using file-prefix to have a different runtime dir (and
> > therefore socket path) or by disabling telemetry if it not needed.
> > 
> 
> telemetry is enabled by default but it may not be used by the application.
> Hitting this issue will cause rte_eal_init() to fail which will probably
> stop or severely limit the application.
> 
> So it could change a working application to a non-working one (albeit one
> that doesn't interfere with other process' sockets).
> 
> Can it just print a warning that telemetry will not be enabled and continue
> so it's not returning an rte_eal_init failure?
> 

For a backported fix, yes, that would probably be better behaviour, but for
the latest branch, I think returning error and having the user explicitly
choose the resolution they want to occur is best. I'll see about doing a
separate backport patch for 20.11.

> A more minor thing, I see it changes the behaviour from, last one runs with
> telemetry, to, first one runs with telemetry. Though it can be figured from
> the commit message, it might be worth calling that change out explicitly.
> 

Sure. I'll resubmit a new version of this without stable CC'ed and include
that behaviour change explicitly in the commit log.

/Bruce


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 13:32     ` Bruce Richardson
@ 2021-09-29 13:51       ` Bruce Richardson
  2021-09-29 14:54       ` Kevin Traynor
  1 sibling, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-09-29 13:51 UTC (permalink / raw)
  To: Kevin Traynor; +Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On Wed, Sep 29, 2021 at 02:32:02PM +0100, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > Hi Bruce,
> > 
> > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > When DPDK is run with --in-memory mode, multiple processes can run
> > > simultaneously using the same runtime dir. This leads to each process
> > > removing another process' telemetry socket as it started up, giving
> > > unexpected behaviour.
> > > 
> > > This patch changes that behaviour to first check if the existing socket
> > > is active. If not, it's an old socket to be cleaned up and can be
> > > removed. If it is active, telemetry initialization fails and an error
> > > message is printed out giving instructions on how to remove the error;
> > > either by using file-prefix to have a different runtime dir (and
> > > therefore socket path) or by disabling telemetry if it not needed.
> > > 
> > 
> > telemetry is enabled by default but it may not be used by the application.
> > Hitting this issue will cause rte_eal_init() to fail which will probably
> > stop or severely limit the application.
> > 
> > So it could change a working application to a non-working one (albeit one
> > that doesn't interfere with other process' sockets).
> > 
> > Can it just print a warning that telemetry will not be enabled and continue
> > so it's not returning an rte_eal_init failure?
> > 
> 
> For a backported fix, yes, that would probably be better behaviour, but for
> the latest branch, I think returning error and having the user explicitly
> choose the resolution they want to occur is best. I'll see about doing a
> separate backport patch for 20.11.
> 
> > A more minor thing, I see it changes the behaviour from, last one runs with
> > telemetry, to, first one runs with telemetry. Though it can be figured from
> > the commit message, it might be worth calling that change out explicitly.
> > 
> 
> Sure. I'll resubmit a new version of this without stable CC'ed and include
> that behaviour change explicitly in the commit log.
> 
Actually, subtle behaviour change would be in the backport version that
doesn't error out, so I'll note it there when doing that patch, not in the
v3 of this one.

/Bruce

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v3] telemetry: fix "in-memory" process socket conflicts
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
  2021-09-22  8:43 ` Power, Ciara
  2021-09-24 16:18 ` [dpdk-dev] [PATCH v2] " Bruce Richardson
@ 2021-09-29 13:54 ` Bruce Richardson
  2021-10-05 11:47   ` Ferruh Yigit
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-09-29 13:54 UTC (permalink / raw)
  To: dev; +Cc: Kevin Traynor, Bruce Richardson, David Marchand, Ciara Power

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process,
as it starts up, removing the telemetry socket of another process,
giving unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8304fbf6e9..78508c1a1d 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,15 +457,30 @@ create_socket(char *path)

 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;

-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}

 	if (listen(sock, 1) < 0) {
--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 13:32     ` Bruce Richardson
  2021-09-29 13:51       ` Bruce Richardson
@ 2021-09-29 14:54       ` Kevin Traynor
  2021-09-29 15:24         ` Bruce Richardson
  1 sibling, 1 reply; 68+ messages in thread
From: Kevin Traynor @ 2021-09-29 14:54 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On 29/09/2021 14:32, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
>> Hi Bruce,
>>
>> On 24/09/2021 17:18, Bruce Richardson wrote:
>>> When DPDK is run with --in-memory mode, multiple processes can run
>>> simultaneously using the same runtime dir. This leads to each process
>>> removing another process' telemetry socket as it started up, giving
>>> unexpected behaviour.
>>>
>>> This patch changes that behaviour to first check if the existing socket
>>> is active. If not, it's an old socket to be cleaned up and can be
>>> removed. If it is active, telemetry initialization fails and an error
>>> message is printed out giving instructions on how to remove the error;
>>> either by using file-prefix to have a different runtime dir (and
>>> therefore socket path) or by disabling telemetry if it not needed.
>>>
>>
>> telemetry is enabled by default but it may not be used by the application.
>> Hitting this issue will cause rte_eal_init() to fail which will probably
>> stop or severely limit the application.
>>
>> So it could change a working application to a non-working one (albeit one
>> that doesn't interfere with other process' sockets).
>>
>> Can it just print a warning that telemetry will not be enabled and continue
>> so it's not returning an rte_eal_init failure?
>>
> 
> For a backported fix, yes, that would probably be better behaviour, but for
> the latest branch, I think returning error and having the user explicitly
> choose the resolution they want to occur is best. I'll see about doing a
> separate backport patch for 20.11.
> 

But this is a runtime message dependent on runtime environment. The user 
may not have access or know how to change eal parameters.

In the case where the application doesn't care about telemetry, they 
have gone from not having telemetry to rte_eal_init() failing, which 
probably has severe consequence.

I could maybe agree if telemetry was default disable and the application 
had set the --telemetry flag indicating that they want/need it. As it 
is, it feels like it's possibly a worse outcome for the user.

thanks,
Kevin.

>> A more minor thing, I see it changes the behaviour from, last one runs with
>> telemetry, to, first one runs with telemetry. Though it can be figured from
>> the commit message, it might be worth calling that change out explicitly.
>>
> 
> Sure. I'll resubmit a new version of this without stable CC'ed and include
> that behaviour change explicitly in the commit log.
> 
> /Bruce
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 14:54       ` Kevin Traynor
@ 2021-09-29 15:24         ` Bruce Richardson
  2021-09-29 15:31           ` Bruce Richardson
  0 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-09-29 15:24 UTC (permalink / raw)
  To: Kevin Traynor; +Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
> On 29/09/2021 14:32, Bruce Richardson wrote:
> > On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > > Hi Bruce,
> > > 
> > > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > > When DPDK is run with --in-memory mode, multiple processes can run
> > > > simultaneously using the same runtime dir. This leads to each process
> > > > removing another process' telemetry socket as it started up, giving
> > > > unexpected behaviour.
> > > > 
> > > > This patch changes that behaviour to first check if the existing socket
> > > > is active. If not, it's an old socket to be cleaned up and can be
> > > > removed. If it is active, telemetry initialization fails and an error
> > > > message is printed out giving instructions on how to remove the error;
> > > > either by using file-prefix to have a different runtime dir (and
> > > > therefore socket path) or by disabling telemetry if it not needed.
> > > > 
> > > 
> > > telemetry is enabled by default but it may not be used by the application.
> > > Hitting this issue will cause rte_eal_init() to fail which will probably
> > > stop or severely limit the application.
> > > 
> > > So it could change a working application to a non-working one (albeit one
> > > that doesn't interfere with other process' sockets).
> > > 
> > > Can it just print a warning that telemetry will not be enabled and continue
> > > so it's not returning an rte_eal_init failure?
> > > 
> > 
> > For a backported fix, yes, that would probably be better behaviour, but for
> > the latest branch, I think returning error and having the user explicitly
> > choose the resolution they want to occur is best. I'll see about doing a
> > separate backport patch for 20.11.
> > 
> 
> But this is a runtime message dependent on runtime environment. The user may
> not have access or know how to change eal parameters.

True. But on the other hand, this problem only occurs with non-default EAL
parameters anyway, so someone must have configured this with the
--in-memory flag.

> 
> In the case where the application doesn't care about telemetry, they have
> gone from not having telemetry to rte_eal_init() failing, which probably has
> severe consequence.
> 

Yes, I agree, which I why I would suggest that for any backport of this
fix, the error be made non-fatal as you suggest. [Having looked into it,
having it as a non-fatal error is rather awkward, so it may be best just
left unfixed and the current behaviour documented as known-issue].

However, for any application being updated and rebuilt against 21.11, I
would have thought it reasonable to flag this as an error, as any such
application would require revalidation anyway.

> I could maybe agree if telemetry was default disable and the application had
> set the --telemetry flag indicating that they want/need it. As it is, it
> feels like it's possibly a worse outcome for the user.
> 

Perhaps, but I believe the only case of there being an issue would be where:
1) a user who cannot modify the EAL parameters
2) runs an application which has been updated and rebuilt against 21.11
3) where that application is hard-coded to use in-memory mode and
4) has never been verified with two or more instances of that running?
Or am I missing something here?

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 15:24         ` Bruce Richardson
@ 2021-09-29 15:31           ` Bruce Richardson
  2021-09-29 16:01             ` Kevin Traynor
  0 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-09-29 15:31 UTC (permalink / raw)
  To: Kevin Traynor; +Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On Wed, Sep 29, 2021 at 04:24:06PM +0100, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
> > On 29/09/2021 14:32, Bruce Richardson wrote:
> > > On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > > > Hi Bruce,
> > > > 
> > > > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > > > When DPDK is run with --in-memory mode, multiple processes can run
> > > > > simultaneously using the same runtime dir. This leads to each process
> > > > > removing another process' telemetry socket as it started up, giving
> > > > > unexpected behaviour.
> > > > > 
> > > > > This patch changes that behaviour to first check if the existing socket
> > > > > is active. If not, it's an old socket to be cleaned up and can be
> > > > > removed. If it is active, telemetry initialization fails and an error
> > > > > message is printed out giving instructions on how to remove the error;
> > > > > either by using file-prefix to have a different runtime dir (and
> > > > > therefore socket path) or by disabling telemetry if it not needed.
> > > > > 
> > > > 
> > > > telemetry is enabled by default but it may not be used by the application.
> > > > Hitting this issue will cause rte_eal_init() to fail which will probably
> > > > stop or severely limit the application.
> > > > 
> > > > So it could change a working application to a non-working one (albeit one
> > > > that doesn't interfere with other process' sockets).
> > > > 
> > > > Can it just print a warning that telemetry will not be enabled and continue
> > > > so it's not returning an rte_eal_init failure?
> > > > 
> > > 
> > > For a backported fix, yes, that would probably be better behaviour, but for
> > > the latest branch, I think returning error and having the user explicitly
> > > choose the resolution they want to occur is best. I'll see about doing a
> > > separate backport patch for 20.11.
> > > 
> > 
> > But this is a runtime message dependent on runtime environment. The user may
> > not have access or know how to change eal parameters.
> 
> True. But on the other hand, this problem only occurs with non-default EAL
> parameters anyway, so someone must have configured this with the
> --in-memory flag.
> 
> > 
> > In the case where the application doesn't care about telemetry, they have
> > gone from not having telemetry to rte_eal_init() failing, which probably has
> > severe consequence.
> > 
> 
> Yes, I agree, which I why I would suggest that for any backport of this
> fix, the error be made non-fatal as you suggest. [Having looked into it,
> having it as a non-fatal error is rather awkward, so it may be best just
> left unfixed and the current behaviour documented as known-issue].
> 
> However, for any application being updated and rebuilt against 21.11, I
> would have thought it reasonable to flag this as an error, as any such
> application would require revalidation anyway.
> 
> > I could maybe agree if telemetry was default disable and the application had
> > set the --telemetry flag indicating that they want/need it. As it is, it
> > feels like it's possibly a worse outcome for the user.
> > 
> 
> Perhaps, but I believe the only case of there being an issue would be where:
> 1) a user who cannot modify the EAL parameters
> 2) runs an application which has been updated and rebuilt against 21.11
> 3) where that application is hard-coded to use in-memory mode and
> 4) has never been verified with two or more instances of that running?
> Or am I missing something here?
> 

Let me also go back to the drawing board on the solution here a bit, and
see if I can come up with something better. If I can find a reasonable way
to make it so that we can always create a socket in in-memory mode, despite
other processes running, it would sidestep this problem completely. Not
sure if it's possible, but let me see if I can come up with some ideas.
[One idea I did try is using abstract sockets on linux, but with those we
lose out on the permissions/protection we get from having a filesystem
path, so were a no-go for me because of that]

/Bruce

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v2] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 15:31           ` Bruce Richardson
@ 2021-09-29 16:01             ` Kevin Traynor
  0 siblings, 0 replies; 68+ messages in thread
From: Kevin Traynor @ 2021-09-29 16:01 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, ciara.power, anatoly.burakov, stable, David Marchand

On 29/09/2021 16:31, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 04:24:06PM +0100, Bruce Richardson wrote:
>> On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
>>> On 29/09/2021 14:32, Bruce Richardson wrote:
>>>> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
>>>>> Hi Bruce,
>>>>>
>>>>> On 24/09/2021 17:18, Bruce Richardson wrote:
>>>>>> When DPDK is run with --in-memory mode, multiple processes can run
>>>>>> simultaneously using the same runtime dir. This leads to each process
>>>>>> removing another process' telemetry socket as it started up, giving
>>>>>> unexpected behaviour.
>>>>>>
>>>>>> This patch changes that behaviour to first check if the existing socket
>>>>>> is active. If not, it's an old socket to be cleaned up and can be
>>>>>> removed. If it is active, telemetry initialization fails and an error
>>>>>> message is printed out giving instructions on how to remove the error;
>>>>>> either by using file-prefix to have a different runtime dir (and
>>>>>> therefore socket path) or by disabling telemetry if it not needed.
>>>>>>
>>>>>
>>>>> telemetry is enabled by default but it may not be used by the application.
>>>>> Hitting this issue will cause rte_eal_init() to fail which will probably
>>>>> stop or severely limit the application.
>>>>>
>>>>> So it could change a working application to a non-working one (albeit one
>>>>> that doesn't interfere with other process' sockets).
>>>>>
>>>>> Can it just print a warning that telemetry will not be enabled and continue
>>>>> so it's not returning an rte_eal_init failure?
>>>>>
>>>>
>>>> For a backported fix, yes, that would probably be better behaviour, but for
>>>> the latest branch, I think returning error and having the user explicitly
>>>> choose the resolution they want to occur is best. I'll see about doing a
>>>> separate backport patch for 20.11.
>>>>
>>>
>>> But this is a runtime message dependent on runtime environment. The user may
>>> not have access or know how to change eal parameters.
>>
>> True. But on the other hand, this problem only occurs with non-default EAL
>> parameters anyway, so someone must have configured this with the
>> --in-memory flag.
>>
>>>
>>> In the case where the application doesn't care about telemetry, they have
>>> gone from not having telemetry to rte_eal_init() failing, which probably has
>>> severe consequence.
>>>
>>
>> Yes, I agree, which I why I would suggest that for any backport of this
>> fix, the error be made non-fatal as you suggest. [Having looked into it,
>> having it as a non-fatal error is rather awkward, so it may be best just
>> left unfixed and the current behaviour documented as known-issue].
>>
>> However, for any application being updated and rebuilt against 21.11, I
>> would have thought it reasonable to flag this as an error, as any such
>> application would require revalidation anyway.
>>
>>> I could maybe agree if telemetry was default disable and the application had
>>> set the --telemetry flag indicating that they want/need it. As it is, it
>>> feels like it's possibly a worse outcome for the user.
>>>
>>
>> Perhaps, but I believe the only case of there being an issue would be where:
>> 1) a user who cannot modify the EAL parameters
>> 2) runs an application which has been updated and rebuilt against 21.11
>> 3) where that application is hard-coded to use in-memory mode and >> 4) has never been verified with two or more instances of that running?

That's a reasonable point that if it has in-memory hardcoded you might 
expect it to be tested with two or more, and if it's not hardcoded, it 
is added by the user so they are able to set eal params.

I still see an extra step for the user but I agree if they can set eal 
params then it is a lot less impactful. For OVS, a user could update the 
dpdk-extra ovsdb entry for the additional eal flags.

>> Or am I missing something here?
>>
> 
> Let me also go back to the drawing board on the solution here a bit, and
> see if I can come up with something better. If I can find a reasonable way
> to make it so that we can always create a socket in in-memory mode, despite
> other processes running, it would sidestep this problem completely. Not
> sure if it's possible, but let me see if I can come up with some ideas.
> [One idea I did try is using abstract sockets on linux, but with those we
> lose out on the permissions/protection we get from having a filesystem
> path, so were a no-go for me because of that]
> 

ok, thanks Bruce. I think you got the concerns anyway. I suppose a part 
of it goes back to: telemetry is default, but does that imply that it is 
required and dpdk should error out if it is not available or not.

Kevin.

> /Bruce
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (2 preceding siblings ...)
  2021-09-29 13:54 ` [dpdk-dev] [PATCH v3] " Bruce Richardson
@ 2021-10-01 11:15 ` Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 1/5] eal: limit telemetry to primary processes Bruce Richardson
                     ` (4 more replies)
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
                   ` (4 subsequent siblings)
  8 siblings, 5 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* add a suffix to the telemetry sockets of any process run using
  "in-memory" mode. This suffix is the PID and ensures we don't get path
  conflicts on the sockets
* inform the user of any path conflicts we do encounter for some reason,
  and present them with the options to solve it, instead of just removing
  the conflicting socket and continuing.
  [This was the original patch in V1-V3]
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD

Bruce Richardson (5):
  eal: limit telemetry to primary processes
  telemetry: fix deletion of active sockets
  telemetry: use unique socket paths for in-memory mode
  usertools/dpdk-telemetry: connect to in-memory processes
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst     | 26 ++++++++++++++++-
 lib/eal/freebsd/eal.c              |  3 +-
 lib/eal/linux/eal.c                |  3 +-
 lib/telemetry/telemetry.c          | 40 ++++++++++++++++++++------
 lib/telemetry/telemetry_internal.h |  3 +-
 usertools/dpdk-telemetry.py        | 45 +++++++++++++++++++++++++++---
 6 files changed, 104 insertions(+), 16 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 1/5] eal: limit telemetry to primary processes
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-01 11:15   ` Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 2/5] telemetry: fix deletion of active sockets Bruce Richardson
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..b06a2c1662 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 2/5] telemetry: fix deletion of active sockets
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 1/5] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-01 11:15   ` Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process,
as it starts up, removing the telemetry socket of another process,
giving unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8304fbf6e9..78508c1a1d 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,15 +457,30 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 1/5] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 2/5] telemetry: fix deletion of active sockets Bruce Richardson
@ 2021-10-01 11:15   ` Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run using "in-memory" flag, multiple processes can be run
using the same file-prefix and hence the same runtime directory. To
avoid problems with conflicting telemetry unix socket paths, we can
put the pid of the process into the socket name. As with the existing
telemetry socket files, these sockets are removed on normal program
exit.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
 lib/eal/freebsd/eal.c              |  1 +
 lib/eal/linux/eal.c                |  1 +
 lib/telemetry/telemetry.c          | 15 ++++++++++++---
 lib/telemetry/telemetry_internal.h |  3 ++-
 5 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..8a61302459 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
 Telemetry Interface
 -------------------
 
-The :doc:`../prog_guide/telemetry_lib` opens a socket with path
+For applications run normally, i.e. without the `--in-memory` EAL flag,
+the :doc:`../prog_guide/telemetry_lib` opens a socket with path
 *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
 telemetry version, the latest is v2. For example, a client would connect to a
 socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
 is run by a root user).
 
+For applications run with the `--in-memory` EAL flag,
+the socket file is created with an additional suffix of the process PID.
+This is because multiple independent DPDK processes can be run simultaneously
+using the same runtime directory when *in-memory* mode is used.
+For example, when a user with UID 1000 runs processes with in-memory mode,
+we would find sockets available such as::
+
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
+
+Where `/run/user/<uid>` is the runtime directory for the user given by the
+`$XDG_RUNTIME_DIR` environment variable,
+and `rte` is the default DPDK file prefix used for a runtime directory.
+
 
 Telemetry Initialization
 ------------------------
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index b06a2c1662..ed39d10b4e 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 0d0fc66668..9db4eb7913 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 78508c1a1d..0e5ef29fff 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -51,6 +51,7 @@ static struct socket v1_socket; /* socket for v1 telemetry */
 
 static const char *telemetry_version; /* save rte_version */
 static const char *socket_dir;        /* runtime directory */
+static bool socket_uses_pid;          /* for in-memory mode, we need different socket paths */
 static rte_cpuset_t *thread_cpuset;
 static rte_log_fn rte_log_ptr;
 static uint32_t logtype;
@@ -432,8 +433,14 @@ static inline char *
 get_socket_path(const char *runtime_dir, const int version)
 {
 	static char path[PATH_MAX];
-	snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
-			strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	if (!socket_uses_pid)
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
+				strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	else
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d.%u",
+				strlen(runtime_dir) ? runtime_dir : "/tmp",
+				version,
+				(unsigned int)getpid());
 	return path;
 }
 
@@ -587,11 +594,13 @@ telemetry_v2_init(void)
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
 int32_t
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype)
 {
 	telemetry_version = rte_version;
 	socket_dir = runtime_dir;
+	socket_uses_pid = in_memory; /* for in-memory mode use pid in sock path for uniqueness */
 	thread_cpuset = cpuset;
 	rte_log_ptr = log_fn;
 	logtype = registered_logtype;
diff --git a/lib/telemetry/telemetry_internal.h b/lib/telemetry/telemetry_internal.h
index d085c492dc..d8fb37a633 100644
--- a/lib/telemetry/telemetry_internal.h
+++ b/lib/telemetry/telemetry_internal.h
@@ -109,7 +109,8 @@ typedef int (*rte_log_fn)(uint32_t level, uint32_t logtype, const char *format,
  */
 __rte_internal
 int
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype);
 
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 4/5] usertools/dpdk-telemetry: connect to in-memory processes
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-01 11:15   ` Bruce Richardson
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Allow connecting to an in-memory process via "-p <pid>" flag, which can
be used to identify the in-memory process to which to connect.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst | 6 ++++++
 usertools/dpdk-telemetry.py    | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8a61302459..c3adca9504 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -63,6 +63,12 @@ and query information using the telemetry client python script.
 
       ./usertools/dpdk-telemetry.py
 
+   .. note::
+
+     When connecting to a process run with `--in-memory` EAL flag,
+     one must specify the PID of the process to connect to using the `-p` flag.
+     This is because there may be multiple such instances.
+
 #. When connected, the script displays the following, waiting for user input::
 
      Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index e04aa04702..780aee0c59 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -104,6 +104,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', \
         help='Provide file-prefix for DPDK runtime directory', default='rte')
+parser.add_argument('-p', '--pid', \
+        help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rdir = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rdir, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rdir, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.pid:
+    sock_path += ".{}".format(args.pid)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v4 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (3 preceding siblings ...)
  2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
@ 2021-10-01 11:15   ` Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 11:15 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example, when running the script without any parameters, but there
are DPDK processes running with in-memory mode (so they need to be
identified by PID), the error message will include the details of how to
connect to each process:

  $ ./usertools/dpdk-telemetry.py
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2.20755  # Connect with './usertools/dpdk-telemetry.py -p 20755'
  - dpdk_telemetry.v2.20451  # Connect with './usertools/dpdk-telemetry.py -p 20451'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst |  3 +++
 usertools/dpdk-telemetry.py    | 46 ++++++++++++++++++++++++++++------
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index c3adca9504..cdde57cb3b 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -68,6 +68,9 @@ and query information using the telemetry client python script.
      When connecting to a process run with `--in-memory` EAL flag,
      one must specify the PID of the process to connect to using the `-p` flag.
      This is because there may be multiple such instances.
+     If there are only *in-memory* DPDK processes to connect to,
+     and no PID, or an invalid PID parameter, is provided,
+     the telemetry script will list any available telemetry sockets and how to connect to them.
 
 #. When connected, the script displays the following, waiting for user input::
 
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 780aee0c59..08d5bb5fab 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -9,6 +9,7 @@
 
 import socket
 import os
+import sys
 import glob
 import json
 import errno
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(args, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if args.file_prefix != DEFAULT_PREFIX:
+        cmd += " -f " + args.file_prefix
+    for s in paths:
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -p {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split('.')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
     global CMDS
@@ -58,6 +82,14 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        if os.path.exists(path):
+            return  # if socket exists but is bad, just return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args, socks)
         return
     json_reply = read_socket(sock, 1024)
     output_buf_len = json_reply["max_output_len"]
@@ -102,13 +134,13 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', \
-        help='Provide file-prefix for DPDK runtime directory', default='rte')
-parser.add_argument('-p', '--pid', \
-        help='Connect to DPDK process with the given pid')
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
+                    help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-p', '--pid',
+                    help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rdir = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rdir, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(rdir, SOCKET_NAME)
 if args.pid:
     sock_path += ".{}".format(args.pid)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (3 preceding siblings ...)
  2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-01 16:22 ` Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 1/5] eal: limit telemetry to primary processes Bruce Richardson
                     ` (4 more replies)
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
                   ` (3 subsequent siblings)
  8 siblings, 5 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* add a suffix to the telemetry sockets of any process run using
  "in-memory" mode. This suffix is the PID and ensures we don't get path
  conflicts on the sockets
* inform the user of any path conflicts we do encounter for some reason,
  and present them with the options to solve it, instead of just removing
  the conflicting socket and continuing.
  [This was the original patch in V1-V3]
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V5: Rebase on latest main after other script cleanups were merged

V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD


Bruce Richardson (5):
  eal: limit telemetry to primary processes
  telemetry: fix deletion of active sockets
  telemetry: use unique socket paths for in-memory mode
  usertools/dpdk-telemetry: connect to in-memory processes
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst     | 26 +++++++++++++++++-
 lib/eal/freebsd/eal.c              |  3 +-
 lib/eal/linux/eal.c                |  3 +-
 lib/telemetry/telemetry.c          | 40 +++++++++++++++++++++------
 lib/telemetry/telemetry_internal.h |  3 +-
 usertools/dpdk-telemetry.py        | 44 ++++++++++++++++++++++++++++--
 6 files changed, 104 insertions(+), 15 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 1/5] eal: limit telemetry to primary processes
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-01 16:22   ` Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 2/5] telemetry: fix deletion of active sockets Bruce Richardson
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..b06a2c1662 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 2/5] telemetry: fix deletion of active sockets
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 1/5] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-01 16:22   ` Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process,
as it starts up, removing the telemetry socket of another process,
giving unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 48f4c7ba46..6676322991 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,15 +457,30 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 1/5] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 2/5] telemetry: fix deletion of active sockets Bruce Richardson
@ 2021-10-01 16:22   ` Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run using "in-memory" flag, multiple processes can be run
using the same file-prefix and hence the same runtime directory. To
avoid problems with conflicting telemetry unix socket paths, we can
put the pid of the process into the socket name. As with the existing
telemetry socket files, these sockets are removed on normal program
exit.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
 lib/eal/freebsd/eal.c              |  1 +
 lib/eal/linux/eal.c                |  1 +
 lib/telemetry/telemetry.c          | 15 ++++++++++++---
 lib/telemetry/telemetry_internal.h |  3 ++-
 5 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..8a61302459 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
 Telemetry Interface
 -------------------
 
-The :doc:`../prog_guide/telemetry_lib` opens a socket with path
+For applications run normally, i.e. without the `--in-memory` EAL flag,
+the :doc:`../prog_guide/telemetry_lib` opens a socket with path
 *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
 telemetry version, the latest is v2. For example, a client would connect to a
 socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
 is run by a root user).
 
+For applications run with the `--in-memory` EAL flag,
+the socket file is created with an additional suffix of the process PID.
+This is because multiple independent DPDK processes can be run simultaneously
+using the same runtime directory when *in-memory* mode is used.
+For example, when a user with UID 1000 runs processes with in-memory mode,
+we would find sockets available such as::
+
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
+
+Where `/run/user/<uid>` is the runtime directory for the user given by the
+`$XDG_RUNTIME_DIR` environment variable,
+and `rte` is the default DPDK file prefix used for a runtime directory.
+
 
 Telemetry Initialization
 ------------------------
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index b06a2c1662..ed39d10b4e 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 0d0fc66668..9db4eb7913 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 6676322991..0ec6f4883e 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -51,6 +51,7 @@ static struct socket v1_socket; /* socket for v1 telemetry */
 
 static const char *telemetry_version; /* save rte_version */
 static const char *socket_dir;        /* runtime directory */
+static bool socket_uses_pid;          /* for in-memory mode, we need different socket paths */
 static rte_cpuset_t *thread_cpuset;
 static rte_log_fn rte_log_ptr;
 static uint32_t logtype;
@@ -432,8 +433,14 @@ static inline char *
 get_socket_path(const char *runtime_dir, const int version)
 {
 	static char path[PATH_MAX];
-	snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
-			strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	if (!socket_uses_pid)
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
+				strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	else
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d.%u",
+				strlen(runtime_dir) ? runtime_dir : "/tmp",
+				version,
+				(unsigned int)getpid());
 	return path;
 }
 
@@ -589,11 +596,13 @@ telemetry_v2_init(void)
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
 int32_t
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype)
 {
 	telemetry_version = rte_version;
 	socket_dir = runtime_dir;
+	socket_uses_pid = in_memory; /* for in-memory mode use pid in sock path for uniqueness */
 	thread_cpuset = cpuset;
 	rte_log_ptr = log_fn;
 	logtype = registered_logtype;
diff --git a/lib/telemetry/telemetry_internal.h b/lib/telemetry/telemetry_internal.h
index d085c492dc..d8fb37a633 100644
--- a/lib/telemetry/telemetry_internal.h
+++ b/lib/telemetry/telemetry_internal.h
@@ -109,7 +109,8 @@ typedef int (*rte_log_fn)(uint32_t level, uint32_t logtype, const char *format,
  */
 __rte_internal
 int
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype);
 
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 4/5] usertools/dpdk-telemetry: connect to in-memory processes
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-01 16:22   ` Bruce Richardson
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Allow connecting to an in-memory process via "-p <pid>" flag, which can
be used to identify the in-memory process to which to connect.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst | 6 ++++++
 usertools/dpdk-telemetry.py    | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8a61302459..c3adca9504 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -63,6 +63,12 @@ and query information using the telemetry client python script.
 
       ./usertools/dpdk-telemetry.py
 
+   .. note::
+
+     When connecting to a process run with `--in-memory` EAL flag,
+     one must specify the PID of the process to connect to using the `-p` flag.
+     This is because there may be multiple such instances.
+
 #. When connected, the script displays the following, waiting for user input::
 
      Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 2974a64732..690014ba9a 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -112,6 +112,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', default='rte',
                     help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-p', '--pid',
+                    help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.pid:
+    sock_path += ".{}".format(args.pid)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v5 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (3 preceding siblings ...)
  2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
@ 2021-10-01 16:22   ` Bruce Richardson
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-01 16:22 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example, when running the script without any parameters, but there
are DPDK processes running with in-memory mode (so they need to be
identified by PID), the error message will include the details of how to
connect to each process:

  $ ./usertools/dpdk-telemetry.py
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2.20755  # Connect with './usertools/dpdk-telemetry.py -p 20755'
  - dpdk_telemetry.v2.20451  # Connect with './usertools/dpdk-telemetry.py -p 20451'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst |  3 +++
 usertools/dpdk-telemetry.py    | 41 ++++++++++++++++++++++++++++++----
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index c3adca9504..cdde57cb3b 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -68,6 +68,9 @@ and query information using the telemetry client python script.
      When connecting to a process run with `--in-memory` EAL flag,
      one must specify the PID of the process to connect to using the `-p` flag.
      This is because there may be multiple such instances.
+     If there are only *in-memory* DPDK processes to connect to,
+     and no PID, or an invalid PID parameter, is provided,
+     the telemetry script will list any available telemetry sockets and how to connect to them.
 
 #. When connected, the script displays the following, waiting for user input::
 
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 690014ba9a..15a8233cfc 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -10,6 +10,7 @@
 import socket
 import os
 import sys
+import glob
 import json
 import errno
 import readline
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(args, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if args.file_prefix != DEFAULT_PREFIX:
+        cmd += " -f " + args.file_prefix
+    for s in paths:
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -p {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split('.')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     prompt = ''  # this evaluates to false in conditions
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
@@ -62,6 +86,15 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        # if socket exists but is bad, or if non-interactive just return
+        if os.path.exists(path) or not prompt:
+            return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args, socks)
         return
     json_reply = read_socket(sock, 1024, prompt)
     output_buf_len = json_reply["max_output_len"]
@@ -110,13 +143,13 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', default='rte',
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
                     help='Provide file-prefix for DPDK runtime directory')
 parser.add_argument('-p', '--pid',
                     help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(rd, SOCKET_NAME)
 if args.pid:
     sock_path += ".{}".format(args.pid)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v3] telemetry: fix "in-memory" process socket conflicts
  2021-09-29 13:54 ` [dpdk-dev] [PATCH v3] " Bruce Richardson
@ 2021-10-05 11:47   ` Ferruh Yigit
  0 siblings, 0 replies; 68+ messages in thread
From: Ferruh Yigit @ 2021-10-05 11:47 UTC (permalink / raw)
  To: Bruce Richardson, dev
  Cc: Kevin Traynor, David Marchand, Ciara Power, techboard, Thomas Monjalon

On 9/29/2021 2:54 PM, Bruce Richardson wrote:
> When DPDK is run with --in-memory mode, multiple processes can run
> simultaneously using the same runtime dir. This leads to each process,
> as it starts up, removing the telemetry socket of another process,
> giving unexpected behaviour.
> 
> This patch changes that behaviour to first check if the existing socket
> is active. If not, it's an old socket to be cleaned up and can be
> removed. If it is active, telemetry initialization fails and an error
> message is printed out giving instructions on how to remove the error;
> either by using file-prefix to have a different runtime dir (and
> therefore socket path) or by disabling telemetry if it not needed.
> 
> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>

Off the topic.

This is the patch 100.000 in patchwork! Wow, that is huge.

This may be something to celebrate on next face to face meeting.
And I assume Bruce will be buying as the owner of the patch 100000 :)

https://patches.dpdk.org/api/1.2/patches/100000/

The historical numbers from DPDK patchwork:
100000 - Sept. 29, 2021 (184 days) [ 6 months / 26 weeks and 2 days ]
  90000 - March 29, 2021 (172 days)
  80000 - Oct.   8, 2020 (153 days)
  70000 - May    8, 2020 (224 days)
  60000 - Sept. 27, 2019 (248 days)
  50000 - Jan.  22, 2019 (253 days)
  40000 - May   14, 2018 (217 days)
  30000 - Oct.   9, 2017 (258 days)
  20000 - Jan.  25, 2017 (372 days)
  10000 - Jan.  20, 2016 (645 days)
  00001 - April 16, 2014

Last two years average is 10K patch in ~185 days, that is more patch per day
comparing to previous years.
Not sure if this is Covid-19 effect or simply DPDK is getting more popular.

Thanks again to all contributors.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (4 preceding siblings ...)
  2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-05 13:59 ` Bruce Richardson
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes Bruce Richardson
                     ` (4 more replies)
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
                   ` (2 subsequent siblings)
  8 siblings, 5 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* add a suffix to the telemetry sockets of any process run using
  "in-memory" mode. This suffix is the PID and ensures we don't get path
  conflicts on the sockets
* inform the user of any path conflicts we do encounter for some reason,
  and present them with the options to solve it, instead of just removing
  the conflicting socket and continuing.
  [This was the original patch in V1-V3]
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V6: fixed issue whereby the failing of the legacy telemetry init would roll-back
init of the v2 telemetry, causing the socket to be deleted, even though it was
still necessary.

V5: Rebase on latest main after other script cleanups were merged

V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD

Bruce Richardson (5):
  eal: limit telemetry to primary processes
  telemetry: fix deletion of active sockets
  telemetry: use unique socket paths for in-memory mode
  usertools/dpdk-telemetry: connect to in-memory processes
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst     | 26 +++++++++++++++++-
 lib/eal/freebsd/eal.c              |  3 +-
 lib/eal/linux/eal.c                |  3 +-
 lib/telemetry/telemetry.c          | 43 +++++++++++++++++++++++------
 lib/telemetry/telemetry_internal.h |  3 +-
 usertools/dpdk-telemetry.py        | 44 ++++++++++++++++++++++++++++--
 6 files changed, 106 insertions(+), 16 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-05 13:59   ` Bruce Richardson
  2021-10-07 13:11     ` Power, Ciara
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets Bruce Richardson
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..b06a2c1662 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-05 13:59   ` Bruce Richardson
  2021-10-05 15:18     ` Conor Walsh
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process,
as it starts up, removing the telemetry socket of another process,
giving unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 lib/telemetry/telemetry.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 9021025a1b..bd804a25a9 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,19 +457,35 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+	TMTY_LOG(DEBUG, "Attempting socket bind to path '%s'\n", path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
 		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
+		unlink(sun.sun_path);
 		goto error;
 	}
 
@@ -477,7 +493,7 @@ create_socket(char *path)
 
 error:
 	close(sock);
-	unlink_sockets();
+	path[0] = 0;
 	return -1;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets Bruce Richardson
@ 2021-10-05 13:59   ` Bruce Richardson
  2021-10-05 14:41     ` Kevin Traynor
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When DPDK is run using "in-memory" flag, multiple processes can be run
using the same file-prefix and hence the same runtime directory. To
avoid problems with conflicting telemetry unix socket paths, we can
put the pid of the process into the socket name. As with the existing
telemetry socket files, these sockets are removed on normal program
exit.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
 lib/eal/freebsd/eal.c              |  1 +
 lib/eal/linux/eal.c                |  1 +
 lib/telemetry/telemetry.c          | 15 ++++++++++++---
 lib/telemetry/telemetry_internal.h |  3 ++-
 5 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..8a61302459 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
 Telemetry Interface
 -------------------
 
-The :doc:`../prog_guide/telemetry_lib` opens a socket with path
+For applications run normally, i.e. without the `--in-memory` EAL flag,
+the :doc:`../prog_guide/telemetry_lib` opens a socket with path
 *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
 telemetry version, the latest is v2. For example, a client would connect to a
 socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
 is run by a root user).
 
+For applications run with the `--in-memory` EAL flag,
+the socket file is created with an additional suffix of the process PID.
+This is because multiple independent DPDK processes can be run simultaneously
+using the same runtime directory when *in-memory* mode is used.
+For example, when a user with UID 1000 runs processes with in-memory mode,
+we would find sockets available such as::
+
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
+  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
+
+Where `/run/user/<uid>` is the runtime directory for the user given by the
+`$XDG_RUNTIME_DIR` environment variable,
+and `rte` is the default DPDK file prefix used for a runtime directory.
+
 
 Telemetry Initialization
 ------------------------
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index b06a2c1662..ed39d10b4e 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 0d0fc66668..9db4eb7913 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
 		if (tlog < 0)
 			tlog = RTE_LOGTYPE_EAL;
 		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
+				internal_conf->in_memory | internal_conf->no_shconf,
 				rte_version(),
 				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
 			return -1;
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index bd804a25a9..24441d100b 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -51,6 +51,7 @@ static struct socket v1_socket; /* socket for v1 telemetry */
 
 static const char *telemetry_version; /* save rte_version */
 static const char *socket_dir;        /* runtime directory */
+static bool socket_uses_pid;          /* for in-memory mode, we need different socket paths */
 static rte_cpuset_t *thread_cpuset;
 static rte_log_fn rte_log_ptr;
 static uint32_t logtype;
@@ -432,8 +433,14 @@ static inline char *
 get_socket_path(const char *runtime_dir, const int version)
 {
 	static char path[PATH_MAX];
-	snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
-			strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	if (!socket_uses_pid)
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
+				strlen(runtime_dir) ? runtime_dir : "/tmp", version);
+	else
+		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d.%u",
+				strlen(runtime_dir) ? runtime_dir : "/tmp",
+				version,
+				(unsigned int)getpid());
 	return path;
 }
 
@@ -591,11 +598,13 @@ telemetry_v2_init(void)
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
 int32_t
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype)
 {
 	telemetry_version = rte_version;
 	socket_dir = runtime_dir;
+	socket_uses_pid = in_memory; /* for in-memory mode use pid in sock path for uniqueness */
 	thread_cpuset = cpuset;
 	rte_log_ptr = log_fn;
 	logtype = registered_logtype;
diff --git a/lib/telemetry/telemetry_internal.h b/lib/telemetry/telemetry_internal.h
index d085c492dc..d8fb37a633 100644
--- a/lib/telemetry/telemetry_internal.h
+++ b/lib/telemetry/telemetry_internal.h
@@ -109,7 +109,8 @@ typedef int (*rte_log_fn)(uint32_t level, uint32_t logtype, const char *format,
  */
 __rte_internal
 int
-rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
+rte_telemetry_init(const char *runtime_dir, bool in_memory,
+		const char *rte_version, rte_cpuset_t *cpuset,
 		rte_log_fn log_fn, uint32_t registered_logtype);
 
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-05 13:59   ` Bruce Richardson
  2021-10-05 15:19     ` Conor Walsh
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Allow connecting to an in-memory process via "-p <pid>" flag, which can
be used to identify the in-memory process to which to connect.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst | 6 ++++++
 usertools/dpdk-telemetry.py    | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8a61302459..c3adca9504 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -63,6 +63,12 @@ and query information using the telemetry client python script.
 
       ./usertools/dpdk-telemetry.py
 
+   .. note::
+
+     When connecting to a process run with `--in-memory` EAL flag,
+     one must specify the PID of the process to connect to using the `-p` flag.
+     This is because there may be multiple such instances.
+
 #. When connected, the script displays the following, waiting for user input::
 
      Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 2974a64732..690014ba9a 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -112,6 +112,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', default='rte',
                     help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-p', '--pid',
+                    help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.pid:
+    sock_path += ".{}".format(args.pid)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (3 preceding siblings ...)
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
@ 2021-10-05 13:59   ` Bruce Richardson
  2021-10-05 15:19     ` Conor Walsh
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 13:59 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example, when running the script without any parameters, but there
are DPDK processes running with in-memory mode (so they need to be
identified by PID), the error message will include the details of how to
connect to each process:

  $ ./usertools/dpdk-telemetry.py
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2.20755  # Connect with './usertools/dpdk-telemetry.py -p 20755'
  - dpdk_telemetry.v2.20451  # Connect with './usertools/dpdk-telemetry.py -p 20451'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst |  3 +++
 usertools/dpdk-telemetry.py    | 41 ++++++++++++++++++++++++++++++----
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index c3adca9504..cdde57cb3b 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -68,6 +68,9 @@ and query information using the telemetry client python script.
      When connecting to a process run with `--in-memory` EAL flag,
      one must specify the PID of the process to connect to using the `-p` flag.
      This is because there may be multiple such instances.
+     If there are only *in-memory* DPDK processes to connect to,
+     and no PID, or an invalid PID parameter, is provided,
+     the telemetry script will list any available telemetry sockets and how to connect to them.
 
 #. When connected, the script displays the following, waiting for user input::
 
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 690014ba9a..7f22e21828 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -10,6 +10,7 @@
 import socket
 import os
 import sys
+import glob
 import json
 import errno
 import readline
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(prefix, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if prefix != DEFAULT_PREFIX:
+        cmd += " -f " + prefix
+    for s in paths:
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -p {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split('.')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     prompt = ''  # this evaluates to false in conditions
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
@@ -62,6 +86,15 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        # if socket exists but is bad, or if non-interactive just return
+        if os.path.exists(path) or not prompt:
+            return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args.file_prefix, socks)
         return
     json_reply = read_socket(sock, 1024, prompt)
     output_buf_len = json_reply["max_output_len"]
@@ -110,13 +143,13 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', default='rte',
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
                     help='Provide file-prefix for DPDK runtime directory')
 parser.add_argument('-p', '--pid',
                     help='Connect to DPDK process with the given pid')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(rd, SOCKET_NAME)
 if args.pid:
     sock_path += ".{}".format(args.pid)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-05 14:41     ` Kevin Traynor
  2021-10-05 14:52       ` Bruce Richardson
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin Traynor @ 2021-10-05 14:41 UTC (permalink / raw)
  To: Bruce Richardson, dev; +Cc: Ciara Power, David Marchand, Anatoly Burakov

Hi Bruce, I started reviewing v5 before v6 was pushed, so these are just 
comments from v5,

On 05/10/2021 14:59, Bruce Richardson wrote:
> When DPDK is run using "in-memory" flag, multiple processes can be run
> using the same file-prefix and hence the same runtime directory. To
> avoid problems with conflicting telemetry unix socket paths, we can
> put the pid of the process into the socket name. As with the existing
> telemetry socket files, these sockets are removed on normal program
> exit.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>   doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
>   lib/eal/freebsd/eal.c              |  1 +
>   lib/eal/linux/eal.c                |  1 +
>   lib/telemetry/telemetry.c          | 15 ++++++++++++---
>   lib/telemetry/telemetry_internal.h |  3 ++-
>   5 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
> index 8f4fa1a510..8a61302459 100644
> --- a/doc/guides/howto/telemetry.rst
> +++ b/doc/guides/howto/telemetry.rst
> @@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
>   Telemetry Interface
>   -------------------
>   
> -The :doc:`../prog_guide/telemetry_lib` opens a socket with path
> +For applications run normally, i.e. without the `--in-memory` EAL flag,
> +the :doc:`../prog_guide/telemetry_lib` opens a socket with path
>   *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
>   telemetry version, the latest is v2. For example, a client would connect to a
>   socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
>   is run by a root user).
>   
> +For applications run with the `--in-memory` EAL flag,
> +the socket file is created with an additional suffix of the process PID.
> +This is because multiple independent DPDK processes can be run simultaneously
> +using the same runtime directory when *in-memory* mode is used.
> +For example, when a user with UID 1000 runs processes with in-memory mode,
> +we would find sockets available such as::
> +
> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
> +

It seems an unnecessary step unless there is multiple process. As a 
suggestion, how about "simplifying" by always adding a check for an 
active socket with the default name. If it is not found, create it with 
the default (pre patches) name. If it is found and active, create a new 
one with process id appended. e.g.

First:
/run/user/1000/dpdk/rte/dpdk_telemetry.v2

Next:
/run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
/run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935

It means that existing socket can still be used by anything using 
telemetry. I think it is a nice default to keep as it doesn't require 
any changes for anything that will connect (e.g. collectd?)

The downside is that one will have a different name but it seems an 
acceptable trade-off to keep compatibility for single process case.

WDYT?

> +Where `/run/user/<uid>` is the runtime directory for the user given by the
> +`$XDG_RUNTIME_DIR` environment variable,
> +and `rte` is the default DPDK file prefix used for a runtime directory.
> +
>   
>   Telemetry Initialization
>   ------------------------
> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index b06a2c1662..ed39d10b4e 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
>   		if (tlog < 0)
>   			tlog = RTE_LOGTYPE_EAL;
>   		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
> +				internal_conf->in_memory | internal_conf->no_shconf,
>   				rte_version(),
>   				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
>   			return -1;
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 0d0fc66668..9db4eb7913 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
>   		if (tlog < 0)
>   			tlog = RTE_LOGTYPE_EAL;
>   		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
> +				internal_conf->in_memory | internal_conf->no_shconf,

should be logical OR

>   				rte_version(),
>   				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
>   			return -1;
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index bd804a25a9..24441d100b 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -51,6 +51,7 @@ static struct socket v1_socket; /* socket for v1 telemetry */
>   
>   static const char *telemetry_version; /* save rte_version */
>   static const char *socket_dir;        /* runtime directory */
> +static bool socket_uses_pid;          /* for in-memory mode, we need different socket paths */
>   static rte_cpuset_t *thread_cpuset;
>   static rte_log_fn rte_log_ptr;
>   static uint32_t logtype;
> @@ -432,8 +433,14 @@ static inline char *
>   get_socket_path(const char *runtime_dir, const int version)
>   {
>   	static char path[PATH_MAX];
> -	snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
> -			strlen(runtime_dir) ? runtime_dir : "/tmp", version);
> +	if (!socket_uses_pid)
> +		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d",
> +				strlen(runtime_dir) ? runtime_dir : "/tmp", version);
> +	else
> +		snprintf(path, sizeof(path), "%s/dpdk_telemetry.v%d.%u",
> +				strlen(runtime_dir) ? runtime_dir : "/tmp",
> +				version,
> +				(unsigned int)getpid());
>   	return path;
>   }
>   
> @@ -591,11 +598,13 @@ telemetry_v2_init(void)
>   #endif /* !RTE_EXEC_ENV_WINDOWS */
>   
>   int32_t
> -rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
> +rte_telemetry_init(const char *runtime_dir, bool in_memory,
> +		const char *rte_version, rte_cpuset_t *cpuset,
>   		rte_log_fn log_fn, uint32_t registered_logtype)
>   {
>   	telemetry_version = rte_version;
>   	socket_dir = runtime_dir;
> +	socket_uses_pid = in_memory; /* for in-memory mode use pid in sock path for uniqueness */
>   	thread_cpuset = cpuset;
>   	rte_log_ptr = log_fn;
>   	logtype = registered_logtype;
> diff --git a/lib/telemetry/telemetry_internal.h b/lib/telemetry/telemetry_internal.h
> index d085c492dc..d8fb37a633 100644
> --- a/lib/telemetry/telemetry_internal.h
> +++ b/lib/telemetry/telemetry_internal.h
> @@ -109,7 +109,8 @@ typedef int (*rte_log_fn)(uint32_t level, uint32_t logtype, const char *format,
>    */
>   __rte_internal
>   int
> -rte_telemetry_init(const char *runtime_dir, const char *rte_version, rte_cpuset_t *cpuset,
> +rte_telemetry_init(const char *runtime_dir, bool in_memory,
> +		const char *rte_version, rte_cpuset_t *cpuset,
>   		rte_log_fn log_fn, uint32_t registered_logtype);
>   
>   #endif
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 14:41     ` Kevin Traynor
@ 2021-10-05 14:52       ` Bruce Richardson
  2021-10-05 15:14         ` Kevin Traynor
  2021-10-05 15:19         ` Conor Walsh
  0 siblings, 2 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-05 14:52 UTC (permalink / raw)
  To: Kevin Traynor; +Cc: dev, Ciara Power, David Marchand, Anatoly Burakov

On Tue, Oct 05, 2021 at 03:41:17PM +0100, Kevin Traynor wrote:
> Hi Bruce, I started reviewing v5 before v6 was pushed, so these are just
> comments from v5,
> 

No problem. Changes V6-v5 are fairly small anyway. Thanks for the review.

> On 05/10/2021 14:59, Bruce Richardson wrote:
> > When DPDK is run using "in-memory" flag, multiple processes can be run
> > using the same file-prefix and hence the same runtime directory. To
> > avoid problems with conflicting telemetry unix socket paths, we can
> > put the pid of the process into the socket name. As with the existing
> > telemetry socket files, these sockets are removed on normal program
> > exit.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >   doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
> >   lib/eal/freebsd/eal.c              |  1 +
> >   lib/eal/linux/eal.c                |  1 +
> >   lib/telemetry/telemetry.c          | 15 ++++++++++++---
> >   lib/telemetry/telemetry_internal.h |  3 ++-
> >   5 files changed, 32 insertions(+), 5 deletions(-)
> > 
> > diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
> > index 8f4fa1a510..8a61302459 100644
> > --- a/doc/guides/howto/telemetry.rst
> > +++ b/doc/guides/howto/telemetry.rst
> > @@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
> >   Telemetry Interface
> >   -------------------
> > -The :doc:`../prog_guide/telemetry_lib` opens a socket with path
> > +For applications run normally, i.e. without the `--in-memory` EAL flag,
> > +the :doc:`../prog_guide/telemetry_lib` opens a socket with path
> >   *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
> >   telemetry version, the latest is v2. For example, a client would connect to a
> >   socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
> >   is run by a root user).
> > +For applications run with the `--in-memory` EAL flag,
> > +the socket file is created with an additional suffix of the process PID.
> > +This is because multiple independent DPDK processes can be run simultaneously
> > +using the same runtime directory when *in-memory* mode is used.
> > +For example, when a user with UID 1000 runs processes with in-memory mode,
> > +we would find sockets available such as::
> > +
> > +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
> > +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
> > +
> 
> It seems an unnecessary step unless there is multiple process. As a
> suggestion, how about "simplifying" by always adding a check for an active
> socket with the default name. If it is not found, create it with the default
> (pre patches) name. If it is found and active, create a new one with process
> id appended. e.g.
> 
> First:
> /run/user/1000/dpdk/rte/dpdk_telemetry.v2
> 
> Next:
> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
> 
> It means that existing socket can still be used by anything using telemetry.
> I think it is a nice default to keep as it doesn't require any changes for
> anything that will connect (e.g. collectd?)
> 
> The downside is that one will have a different name but it seems an
> acceptable trade-off to keep compatibility for single process case.
> 
> WDYT?
> 

Yes, that is an interesting idea, and probably not a bad one.

Taking things further, I wonder if using the pid is the best choice for a
suffix for the non-single-process case. If we used .1, .2 etc. it would
make things more predictable, perhaps easing integration for other tools.
Each process starting up would use the lowest free suffix, and any
external monitoring tools could just be set up to monitor the
first/second/third instance, with reproducable names across process
restarts.

> > +Where `/run/user/<uid>` is the runtime directory for the user given by the
> > +`$XDG_RUNTIME_DIR` environment variable,
> > +and `rte` is the default DPDK file prefix used for a runtime directory.
> > +
> >   Telemetry Initialization
> >   ------------------------
> > diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> > index b06a2c1662..ed39d10b4e 100644
> > --- a/lib/eal/freebsd/eal.c
> > +++ b/lib/eal/freebsd/eal.c
> > @@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
> >   		if (tlog < 0)
> >   			tlog = RTE_LOGTYPE_EAL;
> >   		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
> > +				internal_conf->in_memory | internal_conf->no_shconf,
> >   				rte_version(),
> >   				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
> >   			return -1;
> > diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> > index 0d0fc66668..9db4eb7913 100644
> > --- a/lib/eal/linux/eal.c
> > +++ b/lib/eal/linux/eal.c
> > @@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
> >   		if (tlog < 0)
> >   			tlog = RTE_LOGTYPE_EAL;
> >   		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
> > +				internal_conf->in_memory | internal_conf->no_shconf,
> 
> should be logical OR
> 

I don't think it actually matters since any non-zero value will come
through as "true". However, will change it in v7.

/Bruce

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 14:52       ` Bruce Richardson
@ 2021-10-05 15:14         ` Kevin Traynor
  2021-10-07 13:39           ` Power, Ciara
  2021-10-05 15:19         ` Conor Walsh
  1 sibling, 1 reply; 68+ messages in thread
From: Kevin Traynor @ 2021-10-05 15:14 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Ciara Power, David Marchand, Anatoly Burakov

On 05/10/2021 15:52, Bruce Richardson wrote:
> On Tue, Oct 05, 2021 at 03:41:17PM +0100, Kevin Traynor wrote:
>> Hi Bruce, I started reviewing v5 before v6 was pushed, so these are just
>> comments from v5,
>>
> 
> No problem. Changes V6-v5 are fairly small anyway. Thanks for the review.
> 
>> On 05/10/2021 14:59, Bruce Richardson wrote:
>>> When DPDK is run using "in-memory" flag, multiple processes can be run
>>> using the same file-prefix and hence the same runtime directory. To
>>> avoid problems with conflicting telemetry unix socket paths, we can
>>> put the pid of the process into the socket name. As with the existing
>>> telemetry socket files, these sockets are removed on normal program
>>> exit.
>>>
>>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>>> ---
>>>    doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
>>>    lib/eal/freebsd/eal.c              |  1 +
>>>    lib/eal/linux/eal.c                |  1 +
>>>    lib/telemetry/telemetry.c          | 15 ++++++++++++---
>>>    lib/telemetry/telemetry_internal.h |  3 ++-
>>>    5 files changed, 32 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
>>> index 8f4fa1a510..8a61302459 100644
>>> --- a/doc/guides/howto/telemetry.rst
>>> +++ b/doc/guides/howto/telemetry.rst
>>> @@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
>>>    Telemetry Interface
>>>    -------------------
>>> -The :doc:`../prog_guide/telemetry_lib` opens a socket with path
>>> +For applications run normally, i.e. without the `--in-memory` EAL flag,
>>> +the :doc:`../prog_guide/telemetry_lib` opens a socket with path
>>>    *<runtime_directory>/dpdk_telemetry.<version>*. The version represents the
>>>    telemetry version, the latest is v2. For example, a client would connect to a
>>>    socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the primary process
>>>    is run by a root user).
>>> +For applications run with the `--in-memory` EAL flag,
>>> +the socket file is created with an additional suffix of the process PID.
>>> +This is because multiple independent DPDK processes can be run simultaneously
>>> +using the same runtime directory when *in-memory* mode is used.
>>> +For example, when a user with UID 1000 runs processes with in-memory mode,
>>> +we would find sockets available such as::
>>> +
>>> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
>>> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
>>> +
>>
>> It seems an unnecessary step unless there is multiple process. As a
>> suggestion, how about "simplifying" by always adding a check for an active
>> socket with the default name. If it is not found, create it with the default
>> (pre patches) name. If it is found and active, create a new one with process
>> id appended. e.g.
>>
>> First:
>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2
>>
>> Next:
>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
>>
>> It means that existing socket can still be used by anything using telemetry.
>> I think it is a nice default to keep as it doesn't require any changes for
>> anything that will connect (e.g. collectd?)
>>
>> The downside is that one will have a different name but it seems an
>> acceptable trade-off to keep compatibility for single process case.
>>
>> WDYT?
>>
> 
> Yes, that is an interesting idea, and probably not a bad one.
> 
> Taking things further, I wonder if using the pid is the best choice for a
> suffix for the non-single-process case. If we used .1, .2 etc. it would
> make things more predictable, perhaps easing integration for other tools.
> Each process starting up would use the lowest free suffix, and any
> external monitoring tools could just be set up to monitor the
> first/second/third instance, with reproducable names across process
> restarts.
> 

ok, cool - that sounds better again. Probably you can post and see if 
anyone else finds a hole in it or has a better idea for the next few days.

>>> +Where `/run/user/<uid>` is the runtime directory for the user given by the
>>> +`$XDG_RUNTIME_DIR` environment variable,
>>> +and `rte` is the default DPDK file prefix used for a runtime directory.
>>> +
>>>    Telemetry Initialization
>>>    ------------------------
>>> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
>>> index b06a2c1662..ed39d10b4e 100644
>>> --- a/lib/eal/freebsd/eal.c
>>> +++ b/lib/eal/freebsd/eal.c
>>> @@ -952,6 +952,7 @@ rte_eal_init(int argc, char **argv)
>>>    		if (tlog < 0)
>>>    			tlog = RTE_LOGTYPE_EAL;
>>>    		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
>>> +				internal_conf->in_memory | internal_conf->no_shconf,
>>>    				rte_version(),
>>>    				&internal_conf->ctrl_cpuset, rte_log, tlog) != 0)
>>>    			return -1;
>>> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
>>> index 0d0fc66668..9db4eb7913 100644
>>> --- a/lib/eal/linux/eal.c
>>> +++ b/lib/eal/linux/eal.c
>>> @@ -1326,6 +1326,7 @@ rte_eal_init(int argc, char **argv)
>>>    		if (tlog < 0)
>>>    			tlog = RTE_LOGTYPE_EAL;
>>>    		if (rte_telemetry_init(rte_eal_get_runtime_dir(),
>>> +				internal_conf->in_memory | internal_conf->no_shconf,
>>
>> should be logical OR
>>
> 
> I don't think it actually matters since any non-zero value will come
> through as "true". However, will change it in v7.
> 

It was just a nit-pick, I admit it :-)

> /Bruce
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets Bruce Richardson
@ 2021-10-05 15:18     ` Conor Walsh
  0 siblings, 0 replies; 68+ messages in thread
From: Conor Walsh @ 2021-10-05 15:18 UTC (permalink / raw)
  To: dev

On 05/10/2021 14:59, Bruce Richardson wrote:
> When DPDK is run with --in-memory mode, multiple processes can run
> simultaneously using the same runtime dir. This leads to each process,
> as it starts up, removing the telemetry socket of another process,
> giving unexpected behaviour.
>
> This patch changes that behaviour to first check if the existing socket
> is active. If not, it's an old socket to be cleaned up and can be
> removed. If it is active, telemetry initialization fails and an error
> message is printed out giving instructions on how to remove the error;
> either by using file-prefix to have a different runtime dir (and
> therefore socket path) or by disabling telemetry if it not needed.
>
> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>
> ---

Tested-by: Conor Walsh <conor.walsh@intel.com>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 14:52       ` Bruce Richardson
  2021-10-05 15:14         ` Kevin Traynor
@ 2021-10-05 15:19         ` Conor Walsh
  1 sibling, 0 replies; 68+ messages in thread
From: Conor Walsh @ 2021-10-05 15:19 UTC (permalink / raw)
  To: dev


On 05/10/2021 15:52, Bruce Richardson wrote:
> On Tue, Oct 05, 2021 at 03:41:17PM +0100, Kevin Traynor wrote:
>> Hi Bruce, I started reviewing v5 before v6 was pushed, so these are just
>> comments from v5,
>>
> No problem. Changes V6-v5 are fairly small anyway. Thanks for the review.
>
>> On 05/10/2021 14:59, Bruce Richardson wrote:
>>> When DPDK is run using "in-memory" flag, multiple processes can be run
>>> using the same file-prefix and hence the same runtime directory. To
>>> avoid problems with conflicting telemetry unix socket paths, we can
>>> put the pid of the process into the socket name. As with the existing
>>> telemetry socket files, these sockets are removed on normal program
>>> exit.
>>>
>>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>>> ---

Legacy telemetry does not prevent multiple v2 sockets anymore.

Thanks,

Conor.

Tested-by: Conor Walsh <conor.walsh@intel.com>

<snip>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
@ 2021-10-05 15:19     ` Conor Walsh
  0 siblings, 0 replies; 68+ messages in thread
From: Conor Walsh @ 2021-10-05 15:19 UTC (permalink / raw)
  To: dev

On 05/10/2021 14:59, Bruce Richardson wrote:
> Allow connecting to an in-memory process via "-p <pid>" flag, which can
> be used to identify the in-memory process to which to connect.
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

Tested-by: Conor Walsh <conor.walsh@intel.com>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
@ 2021-10-05 15:19     ` Conor Walsh
  0 siblings, 0 replies; 68+ messages in thread
From: Conor Walsh @ 2021-10-05 15:19 UTC (permalink / raw)
  To: dev

On 05/10/2021 14:59, Bruce Richardson wrote:
> When a user runs the dpdk-telemetry script and fails to connect because
> the socket path does not exist, run a scan for possible sockets that
> could be connected to and inform the user of the command needed to
> connect to those.
>
> For example, when running the script without any parameters, but there
> are DPDK processes running with in-memory mode (so they need to be
> identified by PID), the error message will include the details of how to
> connect to each process:
>
>    $ ./usertools/dpdk-telemetry.py
>    Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2
>    Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2
>
>    Other DPDK telemetry sockets found:
>    - dpdk_telemetry.v2.20755  # Connect with './usertools/dpdk-telemetry.py -p 20755'
>    - dpdk_telemetry.v2.20451  # Connect with './usertools/dpdk-telemetry.py -p 20451'
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

Tested-by: Conor Walsh <conor.walsh@intel.com>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes
  2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-07 13:11     ` Power, Ciara
  0 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-10-07 13:11 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: David Marchand, Burakov, Anatoly, Kevin Traynor

Hi Bruce,

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Tuesday 5 October 2021 14:59
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
><david.marchand@redhat.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>Richardson, Bruce <bruce.richardson@intel.com>
>Subject: [PATCH v6 1/5] eal: limit telemetry to primary processes
>
>Telemetry interface should be exposed for primary processes only, since
>secondary processes will conflict on socket creation, and since all data in
>secondary process is generally available to primary. For example, all device stats
>for ethdevs, cryptodevs, etc. will all be common across processes.
>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
> lib/eal/freebsd/eal.c | 2 +-
> lib/eal/linux/eal.c   | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c index
>6cee5ae369..b06a2c1662 100644
>--- a/lib/eal/freebsd/eal.c
>+++ b/lib/eal/freebsd/eal.c
>@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
> 		rte_eal_init_alert("Cannot clear runtime directory");
> 		return -1;
> 	}
>-	if (!internal_conf->no_telemetry) {
>+	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
>+!internal_conf->no_telemetry) {
> 		int tlog = rte_log_register_type_and_pick_level(
> 				"lib.telemetry", RTE_LOG_WARNING);
> 		if (tlog < 0)
>diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index 3577eaeaa4..0d0fc66668
>100644
>--- a/lib/eal/linux/eal.c
>+++ b/lib/eal/linux/eal.c
>@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
> 		rte_eal_init_alert("Cannot clear runtime directory");
> 		return -1;
> 	}
>-	if (!internal_conf->no_telemetry) {
>+	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
>+!internal_conf->no_telemetry) {
> 		int tlog = rte_log_register_type_and_pick_level(
> 				"lib.telemetry", RTE_LOG_WARNING);
> 		if (tlog < 0)
>--
>2.30.2

Good idea, thanks!

Acked-by:  Ciara Power <ciara.power@intel.com>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-05 15:14         ` Kevin Traynor
@ 2021-10-07 13:39           ` Power, Ciara
  0 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-10-07 13:39 UTC (permalink / raw)
  To: Kevin Traynor, Richardson, Bruce; +Cc: dev, David Marchand, Burakov, Anatoly


>-----Original Message-----
>From: Kevin Traynor <ktraynor@redhat.com>
>Sent: Tuesday 5 October 2021 16:15
>To: Richardson, Bruce <bruce.richardson@intel.com>
>Cc: dev@dpdk.org; Power, Ciara <ciara.power@intel.com>; David Marchand
><david.marchand@redhat.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
>Subject: Re: [PATCH v6 3/5] telemetry: use unique socket paths for in-memory
>mode
>
>On 05/10/2021 15:52, Bruce Richardson wrote:
>> On Tue, Oct 05, 2021 at 03:41:17PM +0100, Kevin Traynor wrote:
>>> Hi Bruce, I started reviewing v5 before v6 was pushed, so these are
>>> just comments from v5,
>>>
>>
>> No problem. Changes V6-v5 are fairly small anyway. Thanks for the review.
>>
>>> On 05/10/2021 14:59, Bruce Richardson wrote:
>>>> When DPDK is run using "in-memory" flag, multiple processes can be
>>>> run using the same file-prefix and hence the same runtime directory.
>>>> To avoid problems with conflicting telemetry unix socket paths, we
>>>> can put the pid of the process into the socket name. As with the
>>>> existing telemetry socket files, these sockets are removed on normal
>>>> program exit.
>>>>
>>>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>>>> ---
>>>>    doc/guides/howto/telemetry.rst     | 17 ++++++++++++++++-
>>>>    lib/eal/freebsd/eal.c              |  1 +
>>>>    lib/eal/linux/eal.c                |  1 +
>>>>    lib/telemetry/telemetry.c          | 15 ++++++++++++---
>>>>    lib/telemetry/telemetry_internal.h |  3 ++-
>>>>    5 files changed, 32 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/doc/guides/howto/telemetry.rst
>>>> b/doc/guides/howto/telemetry.rst index 8f4fa1a510..8a61302459 100644
>>>> --- a/doc/guides/howto/telemetry.rst
>>>> +++ b/doc/guides/howto/telemetry.rst
>>>> @@ -13,12 +13,27 @@ ethdev port list, and eal parameters.
>>>>    Telemetry Interface
>>>>    -------------------
>>>> -The :doc:`../prog_guide/telemetry_lib` opens a socket with path
>>>> +For applications run normally, i.e. without the `--in-memory` EAL
>>>> +flag, the :doc:`../prog_guide/telemetry_lib` opens a socket with
>>>> +path
>>>>    *<runtime_directory>/dpdk_telemetry.<version>*. The version represents
>the
>>>>    telemetry version, the latest is v2. For example, a client would connect to a
>>>>    socket with path  */var/run/dpdk/\*/dpdk_telemetry.v2* (when the
>primary process
>>>>    is run by a root user).
>>>> +For applications run with the `--in-memory` EAL flag, the socket
>>>> +file is created with an additional suffix of the process PID.
>>>> +This is because multiple independent DPDK processes can be run
>>>> +simultaneously using the same runtime directory when *in-memory* mode
>is used.
>>>> +For example, when a user with UID 1000 runs processes with
>>>> +in-memory mode, we would find sockets available such as::
>>>> +
>>>> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
>>>> +  /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
>>>> +
>>>
>>> It seems an unnecessary step unless there is multiple process. As a
>>> suggestion, how about "simplifying" by always adding a check for an
>>> active socket with the default name. If it is not found, create it
>>> with the default (pre patches) name. If it is found and active,
>>> create a new one with process id appended. e.g.
>>>
>>> First:
>>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2
>>>
>>> Next:
>>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1982
>>> /run/user/1000/dpdk/rte/dpdk_telemetry.v2.1935
>>>
>>> It means that existing socket can still be used by anything using telemetry.
>>> I think it is a nice default to keep as it doesn't require any
>>> changes for anything that will connect (e.g. collectd?)
>>>
>>> The downside is that one will have a different name but it seems an
>>> acceptable trade-off to keep compatibility for single process case.
>>>
>>> WDYT?
>>>
>>
>> Yes, that is an interesting idea, and probably not a bad one.
>>
>> Taking things further, I wonder if using the pid is the best choice
>> for a suffix for the non-single-process case. If we used .1, .2 etc.
>> it would make things more predictable, perhaps easing integration for other
>tools.
>> Each process starting up would use the lowest free suffix, and any
>> external monitoring tools could just be set up to monitor the
>> first/second/third instance, with reproducable names across process
>> restarts.
>>
>
>ok, cool - that sounds better again. Probably you can post and see if anyone else
>finds a hole in it or has a better idea for the next few days.

+1 for this approach of using a counter suffix, would be much simpler to use.

Thanks,
Ciara

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (5 preceding siblings ...)
  2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-08 17:18 ` Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 1/5] eal: limit telemetry to primary processes Bruce Richardson
                     ` (4 more replies)
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
  8 siblings, 5 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* when multiple processes are run using the same runtime directory (i.e.
  "in-memory" mode or similar), add a counter suffix to the socket names to
  avoid conflicts over the socket. Each process will use the lowest available
  suffix, with the first process using the directory, not adding any suffix.
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V7: Change from adding a pid suffix generally in "in-memory" mode, to adding an
increasing counter as a suffix in case of name conflicts generally. This
achieves the same result in terms of connectivity, but keeps compatibility of
behaviour for the case of a single in-memory process, while also providing
predictable more socket names for each process i.e. 4 running in-memory
instances they will always use suffixes 1-3 for the extra 3 sockets, even across
restarts.

V6: fixed issue whereby the failing of the legacy telemetry init would roll-back
init of the v2 telemetry, causing the socket to be deleted, even though it was
still necessary.

V5: Rebase on latest main after other script cleanups were merged

V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD

Bruce Richardson (5):
  eal: limit telemetry to primary processes
  telemetry: fix deletion of active sockets
  telemetry: use unique socket paths for in-memory mode
  usertools/dpdk-telemetry: connect to separate instances
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst | 41 +++++++++++++++++++++
 lib/eal/freebsd/eal.c          |  2 +-
 lib/eal/linux/eal.c            |  2 +-
 lib/telemetry/telemetry.c      | 65 +++++++++++++++++++++++++---------
 usertools/dpdk-telemetry.py    | 45 ++++++++++++++++++++---
 5 files changed, 133 insertions(+), 22 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 1/5] eal: limit telemetry to primary processes
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-08 17:18   ` Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 2/5] telemetry: fix deletion of active sockets Bruce Richardson
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by:  Ciara Power <ciara.power@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..b06a2c1662 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 2/5] telemetry: fix deletion of active sockets
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 1/5] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-08 17:18   ` Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson, Conor Walsh

When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process,
as it starts up, removing the telemetry socket of another process,
giving unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
---
 lib/telemetry/telemetry.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 48f4c7ba46..5d38e90bcc 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,27 +457,47 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+	TMTY_LOG(DEBUG, "Attempting socket bind to path '%s'\n", path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		TMTY_LOG(DEBUG, "Initial bind to socket '%s' failed.\n", path);
+
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		TMTY_LOG(DEBUG, "Attempting unlink and retrying bind\n");
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
 		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
+		unlink(sun.sun_path);
 		goto error;
 	}
+	TMTY_LOG(DEBUG, "Socket creation and binding ok\n");
 
 	return sock;
 
 error:
 	close(sock);
-	unlink_sockets();
+	path[0] = 0;
 	return -1;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 1/5] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 2/5] telemetry: fix deletion of active sockets Bruce Richardson
@ 2021-10-08 17:18   ` Bruce Richardson
  2021-10-12 15:37     ` Power, Ciara
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When running in in-memory mode, multiple processes can use the same
runtime dir, leading to conflicts with the telemetry sockets in that
directory. We can resolve this by appending a suffix to each socket
beyond the first, with the suffix being an increasing counter value.
Each process uses the first unused socket counter value.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/telemetry/telemetry.c | 45 +++++++++++++++++++++++++--------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 5d38e90bcc..a7483167d4 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -467,14 +467,14 @@ create_socket(char *path)
 		/* first check if we have a runtime dir */
 		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-			goto error;
+			close(sock);
+			return -ENOENT;
 		}
 
 		/* check if current socket is active */
 		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
-			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
-			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
-			goto error;
+			close(sock);
+			return -EADDRINUSE;
 		}
 
 		/* socket is not active, delete and attempt rebind */
@@ -482,23 +482,20 @@ create_socket(char *path)
 		unlink(sun.sun_path);
 		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-			goto error;
+			close(sock);
+			return -errno; /* if unlink failed, this will be -EADDRINUSE as above */
 		}
 	}
 
 	if (listen(sock, 1) < 0) {
 		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
 		unlink(sun.sun_path);
-		goto error;
+		close(sock);
+		return -errno;
 	}
 	TMTY_LOG(DEBUG, "Socket creation and binding ok\n");
 
 	return sock;
-
-error:
-	close(sock);
-	path[0] = 0;
-	return -1;
 }
 
 static void
@@ -531,8 +528,10 @@ telemetry_legacy_init(void)
 		return -1;
 	}
 	v1_socket.sock = create_socket(v1_socket.path);
-	if (v1_socket.sock < 0)
+	if (v1_socket.sock < 0) {
+		v1_socket.path[0] = '\0';
 		return -1;
+	}
 	rc = pthread_create(&t_old, NULL, socket_listener, &v1_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create legcay socket thread: %s\n",
@@ -553,7 +552,9 @@ telemetry_legacy_init(void)
 static int
 telemetry_v2_init(void)
 {
+	char spath[sizeof(v2_socket.path)];
 	pthread_t t_new;
+	short suffix = 0;
 	int rc;
 
 	v2_socket.num_clients = &v2_clients;
@@ -564,15 +565,27 @@ telemetry_v2_init(void)
 	rte_telemetry_register_cmd("/help", command_help,
 			"Returns help text for a command. Parameters: string command");
 	v2_socket.fn = client_handler;
-	if (strlcpy(v2_socket.path, get_socket_path(socket_dir, 2),
-			sizeof(v2_socket.path)) >= sizeof(v2_socket.path)) {
+	if (strlcpy(spath, get_socket_path(socket_dir, 2), sizeof(spath)) >= sizeof(spath)) {
 		TMTY_LOG(ERR, "Error with socket binding, path too long\n");
 		return -1;
 	}
+	memcpy(v2_socket.path, spath, sizeof(v2_socket.path));
 
 	v2_socket.sock = create_socket(v2_socket.path);
-	if (v2_socket.sock < 0)
-		return -1;
+	while (v2_socket.sock < 0) {
+		/* bail out on unexpected error, or suffix wrap-around */
+		if (v2_socket.sock != -EADDRINUSE || suffix < 0) {
+			v2_socket.path[0] = '\0'; /* clear socket path */
+			return -1;
+		}
+		/* add a suffix to the path if the basic version fails */
+		if (snprintf(v2_socket.path, sizeof(v2_socket.path), "%s:%d",
+				spath, ++suffix) >= (int)sizeof(v2_socket.path)) {
+			TMTY_LOG(ERR, "Error with socket binding, path too long\n");
+			return -1;
+		}
+		v2_socket.sock = create_socket(v2_socket.path);
+	}
 	rc = pthread_create(&t_new, NULL, socket_listener, &v2_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create socket thread: %s\n",
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-08 17:18   ` Bruce Richardson
  2021-10-12 15:37     ` Power, Ciara
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

For processes run using "in-memory" mode sharing the same runtime dir,
we add support for connecting to the separate instance sockets created
using ":1", ":2" etc. via new "-i" or "--instance" argument. Add details
on connecting to separate instances to the telemetry howto document.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/howto/telemetry.rst | 41 ++++++++++++++++++++++++++++++++++
 usertools/dpdk-telemetry.py    |  7 +++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..c9b14a37b7 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -87,3 +87,44 @@ and query information using the telemetry client python script.
        --> /help,/ethdev/xstats
        {"/help": {"/ethdev/xstats": "Returns the extended stats for a port.
        Parameters: int port_id"}}
+
+
+Connecting to Different DPDK Processes
+--------------------------------------
+
+When multiple DPDK process instances are running on a system, the user will
+naturally wish to be able to select the instance to which the connection is
+being made. The method to select the instance depends on how the individual
+instances are run:
+
+* For DPDK processes run using a non-default file-prefix,
+  i.e. using the `--file-prefix` EAL option flag,
+  the file-prefix for the process should be passed via the `-f` or `--file-prefix` script flag.
+
+  For example, to connect to testpmd run as::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --file-prefix="tpmd"
+
+  One would use the telemetry script command::
+
+     $ ./usertools/dpdk-telemetry -f "tpmd"
+
+* For the case where multiple processes are run using the `--in-memory` EAL flag,
+  but no `-file-prefix` flag, or the same `-file-prefix` flag,
+  those processes will all share the same runtime directory.
+  In this case,
+  each process after the first will add an increasing count suffix to the telemetry socket name,
+  with the processees taking the first available free socket name.
+  This suffix count can be passed to the telemetry script using the `-i` or `--instance` flag.
+
+  For example, if the following two applications are run in separate terminals::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --in-memory    # will use socket "dpdk_telemetry.v2"
+
+     $ ./build/app/test/dpdk-test -l 4,5 --in-memory  # will use "dpdk_telemetry.v2:1"
+
+  The following telemetry script commands would allow one to connect to each binary::
+
+     $ ./usertools/dpdk-telemetry.py       # will connect to testpmd
+
+     $ ./usertools/dpdk-telemetry.py -i 1  # will connect to test binary
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 2974a64732..ce27548c3e 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -112,6 +112,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', default='rte',
                     help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-i', '--instance', default='0', type=int,
+                    help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.instance > 0:
+    sock_path += ":{}".format(args.instance)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
                     ` (3 preceding siblings ...)
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
@ 2021-10-08 17:18   ` Bruce Richardson
  2021-10-12 15:37     ` Power, Ciara
  4 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-08 17:18 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example:

  $ ./usertools/dpdk-telemetry.py -i4
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2  # Connect with './usertools/dpdk-telemetry.py'
  - dpdk_telemetry.v2:2  # Connect with './usertools/dpdk-telemetry.py -i 2'
  - dpdk_telemetry.v2:1  # Connect with './usertools/dpdk-telemetry.py -i 1'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 usertools/dpdk-telemetry.py | 42 ++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index ce27548c3e..da3ba60430 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -10,6 +10,7 @@
 import socket
 import os
 import sys
+import glob
 import json
 import errno
 import readline
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(prefix, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if prefix != DEFAULT_PREFIX:
+        cmd += " -f " + prefix
+    for s in paths:
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -i {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split(':')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     prompt = ''  # this evaluates to false in conditions
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
@@ -62,6 +86,15 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        # if socket exists but is bad, or if non-interactive just return
+        if os.path.exists(path) or not prompt:
+            return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args.file_prefix, socks)
         return
     json_reply = read_socket(sock, 1024, prompt)
     output_buf_len = json_reply["max_output_len"]
@@ -110,13 +143,12 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', default='rte',
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
                     help='Provide file-prefix for DPDK runtime directory')
 parser.add_argument('-i', '--instance', default='0', type=int,
                     help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
-rd = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(get_dpdk_runtime_dir(args.file_prefix), SOCKET_NAME)
 if args.instance > 0:
     sock_path += ":{}".format(args.instance)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
@ 2021-10-12 15:37     ` Power, Ciara
  2021-10-12 15:40       ` Bruce Richardson
  0 siblings, 1 reply; 68+ messages in thread
From: Power, Ciara @ 2021-10-12 15:37 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: David Marchand, Burakov, Anatoly, Kevin Traynor

Hi Bruce,

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Friday 8 October 2021 18:19
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
><david.marchand@redhat.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>Richardson, Bruce <bruce.richardson@intel.com>
>Subject: [PATCH v7 3/5] telemetry: use unique socket paths for in-memory
>mode
>
>When running in in-memory mode, multiple processes can use the same
>runtime dir, leading to conflicts with the telemetry sockets in that directory.
>We can resolve this by appending a suffix to each socket beyond the first,
>with the suffix being an increasing counter value.
>Each process uses the first unused socket counter value.
>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
<snip>

Thanks,

Acked-by: Ciara Power <ciara.power@intel.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
@ 2021-10-12 15:37     ` Power, Ciara
  0 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-10-12 15:37 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: David Marchand, Burakov, Anatoly, Kevin Traynor

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Friday 8 October 2021 18:19
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
><david.marchand@redhat.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>Richardson, Bruce <bruce.richardson@intel.com>
>Subject: [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate
>instances
>
>For processes run using "in-memory" mode sharing the same runtime dir, we
>add support for connecting to the separate instance sockets created using
>":1", ":2" etc. via new "-i" or "--instance" argument. Add details on connecting
>to separate instances to the telemetry howto document.
>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
<snip>

Acked-by: Ciara Power <ciara.power@intel.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
@ 2021-10-12 15:37     ` Power, Ciara
  0 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-10-12 15:37 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: David Marchand, Burakov, Anatoly, Kevin Traynor

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Friday 8 October 2021 18:19
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
><david.marchand@redhat.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>Richardson, Bruce <bruce.richardson@intel.com>
>Subject: [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available
>sockets
>
>When a user runs the dpdk-telemetry script and fails to connect because the
>socket path does not exist, run a scan for possible sockets that could be
>connected to and inform the user of the command needed to connect to
>those.
>
>For example:
>
>  $ ./usertools/dpdk-telemetry.py -i4
>  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
>  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
>
>  Other DPDK telemetry sockets found:
>  - dpdk_telemetry.v2  # Connect with './usertools/dpdk-telemetry.py'
>  - dpdk_telemetry.v2:2  # Connect with './usertools/dpdk-telemetry.py -i 2'
>  - dpdk_telemetry.v2:1  # Connect with './usertools/dpdk-telemetry.py -i 1'
>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
<snip>

Acked-by: Ciara Power <ciara.power@intel.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-12 15:37     ` Power, Ciara
@ 2021-10-12 15:40       ` Bruce Richardson
  2021-10-12 15:47         ` Power, Ciara
  0 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 15:40 UTC (permalink / raw)
  To: Power, Ciara; +Cc: dev, David Marchand, Burakov, Anatoly, Kevin Traynor

On Tue, Oct 12, 2021 at 04:37:07PM +0100, Power, Ciara wrote:
> Hi Bruce,
> 
> >-----Original Message-----
> >From: Richardson, Bruce <bruce.richardson@intel.com>
> >Sent: Friday 8 October 2021 18:19
> >To: dev@dpdk.org
> >Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
> ><david.marchand@redhat.com>; Burakov, Anatoly
> ><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> >Richardson, Bruce <bruce.richardson@intel.com>
> >Subject: [PATCH v7 3/5] telemetry: use unique socket paths for in-memory
> >mode
> >
> >When running in in-memory mode, multiple processes can use the same
> >runtime dir, leading to conflicts with the telemetry sockets in that directory.
> >We can resolve this by appending a suffix to each socket beyond the first,
> >with the suffix being an increasing counter value.
> >Each process uses the first unused socket counter value.
> >
> >Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> >---
> <snip>
> 
> Thanks,
> 
> Acked-by: Ciara Power <ciara.power@intel.com>

Thanks.

I'm actually in two minds as to whether to this patch should be merged with
the previous one. It was already reviewed and acked as a bugfix, but this
patch also affects a lot of the same areas of code. What do you think?

/Bruce

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode
  2021-10-12 15:40       ` Bruce Richardson
@ 2021-10-12 15:47         ` Power, Ciara
  0 siblings, 0 replies; 68+ messages in thread
From: Power, Ciara @ 2021-10-12 15:47 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev, David Marchand, Burakov, Anatoly, Kevin Traynor

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Tuesday 12 October 2021 16:41
>To: Power, Ciara <ciara.power@intel.com>
>Cc: dev@dpdk.org; David Marchand <david.marchand@redhat.com>;
>Burakov, Anatoly <anatoly.burakov@intel.com>; Kevin Traynor
><ktraynor@redhat.com>
>Subject: Re: [PATCH v7 3/5] telemetry: use unique socket paths for in-
>memory mode
>
>On Tue, Oct 12, 2021 at 04:37:07PM +0100, Power, Ciara wrote:
>> Hi Bruce,
>>
>> >-----Original Message-----
>> >From: Richardson, Bruce <bruce.richardson@intel.com>
>> >Sent: Friday 8 October 2021 18:19
>> >To: dev@dpdk.org
>> >Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
>> ><david.marchand@redhat.com>; Burakov, Anatoly
>> ><anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>> >Richardson, Bruce <bruce.richardson@intel.com>
>> >Subject: [PATCH v7 3/5] telemetry: use unique socket paths for
>> >in-memory mode
>> >
>> >When running in in-memory mode, multiple processes can use the same
>> >runtime dir, leading to conflicts with the telemetry sockets in that
>directory.
>> >We can resolve this by appending a suffix to each socket beyond the
>> >first, with the suffix being an increasing counter value.
>> >Each process uses the first unused socket counter value.
>> >
>> >Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>> >---
>> <snip>
>>
>> Thanks,
>>
>> Acked-by: Ciara Power <ciara.power@intel.com>
>
>Thanks.
>
>I'm actually in two minds as to whether to this patch should be merged with
>the previous one. It was already reviewed and acked as a bugfix, but this
>patch also affects a lot of the same areas of code. What do you think?
>
>/Bruce

Yes, I was actually wondering about that myself.
I think it would make more sense to merge them, rather than adding some code in patch 2 that then gets removed in patch 3.

Thanks,
Ciara



^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (6 preceding siblings ...)
  2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-12 16:39 ` Bruce Richardson
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes Bruce Richardson
                     ` (3 more replies)
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
  8 siblings, 4 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 16:39 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* when multiple processes are run using the same runtime directory (i.e.
  "in-memory" mode or similar), add a counter suffix to the socket names to
  avoid conflicts over the socket. Each process will use the lowest available
  suffix, with the first process using the directory, not adding any suffix.
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V8: Merged patches 2 & 3 of the set together. Fixed some checkpatch warnings
flagged by the CI.

V7: Change from adding a pid suffix generally in "in-memory" mode, to adding an
increasing counter as a suffix in case of name conflicts generally. This
achieves the same result in terms of connectivity, but keeps compatibility of
behaviour for the case of a single in-memory process, while also providing
predictable more socket names for each process i.e. 4 running in-memory
instances they will always use suffixes 1-3 for the extra 3 sockets, even across
restarts.

V6: fixed issue whereby the failing of the legacy telemetry init would roll-back
init of the v2 telemetry, causing the socket to be deleted, even though it was
still necessary.

V5: Rebase on latest main after other script cleanups were merged

V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD

Bruce Richardson (4):
  eal: limit telemetry to primary processes
  telemetry: fix socket path conflicts for in-memory mode
  usertools/dpdk-telemetry: connect to separate instances
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst | 41 +++++++++++++++++++++
 lib/eal/freebsd/eal.c          |  2 +-
 lib/eal/linux/eal.c            |  2 +-
 lib/telemetry/telemetry.c      | 65 +++++++++++++++++++++++++---------
 usertools/dpdk-telemetry.py    | 45 ++++++++++++++++++++---
 5 files changed, 133 insertions(+), 22 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-12 16:39   ` Bruce Richardson
  2021-10-13 13:15     ` Walsh, Conor
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 16:39 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..b06a2c1662 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-12 16:39   ` Bruce Richardson
  2021-10-13 13:15     ` Walsh, Conor
  2021-10-14  9:40     ` Kevin Traynor
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  3 siblings, 2 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 16:39 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When running using in-memory mode, multiple processes can use the same
runtime dir, leading to conflicts with the telemetry sockets in that
directory. We can resolve this by appending a suffix to each socket
beyond the first, with the suffix being an increasing counter value.
Each process uses the first unused socket counter value.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 lib/telemetry/telemetry.c | 65 +++++++++++++++++++++++++++++----------
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 48f4c7ba46..a7483167d4 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,28 +457,45 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+	TMTY_LOG(DEBUG, "Attempting socket bind to path '%s'\n", path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		TMTY_LOG(DEBUG, "Initial bind to socket '%s' failed.\n", path);
+
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			close(sock);
+			return -ENOENT;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			close(sock);
+			return -EADDRINUSE;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		TMTY_LOG(DEBUG, "Attempting unlink and retrying bind\n");
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			close(sock);
+			return -errno; /* if unlink failed, this will be -EADDRINUSE as above */
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
 		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
-		goto error;
+		unlink(sun.sun_path);
+		close(sock);
+		return -errno;
 	}
+	TMTY_LOG(DEBUG, "Socket creation and binding ok\n");
 
 	return sock;
-
-error:
-	close(sock);
-	unlink_sockets();
-	return -1;
 }
 
 static void
@@ -511,8 +528,10 @@ telemetry_legacy_init(void)
 		return -1;
 	}
 	v1_socket.sock = create_socket(v1_socket.path);
-	if (v1_socket.sock < 0)
+	if (v1_socket.sock < 0) {
+		v1_socket.path[0] = '\0';
 		return -1;
+	}
 	rc = pthread_create(&t_old, NULL, socket_listener, &v1_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create legcay socket thread: %s\n",
@@ -533,7 +552,9 @@ telemetry_legacy_init(void)
 static int
 telemetry_v2_init(void)
 {
+	char spath[sizeof(v2_socket.path)];
 	pthread_t t_new;
+	short suffix = 0;
 	int rc;
 
 	v2_socket.num_clients = &v2_clients;
@@ -544,15 +565,27 @@ telemetry_v2_init(void)
 	rte_telemetry_register_cmd("/help", command_help,
 			"Returns help text for a command. Parameters: string command");
 	v2_socket.fn = client_handler;
-	if (strlcpy(v2_socket.path, get_socket_path(socket_dir, 2),
-			sizeof(v2_socket.path)) >= sizeof(v2_socket.path)) {
+	if (strlcpy(spath, get_socket_path(socket_dir, 2), sizeof(spath)) >= sizeof(spath)) {
 		TMTY_LOG(ERR, "Error with socket binding, path too long\n");
 		return -1;
 	}
+	memcpy(v2_socket.path, spath, sizeof(v2_socket.path));
 
 	v2_socket.sock = create_socket(v2_socket.path);
-	if (v2_socket.sock < 0)
-		return -1;
+	while (v2_socket.sock < 0) {
+		/* bail out on unexpected error, or suffix wrap-around */
+		if (v2_socket.sock != -EADDRINUSE || suffix < 0) {
+			v2_socket.path[0] = '\0'; /* clear socket path */
+			return -1;
+		}
+		/* add a suffix to the path if the basic version fails */
+		if (snprintf(v2_socket.path, sizeof(v2_socket.path), "%s:%d",
+				spath, ++suffix) >= (int)sizeof(v2_socket.path)) {
+			TMTY_LOG(ERR, "Error with socket binding, path too long\n");
+			return -1;
+		}
+		v2_socket.sock = create_socket(v2_socket.path);
+	}
 	rc = pthread_create(&t_new, NULL, socket_listener, &v2_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create socket thread: %s\n",
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
@ 2021-10-12 16:39   ` Bruce Richardson
  2021-10-13 13:15     ` Walsh, Conor
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  3 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 16:39 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

For processes run using "in-memory" mode sharing the same runtime dir,
we add support for connecting to the separate instance sockets created
using ":1", ":2" etc. via new "-i" or "--instance" argument. Add details
on connecting to separate instances to the telemetry howto document.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 doc/guides/howto/telemetry.rst | 41 ++++++++++++++++++++++++++++++++++
 usertools/dpdk-telemetry.py    |  7 +++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..e4edb53fa4 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -87,3 +87,44 @@ and query information using the telemetry client python script.
        --> /help,/ethdev/xstats
        {"/help": {"/ethdev/xstats": "Returns the extended stats for a port.
        Parameters: int port_id"}}
+
+
+Connecting to Different DPDK Processes
+--------------------------------------
+
+When multiple DPDK process instances are running on a system, the user will
+naturally wish to be able to select the instance to which the connection is
+being made. The method to select the instance depends on how the individual
+instances are run:
+
+* For DPDK processes run using a non-default file-prefix,
+  i.e. using the `--file-prefix` EAL option flag,
+  the file-prefix for the process should be passed via the `-f` or `--file-prefix` script flag.
+
+  For example, to connect to testpmd run as::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --file-prefix="tpmd"
+
+  One would use the telemetry script command::
+
+     $ ./usertools/dpdk-telemetry -f "tpmd"
+
+* For the case where multiple processes are run using the `--in-memory` EAL flag,
+  but no `-file-prefix` flag, or the same `-file-prefix` flag,
+  those processes will all share the same runtime directory.
+  In this case,
+  each process after the first will add an increasing count suffix to the telemetry socket name,
+  with each one taking the first available free socket name.
+  This suffix count can be passed to the telemetry script using the `-i` or `--instance` flag.
+
+  For example, if the following two applications are run in separate terminals::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --in-memory    # will use socket "dpdk_telemetry.v2"
+
+     $ ./build/app/test/dpdk-test -l 4,5 --in-memory  # will use "dpdk_telemetry.v2:1"
+
+  The following telemetry script commands would allow one to connect to each binary::
+
+     $ ./usertools/dpdk-telemetry.py       # will connect to testpmd
+
+     $ ./usertools/dpdk-telemetry.py -i 1  # will connect to test binary
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 2974a64732..ce27548c3e 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -112,6 +112,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', default='rte',
                     help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-i', '--instance', default='0', type=int,
+                    help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.instance > 0:
+    sock_path += ":{}".format(args.instance)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
@ 2021-10-12 16:39   ` Bruce Richardson
  2021-10-13 13:15     ` Walsh, Conor
  3 siblings, 1 reply; 68+ messages in thread
From: Bruce Richardson @ 2021-10-12 16:39 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example:

  $ ./dpdk-telemetry.py -i4
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2  # Connect with './dpdk-telemetry.py'
  - dpdk_telemetry.v2:2  # Connect with './dpdk-telemetry.py -i 2'
  - dpdk_telemetry.v2:1  # Connect with './dpdk-telemetry.py -i 1'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
---
 usertools/dpdk-telemetry.py | 42 ++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index ce27548c3e..da3ba60430 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -10,6 +10,7 @@
 import socket
 import os
 import sys
+import glob
 import json
 import errno
 import readline
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(prefix, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if prefix != DEFAULT_PREFIX:
+        cmd += " -f " + prefix
+    for s in paths:
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -i {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split(':')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     prompt = ''  # this evaluates to false in conditions
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
@@ -62,6 +86,15 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        # if socket exists but is bad, or if non-interactive just return
+        if os.path.exists(path) or not prompt:
+            return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args.file_prefix, socks)
         return
     json_reply = read_socket(sock, 1024, prompt)
     output_buf_len = json_reply["max_output_len"]
@@ -110,13 +143,12 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', default='rte',
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
                     help='Provide file-prefix for DPDK runtime directory')
 parser.add_argument('-i', '--instance', default='0', type=int,
                     help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
-rd = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(get_dpdk_runtime_dir(args.file_prefix), SOCKET_NAME)
 if args.instance > 0:
     sock_path += ":{}".format(args.instance)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-13 13:15     ` Walsh, Conor
  0 siblings, 0 replies; 68+ messages in thread
From: Walsh, Conor @ 2021-10-13 13:15 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: Power, Ciara, David Marchand, Burakov, Anatoly, Kevin Traynor,
	Richardson, Bruce

> From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
> Sent: Tuesday 12 October 2021 17:39
> To: dev@dpdk.org
> Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
> <david.marchand@redhat.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes
> 
> Telemetry interface should be exposed for primary processes only, since
> secondary processes will conflict on socket creation, and since all
> data in secondary process is generally available to primary. For
> example, all device stats for ethdevs, cryptodevs, etc. will all be
> common across processes.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>

Tested-by: Conor Walsh <conor.walsh@intel.com>

<snip>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
@ 2021-10-13 13:15     ` Walsh, Conor
  2021-10-14  9:40     ` Kevin Traynor
  1 sibling, 0 replies; 68+ messages in thread
From: Walsh, Conor @ 2021-10-13 13:15 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: Power, Ciara, David Marchand, Burakov, Anatoly, Kevin Traynor,
	Richardson, Bruce

> From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
> Sent: Tuesday 12 October 2021 17:39
> To: dev@dpdk.org
> Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
> <david.marchand@redhat.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-
> memory mode
> 
> When running using in-memory mode, multiple processes can use the same
> runtime dir, leading to conflicts with the telemetry sockets in that
> directory. We can resolve this by appending a suffix to each socket
> beyond the first, with the suffix being an increasing counter value.
> Each process uses the first unused socket counter value.
> 
> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>
> ---

Tested-by: Conor Walsh <conor.walsh@intel.com>

<snip>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
@ 2021-10-13 13:15     ` Walsh, Conor
  0 siblings, 0 replies; 68+ messages in thread
From: Walsh, Conor @ 2021-10-13 13:15 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: Power, Ciara, David Marchand, Burakov, Anatoly, Kevin Traynor,
	Richardson, Bruce

> From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
> Sent: Tuesday 12 October 2021 17:39
> To: dev@dpdk.org
> Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
> <david.marchand@redhat.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to
> separate instances
> 
> For processes run using "in-memory" mode sharing the same runtime dir,
> we add support for connecting to the separate instance sockets created
> using ":1", ":2" etc. via new "-i" or "--instance" argument. Add details
> on connecting to separate instances to the telemetry howto document.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>
> ---

Tested-by: Conor Walsh <conor.walsh@intel.com>

<snip>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
@ 2021-10-13 13:15     ` Walsh, Conor
  0 siblings, 0 replies; 68+ messages in thread
From: Walsh, Conor @ 2021-10-13 13:15 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: Power, Ciara, David Marchand, Burakov, Anatoly, Kevin Traynor,
	Richardson, Bruce

> From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
> Sent: Tuesday 12 October 2021 17:39
> To: dev@dpdk.org
> Cc: Power, Ciara <ciara.power@intel.com>; David Marchand
> <david.marchand@redhat.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info
> on available sockets
> 
> When a user runs the dpdk-telemetry script and fails to connect because
> the socket path does not exist, run a scan for possible sockets that
> could be connected to and inform the user of the command needed to
> connect to those.
> 
> For example:
> 
>   $ ./dpdk-telemetry.py -i4
>   Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
>   Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
> 
>   Other DPDK telemetry sockets found:
>   - dpdk_telemetry.v2  # Connect with './dpdk-telemetry.py'
>   - dpdk_telemetry.v2:2  # Connect with './dpdk-telemetry.py -i 2'
>   - dpdk_telemetry.v2:1  # Connect with './dpdk-telemetry.py -i 1'
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>
> ---

<snip>

> +def print_socket_options(prefix, paths):
> +    """ Given a set of socket paths, give the commands needed to connect
> """
> +    cmd = sys.argv[0]
> +    if prefix != DEFAULT_PREFIX:
> +        cmd += " -f " + prefix
> +    for s in paths:

When I tested this the sockets were in the wrong order e.g. 2, 1, 0.
If the above paths variable was wrapped with sorted() it would ensure the list
was always in the correct order.

With or without this change:
Reviewed-by: Conor Walsh <conor.walsh@intel.com>

> +        sock_name = os.path.basename(s)
> +        if sock_name.endswith(TELEMETRY_VERSION):
> +            print("- {}  # Connect with '{}'".format(os.path.basename(s),
> +                                                     cmd))
> +        else:
> +            print("- {}  # Connect with '{} -i {}'".format(os.path.basename(s),
> +                                                           cmd,
> +                                                           s.split(':')[-1]))

<snip>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode
  2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
  2021-10-13 13:15     ` Walsh, Conor
@ 2021-10-14  9:40     ` Kevin Traynor
  1 sibling, 0 replies; 68+ messages in thread
From: Kevin Traynor @ 2021-10-14  9:40 UTC (permalink / raw)
  To: Bruce Richardson, dev; +Cc: Ciara Power, David Marchand, Anatoly Burakov

On 12/10/2021 17:39, Bruce Richardson wrote:
> When running using in-memory mode, multiple processes can use the same
> runtime dir, leading to conflicts with the telemetry sockets in that
> directory. We can resolve this by appending a suffix to each socket
> beyond the first, with the suffix being an increasing counter value.
> Each process uses the first unused socket counter value.
> 
> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Ciara Power <ciara.power@intel.com>

Thanks Bruce, lgtm. Nice solution to keep existing name where possible 
and to give as predictable as you can names for others.

Acked-by: Kevin Traynor <ktraynor@redhat.com>

> ---
>   lib/telemetry/telemetry.c | 65 +++++++++++++++++++++++++++++----------
>   1 file changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 48f4c7ba46..a7483167d4 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -457,28 +457,45 @@ create_socket(char *path)
>   
>   	struct sockaddr_un sun = {.sun_family = AF_UNIX};
>   	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
> -	unlink(sun.sun_path);
> +	TMTY_LOG(DEBUG, "Attempting socket bind to path '%s'\n", path);
> +
>   	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
>   		struct stat st;
>   
> -		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> -		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
> +		TMTY_LOG(DEBUG, "Initial bind to socket '%s' failed.\n", path);
> +
> +		/* first check if we have a runtime dir */
> +		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
>   			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
> -		sun.sun_path[0] = 0;
> -		goto error;
> +			close(sock);
> +			return -ENOENT;
> +		}
> +
> +		/* check if current socket is active */
> +		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
> +			close(sock);
> +			return -EADDRINUSE;
> +		}
> +
> +		/* socket is not active, delete and attempt rebind */
> +		TMTY_LOG(DEBUG, "Attempting unlink and retrying bind\n");
> +		unlink(sun.sun_path);
> +		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
> +			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> +			close(sock);
> +			return -errno; /* if unlink failed, this will be -EADDRINUSE as above */
> +		}
>   	}
>   
>   	if (listen(sock, 1) < 0) {
>   		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
> -		goto error;
> +		unlink(sun.sun_path);
> +		close(sock);
> +		return -errno;
>   	}
> +	TMTY_LOG(DEBUG, "Socket creation and binding ok\n");
>   
>   	return sock;
> -
> -error:
> -	close(sock);
> -	unlink_sockets();
> -	return -1;
>   }
>   
>   static void
> @@ -511,8 +528,10 @@ telemetry_legacy_init(void)
>   		return -1;
>   	}
>   	v1_socket.sock = create_socket(v1_socket.path);
> -	if (v1_socket.sock < 0)
> +	if (v1_socket.sock < 0) {
> +		v1_socket.path[0] = '\0';
>   		return -1;
> +	}
>   	rc = pthread_create(&t_old, NULL, socket_listener, &v1_socket);
>   	if (rc != 0) {
>   		TMTY_LOG(ERR, "Error with create legcay socket thread: %s\n",
> @@ -533,7 +552,9 @@ telemetry_legacy_init(void)
>   static int
>   telemetry_v2_init(void)
>   {
> +	char spath[sizeof(v2_socket.path)];
>   	pthread_t t_new;
> +	short suffix = 0;
>   	int rc;
>   
>   	v2_socket.num_clients = &v2_clients;
> @@ -544,15 +565,27 @@ telemetry_v2_init(void)
>   	rte_telemetry_register_cmd("/help", command_help,
>   			"Returns help text for a command. Parameters: string command");
>   	v2_socket.fn = client_handler;
> -	if (strlcpy(v2_socket.path, get_socket_path(socket_dir, 2),
> -			sizeof(v2_socket.path)) >= sizeof(v2_socket.path)) {
> +	if (strlcpy(spath, get_socket_path(socket_dir, 2), sizeof(spath)) >= sizeof(spath)) {
>   		TMTY_LOG(ERR, "Error with socket binding, path too long\n");
>   		return -1;
>   	}
> +	memcpy(v2_socket.path, spath, sizeof(v2_socket.path));
>   
>   	v2_socket.sock = create_socket(v2_socket.path);
> -	if (v2_socket.sock < 0)
> -		return -1;
> +	while (v2_socket.sock < 0) {
> +		/* bail out on unexpected error, or suffix wrap-around */
> +		if (v2_socket.sock != -EADDRINUSE || suffix < 0) {
> +			v2_socket.path[0] = '\0'; /* clear socket path */
> +			return -1;
> +		}
> +		/* add a suffix to the path if the basic version fails */
> +		if (snprintf(v2_socket.path, sizeof(v2_socket.path), "%s:%d",
> +				spath, ++suffix) >= (int)sizeof(v2_socket.path)) {
> +			TMTY_LOG(ERR, "Error with socket binding, path too long\n");
> +			return -1;
> +		}
> +		v2_socket.sock = create_socket(v2_socket.path);
> +	}
>   	rc = pthread_create(&t_new, NULL, socket_listener, &v2_socket);
>   	if (rc != 0) {
>   		TMTY_LOG(ERR, "Error with create socket thread: %s\n",
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode
  2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
                   ` (7 preceding siblings ...)
  2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-14 10:49 ` Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 1/4] eal: limit telemetry to primary processes Bruce Richardson
                     ` (4 more replies)
  8 siblings, 5 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-14 10:49 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson

This patchset cleans up telemetry support for "in-memory" mode, so that
multiple independent processes can be run using that mode and still have
telemetry support. It also removes problems of one process removing the
socket of another - which was the original issue reported. The main changes
in this set are to:

* disable telemetry for secondary processes, which prevents any socket
  conflicts in multi-process cases.
* when multiple processes are run using the same runtime directory (i.e.
  "in-memory" mode or similar), add a counter suffix to the socket names to
  avoid conflicts over the socket. Each process will use the lowest available
  suffix, with the first process using the directory, not adding any suffix.
* update the telemetry script and documentation to allow it to connect to
  in-memory DPDK processes.

---
V9: sort output lines in help text in script

V8: Merged patches 2 & 3 of the set together. Fixed some checkpatch warnings
flagged by the CI.

V7: Change from adding a pid suffix generally in "in-memory" mode, to adding an
increasing counter as a suffix in case of name conflicts generally. This
achieves the same result in terms of connectivity, but keeps compatibility of
behaviour for the case of a single in-memory process, while also providing
predictable more socket names for each process i.e. 4 running in-memory
instances they will always use suffixes 1-3 for the extra 3 sockets, even across
restarts.

V6: fixed issue whereby the failing of the legacy telemetry init would roll-back
init of the v2 telemetry, causing the socket to be deleted, even though it was
still necessary.

V5: Rebase on latest main after other script cleanups were merged

V4: Move from simple-fix patch to proper fix patchset

V3: Drop CC stable, as will have separate backport patch which does not
error out, so avoiding causing problems with currently running application

V2: fix build error on FreeBSD

Bruce Richardson (4):
  eal: limit telemetry to primary processes
  telemetry: fix socket path conflicts for in-memory mode
  usertools/dpdk-telemetry: connect to separate instances
  usertools/dpdk-telemetry: provide info on available sockets

 doc/guides/howto/telemetry.rst | 41 +++++++++++++++++++++
 lib/eal/freebsd/eal.c          |  2 +-
 lib/eal/linux/eal.c            |  2 +-
 lib/telemetry/telemetry.c      | 65 +++++++++++++++++++++++++---------
 usertools/dpdk-telemetry.py    | 45 ++++++++++++++++++++---
 5 files changed, 133 insertions(+), 22 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v9 1/4] eal: limit telemetry to primary processes
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
@ 2021-10-14 10:49   ` Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-14 10:49 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson, Conor Walsh

Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
---
 lib/eal/freebsd/eal.c | 2 +-
 lib/eal/linux/eal.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index fb734012a4..56a60f13e9 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -950,7 +950,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..0d0fc66668 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1320,7 +1320,7 @@ rte_eal_init(int argc, char **argv)
 		rte_eal_init_alert("Cannot clear runtime directory");
 		return -1;
 	}
-	if (!internal_conf->no_telemetry) {
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY && !internal_conf->no_telemetry) {
 		int tlog = rte_log_register_type_and_pick_level(
 				"lib.telemetry", RTE_LOG_WARNING);
 		if (tlog < 0)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v9 2/4] telemetry: fix socket path conflicts for in-memory mode
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 1/4] eal: limit telemetry to primary processes Bruce Richardson
@ 2021-10-14 10:49   ` Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-14 10:49 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson, Conor Walsh

When running using in-memory mode, multiple processes can use the same
runtime dir, leading to conflicts with the telemetry sockets in that
directory. We can resolve this by appending a suffix to each socket
beyond the first, with the suffix being an increasing counter value.
Each process uses the first unused socket counter value.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
---
 lib/telemetry/telemetry.c | 65 +++++++++++++++++++++++++++++----------
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 48f4c7ba46..a7483167d4 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,28 +457,45 @@ create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+	TMTY_LOG(DEBUG, "Attempting socket bind to path '%s'\n", path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		TMTY_LOG(DEBUG, "Initial bind to socket '%s' failed.\n", path);
+
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			close(sock);
+			return -ENOENT;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			close(sock);
+			return -EADDRINUSE;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		TMTY_LOG(DEBUG, "Attempting unlink and retrying bind\n");
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			close(sock);
+			return -errno; /* if unlink failed, this will be -EADDRINUSE as above */
+		}
 	}
 
 	if (listen(sock, 1) < 0) {
 		TMTY_LOG(ERR, "Error calling listen for socket: %s\n", strerror(errno));
-		goto error;
+		unlink(sun.sun_path);
+		close(sock);
+		return -errno;
 	}
+	TMTY_LOG(DEBUG, "Socket creation and binding ok\n");
 
 	return sock;
-
-error:
-	close(sock);
-	unlink_sockets();
-	return -1;
 }
 
 static void
@@ -511,8 +528,10 @@ telemetry_legacy_init(void)
 		return -1;
 	}
 	v1_socket.sock = create_socket(v1_socket.path);
-	if (v1_socket.sock < 0)
+	if (v1_socket.sock < 0) {
+		v1_socket.path[0] = '\0';
 		return -1;
+	}
 	rc = pthread_create(&t_old, NULL, socket_listener, &v1_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create legcay socket thread: %s\n",
@@ -533,7 +552,9 @@ telemetry_legacy_init(void)
 static int
 telemetry_v2_init(void)
 {
+	char spath[sizeof(v2_socket.path)];
 	pthread_t t_new;
+	short suffix = 0;
 	int rc;
 
 	v2_socket.num_clients = &v2_clients;
@@ -544,15 +565,27 @@ telemetry_v2_init(void)
 	rte_telemetry_register_cmd("/help", command_help,
 			"Returns help text for a command. Parameters: string command");
 	v2_socket.fn = client_handler;
-	if (strlcpy(v2_socket.path, get_socket_path(socket_dir, 2),
-			sizeof(v2_socket.path)) >= sizeof(v2_socket.path)) {
+	if (strlcpy(spath, get_socket_path(socket_dir, 2), sizeof(spath)) >= sizeof(spath)) {
 		TMTY_LOG(ERR, "Error with socket binding, path too long\n");
 		return -1;
 	}
+	memcpy(v2_socket.path, spath, sizeof(v2_socket.path));
 
 	v2_socket.sock = create_socket(v2_socket.path);
-	if (v2_socket.sock < 0)
-		return -1;
+	while (v2_socket.sock < 0) {
+		/* bail out on unexpected error, or suffix wrap-around */
+		if (v2_socket.sock != -EADDRINUSE || suffix < 0) {
+			v2_socket.path[0] = '\0'; /* clear socket path */
+			return -1;
+		}
+		/* add a suffix to the path if the basic version fails */
+		if (snprintf(v2_socket.path, sizeof(v2_socket.path), "%s:%d",
+				spath, ++suffix) >= (int)sizeof(v2_socket.path)) {
+			TMTY_LOG(ERR, "Error with socket binding, path too long\n");
+			return -1;
+		}
+		v2_socket.sock = create_socket(v2_socket.path);
+	}
 	rc = pthread_create(&t_new, NULL, socket_listener, &v2_socket);
 	if (rc != 0) {
 		TMTY_LOG(ERR, "Error with create socket thread: %s\n",
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v9 3/4] usertools/dpdk-telemetry: connect to separate instances
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 1/4] eal: limit telemetry to primary processes Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
@ 2021-10-14 10:49   ` Bruce Richardson
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
  2021-10-14 19:00   ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode David Marchand
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-14 10:49 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson, Conor Walsh

For processes run using "in-memory" mode sharing the same runtime dir,
we add support for connecting to the separate instance sockets created
using ":1", ":2" etc. via new "-i" or "--instance" argument. Add details
on connecting to separate instances to the telemetry howto document.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
---
 doc/guides/howto/telemetry.rst | 41 ++++++++++++++++++++++++++++++++++
 usertools/dpdk-telemetry.py    |  7 +++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/telemetry.rst b/doc/guides/howto/telemetry.rst
index 8f4fa1a510..e4edb53fa4 100644
--- a/doc/guides/howto/telemetry.rst
+++ b/doc/guides/howto/telemetry.rst
@@ -87,3 +87,44 @@ and query information using the telemetry client python script.
        --> /help,/ethdev/xstats
        {"/help": {"/ethdev/xstats": "Returns the extended stats for a port.
        Parameters: int port_id"}}
+
+
+Connecting to Different DPDK Processes
+--------------------------------------
+
+When multiple DPDK process instances are running on a system, the user will
+naturally wish to be able to select the instance to which the connection is
+being made. The method to select the instance depends on how the individual
+instances are run:
+
+* For DPDK processes run using a non-default file-prefix,
+  i.e. using the `--file-prefix` EAL option flag,
+  the file-prefix for the process should be passed via the `-f` or `--file-prefix` script flag.
+
+  For example, to connect to testpmd run as::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --file-prefix="tpmd"
+
+  One would use the telemetry script command::
+
+     $ ./usertools/dpdk-telemetry -f "tpmd"
+
+* For the case where multiple processes are run using the `--in-memory` EAL flag,
+  but no `-file-prefix` flag, or the same `-file-prefix` flag,
+  those processes will all share the same runtime directory.
+  In this case,
+  each process after the first will add an increasing count suffix to the telemetry socket name,
+  with each one taking the first available free socket name.
+  This suffix count can be passed to the telemetry script using the `-i` or `--instance` flag.
+
+  For example, if the following two applications are run in separate terminals::
+
+     $ ./build/app/dpdk-testpmd -l 2,3 --in-memory    # will use socket "dpdk_telemetry.v2"
+
+     $ ./build/app/test/dpdk-test -l 4,5 --in-memory  # will use "dpdk_telemetry.v2:1"
+
+  The following telemetry script commands would allow one to connect to each binary::
+
+     $ ./usertools/dpdk-telemetry.py       # will connect to testpmd
+
+     $ ./usertools/dpdk-telemetry.py -i 1  # will connect to test binary
diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index 2974a64732..ce27548c3e 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -112,6 +112,11 @@ def get_dpdk_runtime_dir(fp):
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', '--file-prefix', default='rte',
                     help='Provide file-prefix for DPDK runtime directory')
+parser.add_argument('-i', '--instance', default='0', type=int,
+                    help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
 rd = get_dpdk_runtime_dir(args.file_prefix)
-handle_socket(os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
+sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+if args.instance > 0:
+    sock_path += ":{}".format(args.instance)
+handle_socket(sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [dpdk-dev] [PATCH v9 4/4] usertools/dpdk-telemetry: provide info on available sockets
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
                     ` (2 preceding siblings ...)
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
@ 2021-10-14 10:49   ` Bruce Richardson
  2021-10-14 19:00   ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode David Marchand
  4 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-14 10:49 UTC (permalink / raw)
  To: dev
  Cc: Ciara Power, David Marchand, Anatoly Burakov, Kevin Traynor,
	Bruce Richardson, Conor Walsh

When a user runs the dpdk-telemetry script and fails to connect because
the socket path does not exist, run a scan for possible sockets that
could be connected to and inform the user of the command needed to
connect to those.

For example:

  $ ./dpdk-telemetry.py -i4
  Connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4
  Error connecting to /run/user/1000/dpdk/rte/dpdk_telemetry.v2:4

  Other DPDK telemetry sockets found:
  - dpdk_telemetry.v2  # Connect with './dpdk-telemetry.py'
  - dpdk_telemetry.v2:2  # Connect with './dpdk-telemetry.py -i 2'
  - dpdk_telemetry.v2:1  # Connect with './dpdk-telemetry.py -i 1'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
---
 usertools/dpdk-telemetry.py | 42 ++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/usertools/dpdk-telemetry.py b/usertools/dpdk-telemetry.py
index ce27548c3e..8f7d59d139 100755
--- a/usertools/dpdk-telemetry.py
+++ b/usertools/dpdk-telemetry.py
@@ -10,6 +10,7 @@
 import socket
 import os
 import sys
+import glob
 import json
 import errno
 import readline
@@ -17,6 +18,8 @@
 
 # global vars
 TELEMETRY_VERSION = "v2"
+SOCKET_NAME = 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)
+DEFAULT_PREFIX = 'rte'
 CMDS = []
 
 
@@ -48,7 +51,28 @@ def get_app_name(pid):
     return None
 
 
-def handle_socket(path):
+def find_sockets(path):
+    """ Find any possible sockets to connect to and return them """
+    return glob.glob(os.path.join(path, SOCKET_NAME + '*'))
+
+
+def print_socket_options(prefix, paths):
+    """ Given a set of socket paths, give the commands needed to connect """
+    cmd = sys.argv[0]
+    if prefix != DEFAULT_PREFIX:
+        cmd += " -f " + prefix
+    for s in sorted(paths):
+        sock_name = os.path.basename(s)
+        if sock_name.endswith(TELEMETRY_VERSION):
+            print("- {}  # Connect with '{}'".format(os.path.basename(s),
+                                                     cmd))
+        else:
+            print("- {}  # Connect with '{} -i {}'".format(os.path.basename(s),
+                                                           cmd,
+                                                           s.split(':')[-1]))
+
+
+def handle_socket(args, path):
     """ Connect to socket and handle user input """
     prompt = ''  # this evaluates to false in conditions
     sock = socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKET)
@@ -62,6 +86,15 @@ def handle_socket(path):
     except OSError:
         print("Error connecting to " + path)
         sock.close()
+        # if socket exists but is bad, or if non-interactive just return
+        if os.path.exists(path) or not prompt:
+            return
+        # if user didn't give a valid socket path, but there are
+        # some sockets, help the user out by printing how to connect
+        socks = find_sockets(os.path.dirname(path))
+        if socks:
+            print("\nOther DPDK telemetry sockets found:")
+            print_socket_options(args.file_prefix, socks)
         return
     json_reply = read_socket(sock, 1024, prompt)
     output_buf_len = json_reply["max_output_len"]
@@ -110,13 +143,12 @@ def get_dpdk_runtime_dir(fp):
 readline.set_completer_delims(readline.get_completer_delims().replace('/', ''))
 
 parser = argparse.ArgumentParser()
-parser.add_argument('-f', '--file-prefix', default='rte',
+parser.add_argument('-f', '--file-prefix', default=DEFAULT_PREFIX,
                     help='Provide file-prefix for DPDK runtime directory')
 parser.add_argument('-i', '--instance', default='0', type=int,
                     help='Provide file-prefix for DPDK runtime directory')
 args = parser.parse_args()
-rd = get_dpdk_runtime_dir(args.file_prefix)
-sock_path = os.path.join(rd, 'dpdk_telemetry.{}'.format(TELEMETRY_VERSION))
+sock_path = os.path.join(get_dpdk_runtime_dir(args.file_prefix), SOCKET_NAME)
 if args.instance > 0:
     sock_path += ":{}".format(args.instance)
-handle_socket(sock_path)
+handle_socket(args, sock_path)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode
  2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
                     ` (3 preceding siblings ...)
  2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
@ 2021-10-14 19:00   ` David Marchand
  2021-10-15  8:18     ` Bruce Richardson
  4 siblings, 1 reply; 68+ messages in thread
From: David Marchand @ 2021-10-14 19:00 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Ciara Power, Anatoly Burakov, Kevin Traynor

On Thu, Oct 14, 2021 at 12:50 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> This patchset cleans up telemetry support for "in-memory" mode, so that
> multiple independent processes can be run using that mode and still have
> telemetry support. It also removes problems of one process removing the
> socket of another - which was the original issue reported. The main changes
> in this set are to:
>
> * disable telemetry for secondary processes, which prevents any socket
>   conflicts in multi-process cases.
> * when multiple processes are run using the same runtime directory (i.e.
>   "in-memory" mode or similar), add a counter suffix to the socket names to
>   avoid conflicts over the socket. Each process will use the lowest available
>   suffix, with the first process using the directory, not adding any suffix.
> * update the telemetry script and documentation to allow it to connect to
>   in-memory DPDK processes.
>

Thanks a lot for working on this.
Reading the updated doc, I fixed some -file-prefix to --file-prefix in doc.

Series applied, thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode
  2021-10-14 19:00   ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode David Marchand
@ 2021-10-15  8:18     ` Bruce Richardson
  0 siblings, 0 replies; 68+ messages in thread
From: Bruce Richardson @ 2021-10-15  8:18 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Ciara Power, Anatoly Burakov, Kevin Traynor

On Thu, Oct 14, 2021 at 09:00:09PM +0200, David Marchand wrote:
> On Thu, Oct 14, 2021 at 12:50 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > This patchset cleans up telemetry support for "in-memory" mode, so that
> > multiple independent processes can be run using that mode and still have
> > telemetry support. It also removes problems of one process removing the
> > socket of another - which was the original issue reported. The main changes
> > in this set are to:
> >
> > * disable telemetry for secondary processes, which prevents any socket
> >   conflicts in multi-process cases.
> > * when multiple processes are run using the same runtime directory (i.e.
> >   "in-memory" mode or similar), add a counter suffix to the socket names to
> >   avoid conflicts over the socket. Each process will use the lowest available
> >   suffix, with the first process using the directory, not adding any suffix.
> > * update the telemetry script and documentation to allow it to connect to
> >   in-memory DPDK processes.
> >
> 
> Thanks a lot for working on this.
> Reading the updated doc, I fixed some -file-prefix to --file-prefix in doc.
> 
Thanks for the fixups.

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2021-10-15  8:19 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 14:10 [dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts Bruce Richardson
2021-09-22  8:43 ` Power, Ciara
2021-09-24 16:18 ` [dpdk-dev] [PATCH v2] " Bruce Richardson
2021-09-29  8:50   ` Power, Ciara
2021-09-29 12:28   ` Kevin Traynor
2021-09-29 13:32     ` Bruce Richardson
2021-09-29 13:51       ` Bruce Richardson
2021-09-29 14:54       ` Kevin Traynor
2021-09-29 15:24         ` Bruce Richardson
2021-09-29 15:31           ` Bruce Richardson
2021-09-29 16:01             ` Kevin Traynor
2021-09-29 13:54 ` [dpdk-dev] [PATCH v3] " Bruce Richardson
2021-10-05 11:47   ` Ferruh Yigit
2021-10-01 11:15 ` [dpdk-dev] [PATCH v4 0/5] improve telemetry support with in-memory mode Bruce Richardson
2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 1/5] eal: limit telemetry to primary processes Bruce Richardson
2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 2/5] telemetry: fix deletion of active sockets Bruce Richardson
2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
2021-10-01 11:15   ` [dpdk-dev] [PATCH v4 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-01 16:22 ` [dpdk-dev] [PATCH v5 0/5] improve telemetry support with in-memory mode Bruce Richardson
2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 1/5] eal: limit telemetry to primary processes Bruce Richardson
2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 2/5] telemetry: fix deletion of active sockets Bruce Richardson
2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
2021-10-01 16:22   ` [dpdk-dev] [PATCH v5 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-05 13:59 ` [dpdk-dev] [PATCH v6 0/5] improve telemetry support with in-memory mode Bruce Richardson
2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 1/5] eal: limit telemetry to primary processes Bruce Richardson
2021-10-07 13:11     ` Power, Ciara
2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 2/5] telemetry: fix deletion of active sockets Bruce Richardson
2021-10-05 15:18     ` Conor Walsh
2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
2021-10-05 14:41     ` Kevin Traynor
2021-10-05 14:52       ` Bruce Richardson
2021-10-05 15:14         ` Kevin Traynor
2021-10-07 13:39           ` Power, Ciara
2021-10-05 15:19         ` Conor Walsh
2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 4/5] usertools/dpdk-telemetry: connect to in-memory processes Bruce Richardson
2021-10-05 15:19     ` Conor Walsh
2021-10-05 13:59   ` [dpdk-dev] [PATCH v6 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-05 15:19     ` Conor Walsh
2021-10-08 17:18 ` [dpdk-dev] [PATCH v7 0/5] improve telemetry support with in-memory mode Bruce Richardson
2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 1/5] eal: limit telemetry to primary processes Bruce Richardson
2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 2/5] telemetry: fix deletion of active sockets Bruce Richardson
2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 3/5] telemetry: use unique socket paths for in-memory mode Bruce Richardson
2021-10-12 15:37     ` Power, Ciara
2021-10-12 15:40       ` Bruce Richardson
2021-10-12 15:47         ` Power, Ciara
2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 4/5] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
2021-10-12 15:37     ` Power, Ciara
2021-10-08 17:18   ` [dpdk-dev] [PATCH v7 5/5] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-12 15:37     ` Power, Ciara
2021-10-12 16:39 ` [dpdk-dev] [PATCH v8 0/4] improve telemetry support with in-memory mode Bruce Richardson
2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 1/4] eal: limit telemetry to primary processes Bruce Richardson
2021-10-13 13:15     ` Walsh, Conor
2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
2021-10-13 13:15     ` Walsh, Conor
2021-10-14  9:40     ` Kevin Traynor
2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
2021-10-13 13:15     ` Walsh, Conor
2021-10-12 16:39   ` [dpdk-dev] [PATCH v8 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-13 13:15     ` Walsh, Conor
2021-10-14 10:49 ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode Bruce Richardson
2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 1/4] eal: limit telemetry to primary processes Bruce Richardson
2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 2/4] telemetry: fix socket path conflicts for in-memory mode Bruce Richardson
2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 3/4] usertools/dpdk-telemetry: connect to separate instances Bruce Richardson
2021-10-14 10:49   ` [dpdk-dev] [PATCH v9 4/4] usertools/dpdk-telemetry: provide info on available sockets Bruce Richardson
2021-10-14 19:00   ` [dpdk-dev] [PATCH v9 0/4] improve telemetry support with in-memory mode David Marchand
2021-10-15  8:18     ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).