* [PATCH] telemetry: fix autotest failures on Alpine
@ 2023-03-10 18:18 Bruce Richardson
2023-03-10 19:08 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Bruce Richardson @ 2023-03-10 18:18 UTC (permalink / raw)
To: dev; +Cc: Bruce Richardson, stable, Ciara Power, Declan Doherty, Radu Nicolau
On Alpine linux, the telemetry_data_autotest was failing for the
test where we had dictionaries embedded in other dictionaries up
to three levels deep. Indications are that this issue is due to
excess data being stored on the stack, so replace stack-allocated
buffer data with dynamically allocated data in the case where we
are doing recursive processing of telemetry data structures into
json.
Bugzilla ID: 1177
Fixes: c933bb5177ca ("telemetry: support array values in data object")
Fixes: d2671e642a8e ("telemetry: support dict of dicts")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
lib/telemetry/telemetry.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 7bceadcee7..34d371ab8a 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -208,7 +208,9 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -219,6 +221,7 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
break;
}
}
@@ -275,7 +278,9 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -286,6 +291,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
}
}
}
@@ -311,7 +317,9 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used,
d->data.array[i].uval);
else if (d->type == TEL_ARRAY_CONTAINER) {
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
const struct container *rec_data =
&d->data.array[i].container;
if (container_to_json(rec_data->data,
@@ -321,6 +329,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used, temp);
if (!rec_data->keep)
rte_tel_data_free(rec_data->data);
+ free(temp);
}
break;
}
--
2.37.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] telemetry: fix autotest failures on Alpine
2023-03-10 18:18 [PATCH] telemetry: fix autotest failures on Alpine Bruce Richardson
@ 2023-03-10 19:08 ` Stephen Hemminger
2023-03-13 9:38 ` Bruce Richardson
[not found] ` <20230405154414.183915-1-bruce.richardson@intel.com>
[not found] ` <20230405160326.186921-1-bruce.richardson@intel.com>
2 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2023-03-10 19:08 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev, stable, Ciara Power, Declan Doherty, Radu Nicolau
On Fri, 10 Mar 2023 18:18:36 +0000
Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Alpine linux, the telemetry_data_autotest was failing for the
> test where we had dictionaries embedded in other dictionaries up
> to three levels deep. Indications are that this issue is due to
> excess data being stored on the stack, so replace stack-allocated
> buffer data with dynamically allocated data in the case where we
> are doing recursive processing of telemetry data structures into
> json.
>
> Bugzilla ID: 1177
> Fixes: c933bb5177ca ("telemetry: support array values in data object")
> Fixes: d2671e642a8e ("telemetry: support dict of dicts")
> Cc: stable@dpdk.org
Looking at the telemetry code:
- why so many temporary buffers, could this be streamed or redesigned
so that an allocated buffer is returned.
- why is rte_tel_json_XXX all inline? These should just be internal
functions and not in a .h file.
FYI - if this library reused existing json writer it would have
been much simpler.
https://github.com/shemminger/iproute2/blob/main/lib/json_writer.c
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] telemetry: fix autotest failures on Alpine
2023-03-10 19:08 ` Stephen Hemminger
@ 2023-03-13 9:38 ` Bruce Richardson
0 siblings, 0 replies; 8+ messages in thread
From: Bruce Richardson @ 2023-03-13 9:38 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, stable, Ciara Power, Declan Doherty, Radu Nicolau
On Fri, Mar 10, 2023 at 11:08:32AM -0800, Stephen Hemminger wrote:
> On Fri, 10 Mar 2023 18:18:36 +0000
> Bruce Richardson <bruce.richardson@intel.com> wrote:
>
> > On Alpine linux, the telemetry_data_autotest was failing for the
> > test where we had dictionaries embedded in other dictionaries up
> > to three levels deep. Indications are that this issue is due to
> > excess data being stored on the stack, so replace stack-allocated
> > buffer data with dynamically allocated data in the case where we
> > are doing recursive processing of telemetry data structures into
> > json.
> >
> > Bugzilla ID: 1177
> > Fixes: c933bb5177ca ("telemetry: support array values in data object")
> > Fixes: d2671e642a8e ("telemetry: support dict of dicts")
> > Cc: stable@dpdk.org
>
> Looking at the telemetry code:
>
> - why so many temporary buffers, could this be streamed or redesigned
> so that an allocated buffer is returned.
>
This is largely what the fix patch does. The temporary buffers are used to
ensure we never end up with a partially written buffer from any truncated
snprintf, since that could cause us to emit invalid json. I think as a
scheme it works well enough for what it was designed for, but this
particular test case has more levels of recursion than was expected, so the
limit of the design are showing. The downside of using memory allocation
using malloc as here, is just more failure cases.
> - why is rte_tel_json_XXX all inline? These should just be internal
> functions and not in a .h file.
>
Sure. Since these are all just internal functions used by the library, I'm
not sure it really matters.
> FYI - if this library reused existing json writer it would have
> been much simpler.
> https://github.com/shemminger/iproute2/blob/main/lib/json_writer.c
Yep. Looked at that previously, but it's at a lower level than what we
have. IMHO, the complicated bit of producing json output is not the
formatting characters for each data-type but ensuring correct termination
of the json string even in case of errors, so each function in our
telemetry json code always includes correct terminators at all points, i.e.
we have no separate functions to be called for terminating objects, arrays
etc. - they are always included in the output of the items. This is why so
many temporary buffers are used - the input string to a function is
well-formed json, and we only actually append to that buffer by copying
from the temporary buffer once we are sure that the output will also be
similarly well-formed.
That said, if you want to replace the current json implementation with a
better one, I'm not opposed to it. [though I'd definitely rather no
external dependencies which would mean we could no longer rely on it always
being available in default builds]
/Bruce
^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20230405154414.183915-1-bruce.richardson@intel.com>]
* [PATCH v2 1/5] telemetry: fix autotest failures on Alpine
[not found] ` <20230405154414.183915-1-bruce.richardson@intel.com>
@ 2023-04-05 15:44 ` Bruce Richardson
2023-04-07 19:21 ` Tyler Retzlaff
0 siblings, 1 reply; 8+ messages in thread
From: Bruce Richardson @ 2023-04-05 15:44 UTC (permalink / raw)
To: dev; +Cc: ciara.power, roretzla, Bruce Richardson, stable
On Alpine linux, the telemetry_data_autotest was failing for the
test where we had dictionaries embedded in other dictionaries up
to three levels deep. Indications are that this issue is due to
excess data being stored on the stack, so replace stack-allocated
buffer data with dynamically allocated data in the case where we
are doing recursive processing of telemetry data structures into
json.
Bugzilla ID: 1177
Fixes: c933bb5177ca ("telemetry: support array values in data object")
Fixes: d2671e642a8e ("telemetry: support dict of dicts")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
V2:
set '\0' in newly malloc'ed buffer to ensure it always has valid
string data.
---
lib/telemetry/telemetry.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 7bceadcee7..728a0bffd4 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -208,7 +208,11 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -219,6 +223,7 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
break;
}
}
@@ -275,7 +280,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -286,6 +295,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
}
}
}
@@ -311,7 +321,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used,
d->data.array[i].uval);
else if (d->type == TEL_ARRAY_CONTAINER) {
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *rec_data =
&d->data.array[i].container;
if (container_to_json(rec_data->data,
@@ -321,6 +335,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used, temp);
if (!rec_data->keep)
rte_tel_data_free(rec_data->data);
+ free(temp);
}
break;
}
--
2.37.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/5] telemetry: fix autotest failures on Alpine
2023-04-05 15:44 ` [PATCH v2 1/5] " Bruce Richardson
@ 2023-04-07 19:21 ` Tyler Retzlaff
2023-04-11 8:43 ` Bruce Richardson
0 siblings, 1 reply; 8+ messages in thread
From: Tyler Retzlaff @ 2023-04-07 19:21 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev, ciara.power, stable
On Wed, Apr 05, 2023 at 04:44:10PM +0100, Bruce Richardson wrote:
> On Alpine linux, the telemetry_data_autotest was failing for the
> test where we had dictionaries embedded in other dictionaries up
> to three levels deep. Indications are that this issue is due to
> excess data being stored on the stack, so replace stack-allocated
> buffer data with dynamically allocated data in the case where we
> are doing recursive processing of telemetry data structures into
> json.
>
> Bugzilla ID: 1177
> Fixes: c933bb5177ca ("telemetry: support array values in data object")
> Fixes: d2671e642a8e ("telemetry: support dict of dicts")
> Cc: stable@dpdk.org
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>
> ---
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
(one observation below)
> V2:
> set '\0' in newly malloc'ed buffer to ensure it always has valid
> string data.
> ---
> lib/telemetry/telemetry.c | 21 ++++++++++++++++++---
> 1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 7bceadcee7..728a0bffd4 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -208,7 +208,11 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
> break;
> case RTE_TEL_CONTAINER:
> {
> - char temp[buf_len];
> + char *temp = malloc(buf_len);
> + if (temp == NULL)
> + break;
> + *temp = '\0'; /* ensure valid string */
> +
> const struct container *cont =
> &v->value.container;
> if (container_to_json(cont->data,
> @@ -219,6 +223,7 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
> v->name, temp);
> if (!cont->keep)
> rte_tel_data_free(cont->data);
> + free(temp);
> break;
> }
> }
> @@ -275,7 +280,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
> break;
> case RTE_TEL_CONTAINER:
> {
> - char temp[buf_len];
> + char *temp = malloc(buf_len);
> + if (temp == NULL)
> + break;
> + *temp = '\0'; /* ensure valid string */
> +
> const struct container *cont =
> &v->value.container;
> if (container_to_json(cont->data,
> @@ -286,6 +295,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
> v->name, temp);
> if (!cont->keep)
> rte_tel_data_free(cont->data);
> + free(temp);
not expressing a preference just noticing that when
RTE_TEL_CONTAINER cases are the last case in the switch sometimes there
is an explicit break; and sometimes not.
> }
> }
> }
> @@ -311,7 +321,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
> buf_len, used,
> d->data.array[i].uval);
> else if (d->type == TEL_ARRAY_CONTAINER) {
> - char temp[buf_len];
> + char *temp = malloc(buf_len);
> + if (temp == NULL)
> + break;
> + *temp = '\0'; /* ensure valid string */
> +
> const struct container *rec_data =
> &d->data.array[i].container;
> if (container_to_json(rec_data->data,
> @@ -321,6 +335,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
> buf_len, used, temp);
> if (!rec_data->keep)
> rte_tel_data_free(rec_data->data);
> + free(temp);
> }
> break;
> }
> --
> 2.37.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/5] telemetry: fix autotest failures on Alpine
2023-04-07 19:21 ` Tyler Retzlaff
@ 2023-04-11 8:43 ` Bruce Richardson
0 siblings, 0 replies; 8+ messages in thread
From: Bruce Richardson @ 2023-04-11 8:43 UTC (permalink / raw)
To: Tyler Retzlaff; +Cc: dev, ciara.power, stable
On Fri, Apr 07, 2023 at 12:21:16PM -0700, Tyler Retzlaff wrote:
> On Wed, Apr 05, 2023 at 04:44:10PM +0100, Bruce Richardson wrote:
> > On Alpine linux, the telemetry_data_autotest was failing for the
> > test where we had dictionaries embedded in other dictionaries up
> > to three levels deep. Indications are that this issue is due to
> > excess data being stored on the stack, so replace stack-allocated
> > buffer data with dynamically allocated data in the case where we
> > are doing recursive processing of telemetry data structures into
> > json.
> >
> > Bugzilla ID: 1177
> > Fixes: c933bb5177ca ("telemetry: support array values in data object")
> > Fixes: d2671e642a8e ("telemetry: support dict of dicts")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> >
> > ---
> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
>
> (one observation below)
>
> > V2:
> > set '\0' in newly malloc'ed buffer to ensure it always has valid
> > string data.
> > ---
<snip>
> > @@ -286,6 +295,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
> > v->name, temp);
> > if (!cont->keep)
> > rte_tel_data_free(cont->data);
> > + free(temp);
>
> not expressing a preference just noticing that when
> RTE_TEL_CONTAINER cases are the last case in the switch sometimes there
> is an explicit break; and sometimes not.
>
I won't do a new patch revision just for that, but if I end up doing one
for other reasons I'll try and remember to make it more consistent.
thanks,
/Bruce
^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20230405160326.186921-1-bruce.richardson@intel.com>]
* [PATCH v3 1/5] telemetry: fix autotest failures on Alpine
[not found] ` <20230405160326.186921-1-bruce.richardson@intel.com>
@ 2023-04-05 16:03 ` Bruce Richardson
2023-04-07 19:22 ` Tyler Retzlaff
0 siblings, 1 reply; 8+ messages in thread
From: Bruce Richardson @ 2023-04-05 16:03 UTC (permalink / raw)
To: dev; +Cc: ciara.power, roretzla, Bruce Richardson, stable
On Alpine linux, the telemetry_data_autotest was failing for the
test where we had dictionaries embedded in other dictionaries up
to three levels deep. Indications are that this issue is due to
excess data being stored on the stack, so replace stack-allocated
buffer data with dynamically allocated data in the case where we
are doing recursive processing of telemetry data structures into
json.
Bugzilla ID: 1177
Fixes: c933bb5177ca ("telemetry: support array values in data object")
Fixes: d2671e642a8e ("telemetry: support dict of dicts")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
V2:
set '\0' in newly malloc'ed buffer to ensure it always has valid
string data.
---
lib/telemetry/telemetry.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 7bceadcee7..728a0bffd4 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -208,7 +208,11 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -219,6 +223,7 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
break;
}
}
@@ -275,7 +280,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
break;
case RTE_TEL_CONTAINER:
{
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *cont =
&v->value.container;
if (container_to_json(cont->data,
@@ -286,6 +295,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
v->name, temp);
if (!cont->keep)
rte_tel_data_free(cont->data);
+ free(temp);
}
}
}
@@ -311,7 +321,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used,
d->data.array[i].uval);
else if (d->type == TEL_ARRAY_CONTAINER) {
- char temp[buf_len];
+ char *temp = malloc(buf_len);
+ if (temp == NULL)
+ break;
+ *temp = '\0'; /* ensure valid string */
+
const struct container *rec_data =
&d->data.array[i].container;
if (container_to_json(rec_data->data,
@@ -321,6 +335,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
buf_len, used, temp);
if (!rec_data->keep)
rte_tel_data_free(rec_data->data);
+ free(temp);
}
break;
}
--
2.37.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 1/5] telemetry: fix autotest failures on Alpine
2023-04-05 16:03 ` [PATCH v3 " Bruce Richardson
@ 2023-04-07 19:22 ` Tyler Retzlaff
0 siblings, 0 replies; 8+ messages in thread
From: Tyler Retzlaff @ 2023-04-07 19:22 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev, ciara.power, stable
On Wed, Apr 05, 2023 at 05:03:22PM +0100, Bruce Richardson wrote:
> On Alpine linux, the telemetry_data_autotest was failing for the
> test where we had dictionaries embedded in other dictionaries up
> to three levels deep. Indications are that this issue is due to
> excess data being stored on the stack, so replace stack-allocated
> buffer data with dynamically allocated data in the case where we
> are doing recursive processing of telemetry data structures into
> json.
>
> Bugzilla ID: 1177
> Fixes: c933bb5177ca ("telemetry: support array values in data object")
> Fixes: d2671e642a8e ("telemetry: support dict of dicts")
> Cc: stable@dpdk.org
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>
> ---
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
(ack on latest version)
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-04-11 8:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 18:18 [PATCH] telemetry: fix autotest failures on Alpine Bruce Richardson
2023-03-10 19:08 ` Stephen Hemminger
2023-03-13 9:38 ` Bruce Richardson
[not found] ` <20230405154414.183915-1-bruce.richardson@intel.com>
2023-04-05 15:44 ` [PATCH v2 1/5] " Bruce Richardson
2023-04-07 19:21 ` Tyler Retzlaff
2023-04-11 8:43 ` Bruce Richardson
[not found] ` <20230405160326.186921-1-bruce.richardson@intel.com>
2023-04-05 16:03 ` [PATCH v3 " Bruce Richardson
2023-04-07 19:22 ` Tyler Retzlaff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).