From: Bruce Richardson <bruce.richardson@intel.com>
To: Tyler Retzlaff <roretzla@linux.microsoft.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>, <dev@dpdk.org>,
<ciara.power@intel.com>, <david.marchand@redhat.com>,
<thomas@monjalon.net>
Subject: Re: [PATCH 1/2] telemetry: use malloc instead of variable length array
Date: Tue, 4 Apr 2023 18:25:42 +0100 [thread overview]
Message-ID: <ZCxdlsxiO67/QI3c@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <20230404164446.GF18560@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
On Tue, Apr 04, 2023 at 09:44:46AM -0700, Tyler Retzlaff wrote:
> On Tue, Apr 04, 2023 at 05:28:29PM +0100, Bruce Richardson wrote:
> > On Tue, Apr 04, 2023 at 09:24:44AM -0700, Tyler Retzlaff wrote:
> > > On Tue, Apr 04, 2023 at 09:47:21AM +0100, Bruce Richardson wrote:
> > > > On Mon, Apr 03, 2023 at 01:19:12PM -0700, Stephen Hemminger wrote:
> > > > > On Mon, 3 Apr 2023 09:30:23 -0700
> > > > > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> > > > >
> > > > > > __json_snprintf(char *buf, const int len, const char *format, ...)
> > > > > > {
> > > > > > - char tmp[len];
> > > > > > + char *tmp = malloc(len);
> > > > > > va_list ap;
> > > > > > - int ret;
> > > > > > + int ret = 0;
> > > > > > +
> > > > > > + if (tmp == NULL)
> > > > > > + return ret;
> > > > > >
> > > > > > va_start(ap, format);
> > > > > > ret = vsnprintf(tmp, sizeof(tmp), format, ap);
> > > > > > va_end(ap);
> > > > > > if (ret > 0 && ret < (int)sizeof(tmp) && ret < len) {
> > > > > > strcpy(buf, tmp);
> > > > > > - return ret;
> > > > > > }
> > > > > > - return 0; /* nothing written or modified */
> > > > > > +
> > > > > > + free(tmp);
> > > > > > +
> > > > > > + return ret;
> > > > > > }
> > > > >
> > > > > Not sure why it needs a tmp buffer anyway?
> > > >
> > > > The temporary buffer is to ensure that in the case that the data doesn't
> > > > fit in the buffer, the buffer remains unmodified. The reason for this is
> > > > that when building up the json response we always have a valid json string.
> > >
> > > i guessed this but you've now confirmed it. it makes sense in general
> > > that if the callee signals an error to the caller that the caller shall
> > > not observe any side-effects to do so is to take a dependency on what is
> > > more often than not an internal implementation detail.
> > >
> > > >
> > > > For example, suppose we are preparing a response with an array of two
> > > > strings. After the first string has been processed, the output buffer
> > > > contains: '["string1"]'. When json_snprintf is being called to add string2,
> > > > there are a couple of things to note:
> > > > * the text to be inserted will be put not at the end of the string, but
> > > > before the closing "]".
> > > > * the actual text to be inserted will be ',"string2"]', so ensuring that
> > > > the final buffer is valid.
> > > > However, the error case is problematic. While we can catch the case where
> > > > the string to be inserted overflows/has been truncated, doing a regular
> > > > snprintf means that our output buffer could contain invalid json, as our
> > > > end-terminator would have been overwritten, e.g. '["string1","string2'
> > > > To guarantee the output from telemetry is always valid json, even in case
> > > > of truncation, we use a temporary buffer to do the write initially, and if
> > > > it doesn't get truncated, we then copy that to the final buffer.
> > > >
> > > > That's the logic for this temporary buffer. Now, thinking about it
> > > > yesterday evening, there are other ways in which we can do this, which can
> > > > avoid this temporary buffer.
> > > > 1. We can do the initial snprintf to an empty buffer to get the length that
> > > > way. This will still be slower, as it means that we need to do printf
> > > > processing twice rather than using memcpy to copy the result. However, it's
> > > > probably less overhead than malloc and free.
> > > > 2. AFAIK, the normal case for this function being called is with a single
> > > > terminator at the end of the string. We can take advantage of that, by
> > > > checking if the '\0' just one character into the string we are printing,
> > > > and, if so, to store that once character. If we have a snprintf error
> > > > leading to truncation, it then allows us to restore the original string.
> > > >
> > > > My suggestion is to use a combination of these methods. In json_snprintf
> > > > check if the input buffer is empty or has only one character in it, and use
> > > > method #2 if so. If that's not the case, then fallback to method #1 and do
> > > > a double snprintf.
> > > >
> > > > Make sense? Any other suggestions?
> > >
> > > your suggestion seems okay to me, aside from that there's always using
> > > some fixed sized buffer but i'm guessing this being json it's difficult
> > > to choose a reasonable constant size for a stack allocated buffer.
> > >
> > Yes, choosing a reasonable size is very difficult. We could be snprintf-ing
> > a string containing a json-ized object a couple of KB long.
>
> haven't checked recently, but i wonder what our normal usermode stack
> frame size limit is, which is why alloca() would be scary.
>
> >
> > I think suggestion #2 above should cover most cases, in which case using
> > your original suggestion of malloc would be ok too for the rare case (if
> > ever) where we don't just have one terminator on the end.
>
> maybe a dumb'd down compromise is to have a fixed stack limit and then
> if it is exceeded always just go to malloc/free?
>
Perhaps. If you like, I have have a try at implementing my own suggestions
above tomorrow. I'd like if we can get the "single-character-saving" option
working, because that would be the most efficient method of all.
/Bruce
next prev parent reply other threads:[~2023-04-04 17:25 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-03 16:30 [PATCH 0/2] improve code portability Tyler Retzlaff
2023-04-03 16:30 ` [PATCH 1/2] telemetry: use malloc instead of variable length array Tyler Retzlaff
2023-04-03 17:17 ` Tyler Retzlaff
2023-04-03 20:19 ` Stephen Hemminger
2023-04-03 20:40 ` Tyler Retzlaff
2023-04-04 8:47 ` Bruce Richardson
2023-04-04 16:24 ` Tyler Retzlaff
2023-04-04 16:28 ` Bruce Richardson
2023-04-04 16:44 ` Tyler Retzlaff
2023-04-04 17:25 ` Bruce Richardson [this message]
2023-04-04 17:34 ` Tyler Retzlaff
2023-04-05 1:20 ` Stephen Hemminger
2023-04-05 8:53 ` Bruce Richardson
2023-04-05 1:04 ` Stephen Hemminger
2023-04-05 8:54 ` Bruce Richardson
2023-04-05 15:25 ` Tyler Retzlaff
2023-04-05 15:30 ` Dmitry Kozlyuk
2023-04-05 15:37 ` Stephen Hemminger
2023-04-05 15:47 ` Bruce Richardson
2023-04-03 16:30 ` [PATCH 2/2] telemetry: use portable syntax to initialize array Tyler Retzlaff
2023-04-03 17:04 ` [PATCH 0/2] improve code portability Bruce Richardson
2023-04-03 17:35 ` Tyler Retzlaff
2023-04-03 18:47 ` [PATCH v2] " Tyler Retzlaff
2023-04-03 18:47 ` [PATCH v2] telemetry: use portable syntax to initialize array Tyler Retzlaff
2023-04-03 18:59 ` [PATCH v3] improve code portability Tyler Retzlaff
2023-04-03 18:59 ` [PATCH v3] telemetry: use portable syntax to initialize array Tyler Retzlaff
2023-04-04 8:51 ` Bruce Richardson
2023-04-04 15:54 ` Tyler Retzlaff
2023-04-04 16:08 ` Bruce Richardson
2023-04-04 9:01 ` Konstantin Ananyev
2023-04-04 15:59 ` Tyler Retzlaff
2023-04-04 16:19 ` Bruce Richardson
2023-04-04 16:28 ` Tyler Retzlaff
2023-04-04 18:09 ` [PATCH v4] improve code portability Tyler Retzlaff
2023-04-04 18:09 ` [PATCH v4] telemetry: remove non-portable array initialization syntax Tyler Retzlaff
2023-04-05 8:56 ` Bruce Richardson
2023-04-05 15:27 ` Tyler Retzlaff
2023-04-05 18:52 ` [PATCH v5] improve code portability Tyler Retzlaff
2023-04-05 18:52 ` [PATCH v5] telemetry: remove non-portable array initialization syntax Tyler Retzlaff
2023-05-24 20:54 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCxdlsxiO67/QI3c@bricha3-MOBL.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=ciara.power@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=roretzla@linux.microsoft.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).