From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C837E428C6; Tue, 4 Apr 2023 19:34:03 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5567640EE3; Tue, 4 Apr 2023 19:34:03 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 2124640A7E for ; Tue, 4 Apr 2023 19:34:02 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id 41DE2210DDAA; Tue, 4 Apr 2023 10:34:01 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 41DE2210DDAA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1680629641; bh=WqQqqsp0okZkVO1FYremJSyBqHtbpF+nMNvEpcO78gQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=lUtZF/m71dHK3APK9OFy6uU+IsXNVm4GUhsNgOW1LVdV4eerZAj0/gXVMBd5Vrz24 3RF0Ntr74PPYN4WkYcJzq+WXXYNhOfzUbnRt1cKT8koc4nqG6XQH2tVpogOp4UkjxF ypAdl9EsNmRXkTGJScinSuShXPlquhSiLk5tHnso= Date: Tue, 4 Apr 2023 10:34:01 -0700 From: Tyler Retzlaff To: Bruce Richardson Cc: Stephen Hemminger , dev@dpdk.org, ciara.power@intel.com, david.marchand@redhat.com, thomas@monjalon.net Subject: Re: [PATCH 1/2] telemetry: use malloc instead of variable length array Message-ID: <20230404173401.GA32118@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <1680539424-20255-1-git-send-email-roretzla@linux.microsoft.com> <1680539424-20255-2-git-send-email-roretzla@linux.microsoft.com> <20230403131913.0aec54ce@hermes.local> <20230404162444.GB18560@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <20230404164446.GF18560@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Apr 04, 2023 at 06:25:42PM +0100, Bruce Richardson wrote: > On Tue, Apr 04, 2023 at 09:44:46AM -0700, Tyler Retzlaff wrote: > > On Tue, Apr 04, 2023 at 05:28:29PM +0100, Bruce Richardson wrote: > > > On Tue, Apr 04, 2023 at 09:24:44AM -0700, Tyler Retzlaff wrote: > > > > On Tue, Apr 04, 2023 at 09:47:21AM +0100, Bruce Richardson wrote: > > > > > On Mon, Apr 03, 2023 at 01:19:12PM -0700, Stephen Hemminger wrote: > > > > > > On Mon, 3 Apr 2023 09:30:23 -0700 > > > > > > Tyler Retzlaff wrote: > > > > > > > > > > > > > __json_snprintf(char *buf, const int len, const char *format, ...) > > > > > > > { > > > > > > > - char tmp[len]; > > > > > > > + char *tmp = malloc(len); > > > > > > > va_list ap; > > > > > > > - int ret; > > > > > > > + int ret = 0; > > > > > > > + > > > > > > > + if (tmp == NULL) > > > > > > > + return ret; > > > > > > > > > > > > > > va_start(ap, format); > > > > > > > ret = vsnprintf(tmp, sizeof(tmp), format, ap); > > > > > > > va_end(ap); > > > > > > > if (ret > 0 && ret < (int)sizeof(tmp) && ret < len) { > > > > > > > strcpy(buf, tmp); > > > > > > > - return ret; > > > > > > > } > > > > > > > - return 0; /* nothing written or modified */ > > > > > > > + > > > > > > > + free(tmp); > > > > > > > + > > > > > > > + return ret; > > > > > > > } > > > > > > > > > > > > Not sure why it needs a tmp buffer anyway? > > > > > > > > > > The temporary buffer is to ensure that in the case that the data doesn't > > > > > fit in the buffer, the buffer remains unmodified. The reason for this is > > > > > that when building up the json response we always have a valid json string. > > > > > > > > i guessed this but you've now confirmed it. it makes sense in general > > > > that if the callee signals an error to the caller that the caller shall > > > > not observe any side-effects to do so is to take a dependency on what is > > > > more often than not an internal implementation detail. > > > > > > > > > > > > > > For example, suppose we are preparing a response with an array of two > > > > > strings. After the first string has been processed, the output buffer > > > > > contains: '["string1"]'. When json_snprintf is being called to add string2, > > > > > there are a couple of things to note: > > > > > * the text to be inserted will be put not at the end of the string, but > > > > > before the closing "]". > > > > > * the actual text to be inserted will be ',"string2"]', so ensuring that > > > > > the final buffer is valid. > > > > > However, the error case is problematic. While we can catch the case where > > > > > the string to be inserted overflows/has been truncated, doing a regular > > > > > snprintf means that our output buffer could contain invalid json, as our > > > > > end-terminator would have been overwritten, e.g. '["string1","string2' > > > > > To guarantee the output from telemetry is always valid json, even in case > > > > > of truncation, we use a temporary buffer to do the write initially, and if > > > > > it doesn't get truncated, we then copy that to the final buffer. > > > > > > > > > > That's the logic for this temporary buffer. Now, thinking about it > > > > > yesterday evening, there are other ways in which we can do this, which can > > > > > avoid this temporary buffer. > > > > > 1. We can do the initial snprintf to an empty buffer to get the length that > > > > > way. This will still be slower, as it means that we need to do printf > > > > > processing twice rather than using memcpy to copy the result. However, it's > > > > > probably less overhead than malloc and free. > > > > > 2. AFAIK, the normal case for this function being called is with a single > > > > > terminator at the end of the string. We can take advantage of that, by > > > > > checking if the '\0' just one character into the string we are printing, > > > > > and, if so, to store that once character. If we have a snprintf error > > > > > leading to truncation, it then allows us to restore the original string. > > > > > > > > > > My suggestion is to use a combination of these methods. In json_snprintf > > > > > check if the input buffer is empty or has only one character in it, and use > > > > > method #2 if so. If that's not the case, then fallback to method #1 and do > > > > > a double snprintf. > > > > > > > > > > Make sense? Any other suggestions? > > > > > > > > your suggestion seems okay to me, aside from that there's always using > > > > some fixed sized buffer but i'm guessing this being json it's difficult > > > > to choose a reasonable constant size for a stack allocated buffer. > > > > > > > Yes, choosing a reasonable size is very difficult. We could be snprintf-ing > > > a string containing a json-ized object a couple of KB long. > > > > haven't checked recently, but i wonder what our normal usermode stack > > frame size limit is, which is why alloca() would be scary. > > > > > > > > I think suggestion #2 above should cover most cases, in which case using > > > your original suggestion of malloc would be ok too for the rare case (if > > > ever) where we don't just have one terminator on the end. > > > > maybe a dumb'd down compromise is to have a fixed stack limit and then > > if it is exceeded always just go to malloc/free? > > > Perhaps. If you like, I have have a try at implementing my own suggestions > above tomorrow. I'd like if we can get the "single-character-saving" option > working, because that would be the most efficient method of all. that would be great, i'll take the help i can get. > > /Bruce