From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D0C06A0093; Thu, 23 Jun 2022 20:34:15 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 74BF24067B; Thu, 23 Jun 2022 20:34:15 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 4DCDA40146 for ; Thu, 23 Jun 2022 20:34:14 +0200 (CEST) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC PATCH 2/6] telemetry: fix escaping of invalid json characters X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Thu, 23 Jun 2022 20:34:07 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8716B@smartserver.smartshare.dk> In-Reply-To: <20220623164245.561371-3-bruce.richardson@intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC PATCH 2/6] telemetry: fix escaping of invalid json characters Thread-Index: AdiHIFNx/GsZ61ZQRImuFUPg79+sKQADdLnQ References: <20220623164245.561371-1-bruce.richardson@intel.com> <20220623164245.561371-3-bruce.richardson@intel.com> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" , Cc: , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Thursday, 23 June 2022 18.43 >=20 > For string values returned from telemetry, escape any values that > cannot > normally appear in a json string. According to the json spec[1], the > characters than need to be handled are control chars (char value < > 0x20) > and '"' and '\' characters. Correct. Other chars are optional to escape. >=20 > To handle this, we replace the snprintf call with a separate string > copying and encapsulation routine which checks each character as it > copies it to the final array. >=20 > [1] https://www.rfc-editor.org/rfc/rfc8259.txt >=20 > Signed-off-by: Bruce Richardson > --- > lib/telemetry/telemetry_json.h | 48 = +++++++++++++++++++++++++++++++++- > 1 file changed, 47 insertions(+), 1 deletion(-) >=20 > diff --git a/lib/telemetry/telemetry_json.h > b/lib/telemetry/telemetry_json.h > index db70690274..13df5d07e3 100644 > --- a/lib/telemetry/telemetry_json.h > +++ b/lib/telemetry/telemetry_json.h > @@ -44,6 +44,52 @@ __json_snprintf(char *buf, const int len, const = char > *format, ...) > return 0; /* nothing written or modified */ > } >=20 > +static const char control_chars[0x20] =3D { > + ['\n'] =3D 'n', > + ['\r'] =3D 'r', > + ['\t'] =3D 't', > +}; > + > +/** > + * @internal > + * Does the same as __json_snprintf(buf, len, "\"%s\"", str) > + * except that it does proper escaping as necessary. > + * Drops any invalid characters we don't support > + */ > +static inline int > +__json_format_str(char *buf, const int len, const char *str) > +{ > + char tmp[len]; > + int tmpidx =3D 0; > + > + tmp[tmpidx++] =3D '"'; > + while (*str !=3D '\0') { > + if (*str < (int)RTE_DIM(control_chars)) { I would prefer the more explicit 0x20, directly copied from the RFC. = RTE_DIM(control_chars) hints that it could change. > + int idx =3D *str; /* compilers don't like char type as > index */ > + if (control_chars[idx] !=3D 0) { > + tmp[tmpidx++] =3D '\\'; > + tmp[tmpidx++] =3D control_chars[idx]; > + } Consider support for other control characters: + else { + tmp[tmpidx++] =3D '\\'; + tmp[tmpidx++] =3D 'u'; + tmp[tmpidx++] =3D '0'; + tmp[tmpidx++] =3D '0'; + tmp[tmpidx++] =3D hexchar(idx >> 4); + tmp[tmpidx++] =3D hexchar(idx & 0xf); + } Or just drop them, as you mention in the function's description. > + } else if (*str =3D=3D '"' || *str =3D=3D '\\') { > + tmp[tmpidx++] =3D '\\'; > + tmp[tmpidx++] =3D *str; > + } else > + tmp[tmpidx++] =3D *str; > + /* we always need space for closing quote and null > character. > + * Ensuring at least two free characters also means we can > always take an > + * escaped character like "\n" without overflowing > + */ > + if (tmpidx > len - 2) If supporting the \u00XX encoding, you need to reserve more than 2 = characters here and in related code. > + return 0; > + str++; > + } > + tmp[tmpidx++] =3D '"'; > + tmp[tmpidx] =3D '\0'; > + > + strcpy(buf, tmp); > + return tmpidx; > +} > + > /* Copies an empty array into the provided buffer. */ > static inline int > rte_tel_json_empty_array(char *buf, const int len, const int used) > @@ -62,7 +108,7 @@ rte_tel_json_empty_obj(char *buf, const int len, > const int used) > static inline int > rte_tel_json_str(char *buf, const int len, const int used, const char > *str) > { > - return used + __json_snprintf(buf + used, len - used, "\"%s\"", > str); > + return used + __json_format_str(buf + used, len - used, str); > } >=20 > /* Appends a string into the JSON array in the provided buffer. */ > -- > 2.34.1 >=20