From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 46EA4A0032; Fri, 24 Jun 2022 11:12:10 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 94AC2415D7; Fri, 24 Jun 2022 11:12:08 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 015AC4280E for ; Fri, 24 Jun 2022 11:12:06 +0200 (CEST) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC PATCH 0/6] add json string escaping to telemetry X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Fri, 24 Jun 2022 11:12:05 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8716F@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC PATCH 0/6] add json string escaping to telemetry Thread-Index: AdiHomNjZ8QBPcGYT5GJXGCQ6f6lpQABAjAw References: <20220623164245.561371-1-bruce.richardson@intel.com> <98CBD80474FA8B44BF855DF32C47DC35D8716D@smartserver.smartshare.dk> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" Cc: , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Friday, 24 June 2022 10.14 >=20 > On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Br=F8rup wrote: > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > > Sent: Thursday, 23 June 2022 18.43 > > > > > > This RFC shows one possible approach for escaping strings for the > json > > > output of telemetry library. For now this RFC supports escaping > strings > > > for the cases of returning a single string, or returning an array > of > > > strings. Not done is escaping of strings in objs/dicts [see more > below > > > on TODO] > > > > Very good initiative. > > > > > > > > As well as telemetry lib changes, this patchset includes unit = tests > for > > > the above and also little bit of cleanup to the json tests. > > > > > > TODO: > > > Beyond what is here in this RFC: > > > > > > 1. we need to decide what to do about name/value pairs. = Personally, > I > > > think we should add the restriction to the > "rte_tel_data_add_obj_*" > > > APIs > > > to only allow a defined subset of characters in names: e.g. > > > alphanumeric > > > chars, underscore and dash. That means that we only need to > escape > > > the data part in the case of string returns. > > > > I agree about only allowing a subset of characters in names, so JSON > (and other) encoding is not required. > > > > However, I think we should be less restrictive, and also allow > characters commonly used for separation, indexing and wildcard, such = as > '/', '[', ']', and '*', '?' or '%'. > > > > Obviously, we should disallow characters requiring escaping in not > just JSON, but also other foreseeable encodings and protocols. So > please bring your crystal ball to the discussion. ;-) > > > Exactly why I am looking for feedback - and why I'm looking to have an > explicit allowed list of characters rather than trying to just block > the > known-bad in json ones. >=20 > For your suggestions: +1 to separators and indexing, i.e. '[', ']' and > '/', > though I would probably also add ',' and maybe '.' (unless it's likely > to > cause issues with some protocol we are likely to want to use). After having slept on it, I think we should also allow characters that = could appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for = subnetting). > For the wildcarding, I find it hard to see why we would want those? Initially, I thought a wildcard might be useful as a placeholder in = templates. But it might also be useful for partial IP or MAC addresses. E.g.: - The SmartShare Systems OUI could be represented by the MAC address = "00:1F:B4:??:??:??". - A default gateway address in a template configuration could be = "192.168.*.1". On the other hand, wildcard characters could be disallowed or require = escaping in other (non-JSON) protocols. So I'm just being a bit creative here, throwing out ideas in our search = for the right balance in the restrictions. >=20 > The other advantage of using an allowlist of characters is that it > makes it > possible to expand over time, compared to a blocklist which always = runs > the > risk of breaking something if you expand it. Therefore I suggest we > keep > the list as small as we need right now, and expand it only as we need. +1 >=20 > > > 2. once agreed, need to implement a patch to escape strings in > > > dicts/objs > > > > Yes. > > > > > > > > 3. need to add a patch to escape the input command if it contains > > > invalid chars > > > > What do you mean here? You mean unescape JSON encoded input = (arriving > on the JSON telemetry socket) to a proper binary string? > > >=20 > The thing with the telemetry socket interface right now is that the > input > requests are not-json. The reasons for that is that they be kept as > simple > as possible, and to avoid needing a full json parser inside DPDK. > Therefore, the input sent by the user could contain invalid characters > for > json output so we need to: > 1. Guarantee that no command registered with the telemetry library > contains > invalid json characters (though why someone would do so, I don't > know!) > 2. When we return the command back in the reply, properly escape any > invalid characters in the error case. >=20 > #1 is very important for sanity checking, but now that I think about = it > #2 > is probably optional, since if any user does start sending invalid > garbage > input that breaks their json parser on return, they are only hurting > themselves and not affecting anything else on the system. >=20 > > > 4. some small refactoring of the main telemetry.c json-encoding > > > function may be possible. > > > > Perhaps. > > > I saw some options for cleanup when I was working on the code, so > including > this as a note-to-self as much as anything else for feedback. :-) >=20 > /Bruce