From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9573FA0032; Fri, 24 Jun 2022 11:17:11 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3F10740A87; Fri, 24 Jun 2022 11:17:11 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by mails.dpdk.org (Postfix) with ESMTP id 027554069D for ; Fri, 24 Jun 2022 11:17:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1656062230; x=1687598230; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=Vcku+VBiLp7pWWCb9PJ+R2Y6NZXbAlOkbzffLY/v0yI=; b=kp/Vd/AlHoWVlGxG5M6svGNcvPxg+SJAV5RCiPLFA5ZITzmC9lfhzJvB U4yK7qACrj6RziGx+M1ExNAxoioFvZIjtUiD0lnzbnhWjLnasWiWDmvqW DfUT3MhQp3tcMLlq0kNvo+egFhlYnmfi+TCfRXlEQEJqnQTEpqOCCsTdA Ywj6U+ZGmJV15y9dYdtXIrNOOgx3ssuE/YIuU/UnF4qkZYl/Gd0rMAGV4 zO/vR9jKKSYWXyoL+W7LzGnFjOPDmQmxKkAv+0UhtYQk/0hIf5Aqzto+N y7OX8DiSENIXpcHi+cVuEFbnNz9zzzskIU14IBOxXbYaj1g7nbP14zjWp g==; X-IronPort-AV: E=McAfee;i="6400,9594,10387"; a="281696240" X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="281696240" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2022 02:17:08 -0700 X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="593145585" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.25.171]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 24 Jun 2022 02:17:07 -0700 Date: Fri, 24 Jun 2022 10:17:04 +0100 From: Bruce Richardson To: Morten =?iso-8859-1?Q?Br=F8rup?= Cc: dev@dpdk.org, ciara.power@intel.com, fengchengwen@huawei.com Subject: Re: [RFC PATCH 0/6] add json string escaping to telemetry Message-ID: References: <20220623164245.561371-1-bruce.richardson@intel.com> <98CBD80474FA8B44BF855DF32C47DC35D8716D@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D8716F@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8716F@smartserver.smartshare.dk> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, Jun 24, 2022 at 11:12:05AM +0200, Morten Brørup wrote: > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > Sent: Friday, 24 June 2022 10.14 > > > > On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Brørup wrote: > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > > > Sent: Thursday, 23 June 2022 18.43 > > > > > > > > This RFC shows one possible approach for escaping strings for the > > json > > > > output of telemetry library. For now this RFC supports escaping > > strings > > > > for the cases of returning a single string, or returning an array > > of > > > > strings. Not done is escaping of strings in objs/dicts [see more > > below > > > > on TODO] > > > > > > Very good initiative. > > > > > > > > > > > As well as telemetry lib changes, this patchset includes unit tests > > for > > > > the above and also little bit of cleanup to the json tests. > > > > > > > > TODO: > > > > Beyond what is here in this RFC: > > > > > > > > 1. we need to decide what to do about name/value pairs. Personally, > > I > > > > think we should add the restriction to the > > "rte_tel_data_add_obj_*" > > > > APIs > > > > to only allow a defined subset of characters in names: e.g. > > > > alphanumeric > > > > chars, underscore and dash. That means that we only need to > > escape > > > > the data part in the case of string returns. > > > > > > I agree about only allowing a subset of characters in names, so JSON > > (and other) encoding is not required. > > > > > > However, I think we should be less restrictive, and also allow > > characters commonly used for separation, indexing and wildcard, such as > > '/', '[', ']', and '*', '?' or '%'. > > > > > > Obviously, we should disallow characters requiring escaping in not > > just JSON, but also other foreseeable encodings and protocols. So > > please bring your crystal ball to the discussion. ;-) > > > > > Exactly why I am looking for feedback - and why I'm looking to have an > > explicit allowed list of characters rather than trying to just block > > the > > known-bad in json ones. > > > > For your suggestions: +1 to separators and indexing, i.e. '[', ']' and > > '/', > > though I would probably also add ',' and maybe '.' (unless it's likely > > to > > cause issues with some protocol we are likely to want to use). > > After having slept on it, I think we should also allow characters that could appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for subnetting). > > > For the wildcarding, I find it hard to see why we would want those? > > Initially, I thought a wildcard might be useful as a placeholder in templates. > > But it might also be useful for partial IP or MAC addresses. E.g.: > - The SmartShare Systems OUI could be represented by the MAC address "00:1F:B4:??:??:??". > - A default gateway address in a template configuration could be "192.168.*.1". > > On the other hand, wildcard characters could be disallowed or require escaping in other (non-JSON) protocols. > > So I'm just being a bit creative here, throwing out ideas in our search for the right balance in the restrictions. > I could see those characters certainly being needed in data values, but do you foresee them being required in the names of fields? > > > > The other advantage of using an allowlist of characters is that it > > makes it > > possible to expand over time, compared to a blocklist which always runs > > the > > risk of breaking something if you expand it. Therefore I suggest we > > keep > > the list as small as we need right now, and expand it only as we need. > > +1 > >From previous on-list discussion, I take it that SNMP is a possible target protocol you might have in mind. Any other protocols you can think of and what restrictions (if any) would SNMP or those other protocols add? /Bruce