From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 9573FA0032;
	Fri, 24 Jun 2022 11:17:11 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 3F10740A87;
	Fri, 24 Jun 2022 11:17:11 +0200 (CEST)
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
 by mails.dpdk.org (Postfix) with ESMTP id 027554069D
 for <dev@dpdk.org>; Fri, 24 Jun 2022 11:17:09 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1656062230; x=1687598230;
 h=date:from:to:cc:subject:message-id:references:
 mime-version:content-transfer-encoding:in-reply-to;
 bh=Vcku+VBiLp7pWWCb9PJ+R2Y6NZXbAlOkbzffLY/v0yI=;
 b=kp/Vd/AlHoWVlGxG5M6svGNcvPxg+SJAV5RCiPLFA5ZITzmC9lfhzJvB
 U4yK7qACrj6RziGx+M1ExNAxoioFvZIjtUiD0lnzbnhWjLnasWiWDmvqW
 DfUT3MhQp3tcMLlq0kNvo+egFhlYnmfi+TCfRXlEQEJqnQTEpqOCCsTdA
 Ywj6U+ZGmJV15y9dYdtXIrNOOgx3ssuE/YIuU/UnF4qkZYl/Gd0rMAGV4
 zO/vR9jKKSYWXyoL+W7LzGnFjOPDmQmxKkAv+0UhtYQk/0hIf5Aqzto+N
 y7OX8DiSENIXpcHi+cVuEFbnNz9zzzskIU14IBOxXbYaj1g7nbP14zjWp g==;
X-IronPort-AV: E=McAfee;i="6400,9594,10387"; a="281696240"
X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="281696240"
Received: from fmsmga007.fm.intel.com ([10.253.24.52])
 by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 24 Jun 2022 02:17:08 -0700
X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="593145585"
Received: from bricha3-mobl.ger.corp.intel.com ([10.252.25.171])
 by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA;
 24 Jun 2022 02:17:07 -0700
Date: Fri, 24 Jun 2022 10:17:04 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Morten =?iso-8859-1?Q?Br=F8rup?= <mb@smartsharesystems.com>
Cc: dev@dpdk.org, ciara.power@intel.com, fengchengwen@huawei.com
Subject: Re: [RFC PATCH 0/6] add json string escaping to telemetry
Message-ID: <YrWBEBzueSeV+Snd@bricha3-MOBL.ger.corp.intel.com>
References: <20220623164245.561371-1-bruce.richardson@intel.com>
 <98CBD80474FA8B44BF855DF32C47DC35D8716D@smartserver.smartshare.dk>
 <YrVyRgsDUZvlaHIX@bricha3-MOBL.ger.corp.intel.com>
 <98CBD80474FA8B44BF855DF32C47DC35D8716F@smartserver.smartshare.dk>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8716F@smartserver.smartshare.dk>
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Fri, Jun 24, 2022 at 11:12:05AM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 24 June 2022 10.14
> > 
> > On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Thursday, 23 June 2022 18.43
> > > >
> > > > This RFC shows one possible approach for escaping strings for the
> > json
> > > > output of telemetry library. For now this RFC supports escaping
> > strings
> > > > for the cases of returning a single string, or returning an array
> > of
> > > > strings. Not done is escaping of strings in objs/dicts [see more
> > below
> > > > on TODO]
> > >
> > > Very good initiative.
> > >
> > > >
> > > > As well as telemetry lib changes, this patchset includes unit tests
> > for
> > > > the above and also little bit of cleanup to the json tests.
> > > >
> > > > TODO:
> > > > Beyond what is here in this RFC:
> > > >
> > > > 1. we need to decide what to do about name/value pairs. Personally,
> > I
> > > >    think we should add the restriction to the
> > "rte_tel_data_add_obj_*"
> > > > APIs
> > > >    to only allow a defined subset of characters in names: e.g.
> > > > alphanumeric
> > > >    chars, underscore and dash. That means that we only need to
> > escape
> > > >    the data part in the case of string returns.
> > >
> > > I agree about only allowing a subset of characters in names, so JSON
> > (and other) encoding is not required.
> > >
> > > However, I think we should be less restrictive, and also allow
> > characters commonly used for separation, indexing and wildcard, such as
> > '/', '[', ']', and '*', '?' or '%'.
> > >
> > > Obviously, we should disallow characters requiring escaping in not
> > just JSON, but also other foreseeable encodings and protocols. So
> > please bring your crystal ball to the discussion. ;-)
> > >
> > Exactly why I am looking for feedback - and why I'm looking to have an
> > explicit allowed list of characters rather than trying to just block
> > the
> > known-bad in json ones.
> > 
> > For your suggestions: +1 to separators and indexing, i.e. '[', ']' and
> > '/',
> > though I would probably also add ',' and maybe '.' (unless it's likely
> > to
> > cause issues with some protocol we are likely to want to use).
> 
> After having slept on it, I think we should also allow characters that could appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for subnetting).
> 
> > For the wildcarding, I find it hard to see why we would want those?
> 
> Initially, I thought a wildcard might be useful as a placeholder in templates.
> 
> But it might also be useful for partial IP or MAC addresses. E.g.:
> - The SmartShare Systems OUI could be represented by the MAC address "00:1F:B4:??:??:??".
> - A default gateway address in a template configuration could be "192.168.*.1".
> 
> On the other hand, wildcard characters could be disallowed or require escaping in other (non-JSON) protocols.
> 
> So I'm just being a bit creative here, throwing out ideas in our search for the right balance in the restrictions.
> 

I could see those characters certainly being needed in data values, but do
you foresee them being required in the names of fields?

> > 
> > The other advantage of using an allowlist of characters is that it
> > makes it
> > possible to expand over time, compared to a blocklist which always runs
> > the
> > risk of breaking something if you expand it. Therefore I suggest we
> > keep
> > the list as small as we need right now, and expand it only as we need.
> 
> +1
>

>From previous on-list discussion, I take it that SNMP is a possible target
protocol you might have in mind. Any other protocols you can think of and
what restrictions (if any) would SNMP or those other protocols add?

/Bruce