From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE892A04FD; Wed, 22 Jun 2022 11:19:57 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CA2504069C; Wed, 22 Jun 2022 11:19:57 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 6B83C40689 for ; Wed, 22 Jun 2022 11:19:55 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655889595; x=1687425595; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=GRLI+WWLM2hmXEkcEjXI7UE15XUU+WKjDxBShxekpY0=; b=JvOTVfgTjcul20wl11kjZtTrs8kjB6bBraeDKrqEo4I0/mMli+FYc4aq iqvWvTk09xAlw64xr7GiPYYbhXPxyibjh7iFFd6/E4Q0864GJX7JpkAzt x7T1SAy00lWgbcbJtjyydbWKdW0rZy560/nV88cqQ5gTOR6SLkw7bnhu8 KZ80+Q9tU/8Co5IyBcJz5RyeCdAH1Li5M6qawEP7jm5q59SYaVrAxtSVk LQETWvBcStbGd/V1FJYfVOkx5rx/kviT7Jd9LJSkwPtnGB2kGNIpvFq1b k6uaTp7qo/dspRjsPRvInIzYqLwQhCLf2jqoHtgQhoIK3MtNQA9AUPcSL A==; X-IronPort-AV: E=McAfee;i="6400,9594,10385"; a="366688839" X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="366688839" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2022 02:19:54 -0700 X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="592092493" Received: from bricha3-mobl.ger.corp.intel.com ([10.55.133.62]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 22 Jun 2022 02:19:51 -0700 Date: Wed, 22 Jun 2022 10:19:48 +0100 From: Bruce Richardson To: "Power, Ciara" Cc: Morten =?iso-8859-1?Q?Br=F8rup?= , fengchengwen , Stephen Hemminger , "thomas@monjalon.net" , "ferruh.yigit@xilinx.com" , "Laatz, Kevin" , "andrew.rybchenko@oktetlabs.ru" , "jerinj@marvell.com" , "sachin.saxena@oss.nxp.com" , "hemant.agrawal@nxp.com" , "dev@dpdk.org" Subject: Re: [PATCH v2 1/5] telemetry: escape special char when tel string Message-ID: References: <20220615073915.14041-1-fengchengwen@huawei.com> <20220617094624.17578-1-fengchengwen@huawei.com> <20220617094624.17578-2-fengchengwen@huawei.com> <98CBD80474FA8B44BF855DF32C47DC35D8713C@smartserver.smartshare.dk> <20220617100514.5a2df62c@hermes.local> <507d1942-868b-4e60-6921-6b420190e5de@huawei.com> <98CBD80474FA8B44BF855DF32C47DC35D8713F@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Jun 22, 2022 at 08:57:43AM +0100, Power, Ciara wrote: > Hi folks, > > > -----Original Message----- > > From: Morten Brørup > > Sent: Saturday 18 June 2022 10:59 > > To: fengchengwen ; Stephen Hemminger > > ; Richardson, Bruce > > > > Cc: thomas@monjalon.net; ferruh.yigit@xilinx.com; Laatz, Kevin > > ; andrew.rybchenko@oktetlabs.ru; > > jerinj@marvell.com; sachin.saxena@oss.nxp.com; > > hemant.agrawal@nxp.com; dev@dpdk.org; Power, Ciara > > > > Subject: RE: [PATCH v2 1/5] telemetry: escape special char when tel string > > > > +CC: Ciara Power, Telemetry library maintainer > > > > > From: fengchengwen [mailto:fengchengwen@huawei.com] > > > Sent: Saturday, 18 June 2022 05.52 > > > > > > On 2022/6/18 1:05, Stephen Hemminger wrote: > > > > On Fri, 17 Jun 2022 12:25:04 +0100 > > > > Bruce Richardson wrote: > > > > > > > >> On Fri, Jun 17, 2022 at 01:16:08PM +0200, Morten Brørup wrote: > > > >>>> From: Chengwen Feng [mailto:fengchengwen@huawei.com] > > > >>>> Sent: Friday, 17 June 2022 11.46 > > > >>>> > > > >>>> This patch supports escape special characters (including: > > > \",\\,/,\b, > > > >>>> /f,/n,/r,/t) when telemetry string. > > > >>>> This patch is used to support telemetry xxx-dump commands which > > > the > > > >>>> string may include special characters. > > > >>>> > > > >>>> Signed-off-by: Chengwen Feng > > > >>>> --- > > > >>>> lib/telemetry/telemetry.c | 96 > > > +++++++++++++++++++++++++++++++++++++-- > > > >>>> 1 file changed, 93 insertions(+), 3 deletions(-) > > > >>>> > > > >>>> diff --git a/lib/telemetry/telemetry.c > > > >>>> b/lib/telemetry/telemetry.c index c6fd03a5ab..0f762f633e 100644 > > > >>>> --- a/lib/telemetry/telemetry.c > > > >>>> +++ b/lib/telemetry/telemetry.c > > > >>>> @@ -215,6 +215,94 @@ container_to_json(const struct rte_tel_data > > > *d, > > > >>>> char *out_buf, size_t buf_len) > > > >>>> return used; > > > >>>> } > > > >>>> > > > >>>> +static bool > > > >>>> +json_is_special_char(char ch) > > > >>>> +{ > > > >>>> + static unsigned char is_spec[256] = { 0 }; > > > >>>> + static bool init_once; > > > >>>> + > > > >>>> + if (!init_once) { > > > >>>> + is_spec['\"'] = 1; > > > >>>> + is_spec['\\'] = 1; > > > >>>> + is_spec['/'] = 1; > > > >>>> + is_spec['\b'] = 1; > > > >>>> + is_spec['\f'] = 1; > > > >>>> + is_spec['\n'] = 1; > > > >>>> + is_spec['\r'] = 1; > > > >>>> + is_spec['\t'] = 1; > > > >>>> + init_once = true; > > > >>>> + } > > > >>>> + > > > >>>> + return (bool)is_spec[(unsigned char)ch]; } > > > >> > > > >> According to the json spec at [1], the characters that need to be > > > escaped > > > >> are: > > > >> a) any characters <0x20 > > > >> b) inverted commas/quote character \" > > > >> c) the "reverse solidus character", better known to you and I as > > > >> the back-slash. > > > >> > > > >> Therefore, I think this table generation could be simplified, but > > > also > > > >> expanded using this. For completeness we should also see about > > > handling all > > > >> control characters if they are encountered. > > > >> > > > >> [1] https://www.rfc-editor.org/rfc/rfc8259.txt > > > >> > > > >> /Bruce > > > > > > > > Since it is trivial could be initializer? > > > > > > > > static const uint8_t is_spec[256] = { > > > > [0 ... 0x20] = 1, > > > > ['\"' ] = 1, > > > > ['\\' ] = 1, > > > > ['/'] = 1, > > > > > > > > etc > > > > > > > > Or we could change the telemetry API to disallow control characters? > > > > > > I was thinking about converting 0~0x20, but I don't think there's a > > > scenario. > > > > > > I prefer change the telemetry API to disallow control characters, and > > > this may not be a violation of the ABI, if yes, the dpdk-telemetry.py > > > will returns an error. > > > > I agree with Chengwen Feng. The telemetry data type is STRING, not BLOB. > > > > So we need to define exactly what the STRING type contains. > > > > I hope we can all agree that control characters should be disallowed. > > > > The more complicated question is: Do we want to use the ASCII character set > > only, or do we want to use UTF-8 encoded Unicode? > > > > Personally, think UTF-8 encoded Unicode is more future proof, and would > > vote for that. > > > > But I wouldn't reject limiting it to ASCII, and perhaps in the future introduce > > another data type for UTF-8 strings. > > > > UTF-8 is the modern choice, but it is incompatible with old stuff, e.g. many > > SNMP MIBs. > > > [CP] > > Just from looking at the spec [1] , I would say UTF-8, as it seems to suggest its use for JSON (section 8.1). > > [1] https://www.rfc-editor.org/rfc/rfc8259.txt > > > > > > > So I think we could add declaring and checking functions to make sure > > > telemetry string do not allow control characters. > [CP] > > I am not sure why we don't want these at all - I thought we do want some of them, like tab (\u0009) for example. > > > > In general, I think Bruce's suggestion of using a customised printf function instead of snprintf would be a good way forward, to scan the chars as they are being copied in. > I'm hoping to have some time to try and prototype this myself soon, and send out a draft patch to this mailing list for consideration. /Bruce