From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5099842DCC; Tue, 4 Jul 2023 08:41:33 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DB26D40F18; Tue, 4 Jul 2023 08:41:32 +0200 (CEST) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id 7E16740E03 for ; Tue, 4 Jul 2023 08:41:30 +0200 (CEST) Received: from dggpeml100024.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4QwCpf4rL5zTm9y; Tue, 4 Jul 2023 14:40:26 +0800 (CST) Received: from [10.67.100.224] (10.67.100.224) by dggpeml100024.china.huawei.com (7.185.36.115) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 4 Jul 2023 14:41:27 +0800 Subject: Re: [PATCH v5 2/2] ethdev: support xstats reset telemetry command To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , Thomas Monjalon , Dongdong Liu CC: Bruce Richardson , , Ferruh Yigit , , , , References: <20221219090723.29356-1-fengchengwen@huawei.com> <12802674.ZYm5mLc6kN@thomas> <18761354.fAMKPKieAE@thomas> <98CBD80474FA8B44BF855DF32C47DC35D87A5C@smartserver.smartshare.dk> From: fengchengwen Message-ID: Date: Tue, 4 Jul 2023 14:41:27 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D87A5C@smartserver.smartshare.dk> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.100.224] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml100024.china.huawei.com (7.185.36.115) X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Thomas and Morten, On 2023/7/3 21:44, Morten Brørup wrote: >> From: Thomas Monjalon [mailto:thomas@monjalon.net] >> Sent: Monday, 3 July 2023 09.20 >> >> 03/07/2023 05:58, fengchengwen: >>> >>> On 2023/2/20 21:05, Thomas Monjalon wrote: >>>> 17/02/2023 10:44, fengchengwen: >>>>> On 2023/2/16 20:54, Bruce Richardson wrote: >>>>>> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote: >>>>>>> On 2023/2/16 20:06, Ferruh Yigit wrote: >>>>>>>> On 2/16/2023 11:53 AM, fengchengwen wrote: >>>>>>>>> On 2023/2/15 11:19, Dongdong Liu wrote: >>>>>>>>>> Hi Chengwen >>>>>>>>>> >>>>>>>>>> On 2023/2/9 10:32, Chengwen Feng wrote: >>>>>>>>>>> The xstats reset is useful for debugging, so add it to the >> ethdev >>>>>>>>>>> telemetry command lists. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Chengwen Feng >>>>>>>>>> This patch looks good, so >>>>>>>>>> Reviewed-by: Dongdong Liu >>>>>>>>>> >>>>>>>>>> A minior question >>>>>>>>>> Do we need to support stats reset ? >>>>>>>>> >>>>>>>>> Stats is contained by xstats, and future direction I think is >> xstats. >>>>>>>>> So I think we don't need support stats reset. >>>>>>>>> >>>>>>>> >>>>>>>> I have similar question with Dongdong, readonly values are safe >> for >>>>>>>> telemetry, but modifying data can be more tricky since we don't >> have >>>>>>>> locking in ethdev APIs, this can cause concurrency issues. >>>>>>> >>>>>>> Yes, it indeed has concurrency issues. >>>>>>> >>>>>>>> >>>>>>>> Overall do we want telemetry go that way and become something >> that >>>>>>>> alters ethdev data/config? >>>>>>> >>>>>>> There are at least two part of data: config and status. >>>>>>> For stats (which belong status data) could help for debugging, I >> think it's acceptable. >>>>>>> >>>>>>> As for concurrency issues. People should know what to do and when >> to do, just like >>>>>>> the don't invoke config API (e.g. dev_configure/dev_start/...) >> concurrency. >>>>>>> >>>>>> While this is probably ok for now, I think in next release we >> should look >>>>>> to add some sort of support for locking for destructive ops in a >> future >>>>>> release. For example, we could: >>>>>> >>>>>> 1. Add support for marking a callback as "destructive" and only >> allow it to >>>>>> be called if only one connection is present or >>>>>> >>>>>> 2. Make it possible for callbacks to query the number of >> connections so >>>>>> that the callback itself is non-destructive in more than one >> connection is >>>>>> open. >>>>>> >>>>>> [Both of these will require locking support so that new >> connections aren't >>>>>> openned when the callback is in-flight!] >>>>> >>>>> Except telemetry, the application may have other console could >> execute DPDK API. >>>>> So I think trying to keep it simple, it's up to the user to invoke. >>>> >>>> No, the user should not be responsible for concurrency issues. >>>> We can ask the app developper to take care, >>>> but not to the user who has no control on what happens in the app. >>>> >>>> On a more general note, I feel the expansion of telemetry is not >> controlled enough. >>>> I would like to stop on adding more telemetry until we have a clear >> guideline >>>> about what is telemetry for and how to use it. >>> >>> Hi Thomas, >>> >>> Should this be discussed on TB? >> >> What would be your question exactly? > > A general comment about telemetry: > > If an application exposes telemetry through an end user facing API, e.g. http(s) REST, it would be nice if non-read-only telemetry paths are easy to identify by following some DPDK standard convention, so the application does not need to manually maintain an allow-list of read-only paths. +1 for this point. > > Bruce's documentation about trace/log/telemetry/dump might also need to be updated regarding non-read-only telemetry actions. I just check Bruce's patch [1], and notice that the telemetry callback must be 'read-only': (Telemetry callbacks should not modify any program state, but be "read-only"). >From internal product usage, we think xstats-reset is valid to identify problem, but this callback is not read-only. We think telemetry callback should not limit to 'read-only'. Perhaps we could develop some strategy to better manage non-read-only callbacks (just like Morten's advise). [1]: https://patchwork.dpdk.org/project/dpdk/patch/20230620170728.74117-3-bruce.richardson@intel.com/ > > > . >