From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ECA13A0032; Wed, 17 Aug 2022 17:34:54 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E02D940DDA; Wed, 17 Aug 2022 17:34:54 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 7D65F40685 for ; Wed, 17 Aug 2022 17:34:53 +0200 (CEST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC] Dynamic log/trace control via telemetry Date: Wed, 17 Aug 2022 17:34:49 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8727A@smartserver.smartshare.dk> In-Reply-To: <20220817181503.323b6230@sovereign> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC] Dynamic log/trace control via telemetry Thread-Index: AdiyTCUK4qJjjyNuTU2qHpU62VXmdgAAEpaQ References: <20220816021738.5498f802@sovereign> <20220816190837.40c557bb@hermes.local> <20220817181503.323b6230@sovereign> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Dmitry Kozlyuk" , "Stephen Hemminger" Cc: X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Dmitry Kozlyuk [mailto:dmitry.kozliuk@gmail.com] > Sent: Wednesday, 17 August 2022 17.15 >=20 > 2022-08-16 19:08 (UTC-0700), Stephen Hemminger: > > Not sure if turning telemetry into a do all control api makes sense. >=20 > I'm sure it doesn't, for "do all". > Controlling diagnostic collection and output, however, > is directly related to the telemetry purpose. >=20 > > This seems like a different API. I agree with Stephen regarding not making the telemetry library a "do = all" control API. A separate API would be preferable. And then, a wrapper through the telemetry interface can be provided to = that API. Best of both worlds. :-) > > Also, the default would have to be disabled for application safety > reasons. >=20 > This feature would be for collecting additional info > in case the collection was not planned and a restart is not desired. > If it is disabled by default, it is likely to be off when it's needed. All tracing, logging etc. MUST be disabled by default. You are = suggesting the opposite, which will definitely impact performance. And performance will become a valid argument for not adding more = trace/logging to libraries, if all of it is enabled by default. And my usual rant: I hope all of this can be disabled at build time - = for maximum performance. >=20 > Let's consider how exactly can safety be compromised. >=20 > 1. Securing telemetry socket access is out of scope for DPDK, > that is, any successful access is considered trusted. >=20 > 2. Even read-only telemetry still comes at cost, for example, > memory telemetry takes a global lock that blocks all allocations, > so affecting the app performance is already possible. >=20 > 3. Important logs and traces enabled at startup may be disabled > dynamically. > If it's an issue, the API can refuse to disable them. >=20 > 4. Bogus logs may flood the output and slow down the app. > Bogus traces can exhaust disk space. > Logs should be monitored automatically, so flooding is just an > annoyance. > Disk space can have a quota. > Since the user is trusted (item 1), even if they do it by mistake, > they can quickly correct themselves using the same API. >=20 Here's a thought: Add an API to set an "unlock key", so applications who don't want to = allow these features for unauthorized users can prevent them from = enabling it. Authorized users can use an API to unlock these features by = providing the key.