From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F242D43ED8; Mon, 22 Apr 2024 09:17:51 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7B80A40265; Mon, 22 Apr 2024 09:17:51 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id B00DE4021F for ; Mon, 22 Apr 2024 09:17:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713770269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xkeIdWKiNOGvPtSwzrrpKCJPKRo0sXfxFn5Ev8kgO60=; b=Qu5e6EK3xfHc9UN1Qdfm4iwKrFAXrxz2B3BqjLge8y4FrHd5MKnllGsE69HBEtVUCykLrQ arWVp+aGnp6q97ZeZgPw+v+E4kq7qyZ9CkQDZatdnG3MGUj3mN7onGZzGiaKaUv4Pvc1dA NXBqkRjKnDEd4xMW3xB7g41FP+lYEiA= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-586-rwmWMK2ONuKHgwBlIznfYA-1; Mon, 22 Apr 2024 03:17:45 -0400 X-MC-Unique: rwmWMK2ONuKHgwBlIznfYA-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-416a844695dso24210125e9.2 for ; Mon, 22 Apr 2024 00:17:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713770264; x=1714375064; h=in-reply-to:references:user-agent:to:from:subject:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=xkeIdWKiNOGvPtSwzrrpKCJPKRo0sXfxFn5Ev8kgO60=; b=W8NzrZsQUUt53zhN+YcZJbGMeLal9Tqy8srykkEhAh4HhbTnfID60ZQ4BMIlF0OXKF 8jdJpXER0KRsrhpi4LYXERE8zAwUTFHpW7gOwiNQWZ3DFWMhf/8fs5WouN+ruBmLWlSy HKqFb417tYOa6Dl+JYOI0BiPgZvn3lR/oyMkzgZ5BRrZufucCcgtG6IZrq5v9xvD+Zra YckGflQnICNHT0kn91eospuupBV7uB5d2+5Qfo0QZQiNhCkoF5gYMh8BPuTLhx/myw09 9VbKJRxIZ/eYLpsKcIRVrAS2VLlMnMnYf8jFdGP2NNcNy+iLt4ZHk34BA6CX4FOwgimx ht8w== X-Forwarded-Encrypted: i=1; AJvYcCVy/ACtjrLHn3231drxvc/trUWmN3V80TM94vXlnDy3bY/XL/fQt0OxzMGip9Mz2EaG8ovfSyREhTQxYPo= X-Gm-Message-State: AOJu0Yz/+er2w3SoHL4mkia5/6THEBJPFfgCMM4ZApxZMbHN/HpcoLaM GoR8bFRanvbNN2Q1FSmrg38iqo1SyEYGWmjq4IcEfbK4/sQrZXurOs9KSpTWvDMAsUp+jaLgLaN QFA7+I7G5BkPrWJ8WidaWijfguIRBu8orEfaiUXPV X-Received: by 2002:a05:600c:a15:b0:41a:6335:892e with SMTP id z21-20020a05600c0a1500b0041a6335892emr1083700wmp.8.1713770264187; Mon, 22 Apr 2024 00:17:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEQliaIrEdd9BbSs8j+jFzsrFzz8BuhQQRQGFCXS0gMCCk+wGYEdnW+AYiHbTtQj2aZk4XnSQ== X-Received: by 2002:a05:600c:a15:b0:41a:6335:892e with SMTP id z21-20020a05600c0a1500b0041a6335892emr1083680wmp.8.1713770263563; Mon, 22 Apr 2024 00:17:43 -0700 (PDT) Received: from localhost ([2a01:e0a:a9a:c460:2827:8723:3c60:c84a]) by smtp.gmail.com with ESMTPSA id p6-20020a05600c468600b00418accde252sm15660949wmo.30.2024.04.22.00.17.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Apr 2024 00:17:43 -0700 (PDT) Mime-Version: 1.0 Date: Mon, 22 Apr 2024 09:17:42 +0200 Message-Id: Subject: Re: [PATCH v2] usertools: add telemetry exporter From: "Anthony Harivel" To: "Robin Jarry" , User-Agent: aerc/0.17.0-121-g0798a428060d References: <20230926163442.844006-2-rjarry@redhat.com> <20240416134620.64277-3-rjarry@redhat.com> In-Reply-To: <20240416134620.64277-3-rjarry@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Robin, I've tested your patch and this is all good for me.=20 The errors are better handled.=20 The most common one is when the socket telemetry is closed and you=20 get: 2024-04-19 15:22:00 ERROR 192.168.122.116 GET /metrics HTTP/1.1: telemetry = socket not available Traceback (most recent call last): =20 File "/usr/bin/dpdk-telemetry-exporter.py", line 312, in do_GET with TelemetrySocket(self.server.dpdk_socket_path) as sock:=20 File "/usr/bin/dpdk-telemetry-exporter.py", line 165, in __init__ self.sock.connect(path) =20 FileNotFoundError: [Errno 2] No such file or directory You get the Traceback of Python which is a bit useless for the user but=20 at least you have at the first line the root cause:=20 "telemetry socket not available" which is IMO the most important. Thanks for you patch ! Tested-by: Anthony Harivel Regards, Anthony Robin Jarry, Apr 16, 2024 at 15:46: > For now the telemetry socket is local to the machine running a DPDK > application. Also, there is no official "schema" for the exposed > metrics. Add a framework and a script to collect and expose these > metrics to telemetry and observability agree gators such as Prometheus, > Carbon or Influxdb. The exposed data must be done with end-users in > mind, some DPDK terminology or internals may not make sense to everyone. > > The script only serves as an entry point and does not know anything > about any specific metrics nor JSON data structures exposed in the > telemetry socket. > > It uses dynamically loaded endpoint exporters which are basic python > files that must implement two functions: > > def info() -> dict[MetricName, MetricInfo]: > Mapping of metric names to their description and type. > > def metrics(sock: TelemetrySocket) -> list[MetricValue]: > Request data from sock and return it as metric values. A metric > value is a 3-tuple: (name: str, value: any, labels: dict). Each > name must be present in info(). > > The sock argument passed to metrics() has a single method: > > def cmd(self, uri: str, arg: any =3D None) -> dict | list: > Request JSON data to the telemetry socket and parse it to python > values. > > The main script invokes endpoints and exports the data into an output > format. For now, only two formats are implemented: > > * openmetrics/prometheus: text based format exported via a local HTTP > server. > * carbon/graphite: binary (python pickle) format exported to a distant > carbon TCP server. > > As a starting point, 3 built-in endpoints are implemented: > > * counters: ethdev hardware counters > * cpu: lcore usage > * memory: overall memory usage > > The goal is to keep all built-in endpoints in the DPDK repository so > that they can be updated along with the telemetry JSON data structures. > > Example output for the openmetrics:// format: > > ~# dpdk-telemetry-exporter.py -o openmetrics://:9876 & > INFO using endpoint: counters (from .../telemetry-endpoints/counters.py) > INFO using endpoint: cpu (from .../telemetry-endpoints/cpu.py) > INFO using endpoint: memory (from .../telemetry-endpoints/memory.py) > INFO listening on port 9876 > [1] 838829 > > ~$ curl http://127.0.0.1:9876/ > # HELP dpdk_cpu_total_cycles Total number of CPU cycles. > # TYPE dpdk_cpu_total_cycles counter > # HELP dpdk_cpu_busy_cycles Number of busy CPU cycles. > # TYPE dpdk_cpu_busy_cycles counter > dpdk_cpu_total_cycles{cpu=3D"73", numa=3D"0"} 4353385274702980 > dpdk_cpu_busy_cycles{cpu=3D"73", numa=3D"0"} 6215932860 > dpdk_cpu_total_cycles{cpu=3D"9", numa=3D"0"} 4353385274745740 > dpdk_cpu_busy_cycles{cpu=3D"9", numa=3D"0"} 6215932860 > dpdk_cpu_total_cycles{cpu=3D"8", numa=3D"0"} 4353383451895540 > dpdk_cpu_busy_cycles{cpu=3D"8", numa=3D"0"} 6171923160 > dpdk_cpu_total_cycles{cpu=3D"72", numa=3D"0"} 4353385274817320 > dpdk_cpu_busy_cycles{cpu=3D"72", numa=3D"0"} 6215932860 > # HELP dpdk_memory_total_bytes The total size of reserved memory in byte= s. > # TYPE dpdk_memory_total_bytes gauge > # HELP dpdk_memory_used_bytes The currently used memory in bytes. > # TYPE dpdk_memory_used_bytes gauge > dpdk_memory_total_bytes 1073741824 > dpdk_memory_used_bytes 794197376 > > Link: https://prometheus.io/docs/instrumenting/exposition_formats/#text-b= ased-format > Link: https://github.com/OpenObservability/OpenMetrics/blob/main/specific= ation/OpenMetrics.md#text-format > Link: https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-p= ickle-protocol > Link: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/p= rometheus > Signed-off-by: Robin Jarry > --- > > Notes: > v2: > =20 > * Refuse to run if no endpoints are enabled. > * Handle endpoint errors gracefully without failing the whole query. > > usertools/dpdk-telemetry-exporter.py | 405 ++++++++++++++++++++++ > usertools/meson.build | 6 + > usertools/telemetry-endpoints/counters.py | 47 +++ > usertools/telemetry-endpoints/cpu.py | 29 ++ > usertools/telemetry-endpoints/memory.py | 37 ++ > 5 files changed, 524 insertions(+) > create mode 100755 usertools/dpdk-telemetry-exporter.py > create mode 100644 usertools/telemetry-endpoints/counters.py > create mode 100644 usertools/telemetry-endpoints/cpu.py > create mode 100644 usertools/telemetry-endpoints/memory.py > > diff --git a/usertools/dpdk-telemetry-exporter.py b/usertools/dpdk-teleme= try-exporter.py > new file mode 100755 > index 000000000000..f8d873ad856c > --- /dev/null > +++ b/usertools/dpdk-telemetry-exporter.py > @@ -0,0 +1,405 @@ > +#!/usr/bin/env python3 > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright (c) 2023 Robin Jarry > + > +r''' > +DPDK telemetry exporter. > + > +It uses dynamically loaded endpoint exporters which are basic python fil= es that > +must implement two functions: > + > + def info() -> dict[MetricName, MetricInfo]: > + """ > + Mapping of metric names to their description and type. > + """ > + > + def metrics(sock: TelemetrySocket) -> list[MetricValue]: > + """ > + Request data from sock and return it as metric values. A metric = value > + is a 3-tuple: (name: str, value: any, labels: dict). Each name m= ust be > + present in info(). > + """ > + > +The sock argument passed to metrics() has a single method: > + > + def cmd(self, uri, arg=3DNone) -> dict | list: > + """ > + Request JSON data to the telemetry socket and parse it to python > + values. > + """ > + > +See existing endpoints for examples. > + > +The exporter supports multiple output formats: > + > +prometheus://ADDRESS:PORT > +openmetrics://ADDRESS:PORT > + Expose the enabled endpoints via a local HTTP server listening on the > + specified address and port. GET requests on that server are served wit= h > + text/plain responses in the prometheus/openmetrics format. > + > + More details: > + https://prometheus.io/docs/instrumenting/exposition_formats/#text-base= d-format > + > +carbon://ADDRESS:PORT > +graphite://ADDRESS:PORT > + Export all enabled endpoints to the specified TCP ADDRESS:PORT in the = pickle > + carbon format. > + > + More details: > + https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pick= le-protocol > +''' > + > +import argparse > +import importlib.util > +import json > +import logging > +import os > +import pickle > +import re > +import socket > +import struct > +import sys > +import time > +import typing > +from http import HTTPStatus, server > +from urllib.parse import urlparse > + > +LOG =3D logging.getLogger(__name__) > +# Use local endpoints path only when running from source > +LOCAL =3D os.path.join(os.path.dirname(__file__), "telemetry-endpoints") > +DEFAULT_LOAD_PATHS =3D [] > +if os.path.isdir(LOCAL): > + DEFAULT_LOAD_PATHS.append(LOCAL) > +DEFAULT_LOAD_PATHS +=3D [ > + "/usr/local/share/dpdk/telemetry-endpoints", > + "/usr/share/dpdk/telemetry-endpoints", > +] > +DEFAULT_OUTPUT =3D "openmetrics://:9876" > + > + > +def main(): > + logging.basicConfig( > + stream=3Dsys.stdout, > + level=3Dlogging.INFO, > + format=3D"%(asctime)s %(levelname)s %(message)s", > + datefmt=3D"%Y-%m-%d %H:%M:%S", > + ) > + parser =3D argparse.ArgumentParser( > + description=3D__doc__, > + formatter_class=3Dargparse.RawDescriptionHelpFormatter, > + ) > + parser.add_argument( > + "-o", > + "--output", > + metavar=3D"FORMAT://PARAMETERS", > + default=3Durlparse(DEFAULT_OUTPUT), > + type=3Durlparse, > + help=3Df""" > + Output format (default: "{DEFAULT_OUTPUT}"). Depending on the fo= rmat, > + URL elements have different meanings. By default, the exporter s= tarts a > + local HTTP server on port 9876 that serves requests in the > + prometheus/openmetrics plain text format. > + """, > + ) > + parser.add_argument( > + "-p", > + "--load-path", > + dest=3D"load_paths", > + type=3Dlambda v: v.split(os.pathsep), > + default=3DDEFAULT_LOAD_PATHS, > + help=3Df""" > + The list of paths from which to disvover endpoints. > + (default: "{os.pathsep.join(DEFAULT_LOAD_PATHS)}"). > + """, > + ) > + parser.add_argument( > + "-e", > + "--endpoint", > + dest=3D"endpoints", > + metavar=3D"ENDPOINT", > + action=3D"append", > + help=3D""" > + Telemetry endpoint to export (by default, all discovered endpoin= ts are > + enabled). This option can be specified more than once. > + """, > + ) > + parser.add_argument( > + "-l", > + "--list", > + action=3D"store_true", > + help=3D""" > + Only list detected endpoints and exit. > + """, > + ) > + parser.add_argument( > + "-s", > + "--socket-path", > + default=3D"/run/dpdk/rte/dpdk_telemetry.v2", > + help=3D""" > + The DPDK telemetry socket path (default: "%(default)s"). > + """, > + ) > + args =3D parser.parse_args() > + output =3D OUTPUT_FORMATS.get(args.output.scheme) > + if output is None: > + parser.error(f"unsupported output format: {args.output.scheme}:/= /") > + > + try: > + endpoints =3D load_endpoints(args.load_paths, args.endpoints) > + if args.list: > + return > + except Exception as e: > + parser.error(str(e)) > + > + output(args, endpoints) > + > + > +class TelemetrySocket: > + """ > + Abstraction of the DPDK telemetry socket. > + """ > + > + def __init__(self, path: str): > + self.sock =3D socket.socket(socket.AF_UNIX, socket.SOCK_SEQPACKE= T) > + self.sock.connect(path) > + data =3D json.loads(self.sock.recv(1024).decode()) > + self.max_output_len =3D data["max_output_len"] > + > + def cmd( > + self, uri: str, arg: typing.Any =3D None > + ) -> typing.Optional[typing.Union[dict, list]]: > + """ > + Request JSON data to the telemetry socket and parse it to python > + values. > + """ > + if arg is not None: > + u =3D f"{uri},{arg}" > + else: > + u =3D uri > + self.sock.send(u.encode("utf-8")) > + data =3D self.sock.recv(self.max_output_len) > + return json.loads(data.decode("utf-8"))[uri] > + > + def __enter__(self): > + return self > + > + def __exit__(self, *args, **kwargs): > + self.sock.close() > + > + > +MetricDescription =3D str > +MetricType =3D str > +MetricName =3D str > +MetricLabels =3D typing.Dict[str, typing.Any] > +MetricInfo =3D typing.Tuple[MetricDescription, MetricType] > +MetricValue =3D typing.Tuple[MetricName, typing.Any, MetricLabels] > + > + > +class TelemetryEndpoint: > + """ > + Placeholder class only used for typing annotations. > + """ > + > + @staticmethod > + def info() -> typing.Dict[MetricName, MetricInfo]: > + """ > + Mapping of metric names to their description and type. > + """ > + raise NotImplementedError() > + > + @staticmethod > + def metrics(sock: TelemetrySocket) -> typing.List[MetricValue]: > + """ > + Request data from sock and return it as metric values. Each metr= ic > + name must be present in info(). > + """ > + raise NotImplementedError() > + > + > +def load_endpoints( > + paths: typing.List[str], names: typing.List[str] > +) -> typing.List[TelemetryEndpoint]: > + """ > + Load selected telemetry endpoints from the specified paths. > + """ > + > + endpoints =3D {} > + dwb =3D sys.dont_write_bytecode > + sys.dont_write_bytecode =3D True # never generate .pyc files for en= dpoints > + > + for p in paths: > + if not os.path.isdir(p): > + continue > + for fname in os.listdir(p): > + f =3D os.path.join(p, fname) > + if os.path.isdir(f): > + continue > + try: > + name, _ =3D os.path.splitext(fname) > + if names is not None and name not in names: > + # not selected by user > + continue > + if name in endpoints: > + # endpoint with same name already loaded > + continue > + spec =3D importlib.util.spec_from_file_location(name, f) > + module =3D importlib.util.module_from_spec(spec) > + spec.loader.exec_module(module) > + endpoints[name] =3D module > + except Exception: > + LOG.exception("parsing endpoint: %s", f) > + > + if not endpoints: > + raise Exception("no telemetry endpoints detected/selected") > + > + sys.dont_write_bytecode =3D dwb > + > + modules =3D [] > + info =3D {} > + for name, module in sorted(endpoints.items()): > + LOG.info("using endpoint: %s (from %s)", name, module.__file__) > + try: > + for metric, (description, type_) in module.info().items(): > + info[(name, metric)] =3D (description, type_) > + modules.append(module) > + except Exception: > + LOG.exception("getting endpoint info: %s", name) > + return modules > + > + > +def serve_openmetrics( > + args: argparse.Namespace, endpoints: typing.List[TelemetryEndpoint] > +): > + """ > + Start an HTTP server and serve requests in the openmetrics/prometheu= s > + format. > + """ > + listen =3D (args.output.hostname or "", int(args.output.port or 80)) > + with server.HTTPServer(listen, OpenmetricsHandler) as httpd: > + httpd.dpdk_socket_path =3D args.socket_path > + httpd.telemetry_endpoints =3D endpoints > + LOG.info("listening on port %s", httpd.server_port) > + try: > + httpd.serve_forever() > + except KeyboardInterrupt: > + LOG.info("shutting down") > + > + > +class OpenmetricsHandler(server.BaseHTTPRequestHandler): > + """ > + Basic HTTP handler that returns prometheus/openmetrics formatted res= ponses. > + """ > + > + CONTENT_TYPE =3D "text/plain; version=3D0.0.4; charset=3Dutf-8" > + > + def escape(self, value: typing.Any) -> str: > + """ > + Escape a metric label value. > + """ > + value =3D str(value) > + value =3D value.replace('"', '\\"') > + value =3D value.replace("\\", "\\\\") > + return value.replace("\n", "\\n") > + > + def do_GET(self): > + """ > + Called uppon GET requests. > + """ > + try: > + lines =3D [] > + metrics_names =3D set() > + with TelemetrySocket(self.server.dpdk_socket_path) as sock: > + for e in self.server.telemetry_endpoints: > + info =3D e.info() > + metrics_lines =3D [] > + try: > + metrics =3D e.metrics(sock) > + except Exception: > + LOG.exception("%s: metrics collection failed", e= .__name__) > + continue > + for name, value, labels in metrics: > + fullname =3D re.sub(r"\W", "_", f"dpdk_{e.__name= __}_{name}") > + labels =3D ", ".join( > + f'{k}=3D"{self.escape(v)}"' for k, v in labe= ls.items() > + ) > + if labels: > + labels =3D f"{{{labels}}}" > + metrics_lines.append(f"{fullname}{labels} {value= }") > + if fullname not in metrics_names: > + metrics_names.add(fullname) > + desc, metric_type =3D info[name] > + lines +=3D [ > + f"# HELP {fullname} {desc}", > + f"# TYPE {fullname} {metric_type}", > + ] > + lines +=3D metrics_lines > + if not lines: > + self.send_error(HTTPStatus.INTERNAL_SERVER_ERROR) > + LOG.error( > + "%s %s: no metrics collected", > + self.address_string(), > + self.requestline, > + ) > + body =3D "\n".join(lines).encode("utf-8") + b"\n" > + self.send_response(HTTPStatus.OK) > + self.send_header("Content-Type", self.CONTENT_TYPE) > + self.send_header("Content-Length", str(len(body))) > + self.end_headers() > + self.wfile.write(body) > + LOG.info("%s %s", self.address_string(), self.requestline) > + > + except (FileNotFoundError, ConnectionRefusedError): > + self.send_error(HTTPStatus.SERVICE_UNAVAILABLE) > + LOG.exception( > + "%s %s: telemetry socket not available", > + self.address_string(), > + self.requestline, > + ) > + except Exception: > + self.send_error(HTTPStatus.INTERNAL_SERVER_ERROR) > + LOG.exception("%s %s", self.address_string(), self.requestli= ne) > + > + def log_message(self, fmt, *args): > + pass # disable built-in logger > + > + > +def export_carbon(args: argparse.Namespace, endpoints: typing.List[Telem= etryEndpoint]): > + """ > + Collect all metrics and export them to a carbon server in the pickle= format. > + """ > + addr =3D (args.output.hostname or "", int(args.output.port or 80)) > + with TelemetrySocket(args.socket_path) as dpdk: > + with socket.socket() as carbon: > + carbon.connect(addr) > + all_metrics =3D [] > + for e in endpoints: > + try: > + metrics =3D e.metrics(dpdk) > + except Exception: > + LOG.exception("%s: metrics collection failed", e.__n= ame__) > + continue > + for name, value, labels in metrics: > + fullname =3D re.sub(r"\W", ".", f"dpdk.{e.__name__}.= {name}") > + for key, val in labels.items(): > + val =3D str(val).replace(";", "") > + fullname +=3D f";{key}=3D{val}" > + all_metrics.append((fullname, (time.time(), value))) > + if not all_metrics: > + raise Exception("no metrics collected") > + payload =3D pickle.dumps(all_metrics, protocol=3D2) > + header =3D struct.pack("!L", len(payload)) > + buf =3D header + payload > + carbon.sendall(buf) > + > + > +OUTPUT_FORMATS =3D { > + "openmetrics": serve_openmetrics, > + "prometheus": serve_openmetrics, > + "carbon": export_carbon, > + "graphite": export_carbon, > +} > + > + > +if __name__ =3D=3D "__main__": > + main() > diff --git a/usertools/meson.build b/usertools/meson.build > index 740b4832f36d..eb48e2f4403f 100644 > --- a/usertools/meson.build > +++ b/usertools/meson.build > @@ -11,5 +11,11 @@ install_data([ > 'dpdk-telemetry.py', > 'dpdk-hugepages.py', > 'dpdk-rss-flows.py', > + 'dpdk-telemetry-exporter.py', > ], > install_dir: 'bin') > + > +install_subdir( > + 'telemetry-endpoints', > + install_dir: 'share/dpdk', > + strip_directory: false) > diff --git a/usertools/telemetry-endpoints/counters.py b/usertools/teleme= try-endpoints/counters.py > new file mode 100644 > index 000000000000..e17cffb43b2c > --- /dev/null > +++ b/usertools/telemetry-endpoints/counters.py > @@ -0,0 +1,47 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright (c) 2023 Robin Jarry > + > +RX_PACKETS =3D "rx_packets" > +RX_BYTES =3D "rx_bytes" > +RX_MISSED =3D "rx_missed" > +RX_NOMBUF =3D "rx_nombuf" > +RX_ERRORS =3D "rx_errors" > +TX_PACKETS =3D "tx_packets" > +TX_BYTES =3D "tx_bytes" > +TX_ERRORS =3D "tx_errors" > + > + > +def info() -> "dict[Name, tuple[Description, Type]]": > + return { > + RX_PACKETS: ("Number of successfully received packets.", "counte= r"), > + RX_BYTES: ("Number of successfully received bytes.", "counter"), > + RX_MISSED: ( > + "Number of packets dropped by the HW because Rx queues are f= ull.", > + "counter", > + ), > + RX_NOMBUF: ("Number of Rx mbuf allocation failures.", "counter")= , > + RX_ERRORS: ("Number of erroneous received packets.", "counter"), > + TX_PACKETS: ("Number of successfully transmitted packets.", "cou= nter"), > + TX_BYTES: ("Number of successfully transmitted bytes.", "counter= "), > + TX_ERRORS: ("Number of packet transmission failures.", "counter"= ), > + } > + > + > +def metrics(sock: "TelemetrySocket") -> "list[tuple[Name, Value, Labels]= ]": > + out =3D [] > + for port_id in sock.cmd("/ethdev/list"): > + port =3D sock.cmd("/ethdev/info", port_id) > + stats =3D sock.cmd("/ethdev/stats", port_id) > + labels =3D {"port": port["name"]} > + out +=3D [ > + (RX_PACKETS, stats["ipackets"], labels), > + (RX_PACKETS, stats["ipackets"], labels), > + (RX_BYTES, stats["ibytes"], labels), > + (RX_MISSED, stats["imissed"], labels), > + (RX_NOMBUF, stats["rx_nombuf"], labels), > + (RX_ERRORS, stats["ierrors"], labels), > + (TX_PACKETS, stats["opackets"], labels), > + (TX_BYTES, stats["obytes"], labels), > + (TX_ERRORS, stats["oerrors"], labels), > + ] > + return out > diff --git a/usertools/telemetry-endpoints/cpu.py b/usertools/telemetry-e= ndpoints/cpu.py > new file mode 100644 > index 000000000000..d38d8d6e2558 > --- /dev/null > +++ b/usertools/telemetry-endpoints/cpu.py > @@ -0,0 +1,29 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright (c) 2023 Robin Jarry > + > +CPU_TOTAL =3D "total_cycles" > +CPU_BUSY =3D "busy_cycles" > + > + > +def info() -> "dict[Name, tuple[Description, Type]]": > + return { > + CPU_TOTAL: ("Total number of CPU cycles.", "counter"), > + CPU_BUSY: ("Number of busy CPU cycles.", "counter"), > + } > + > + > +def metrics(sock: "TelemetrySocket") -> "list[tuple[Name, Value, Labels]= ]": > + out =3D [] > + for lcore_id in sock.cmd("/eal/lcore/list"): > + lcore =3D sock.cmd("/eal/lcore/info", lcore_id) > + cpu =3D ",".join(str(c) for c in lcore.get("cpuset", [])) > + total =3D lcore.get("total_cycles") > + busy =3D lcore.get("busy_cycles", 0) > + if not (cpu and total): > + continue > + labels =3D {"cpu": cpu, "numa": lcore.get("socket", 0)} > + out +=3D [ > + (CPU_TOTAL, total, labels), > + (CPU_BUSY, busy, labels), > + ] > + return out > diff --git a/usertools/telemetry-endpoints/memory.py b/usertools/telemetr= y-endpoints/memory.py > new file mode 100644 > index 000000000000..32cce1e59382 > --- /dev/null > +++ b/usertools/telemetry-endpoints/memory.py > @@ -0,0 +1,37 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright (c) 2023 Robin Jarry > + > +MEM_TOTAL =3D "total_bytes" > +MEM_USED =3D "used_bytes" > + > + > +def info() -> "dict[Name, tuple[Description, Type]]": > + return { > + MEM_TOTAL: ("The total size of reserved memory in bytes.", "gaug= e"), > + MEM_USED: ("The currently used memory in bytes.", "gauge"), > + } > + > + > +def metrics(sock: "TelemetrySocket") -> "list[tuple[Name, Value, Labels]= ]": > + zones =3D {} > + used =3D 0 > + for zone in sock.cmd("/eal/memzone_list") or []: > + z =3D sock.cmd("/eal/memzone_info", zone) > + start =3D int(z["Hugepage_base"], 16) > + end =3D start + (z["Hugepage_size"] * z["Hugepage_used"]) > + used +=3D z["Length"] > + for s, e in list(zones.items()): > + if s < start < e < end: > + zones[s] =3D end > + break > + if start < s < end < e: > + del zones[s] > + zones[start] =3D e > + break > + else: > + zones[start] =3D end > + > + return [ > + (MEM_TOTAL, sum(end - start for (start, end) in zones.items()), = {}), > + (MEM_USED, max(0, used), {}), > + ] > --=20 > 2.44.0