From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2D5274545F; Fri, 14 Jun 2024 19:39:29 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AEF414279E; Fri, 14 Jun 2024 19:39:28 +0200 (CEST) Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by mails.dpdk.org (Postfix) with ESMTP id B423240B9A for ; Fri, 14 Jun 2024 19:39:26 +0200 (CEST) Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-2ec0644a2c3so1871021fa.3 for ; Fri, 14 Jun 2024 10:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1718386766; x=1718991566; darn=dpdk.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QFvRDO/wWmvK+3xUpemyh2KjqxIl3d+VczCJJVjfhTI=; b=gObF0CzjeI+w9kgFBBl240HtxHLhxeBg46g/lu9j4HkcwZZNyz44wZNff+p0NSpmoV QPu6E6MYE6oZFhBzdrMhIFR8yqc0BUaQLDBCp8QruWDLH1ILXqBpuxO5a+2xvwlpfi/6 7dcDQE3qcL5OC0VIOwgMxNmCT+9RCLe2ltWxA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718386766; x=1718991566; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QFvRDO/wWmvK+3xUpemyh2KjqxIl3d+VczCJJVjfhTI=; b=L2W5j9jxAIcvzlIVS0YueZudCFeXEqoTlm32YsFfrJy5tPewdaQLXP3UyNarVAlk75 6e8qOfxapSAl7oButgJpocu9/RdH3/EYH5BgkT731ZnNgHVzmPOy++Phi9ruvBZo9lgF EtvVgMg80i/9IZ1ZvFVtG/jYVrd+jSr//IryBsl0WZGtaAJs8frebp5Beo6OMJXUSfGg ezx9i+4TNDXUOuoMCXE1BAtOuItLRJ5opeH/XG5XdSAAmW/PDvXgcfDQcxa34N6S8w5N UAbyJAFHlZyPxX23aV+wg7cqGMg13+QFeQrNzScxVXucLHCc9j0+H6BAf1SesiYqRjzN 59QA== X-Gm-Message-State: AOJu0YzFKBNxOtG3gZFHLKuDIR+dHTPmTVtB/xI6qSX9kxM9IIkHwOdu HEsl/miDc0G/S0yoYe6zdADeJmqhbpZMsN1hCQDh9GL2RZdS8eXed+LYHOcpLi5dQItlqsWOSn+ n5bQ6+YwreCASqE782ipfRxz9mjZnzPVEHXjXGQ== X-Google-Smtp-Source: AGHT+IHzLtROnDJxRg2/sIRitn+2kylQZTw1oS/11n7Z8f1gLsSunDL+nL3hZdHu7bwgGtnzyvq/nUOeagK5si0jerE= X-Received: by 2002:a2e:a402:0:b0:2ec:e0c:6694 with SMTP id 38308e7fff4ca-2ec0e5ca700mr18103491fa.5.1718386766001; Fri, 14 Jun 2024 10:39:26 -0700 (PDT) MIME-Version: 1.0 References: <20240412111136.3470304-1-luca.vizzarro@arm.com> <20240606213420.254260-1-luca.vizzarro@arm.com> <20240606213420.254260-4-luca.vizzarro@arm.com> In-Reply-To: <20240606213420.254260-4-luca.vizzarro@arm.com> From: Nicholas Pratte Date: Fri, 14 Jun 2024 13:39:15 -0400 Message-ID: Subject: Re: [PATCH v5 3/5] dts: add parsing utility module To: Luca Vizzarro Cc: dev@dpdk.org, Jeremy Spewock , =?UTF-8?Q?Juraj_Linke=C5=A1?= , Paul Szczepanek Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Tested-by: Nicholas Pratte Reviewed-by: Nicholas Pratte On Thu, Jun 6, 2024 at 5:34=E2=80=AFPM Luca Vizzarro wrote: > > Adds parsing text into a custom dataclass. It provides a new > `TextParser` dataclass to be inherited. This implements the `parse` > method, which combined with the parser functions, it can automatically > parse the value for each field. > > This new utility will facilitate and simplify the parsing of complex > command outputs, while ensuring that the codebase does not get bloated > and stays flexible. > > Signed-off-by: Luca Vizzarro > Reviewed-by: Paul Szczepanek > --- > dts/framework/exception.py | 9 ++ > dts/framework/parser.py | 229 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 238 insertions(+) > create mode 100644 dts/framework/parser.py > > diff --git a/dts/framework/exception.py b/dts/framework/exception.py > index cce1e0231a..d9d690037d 100644 > --- a/dts/framework/exception.py > +++ b/dts/framework/exception.py > @@ -31,6 +31,8 @@ class ErrorSeverity(IntEnum): > #: > SSH_ERR =3D 4 > #: > + INTERNAL_ERR =3D 5 > + #: > DPDK_BUILD_ERR =3D 10 > #: > TESTCASE_VERIFY_ERR =3D 20 > @@ -192,3 +194,10 @@ def __init__(self, suite_name: str) -> None: > def __str__(self) -> str: > """Add some context to the string representation.""" > return f"Blocking suite {self._suite_name} failed." > + > + > +class InternalError(DTSError): > + """An internal error or bug has occurred in DTS.""" > + > + #: > + severity: ClassVar[ErrorSeverity] =3D ErrorSeverity.INTERNAL_ERR > diff --git a/dts/framework/parser.py b/dts/framework/parser.py > new file mode 100644 > index 0000000000..741dfff821 > --- /dev/null > +++ b/dts/framework/parser.py > @@ -0,0 +1,229 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2024 Arm Limited > + > +"""Parsing utility module. > + > +This module provides :class:`~TextParser` which can be used to model any= dataclass to a block of > +text. > +""" > + > +import re > +from abc import ABC > +from dataclasses import MISSING, dataclass, fields > +from functools import partial > +from typing import Any, Callable, TypedDict, cast > + > +from typing_extensions import Self > + > +from framework.exception import InternalError > + > + > +class ParserFn(TypedDict): > + """Parser function in a dict compatible with the :func:`dataclasses.= field` metadata param.""" > + > + #: > + TextParser_fn: Callable[[str], Any] > + > + > +@dataclass > +class TextParser(ABC): > + r"""Helper abstract dataclass that parses a text according to the fi= elds' rules. > + > + In order to enable text parsing in a dataclass, subclass it with :cl= ass:`TextParser`. > + > + The provided `parse` method is a factory which parses the supplied t= ext and creates an instance > + with populated dataclass fields. This takes text as an argument and = for each field in the > + dataclass, the field's parser function is run against the whole text= . The returned value is then > + assigned to the field of the new instance. If the field does not hav= e a parser function its > + default value or factory is used instead. If no default is available= either, an exception is > + raised. > + > + This class provides a selection of parser functions and a function t= o wrap parser functions with > + generic functions. Parser functions are designed to be passed to the= fields' metadata param. The > + most commonly used parser function is expected to be the `find` meth= od, which runs a regular > + expression against the text to find matches. > + > + Example: > + The following example makes use of and demonstrates every parser= function available: > + > + ..code:: python > + > + from dataclasses import dataclass, field > + from enum import Enum > + from framework.parser import TextParser > + > + class Colour(Enum): > + BLACK =3D 1 > + WHITE =3D 2 > + > + @classmethod > + def from_str(cls, text: str): > + match text: > + case "black": > + return cls.BLACK > + case "white": > + return cls.WHITE > + case _: > + return None # unsupported colour > + > + @classmethod > + def make_parser(cls): > + # make a parser function that finds a match and > + # then makes it a Colour object through Colour.from_= str > + return TextParser.wrap(TextParser.find(r"is a (\w+)"= ), cls.from_str) > + > + @dataclass > + class Animal(TextParser): > + kind: str =3D field(metadata=3DTextParser.find(r"is a \w= + (\w+)")) > + name: str =3D field(metadata=3DTextParser.find(r"^(\w+)"= )) > + colour: Colour =3D field(metadata=3DColour.make_parser()= ) > + age: int =3D field(metadata=3DTextParser.find_int(r"aged= (\d+)")) > + > + steph =3D Animal.parse("Stephanie is a white cat aged 10") > + print(steph) # Animal(kind=3D'cat', name=3D'Stephanie', colo= ur=3D, age=3D10) > + """ > + > + """=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BEGIN PARSER FUNCTIONS =3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D""" > + > + @staticmethod > + def wrap(parser_fn: ParserFn, wrapper_fn: Callable) -> ParserFn: > + """Makes a wrapped parser function. > + > + `parser_fn` is called and if a non-None value is returned, `wrap= per_function` is called with > + it. Otherwise the function returns early with None. In pseudo-co= de: > + > + intermediate_value :=3D parser_fn(input) > + if intermediary_value is None then > + output :=3D None > + else > + output :=3D wrapper_fn(intermediate_value) > + > + Args: > + parser_fn: The dictionary storing the parser function to be = wrapped. > + wrapper_fn: The function that wraps `parser_fn`. > + > + Returns: > + ParserFn: A dictionary for the `dataclasses.field` metadata = argument containing the > + newly wrapped parser function. > + """ > + inner_fn =3D parser_fn["TextParser_fn"] > + > + def _composite_parser_fn(text: str) -> Any: > + intermediate_value =3D inner_fn(text) > + if intermediate_value is None: > + return None > + return wrapper_fn(intermediate_value) > + > + return ParserFn(TextParser_fn=3D_composite_parser_fn) > + > + @staticmethod > + def find( > + pattern: str | re.Pattern[str], > + flags: re.RegexFlag =3D re.RegexFlag(0), > + named: bool =3D False, > + ) -> ParserFn: > + """Makes a parser function that finds a regular expression match= in the text. > + > + If the pattern has any capturing groups, it returns None if no m= atch was found, otherwise a > + tuple containing the values per each group is returned. If the p= attern has only one > + capturing group and a match was found, its value is returned. If= the pattern has no > + capturing groups then either True or False is returned if the pa= ttern had a match or not. > + > + Args: > + pattern: The regular expression pattern. > + flags: The regular expression flags. Ignored if the given pa= ttern is already compiled. > + named: If set to True only the named capturing groups will b= e returned, as a dictionary. > + > + Returns: > + ParserFn: A dictionary for the `dataclasses.field` metadata = argument containing the find > + parser function. > + """ > + if isinstance(pattern, str): > + pattern =3D re.compile(pattern, flags) > + > + def _find(text: str) -> Any: > + m =3D pattern.search(text) > + if m is None: > + return None if pattern.groups > 0 else False > + > + if pattern.groups =3D=3D 0: > + return True > + > + if named: > + return m.groupdict() > + > + matches =3D m.groups() > + if len(matches) =3D=3D 1: > + return matches[0] > + > + return matches > + > + return ParserFn(TextParser_fn=3D_find) > + > + @staticmethod > + def find_int( > + pattern: str | re.Pattern[str], > + flags: re.RegexFlag =3D re.RegexFlag(0), > + int_base: int =3D 0, > + ) -> ParserFn: > + """Makes a parser function that converts the match of :meth:`~fi= nd` to int. > + > + This function is compatible only with a pattern containing one c= apturing group. > + > + Args: > + pattern: The regular expression pattern. > + flags: The regular expression flags. Ignored if the given pa= ttern is already compiled. > + int_base: The base of the number to convert from. > + > + Raises: > + InternalError: If the pattern does not have exactly one capt= uring group. > + > + Returns: > + ParserFn: A dictionary for the `dataclasses.field` metadata = argument containing the > + :meth:`~find` parser function wrapped by the int built-i= n. > + """ > + if isinstance(pattern, str): > + pattern =3D re.compile(pattern, flags) > + > + if pattern.groups !=3D 1: > + raise InternalError("only one capturing group is allowed wit= h this parser function") > + > + return TextParser.wrap(TextParser.find(pattern), partial(int, ba= se=3Dint_base)) > + > + """=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D END PARSER FUNCTIONS =3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D""" > + > + @classmethod > + def parse(cls, text: str) -> Self: > + """Creates a new instance of the class from the given text. > + > + A new class instance is created with all the fields that have a = parser function in their > + metadata. Fields without one are ignored and are expected to hav= e a default value, otherwise > + the class initialization will fail. > + > + A field is populated with the value returned by its correspondin= g parser function. > + > + Args: > + text: the text to parse > + > + Raises: > + InternalError: if the parser did not find a match and the fi= eld does not have a default > + value or default factory. > + > + Returns: > + A new instance of the class. > + """ > + fields_values =3D {} > + for field in fields(cls): > + parse =3D cast(ParserFn, field.metadata).get("TextParser_fn"= ) > + if parse is None: > + continue > + > + value =3D parse(text) > + if value is not None: > + fields_values[field.name] =3D value > + elif field.default is MISSING and field.default_factory is M= ISSING: > + raise InternalError( > + f"parser for field {field.name} returned None, but t= he field has no default" > + ) > + > + return cls(**fields_values) > -- > 2.34.1 >