From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 2D5274545F;
	Fri, 14 Jun 2024 19:39:29 +0200 (CEST)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id AEF414279E;
	Fri, 14 Jun 2024 19:39:28 +0200 (CEST)
Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com
 [209.85.208.176])
 by mails.dpdk.org (Postfix) with ESMTP id B423240B9A
 for <dev@dpdk.org>; Fri, 14 Jun 2024 19:39:26 +0200 (CEST)
Received: by mail-lj1-f176.google.com with SMTP id
 38308e7fff4ca-2ec0644a2c3so1871021fa.3
 for <dev@dpdk.org>; Fri, 14 Jun 2024 10:39:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=iol.unh.edu; s=unh-iol; t=1718386766; x=1718991566; darn=dpdk.org;
 h=content-transfer-encoding:cc:to:subject:message-id:date:from
 :in-reply-to:references:mime-version:from:to:cc:subject:date
 :message-id:reply-to;
 bh=QFvRDO/wWmvK+3xUpemyh2KjqxIl3d+VczCJJVjfhTI=;
 b=gObF0CzjeI+w9kgFBBl240HtxHLhxeBg46g/lu9j4HkcwZZNyz44wZNff+p0NSpmoV
 QPu6E6MYE6oZFhBzdrMhIFR8yqc0BUaQLDBCp8QruWDLH1ILXqBpuxO5a+2xvwlpfi/6
 7dcDQE3qcL5OC0VIOwgMxNmCT+9RCLe2ltWxA=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1718386766; x=1718991566;
 h=content-transfer-encoding:cc:to:subject:message-id:date:from
 :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=QFvRDO/wWmvK+3xUpemyh2KjqxIl3d+VczCJJVjfhTI=;
 b=L2W5j9jxAIcvzlIVS0YueZudCFeXEqoTlm32YsFfrJy5tPewdaQLXP3UyNarVAlk75
 6e8qOfxapSAl7oButgJpocu9/RdH3/EYH5BgkT731ZnNgHVzmPOy++Phi9ruvBZo9lgF
 EtvVgMg80i/9IZ1ZvFVtG/jYVrd+jSr//IryBsl0WZGtaAJs8frebp5Beo6OMJXUSfGg
 ezx9i+4TNDXUOuoMCXE1BAtOuItLRJ5opeH/XG5XdSAAmW/PDvXgcfDQcxa34N6S8w5N
 UAbyJAFHlZyPxX23aV+wg7cqGMg13+QFeQrNzScxVXucLHCc9j0+H6BAf1SesiYqRjzN
 59QA==
X-Gm-Message-State: AOJu0YzFKBNxOtG3gZFHLKuDIR+dHTPmTVtB/xI6qSX9kxM9IIkHwOdu
 HEsl/miDc0G/S0yoYe6zdADeJmqhbpZMsN1hCQDh9GL2RZdS8eXed+LYHOcpLi5dQItlqsWOSn+
 n5bQ6+YwreCASqE782ipfRxz9mjZnzPVEHXjXGQ==
X-Google-Smtp-Source: AGHT+IHzLtROnDJxRg2/sIRitn+2kylQZTw1oS/11n7Z8f1gLsSunDL+nL3hZdHu7bwgGtnzyvq/nUOeagK5si0jerE=
X-Received: by 2002:a2e:a402:0:b0:2ec:e0c:6694 with SMTP id
 38308e7fff4ca-2ec0e5ca700mr18103491fa.5.1718386766001; Fri, 14 Jun 2024
 10:39:26 -0700 (PDT)
MIME-Version: 1.0
References: <20240412111136.3470304-1-luca.vizzarro@arm.com>
 <20240606213420.254260-1-luca.vizzarro@arm.com>
 <20240606213420.254260-4-luca.vizzarro@arm.com>
In-Reply-To: <20240606213420.254260-4-luca.vizzarro@arm.com>
From: Nicholas Pratte <npratte@iol.unh.edu>
Date: Fri, 14 Jun 2024 13:39:15 -0400
Message-ID: <CAKXZ7egYphLfMhKF23baxk5kSFbW6is6YUn+Jcd3RYh=V1dSoA@mail.gmail.com>
Subject: Re: [PATCH v5 3/5] dts: add parsing utility module
To: Luca Vizzarro <luca.vizzarro@arm.com>
Cc: dev@dpdk.org, Jeremy Spewock <jspewock@iol.unh.edu>, 
 =?UTF-8?Q?Juraj_Linke=C5=A1?= <juraj.linkes@pantheon.tech>, 
 Paul Szczepanek <paul.szczepanek@arm.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Tested-by: Nicholas Pratte <npratte@iol.unh.edu>
Reviewed-by: Nicholas Pratte <npratte@iol.unh.edu>

On Thu, Jun 6, 2024 at 5:34=E2=80=AFPM Luca Vizzarro <luca.vizzarro@arm.com=
> wrote:
>
> Adds parsing text into a custom dataclass. It provides a new
> `TextParser` dataclass to be inherited. This implements the `parse`
> method, which combined with the parser functions, it can automatically
> parse the value for each field.
>
> This new utility will facilitate and simplify the parsing of complex
> command outputs, while ensuring that the codebase does not get bloated
> and stays flexible.
>
> Signed-off-by: Luca Vizzarro <luca.vizzarro@arm.com>
> Reviewed-by: Paul Szczepanek <paul.szczepanek@arm.com>
> ---
>  dts/framework/exception.py |   9 ++
>  dts/framework/parser.py    | 229 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 238 insertions(+)
>  create mode 100644 dts/framework/parser.py
>
> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> index cce1e0231a..d9d690037d 100644
> --- a/dts/framework/exception.py
> +++ b/dts/framework/exception.py
> @@ -31,6 +31,8 @@ class ErrorSeverity(IntEnum):
>      #:
>      SSH_ERR =3D 4
>      #:
> +    INTERNAL_ERR =3D 5
> +    #:
>      DPDK_BUILD_ERR =3D 10
>      #:
>      TESTCASE_VERIFY_ERR =3D 20
> @@ -192,3 +194,10 @@ def __init__(self, suite_name: str) -> None:
>      def __str__(self) -> str:
>          """Add some context to the string representation."""
>          return f"Blocking suite {self._suite_name} failed."
> +
> +
> +class InternalError(DTSError):
> +    """An internal error or bug has occurred in DTS."""
> +
> +    #:
> +    severity: ClassVar[ErrorSeverity] =3D ErrorSeverity.INTERNAL_ERR
> diff --git a/dts/framework/parser.py b/dts/framework/parser.py
> new file mode 100644
> index 0000000000..741dfff821
> --- /dev/null
> +++ b/dts/framework/parser.py
> @@ -0,0 +1,229 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 Arm Limited
> +
> +"""Parsing utility module.
> +
> +This module provides :class:`~TextParser` which can be used to model any=
 dataclass to a block of
> +text.
> +"""
> +
> +import re
> +from abc import ABC
> +from dataclasses import MISSING, dataclass, fields
> +from functools import partial
> +from typing import Any, Callable, TypedDict, cast
> +
> +from typing_extensions import Self
> +
> +from framework.exception import InternalError
> +
> +
> +class ParserFn(TypedDict):
> +    """Parser function in a dict compatible with the :func:`dataclasses.=
field` metadata param."""
> +
> +    #:
> +    TextParser_fn: Callable[[str], Any]
> +
> +
> +@dataclass
> +class TextParser(ABC):
> +    r"""Helper abstract dataclass that parses a text according to the fi=
elds' rules.
> +
> +    In order to enable text parsing in a dataclass, subclass it with :cl=
ass:`TextParser`.
> +
> +    The provided `parse` method is a factory which parses the supplied t=
ext and creates an instance
> +    with populated dataclass fields. This takes text as an argument and =
for each field in the
> +    dataclass, the field's parser function is run against the whole text=
. The returned value is then
> +    assigned to the field of the new instance. If the field does not hav=
e a parser function its
> +    default value or factory is used instead. If no default is available=
 either, an exception is
> +    raised.
> +
> +    This class provides a selection of parser functions and a function t=
o wrap parser functions with
> +    generic functions. Parser functions are designed to be passed to the=
 fields' metadata param. The
> +    most commonly used parser function is expected to be the `find` meth=
od, which runs a regular
> +    expression against the text to find matches.
> +
> +    Example:
> +        The following example makes use of and demonstrates every parser=
 function available:
> +
> +        ..code:: python
> +
> +            from dataclasses import dataclass, field
> +            from enum import Enum
> +            from framework.parser import TextParser
> +
> +            class Colour(Enum):
> +                BLACK =3D 1
> +                WHITE =3D 2
> +
> +                @classmethod
> +                def from_str(cls, text: str):
> +                    match text:
> +                        case "black":
> +                            return cls.BLACK
> +                        case "white":
> +                            return cls.WHITE
> +                        case _:
> +                            return None # unsupported colour
> +
> +                @classmethod
> +                def make_parser(cls):
> +                    # make a parser function that finds a match and
> +                    # then makes it a Colour object through Colour.from_=
str
> +                    return TextParser.wrap(TextParser.find(r"is a (\w+)"=
), cls.from_str)
> +
> +            @dataclass
> +            class Animal(TextParser):
> +                kind: str =3D field(metadata=3DTextParser.find(r"is a \w=
+ (\w+)"))
> +                name: str =3D field(metadata=3DTextParser.find(r"^(\w+)"=
))
> +                colour: Colour =3D field(metadata=3DColour.make_parser()=
)
> +                age: int =3D field(metadata=3DTextParser.find_int(r"aged=
 (\d+)"))
> +
> +            steph =3D Animal.parse("Stephanie is a white cat aged 10")
> +            print(steph) # Animal(kind=3D'cat', name=3D'Stephanie', colo=
ur=3D<Colour.WHITE: 2>, age=3D10)
> +    """
> +
> +    """=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BEGIN PARSER FUNCTIONS =3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D"""
> +
> +    @staticmethod
> +    def wrap(parser_fn: ParserFn, wrapper_fn: Callable) -> ParserFn:
> +        """Makes a wrapped parser function.
> +
> +        `parser_fn` is called and if a non-None value is returned, `wrap=
per_function` is called with
> +        it. Otherwise the function returns early with None. In pseudo-co=
de:
> +
> +            intermediate_value :=3D parser_fn(input)
> +            if intermediary_value is None then
> +                output :=3D None
> +            else
> +                output :=3D wrapper_fn(intermediate_value)
> +
> +        Args:
> +            parser_fn: The dictionary storing the parser function to be =
wrapped.
> +            wrapper_fn: The function that wraps `parser_fn`.
> +
> +        Returns:
> +            ParserFn: A dictionary for the `dataclasses.field` metadata =
argument containing the
> +                newly wrapped parser function.
> +        """
> +        inner_fn =3D parser_fn["TextParser_fn"]
> +
> +        def _composite_parser_fn(text: str) -> Any:
> +            intermediate_value =3D inner_fn(text)
> +            if intermediate_value is None:
> +                return None
> +            return wrapper_fn(intermediate_value)
> +
> +        return ParserFn(TextParser_fn=3D_composite_parser_fn)
> +
> +    @staticmethod
> +    def find(
> +        pattern: str | re.Pattern[str],
> +        flags: re.RegexFlag =3D re.RegexFlag(0),
> +        named: bool =3D False,
> +    ) -> ParserFn:
> +        """Makes a parser function that finds a regular expression match=
 in the text.
> +
> +        If the pattern has any capturing groups, it returns None if no m=
atch was found, otherwise a
> +        tuple containing the values per each group is returned. If the p=
attern has only one
> +        capturing group and a match was found, its value is returned. If=
 the pattern has no
> +        capturing groups then either True or False is returned if the pa=
ttern had a match or not.
> +
> +        Args:
> +            pattern: The regular expression pattern.
> +            flags: The regular expression flags. Ignored if the given pa=
ttern is already compiled.
> +            named: If set to True only the named capturing groups will b=
e returned, as a dictionary.
> +
> +        Returns:
> +            ParserFn: A dictionary for the `dataclasses.field` metadata =
argument containing the find
> +                parser function.
> +        """
> +        if isinstance(pattern, str):
> +            pattern =3D re.compile(pattern, flags)
> +
> +        def _find(text: str) -> Any:
> +            m =3D pattern.search(text)
> +            if m is None:
> +                return None if pattern.groups > 0 else False
> +
> +            if pattern.groups =3D=3D 0:
> +                return True
> +
> +            if named:
> +                return m.groupdict()
> +
> +            matches =3D m.groups()
> +            if len(matches) =3D=3D 1:
> +                return matches[0]
> +
> +            return matches
> +
> +        return ParserFn(TextParser_fn=3D_find)
> +
> +    @staticmethod
> +    def find_int(
> +        pattern: str | re.Pattern[str],
> +        flags: re.RegexFlag =3D re.RegexFlag(0),
> +        int_base: int =3D 0,
> +    ) -> ParserFn:
> +        """Makes a parser function that converts the match of :meth:`~fi=
nd` to int.
> +
> +        This function is compatible only with a pattern containing one c=
apturing group.
> +
> +        Args:
> +            pattern: The regular expression pattern.
> +            flags: The regular expression flags. Ignored if the given pa=
ttern is already compiled.
> +            int_base: The base of the number to convert from.
> +
> +        Raises:
> +            InternalError: If the pattern does not have exactly one capt=
uring group.
> +
> +        Returns:
> +            ParserFn: A dictionary for the `dataclasses.field` metadata =
argument containing the
> +                :meth:`~find` parser function wrapped by the int built-i=
n.
> +        """
> +        if isinstance(pattern, str):
> +            pattern =3D re.compile(pattern, flags)
> +
> +        if pattern.groups !=3D 1:
> +            raise InternalError("only one capturing group is allowed wit=
h this parser function")
> +
> +        return TextParser.wrap(TextParser.find(pattern), partial(int, ba=
se=3Dint_base))
> +
> +    """=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D END PARSER FUNCTIONS =3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D"""
> +
> +    @classmethod
> +    def parse(cls, text: str) -> Self:
> +        """Creates a new instance of the class from the given text.
> +
> +        A new class instance is created with all the fields that have a =
parser function in their
> +        metadata. Fields without one are ignored and are expected to hav=
e a default value, otherwise
> +        the class initialization will fail.
> +
> +        A field is populated with the value returned by its correspondin=
g parser function.
> +
> +        Args:
> +            text: the text to parse
> +
> +        Raises:
> +            InternalError: if the parser did not find a match and the fi=
eld does not have a default
> +                value or default factory.
> +
> +        Returns:
> +            A new instance of the class.
> +        """
> +        fields_values =3D {}
> +        for field in fields(cls):
> +            parse =3D cast(ParserFn, field.metadata).get("TextParser_fn"=
)
> +            if parse is None:
> +                continue
> +
> +            value =3D parse(text)
> +            if value is not None:
> +                fields_values[field.name] =3D value
> +            elif field.default is MISSING and field.default_factory is M=
ISSING:
> +                raise InternalError(
> +                    f"parser for field {field.name} returned None, but t=
he field has no default"
> +                )
> +
> +        return cls(**fields_values)
> --
> 2.34.1
>