From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9294E459BD; Tue, 17 Sep 2024 13:13:15 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2C2C040A6F; Tue, 17 Sep 2024 13:13:15 +0200 (CEST) Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) by mails.dpdk.org (Postfix) with ESMTP id C2BF840652 for ; Tue, 17 Sep 2024 13:13:13 +0200 (CEST) Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-42cba6cdf32so45638295e9.1 for ; Tue, 17 Sep 2024 04:13:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pantheon.tech; s=google; t=1726571593; x=1727176393; darn=dpdk.org; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=ihZgJf+SQJoJumiMQkVJFj+zn8IWXqC342v7g1Sicak=; b=aLBSVzXI81w7cH/QQDGZ0tN3TalrecfZiUFv+iJc7ee6cWhCX6WQh6+fkq5Qm5NieN ar8pDI6P4nWh0MN0WZcAN8BuPMdPizlQ4hGRTJJljS95hcIrpzlCF9KwvfTLyWA99I67 LHMLzsoZ3mnevyx4A8nitj4YK2SAKt1e5DZ/+60spLMm26vb3HJtZoRLO2lU5ll3XIiF 1aSnRttKMYZQRozM1Z+kXpX/rHj7D9+CWgS7ncIEWsBkeefHhp+No6vduhQ+FC1uD/rW OPc3jA/zJdBmuL/IAThL8Aj7xT0rHOea0xwXdGazfxkEKcw7Q/UT5rRJMqFYJa09JsBi 8zPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726571593; x=1727176393; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ihZgJf+SQJoJumiMQkVJFj+zn8IWXqC342v7g1Sicak=; b=QdjVrQLTOOKLcWs5JTYXr4EEK34ucLTCcs6oD0rX7kKluACPmQ99xiZqt3DnyICkk0 8/JKVq/2qyrgTkjZJ2USAvD/2dlQep8JyrLJ+gP3z4nV0ym+nokdi42fiiqOJj5yP79p Kh6fgklHfLIaJAjxWDu9kd3sEruB618BD80ccx3FEkBJXaVagRItB+ddZGkCBe7UqGYh tgAOrNaN3m0MJr4ImedlcDZwaiXfVW33LdPSCO1iMtnfwDUpkqW37JAKBqoVw1xCrlY1 f8dGyz1IgvmOeHhM5V8VPcA46Sck6eVkB0W27FIBA2cvh7irf4HW/3yP1VuBgxAihuMw VNTA== X-Forwarded-Encrypted: i=1; AJvYcCUQAYNH61irYFVqAGRdGuG4SVkDOdydxzda6Fxogh1DhgQ9sya/JqbnBrQUQNalB5gdTms=@dpdk.org X-Gm-Message-State: AOJu0YzVZPGjtimOD7B/IsAiqO82euOYzKzW2za0E8xBKZyQsG6gbh/4 iEcgu98nL1wDOrvN+Mfdgx/EIddqnkphwGBgcmmB5FyTDYeIEXtUVYCuNj8YB2w= X-Google-Smtp-Source: AGHT+IGxmaEjQ6qZLJJuJ1/zog8HFZzbZ4fMooeN97HsYQq2Q45Nz79umS5on4dKwNe5s/OJtgv+ow== X-Received: by 2002:a05:600c:4fcf:b0:42c:b16e:7a22 with SMTP id 5b1f17b1804b1-42cdb547d65mr117265085e9.12.1726571592958; Tue, 17 Sep 2024 04:13:12 -0700 (PDT) Received: from [10.12.0.236] (81.89.53.154.host.vnet.sk. [81.89.53.154]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42da22b8b65sm100745465e9.8.2024.09.17.04.13.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Sep 2024 04:13:12 -0700 (PDT) Message-ID: <955abe98-9b96-44c5-8ec2-969a8da20365@pantheon.tech> Date: Tue, 17 Sep 2024 13:13:11 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: =?UTF-8?Q?Juraj_Linke=C5=A1?= Subject: Re: [PATCH 3/5] dts: use Pydantic in the configuration To: Luca Vizzarro , dev@dpdk.org Cc: Honnappa Nagarahalli , Paul Szczepanek References: <20240822163941.1390326-1-luca.vizzarro@arm.com> <20240822163941.1390326-4-luca.vizzarro@arm.com> Content-Language: en-US In-Reply-To: <20240822163941.1390326-4-luca.vizzarro@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 22. 8. 2024 18:39, Luca Vizzarro wrote: > This change brings in Pydantic in place of Warlock. Pydantic offers > a built-in model validation system in the classes, which allows for > a more resilient and simpler code. As a consequence of this change: > > - most validation is now built-in > - further validation is added to verify: > - cross referencing of node names and ports > - test suite and test cases names > - dictionaries representing the config schema are removed > - the config schema is no longer used for validation but kept as an > alternative format for the developer If it's not used, we should remove it right away (in this patch). I see that it's updated in v5, but we can just add it back. > - the config schema can now be generated automatically from the > Pydantic models > - the TrafficGeneratorType enum has been changed from inheriting > StrEnum to the native str and Enum. This change was necessary to > enable the discriminator for object unions > - the structure of the classes has been slightly changed to perfectly > match the structure of the configuration files > - updates the test suite argument to catch the ValidationError that > TestSuiteConfig can now raise Passive voice is not used here, but the rest of the bullet points are using it. > delete mode 100644 dts/framework/config/types.py A note, don't forget to remove this from doc sources if those get merged before this. > > diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py > @@ -2,17 +2,19 @@ > -the YAML test run configuration file > -and validates it according to :download:`the schema `. > +the YAML test run configuration file and validates it against the :class:`Configuration` Pydantic > +dataclass model. The Pydantic model is also available as > +:download:`JSON schema `. This second sentence should be moved to the last patch. > @@ -33,29 +35,31 @@ > +) > +from pydantic.config import JsonDict > +from pydantic.dataclasses import dataclass We should probably distinguish between built-in dataclasses and pydantic dataclasses (as pydantic adds the extra argument). Importing them as pydantic_dataclass seems like the easiest way to achieve this. > @@ -116,14 +120,14 @@ class Compiler(StrEnum): > @unique > -class TrafficGeneratorType(StrEnum): > +class TrafficGeneratorType(str, Enum): > """The supported traffic generators.""" > > #: > - SCAPY = auto() > + SCAPY = "SCAPY" Do discriminators not work with auto()? > -@dataclass(slots=True, frozen=True) > +@dataclass(slots=True, frozen=True, kw_only=True, config=ConfigDict(extra="forbid")) Is there any special reason for kw_only? Maybe we should add the reason for this (and also the config arg) to the module dosctring and commit msg. > @@ -136,12 +140,17 @@ class HugepageConfiguration: > -@dataclass(slots=True, frozen=True) > +PciAddress = Annotated[ > + str, StringConstraints(pattern=r"^[\da-fA-F]{4}:[\da-fA-F]{2}:[\da-fA-F]{2}.\d:?\w*$") We have a pattern for this in utils.py. We can reuse and maybe update it if needed. > +] > +"""A constrained string type representing a PCI address.""" This should be above the var and I think regular comment (with #) should suffice. > @@ -150,69 +159,53 @@ class PortConfig: > +TrafficGeneratorConfigTypes = Annotated[ScapyTrafficGeneratorConfig, Field(discriminator="type")] > > -@dataclass(slots=True, frozen=True) > -class ScapyTrafficGeneratorConfig(TrafficGeneratorConfig): > - """Scapy traffic generator specific configuration.""" > > - pass > +LogicalCores = Annotated[ > + str, > + StringConstraints(pattern=r"^(([0-9]+|([0-9]+-[0-9]+))(,([0-9]+|([0-9]+-[0-9]+)))*)?$"), > + Field( > + description="Comma-separated list of logical cores to use. " > + "An empty string means use all lcores.", > + examples=["1,2,3,4,5,18-22", "10-15"], > + ), > +] These two types don't have have a docstring, but others have. > @@ -232,69 +225,25 @@ class NodeConfiguration: > arch: Architecture > os: OS Adding the descriptions to all fields would be beneficial. Do we want to do that in this patch? > @@ -313,10 +264,14 @@ class TGNodeConfiguration(NodeConfiguration): > - traffic_generator: TrafficGeneratorConfig > + traffic_generator: TrafficGeneratorConfigTypes > + > > +NodeConfigurationTypes = TGNodeConfiguration | SutNodeConfiguration > +"""Union type for all the node configuration types.""" Same note as with PciAddress. > @@ -405,31 +369,63 @@ class TestSuiteConfig: > + test_suite_name: str = Field( > + title="Test suite name", > + description="The identifying name of the test suite.", I think we need to update this to mention that it's the test suite module name. Maybe we can also update the field, as it's only used in this object. > + alias="test_suite", > + ) > + test_cases_names: list[str] = Field( > + default_factory=list, > + title="Test cases by name", > + description="The identifying name of the test cases of the test suite.", > + alias="test_cases", > + ) The attributes under Attributes need to be updated. > + > + @cached_property > + def test_suite_spec(self) -> "TestSuiteSpec": > + """The specification of the requested test suite.""" > + from framework.test_suite import find_by_name > + > + test_suite_spec = find_by_name(self.test_suite_name) > + assert test_suite_spec is not None, f"{self.test_suite_name} is not a valid test suite name" Doesn't end with a dot; the message should also mention that we're dealing with module name. > + return test_suite_spec > + > + @model_validator(mode="before") I think it makes sense to exlude these from docs. I tried putting :meta private: into a docstring and it seems to be working. > @classmethod > - def from_dict( > - cls, > - entry: str | TestSuiteConfigDict, > - ) -> Self: > - """Create an instance from two different types. > + def convert_from_string(cls, data: Any) -> Any: > + """Convert the string representation into a valid mapping.""" > + if isinstance(data, str): > + [test_suite, *test_cases] = data.split() > + return dict(test_suite=test_suite, test_cases=test_cases) > + return data > + Why is this here? To unify the format with the one accepted by the --test-suite argument? Do we want to add an alternative format? If so, we need to make sure we document clearly that there are two alternatives and that they're equivalent. > + @model_validator(mode="after") > + def validate_names(self) -> Self: > + """Validate the supplied test suite and test cases names.""" In Configuration.validate_test_runs_with_nodes() the docstring mentions the use of the cached property, let's also do that here. > + available_test_cases = map(lambda t: t.name, self.test_suite_spec.test_cases) > + for requested_test_case in self.test_cases_names: > + assert requested_test_case in available_test_cases, ( > + f"{requested_test_case} is not a valid test case " > + f"for test suite {self.test_suite_name}" for test suite -> of test suite; also end with a dot. The dot is missing in a lot of places (and capital letters where the message doesn't start with a var value). > @@ -442,143 +438,132 @@ class TestRunConfiguration: > -@dataclass(slots=True, frozen=True) > + > +@dataclass(frozen=True, kw_only=True) > class Configuration: > """DTS testbed and test configuration. > > - The node configuration is not stored in this object. Rather, all used node configurations > - are stored inside the test run configuration where the nodes are actually used. > - I think it makes sense to explain the extra validation (with the @*_validator decorators) that's being done in the docstring (if we remove the validation methods from the generated docs). The docstring should be updated for each model that doing the extra validation. > Attributes: > test_runs: Test run configurations. > + nodes: Node configurations. > """ > > - test_runs: list[TestRunConfiguration] > + test_runs: list[TestRunConfiguration] = Field(min_length=1) > + nodes: list[NodeConfigurationTypes] = Field(min_length=1) > > + @field_validator("nodes") > @classmethod > - def from_dict(cls, d: ConfigurationDict) -> Self: > - """A convenience method that processes the inputs before creating an instance. > + def validate_node_names(cls, nodes: list[NodeConfiguration]) -> list[NodeConfiguration]: > + """Validate that the node names are unique.""" > + nodes_by_name: dict[str, int] = {} > + for node_no, node in enumerate(nodes): > + assert node.name not in nodes_by_name, ( > + f"node {node_no} cannot have the same name as node {nodes_by_name[node.name]} " > + f"({node.name})" > + ) > + nodes_by_name[node.name] = node_no > + > + return nodes > + > + @model_validator(mode="after") > + def validate_ports(self) -> Self: > + """Validate that the ports are all linked to valid ones.""" > + port_links: dict[tuple[str, str], Literal[False] | tuple[int, int]] = { > + (node.name, port.pci): False for node in self.nodes for port in node.ports > + } > + > + for node_no, node in enumerate(self.nodes): I could see why we're use enumeration for nodes in validate_node_names, but here we can just use node names in assert messages. At least that should be the case if nodes get validated before this model validator runs - it that the case? > + for port_no, port in enumerate(node.ports): > + peer_port_identifier = (port.peer_node, port.peer_pci) > + peer_port = port_links.get(peer_port_identifier, None) > + assert peer_port is not None, ( > + "invalid peer port specified for " f"nodes.{node_no}.ports.{port_no}" > + ) > + assert peer_port is False, ( > + f"the peer port specified for nodes.{node_no}.ports.{port_no} " > + f"is already linked to nodes.{peer_port[0]}.ports.{peer_port[1]}" > + ) > + port_links[peer_port_identifier] = (node_no, port_no) > > + @cached_property > + def test_runs_with_nodes(self) -> list[TestRunWithNodesConfiguration]: Let's move the property to be the first member of the class, to unify the order it with TestSuiteConfig. > + """List test runs with the associated nodes.""" This doesn't list the test runs. I think the docstring should say a bit more to make it obvious that this is the main attribute to use with this class. Or maybe that could be in the the class's docstring. We're also missing the Returns: section. > + test_runs_with_nodes = [] > > - Returns: > - The whole configuration instance. > - """ > - nodes: list[SutNodeConfiguration | TGNodeConfiguration] = list( > - map(NodeConfiguration.from_dict, d["nodes"]) > - ) > - assert len(nodes) > 0, "There must be a node to test" > + for test_run_no, test_run in enumerate(self.test_runs): > + sut_node_name = test_run.system_under_test_node.node_name > + sut_node = next(filter(lambda n: n.name == sut_node_name, self.nodes), None) There are a number of these instead of a list comprehension (I mentioned this in a previous patch). I still don't really see a reason to not use list comprehensions in all these cases. > + > + > +ConfigurationType = TypeAdapter(Configuration) This new transformed class exists only for validation purposes, right? I think we can move this to load_config, as it's not going to be used anywhere else. Also I'd rename it to something else, it's not a type. Maybe ConfigurationAdapter or PydanticConfiguration or ConfigurationModel (as the adapter adds some methods from BaseModel). Or something else, but the type in the name confused me. > diff --git a/dts/framework/runner.py b/dts/framework/runner.py > @@ -231,10 +234,10 @@ def _get_test_suites_with_cases( > test_suites_with_cases = [] > > for test_suite_config in test_suite_configs: > - test_suite_class = self._get_test_suite_class(test_suite_config.test_suite) > + test_suite_class = self._get_test_suite_class(test_suite_config.test_suite_name) We've already done all the validation and importing at this point and we should be able to use test_suite_config.test_suite_spec, right? The same is true for TestSuiteWithCases, which holds the same information. Looks like you removed _get_test_suite_class in a subsequent patch, but we should think about getting rid of TestSuiteWithCases, as it was conceived to do what TestSuiteSpec is doing.