DPDK CI discussions
 help / color / mirror / Atom feed
From: Owen Hilyard <ohilyard@iol.unh.edu>
To: ci@dpdk.org
Cc: Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-ci] [PATCH] ci: added patch parser for patch files
Date: Thu, 14 Jan 2021 11:53:04 -0500	[thread overview]
Message-ID: <CAHx6DYD1qRCaM=ZzEYXzxp2AB++v_ikHZ1Y8-2dHqhAx2RD4QA@mail.gmail.com> (raw)
In-Reply-To: <20201204194512.14666-1-ohilyard@iol.unh.edu>

[-- Attachment #1: Type: text/plain, Size: 6395 bytes --]

A bit of fuzz testing found some edge cases where this script crashes or
fails to properly parse the patch file. I am currently working on a
rewrite using a dedicated library to avoid these and similar issues.

On Fri, Dec 4, 2020 at 2:45 PM Owen Hilyard <ohilyard@iol.unh.edu> wrote:

> This commit contains a script, patch_parser.py, and a config file,
> patch_parser.cfg. These are tooling that the UNH CI team has been
> testing in order to reduce the number of tests that need to be run
> per patch. This resulted from our push to increase the number of
> functional tests running in the CI. While working on expanding test
> coverage, we found that DTS could easily take over 6 hours to run, so
> we decided to begin work on tagging patches and then only running the
> required tests.
>
> The script works by taking in an address for the config file and then
> a list of patch files, which it will parse and then produce a list of
> tags for that list of patches based on the config file. The config file
> is designed to work as a mapping for a base path to a set of tags. It
> also contains an ordered list of priorities for tags so that this may
> also be used by hierarchical tools rather than modular ones.
>
> The intention of the UNH team with giving this tooling to the wider
> DPDK community is to have people more familiar with the internal
> functionality of DPDK provide most of the tagging. This would allow
> UNH to have a better turn around time for testing by eliminating
> unnecessary tests, while still increasing the number of tests in the
> CI.
>
> The different patch tags are currently defined as such:
>
> core:
>     Core DPDK functionality. Examples include kernel modules and
>     librte_eal. This tag should be used sparingly as it is intended
>      to signal to automated test suites that it is necessary to
>      run most of the tests for DPDK and as such will consume CI
>      resources for a long period of time.
>
> driver:
>     For NIC drivers and other hardware interface code. This should be
>     used as a generic tag with each driver getting it's own tag.
>
> application:
>     Used in a similar manner to "driver". This tag is intended for
>     code used in only in applications that DPDK provides, such as
>     testpmd or helloworld. This tag should be accompanied by a tag
>     which denotes which application specifically has been changed.
>
> documentation:
>     This is intended to be used as a tag for paths which only contain
>     documentation, such as "doc/". It's intended use is as a way to
>     trigger the automatic re-building of the documentation website.
>
> Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu>
> ---
>  config/patch_parser.cfg | 25 ++++++++++++++++
>  tools/patch_parser.py   | 64 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 89 insertions(+)
>  create mode 100644 config/patch_parser.cfg
>  create mode 100755 tools/patch_parser.py
>
> diff --git a/config/patch_parser.cfg b/config/patch_parser.cfg
> new file mode 100644
> index 0000000..5757f9a
> --- /dev/null
> +++ b/config/patch_parser.cfg
> @@ -0,0 +1,25 @@
> +# Description of the categories as initially designed
> +
> +[Paths]
> +drivers =
> +    driver,
> +    core
> +kernel = core
> +doc = documentation
> +lib = core
> +meson_options.txt = core
> +examples = application
> +app = application
> +license = documentation
> +VERSION = documentation
> +build = core
> +
> +# This is an ordered list of the importance of each patch classification.
> +# It should be used to determine which classification to use on tools
> which
> +# do not support multiple patch classifications.
> +[Priority]
> +priority_list =
> +    core,
> +    driver,
> +    application,
> +    documentation
> diff --git a/tools/patch_parser.py b/tools/patch_parser.py
> new file mode 100755
> index 0000000..01fc55d
> --- /dev/null
> +++ b/tools/patch_parser.py
> @@ -0,0 +1,64 @@
> +#!/usr/bin/env python3
> +
> +import itertools
> +import sys
> +from configparser import ConfigParser
> +from typing import List, Dict, Set
> +
> +
> +def get_patch_files(patch_file: str) -> List[str]:
> +    with open(patch_file, 'r') as f:
> +        lines = list(itertools.takewhile(
> +            lambda line: line.strip().endswith('+') or
> line.strip().endswith('-'),
> +            itertools.dropwhile(
> +                lambda line: not line.strip().startswith("---"),
> +                f.readlines()
> +            )
> +        ))
> +        filenames = map(lambda line: line.strip().split(' ')[0], lines)
> +        # takewhile includes the --- which starts the filenames
> +        return list(filenames)[1:]
> +
> +
> +def get_all_files_from_patches(patch_files: List[str]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(map(get_patch_files,
> patch_files)))
> +
> +
> +def parse_comma_delimited_list_from_string(mod_str: str) -> List[str]:
> +    return list(map(str.strip, mod_str.split(',')))
> +
> +
> +def get_dictionary_attributes_from_config_file(conf_obj: ConfigParser) ->
> Dict[str, Set[str]]:
> +    return {
> +        directory: parse_comma_delimited_list_from_string(module_string)
> for directory, module_string in
> +        conf_obj['Paths'].items()
> +    }
> +
> +
> +def get_tags_for_patch_file(patch_file: str, dir_attrs: Dict[str,
> Set[str]]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(
> +        tags for directory, tags in dir_attrs.items() if
> patch_file.startswith(directory)
> +    ))
> +
> +
> +def get_tags_for_patches(patch_files: Set[str], dir_attrs: Dict[str,
> Set[str]]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(
> +        map(lambda patch_file: get_tags_for_patch_file(patch_file,
> dir_attrs), patch_files)
> +    ))
> +
> +
> +if len(sys.argv) < 3:
> +    print("usage: patch_parser.py <path to patch_parser.cfg> <patch
> file>...")
> +    exit(1)
> +
> +conf_obj = ConfigParser()
> +conf_obj.read(sys.argv[1])
> +
> +patch_files = get_all_files_from_patches(sys.argv[2:])
> +dir_attrs = get_dictionary_attributes_from_config_file(conf_obj)
> +priority_list =
> parse_comma_delimited_list_from_string(conf_obj['Priority']['priority_list'])
> +
> +unordered_tags: Set[str] = get_tags_for_patches(patch_files, dir_attrs)
> +ordered_tags: List[str] = [tag for tag in priority_list if tag in
> unordered_tags]
> +
> +print("\n".join(ordered_tags))
> --
> 2.27.0
>
>

[-- Attachment #2: Type: text/html, Size: 7501 bytes --]

  reply	other threads:[~2021-01-14 16:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04 19:45 Owen Hilyard
2021-01-14 16:53 ` Owen Hilyard [this message]
2021-01-14 17:59   ` [dpdk-ci] [PATCH V2] patch-tagging: Added tool to convert tags into dts execution file ohilyard
2021-01-14 18:19     ` Owen Hilyard
2021-04-15 20:42       ` Aaron Conole
2021-04-15 21:14         ` Owen Hilyard
2021-04-15 20:38 ` [dpdk-ci] [PATCH] ci: added patch parser for patch files Aaron Conole
2021-04-15 21:11   ` [dpdk-ci] [PATCH] patch parsing: Added library for parsing " ohilyard
2021-04-15 21:37     ` Aaron Conole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHx6DYD1qRCaM=ZzEYXzxp2AB++v_ikHZ1Y8-2dHqhAx2RD4QA@mail.gmail.com' \
    --to=ohilyard@iol.unh.edu \
    --cc=ci@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).