DPDK CI discussions
 help / color / Atom feed
* [dpdk-ci] [PATCH v2] add script to decide best tree match for patches
@ 2019-02-12 14:48 Ali Alnubani
  2019-02-15 21:28 ` Jeremy Plsek
  2019-02-16 16:03 ` [dpdk-ci] [PATCH v3] " Ali Alnubani
  0 siblings, 2 replies; 15+ messages in thread
From: Ali Alnubani @ 2019-02-12 14:48 UTC (permalink / raw)
  To: ci; +Cc: Thomas Monjalon, ferruh.yigit, jplsek, Ori Kam

The script can be used to get the trees that best match
a patch or a series.

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
Changes in v2:
	- Renamed script.
	- Updated license.

 tools/guess-git-tree.py | 244 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 244 insertions(+)
 create mode 100755 tools/guess-git-tree.py

diff --git a/tools/guess-git-tree.py b/tools/guess-git-tree.py
new file mode 100755
index 0000000..f4caef5
--- /dev/null
+++ b/tools/guess-git-tree.py
@@ -0,0 +1,244 @@
+#!/usr/bin/env python
+
+# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT)
+# Copyright 2019 Mellanox Technologies, Ltd
+
+import os
+import sys
+import re
+import argparse
+import copy
+import fnmatch
+
+from requests.exceptions import HTTPError
+
+from git_pw import config
+from git_pw import api
+from git_pw import utils
+
+"""
+This script uses the git-pw API to retrieve Patchwork's series/patches,
+and find a list of trees/repos that best match the series/patch.
+
+The rules on which matches are based, are taken from the MAINTAINERS file,
+and currently only based on the paths of the changed files. Results can be
+improved by adding more information to the MAINTAINERS file.
+
+TODO:
+    - Match using the subject of the patch/series.
+    - Add a configuration file to specify the priority of each tree.
+
+Configurations:
+The script uses tokens for authentication.
+If the arguments pw_{server,project,token} aren't passed, the environment
+variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will try
+to load the git configurations pw.{server,project,token}.
+
+Example usage:
+    ./guess-git-tree.py --command list_trees_for_series 2054
+    ./guess-git-tree.py --command list_trees_for_patch 2054
+
+The output will be a list of trees sorted based on number of matches,
+with the first line having the highest count.
+"""
+
+CONF = config.CONF
+CONF.debug = False
+
+MAINTAINERS_FILE_PATH = os.environ.get('MAINTAINERS_FILE_PATH')
+if not MAINTAINERS_FILE_PATH:
+    print('MAINTAINERS_FILE_PATH is not set.')
+    sys.exit(1)
+RULES = {}
+
+ignored_files_re = re.compile(r'^doc/|\.sh$|\.py$')
+
+def configure_git_pw(args=None):
+    """Configure git-pw."""
+    conf = {}
+    conf_keys = ['server', 'project', 'token']
+    for key in conf_keys:
+        value =  getattr(args, 'pw_{}'.format(key))
+        if not value:
+            print('--{} is a required git-pw configuration'.format(arg))
+            sys.exit(1)
+        else:
+            setattr(CONF, key, value)
+
+def find_filenames(diff):
+    """Find file changes in a given diff.
+
+    Source: https://github.com/getpatchwork/patchwork/blob/master/patchwork/parser.py
+    Changes from source:
+        - Moved _filename_re into the method.
+        - Reduced newlines.
+    """
+    _filename_re = re.compile(r'^(---|\+\+\+) (\S+)')
+    # normalise spaces
+    diff = diff.replace('\r', '')
+    diff = diff.strip() + '\n'
+    filenames = {}
+    for line in diff.split('\n'):
+        if len(line) <= 0:
+            continue
+        filename_match = _filename_re.match(line)
+        if not filename_match:
+            continue
+        filename = filename_match.group(2)
+        if filename.startswith('/dev/null'):
+            continue
+        filename = '/'.join(filename.split('/')[1:])
+        filenames[filename] = True
+    filenames = sorted(filenames.keys())
+    return filenames
+
+def construct_rules():
+    """Build a dictionary of rules from the MAINTAINERS file."""
+    with open(MAINTAINERS_FILE_PATH) as fd:
+        maintainers = fd.read()
+    # Split into blocks of text for easier search.
+    maintainers = maintainers.split('\n\n')
+
+    # Extract blocks that have a tree and files.
+    tree_file_blocks = [_item for _item in maintainers \
+            if 'T: git://dpdk.org' in _item and 'F: ' in _item]
+    _dict = {}
+    for _item in tree_file_blocks:
+        # Get the tree url.
+        tree_match = re.search(r'T: (git://dpdk\.org[^\n]+)', _item)
+        if tree_match:
+            tree = tree_match.group(1)
+        else:
+            continue
+        if tree not in _dict:
+            _dict[tree] = {}
+            _dict[tree]['paths'] = []
+        paths = re.findall(r'F: ([^\n]+)',  _item)
+        _paths = copy.deepcopy(paths)
+        for path in paths:
+            # Remove don't-care paths
+            if ignored_files_re.search(path):
+                _paths.remove(path)
+        _dict[tree]['paths'] += _paths
+    return _dict
+
+def get_subject(resource):
+    """Get subject from patch/series object,
+    remove its prefix and strip it.
+    """
+    name = resource['name']
+    return re.sub('^\[.*\]', '', name).strip()
+
+def find_matches(files):
+    """Find trees that the changed files in a patch match,
+    and stop at first match for each file."""
+    matches = []
+    for _file in files:
+        if ignored_files_re.search(_file):
+            continue
+        match_found = False
+        for tree in RULES.keys():
+            for rule in RULES[tree]['paths']:
+                if rule.endswith('/'):
+                    rule = '{}*'.format(rule)
+                if fnmatch.fnmatch(_file, rule):
+                    matches.append(tree)
+                    match_found = True
+                    break
+            if match_found:
+                break
+    return matches
+
+def get_ordered_matches(matches):
+    """Order matches by occurrences."""
+    match_counts = {item:matches.count(item) for item in matches}
+    return sorted(match_counts, key=match_counts.get, reverse=True)
+
+def list_trees_for_patch(patch):
+    """Find matching trees for a specific patch.
+    For a patch to match a tree, both its subject and
+    at least one changed path has to match the tree.
+    """
+    subject = get_subject(patch)
+    files = find_filenames(patch['diff'])
+
+    matches = find_matches(files)
+    return matches
+
+def list_trees_for_series(series):
+    """Find matching trees for a series."""
+    patch_list = series['patches']
+
+    matches = []
+
+    for patch in patch_list:
+        matches = matches + \
+                list_trees_for_patch(api_get('patches', patch['id']))
+
+    return matches
+
+def parse_args():
+    """Parse command-line arguments."""
+    parser = argparse.ArgumentParser()
+    git_pw_conf_parser = parser.add_argument_group('git-pw configurations')
+    options_parser = parser.add_argument_group('optional arguments')
+
+    options_parser.add_argument('--command',
+            choices=('list_trees_for_patch',
+                     'list_trees_for_series'),
+            required=True, help='command to perform on patch/series')
+
+    git_pw_conf_parser.add_argument('--pw_server', type=str,
+            default=os.environ.get('PW_SERVER', utils.git_config('pw.server')),
+            help='PW.SERVER')
+    git_pw_conf_parser.add_argument('--pw_project', type=str,
+            default=os.environ.get('PW_PROJECT', utils.git_config('pw.project')),
+            help='PW.PROJECT')
+    git_pw_conf_parser.add_argument('--pw_token', type=str,
+            default=os.environ.get('PW_TOKEN', utils.git_config('pw.token')),
+            help='PW.TOKEN')
+
+    parser.add_argument('id', type=int,
+            help='patch/series id')
+
+    args = parser.parse_args()
+
+    return args
+
+def main():
+    """Main procedure."""
+    args = parse_args()
+    configure_git_pw(args)
+    
+    command = args.command
+    _id = args.id
+
+    global RULES
+    RULES = construct_rules()
+
+    tree_list = []
+
+    if command == 'list_trees_for_patch':
+        patch = api_get('patches', _id)
+        tree_list = list_trees_for_patch(patch)
+
+    elif command == 'list_trees_for_series':
+        series = api_get('series', _id)
+        tree_list = list_trees_for_series(series)
+
+    tree_list = get_ordered_matches(tree_list)
+
+    print('{}'.format('\n'.join(tree_list)))
+
+def api_get(resource_type, resource_id):
+    """Retrieve an API resource."""
+    try:
+        return api.detail(resource_type, resource_id)
+    except HTTPError as err:
+        if '404' in str(err):
+            sys.exit(1)
+        else:
+            raise
+
+if __name__ == '__main__':
+    main()
-- 
2.11.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v2] add script to decide best tree match for patches
  2019-02-12 14:48 [dpdk-ci] [PATCH v2] add script to decide best tree match for patches Ali Alnubani
@ 2019-02-15 21:28 ` Jeremy Plsek
  2019-02-15 23:08   ` Thomas Monjalon
  2019-02-16 16:03 ` [dpdk-ci] [PATCH v3] " Ali Alnubani
  1 sibling, 1 reply; 15+ messages in thread
From: Jeremy Plsek @ 2019-02-15 21:28 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: ci, Thomas Monjalon, ferruh.yigit, Ori Kam

Thanks for the script! I tested it locally and seems to work.
Review inline. Extra comments at the end.

On Tue, Feb 12, 2019 at 9:48 AM Ali Alnubani <alialnu@mellanox.com> wrote:
>
> The script can be used to get the trees that best match
> a patch or a series.
>
> Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
> Signed-off-by: Ori Kam <orika@mellanox.com>
> ---
> Changes in v2:
>         - Renamed script.
>         - Updated license.
>
>  tools/guess-git-tree.py | 244 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 244 insertions(+)
>  create mode 100755 tools/guess-git-tree.py
>
> diff --git a/tools/guess-git-tree.py b/tools/guess-git-tree.py
> new file mode 100755
> index 0000000..f4caef5
> --- /dev/null
> +++ b/tools/guess-git-tree.py
> @@ -0,0 +1,244 @@
> +#!/usr/bin/env python
> +
> +# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT)
> +# Copyright 2019 Mellanox Technologies, Ltd
> +
> +import os
> +import sys
> +import re
> +import argparse
> +import copy
> +import fnmatch
> +
> +from requests.exceptions import HTTPError
> +
> +from git_pw import config
> +from git_pw import api
> +from git_pw import utils
> +
> +"""
> +This script uses the git-pw API to retrieve Patchwork's series/patches,
> +and find a list of trees/repos that best match the series/patch.
> +
> +The rules on which matches are based, are taken from the MAINTAINERS file,
> +and currently only based on the paths of the changed files. Results can be
> +improved by adding more information to the MAINTAINERS file.
> +
> +TODO:
> +    - Match using the subject of the patch/series.
> +    - Add a configuration file to specify the priority of each tree.
> +
> +Configurations:
> +The script uses tokens for authentication.
> +If the arguments pw_{server,project,token} aren't passed, the environment
> +variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will try
> +to load the git configurations pw.{server,project,token}.
> +
> +Example usage:
> +    ./guess-git-tree.py --command list_trees_for_series 2054
> +    ./guess-git-tree.py --command list_trees_for_patch 2054
> +
> +The output will be a list of trees sorted based on number of matches,
> +with the first line having the highest count.
> +"""
> +
> +CONF = config.CONF
> +CONF.debug = False
> +
> +MAINTAINERS_FILE_PATH = os.environ.get('MAINTAINERS_FILE_PATH')
> +if not MAINTAINERS_FILE_PATH:
> +    print('MAINTAINERS_FILE_PATH is not set.')
> +    sys.exit(1)
> +RULES = {}
> +
> +ignored_files_re = re.compile(r'^doc/|\.sh$|\.py$')
> +
> +def configure_git_pw(args=None):
> +    """Configure git-pw."""
> +    conf = {}
> +    conf_keys = ['server', 'project', 'token']

See comment in the argument parser section.

> +    for key in conf_keys:
> +        value =  getattr(args, 'pw_{}'.format(key))

There are two spaces between = and getattr,

> +        if not value:
> +            print('--{} is a required git-pw configuration'.format(arg))

I think this was meant to be '--pw_{}' instead of '--{}'.

> +            sys.exit(1)
> +        else:
> +            setattr(CONF, key, value)
> +
> +def find_filenames(diff):
> +    """Find file changes in a given diff.
> +
> +    Source: https://github.com/getpatchwork/patchwork/blob/master/patchwork/parser.py
> +    Changes from source:
> +        - Moved _filename_re into the method.
> +        - Reduced newlines.
> +    """
> +    _filename_re = re.compile(r'^(---|\+\+\+) (\S+)')
> +    # normalise spaces
> +    diff = diff.replace('\r', '')
> +    diff = diff.strip() + '\n'
> +    filenames = {}
> +    for line in diff.split('\n'):
> +        if len(line) <= 0:
> +            continue
> +        filename_match = _filename_re.match(line)
> +        if not filename_match:
> +            continue
> +        filename = filename_match.group(2)
> +        if filename.startswith('/dev/null'):
> +            continue
> +        filename = '/'.join(filename.split('/')[1:])
> +        filenames[filename] = True
> +    filenames = sorted(filenames.keys())
> +    return filenames
> +
> +def construct_rules():
> +    """Build a dictionary of rules from the MAINTAINERS file."""
> +    with open(MAINTAINERS_FILE_PATH) as fd:
> +        maintainers = fd.read()
> +    # Split into blocks of text for easier search.
> +    maintainers = maintainers.split('\n\n')
> +
> +    # Extract blocks that have a tree and files.
> +    tree_file_blocks = [_item for _item in maintainers \
> +            if 'T: git://dpdk.org' in _item and 'F: ' in _item]
> +    _dict = {}
> +    for _item in tree_file_blocks:
> +        # Get the tree url.
> +        tree_match = re.search(r'T: (git://dpdk\.org[^\n]+)', _item)
> +        if tree_match:
> +            tree = tree_match.group(1)
> +        else:
> +            continue
> +        if tree not in _dict:
> +            _dict[tree] = {}
> +            _dict[tree]['paths'] = []
> +        paths = re.findall(r'F: ([^\n]+)',  _item)
> +        _paths = copy.deepcopy(paths)
> +        for path in paths:
> +            # Remove don't-care paths
> +            if ignored_files_re.search(path):
> +                _paths.remove(path)
> +        _dict[tree]['paths'] += _paths
> +    return _dict
> +
> +def get_subject(resource):
> +    """Get subject from patch/series object,
> +    remove its prefix and strip it.
> +    """
> +    name = resource['name']
> +    return re.sub('^\[.*\]', '', name).strip()
> +
> +def find_matches(files):
> +    """Find trees that the changed files in a patch match,
> +    and stop at first match for each file."""
> +    matches = []
> +    for _file in files:
> +        if ignored_files_re.search(_file):
> +            continue
> +        match_found = False
> +        for tree in RULES.keys():
> +            for rule in RULES[tree]['paths']:
> +                if rule.endswith('/'):
> +                    rule = '{}*'.format(rule)
> +                if fnmatch.fnmatch(_file, rule):
> +                    matches.append(tree)
> +                    match_found = True
> +                    break
> +            if match_found:
> +                break
> +    return matches
> +
> +def get_ordered_matches(matches):
> +    """Order matches by occurrences."""
> +    match_counts = {item:matches.count(item) for item in matches}
> +    return sorted(match_counts, key=match_counts.get, reverse=True)
> +
> +def list_trees_for_patch(patch):
> +    """Find matching trees for a specific patch.
> +    For a patch to match a tree, both its subject and
> +    at least one changed path has to match the tree.
> +    """
> +    subject = get_subject(patch)

subject is not used and can be removed.

> +    files = find_filenames(patch['diff'])
> +
> +    matches = find_matches(files)
> +    return matches
> +
> +def list_trees_for_series(series):
> +    """Find matching trees for a series."""
> +    patch_list = series['patches']
> +
> +    matches = []
> +
> +    for patch in patch_list:
> +        matches = matches + \
> +                list_trees_for_patch(api_get('patches', patch['id']))
> +
> +    return matches
> +
> +def parse_args():
> +    """Parse command-line arguments."""
> +    parser = argparse.ArgumentParser()
> +    git_pw_conf_parser = parser.add_argument_group('git-pw configurations')
> +    options_parser = parser.add_argument_group('optional arguments')
> +
> +    options_parser.add_argument('--command',
> +            choices=('list_trees_for_patch',
> +                     'list_trees_for_series'),
> +            required=True, help='command to perform on patch/series')
> +
> +    git_pw_conf_parser.add_argument('--pw_server', type=str,
> +            default=os.environ.get('PW_SERVER', utils.git_config('pw.server')),
> +            help='PW.SERVER')
> +    git_pw_conf_parser.add_argument('--pw_project', type=str,
> +            default=os.environ.get('PW_PROJECT', utils.git_config('pw.project')),
> +            help='PW.PROJECT')
> +    git_pw_conf_parser.add_argument('--pw_token', type=str,
> +            default=os.environ.get('PW_TOKEN', utils.git_config('pw.token')),
> +            help='PW.TOKEN')

Depending on the action taken from
https://github.com/getpatchwork/git-pw/pull/37 this argument may be
removed, since the API calls don't require authentication. Then
"token" can be removed from conf_keys. You'll also probably need to
add and set CONF.token/username/password to None, similar to the
CONF.debug = False line. But since authentication in the python
library is required now, this part can be done later.

> +
> +    parser.add_argument('id', type=int,
> +            help='patch/series id')
> +
> +    args = parser.parse_args()
> +
> +    return args
> +
> +def main():
> +    """Main procedure."""
> +    args = parse_args()
> +    configure_git_pw(args)
> +
> +    command = args.command
> +    _id = args.id
> +
> +    global RULES
> +    RULES = construct_rules()
> +
> +    tree_list = []
> +
> +    if command == 'list_trees_for_patch':
> +        patch = api_get('patches', _id)
> +        tree_list = list_trees_for_patch(patch)
> +
> +    elif command == 'list_trees_for_series':
> +        series = api_get('series', _id)
> +        tree_list = list_trees_for_series(series)
> +
> +    tree_list = get_ordered_matches(tree_list)
> +
> +    print('{}'.format('\n'.join(tree_list)))
> +
> +def api_get(resource_type, resource_id):
> +    """Retrieve an API resource."""
> +    try:
> +        return api.detail(resource_type, resource_id)
> +    except HTTPError as err:
> +        if '404' in str(err):
> +            sys.exit(1)
> +        else:
> +            raise
> +
> +if __name__ == '__main__':
> +    main()
> --
> 2.11.0
>

There should also be two blank lines above/under method definitions
outside of classes to follow pep8 more closely.

We have a python script that we use to apply patches. With this
current script, it's a little bit awkward to use inside our apply
script.
Right now it would look something like this inside of our apply script:
(ggt = guess-git-tree)
ggt.configure_git_pw(args)
ggt.RULES = construct_rules()
series = ggt.api_get('series', id)
tree_list = list_trees_for_series(series)
tree_list = get_ordered_matches(tree_list)
# now attempt to apply patch to each tree until successful

I'd rather see this implemented in a class, so that things like the
construct_rules() and configure_git_pw() happens in the constructor,
then I'd like to call something like get_ordered_tree_for_series(id)
to get the list. The constructor arguments would probably contain the
server address and project. (Or you could make the server address also
an environment variable like PATCHWORK_API_URL =
os.getenv("PATCHWORK_API_URL", "http://patches.dpdk.org/api/1.0/"),
which is what we do in our scripts.)

As a note, this my first in-email review. If I was supposed to do
something different, please let me know!

Thanks
-- 
Jeremy Plsek
UNH InterOperability Laboratory

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v2] add script to decide best tree match for patches
  2019-02-15 21:28 ` Jeremy Plsek
@ 2019-02-15 23:08   ` Thomas Monjalon
  2019-02-17  8:09     ` Ali Alnubani
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Monjalon @ 2019-02-15 23:08 UTC (permalink / raw)
  To: Jeremy Plsek; +Cc: Ali Alnubani, ci, ferruh.yigit, Ori Kam

Hi Jeremy,
Thanks for reviewing.
Some comments below:

15/02/2019 22:28, Jeremy Plsek:
> We have a python script that we use to apply patches. With this
> current script, it's a little bit awkward to use inside our apply
> script.
> Right now it would look something like this inside of our apply script:
> (ggt = guess-git-tree)
> ggt.configure_git_pw(args)
> ggt.RULES = construct_rules()
> series = ggt.api_get('series', id)
> tree_list = list_trees_for_series(series)
> tree_list = get_ordered_matches(tree_list)
> # now attempt to apply patch to each tree until successful

I think you should call this script as a black box.

> I'd rather see this implemented in a class, so that things like the
> construct_rules() and configure_git_pw() happens in the constructor,
> then I'd like to call something like get_ordered_tree_for_series(id)
> to get the list.

The idea behind dpdk-ci.git is to allow building a CI infrastructure
by assembling a collection of scripts.
If possible, we should keep each script independent and simple to use,
in the Unix spirit.

> The constructor arguments would probably contain the
> server address and project. (Or you could make the server address also
> an environment variable like PATCHWORK_API_URL =
> os.getenv("PATCHWORK_API_URL", "http://patches.dpdk.org/api/1.0/"),
> which is what we do in our scripts.)

The URL is not going to change anytime soon.
What is the benefit of reading it from an environment variable?
I see one benefit: allow to test with a test server.
I would be in favor of providing a default URL and allow to override
with an environment variable.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-ci] [PATCH v3] add script to decide best tree match for patches
  2019-02-12 14:48 [dpdk-ci] [PATCH v2] add script to decide best tree match for patches Ali Alnubani
  2019-02-15 21:28 ` Jeremy Plsek
@ 2019-02-16 16:03 ` " Ali Alnubani
  2019-04-09 16:12   ` [dpdk-ci] [PATCH v4] " Ali Alnubani
  1 sibling, 1 reply; 15+ messages in thread
From: Ali Alnubani @ 2019-02-16 16:03 UTC (permalink / raw)
  To: ci; +Cc: Thomas Monjalon, ferruh.yigit, jplsek, Ori Kam

The script can be used to get the trees that best match
a patch or a series.

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
Changes in v3:
	- Fixed PEP8 violations.
	- Fixed variable names.
	- Removed unused get_subject() method.

 tools/guess-git-tree.py | 254 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 254 insertions(+)
 create mode 100755 tools/guess-git-tree.py

diff --git a/tools/guess-git-tree.py b/tools/guess-git-tree.py
new file mode 100755
index 0000000..f3f69df
--- /dev/null
+++ b/tools/guess-git-tree.py
@@ -0,0 +1,254 @@
+#!/usr/bin/env python
+
+# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT)
+# Copyright 2019 Mellanox Technologies, Ltd
+
+import os
+import sys
+import re
+import argparse
+import copy
+import fnmatch
+
+from requests.exceptions import HTTPError
+
+from git_pw import config
+from git_pw import api
+from git_pw import utils
+
+"""
+This script uses the git-pw API to retrieve Patchwork's series/patches,
+and find a list of trees/repos that best match the series/patch.
+
+The rules on which matches are based, are taken from the MAINTAINERS file,
+and currently only based on the paths of the changed files. Results can be
+improved by adding more information to the MAINTAINERS file.
+
+TODO:
+    - Match using the subject of the patch/series.
+    - Add a configuration file to specify the priority of each tree.
+
+Configurations:
+The script uses tokens for authentication.
+If the arguments pw_{server,project,token} aren't passed, the environment
+variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will try
+to load the git configurations pw.{server,project,token}.
+
+Example usage:
+    ./guess-git-tree.py --command list_trees_for_series 2054
+    ./guess-git-tree.py --command list_trees_for_patch 2054
+
+The output will be a list of trees sorted based on number of matches,
+with the first line having the highest count.
+"""
+
+CONF = config.CONF
+CONF.debug = False
+
+MAINTAINERS_FILE_PATH = os.environ.get('MAINTAINERS_FILE_PATH')
+if not MAINTAINERS_FILE_PATH:
+    print('MAINTAINERS_FILE_PATH is not set.')
+    sys.exit(1)
+RULES = {}
+
+ignored_files_re = re.compile(r'^doc/|\.sh$|\.py$')
+
+
+def configure_git_pw(args=None):
+    """Configure git-pw."""
+    conf = {}
+    conf_keys = ['server', 'project', 'token']
+    for key in conf_keys:
+        value = getattr(args, 'pw_{}'.format(key))
+        if not value:
+            print('--pw_{} is a required git-pw configuration'.format(key))
+            sys.exit(1)
+        else:
+            setattr(CONF, key, value)
+
+
+def find_filenames(diff):
+    """Find file changes in a given diff.
+
+    Source: https://github.com/getpatchwork/patchwork/blob/master/patchwork/parser.py
+    Changes from source:
+        - Moved _filename_re into the method.
+        - Reduced newlines.
+    """
+    _filename_re = re.compile(r'^(---|\+\+\+) (\S+)')
+    # normalise spaces
+    diff = diff.replace('\r', '')
+    diff = diff.strip() + '\n'
+    filenames = {}
+    for line in diff.split('\n'):
+        if len(line) <= 0:
+            continue
+        filename_match = _filename_re.match(line)
+        if not filename_match:
+            continue
+        filename = filename_match.group(2)
+        if filename.startswith('/dev/null'):
+            continue
+        filename = '/'.join(filename.split('/')[1:])
+        filenames[filename] = True
+    filenames = sorted(filenames.keys())
+    return filenames
+
+
+def construct_rules():
+    """Build a dictionary of rules from the MAINTAINERS file."""
+    with open(MAINTAINERS_FILE_PATH) as fd:
+        maintainers = fd.read()
+    # Split into blocks of text for easier search.
+    maintainers = maintainers.split('\n\n')
+
+    # Extract blocks that have a tree and files.
+    tree_file_blocks = [_item for _item in maintainers
+                        if 'T: git://dpdk.org' in _item and 'F: ' in _item]
+    _dict = {}
+    for _item in tree_file_blocks:
+        # Get the tree url.
+        tree_match = re.search(r'T: (git://dpdk\.org[^\n]+)', _item)
+        if tree_match:
+            tree = tree_match.group(1)
+        else:
+            continue
+        if tree not in _dict:
+            _dict[tree] = {}
+            _dict[tree]['paths'] = []
+        paths = re.findall(r'F: ([^\n]+)',  _item)
+        _paths = copy.deepcopy(paths)
+        for path in paths:
+            # Remove don't-care paths
+            if ignored_files_re.search(path):
+                _paths.remove(path)
+        _dict[tree]['paths'] += _paths
+    return _dict
+
+
+def find_matches(files):
+    """Find trees that the changed files in a patch match,
+    and stop at first match for each file."""
+    matches = []
+    for _file in files:
+        if ignored_files_re.search(_file):
+            continue
+        match_found = False
+        for tree in RULES.keys():
+            for rule in RULES[tree]['paths']:
+                if rule.endswith('/'):
+                    rule = '{}*'.format(rule)
+                if fnmatch.fnmatch(_file, rule):
+                    matches.append(tree)
+                    match_found = True
+                    break
+            if match_found:
+                break
+    return matches
+
+
+def get_ordered_matches(matches):
+    """Order matches by occurrences."""
+    match_counts = {item: matches.count(item) for item in matches}
+    return sorted(match_counts, key=match_counts.get, reverse=True)
+
+
+def list_trees_for_patch(patch):
+    """Find matching trees for a specific patch.
+    For a patch to match a tree, at least one changed
+    path has to match that tree.
+    """
+    files = find_filenames(patch['diff'])
+
+    matches = find_matches(files)
+    return matches
+
+
+def list_trees_for_series(series):
+    """Find matching trees for a series."""
+    patch_list = series['patches']
+
+    matches = []
+
+    for patch in patch_list:
+        matches = matches + \
+                list_trees_for_patch(api_get('patches', patch['id']))
+
+    return matches
+
+
+def parse_args():
+    """Parse command-line arguments."""
+    parser = argparse.ArgumentParser()
+    git_pw_conf_parser = parser.add_argument_group('git-pw configurations')
+    options_parser = parser.add_argument_group('optional arguments')
+
+    options_parser.add_argument(
+            '--command',
+            choices=(
+                'list_trees_for_patch',
+                'list_trees_for_series'),
+            required=True, help='command to perform on patch/series')
+
+    git_pw_conf_parser.add_argument(
+            '--pw_server', type=str,
+            default=os.environ.get(
+                'PW_SERVER', utils.git_config('pw.server')),
+            help='PW.SERVER')
+    git_pw_conf_parser.add_argument(
+            '--pw_project', type=str,
+            default=os.environ.get(
+                'PW_PROJECT', utils.git_config('pw.project')),
+            help='PW.PROJECT')
+    git_pw_conf_parser.add_argument(
+            '--pw_token', type=str,
+            default=os.environ.get('PW_TOKEN', utils.git_config('pw.token')),
+            help='PW.TOKEN')
+
+    parser.add_argument(
+            'id', type=int, help='patch/series id')
+
+    args = parser.parse_args()
+
+    return args
+
+
+def main():
+    """Main procedure."""
+    args = parse_args()
+    configure_git_pw(args)
+
+    command = args.command
+    _id = args.id
+
+    global RULES
+    RULES = construct_rules()
+
+    tree_list = []
+
+    if command == 'list_trees_for_patch':
+        patch = api_get('patches', _id)
+        tree_list = list_trees_for_patch(patch)
+
+    elif command == 'list_trees_for_series':
+        series = api_get('series', _id)
+        tree_list = list_trees_for_series(series)
+
+    tree_list = get_ordered_matches(tree_list)
+
+    print('{}'.format('\n'.join(tree_list)))
+
+
+def api_get(resource_type, resource_id):
+    """Retrieve an API resource."""
+    try:
+        return api.detail(resource_type, resource_id)
+    except HTTPError as err:
+        if '404' in str(err):
+            sys.exit(1)
+        else:
+            raise
+
+
+if __name__ == '__main__':
+    main()
-- 
2.11.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v2] add script to decide best tree match for patches
  2019-02-15 23:08   ` Thomas Monjalon
@ 2019-02-17  8:09     ` Ali Alnubani
  0 siblings, 0 replies; 15+ messages in thread
From: Ali Alnubani @ 2019-02-17  8:09 UTC (permalink / raw)
  To: Thomas Monjalon, Jeremy Plsek; +Cc: ci, ferruh.yigit, Ori Kam

Hi,
Thanks for the feedback. I sent v3.

Other comments are inline.

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Saturday, February 16, 2019 1:08 AM
> To: Jeremy Plsek <jplsek@iol.unh.edu>
> Cc: Ali Alnubani <alialnu@mellanox.com>; ci@dpdk.org;
> ferruh.yigit@intel.com; Ori Kam <orika@mellanox.com>
> Subject: Re: [PATCH v2] add script to decide best tree match for patches
> 
> Hi Jeremy,
> Thanks for reviewing.
> Some comments below:
> 
> 15/02/2019 22:28, Jeremy Plsek:
> > We have a python script that we use to apply patches. With this
> > current script, it's a little bit awkward to use inside our apply
> > script.
> > Right now it would look something like this inside of our apply script:
> > (ggt = guess-git-tree)
> > ggt.configure_git_pw(args)
> > ggt.RULES = construct_rules()
> > series = ggt.api_get('series', id)
> > tree_list = list_trees_for_series(series) tree_list =
> > get_ordered_matches(tree_list) # now attempt to apply patch to each
> > tree until successful
> 
> I think you should call this script as a black box.
Yes, the script was meant to be run as a black box.
> 
> > I'd rather see this implemented in a class, so that things like the
> > construct_rules() and configure_git_pw() happens in the constructor,
> > then I'd like to call something like get_ordered_tree_for_series(id)
> > to get the list.
> 
> The idea behind dpdk-ci.git is to allow building a CI infrastructure by
> assembling a collection of scripts.
> If possible, we should keep each script independent and simple to use, in the
> Unix spirit.
> 
> > The constructor arguments would probably contain the server address
> > and project. (Or you could make the server address also an environment
> > variable like PATCHWORK_API_URL = os.getenv("PATCHWORK_API_URL",
> >
<url removed>
> > which is what we do in our scripts.)
> 
> The URL is not going to change anytime soon.
> What is the benefit of reading it from an environment variable?
> I see one benefit: allow to test with a test server.
> I would be in favor of providing a default URL and allow to override with an
> environment variable.
> 

The script follows the git-pw configuration style,
You can either set the environment variable PW_SERVER, or add the git configuration pw.server.

Regards,
Ali

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-02-16 16:03 ` [dpdk-ci] [PATCH v3] " Ali Alnubani
@ 2019-04-09 16:12   ` " Ali Alnubani
  2019-04-18  8:16     ` Ali Alnubani
  2019-04-19 20:39     ` Thomas Monjalon
  0 siblings, 2 replies; 15+ messages in thread
From: Ali Alnubani @ 2019-04-09 16:12 UTC (permalink / raw)
  To: ci; +Cc: Thomas Monjalon, ferruh.yigit, jplsek, Ori Kam

The information provided in the MAINTAINERS file is used to find
the tree that matches the files changed in patches as follows:
 - For each file, it first tries to find a matching unix shell-style
   pattern. If one was found, it looks for a tree specified in the
   subsection containing the pattern, and that tree is returned.
   If no tree was found in that subsection, the script tries to
   find a tree specified under the name of the section containing
   the pattern.
 - If more than one tree was matched (for multiple files),
   a tree matching the common prefix of all these trees will be
   returned.
 - The main repo 'dpdk' will be returned if no other match was found.

Results can be further improved by adding more information to
the MAINTAINERS file or by using the subject of the series/patch
for matching too.

Bugzilla ID: 166

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
Changes in v2:
  - Refactored the script to use classes, which makes it easier
    to be used inside other scripts.
  - Renamed the script so that it can be imported.
  - The script will always return a single tree.
    If multiple trees were matched (for multiple files),
    a tree matching the common prefix of all these trees will be
    returned.
  - Main repo is now reported if no other matches were found.
  - Only tree name will be returned without the full url.
  - Updated description and usage info.

 tools/guess_git_tree.py | 291 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 291 insertions(+)
 create mode 100755 tools/guess_git_tree.py

diff --git a/tools/guess_git_tree.py b/tools/guess_git_tree.py
new file mode 100755
index 0000000..ccf6fbe
--- /dev/null
+++ b/tools/guess_git_tree.py
@@ -0,0 +1,291 @@
+#!/usr/bin/env python
+
+# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT)
+# Copyright 2019 Mellanox Technologies, Ltd
+
+import os
+import sys
+import re
+import argparse
+import copy
+import fnmatch
+
+from requests.exceptions import HTTPError
+
+from git_pw import config
+from git_pw import api
+from git_pw import utils
+
+"""
+Description:
+This script uses the git-pw API to retrieve Patchwork's
+series/patches, and then find a tree/repo that best matches them,
+depending on which files were changed in their diffs and using
+the rules specified in the MAINTAINERS file.
+If more than one tree was matched, a common tree (based on the
+longest common prefix) will be chosen from the list.
+
+Configurations:
+The script uses tokens for authentication.
+If the arguments pw_{server,project,token} aren't passed, the environment
+variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will try
+to load the git configurations pw.{server,project,token}.
+
+Example usage:
+    ./guess-git-tree.py --command list_trees_for_series 2054
+    ./guess-git-tree.py --command list_trees_for_patch 2054
+
+Or if you want to use inside other scripts:
+
+    import os
+    from guess_git_tree import (Maintainers, GitPW, Diff)
+    _git_pw = GitPW({
+        'pw_server': os.environ.get('PW_SERVER'),
+        'pw_project': os.environ.get('PW_PROJECT'),
+        'pw_token': os.environ.get('PW_TOKEN')})
+
+    maintainers = Maintainers()
+    patch_id = 52199
+    files = Diff.find_filenames(_git_pw.api_get('patches', patch_id)['diff'])
+    tree = maintainers.get_tree(files)
+"""
+
+
+MAINTAINERS_FILE_PATH = os.environ.get('MAINTAINERS_FILE_PATH')
+if not MAINTAINERS_FILE_PATH:
+    print('MAINTAINERS_FILE_PATH is not set.')
+    sys.exit(1)
+
+
+class GitPW(object):
+    CONF = config.CONF
+    CONF.debug = False
+
+    def __init__(self, conf_obj=None):
+        # Configure git-pw.
+        conf_keys = ['server', 'project', 'token']
+        for key in conf_keys:
+            value = conf_obj.get('pw_{}'.format(key))
+            if not value:
+                print('--pw_{} is a required git-pw configuration'.format(key))
+                sys.exit(1)
+            else:
+                setattr(self.CONF, key, value)
+
+    def api_get(self, resource_type, resource_id):
+        """Retrieve an API resource."""
+        try:
+            return api.detail(resource_type, resource_id)
+        except HTTPError as err:
+            if '404' in str(err):
+                sys.exit(1)
+            else:
+                raise
+
+
+class Diff(object):
+
+    @staticmethod
+    def find_filenames(diff):
+        """Find file changes in a given diff.
+
+        Source: https://github.com/getpatchwork/patchwork/blob/master/patchwork/parser.py
+        Changes from source:
+            - Moved _filename_re into the method.
+            - Reduced newlines.
+        """
+        _filename_re = re.compile(r'^(---|\+\+\+) (\S+)')
+        # normalise spaces
+        diff = diff.replace('\r', '')
+        diff = diff.strip() + '\n'
+        filenames = {}
+        for line in diff.split('\n'):
+            if len(line) <= 0:
+                continue
+            filename_match = _filename_re.match(line)
+            if not filename_match:
+                continue
+            filename = filename_match.group(2)
+            if filename.startswith('/dev/null'):
+                continue
+            filename = '/'.join(filename.split('/')[1:])
+            filenames[filename] = True
+        filenames = sorted(filenames.keys())
+        return filenames
+
+
+class Maintainers(object):
+
+    file_regex = r'F:\s(.*)'
+    tree_regex = r'T: git:\/\/dpdk\.org(?:\/next)*\/(.*)'
+    section_regex = r'([^\n]*)\n-+.*?(?=([^\n]*\n-+)|\Z)'
+    subsection_regex = r'[^\n](?:(?!\n{{2}}).)*?^F: {}(?:(?!\n{{2}}).)*'
+
+    def __init__(self):
+        with open(MAINTAINERS_FILE_PATH) as fd:
+            self.maintainers_txt = fd.read()
+        # Add wildcard symbol at the end of lines where missing.
+        self.maintainers_txt = re.sub(
+                r'/$', '/*', self.maintainers_txt,
+                count=0, flags=re.MULTILINE)
+        # This matches the whole section that starts with:
+        # Section Name
+        # ------------
+        self.sections = list(re.finditer(
+                self.section_regex,
+                self.maintainers_txt,
+                re.DOTALL | re.MULTILINE))
+        # This matches all the file patterns in the maintainers file.
+        self.file_patterns = re.findall(
+                self.file_regex, self.maintainers_txt, re.MULTILINE)
+        # Save already matched patterns.
+        self.matched = {}
+
+    def get_tree(self, files):
+        """
+        Return a git tree that matches a list of files."""
+        tree_list = []
+        for _file in files:
+            _tree = self._get_tree(_file)
+            if _tree:
+                tree_list.append(_tree)
+        tree = self.get_common_denominator(tree_list)
+        if tree == '':
+            tree = 'dpdk'
+        return tree
+
+    def _get_tree(self, filename):
+        """
+        Find a git tree that matches a filename from the maintainers file.
+        The search stops at the first match.
+        """
+        tree = None
+        # Check if we already tried to match with this pattern.
+        for pat in self.matched.keys():
+            if fnmatch.fnmatch(filename, pat):
+                return self.matched[pat]
+
+        # Find a file matching pattern.
+        matching_pattern = None
+        for pat in self.file_patterns:
+            # This regex matches a lot of files and trees. Ignore it.
+            if 'doc/*' in pat:
+                continue
+            if fnmatch.fnmatch(filename, pat):
+                matching_pattern = pat
+                break
+        if not matching_pattern:
+            return None
+
+        found_match = False
+        # Find the block containing filename.
+        regex = self.subsection_regex.format(re.escape(matching_pattern))
+        subsection_match = re.findall(
+                regex,
+                self.maintainers_txt,
+                re.DOTALL | re.MULTILINE)
+        if len(subsection_match):
+            subsection = subsection_match[-1]
+            # Look for a tree around the file path.
+            tree_match = re.search(
+                    self.tree_regex, subsection)
+            if tree_match:
+                tree = tree_match.group(1)
+                self.matched[matching_pattern] = tree
+                found_match = True
+
+        # If no tree was specified in the subsection containing filename,
+        # try to find a tree after the section name.
+        if not found_match:
+            for section in self.sections:
+                if re.search(re.escape(matching_pattern), section.group(0)):
+                    tree_match = re.search(
+                            self.tree_regex,
+                            section.group(0).split('\n\n')[0])
+                    if tree_match:
+                        tree = tree_match.group(1)
+
+        self.matched[matching_pattern] = tree
+        return tree
+
+    def get_common_denominator(self, tree_list):
+        """Finds a common tree by finding the longest common prefix.
+        Examples for expected output:
+          dpdk-next-virtio + dpdk = dpdk
+          dpdk-next-net-intel + dpdk = dpdk
+          dpdk-next-crypto + dpdk-next-virtio = dpdk
+          dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
+        """
+        # Make sure the list is unique.
+        tree_list = list(set(tree_list))
+
+        # Rename dpdk-next-virtio internally to match dpdk-next-net
+        _tree_list = [
+                tree.replace('dpdk-next-virtio', 'dpdk-next-net-virtio')
+                for tree in tree_list]
+        common_prefix = \
+            os.path.commonprefix(_tree_list).rstrip('-').replace(
+                    'dpdk-next-net-virtio', 'dpdk-next-virtio')
+        # There is no 'dpdk-next' named tree.
+        if common_prefix == 'dpdk-next':
+            common_prefix = 'dpdk'
+        return common_prefix
+
+
+if __name__ == '__main__':
+    """Main procedure."""
+    parser = argparse.ArgumentParser()
+    git_pw_conf_parser = parser.add_argument_group('git-pw configurations')
+    options_parser = parser.add_argument_group('optional arguments')
+
+    options_parser.add_argument(
+            '--command',
+            choices=(
+                'list_trees_for_patch',
+                'list_trees_for_series'),
+            required=True, help='Command to perform')
+
+    git_pw_conf_parser.add_argument(
+            '--pw_server', type=str,
+            default=os.environ.get(
+                'PW_SERVER', utils.git_config('pw.server')),
+            help='Patchwork server')
+    git_pw_conf_parser.add_argument(
+            '--pw_project', type=str,
+            default=os.environ.get(
+                'PW_PROJECT', utils.git_config('pw.project')),
+            help='Patchwork project')
+    git_pw_conf_parser.add_argument(
+            '--pw_token', type=str,
+            default=os.environ.get('PW_TOKEN', utils.git_config('pw.token')),
+            help='Authentication token')
+
+    parser.add_argument(
+            'id', type=int, help='patch/series id')
+
+    args = parser.parse_args()
+
+    command = args.command
+    _id = args.id
+
+    # Pass the needed configurations to git-pw.
+    conf_obj = {
+            key: value for key, value in args.__dict__.items() if
+            key.startswith('pw_')}
+    _git_pw = GitPW(conf_obj)
+
+    maintainers = Maintainers()
+
+    patch_list = []
+    if command == 'list_trees_for_patch':
+        patch_list.append(_git_pw.api_get('patches', _id))
+    elif command == 'list_trees_for_series':
+        series = _git_pw.api_get('series', _id)
+        patch_list = [
+                _git_pw.api_get('patches', patch['id'])
+                for patch in series['patches']]
+
+    files = []
+    for patch in patch_list:
+        files += Diff.find_filenames(patch['diff'])
+    print(maintainers.get_tree(files))
-- 
2.11.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-09 16:12   ` [dpdk-ci] [PATCH v4] " Ali Alnubani
@ 2019-04-18  8:16     ` Ali Alnubani
  2019-04-19 16:21       ` Jeremy Plsek
  2019-04-19 20:39     ` Thomas Monjalon
  1 sibling, 1 reply; 15+ messages in thread
From: Ali Alnubani @ 2019-04-18  8:16 UTC (permalink / raw)
  To: jplsek, ferruh.yigit; +Cc: Thomas Monjalon, ci

Hi,

> -----Original Message-----
> From: ci <ci-bounces@dpdk.org> On Behalf Of Ali Alnubani
> Sent: Tuesday, April 9, 2019 7:13 PM
> To: ci@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> jplsek@iol.unh.edu; Ori Kam <orika@mellanox.com>
> Subject: [dpdk-ci] [PATCH v4] add script to decide best tree match for
> patches
> 
> The information provided in the MAINTAINERS file is used to find the tree
> that matches the files changed in patches as follows:
>  - For each file, it first tries to find a matching unix shell-style
>    pattern. If one was found, it looks for a tree specified in the
>    subsection containing the pattern, and that tree is returned.
>    If no tree was found in that subsection, the script tries to
>    find a tree specified under the name of the section containing
>    the pattern.
>  - If more than one tree was matched (for multiple files),
>    a tree matching the common prefix of all these trees will be
>    returned.
>  - The main repo 'dpdk' will be returned if no other match was found.
> 
> Results can be further improved by adding more information to the
> MAINTAINERS file or by using the subject of the series/patch for matching
> too.
> 
> Bugzilla ID: 166
> 
> Suggested-by: Thomas Monjalon <thomas@monjalon.net>
> Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
> Signed-off-by: Ori Kam <orika@mellanox.com>
> ---
> Changes in v2:
Typo.
s/v2/v4/
>   - Refactored the script to use classes, which makes it easier
>     to be used inside other scripts.
>   - Renamed the script so that it can be imported.
>   - The script will always return a single tree.
>     If multiple trees were matched (for multiple files),
>     a tree matching the common prefix of all these trees will be
>     returned.
>   - Main repo is now reported if no other matches were found.
>   - Only tree name will be returned without the full url.
>   - Updated description and usage info.
> 

Did you guys get a chance to test this version?

Thanks,
Ali

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-18  8:16     ` Ali Alnubani
@ 2019-04-19 16:21       ` Jeremy Plsek
  2019-04-19 16:24         ` Thomas Monjalon
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Plsek @ 2019-04-19 16:21 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: ferruh.yigit, Thomas Monjalon, ci

We've switched to using it a few days ago.
Some examples of it switching branches:
https://lab.dpdk.org/results/dashboard/patchsets/5503/
https://lab.dpdk.org/results/dashboard/patchsets/5499/
https://lab.dpdk.org/results/dashboard/patchsets/5498/

On Thu, Apr 18, 2019 at 4:16 AM Ali Alnubani <alialnu@mellanox.com> wrote:
>
> Hi,
>
> > -----Original Message-----
> > From: ci <ci-bounces@dpdk.org> On Behalf Of Ali Alnubani
> > Sent: Tuesday, April 9, 2019 7:13 PM
> > To: ci@dpdk.org
> > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > jplsek@iol.unh.edu; Ori Kam <orika@mellanox.com>
> > Subject: [dpdk-ci] [PATCH v4] add script to decide best tree match for
> > patches
> >
> > The information provided in the MAINTAINERS file is used to find the tree
> > that matches the files changed in patches as follows:
> >  - For each file, it first tries to find a matching unix shell-style
> >    pattern. If one was found, it looks for a tree specified in the
> >    subsection containing the pattern, and that tree is returned.
> >    If no tree was found in that subsection, the script tries to
> >    find a tree specified under the name of the section containing
> >    the pattern.
> >  - If more than one tree was matched (for multiple files),
> >    a tree matching the common prefix of all these trees will be
> >    returned.
> >  - The main repo 'dpdk' will be returned if no other match was found.
> >
> > Results can be further improved by adding more information to the
> > MAINTAINERS file or by using the subject of the series/patch for matching
> > too.
> >
> > Bugzilla ID: 166
> >
> > Suggested-by: Thomas Monjalon <thomas@monjalon.net>
> > Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
> > Signed-off-by: Ori Kam <orika@mellanox.com>
> > ---
> > Changes in v2:
> Typo.
> s/v2/v4/
> >   - Refactored the script to use classes, which makes it easier
> >     to be used inside other scripts.
> >   - Renamed the script so that it can be imported.
> >   - The script will always return a single tree.
> >     If multiple trees were matched (for multiple files),
> >     a tree matching the common prefix of all these trees will be
> >     returned.
> >   - Main repo is now reported if no other matches were found.
> >   - Only tree name will be returned without the full url.
> >   - Updated description and usage info.
> >
>
> Did you guys get a chance to test this version?
>
> Thanks,
> Ali



-- 
Jeremy Plsek
UNH InterOperability Laboratory

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 16:21       ` Jeremy Plsek
@ 2019-04-19 16:24         ` Thomas Monjalon
  2019-04-19 17:33           ` Jeremy Plsek
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Monjalon @ 2019-04-19 16:24 UTC (permalink / raw)
  To: Jeremy Plsek; +Cc: Ali Alnubani, ferruh.yigit, ci

19/04/2019 18:21, Jeremy Plsek:
> On Thu, Apr 18, 2019 at 4:16 AM Ali Alnubani <alialnu@mellanox.com> wrote:
> > Did you guys get a chance to test this version?
> 
> We've switched to using it a few days ago.
> Some examples of it switching branches:
> https://lab.dpdk.org/results/dashboard/patchsets/5503/
> https://lab.dpdk.org/results/dashboard/patchsets/5499/
> https://lab.dpdk.org/results/dashboard/patchsets/5498/

Is it working as you expect?
May I merge it in dpdk-ci?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 16:24         ` Thomas Monjalon
@ 2019-04-19 17:33           ` Jeremy Plsek
  2019-04-19 17:55             ` Thomas Monjalon
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Plsek @ 2019-04-19 17:33 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ali Alnubani, ferruh.yigit, ci

I thinks so. Only more patchsets will tell, but that could be improved
after the fact. So I think it's fine to merge it in.

At first glance, the only part that I don't think is implemented is
mentioned here: https://bugs.dpdk.org/show_bug.cgi?id=166#c35
> We must match the common prefix of the git trees.
> Examples:
> dpdk-next-net-intel + dpdk = dpdk
> dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
Since some of the patches are being set to dpdk-next-net-mlx instead
of dpdk-next-net. But I'm fine with how it is right now and wouldn't
mind it getting changed to this later on.

On Fri, Apr 19, 2019 at 12:24 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/04/2019 18:21, Jeremy Plsek:
> > On Thu, Apr 18, 2019 at 4:16 AM Ali Alnubani <alialnu@mellanox.com> wrote:
> > > Did you guys get a chance to test this version?
> >
> > We've switched to using it a few days ago.
> > Some examples of it switching branches:
> > https://lab.dpdk.org/results/dashboard/patchsets/5503/
> > https://lab.dpdk.org/results/dashboard/patchsets/5499/
> > https://lab.dpdk.org/results/dashboard/patchsets/5498/
>
> Is it working as you expect?
> May I merge it in dpdk-ci?
>
>


-- 
Jeremy Plsek
UNH InterOperability Laboratory

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 17:33           ` Jeremy Plsek
@ 2019-04-19 17:55             ` Thomas Monjalon
  2019-04-19 18:06               ` Jeremy Plsek
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Monjalon @ 2019-04-19 17:55 UTC (permalink / raw)
  To: Jeremy Plsek; +Cc: Ali Alnubani, ferruh.yigit, ci

19/04/2019 19:33, Jeremy Plsek:
> I thinks so. Only more patchsets will tell, but that could be improved
> after the fact. So I think it's fine to merge it in.
> 
> At first glance, the only part that I don't think is implemented is
> mentioned here: https://bugs.dpdk.org/show_bug.cgi?id=166#c35
> > We must match the common prefix of the git trees.
> > Examples:
> > dpdk-next-net-intel + dpdk = dpdk
> > dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
> Since some of the patches are being set to dpdk-next-net-mlx instead
> of dpdk-next-net. But I'm fine with how it is right now and wouldn't
> mind it getting changed to this later on.

It is supposed to be fixed.
Please could you give an example of a misbehaviour?

PS: please avoid top-post, it is making thread hard to read.


> On Fri, Apr 19, 2019 at 12:24 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 19/04/2019 18:21, Jeremy Plsek:
> > > On Thu, Apr 18, 2019 at 4:16 AM Ali Alnubani <alialnu@mellanox.com> wrote:
> > > > Did you guys get a chance to test this version?
> > >
> > > We've switched to using it a few days ago.
> > > Some examples of it switching branches:
> > > https://lab.dpdk.org/results/dashboard/patchsets/5503/
> > > https://lab.dpdk.org/results/dashboard/patchsets/5499/
> > > https://lab.dpdk.org/results/dashboard/patchsets/5498/
> >
> > Is it working as you expect?
> > May I merge it in dpdk-ci?






^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 17:55             ` Thomas Monjalon
@ 2019-04-19 18:06               ` Jeremy Plsek
  2019-04-19 19:41                 ` Thomas Monjalon
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Plsek @ 2019-04-19 18:06 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ali Alnubani, ferruh.yigit, ci

On Fri, Apr 19, 2019 at 1:55 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/04/2019 19:33, Jeremy Plsek:
> > I thinks so. Only more patchsets will tell, but that could be improved
> > after the fact. So I think it's fine to merge it in.
> >
> > At first glance, the only part that I don't think is implemented is
> > mentioned here: https://bugs.dpdk.org/show_bug.cgi?id=166#c35
> > > We must match the common prefix of the git trees.
> > > Examples:
> > > dpdk-next-net-intel + dpdk = dpdk
> > > dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
> > Since some of the patches are being set to dpdk-next-net-mlx instead
> > of dpdk-next-net. But I'm fine with how it is right now and wouldn't
> > mind it getting changed to this later on.
>
> It is supposed to be fixed.
> Please could you give an example of a misbehaviour?

The most recent example is series 4380. For me, that returned dpdk-next-net-mlx.

> PS: please avoid top-post, it is making thread hard to read.

Sure. It would be great if it wasn't the default for Gmail ;)

>
> > On Fri, Apr 19, 2019 at 12:24 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 19/04/2019 18:21, Jeremy Plsek:
> > > > On Thu, Apr 18, 2019 at 4:16 AM Ali Alnubani <alialnu@mellanox.com> wrote:
> > > > > Did you guys get a chance to test this version?
> > > >
> > > > We've switched to using it a few days ago.
> > > > Some examples of it switching branches:
> > > > https://lab.dpdk.org/results/dashboard/patchsets/5503/
> > > > https://lab.dpdk.org/results/dashboard/patchsets/5499/
> > > > https://lab.dpdk.org/results/dashboard/patchsets/5498/
> > >
> > > Is it working as you expect?
> > > May I merge it in dpdk-ci?
>
>
>
>
>


-- 
Jeremy Plsek
UNH InterOperability Laboratory

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 18:06               ` Jeremy Plsek
@ 2019-04-19 19:41                 ` Thomas Monjalon
  2019-04-19 19:45                   ` Jeremy Plsek
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Monjalon @ 2019-04-19 19:41 UTC (permalink / raw)
  To: Jeremy Plsek; +Cc: Ali Alnubani, ferruh.yigit, ci

19/04/2019 20:06, Jeremy Plsek:
> On Fri, Apr 19, 2019 at 1:55 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 19/04/2019 19:33, Jeremy Plsek:
> > > I thinks so. Only more patchsets will tell, but that could be improved
> > > after the fact. So I think it's fine to merge it in.
> > >
> > > At first glance, the only part that I don't think is implemented is
> > > mentioned here: https://bugs.dpdk.org/show_bug.cgi?id=166#c35
> > > > We must match the common prefix of the git trees.
> > > > Examples:
> > > > dpdk-next-net-intel + dpdk = dpdk
> > > > dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
> > > Since some of the patches are being set to dpdk-next-net-mlx instead
> > > of dpdk-next-net. But I'm fine with how it is right now and wouldn't
> > > mind it getting changed to this later on.
> >
> > It is supposed to be fixed.
> > Please could you give an example of a misbehaviour?
> 
> The most recent example is series 4380. For me, that returned dpdk-next-net-mlx.

The series 4380 is mlx only, so it fine to match dpdk-next-net-mlx.
Why do you expect something else?




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-19 19:41                 ` Thomas Monjalon
@ 2019-04-19 19:45                   ` Jeremy Plsek
  0 siblings, 0 replies; 15+ messages in thread
From: Jeremy Plsek @ 2019-04-19 19:45 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ali Alnubani, ferruh.yigit, ci

On Fri, Apr 19, 2019 at 3:41 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/04/2019 20:06, Jeremy Plsek:
> > On Fri, Apr 19, 2019 at 1:55 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 19/04/2019 19:33, Jeremy Plsek:
> > > > I thinks so. Only more patchsets will tell, but that could be improved
> > > > after the fact. So I think it's fine to merge it in.
> > > >
> > > > At first glance, the only part that I don't think is implemented is
> > > > mentioned here: https://bugs.dpdk.org/show_bug.cgi?id=166#c35
> > > > > We must match the common prefix of the git trees.
> > > > > Examples:
> > > > > dpdk-next-net-intel + dpdk = dpdk
> > > > > dpdk-next-net-intel + dpdk-next-net-mlx = dpdk-next-net
> > > > Since some of the patches are being set to dpdk-next-net-mlx instead
> > > > of dpdk-next-net. But I'm fine with how it is right now and wouldn't
> > > > mind it getting changed to this later on.
> > >
> > > It is supposed to be fixed.
> > > Please could you give an example of a misbehaviour?
> >
> > The most recent example is series 4380. For me, that returned dpdk-next-net-mlx.
>
> The series 4380 is mlx only, so it fine to match dpdk-next-net-mlx.
> Why do you expect something else?
>
Ok, I guess I was misunderstanding the comment. I thought that
anything under dpdk-next-net-* would just be lumped into
dpdk-next-net. The scripts are good to go then.



--
Jeremy Plsek
UNH InterOperability Laboratory

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-ci] [PATCH v4] add script to decide best tree match for patches
  2019-04-09 16:12   ` [dpdk-ci] [PATCH v4] " Ali Alnubani
  2019-04-18  8:16     ` Ali Alnubani
@ 2019-04-19 20:39     ` Thomas Monjalon
  1 sibling, 0 replies; 15+ messages in thread
From: Thomas Monjalon @ 2019-04-19 20:39 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: ci, ferruh.yigit, jplsek, Ori Kam

09/04/2019 18:12, Ali Alnubani:
> The information provided in the MAINTAINERS file is used to find
> the tree that matches the files changed in patches as follows:
>  - For each file, it first tries to find a matching unix shell-style
>    pattern. If one was found, it looks for a tree specified in the
>    subsection containing the pattern, and that tree is returned.
>    If no tree was found in that subsection, the script tries to
>    find a tree specified under the name of the section containing
>    the pattern.
>  - If more than one tree was matched (for multiple files),
>    a tree matching the common prefix of all these trees will be
>    returned.
>  - The main repo 'dpdk' will be returned if no other match was found.
> 
> Results can be further improved by adding more information to
> the MAINTAINERS file or by using the subject of the series/patch
> for matching too.
> 
> Bugzilla ID: 166
> 
> Suggested-by: Thomas Monjalon <thomas@monjalon.net>
> Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
> Signed-off-by: Ori Kam <orika@mellanox.com>

Applied (without unused "import copy"), thanks



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-12 14:48 [dpdk-ci] [PATCH v2] add script to decide best tree match for patches Ali Alnubani
2019-02-15 21:28 ` Jeremy Plsek
2019-02-15 23:08   ` Thomas Monjalon
2019-02-17  8:09     ` Ali Alnubani
2019-02-16 16:03 ` [dpdk-ci] [PATCH v3] " Ali Alnubani
2019-04-09 16:12   ` [dpdk-ci] [PATCH v4] " Ali Alnubani
2019-04-18  8:16     ` Ali Alnubani
2019-04-19 16:21       ` Jeremy Plsek
2019-04-19 16:24         ` Thomas Monjalon
2019-04-19 17:33           ` Jeremy Plsek
2019-04-19 17:55             ` Thomas Monjalon
2019-04-19 18:06               ` Jeremy Plsek
2019-04-19 19:41                 ` Thomas Monjalon
2019-04-19 19:45                   ` Jeremy Plsek
2019-04-19 20:39     ` Thomas Monjalon

DPDK CI discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/ci/0 ci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ci ci/ http://inbox.dpdk.org/ci \
		ci@dpdk.org
	public-inbox-index ci


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.ci


AGPL code for this site: git clone https://public-inbox.org/ public-inbox