From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80043.outbound.protection.outlook.com [40.107.8.43]) by dpdk.org (Postfix) with ESMTP id 100A91B10E for ; Mon, 4 Feb 2019 15:19:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oSqEJGCMlvgmfHq6hjT2PhDDLnfTx3gpE/a8X7e19YA=; b=hRxqYRvU7gjbQp3yFwyK3Ez7fpVXWTECgn3YzcHxZzFRCxTqVQB0jL2SjzvHaDQ6kMLQxUSjNdGUtT9X8csIR3qfHVN/Ki2PWrPYIlLKPkhgA30kYBtVwGx2EBFxxlPcB2zPOaS88yBNqcuS4jZlnNm5OxvmALKfbuJGa/1OSHI= Received: from VI1PR05MB4269.eurprd05.prod.outlook.com (52.133.12.22) by VI1PR05MB5376.eurprd05.prod.outlook.com (20.178.8.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1580.22; Mon, 4 Feb 2019 14:19:09 +0000 Received: from VI1PR05MB4269.eurprd05.prod.outlook.com ([fe80::64cb:93f8:918:f1a1]) by VI1PR05MB4269.eurprd05.prod.outlook.com ([fe80::64cb:93f8:918:f1a1%7]) with mapi id 15.20.1580.019; Mon, 4 Feb 2019 14:19:09 +0000 From: Ali Alnubani To: "ci@dpdk.org" CC: Thomas Monjalon , "ferruh.yigit@intel.com" , Ori Kam Thread-Topic: [PATCH] add script to decide best tree match for patches Thread-Index: AQHUvJSXmTp6jT8fCECwpwbJhgVgeg== Date: Mon, 4 Feb 2019 14:19:09 +0000 Message-ID: <20190204141840.20715-1-alialnu@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: git-send-email 2.11.0 x-clientproxiedby: PR0P264CA0019.FRAP264.PROD.OUTLOOK.COM (2603:10a6:100::31) To VI1PR05MB4269.eurprd05.prod.outlook.com (2603:10a6:803:40::22) authentication-results: spf=none (sender IP is ) smtp.mailfrom=alialnu@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [2001:4b98:dc0:51:216:3eff:feac:53b] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR05MB5376; 6:XPbIPrizRxS+v4exbYO5BCLO/FmegGsZh0hWY6UvV+1S3d96Le3Y0Q2U8iWhQht10TdNIq6UiuwVicMQxDVK6qhrddwd6Y+siIrCIB0t+qpGqh29iTkIlhR+BCPC/22CO7jaZV+duXV8lqzVvafJ+fx9rC2dfHGVx4+CJw4ik0VYW6uTBn/nvl1NUDyIaXOiHLWit9fw7chK/u7FA8dV8SV6EBp9onOEsveeiUA70OvOvq4o3b6aLjOsiZ6wEvIedIk9J6EzfRSOAzyS5UZemTIvpAUsLCH9UGwdUUSuGfBqF7r4WZPf67T6ytBNROB3im+FYeHsoC2wlY4HZruZBJsO6pkh9233BpqCB/bI3Mcj7kGU/NruBvps/XzFvmwmDcoQSUumO+WwdLX9niLG/Q1ELD3Z721vpGVYaXc+jf7b0zxzzWqGLeeE7E2aymAtUS6YPwXPFzlY5yFpI/pr9g==; 5:XQ7W4nUqjHT2hltihvskrxU+q/9x6R3TpS+x7seC4fBwm7i25LTaOpIHMWoh76dYycEN9RUEA0X7i3mps784l4L5dlk4SYZ3TQKty2krGZBFp0saQ2JMkJ1WSNgW7cbFEOIQh7cZEIjBnmaMaXrSdZWxxf51KMrvWgNoZDmuxtkjr9zA4kQ0LuLfQiwu6olDEQDFa1HPpzrcGVb9Yb6HOg==; 7:6Varkem5Jq0Lu+wu2Yz4WL/Io11AdpN5FQbLqMmsFUrRUUmuuXnLIe/XEyqQb3VOEFRcIOMDI88BiycckwmNJmIP6YODfuK0SAGbSdh4srue/qnk2Z8NQ6VFZtryKk1AWKm9wyjm1+ntTFix0gFSgA== x-ms-office365-filtering-correlation-id: 36ee8424-a578-4be5-4570-08d68aabba3c x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600110)(711020)(4605077)(4618075)(2017052603328)(7153060)(7193020); SRVR:VI1PR05MB5376; x-ms-traffictypediagnostic: VI1PR05MB5376: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-forefront-prvs: 0938781D02 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(376002)(39860400002)(366004)(396003)(346002)(189003)(199004)(1076003)(86362001)(50226002)(316002)(6116002)(71190400001)(71200400001)(6436002)(36756003)(186003)(53936002)(52116002)(7736002)(305945005)(107886003)(46003)(2616005)(2906002)(99286004)(8676002)(68736007)(476003)(2501003)(2351001)(1730700003)(5640700003)(386003)(6506007)(6916009)(81156014)(81166006)(14444005)(97736004)(4326008)(105586002)(256004)(102836004)(106356001)(486006)(8936002)(25786009)(54906003)(478600001)(6486002)(6306002)(14454004)(6512007)(966005); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB5376; H:VI1PR05MB4269.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: z0ZPqtTMGNwcnIcCjCIzUE2DyfgvwAwm5uWpq+m59V2Ucg3Sb1udzLp5QnOaSsAjd0STs0tywfoeumcgdb44Dd3zIkGYlGd6nJOZX31rCFZlJIasM7ZJwoH/qUGZMfgbcZDm2DB9keUOUjE12/Z+8j34ilLMmKS5qMk2kt891tbLkoh3Nh7y4zXTAPazlB1bnjrC1nQogJCbP/iu7xcQcxwu+bHgE8RtazpJW+BdC2gEhoQ9RwKhpYWWGL3wYFz9zJteg1ix+2/7QiHt++z9KaT7BIENCitIe8svyqHHElJLnGBLK7xq3QHLk0Oik9IGtGxcInJ2G2kLB6LBYbUNR7NT6MS89gnFjDSCWickm6qJET3hXZK+LDnfIwpUuLfayobtsTHu5FSkdJ7kUixB/vCuIANU8+O2IyiJaOLSjao= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 36ee8424-a578-4be5-4570-08d68aabba3c X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Feb 2019 14:19:08.6845 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB5376 Subject: [dpdk-ci] [PATCH] add script to decide best tree match for patches X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2019 14:19:11 -0000 The script can be used to get the trees that best match a patch or a series. Signed-off-by: Ali Alnubani Signed-off-by: Ori Kam --- tools/get-tree.py | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 245 insertions(+) create mode 100755 tools/get-tree.py diff --git a/tools/get-tree.py b/tools/get-tree.py new file mode 100755 index 0000000..3727439 --- /dev/null +++ b/tools/get-tree.py @@ -0,0 +1,245 @@ +#!/usr/bin/env python + +# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT) +# Copyright 2019 6WIND S.A. +# Copyright 2019 Mellanox Technologies, Ltd + +import os +import sys +import re +import argparse +import copy +import fnmatch + +from requests.exceptions import HTTPError + +from git_pw import config +from git_pw import api +from git_pw import utils + +""" +This script uses the git-pw API to retrieve Patchwork's series/patches, +and find a list of trees/repos that best match the series/patch. + +The rules on which matches are based, are taken from the MAINTAINERS file, +and currently only based on the paths of the changed files. Results can be +improved by adding more information to the MAINTAINERS file. + +TODO: + - Match using the subject of the patch/series. + - Add a configuration file to specify the priority of each tree. + +Configurations: +The script uses tokens for authentication. +If the arguments pw_{server,project,token} aren't passed, the environment +variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will= try +to load the git configurations pw.{server,project,token}. + +Example usage: + ./get-tree.py --command list_trees_for_series 2054 + ./get-tree.py --command list_trees_for_patch 2054 + +The output will be a list of trees sorted based on number of matches, +with the first line having the highest count. +""" + +CONF =3D config.CONF +CONF.debug =3D False + +MAINTAINERS_FILE_PATH =3D os.environ.get('MAINTAINERS_FILE_PATH') +if not MAINTAINERS_FILE_PATH: + print('MAINTAINERS_FILE_PATH is not set.') + sys.exit(1) +RULES =3D {} + +ignored_files_re =3D re.compile(r'^doc/|\.sh$|\.py$') + +def configure_git_pw(args=3DNone): + """Configure git-pw.""" + conf =3D {} + conf_keys =3D ['server', 'project', 'token'] + for key in conf_keys: + value =3D getattr(args, 'pw_{}'.format(key)) + if not value: + print('--{} is a required git-pw configuration'.format(arg)) + sys.exit(1) + else: + setattr(CONF, key, value) + +def find_filenames(diff): + """Find file changes in a given diff. + + Source: https://github.com/getpatchwork/patchwork/blob/master/patchwor= k/parser.py + Changes from source: + - Moved _filename_re into the method. + - Reduced newlines. + """ + _filename_re =3D re.compile(r'^(---|\+\+\+) (\S+)') + # normalise spaces + diff =3D diff.replace('\r', '') + diff =3D diff.strip() + '\n' + filenames =3D {} + for line in diff.split('\n'): + if len(line) <=3D 0: + continue + filename_match =3D _filename_re.match(line) + if not filename_match: + continue + filename =3D filename_match.group(2) + if filename.startswith('/dev/null'): + continue + filename =3D '/'.join(filename.split('/')[1:]) + filenames[filename] =3D True + filenames =3D sorted(filenames.keys()) + return filenames + +def construct_rules(): + """Build a dictionary of rules from the MAINTAINERS file.""" + with open(MAINTAINERS_FILE_PATH) as fd: + maintainers =3D fd.read() + # Split into blocks of text for easier search. + maintainers =3D maintainers.split('\n\n') + + # Extract blocks that have a tree and files. + tree_file_blocks =3D [_item for _item in maintainers \ + if 'T: git://dpdk.org' in _item and 'F: ' in _item] + _dict =3D {} + for _item in tree_file_blocks: + # Get the tree url. + tree_match =3D re.search(r'T: (git://dpdk\.org[^\n]+)', _item) + if tree_match: + tree =3D tree_match.group(1) + else: + continue + if tree not in _dict: + _dict[tree] =3D {} + _dict[tree]['paths'] =3D [] + paths =3D re.findall(r'F: ([^\n]+)', _item) + _paths =3D copy.deepcopy(paths) + for path in paths: + # Remove don't-care paths + if ignored_files_re.search(path): + _paths.remove(path) + _dict[tree]['paths'] +=3D _paths + return _dict + +def get_subject(resource): + """Get subject from patch/series object, + remove its prefix and strip it. + """ + name =3D resource['name'] + return re.sub('^\[.*\]', '', name).strip() + +def find_matches(files): + """Find trees that the changed files in a patch match, + and stop at first match for each file.""" + matches =3D [] + for _file in files: + if ignored_files_re.search(_file): + continue + match_found =3D False + for tree in RULES.keys(): + for rule in RULES[tree]['paths']: + if rule.endswith('/'): + rule =3D '{}*'.format(rule) + if fnmatch.fnmatch(_file, rule): + matches.append(tree) + match_found =3D True + break + if match_found: + break + return matches + +def get_ordered_matches(matches): + """Order matches by occurrences.""" + match_counts =3D {item:matches.count(item) for item in matches} + return sorted(match_counts, key=3Dmatch_counts.get, reverse=3DTrue) + +def list_trees_for_patch(patch): + """Find matching trees for a specific patch. + For a patch to match a tree, both its subject and + at least one changed path has to match the tree. + """ + subject =3D get_subject(patch) + files =3D find_filenames(patch['diff']) + + matches =3D find_matches(files) + return matches + +def list_trees_for_series(series): + """Find matching trees for a series.""" + patch_list =3D series['patches'] + + matches =3D [] + + for patch in patch_list: + matches =3D matches + \ + list_trees_for_patch(api_get('patches', patch['id'])) + + return matches + +def parse_args(): + """Parse command-line arguments.""" + parser =3D argparse.ArgumentParser() + git_pw_conf_parser =3D parser.add_argument_group('git-pw configuration= s') + options_parser =3D parser.add_argument_group('optional arguments') + + options_parser.add_argument('--command', + choices=3D('list_trees_for_patch', + 'list_trees_for_series'), + required=3DTrue, help=3D'command to perform on patch/series') + + git_pw_conf_parser.add_argument('--pw_server', type=3Dstr, + default=3Dos.environ.get('PW_SERVER', utils.git_config('pw.ser= ver')), + help=3D'PW.SERVER') + git_pw_conf_parser.add_argument('--pw_project', type=3Dstr, + default=3Dos.environ.get('PW_PROJECT', utils.git_config('pw.pr= oject')), + help=3D'PW.PROJECT') + git_pw_conf_parser.add_argument('--pw_token', type=3Dstr, + default=3Dos.environ.get('PW_TOKEN', utils.git_config('pw.toke= n')), + help=3D'PW.TOKEN') + + parser.add_argument('id', type=3Dint, + help=3D'patch/series id') + + args =3D parser.parse_args() + + return args + +def main(): + """Main procedure.""" + args =3D parse_args() + configure_git_pw(args) + =20 + command =3D args.command + _id =3D args.id + + global RULES + RULES =3D construct_rules() + + tree_list =3D [] + + if command =3D=3D 'list_trees_for_patch': + patch =3D api_get('patches', _id) + tree_list =3D list_trees_for_patch(patch) + + elif command =3D=3D 'list_trees_for_series': + series =3D api_get('series', _id) + tree_list =3D list_trees_for_series(series) + + tree_list =3D get_ordered_matches(tree_list) + + print('{}'.format('\n'.join(tree_list))) + +def api_get(resource_type, resource_id): + """Retrieve an API resource.""" + try: + return api.detail(resource_type, resource_id) + except HTTPError as err: + if '404' in str(err): + sys.exit(1) + else: + raise + +if __name__ =3D=3D '__main__': + main() --=20 2.11.0