From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00059.outbound.protection.outlook.com [40.107.0.59]) by dpdk.org (Postfix) with ESMTP id 0E4831B3A9 for ; Tue, 12 Feb 2019 15:48:48 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PB1c1rP9dr3w8jU68+W0/rLXdIs5zxapskarlN0NexU=; b=gb0YnDa9183uIg2QKQr5LGwCMHAX4ZdBYLghCuWS7MHUdLGHfHDVhwYQ7Oc+epz+TArgEl3JUyShNxi++DH4ma9EzT3yFQnvDstu4JO46DohL75Kjpp5O2BoP2WMHvrzw6HC5dqxhoH6iQ+fMy28ko239xjQ9q0PLo5v/fo2LfY= Received: from VI1PR05MB4269.eurprd05.prod.outlook.com (52.133.12.22) by VI1PR05MB6141.eurprd05.prod.outlook.com (20.178.205.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1601.19; Tue, 12 Feb 2019 14:48:46 +0000 Received: from VI1PR05MB4269.eurprd05.prod.outlook.com ([fe80::64cb:93f8:918:f1a1]) by VI1PR05MB4269.eurprd05.prod.outlook.com ([fe80::64cb:93f8:918:f1a1%7]) with mapi id 15.20.1601.023; Tue, 12 Feb 2019 14:48:46 +0000 From: Ali Alnubani To: "ci@dpdk.org" CC: Thomas Monjalon , "ferruh.yigit@intel.com" , "jplsek@iol.unh.edu" , Ori Kam Thread-Topic: [PATCH v2] add script to decide best tree match for patches Thread-Index: AQHUwuIO4Vflct6tfE+fy/SC26xbcg== Date: Tue, 12 Feb 2019 14:48:46 +0000 Message-ID: <20190212144828.18122-1-alialnu@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: git-send-email 2.11.0 x-clientproxiedby: LO2P265CA0221.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:b::17) To VI1PR05MB4269.eurprd05.prod.outlook.com (2603:10a6:803:40::22) authentication-results: spf=none (sender IP is ) smtp.mailfrom=alialnu@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [2001:4b98:dc0:51:216:3eff:feac:53b] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR05MB6141; 6:WDUP71meyfK5en2PUHsPkGFP6rgg3oRz7HzctpLi7nnzRt8Jm+2umNXCCne63vQBBXurtQAfl5vL9ceOWgW5QrwahHD/KO7qTpoekdGYJ2ms+xubXwjzEyI67TSm8TChgSOdiA15XZKqPnDlQqKf6A2aF48oJZ9alLovuEMGdsToRLw9srHUuPZIkkR69qcA/nfZ3lEcWWIEyco0uA/JJmejtkF07pP8vSJPI6mqWZV3GxC943oZO55CCaO6cQFy99mtCB3F5pgjXXg6sTqUD0An0v0mVd/YJye9cC8QrWslIQQGzNrSEbNnVopUw1A8u/Yqzmm97Ptj12oMh4j3Upz0rL5B7oVSRem7u4CYLW4gZz1xO38Y2PEhbrlhag5UiC8pgz00rwVjTAiJmFSfjOQDhq0/KMLUGzlefWkmQ3TUJs3XcgR1ufkZdLn0D2kSKKDOOdyDyxgGbdV/SAEjYA==; 5:cHUB5481AXfaDTRLHr0rGD3PCBdCQdpWy3XLa3BFtJP3T2VrVcDgxURopgW5Q81MxiN4ZgDmJVGhfndAQCsSHGaxG09fdB9lkRO9z+OP1N7lm1BYGFoPiUBToBEwYxK7HDxYtvKnqyK3Is+nkvkW2kJEK8B1qHVD3mjnheKqTw83e29eErqt16dojOKF5nIdY/3FKYa318dD/ASPeH1Tyw==; 7:7ZcRTVZ+a9H8EjyDBoswfZFyNv/Jr1nxCDNRx1M5a3QQ1CXgwaRFdf0XLoYIO/KqN4ipsSRN851NhiheSNM5zb3zl4dLH8V/yG5ISK1+N6pSocz/wZTfumyl8zAu+DZe34B4J25NaReGyw7r8IB0gQ== x-ms-office365-filtering-correlation-id: faab6468-247d-496c-81aa-08d690f930fa x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(5600110)(711020)(4605077)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:VI1PR05MB6141; x-ms-traffictypediagnostic: VI1PR05MB6141: x-ms-exchange-purlcount: 1 x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-forefront-prvs: 0946DC87A1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(346002)(136003)(39860400002)(366004)(396003)(189003)(199004)(6916009)(6436002)(6512007)(6306002)(36756003)(52116002)(476003)(2616005)(1076003)(316002)(5640700003)(54906003)(46003)(305945005)(14454004)(102836004)(8936002)(81166006)(50226002)(99286004)(486006)(1730700003)(6116002)(8676002)(81156014)(478600001)(6506007)(386003)(966005)(2501003)(86362001)(2351001)(68736007)(25786009)(4326008)(71190400001)(186003)(256004)(105586002)(71200400001)(7736002)(14444005)(6486002)(53936002)(2906002)(97736004)(107886003)(106356001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB6141; H:VI1PR05MB4269.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: ykI+4E5XLrCzKizwIRPz8NtKCuDKVrKNJRHAeSM4BXU7+njB1qb+30ljBqELpmUvDQ04FEczcK3jZ/HGbK/iS2tKXMuySj9CE+cQ3QN7DG1HPT4ztI7pVlAy+C8GqXmeyhOmBsVNIH3HCF5ES+DGkxfucJPgRHkbdpz3xcHV+YBq/lZjOLk24Gssa8c0Y0u3p28fmn3fWjlgCV6xIMnnQKIRIydfs5T8pKUP4/SrVZ9mPcZfw7xbhBL/E6ZqgbYIKNcugqaeX+wy5H4UbMk4ND/oBy1E/Ve7HSDol3dhoxKC+PdeZ58TVk/Ms9cFkBhXKhxncpz4w1L5W3zII0jYOHbajzDohkqs8+/6CU0sqNro9hKm0r32JPWN6I8vWtbZ5/E2wNeK0ULS6UAFA6d0yw7SN4m3fVoUkFArkcCb39g= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: faab6468-247d-496c-81aa-08d690f930fa X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Feb 2019 14:48:46.0062 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB6141 Subject: [dpdk-ci] [PATCH v2] add script to decide best tree match for patches X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Feb 2019 14:48:48 -0000 The script can be used to get the trees that best match a patch or a series. Signed-off-by: Ali Alnubani Signed-off-by: Ori Kam --- Changes in v2: - Renamed script. - Updated license. tools/guess-git-tree.py | 244 ++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 244 insertions(+) create mode 100755 tools/guess-git-tree.py diff --git a/tools/guess-git-tree.py b/tools/guess-git-tree.py new file mode 100755 index 0000000..f4caef5 --- /dev/null +++ b/tools/guess-git-tree.py @@ -0,0 +1,244 @@ +#!/usr/bin/env python + +# SPDX-License-Identifier: (BSD-3-Clause AND GPL-2.0-or-later AND MIT) +# Copyright 2019 Mellanox Technologies, Ltd + +import os +import sys +import re +import argparse +import copy +import fnmatch + +from requests.exceptions import HTTPError + +from git_pw import config +from git_pw import api +from git_pw import utils + +""" +This script uses the git-pw API to retrieve Patchwork's series/patches, +and find a list of trees/repos that best match the series/patch. + +The rules on which matches are based, are taken from the MAINTAINERS file, +and currently only based on the paths of the changed files. Results can be +improved by adding more information to the MAINTAINERS file. + +TODO: + - Match using the subject of the patch/series. + - Add a configuration file to specify the priority of each tree. + +Configurations: +The script uses tokens for authentication. +If the arguments pw_{server,project,token} aren't passed, the environment +variables PW_{SERVER,PROJECT,TOKEN} should be set. If not, the script will= try +to load the git configurations pw.{server,project,token}. + +Example usage: + ./guess-git-tree.py --command list_trees_for_series 2054 + ./guess-git-tree.py --command list_trees_for_patch 2054 + +The output will be a list of trees sorted based on number of matches, +with the first line having the highest count. +""" + +CONF =3D config.CONF +CONF.debug =3D False + +MAINTAINERS_FILE_PATH =3D os.environ.get('MAINTAINERS_FILE_PATH') +if not MAINTAINERS_FILE_PATH: + print('MAINTAINERS_FILE_PATH is not set.') + sys.exit(1) +RULES =3D {} + +ignored_files_re =3D re.compile(r'^doc/|\.sh$|\.py$') + +def configure_git_pw(args=3DNone): + """Configure git-pw.""" + conf =3D {} + conf_keys =3D ['server', 'project', 'token'] + for key in conf_keys: + value =3D getattr(args, 'pw_{}'.format(key)) + if not value: + print('--{} is a required git-pw configuration'.format(arg)) + sys.exit(1) + else: + setattr(CONF, key, value) + +def find_filenames(diff): + """Find file changes in a given diff. + + Source: https://github.com/getpatchwork/patchwork/blob/master/patchwor= k/parser.py + Changes from source: + - Moved _filename_re into the method. + - Reduced newlines. + """ + _filename_re =3D re.compile(r'^(---|\+\+\+) (\S+)') + # normalise spaces + diff =3D diff.replace('\r', '') + diff =3D diff.strip() + '\n' + filenames =3D {} + for line in diff.split('\n'): + if len(line) <=3D 0: + continue + filename_match =3D _filename_re.match(line) + if not filename_match: + continue + filename =3D filename_match.group(2) + if filename.startswith('/dev/null'): + continue + filename =3D '/'.join(filename.split('/')[1:]) + filenames[filename] =3D True + filenames =3D sorted(filenames.keys()) + return filenames + +def construct_rules(): + """Build a dictionary of rules from the MAINTAINERS file.""" + with open(MAINTAINERS_FILE_PATH) as fd: + maintainers =3D fd.read() + # Split into blocks of text for easier search. + maintainers =3D maintainers.split('\n\n') + + # Extract blocks that have a tree and files. + tree_file_blocks =3D [_item for _item in maintainers \ + if 'T: git://dpdk.org' in _item and 'F: ' in _item] + _dict =3D {} + for _item in tree_file_blocks: + # Get the tree url. + tree_match =3D re.search(r'T: (git://dpdk\.org[^\n]+)', _item) + if tree_match: + tree =3D tree_match.group(1) + else: + continue + if tree not in _dict: + _dict[tree] =3D {} + _dict[tree]['paths'] =3D [] + paths =3D re.findall(r'F: ([^\n]+)', _item) + _paths =3D copy.deepcopy(paths) + for path in paths: + # Remove don't-care paths + if ignored_files_re.search(path): + _paths.remove(path) + _dict[tree]['paths'] +=3D _paths + return _dict + +def get_subject(resource): + """Get subject from patch/series object, + remove its prefix and strip it. + """ + name =3D resource['name'] + return re.sub('^\[.*\]', '', name).strip() + +def find_matches(files): + """Find trees that the changed files in a patch match, + and stop at first match for each file.""" + matches =3D [] + for _file in files: + if ignored_files_re.search(_file): + continue + match_found =3D False + for tree in RULES.keys(): + for rule in RULES[tree]['paths']: + if rule.endswith('/'): + rule =3D '{}*'.format(rule) + if fnmatch.fnmatch(_file, rule): + matches.append(tree) + match_found =3D True + break + if match_found: + break + return matches + +def get_ordered_matches(matches): + """Order matches by occurrences.""" + match_counts =3D {item:matches.count(item) for item in matches} + return sorted(match_counts, key=3Dmatch_counts.get, reverse=3DTrue) + +def list_trees_for_patch(patch): + """Find matching trees for a specific patch. + For a patch to match a tree, both its subject and + at least one changed path has to match the tree. + """ + subject =3D get_subject(patch) + files =3D find_filenames(patch['diff']) + + matches =3D find_matches(files) + return matches + +def list_trees_for_series(series): + """Find matching trees for a series.""" + patch_list =3D series['patches'] + + matches =3D [] + + for patch in patch_list: + matches =3D matches + \ + list_trees_for_patch(api_get('patches', patch['id'])) + + return matches + +def parse_args(): + """Parse command-line arguments.""" + parser =3D argparse.ArgumentParser() + git_pw_conf_parser =3D parser.add_argument_group('git-pw configuration= s') + options_parser =3D parser.add_argument_group('optional arguments') + + options_parser.add_argument('--command', + choices=3D('list_trees_for_patch', + 'list_trees_for_series'), + required=3DTrue, help=3D'command to perform on patch/series') + + git_pw_conf_parser.add_argument('--pw_server', type=3Dstr, + default=3Dos.environ.get('PW_SERVER', utils.git_config('pw.ser= ver')), + help=3D'PW.SERVER') + git_pw_conf_parser.add_argument('--pw_project', type=3Dstr, + default=3Dos.environ.get('PW_PROJECT', utils.git_config('pw.pr= oject')), + help=3D'PW.PROJECT') + git_pw_conf_parser.add_argument('--pw_token', type=3Dstr, + default=3Dos.environ.get('PW_TOKEN', utils.git_config('pw.toke= n')), + help=3D'PW.TOKEN') + + parser.add_argument('id', type=3Dint, + help=3D'patch/series id') + + args =3D parser.parse_args() + + return args + +def main(): + """Main procedure.""" + args =3D parse_args() + configure_git_pw(args) + =20 + command =3D args.command + _id =3D args.id + + global RULES + RULES =3D construct_rules() + + tree_list =3D [] + + if command =3D=3D 'list_trees_for_patch': + patch =3D api_get('patches', _id) + tree_list =3D list_trees_for_patch(patch) + + elif command =3D=3D 'list_trees_for_series': + series =3D api_get('series', _id) + tree_list =3D list_trees_for_series(series) + + tree_list =3D get_ordered_matches(tree_list) + + print('{}'.format('\n'.join(tree_list))) + +def api_get(resource_type, resource_id): + """Retrieve an API resource.""" + try: + return api.detail(resource_type, resource_id) + except HTTPError as err: + if '404' in str(err): + sys.exit(1) + else: + raise + +if __name__ =3D=3D '__main__': + main() --=20 2.11.0