From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EE23E4298A; Wed, 19 Apr 2023 17:00:29 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7D07B40A79; Wed, 19 Apr 2023 17:00:29 +0200 (CEST) Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by mails.dpdk.org (Postfix) with ESMTP id 451494021F for ; Wed, 19 Apr 2023 17:00:28 +0200 (CEST) Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-63b70ca0a84so3130951b3a.2 for ; Wed, 19 Apr 2023 08:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20221208.gappssmtp.com; s=20221208; t=1681916427; x=1684508427; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YDUM58D8F8lTFuX+8mP6xgNzeKgpWcrjA/KTrcm3U9I=; b=M5tp5CGU0mn/r1DdKQEZ3GPUvfnnyx3uYYytDq79St1pDqMPIBsTJrZXgXlIsQzWAQ v4P9FBTQMTeTv4wXf/sfxw+Zpsg3xxUpnptkuzjC4GSSuKpIwdiJQ+hanDcxatVSdHx/ 5uS8NlA5sBSbiCOBr6IcKcqnICmEpP41a26pEqfaQDb73yOBswEQcOKHsXZ2Dla4q3i9 98fRPkoNKYCFefLP9y7Zth5U0ky5kabfZaEtvc/6Hekj1dpdy4a1DpVA/ybqMFLMNbxh HLmBcyUzv/EFkpHO7qrJcSRL7gsIX9XEU+93qvwpoJICOVZe8PJm8aJaqRT66UdoBzWx tBbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681916427; x=1684508427; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YDUM58D8F8lTFuX+8mP6xgNzeKgpWcrjA/KTrcm3U9I=; b=g7kwZOY3uuMXNdSHf8TvUkyXJBwA2H00FHFK0He22QdBaBpJQWLTBKDfc7rwGO+1Xm e90qN8AiGddu+/lYt+IZSpYWvf4zUNVL8/UwIHjP1TTHxDfCBCIWtqv7pdl9W9FaHvOh JnIG84ljuq14ScUOHqwPgNWl2DMEON/00mWMFwP/CQdoPsliECcldKuAKTy+5kjPSCQA oA6p0u3tN7s+iAgqNJvwvKZF4bSuqr7hUgW8jPFQ2ayRuIon6+Os5vH64tvISk10EcVl GznUwY84yDoFht8pK9d2j3NIVCUnJJUX7vnlG9j7Ig2bzdq/qCEkyzJlYd5a+dnaB1gC +asQ== X-Gm-Message-State: AAQBX9eOqESFCoMFluu6hBK0Tw+puQcUt7tOplD/nVyOn8kDGiV9DQ6G TfdyIz4LweH7Lsxxa086tTxodh3lsHuhiFRJwjfrbQ== X-Google-Smtp-Source: AKy350ZMuclyT78hq6WdgKN3j7hxVp8kSNTIUBAyZKbChUnYlLyQ6sGd3xwvIHmNFeOZK5Ud2H/VSg== X-Received: by 2002:a05:6a00:a0e:b0:63b:7954:9881 with SMTP id p14-20020a056a000a0e00b0063b79549881mr4614684pfh.28.1681916427015; Wed, 19 Apr 2023 08:00:27 -0700 (PDT) Received: from hermes.local (204-195-120-218.wavecable.com. [204.195.120.218]) by smtp.gmail.com with ESMTPSA id c6-20020aa781c6000000b0063d24d5f9b6sm4970835pfn.210.2023.04.19.08.00.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Apr 2023 08:00:24 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Subject: [PATCH] devtools: add script to check for non inclusive naming Date: Wed, 19 Apr 2023 08:00:20 -0700 Message-Id: <20230419150020.65805-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <0230331200824.195294-1-stephen@networkplumber.org> References: <0230331200824.195294-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Script to find words that should not be used. Really just a wrapper around git grep command. By default it prints matches. Uses the word lists from Inclusive Naming Initiative see https://inclusivenaming.org/word-lists/ Examples: $ ./devtools/check-naming-policy.sh -c app/test-compress-perf/comp_perf_test_cyclecount.c:1 uapp/test-compress-perf/comp_perf_test_throughput.c:1 app/test-compress-perf/comp_perf_test_verify.c:1 app/test/test_common.c:1 ... $ ./devtools/check-naming-policy.py lib/pcapng lib/pcapng/rte_pcapng.c: /* sanity check that is really a pcapng mbuf */ $ ./devtools/check-naming-policy.py -l lib/eal lib/eal/common/eal_common_memory.c lib/eal/common/eal_common_proc.c lib/eal/common/eal_common_trace.c lib/eal/common/eal_memcfg.h lib/eal/common/rte_malloc.c lib/eal/freebsd/eal.c lib/eal/include/generic/rte_power_intrinsics.h lib/eal/include/generic/rte_rwlock.h lib/eal/include/generic/rte_spinlock.h lib/eal/include/rte_debug.h lib/eal/include/rte_seqlock.h lib/eal/linux/eal.c lib/eal/windows/eal.c lib/eal/x86/include/rte_rtm.h lib/eal/x86/include/rte_spinlock.h lib/eal/x86/rte_power_intrinsics.c $ ./devtools/check-inclusive-naming -h usage: check-inclusive-naming.py [-h] [-c] [-d] [-l] [-t {0,1,2,3}] [-x EXCLUDE] [--url URL] [paths ...] Identify word usage not aligned with inclusive naming positional arguments: paths files and directory to scan options: -h, --help show this help message and exit -c, --count Show the nuber of lines that match -d, --debug Debug this script -l, --files-with-matches Show only names of files with hits -t {0,1,2,3}, --tier {0,1,2,3} Show non-conforming words of particular tier -x EXCLUDE, --exclude EXCLUDE Exclude path from scan --url URL URL for the non-inclusive naming word list Signed-off-by: Stephen Hemminger --- v4 - fix python lint warnings and spelling MAINTAINERS | 1 + devtools/check-inclusive-naming.py | 137 +++++++++++++++++++++++++++++ 2 files changed, 138 insertions(+) create mode 100755 devtools/check-inclusive-naming.py diff --git a/MAINTAINERS b/MAINTAINERS index 8df23e50999f..141d70596020 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -89,6 +89,7 @@ F: devtools/check-doc-vs-code.sh F: devtools/check-dup-includes.sh F: devtools/check-maintainers.sh F: devtools/check-forbidden-tokens.awk +F: devtools/check-inclusive-naming.py F: devtools/check-git-log.sh F: devtools/check-spdx-tag.sh F: devtools/check-symbol-change.sh diff --git a/devtools/check-inclusive-naming.py b/devtools/check-inclusive-naming.py new file mode 100755 index 000000000000..916d49aa2ecf --- /dev/null +++ b/devtools/check-inclusive-naming.py @@ -0,0 +1,137 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2023 Stephen Hemminger +# +# This script scans the source tree and creates list of files +# containing words that are recommended to avoided by the +# Inclusive Naming Initiative. +# See: https://inclusivenaming.org/word-lists/ + +"""Script to run git grep to finds strings in inclusive naming word list.""" + +import argparse +import json +import subprocess +import sys +from urllib.request import urlopen + +WORDLIST_URL = 'https://inclusivenaming.org/word-lists/index.json' + +# These give false positives +dont_scan = [ + 'doc/guides/rel_notes/', + 'doc/guides/contributing/coding_style.rst' + 'doc/guides/prog_guide/glossary.rst' +] + + +def args_parse(): + "parse arguments and return the argument object back to main" + + parser = argparse.ArgumentParser( + description="Identify word usage not aligned with inclusive naming") + parser.add_argument("-c", + "--count", + help="Show the number of lines that match", + action='store_true') + parser.add_argument("-d", + "--debug", + default=False, + help="Debug this script", + action='store_true') + parser.add_argument("-l", + "--files-with-matches", + help="Show only names of files with hits", + action='store_true') + parser.add_argument("-n", + "--line-number", + help="Prefix with line number to matching lines", + action='store_true') + # note: tier 0 is "OK to use" + parser.add_argument("-t", + "--tier", + type=int, + choices=range(0, 4), + action='append', + help="Show non-conforming words of particular tier") + parser.add_argument('-x', + "--exclude", + default=dont_scan, + action='append', + help="Exclude path from scan") + parser.add_argument('--url', + default=WORDLIST_URL, + help="URL for the non-inclusive naming word list") + parser.add_argument('paths', nargs='*', + help='files and directory to scan') + + return parser.parse_args() + + +def fetch_wordlist(url, tiers): + "Read list of words from inclusivenaming.org" + + # The word list is returned as JSON like: + # { + # "data" : + # [ + # { + # "term": "abort", + # "tier" : "1", + # "recommendation": "Replace when possible.", + # ... + with urlopen(url) as response: + entries = json.loads(response.read())['data'] + + wordlist = [] + for item in entries: + tier = int(item['tier']) + if tiers.count(tier) > 0: + # convert minus sign to minus or space regex + pattern = item['term'].replace('-', '[- ]') + wordlist.append(pattern.lower()) + return wordlist + + +def git_args(args): + "Construct command line based on args" + + # Default to Tier 1, 2 and 3. + if args.tier: + tiers = args.tier + else: + tiers = list(range(1, 4)) + + wordlist = fetch_wordlist(args.url, tiers) + if args.debug: + print(f"Matching on: {wordlist}") + + cmd = ['git', 'grep', '-i'] + if args.files_with_matches: + cmd.append('-l') + if args.count: + cmd.append('-c') + if args.line_number: + cmd.append('-n') + for word in wordlist: + cmd.append('-e') + cmd.append(word) + cmd.append('--') + # convert the dont_scan paths to regexp + for path in dont_scan: + cmd.append(f":^{path}") + cmd += args.paths + if args.debug: + print(cmd) + return cmd + + +def main(): + "decode command line arguments then run setup to run" + + grep = subprocess.run(git_args(args_parse()), check=False) + sys.exit(grep.returncode) + + +if __name__ == "__main__": + main() -- 2.39.2