From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 384D8A0548; Mon, 26 Apr 2021 16:49:03 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id ADA0F41110; Mon, 26 Apr 2021 16:49:02 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by mails.dpdk.org (Postfix) with ESMTP id 93D0C41104 for ; Mon, 26 Apr 2021 16:49:00 +0200 (CEST) IronPort-SDR: SlPSHYTBisXNGUH609ijV6twnNzwStPpXoPodTCvlVlUhiejYCDFIvtlbSZ/rWilAL4nCkU+oa lPxVd/KpOgkQ== X-IronPort-AV: E=McAfee;i="6200,9189,9966"; a="196406139" X-IronPort-AV: E=Sophos;i="5.82,252,1613462400"; d="scan'208";a="196406139" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Apr 2021 07:48:58 -0700 IronPort-SDR: Wa4Faf+tWrPbIOgNP3yzso4UxqrodbPPWDL1UHdxCpo2tynK7yNZKQVYv1HNCAB3sPcauaMKUN NNkeqjzVfBfQ== X-IronPort-AV: E=Sophos;i="5.82,252,1613462400"; d="scan'208";a="536248219" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.222.4]) ([10.213.222.4]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Apr 2021 07:48:57 -0700 To: Bruce Richardson Cc: dev@dpdk.org, thomas@monjalon.net References: <20210422090211.320855-1-bruce.richardson@intel.com> <20210426105403.226004-1-bruce.richardson@intel.com> <32b16ecf-be20-1ad2-43b1-df0f4864f681@intel.com> From: "Burakov, Anatoly" Message-ID: <58a49a64-62c5-d633-08c1-d57e5e9acd4c@intel.com> Date: Mon, 26 Apr 2021 15:48:54 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 1/2] devtools: script to check meson indentation of lists X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 26-Apr-21 3:05 PM, Bruce Richardson wrote: > On Mon, Apr 26, 2021 at 02:40:25PM +0100, Burakov, Anatoly wrote: >> On 26-Apr-21 11:54 AM, Bruce Richardson wrote: >>> This is a script to fix up minor formatting issues in meson files. >>> It scans for, and can optionally fix, indentation issues and missing >>> trailing commas in the lists in meson.build files. It also detects, >>> and can fix, multi-line lists where more than one entry appears on a >>> line. >>> >>> Signed-off-by: Bruce Richardson >>> --- >> >> >> >>> +def split_code_comments(line): >>> + 'splits a line into a code part and a comment part, returns (code, comment) tuple' >>> + if line.lstrip().startswith('#'): >>> + return ('', line) >>> + elif '#' in line and '#include' not in line: # catch 99% of cases, not 100% > + idx = line.index('#') >>> + while (line[idx - 1].isspace()): >>> + idx -= 1 >>> + return line[:idx], line[idx:] >> >> >> I think this could be simplified with regex: >> >> # find any occurrences of '#' but only if it's not an '#include' >> if not re.search(r'#(?!include)', line) >> return line, '' >> return line.split('#', maxsplit=1) > > Not sure that is simpler, and just splitting on '#' is actually not what we > want either. > > Firstly, while r'#(?!include)' is not a massively complex regex, just > checking for "'#' in line and '#include' not in line" is just easier to > read for most mortals. > > In terms of the split, I did initially do as you have here and split on > '#', but we don't actually want that, because we want to preserve the > whitespace in the line before the comment too - as part of the comment, not > the code. This is why after finding the '#' we walk backwards to find the > end of the code and find that as the split point. It then saves us worrying > about any strips() breaking any comment alignment the user has explicitly > set up. Not using split also means that we can just merge the strings back > with '+' rather than having to use "'#'.join()". Fair enough so! > >> >>> + else: >>> + return (line, '') >>> + >>> + >>> +def setline(contents, index, value): >>> + 'sets the contents[index] to value. Returns the line, along with code and comments part' >>> + line = contents[index] = value >>> + code, comments = split_code_comments(line) >>> + return line, code, comments >>> + >>> + >>> +def check_indentation(filename, contents): >>> + '''check that a list or files() is correctly indented''' >>> + infiles = False >>> + inlist = False >>> + edit_count = 0 >>> + for lineno, line in enumerate(contents): >>> + code, comments = split_code_comments(line) >> >> Nitpicking, but maybe instead of calling strip() all over the place, just >> count the number of spaces and strip right at the outset? E.g. something >> like: >> >> stripped = code.strip() >> line_indent = len(code) - len(stripped) >> >> You can then reason about indent levels by comparing stripped to code >> afterwards, and avoid doing this: >> >>> + # skip further subarrays or lists >>> + if '[' in code or ']' in code: >>> + continue >>> + if not code.startswith(indent) or code[len(indent)] == ' ': >> >> Opting to just check the indent size you calculated initially. Unless i'm >> missing something :) >> >> You could also increment edit_count if `calculated indent + stripped` is >> equal to `code`. Seems easier logic than raw string manipulation you're >> going for here... >> >> > > Interesting. That could be a good approach alright. If I do a V3 (not > guaranteed for this release) I can try taking that idea on board. > >> >>> +def process_file(filename, fix): >>> + '''run checks on file "filename"''' >>> + if VERBOSE: >>> + print(f'Processing {filename}') >>> + with open(filename) as f: >>> + contents = [ln.rstrip() for ln in f.readlines()] >> >> So any trailing whitespace gets automatically and silently fixed? >> > Hadn't actually thought of that, but yes, that will happen if --fix is > given and other changes are made to the file. Ideally, that should be fixed > to "non-silently" do so, but I'd view it as low priority since other tools > tend to be good at flagging trailing whitespace issues anyway. Yeah, i think there's not a lot of trailing whitespace there in the first place, so probably a non issue. > >>> + >>> + if check_indentation(filename, contents) > 0 and fix: >>> + print(f"Fixing {filename}") >>> + with open(filename, 'w') as f: >>> + f.writelines([f'{ln}\n' for ln in contents]) >> >> Something seems suspect here. So, if `fix` is *not* specified, the script >> just opens the file, reads it, and... does nothing else? >> > No, it prints out all the errors without actually fixing them. > Oh, right, check_indentation will run first. Never mind! In any case, Reviewed-by: Anatoly Burakov -- Thanks, Anatoly