DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: dev@dpdk.org, thomas@monjalon.net
Subject: Re: [dpdk-dev] [PATCH v2 1/2] devtools: script to check meson indentation of lists
Date: Mon, 26 Apr 2021 15:05:59 +0100	[thread overview]
Message-ID: <YIbIx8H6p2kMz9RA@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <32b16ecf-be20-1ad2-43b1-df0f4864f681@intel.com>

On Mon, Apr 26, 2021 at 02:40:25PM +0100, Burakov, Anatoly wrote:
> On 26-Apr-21 11:54 AM, Bruce Richardson wrote:
> > This is a script to fix up minor formatting issues in meson files.
> > It scans for, and can optionally fix, indentation issues and missing
> > trailing commas in the lists in meson.build files. It also detects,
> > and can fix, multi-line lists where more than one entry appears on a
> > line.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> 
> <snip>
> 
> > +def split_code_comments(line):
> > +    'splits a line into a code part and a comment part, returns (code, comment) tuple'
> > +    if line.lstrip().startswith('#'):
> > +        return ('', line)
> > +    elif '#' in line and '#include' not in line:  # catch 99% of cases, not 100% > +        idx = line.index('#')
> > +        while (line[idx - 1].isspace()):
> > +            idx -= 1
> > +        return line[:idx], line[idx:]
> 
> 
> I think this could be simplified with regex:
> 
> # find any occurrences of '#' but only if it's not an '#include'
> if not re.search(r'#(?!include)', line)
>     return line, ''
> return line.split('#', maxsplit=1)

Not sure that is simpler, and just splitting on '#' is actually not what we
want either.

Firstly, while r'#(?!include)' is not a massively complex regex, just
checking for "'#' in line and '#include' not in line" is just easier to
read for most mortals.

In terms of the split, I did initially do as you have here and split on
'#', but we don't actually want that, because we want to preserve the
whitespace in the line before the comment too - as part of the comment, not
the code. This is why after finding the '#' we walk backwards to find the
end of the code and find that as the split point. It then saves us worrying
about any strips() breaking any comment alignment the user has explicitly
set up. Not using split also means that we can just merge the strings back
with '+' rather than having to use "'#'.join()".

> 
> > +    else:
> > +        return (line, '')
> > +
> > +
> > +def setline(contents, index, value):
> > +    'sets the contents[index] to value. Returns the line, along with code and comments part'
> > +    line = contents[index] = value
> > +    code, comments = split_code_comments(line)
> > +    return line, code, comments
> > +
> > +
> > +def check_indentation(filename, contents):
> > +    '''check that a list or files() is correctly indented'''
> > +    infiles = False
> > +    inlist = False
> > +    edit_count = 0
> > +    for lineno, line in enumerate(contents):
> > +        code, comments = split_code_comments(line)
> 
> Nitpicking, but maybe instead of calling strip() all over the place, just
> count the number of spaces and strip right at the outset? E.g. something
> like:
> 
> stripped = code.strip()
> line_indent = len(code) - len(stripped)
> 
> You can then reason about indent levels by comparing stripped to code
> afterwards, and avoid doing this:
> 
> > +            # skip further subarrays or lists
> > +            if '[' in code or ']' in code:
> > +                continue
> > +            if not code.startswith(indent) or code[len(indent)] == ' ':
> 
> Opting to just check the indent size you calculated initially. Unless i'm
> missing something :)
> 
> You could also increment edit_count if `calculated indent + stripped` is
> equal to `code`. Seems easier logic than raw string manipulation you're
> going for here...
> 
> <snip>

Interesting. That could be a good approach alright. If I do a V3 (not
guaranteed for this release) I can try taking that idea on board.

> 
> > +def process_file(filename, fix):
> > +    '''run checks on file "filename"'''
> > +    if VERBOSE:
> > +        print(f'Processing {filename}')
> > +    with open(filename) as f:
> > +        contents = [ln.rstrip() for ln in f.readlines()]
> 
> So any trailing whitespace gets automatically and silently fixed?
> 
Hadn't actually thought of that, but yes, that will happen if --fix is
given and other changes are made to the file. Ideally, that should be fixed
to "non-silently" do so, but I'd view it as low priority since other tools
tend to be good at flagging trailing whitespace issues anyway.

> > +
> > +    if check_indentation(filename, contents) > 0 and fix:
> > +        print(f"Fixing {filename}")
> > +        with open(filename, 'w') as f:
> > +            f.writelines([f'{ln}\n' for ln in contents])
> 
> Something seems suspect here. So, if `fix` is *not* specified, the script
> just opens the file, reads it, and... does nothing else?
> 
No, it prints out all the errors without actually fixing them.

Regards,
/Bruce

  reply	other threads:[~2021-04-26 14:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21 22:03 [dpdk-dev] [PATCH] drivers: fix indentation in build files Thomas Monjalon
2021-04-22  8:39 ` Bruce Richardson
2021-04-22  9:20   ` Thomas Monjalon
2021-04-26 10:56     ` Bruce Richardson
2021-04-22  9:02 ` [dpdk-dev] [RFC PATCH] devtools: script to check meson indentation of lists Bruce Richardson
2021-04-22  9:40   ` Burakov, Anatoly
2021-04-22  9:58     ` Bruce Richardson
2021-04-22 10:21       ` Burakov, Anatoly
2021-04-22 10:31         ` Bruce Richardson
2021-04-26 10:54   ` [dpdk-dev] [PATCH v2 1/2] " Bruce Richardson
2021-04-26 10:54     ` [dpdk-dev] [PATCH v2 2/2] build: style fixup on meson files Bruce Richardson
2021-04-26 13:40     ` [dpdk-dev] [PATCH v2 1/2] devtools: script to check meson indentation of lists Burakov, Anatoly
2021-04-26 14:05       ` Bruce Richardson [this message]
2021-04-26 14:48         ` Burakov, Anatoly
2021-05-04 13:05     ` Thomas Monjalon
2021-05-04 13:34       ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YIbIx8H6p2kMz9RA@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).