From: Bruce Richardson <bruce.richardson@intel.com>
To: Robin Jarry <rjarry@redhat.com>
Cc: <dev@dpdk.org>, <marat.khalili@huawei.com>
Subject: Re: [PATCH v2 1/2] devtools/mailmap_ctl: script to work with mailmap
Date: Fri, 17 Oct 2025 14:35:06 +0100 [thread overview]
Message-ID: <aPJGCjDJ5Gu3UQkg@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <DDIYNM2UVM02.I4Z8DDUEEU68@redhat.com>
On Wed, Oct 15, 2025 at 04:20:57PM +0200, Robin Jarry wrote:
> Hi Bruce, see my comments inline.
>
Thanks for review. Taking nearly all feedback in v3. See inline below.
/Bruce
> Bruce Richardson, Aug 08, 2025 at 23:08:
> > Add a script to easily add entries to, check and sort the mailmap file.
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> > devtools/mailmap_ctl.py | 212 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 212 insertions(+)
> > create mode 100755 devtools/mailmap_ctl.py
> >
> > diff --git a/devtools/mailmap_ctl.py b/devtools/mailmap_ctl.py
> > new file mode 100755
> > index 0000000000..15548c54cd
> > --- /dev/null
> > +++ b/devtools/mailmap_ctl.py
> > @@ -0,0 +1,212 @@
> > +#!/usr/bin/env python3
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright(c) 2025 Intel Corporation
> > +
> > +"""
> > +A tool for manipulating the .mailmap file in DPDK repository.
> > +
> > +This script supports three operations:
> > +- add: adds a new entry to the mailmap file in the correct position
> > +- check: validates mailmap entries are sorted and correctly formatted
> > +- sort: sorts the mailmap entries alphabetically by name
>
> You can remove the second paragraph and the bullet points. See at the
> end.
>
Ack, will remove in v3
> > +"""
> > +
> > +import sys
> > +import re
> > +import argparse
> > +import itertools
> > +import unicodedata
> > +from pathlib import Path
> > +from dataclasses import dataclass
> > +from typing import List, Optional
>
> That's a matter of preference, but for such a small script I prefer to
> only import top level modules, e.g.:
>
Given the script is short, I prefer the opposite - the amount of context is
so small there is no need to keep the namespace along with the functions
like Path, or dataclass.
> import pathlib
> import dataclasses
>
> By the way, since python 3.10, you can use the builtin symbols instead
> of importing typing symbols. See below for more details.
>
Ack. Not sure what our min python is, but this is not a regular user tool,
so requiring 3.10 should be fine for any devs or maintainers.
> > +
> > +
> > +@dataclass
> > +class MailmapEntry:
> > + """Represents a single mailmap entry."""
> > +
> > + name: str
> > + name_for_sorting: str
> > + email1: str
> > + email2: Optional[str]
>
> Replace with python 3.10 syntax:
>
> email2: str | None
>
Will fix in v3
> > + line_number: int
> > +
> > + def __str__(self) -> str:
> > + """Format the entry back to mailmap string format."""
> > + return f"{self.name} <{self.email1}>" + (f" <{self.email2}>" if self.email2 else "")
> > +
> > + @staticmethod
> > + def _get_name_for_sorting(name) -> str:
> > + """Normalize a name for sorting purposes."""
> > + # Remove accents/diacritics. Separate accented chars into two - so accent is separate,
> > + # then remove the accent.
> > + normalized = unicodedata.normalize("NFD", name)
> > + normalized = "".join(c for c in normalized if unicodedata.category(c) != "Mn")
> > +
> > + return normalized.lower()
> > +
> > + @classmethod
> > + def parse(cls, line: str, line_number: int = 0):
> > + """
> > + Parse a mailmap line and create a MailmapEntry instance.
> > +
> > + Valid formats:
> > + - Name <email>
> > + - Name <primary_email> <secondary_email>
> > + """
> > + # Pattern to match mailmap entries
> > + # Group 1: Name, Group 2: first email, Group 3: optional second email
> > + pattern = r"^([^<]+?)\s*<([^>]+)>(?:\s*<([^>]+)>)?$"
> > + match = re.match(pattern, line.strip())
> > + if not match:
> > + raise argparse.ArgumentTypeError(f"Invalid entry format: '{line}'")
> > +
> > + name = match.group(1).strip()
> > + primary_email = match.group(2).strip()
> > + secondary_email = match.group(3).strip() if match.group(3) else None
> > +
> > + return cls(
> > + name=name,
> > + name_for_sorting=cls._get_name_for_sorting(name),
> > + email1=primary_email,
> > + email2=secondary_email,
> > + line_number=line_number,
> > + )
> > +
> > +
> > +def read_and_parse_mailmap(mailmap_path: Path, fail_on_err: bool) -> List[MailmapEntry]:
>
> Replace with python 3.9 builtin list and use pathlib module directly:
>
> def read_and_parse_mailmap(mailmap_path: pathlib.Path, fail_on_err: bool) -> list[MailmapEntry]:
>
Ack for using builtin lists. Keep un-namespaced Path.
> > + """Read and parse a mailmap file, returning entries."""
> > + try:
> > + with open(mailmap_path, "r", encoding="utf-8") as f:
> > + lines = f.readlines()
> > + except IOError as e:
> > + print(f"Error reading {mailmap_path}: {e}", file=sys.stderr)
> > + sys.exit(1)
> > +
> > + entries = []
> > + for line_num, line in enumerate(lines, 1):
> > + stripped_line = line.strip()
> > +
> > + # Skip empty lines and comments
> > + if not stripped_line or stripped_line.startswith("#"):
> > + continue
> > +
> > + try:
> > + entry = MailmapEntry.parse(stripped_line, line_num)
> > + except argparse.ArgumentTypeError as e:
> > + print(f"Line {line_num}: {e}", file=sys.stderr)
> > + if fail_on_err:
> > + sys.exit(1)
> > + continue
> > +
> > + entries.append(entry)
> > + return entries
> > +
> > +
> > +def write_entries_to_file(mailmap_path: Path, entries: List[MailmapEntry]):
>
> Same:
>
> def write_entries_to_file(mailmap_path: pathlib.Path, entries: list[MailmapEntry]):
>
> > + """Write entries to mailmap file."""
> > + try:
> > + with open(mailmap_path, "w", encoding="utf-8") as f:
> > + for entry in entries:
> > + f.write(str(entry) + "\n")
> > + except IOError as e:
> > + print(f"Error writing {mailmap_path}: {e}", file=sys.stderr)
> > + sys.exit(1)
> > +
> > +
> > +def check_mailmap(mailmap_path, _):
> > + """Check that mailmap entries are correctly sorted and formatted."""
> > + entries = read_and_parse_mailmap(mailmap_path, False)
> > +
> > + errors = 0
> > + for e1, e2 in itertools.pairwise(entries):
> > + if e1.name_for_sorting > e2.name_for_sorting:
> > + print(
> > + f"Line {e2.line_number}: '{e2.name}' should come before '{e1.name}'",
> > + file=sys.stderr,
> > + )
> > + errors += 1
> > +
> > + if errors:
> > + sys.exit(1)
> > +
> > +
> > +def sort_mailmap(mailmap_path, _):
> > + """Sort the mailmap entries alphabetically by name."""
> > + entries = read_and_parse_mailmap(mailmap_path, True)
> > +
> > + entries.sort(key=lambda x: x.name_for_sorting)
> > + write_entries_to_file(mailmap_path, entries)
> > +
> > +
> > +def add_entry(mailmap_path, args):
> > + """Add a new entry to the mailmap file in the correct alphabetical position."""
> > + if not args.entry:
> > + print("Error: 'add' operation requires an entry argument", file=sys.stderr)
> > + sys.exit(1)
> > +
> > + new_entry = args.entry
> > + entries = read_and_parse_mailmap(mailmap_path, True)
> > +
> > + # Check if entry already exists, checking email2 only if it's specified
> > + if (
> > + not new_entry.email2
> > + and any(e.name == new_entry.name and e.email1 == new_entry.email1 for e in entries)
> > + ) or any(
> > + e.name == new_entry.name and e.email1 == new_entry.email1 and e.email2 == new_entry.email2
> > + for e in entries
> > + ):
> > + print(
> > + f"Error: Duplicate entry - '{new_entry.name} <{new_entry.email1}>' already exists",
> > + file=sys.stderr,
> > + )
> > + sys.exit(1)
> > +
> > + for n, entry in enumerate(entries):
> > + if entry.name_for_sorting > new_entry.name_for_sorting:
> > + entries.insert(n, new_entry)
> > + break
> > + else:
> > + entries.append(new_entry)
> > + write_entries_to_file(mailmap_path, entries)
> > +
> > +
> > +def main():
> > + """Main function."""
> > + # ops and functions implementing them
> > + operations = {"add": add_entry, "check": check_mailmap, "sort": sort_mailmap}
> > +
> > + parser = argparse.ArgumentParser(
> > + description=__doc__,
> > + formatter_class=argparse.RawDescriptionHelpFormatter,
> > + epilog="NOTE:\n for operations which write .mailmap, any comments or blank lines in the file will be removed",
> > + )
> > + parser.add_argument("operation", choices=operations.keys(), help="Operation to perform")
> > + parser.add_argument("--mailmap", help="Path to .mailmap file (default: search up tree)")
> > + parser.add_argument(
> > + "entry",
> > + nargs="?",
> > + type=MailmapEntry.parse,
> > + help='Entry to add. Format: "Name <email@domain.com>"',
> > + )
>
> Here you can use sub parsers instead:
>
> parser = argparse.ArgumentParser(
> description=__doc__,
> epilog="NOTE: for operations which write .mailmap, any comments or blank lines in the file will be removed",
> )
> parser.add_argument("--mailmap", help="Path to .mailmap file (default: search up tree)")
> sub = parser.add_subparsers(title="sub-command help", metavar="SUB_COMMAND")
> sub.required = True
> add = sub.add_parser("add", description=add_entry.__doc__, help=add_entry.__doc__)
> add.add_argument(
> "entry",
> type=MailmapEntry.parse,
> help='Entry to add. Format: "Name <email@domain.com>"',
> )
> add.set_defaults(callback=add_entry)
> check = sub.add_parser("check", description=check_mailmap.__doc__, help=check_mailmap.__doc__)
> check.set_defaults(callback=check_mailmap)
> sort = sub.add_parser("sort", description=sort_mailmap.__doc__, help=sort_mailmap.__doc__)
> sort.set_defaults(callback=sort_mailmap)
>
Very neat thanks. Adding to v3.
> > +
> > + args = parser.parse_args()
> > +
> > + if args.mailmap:
> > + mailmap_path = Path(args.mailmap)
>
> args.mailmap = pathlib.Path(args.mailmap)
>
> > + else:
> > + # Find mailmap file
> > + mailmap_path = Path(".").resolve()
> > + while not (mailmap_path / ".mailmap").exists():
> > + if mailmap_path == mailmap_path.parent:
> > + print("Error: No .mailmap file found", file=sys.stderr)
> > + sys.exit(1)
> > + mailmap_path = mailmap_path.parent
> > + mailmap_path = mailmap_path / ".mailmap"
>
> args.mailmap = mailmap_path / ".mailmap"
> > +
> > + # call appropriate operation
> > + operations[args.operation](mailmap_path, args)
>
> And there (you'll need to adjust all callbacks to take a single args
> parameter):
>
> args.callback(args)
>
Changed in v3.
> > +
> > +
> > +if __name__ == "__main__":
> > + main()
>
>
> --
> Robin
>
> > Do not use or store near heat or open flame.
>
next prev parent reply other threads:[~2025-10-17 13:35 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-08 14:27 [PATCH 0/2] Improve mailmap file Bruce Richardson
2025-08-08 14:27 ` [PATCH 1/2] devtools/mailmap_ctl: script to work with mailmap Bruce Richardson
2025-08-08 16:44 ` Marat Khalili
2025-08-08 16:47 ` Bruce Richardson
2025-08-08 19:58 ` Bruce Richardson
2025-08-11 11:27 ` Marat Khalili
2025-08-08 14:27 ` [PATCH 2/2] mailmap: sort mailmap Bruce Richardson
2025-10-15 7:38 ` Varghese, Vipin
2025-10-15 7:41 ` Bruce Richardson
2025-08-08 21:08 ` [PATCH v2 0/2] Improve mailmap file Bruce Richardson
2025-08-08 21:08 ` [PATCH v2 1/2] devtools/mailmap_ctl: script to work with mailmap Bruce Richardson
2025-08-11 11:28 ` Marat Khalili
2025-10-15 14:20 ` Robin Jarry
2025-10-17 13:35 ` Bruce Richardson [this message]
2025-08-08 21:08 ` [PATCH v2 2/2] mailmap: sort mailmap Bruce Richardson
2025-08-11 12:18 ` Marat Khalili
2025-10-17 13:38 ` [PATCH v3 0/2] Improve mailmap file Bruce Richardson
2025-10-17 13:38 ` [PATCH v3 1/2] devtools/mailmap_ctl: script to work with mailmap Bruce Richardson
2025-10-17 13:45 ` Robin Jarry
2025-10-17 14:05 ` Bruce Richardson
2025-10-17 13:38 ` [PATCH v3 2/2] mailmap: sort mailmap Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPJGCjDJ5Gu3UQkg@bricha3-mobl1.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=marat.khalili@huawei.com \
--cc=rjarry@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).