Re: [PATCH] tools: add branch rebase support to recheck script

DPDK CI discussions
 help / color / mirror / Atom feed

From: zhoumin <zhoumin@loongson.cn>
To: Patrick Robb <probb@iol.unh.edu>, aconole@redhat.com
Cc: ci@dpdk.org, ahassick@iol.unh.edu, shaibran@amazon.com
Subject: Re: [PATCH] tools: add branch rebase support to recheck script
Date: Tue, 17 Jun 2025 12:16:01 +0800	[thread overview]
Message-ID: <974fd110-bb60-ced2-405f-3cad65f64ab4@loongson.cn> (raw)
In-Reply-To: <20250611205849.72165-1-probb@iol.unh.edu>

Tested-by: Min Zhou <zhoumin@loongson.cn>

On 2025/6/12 4:58AM, Patrick Robb wrote:
> Adding support for key value parameters to the recheck framework. This is
> being done specifically to support the branch rebase feature. With this
> commit, the rerun_requests.json includes a new arguments section which
> currently stores the rebase value, and can store future key value pairs
> in the future. This commit does not add a requirement that the user uses
> the rebase argument. It is optional.
>
> There are also some small quality of life changes which have been added.
>
> Signed-off-by: Adam Hassick <ahassick@iol.unh.edu>
> Signed-off-by: Patrick Robb <probb@iol.unh.edu>
> ---
>   tools/get_reruns.py | 151 +++++++++++++++++++++++++-------------------
>   1 file changed, 87 insertions(+), 64 deletions(-)
>
> diff --git a/tools/get_reruns.py b/tools/get_reruns.py
> index ab4d900..7c27650 100755
> --- a/tools/get_reruns.py
> +++ b/tools/get_reruns.py
> @@ -7,17 +7,20 @@ import argparse
>   import datetime
>   import json
>   import re
> -import requests
> -from typing import Dict, List, Optional, Set
> +from json import JSONEncoder
> +from typing import Dict, List, Set, Optional, Tuple
>   
> -DPDK_PATCHWORK_EVENTS_API_URL = "http://patches.dpdk.org/api/events/"
> +import requests
>   
>   
> -class JSONSetEncoder(json.JSONEncoder):
> +class JSONSetEncoder(JSONEncoder):
>       """Custom JSON encoder to handle sets.
>   
>       Pythons json module cannot serialize sets so this custom encoder converts
>       them into lists.
> +
> +    Args:
> +        JSONEncoder: JSON encoder from the json python module.
>       """
>   
>       def default(self, input_object):
> @@ -33,12 +36,10 @@ class RerunProcessor:
>       The idea of this class is to use regex to find certain patterns that
>       represent desired contexts to rerun.
>   
> -    Args:
> +    Arguments:
>           desired_contexts: List of all contexts to search for in the bodies of
>               the comments
>           time_since: Get all comments since this timestamp
> -        pw_api_url: URL for events endpoint of the patchwork API to use for collecting
> -            comments and comment data
>   
>       Attributes:
>           collection_of_retests: A dictionary that maps patch series IDs to the
> @@ -47,38 +48,45 @@ class RerunProcessor:
>           last_comment_timestamp: timestamp of the most recent comment that was
>               processed
>       """
> +    _VALID_ARGS: Set[str] = set(["rebase"])
>   
>       _desired_contexts: List[str]
>       _time_since: str
> -    _pw_api_url: str
>       collection_of_retests: Dict[str, Dict[str, Set]] = {}
>       last_comment_timestamp: Optional[str] = None
> -    # The tag we search for in comments must appear at the start of the line
> -    # and is case sensitive. After this tag we expect a comma separated list
> -    # of valid DPDK patchwork contexts.
> -    #
> +    # ^ is start of line
> +    # ((?:(?:[\\w-]+=)?[\\w-]+(?:, ?\n?)?)+) is a capture group that gets all
> +    #   test labels and key-value pairs after "Recheck-request: "
> +    #   (?:[\\w-]+=)? optionally grabs a key followed by an equals sign
> +    #       (no space)
> +    #       [\\w-] (expanded to "(:?[a-zA-Z0-9-_]+)" ) means 1 more of any
> +    #           character in the ranges a-z, A-Z, 0-9, or the characters
> +    #               '-' or '_'
> +    #       (?:, ?\n?)? means 1 or none of this match group which expects
> +    #           exactly 1 comma followed by 1 or no spaces followed by
> +    #           1 or no newlines.
>       # VALID MATCHES:
>       #   Recheck-request: iol-unit-testing, iol-something-else, iol-one-more,
>       #   Recheck-request: iol-unit-testing,iol-something-else, iol-one-more
>       #   Recheck-request: iol-unit-testing, iol-example, iol-another-example,
>       #   more-intel-testing
> +    #   Recheck-request: x=y, rebase=latest, iol-unit-testing, iol-additional-example
>       # INVALID MATCHES:
>       #   Recheck-request: iol-unit-testing,  intel-example-testing
>       #   Recheck-request: iol-unit-testing iol-something-else,iol-one-more,
> +    #   Recheck-request: iol-unit-testing, rebase = latest
>       #   Recheck-request: iol-unit-testing,iol-something-else,iol-one-more,
> -    #
>       #   more-intel-testing
> -    regex: str = "^Recheck-request: ((?:[a-zA-Z0-9-_]+(?:, ?\n?)?)+)"
> +    regex: str = "^Recheck-request: ((?:(?:[\\w-]+=)?[\\w-]+(?:, ?\n?)?)+)"
> +    last_comment_timestamp: str
>   
> -    def __init__(
> -        self, desired_contexts: List[str], time_since: str, pw_api_url: str
> -    ) -> None:
> +    def __init__(self, desired_contexts: List[str], time_since: str, multipage: bool) -> None:
>           self._desired_contexts = desired_contexts
>           self._time_since = time_since
> -        self._pw_api_url = pw_api_url
> +        self._multipage = multipage
>   
>       def process_reruns(self) -> None:
> -        patchwork_url = f"{self._pw_api_url}?since={self._time_since}"
> +        patchwork_url = f"http://patches.dpdk.org/api/events/?since={self._time_since}"
>           comment_request_info = []
>           for item in [
>               "&category=cover-comment-created",
> @@ -87,6 +95,12 @@ class RerunProcessor:
>               response = requests.get(patchwork_url + item)
>               response.raise_for_status()
>               comment_request_info.extend(response.json())
> +
> +            while 'next' in response.links and self._multipage:
> +                response = requests.get(response.links['next']['url'])
> +                response.raise_for_status()
> +                comment_request_info.extend(response.json())
> +
>           rerun_processor.process_comment_info(comment_request_info)
>   
>       def process_comment_info(self, list_of_comment_blobs: List[Dict]) -> None:
> @@ -138,54 +152,69 @@ class RerunProcessor:
>               comment_info.raise_for_status()
>               content = comment_info.json()["content"]
>   
> -            labels_to_rerun = self.get_test_names(content)
> +            (args, labels_to_rerun) = self.get_test_names_and_parameters(content)
> +
> +            # Accept either filtered labels or arguments.
> +            if labels_to_rerun or (args and self._VALID_ARGS.issuperset(args.keys())):
> +                # Get or insert a new retest request into the dict.
> +                self.collection_of_retests[patch_id] = \
> +                    self.collection_of_retests.get(
> +                        patch_id, {"contexts": set(), "arguments": dict()}
> +                    )
>   
> -            # appending to the list if it already exists, or creating it if it
> -            # doesn't
> -            if labels_to_rerun:
> -                self.collection_of_retests[patch_id] = self.collection_of_retests.get(
> -                    patch_id, {"contexts": set()}
> -                )
> -                self.collection_of_retests[patch_id]["contexts"].update(labels_to_rerun)
> +                req = self.collection_of_retests[patch_id]
>   
> -    def get_test_names(self, email_body: str) -> Set[str]:
> +                # Update the fields.
> +                req["contexts"].update(labels_to_rerun)
> +                req["arguments"].update(args)
> +
> +    def get_test_names_and_parameters(
> +        self, email_body: str
> +    ) -> Tuple[Dict[str, str], Set[str]]:
>           """Uses the regex in the class to get the information from the email.
>   
> -        When it gets the test names from the email, it will all be in one
> -        capture group. We expect a comma separated list of patchwork labels
> -        to be retested.
> +        When it gets the test names from the email, it will be split into two
> +        capture groups. We expect a comma separated list of patchwork labels
> +        to be retested, and another comma separated list of key-value pairs
> +        which are arguments for the retest.
>   
>           Returns:
>               A set of contexts found in the email that match your list of
>               desired contexts to capture. We use a set here to avoid duplicate
>               contexts.
>           """
> -        rerun_section = re.findall(self.regex, email_body, re.MULTILINE)
> -        if not rerun_section:
> -            return set()
> -        rerun_list = list(map(str.strip, rerun_section[0].split(",")))
> -        return set(filter(lambda x: x and x in self._desired_contexts, rerun_list))
> +        rerun_list: Set[str] = set()
> +        params_dict: Dict[str, str] = dict()
> +
> +        match: List[str] = re.findall(self.regex, email_body, re.MULTILINE)
> +        if match:
> +            items: List[str] = list(map(str.strip, match[0].split(",")))
> +
> +            for item in items:
> +                if '=' in item:
> +                    sides = item.split('=')
> +                    params_dict[sides[0]] = sides[1]
> +                else:
> +                    rerun_list.add(item)
> +
> +        return (params_dict, set(filter(lambda x: x in self._desired_contexts, rerun_list)))
>   
> -    def write_output(self, file_name: str) -> None:
> -        """Output class information.
> +    def write_to_output_file(self, file_name: str) -> None:
> +        """Write class information to a JSON file.
>   
>           Takes the collection_of_retests and last_comment_timestamp and outputs
> -        them into either a json file or stdout.
> +        them into a json file.
>   
>           Args:
> -            file_name: Name of the file to write the output to. If this is set
> -            to "-" then it will output to stdout.
> +            file_name: Name of the file to write the output to.
>           """
>   
>           output_dict = {
>               "retests": self.collection_of_retests,
>               "last_comment_timestamp": self.last_comment_timestamp,
>           }
> -        if file_name == "-":
> -            print(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
> -        else:
> -            with open(file_name, "w") as file:
> -                file.write(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
> +        with open(file_name, "w") as file:
> +            file.write(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
>   
>   
>   if __name__ == "__main__":
> @@ -195,39 +224,33 @@ if __name__ == "__main__":
>           "--time-since",
>           dest="time_since",
>           required=True,
> -        help='Get all patches since this timestamp (yyyy-mm-ddThh:mm:ss.SSSSSS).',
> +        help="Get all patches since this many days ago (default: 5)",
>       )
>       parser.add_argument(
>           "--contexts",
>           dest="contexts_to_capture",
>           nargs="*",
>           required=True,
> -        help='List of patchwork contexts you would like to capture.',
> +        help="List of patchwork contexts you would like to capture",
>       )
>       parser.add_argument(
>           "-o",
>           "--out-file",
>           dest="out_file",
>           help=(
> -            'Output file where the list of reruns and the timestamp of the '
> -            'last comment in the list of comments is sent. If this is set '
> -            'to "-" then it will output to stdout (default: -).'
> +            "Output file where the list of reruns and the timestamp of the"
> +            "last comment in the list of comments"
> +            "(default: rerun_requests.json)."
>           ),
> -        default="-",
> +        default="rerun_requests.json",
>       )
>       parser.add_argument(
> -        "-u",
> -        "--patchwork-url",
> -        dest="pw_url",
> -        help=(
> -            'URL for the events endpoint of the patchwork API that will be used to '
> -            f'collect retest requests (default: {DPDK_PATCHWORK_EVENTS_API_URL})'
> -        ),
> -        default=DPDK_PATCHWORK_EVENTS_API_URL
> +        "-m",
> +        "--multipage",
> +        action="store_true",
> +        help="When set, searches all pages of patch/cover comments in the query."
>       )
>       args = parser.parse_args()
> -    rerun_processor = RerunProcessor(
> -        args.contexts_to_capture, args.time_since, args.pw_url
> -    )
> +    rerun_processor = RerunProcessor(args.contexts_to_capture, args.time_since, args.multipage)
>       rerun_processor.process_reruns()
> -    rerun_processor.write_output(args.out_file)
> +    rerun_processor.write_to_output_file(args.out_file)

     prev parent reply	other threads:[~2025-06-17  4:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-11 20:58 Patrick Robb
2025-06-17  4:16 ` zhoumin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=974fd110-bb60-ced2-405f-3cad65f64ab4@loongson.cn \
    --to=zhoumin@loongson.cn \
    --cc=aconole@redhat.com \
    --cc=ahassick@iol.unh.edu \
    --cc=ci@dpdk.org \
    --cc=probb@iol.unh.edu \
    --cc=shaibran@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).