From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ci-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 3ECC0468DA;
	Wed, 11 Jun 2025 23:04:03 +0200 (CEST)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 1143E402D0;
	Wed, 11 Jun 2025 23:04:03 +0200 (CEST)
Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com
 [209.85.222.173])
 by mails.dpdk.org (Postfix) with ESMTP id 456C040156
 for <ci@dpdk.org>; Wed, 11 Jun 2025 23:04:01 +0200 (CEST)
Received: by mail-qk1-f173.google.com with SMTP id
 af79cd13be357-7c56a3def84so27974885a.0
 for <ci@dpdk.org>; Wed, 11 Jun 2025 14:04:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=iol.unh.edu; s=unh-iol; t=1749675840; x=1750280640; darn=dpdk.org;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=I1MfKNQKJsu1tsEX/Ip2evHmxuJgnudKQkI/On/TR4s=;
 b=d+LUFBgGtYwYS4gYTnADj1nmJv/XFukpUFcmZF046VgM9qiu1dZEGloqwd0CZoZRuH
 sSfxtHwVkNCaKOEv50ZDsWnzXmSMSHZflLsm2CvjSCGa9kVHDmZbMVFRzzD3zMuULwzY
 jsAQPnxlvGCsKFJeE37JQyWi3hf/adMaW4fgo=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1749675840; x=1750280640;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=I1MfKNQKJsu1tsEX/Ip2evHmxuJgnudKQkI/On/TR4s=;
 b=MN2zUO23yUDswbpQqC5G7bFjcrcXpi7YVelL4ajoRPQIhQy6ej5t5u5A0KBeQjw0Gd
 riHbb4yzfxBujZ2Q2IBNiBnip9pMqI2MCbVBovmcuxnc3gMlLn/oKuy4kqNewyjj0cwm
 SzgcvWxI97ulNynoLLeW/5gy8jixxDOcDnt5HQo7ooDkgQJQckIvU5caMXO841LI8vCg
 MJ2hj+Gprviy7Ka1PL+X4kf9P+SACKo0y4/S/v4wDC9zQkJe3bLoFYL8btw6PGPoiUai
 nOhJwlL1EOMFqB7+kLZWlHZwmI1aHSrHYxhB3Ku4PBgrX+FWzaXuVTzWydi9OOYtniLX
 GYzQ==
X-Gm-Message-State: AOJu0Yx17/kYTotOQG5Jlh2RrKuA1y0NKeEW++sfu1cYSx6v4rjO1/xw
 1dUdQgZCvv4yER6c+L557WZddHpBRo1ETtMQ3Rq4P+/nK4SbuF5j645jPoheFu55iWM=
X-Gm-Gg: ASbGnctI0IFDlBoDsn0EjwGXAD5EFadZyh5ipXnrxcucngDRwZLP1zpcTQG10RoCviG
 M9NjK/T9EJXoS0jx8G2R/RBcQnfEa4DCrPNX7/V3xgk0AEHzS58dMUOvpvzErsxG3uW9Yjcbz6T
 Tuwvb63vYxnY053IymsdQB844r3FET/gIZhNKIxVOTJTbgEjErpSk3MddqLunhVIJBxPOn1Lccz
 GXVMlrub5nZxaeaLEDozqq7UEFzoZvpL24lYKug5C7SlnDDCpBZFmJojCdHhILnZZuFPgrIMDz2
 MpJTV0NVWQtfs0yHgROtI1jNg9CImEEn0ZltEZZviXiFTro7jzbDxlVL1rvzNYwl+cjjbXdKQ51
 vvFe2Yg==
X-Google-Smtp-Source: AGHT+IFlaQszS0O1IFFSYvc7OUG1ZJr+QrFHmLPbqGqMudUwAhUETpxf0mz7GJNXNmuLULpgMy5EUQ==
X-Received: by 2002:a05:620a:6289:b0:7d2:107c:4228 with SMTP id
 af79cd13be357-7d3a88316b0mr705174185a.18.1749675840344; 
 Wed, 11 Jun 2025 14:04:00 -0700 (PDT)
Received: from patrick-laptop.iol.unh.edu
 ([2606:4100:3880:1271:8c70:5b9e:ecd3:14e])
 by smtp.gmail.com with ESMTPSA id
 af79cd13be357-7d3b525f638sm6930685a.69.2025.06.11.14.03.59
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 11 Jun 2025 14:03:59 -0700 (PDT)
From: Patrick Robb <probb@iol.unh.edu>
To: aconole@redhat.com
Cc: ci@dpdk.org, ahassick@iol.unh.edu, zhoumin@loongson.cn,
 shaibran@amazon.com, Patrick Robb <probb@iol.unh.edu>
Subject: [PATCH] tools: add branch rebase support to recheck script
Date: Wed, 11 Jun 2025 16:58:49 -0400
Message-ID: <20250611205849.72165-1-probb@iol.unh.edu>
X-Mailer: git-send-email 2.49.0
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-BeenThere: ci@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK CI discussions <ci.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/ci>,
 <mailto:ci-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/ci/>
List-Post: <mailto:ci@dpdk.org>
List-Help: <mailto:ci-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/ci>,
 <mailto:ci-request@dpdk.org?subject=subscribe>
Errors-To: ci-bounces@dpdk.org

Adding support for key value parameters to the recheck framework. This is
being done specifically to support the branch rebase feature. With this
commit, the rerun_requests.json includes a new arguments section which
currently stores the rebase value, and can store future key value pairs
in the future. This commit does not add a requirement that the user uses
the rebase argument. It is optional.

There are also some small quality of life changes which have been added.

Signed-off-by: Adam Hassick <ahassick@iol.unh.edu>
Signed-off-by: Patrick Robb <probb@iol.unh.edu>
---
 tools/get_reruns.py | 151 +++++++++++++++++++++++++-------------------
 1 file changed, 87 insertions(+), 64 deletions(-)

diff --git a/tools/get_reruns.py b/tools/get_reruns.py
index ab4d900..7c27650 100755
--- a/tools/get_reruns.py
+++ b/tools/get_reruns.py
@@ -7,17 +7,20 @@ import argparse
 import datetime
 import json
 import re
-import requests
-from typing import Dict, List, Optional, Set
+from json import JSONEncoder
+from typing import Dict, List, Set, Optional, Tuple
 
-DPDK_PATCHWORK_EVENTS_API_URL = "http://patches.dpdk.org/api/events/"
+import requests
 
 
-class JSONSetEncoder(json.JSONEncoder):
+class JSONSetEncoder(JSONEncoder):
     """Custom JSON encoder to handle sets.
 
     Pythons json module cannot serialize sets so this custom encoder converts
     them into lists.
+
+    Args:
+        JSONEncoder: JSON encoder from the json python module.
     """
 
     def default(self, input_object):
@@ -33,12 +36,10 @@ class RerunProcessor:
     The idea of this class is to use regex to find certain patterns that
     represent desired contexts to rerun.
 
-    Args:
+    Arguments:
         desired_contexts: List of all contexts to search for in the bodies of
             the comments
         time_since: Get all comments since this timestamp
-        pw_api_url: URL for events endpoint of the patchwork API to use for collecting
-            comments and comment data
 
     Attributes:
         collection_of_retests: A dictionary that maps patch series IDs to the
@@ -47,38 +48,45 @@ class RerunProcessor:
         last_comment_timestamp: timestamp of the most recent comment that was
             processed
     """
+    _VALID_ARGS: Set[str] = set(["rebase"])
 
     _desired_contexts: List[str]
     _time_since: str
-    _pw_api_url: str
     collection_of_retests: Dict[str, Dict[str, Set]] = {}
     last_comment_timestamp: Optional[str] = None
-    # The tag we search for in comments must appear at the start of the line
-    # and is case sensitive. After this tag we expect a comma separated list
-    # of valid DPDK patchwork contexts.
-    #
+    # ^ is start of line
+    # ((?:(?:[\\w-]+=)?[\\w-]+(?:, ?\n?)?)+) is a capture group that gets all
+    #   test labels and key-value pairs after "Recheck-request: "
+    #   (?:[\\w-]+=)? optionally grabs a key followed by an equals sign
+    #       (no space)
+    #       [\\w-] (expanded to "(:?[a-zA-Z0-9-_]+)" ) means 1 more of any
+    #           character in the ranges a-z, A-Z, 0-9, or the characters
+    #               '-' or '_'
+    #       (?:, ?\n?)? means 1 or none of this match group which expects
+    #           exactly 1 comma followed by 1 or no spaces followed by
+    #           1 or no newlines.
     # VALID MATCHES:
     #   Recheck-request: iol-unit-testing, iol-something-else, iol-one-more,
     #   Recheck-request: iol-unit-testing,iol-something-else, iol-one-more
     #   Recheck-request: iol-unit-testing, iol-example, iol-another-example,
     #   more-intel-testing
+    #   Recheck-request: x=y, rebase=latest, iol-unit-testing, iol-additional-example
     # INVALID MATCHES:
     #   Recheck-request: iol-unit-testing,  intel-example-testing
     #   Recheck-request: iol-unit-testing iol-something-else,iol-one-more,
+    #   Recheck-request: iol-unit-testing, rebase = latest
     #   Recheck-request: iol-unit-testing,iol-something-else,iol-one-more,
-    #
     #   more-intel-testing
-    regex: str = "^Recheck-request: ((?:[a-zA-Z0-9-_]+(?:, ?\n?)?)+)"
+    regex: str = "^Recheck-request: ((?:(?:[\\w-]+=)?[\\w-]+(?:, ?\n?)?)+)"
+    last_comment_timestamp: str
 
-    def __init__(
-        self, desired_contexts: List[str], time_since: str, pw_api_url: str
-    ) -> None:
+    def __init__(self, desired_contexts: List[str], time_since: str, multipage: bool) -> None:
         self._desired_contexts = desired_contexts
         self._time_since = time_since
-        self._pw_api_url = pw_api_url
+        self._multipage = multipage
 
     def process_reruns(self) -> None:
-        patchwork_url = f"{self._pw_api_url}?since={self._time_since}"
+        patchwork_url = f"http://patches.dpdk.org/api/events/?since={self._time_since}"
         comment_request_info = []
         for item in [
             "&category=cover-comment-created",
@@ -87,6 +95,12 @@ class RerunProcessor:
             response = requests.get(patchwork_url + item)
             response.raise_for_status()
             comment_request_info.extend(response.json())
+
+            while 'next' in response.links and self._multipage:
+                response = requests.get(response.links['next']['url'])
+                response.raise_for_status()
+                comment_request_info.extend(response.json())
+
         rerun_processor.process_comment_info(comment_request_info)
 
     def process_comment_info(self, list_of_comment_blobs: List[Dict]) -> None:
@@ -138,54 +152,69 @@ class RerunProcessor:
             comment_info.raise_for_status()
             content = comment_info.json()["content"]
 
-            labels_to_rerun = self.get_test_names(content)
+            (args, labels_to_rerun) = self.get_test_names_and_parameters(content)
+
+            # Accept either filtered labels or arguments.
+            if labels_to_rerun or (args and self._VALID_ARGS.issuperset(args.keys())):
+                # Get or insert a new retest request into the dict.
+                self.collection_of_retests[patch_id] = \
+                    self.collection_of_retests.get(
+                        patch_id, {"contexts": set(), "arguments": dict()}
+                    )
 
-            # appending to the list if it already exists, or creating it if it
-            # doesn't
-            if labels_to_rerun:
-                self.collection_of_retests[patch_id] = self.collection_of_retests.get(
-                    patch_id, {"contexts": set()}
-                )
-                self.collection_of_retests[patch_id]["contexts"].update(labels_to_rerun)
+                req = self.collection_of_retests[patch_id]
 
-    def get_test_names(self, email_body: str) -> Set[str]:
+                # Update the fields.
+                req["contexts"].update(labels_to_rerun)
+                req["arguments"].update(args)
+
+    def get_test_names_and_parameters(
+        self, email_body: str
+    ) -> Tuple[Dict[str, str], Set[str]]:
         """Uses the regex in the class to get the information from the email.
 
-        When it gets the test names from the email, it will all be in one
-        capture group. We expect a comma separated list of patchwork labels
-        to be retested.
+        When it gets the test names from the email, it will be split into two
+        capture groups. We expect a comma separated list of patchwork labels
+        to be retested, and another comma separated list of key-value pairs
+        which are arguments for the retest.
 
         Returns:
             A set of contexts found in the email that match your list of
             desired contexts to capture. We use a set here to avoid duplicate
             contexts.
         """
-        rerun_section = re.findall(self.regex, email_body, re.MULTILINE)
-        if not rerun_section:
-            return set()
-        rerun_list = list(map(str.strip, rerun_section[0].split(",")))
-        return set(filter(lambda x: x and x in self._desired_contexts, rerun_list))
+        rerun_list: Set[str] = set()
+        params_dict: Dict[str, str] = dict()
+
+        match: List[str] = re.findall(self.regex, email_body, re.MULTILINE)
+        if match:
+            items: List[str] = list(map(str.strip, match[0].split(",")))
+
+            for item in items:
+                if '=' in item:
+                    sides = item.split('=')
+                    params_dict[sides[0]] = sides[1]
+                else:
+                    rerun_list.add(item)
+
+        return (params_dict, set(filter(lambda x: x in self._desired_contexts, rerun_list)))
 
-    def write_output(self, file_name: str) -> None:
-        """Output class information.
+    def write_to_output_file(self, file_name: str) -> None:
+        """Write class information to a JSON file.
 
         Takes the collection_of_retests and last_comment_timestamp and outputs
-        them into either a json file or stdout.
+        them into a json file.
 
         Args:
-            file_name: Name of the file to write the output to. If this is set
-            to "-" then it will output to stdout.
+            file_name: Name of the file to write the output to.
         """
 
         output_dict = {
             "retests": self.collection_of_retests,
             "last_comment_timestamp": self.last_comment_timestamp,
         }
-        if file_name == "-":
-            print(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
-        else:
-            with open(file_name, "w") as file:
-                file.write(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
+        with open(file_name, "w") as file:
+            file.write(json.dumps(output_dict, indent=4, cls=JSONSetEncoder))
 
 
 if __name__ == "__main__":
@@ -195,39 +224,33 @@ if __name__ == "__main__":
         "--time-since",
         dest="time_since",
         required=True,
-        help='Get all patches since this timestamp (yyyy-mm-ddThh:mm:ss.SSSSSS).',
+        help="Get all patches since this many days ago (default: 5)",
     )
     parser.add_argument(
         "--contexts",
         dest="contexts_to_capture",
         nargs="*",
         required=True,
-        help='List of patchwork contexts you would like to capture.',
+        help="List of patchwork contexts you would like to capture",
     )
     parser.add_argument(
         "-o",
         "--out-file",
         dest="out_file",
         help=(
-            'Output file where the list of reruns and the timestamp of the '
-            'last comment in the list of comments is sent. If this is set '
-            'to "-" then it will output to stdout (default: -).'
+            "Output file where the list of reruns and the timestamp of the"
+            "last comment in the list of comments"
+            "(default: rerun_requests.json)."
         ),
-        default="-",
+        default="rerun_requests.json",
     )
     parser.add_argument(
-        "-u",
-        "--patchwork-url",
-        dest="pw_url",
-        help=(
-            'URL for the events endpoint of the patchwork API that will be used to '
-            f'collect retest requests (default: {DPDK_PATCHWORK_EVENTS_API_URL})'
-        ),
-        default=DPDK_PATCHWORK_EVENTS_API_URL
+        "-m",
+        "--multipage",
+        action="store_true",
+        help="When set, searches all pages of patch/cover comments in the query."
     )
     args = parser.parse_args()
-    rerun_processor = RerunProcessor(
-        args.contexts_to_capture, args.time_since, args.pw_url
-    )
+    rerun_processor = RerunProcessor(args.contexts_to_capture, args.time_since, args.multipage)
     rerun_processor.process_reruns()
-    rerun_processor.write_output(args.out_file)
+    rerun_processor.write_to_output_file(args.out_file)
-- 
2.49.0