Hello all,

I know Aaron and I had talked a little bit about the script for this and how we were going to handle it during the CI meeting, so I figured I would raise this on the mailing list so anyone could provide input. One of the points we had mentioned was whether this was something we wanted to use pwclient or instead make it a separate script that is just dedicated to collecting retest requests. I'm not completely sure which one would be the better option, but as I mentioned during the meeting I did write a little bit of a draft for a dedicated script that I thought might be something that would be good to bring up. The below code is just a simple python script that allows you to pass in the name of contexts you would like to collect for retesting and it just follows the event schema listed in the previous email. The thought process with this is every lab could maintain a list of labels they would like to capture and the timestamp since the last time they ran the script and gathered retesting requests and use that to run this script periodically.

There are a couple of things to consider with this script before it is completely polished like where to send the output and how people want to handle it, or if we even want to use it rather than just use pwclient and write something to allow it to handle collecting these requests. I also wasn't sure if a comma or space separated list would be preferred for input, so I'm open to suggestions on that as well if we decide to use this script. I wrote something to handle both comma and space delimiters but it required flattening the list and a little extra complexity, so I left it out of the code below. Let me know if anyone has any thoughts on the matter and how we wanted to handle collecting the requests.

Thanks,
Jeremy

+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023 University of New Hampshire
+
+import re
+import requests
+import argparse
+class RerunProcessor:
+    """Class for finding reruns inside an email using the patchworks events API.
+
+    The idea of this class is to use regex to find certain patterns that represent
+    desired contexts to rerun.
+    """
+    desired_contexts: list[str] = []
+    collection_of_retests: dict = {}
+    #^ is start of line and $ is end
+    # ((?:[a-xA-Z-]+(?:, ?\n?)?)+) is a capture group that gets all test labels after "Recheck-request: "
+    #   (?:[a-xA-Z-]+(?:, ?\n?)?)+ means 1 or more of the first match group
+    #       [a-xA-Z-]+ means 1 more more of any character in the ranges a-z or A-Z, or the character '-'
+    #       (?:, ?\n?)? means 1 or none of this match group which expects exactly 1 comma followed by
+    #                   1 or no spaces followed by 1 or no newlines.
+    # VALID MATCHES:
+    #   Recheck-request: iol-unit-testing, iol-something-else, iol-one-more, intel-example-testing
+    #   Recheck-request: iol-unit-testing,iol-something-else,iol-one-more, intel-example-testing,
+    #   Recheck-request: iol-unit-testing, iol-example, iol-another-example, intel-example-testing,
+    #   more-intel-testing
+    # INVALID MATCHES:
+    #   Recheck-request: iol-unit-testing, iol-something-else,iol-one-more,  intel-example-testing
+    #   Recheck-request: iol-unit-testing iol-something-else,iol-one-more, intel-example-testing,
+    #   Recheck-request: iol-unit-testing,iol-something-else,iol-one-more, intel-example-testing,
+    #
+    #   more-intel-testing
+    regex:str = "^Recheck-request: ((?:[a-xA-Z-]+(?:, ?\n?)?)+)"
+
+    def __init__(self, desired_contexts: list) -> None:
+        self.desired_contexts = desired_contexts
+
+    def process_comment_info(self, list_of_comment_blobs: list[str]) -> None:
+        """Takes the list of json blobs of comment information and associates them
+        with their patches.
+
+        Collects retest labels from a list of comments on patches represented in
+        list_of_comment_blobs and creates a dictionary that associates them with their
+        corresponding patch series ID. The labels that need to be retested are collected
+        by passing the comments body into get_test_names() method.
+
+        Args:
+            list_of_comment_blobs: a list of JSON blobs that represent comment information
+        """
+        for comment in list_of_comment_blobs:
+            comment_info = requests.get(
+                comment["comment"]["url"]
+            )
+            labels_to_rerun = self.get_test_name(comment_info.json()["content"])
+            patch_id = comment["payload"]["patch"]["id"]
+            #appending to the list if it already exists, or creating it if it doesn't
+            self.collection_of_retests[patch_id] = [*self.collection_of_retests.get(patch_id, []), *labels_to_rerun]
+
+    def get_test_names(self, email_body:str) -> list[str]:
+        """Uses the regex in the class to get the information from the email.
+
+        When it gets the test names from the email, it will all be in one capture group.
+        We expect a comma separated list of patchwork labels to be retested.
+
+        Returns:
+            A list of contexts found in the email that match your list of desired
+            contexts to capture
+        """
+        rerun_section = re.findall(self.regex, email_body, re.MULTILINE)
+        rerun_list = list(map(str.strip, rerun_section[0].split(",")))
+        valid_test_labels = []
+        for test_name in rerun_list:
+            if not test_name: #handle the capturing of empty string
+                continue
+            if test_name in self.desired_contexts:
+                valid_test_labels.append(test_name)
+        return valid_test_labels
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description="Help text for getting reruns")
+    parser.add_argument('-ts', '--time-since', dest="time_since", required=True, help="Get all patches since this timestamp")
+    parser.add_argument('--no-cover-comment', dest="no_cover_comment", default=False, help="Option to ignore comments on cover letters")
+    parser.add_argument('--no-patch-comment', dest="no_patch_comment", default=False, help="Option to ignore comments on patch emails")
+    parser.add_argument('--contexts', dest="contexts_to_capture", nargs='*', required=True, help="List of patchwork contexts you would like to capture")
+    args = parser.parse_args()
+    rerun_processor = RerunProcessor(args.contexts_to_capture)
+    patchwork_url = f"http://patches.dpdk.org/api/events/?since={args.time_since}"
+    if args.no_cover_comment and args.no_patch_comment:
+        exit(0)
+    if not args.no_cover_comment:
+        patchwork_url += "&category=cover-comment-created"
+    if not args.no_patch_comment:
+        patchwork_url += "&category=patch-comment-created"
+    comment_request_info = requests.get(patchwork_url)
+    rerun_processor.process_comment_info(comment_request_info.json())

On Wed, Jun 21, 2023 at 12:21 PM Ali Alnubani <alialnu@nvidia.com> wrote:
On 6/6/2023 7:57 PM, Patrick Robb wrote:
> Hello all,
>
> I'd like to revive the conversation about a request from the community
> for an email based re-testing framework. The idea is that using one
> standardized format, dpdk developers could email the test-report mailing
> list, requesting a rerun on their patch series for "X" set of tests at
> "Y" lab. I think that since patchwork testing labels (ie.
> iol-broadcom-Performance, github-robot: build, loongarch-compilation)
> are already visible on patch pages on patchwork, those labels are the
> most reasonable ones to expect developers to use when requesting a
> re-test. We probably wouldn't want to get any more general than that,
> like, say, rerunning all CI testing for a specific patch series at a
> specific lab, since it would result in a significant amount of "wasted"
> testing capacity.
>
> The standard email format those of us at the Community Lab are thinking
> of is like below. Developers would request retests by emailing the
> test-report mailing list with email bodies like:
>
> [RETEST UNH-IOL]
> iol-abi-testing
> iol-broadcom-Performance
>
> [RETEST Intel]
> intel-Functional
>
> [RETEST Loongson]
> loongarch-compilation
>
> [RETEST GHA]
> github-robot: build
>
> From there, it would be up to the various labs to poll the test-report
> mailing list archive (or use a similar method) to check for such
> requests, and trigger a CI testing rerun based on the labels provided in
> the re-test email. If there is interest from other labs, UNH might also
> be able to host the entire set of re-test requests, allowing other labs
> to poll a curated list hosted by UNH. One simple approach would be for
> labs to download all emails sent to test-report and parse with regex to
> determine the re-test list for their specific lab. But, if anyone has
> any better ideas for aggregating the emails to be parsed, suggestions
> are welcome! If this approach sounds reasonable to everyone, we could
> determine a timeline by which labs would implement the functionality
> needed to trigger re-tests. Or, we can just add re-testing for various
> labs if/when they add this functionality - whatever is better. Happy to
> discuss at the CI meeting on Thursday.
>

Hello,

For context, and as discussed in the last community CI meeting, going through every new patch to look for new comments that trigger retests might take too long and potentially slow down the server.

I will upgrade Patchwork to v3.1 right after the v23.07 release.
The new version adds two new events to the /events API: cover-comment-created and patch-comment-created.[1]

An example event schema:

"""
{
        [..]
        "category": "patch-comment-created",
        "project": {
            [..]
        },
        "date": "string",
        "actor": {
            [..]
        },
        "payload": {
            "patch": {
                [..]
            },
            "comment": {
                [..]
                "url": "https://patches.dpdk.org/api/patches/X/comments/Y/",
                [..]
            }
        }
    }
"""

The comments body/contents can be extracted from the "content" property after fetching the comment's api url. Example schema:

"""
{
    [..]
    "subject": "string",
    "submitter": {
        [..]
    },
    "content": "string",
    "headers": {
        [..]
    },
    "addressed": null
}
"""

[1] https://patchwork.readthedocs.io/en/latest/releases/hessian/#relnotes-v3-1-0-stable-3-1-new-features
Also see:
https://patchwork.readthedocs.io/en/latest/api/rest/schemas/v1.2/#get--api-1.2-events-
https://patchwork.readthedocs.io/en/latest/api/rest/schemas/v1.2/#get--api-1.2-patches-id-comments-

Let me know if you have any questions.

Regards,
Ali