DPDK CI discussions
 help / color / mirror / Atom feed
From: Aaron Conole <aconole@redhat.com>
To: Michael Santana <msantana@redhat.com>
Cc: Dumitru Ceara <dceara@redhat.com>,
	 David Marchand <dmarchan@redhat.com>,
	 Ilya Maximets <imaximet@redhat.com>,
	 ci@dpdk.org
Subject: Re: [RFC 2/2] recheck: Add a recheck parser for patchwork comments
Date: Wed, 01 Nov 2023 15:16:59 -0400	[thread overview]
Message-ID: <f7til6l707o.fsf@redhat.com> (raw)
In-Reply-To: <CABVNPRr7J80F4ySEsocc+J+m6A_uGBM5ryZvO0=FDcU5akxz=g@mail.gmail.com> (Michael Santana's message of "Wed, 1 Nov 2023 12:57:37 -0400")

Michael Santana <msantana@redhat.com> writes:

> On Fri, Oct 27, 2023 at 9:06 AM Aaron Conole <aconole@redhat.com> wrote:
>>
>> Add a recheck parsing tool that will allow for labs to build a
>> recheck workflow based on specific recheck labels and projects,
>> with an associated state machine and querying interface.
>>
>> The output of the recheck parsing tool is json and can be fed to
>> jq or other json parsing utilities for better field support.
>>
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
> Thank you Aaron for the patch. It looks like you spent a lot of time on it
>
> Overall I like the patch and dont have major concerns other than the
> questions I have made in-line
>
> Thanks!
>> ---
>>  pw_mon           | 59 +++++++++++++++++++++++++++++-
>>  recheck_tool     | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  series_db_lib.sh | 62 +++++++++++++++++++++++++++++++-
>>  3 files changed, 212 insertions(+), 2 deletions(-)
>>  create mode 100755 recheck_tool
>>
>> diff --git a/pw_mon b/pw_mon
>> index 01bdd25..d5ad8f5 100755
>> --- a/pw_mon
>> +++ b/pw_mon
>> @@ -1,6 +1,6 @@
>>  #!/bin/sh
>>  # SPDX-Identifier: gpl-2.0-or-later
>> -# Copyright (C) 2018, Red Hat, Inc.
>> +# Copyright (C) 2018-2023 Red Hat, Inc.
>>  #
>>  # Monitors a project on a patchwork instance for new series submissions
>>  # Records the submissions in the series database (and emits them on the
>> @@ -44,6 +44,7 @@ if [ "$1" != "" ]; then
>>      fi
>>  fi
>>
>> +recheck_filter=""
>>
>>  while [ "$1" != "" ]; do
>>      if echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
>> @@ -59,6 +60,13 @@ while [ "$1" != "" ]; do
>>          echo "    --add-filter-recheck=filter  Adds a filter to flag that a recheck needs to be done"
>>          echo ""
>>          exit 0
>> +    elif echo "$1" | grep -E ^--add-filter-recheck= >/dev/null 2>&1; then
>> +        filter_str=$(echo "$1" | sed s/--add-filter-recheck=//)
>> +        recheck_filter="$filter_str $recheck_filter"
>> +        shift
>> +    else
>> +        echo "Uknown option: $1"
> s/Uknown/Unknown/

d'oh - will fix these.

>> +        exit 1
>>      fi
>>  done
>>
>> @@ -179,7 +187,56 @@ function check_superseded_series() {
>>      done
>>  }
>>
>> +function run_recheck() {
>> +    local recheck_list=$(echo "$7" | sed -e 's/^Recheck-request: // ' -e 's/,/ /g')
>> +
>> +    for filter in $recheck_filter; do
>> +        for check in $recheck_list; do
>> +            if [ "$filter" == "$check" ]; then
>> +                insert_recheck_request_if_needed "$1" "$3" "$8" "$check" "$2"
>> +            fi
>> +        done
>> +    done
>> +}
>> +
>> +function check_patch_for_retest_request() {
>> +    local patch_url="$1"
>> +
>> +    local patch_comments_url=$(curl -s "$userpw" "$patch_url" | jq -rc '.comments')
>> +    if [ "Xnull" != "X$patch_comments_url" ]; then
>> +        local comments_json=$(curl -s "$userpw" "$patch_comments_url")
>> +
>> +        local seq_end=$(echo "$comments_json" | jq -rc 'length')
>> +        if [ "$seq_end" -a $seq_end -gt 0 ]; then
> Isnt this just a longer way of saying " if [ $seq_end -gt 0 ] " ?

No - in case the jq fails (lets say webserver had an issue), the seq_end
value will be empty.  That would cause error if we try to run '-gt'
operation.  We avoid it by testing that there is at least something
there.

>> +            seq_end=$((seq_end-1))
>> +            for comment_id in $(seq 0 $seq_end); do
>> + local recheck_requested=$(echo "$comments_json" | jq -rc
>> ".[$comment_id].content" | grep "^Recheck-request: ")
>> +                if [ "X$recheck_requested" != "X" ]; then
>> +                    local msgid=$(echo "$comments_json" | jq -rc ".[$comment_id].msgid")
>> + run_recheck "$pw_instance" "$series_id" "$project" "$url" "$repo"
>> "$branchname" "$recheck_requested" "$msgid"
>> +                fi
>> +            done
>> +        fi
>> +    fi
>> +}
>> +
>> +function check_series_needs_retest() {
>> +    local pw_instance="$1"
>> +
>> + series_get_active_branches "$pw_instance" | while IFS=\| read -r
>> series_id project url repo branchname; do
>> +        local patch_comments_url=$(curl -s "$userpw" "$url" | jq -rc '.patches[] | .url')
>> +
>> +        for patch in $patch_comments_url; do
>> +            check_patch_for_retest_request $patch
>> +        done
>> +    done
>> +}
>> +
>>  check_undownloaded_series "$pw_instance" "$pw_project"
>>  check_completed_series "$pw_instance" "$pw_project"
>>  check_new_series "$pw_instance" "$pw_project"
>>  check_superseded_series "$pw_instance"
>> +
>> +# check for retest requests after a series is still passing all the
>> +# checks above
>> +check_series_needs_retest "$pw_instance"
> Okay, I am trying to understand what the workflow here is. I think I
> understand that this script will automatically go and check the series
> for comments that match "Recheck-request:". This string means that
> someone asked the bot to recheck the patch/series.

Correct - if someone sends something like:

 Recheck-request: a,b,c,d

and this pw_mon script is called with filters for 'a' and 'b', it will
flag those patches in the DB.

> This script will call into insert_recheck_request_if_needed(), which
> will do a check in the database to see if we have already done a
> recheck to avoid repeating checking the same patches over and over.
> This is done by checking the $recheck_msgid value in the database

Correct.

> I like this workflow. The only thing that I do not like is that you
> have to check every comment on every patch. That seems like an
> expensive operation, but honestly I do not think there is a better way
> to accomplish this. So if there is no better way to do it then it's
> okay, let's move forward with it

There isn't a different way to do it for now, but I hope to switch to
using the events API which should mean we only look at the most recent
events that come in.

>> diff --git a/recheck_tool b/recheck_tool
>> new file mode 100755
>> index 0000000..f346e1c
>> --- /dev/null
>> +++ b/recheck_tool
> I guess it wasnt very obvious to me. But what is the purpose of this
> script? for us to manually add an entry in the database to run a
> recheck?

No - this script cannot insert an entry.  However, it can modify
existing entries.  This lets us define a set of transitions for each
operation that might need to happen for every type that exists.  For
instance, in github, we need to kick off the new testing, then we need
to monitor for the new results, then we need to report the new results.

These could be represented by individual states.  But, for example, if
there's a future where we add an option to re-apply to a new tree or
something, we can insert a state to handle that.

>> @@ -0,0 +1,93 @@
>> +#!/bin/sh
>> +# SPDX-Identifier: gpl-2.0-or-later
>> +# Copyright (C) 2023 Red Hat, Inc.
>> +#
>> +# Licensed under the terms of the GNU General Public License as published
>> +# by the Free Software Foundation; either version 2 of the License, or
>> +# (at your option) any later version.  You may obtain a copy of the
>> +# license at
>> +#
>> +#    https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
>> +#
>> +# Unless required by applicable law or agreed to in writing, software
>> +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
>> +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
>> +# License for the specific language governing permissions and limitations
>> +# under the License.
>> +
>> +mode="select"
>> +
>> +while [ "$1" != "" ]; do
>> +    if echo "$1" | grep -E ^--help >/dev/null 2>&1; then
>> +        echo "recheck / retest state machine script"
>> +        echo ""
>> +        echo "$0:"
>> +        echo " --pw-project=<proj>:    Patchwork project."
>> +        echo " --pw-instance=<inst>:   Patchwork instance."
>> +        echo " --filter=<str>:         Job / request for recheck."
>> +        echo " --state=<0..>:          Resync state ID."
>> +        echo " --msgid=<msgid>:                Message ID to select."
>> +        echo " --update:               Set tool in update mode"
>> +        echo " --new-state=<0..>:      New state ID to set"
>> +        echo " --series-id=<..>:       Series ID"
>> +        echo ""
>> +        echo "Will spit out a parsable json for each db line when selecting"
>> +        exit 0
>> +    elif echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
>> +        pw_project=$(echo "$1" | sed s/--pw-project=//)
>> +    elif echo "$1" | grep -E ^--pw-instance= >/dev/null 2>&1; then
>> +        pw_instance=$(echo "$1" | sed s/--pw-instance=//)
>> +    elif echo "$1" | grep -E ^--filter= >/dev/null 2>&1; then
>> +        filter=$(echo "$1" | sed s/--filter=//)
>> +    elif echo "$1" | grep -E ^--state= >/dev/null 2>&1; then
>> +        recheck_state=$(echo "$1" | sed s/--state=//)
>> +    elif echo "$1" | grep -E ^--msgid= >/dev/null 2>&1; then
>> +        message_id=$(echo "$1" | sed s/--msgid=//)
>> +    elif echo "$1" | grep -E ^--update >/dev/null 2>&1; then
>> +        mode="update"
>> +    elif echo "$1" | grep -E ^--new-state= >/dev/null 2>&1; then
>> +        new_recheck_state=$(echo "$1" | sed s/--new-state=//)
>> +    elif echo "$1" | grep -E ^--series-id= >/dev/null 2>&1; then
>> +        series_id=$(echo "$1" | sed s/--series-id=//)
>> +    else
>> +        echo "unknown option $1"
>> +        exit 1
>> +    fi
>> +    shift
>> +done
>> +
>> +source $(dirname $0)/series_db_lib.sh
>> +
>> +if [ "$mode" == "select" ]; then
>> +    echo "{\"rechecks\":["
>> +    for request in $(get_recheck_requests_by_project "$pw_instance" \
>> +                                                     "$pw_project" \
>> +                                                     "$recheck_state" \
>> +                                                     "$filter"); do
>> +        message_id=$(echo $request | cut -d\| -f1)
>> +        series_id=$(echo $request | cut -d\| -f2)
>> + echo "{\"pw_instance\": \"$pw_instance\",
>> \"series_id\":$series_id, \"msg_id\":\"$message_id\",
>> \"state\":\"$recheck_state\", \"requested\": \"$filter\"}"
>> +    done
>> +    echo "]}"
>> +elif [ "$mode" == "update" ]; then
>> +    if [ "X$new_recheck_state" == "X" -o "X$series_id" == "X" ]; then
>> +        echo "Need to set a series-id and a new recheck state when updating."
>> +        exit 1
>> +    fi
>> +
>> +    request=$(get_recheck_request "$pw_instance" "$pw_project" "$message_id" \
>> +                                  "$filter" "$series_id" "$recheck_state")
>> +    if [ "X$request" == "X" ]; then
>> +        echo "{\"result\":\"notfound\"}"
>> +        exit 0
>> +    fi
>> +
>> +    set_recheck_request_state "$pw_instance" "$pw_project" "$message_id" \
>> +                              "$filter" "$series_id" "$new_recheck_state"
>> +
>> + echo "{\"result\":\"executed\",\"recheck\":{\"pw_instance\":
>> \"$pw_instance\", \"series_id\":$series_id,
>> \"msg_id\":\"$message_id\", \"state\":\"$new_recheck_state\",
>> \"requested\": \"$filter\"}}"
>> +else
>> +    echo "Uknown state: $mode"
> s/Uknown/Unknown/

ACK

>> +    exit 1
>> +fi
>> +
>> diff --git a/series_db_lib.sh b/series_db_lib.sh
>> index 6c2d98e..a729337 100644
>> --- a/series_db_lib.sh
>> +++ b/series_db_lib.sh
>> @@ -1,6 +1,6 @@
>>  #!/bin/sh
>>  # SPDX-Identifier: gpl-2.0-or-later
>> -# Copyright (C) 2018,2019 Red Hat, Inc.
>> +# Copyright (C) 2018-2023 Red Hat, Inc.
>>  #
>>  # Licensed under the terms of the GNU General Public License as published
>>  # by the Free Software Foundation; either version 2 of the License, or
>> @@ -114,6 +114,21 @@ EOF
>>          run_db_command "INSERT INTO series_schema_version(id) values (7);"
>>      fi
>>
>> +    run_db_command "select * from series_schema_version;" | egrep '^8$' > /dev/null 2>&1
>> +    if [ $? -eq 1 ]; then
>> +        sqlite3 ${HOME}/.series-db <<EOF
>> +CREATE TABLE recheck_requests (
>> +recheck_id INTEGER,
>> +recheck_message_id STRING,
>> +recheck_requested_by STRING,
>> +recheck_series STRING,
>> +patchwork_instance STRING,
>> +patchwork_project STRING,
>> +recheck_sync INTEGER
>> +);
>> +EOF
>> +        run_db_command "INSERT INTO series_schema_version(id) values (8);"
>> +    fi
>>  }
>>
>>  function series_db_exists() {
>> @@ -390,3 +405,48 @@ function get_patch_id_by_series_id_and_sha() {
>>
>>      echo "select patch_id from git_builds where patchwork_instance=\"$instance\" and series_id=$series_id and sha=\"$sha\";" | series_db_execute
>>  }
>> +
>> +function get_recheck_requests_by_project() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_state="$3"
>> +    local recheck_requested_by="$4"
>> +
>> +    series_db_exists
>> +
>> +    echo "select recheck_message_id,recheck_series from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_sync=$recheck_state and recheck_requested_by=\"$recheck_requested_by\";" | series_db_execute
>> +}
>> +
>> +function insert_recheck_request_if_needed() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +
>> +    if ! echo "select * from recheck_requests where recheck_message_id=\"$recheck_msgid\";" | series_db_execute | grep $recheck_msgid >/dev/null 2>&1; then
>> +        echo "INSERT INTO recheck_requests (recheck_message_id, recheck_requested_by, recheck_series, patchwork_instance, patchwork_project, recheck_sync) values (\"$recheck_msgid\", \"$recheck_requested_by\", \"$recheck_series\", \"$recheck_instance\", \"$recheck_project\", 0);" | series_db_execute
>> +    fi
>> +}
>> +
>> +function get_recheck_request() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +    local recheck_state="$6"
>> +
>> +    echo "select * from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\" and recheck_message_id=\"$recheck_msgid\" and recheck_sync=$recheck_state;" | series_db_execute
>> +}
>> +
>> +function set_recheck_request_state() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +    local recheck_state="$6"
>> +
>> +    echo "UPDATE recheck_requests set recheck_sync=$recheck_state where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\";" | series_db_execute
>> +}
>> --
>> 2.41.0
>>


  reply	other threads:[~2023-11-01 19:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-27 13:06 [RFC 1/2] pw_mon: improve command line options Aaron Conole
2023-10-27 13:06 ` [RFC 2/2] recheck: Add a recheck parser for patchwork comments Aaron Conole
2023-11-01 16:57   ` Michael Santana
2023-11-01 19:16     ` Aaron Conole [this message]
2023-11-02 10:44       ` Thomas Monjalon
2023-11-02 13:03         ` Aaron Conole
2023-11-02 13:32           ` Thomas Monjalon
2023-10-31 14:45 ` [RFC 1/2] pw_mon: improve command line options Michael Santana
2023-10-31 15:54   ` Aaron Conole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7til6l707o.fsf@redhat.com \
    --to=aconole@redhat.com \
    --cc=ci@dpdk.org \
    --cc=dceara@redhat.com \
    --cc=dmarchan@redhat.com \
    --cc=imaximet@redhat.com \
    --cc=msantana@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).