From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ci-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 0666F439A2;
	Tue, 23 Jan 2024 06:49:48 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 00747402B8;
	Tue, 23 Jan 2024 06:49:47 +0100 (CET)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by mails.dpdk.org (Postfix) with ESMTP id DB2F94025D
 for <ci@dpdk.org>; Tue, 23 Jan 2024 06:49:46 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1705988986;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=xl/+tpraUVFQqQPqf1yfPiatoiw0cPk0bC1mi4+eaZ8=;
 b=dTdeI3slHgfb7NH5liRfQLBArzLtpePdT52atbPRHpM261sKgHhTRDUYt0yn+xLkt0iJ1E
 9dBSCG2PTgOhYpz9FCqJzGKetG5Rkc69R0XirBl4eNT2vARDSTISkvHcozzKRbfN1+JHZB
 etx7omUZnO3TYt1WWdcbZZPvM66V3r0=
Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com
 [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-306-DblbFZB5NRu7q2ChSkeaqA-1; Tue, 23 Jan 2024 00:49:44 -0500
X-MC-Unique: DblbFZB5NRu7q2ChSkeaqA-1
Received: by mail-ej1-f71.google.com with SMTP id
 a640c23a62f3a-a2c4e9cb449so218919266b.1
 for <ci@dpdk.org>; Mon, 22 Jan 2024 21:49:44 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1705988983; x=1706593783;
 h=content-transfer-encoding:cc:to:subject:message-id:date:from
 :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=xl/+tpraUVFQqQPqf1yfPiatoiw0cPk0bC1mi4+eaZ8=;
 b=Spqr4E3ccTVUrj50yV3j/es3X0HWzkkelHrE17g2i1niZLIroTV33MgniMTjm6voJ9
 GO0BC+M2PNFGmvhTrvMrfAlUZtlAlSsb51XY89EBqx9C5FFJpSZkccA+B9rTXh1HK1AM
 LSHbBqL1+I/7344DUokRZkj7nVmf+qnw6QzMhuOgCi6bI03HrqnEywzwuWM8pkQWHPPr
 asRZUQdoJKs7ErIODyJQYdlPYLZS9mibDtpe46ASvoVDkt0FDGZ4N3FBBOa/z7ZswZoT
 FHyZm4jcgxCdP8UZfbvhnLYOtky/rztlj96PHXOzvyJQAnIHqaaXkwIVd9f+zNXGakw8
 LyPw==
X-Gm-Message-State: AOJu0YzsT42KQYuNGv+Dy6AcZSELpJ5ZHNiKMGE07GUqhmosXYRBOkGF
 yqEB1jcIKT7J+nPgovn6OddIOT38rbeQjfOFHFJ2fxiY/4lSMB9b8n/M9vdId3BVwegRuOesCCs
 57Q6/rdFlCrqN3wfxiVFG7zqHJ+/iRqOqHDvsVtQCiYieO3q/Ox5T+SQsi9BSQIT1yhM/eplugk
 MZXNP/0nD/vtzS7fawibx6SgBD
X-Received: by 2002:a17:907:c207:b0:a30:494:75e6 with SMTP id
 ti7-20020a170907c20700b00a30049475e6mr1097398ejc.182.1705988982927; 
 Mon, 22 Jan 2024 21:49:42 -0800 (PST)
X-Google-Smtp-Source: AGHT+IGf6k/AFie7Srqd0Y8f88AywuIl1x3FI/BOWTWKrPVSQ/B04N4haLrV8B1U4bS19MVBq1tXt+pFZ9uiedXL2CI=
X-Received: by 2002:a17:907:c207:b0:a30:494:75e6 with SMTP id
 ti7-20020a170907c20700b00a30049475e6mr1097395ejc.182.1705988982590; Mon, 22
 Jan 2024 21:49:42 -0800 (PST)
MIME-Version: 1.0
References: <20240122234034.3883647-1-aconole@redhat.com>
 <20240122234034.3883647-3-aconole@redhat.com>
In-Reply-To: <20240122234034.3883647-3-aconole@redhat.com>
From: Michael Santana <msantana@redhat.com>
Date: Tue, 23 Jan 2024 00:49:31 -0500
Message-ID: <CABVNPRqrftpQQBZ2LkeNzy8cAOXLwV5ywnawc4+g6yBAxU9RGQ@mail.gmail.com>
Subject: Re: [PATCH v3 2/2] post_pw: Store submitted checks locally as well
To: Aaron Conole <aconole@redhat.com>
Cc: ci@dpdk.org, Ilya Maximets <i.maximets@ovn.org>,
 Jeremy Kerr <jk@ozlabs.org>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-BeenThere: ci@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK CI discussions <ci.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/ci>,
 <mailto:ci-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/ci/>
List-Post: <mailto:ci@dpdk.org>
List-Help: <mailto:ci-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/ci>,
 <mailto:ci-request@dpdk.org?subject=subscribe>
Errors-To: ci-bounces@dpdk.org

On Mon, Jan 22, 2024 at 6:40=E2=80=AFPM Aaron Conole <aconole@redhat.com> w=
rote:
>
> Jeremy Kerr reports that our PW checks reporting submitted 43000 API call=
s
> in just a single day.  That is alarmingly unacceptable.  We can store the
> URLs we've already submitted and then just skip over any additional
> processing at least on the PW side.
>
> This patch does two things to try and mitigate this issue:
>
> 1. Store each patch ID and URL in the series DB to show that we reported
>    the check.  This means we don't need to poll patchwork for check statu=
s
>
> 2. Store the last modified time of the reports mailing list.  This means
>    we only poll the mailing list when a new email has surely landed.
>
> Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Michael Santana <msantana@redhat.com>
> ---
> v2: fixed up the Last-Modified grep and storage
> v3: Simplified the logic of creating the last-access file
>
>  post_pw.sh       | 35 ++++++++++++++++++++++++++++++++++-
>  series_db_lib.sh | 25 +++++++++++++++++++++++++
>  2 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/post_pw.sh b/post_pw.sh
> index 9163ea1..a8111ff 100755
> --- a/post_pw.sh
> +++ b/post_pw.sh
> @@ -20,6 +20,7 @@
>  # License for the specific language governing permissions and limitation=
s
>  # under the License.
>
> +[ -f "$(dirname $0)/series_db_lib.sh" ] && source "$(dirname $0)/series_=
db_lib.sh" || exit 1
>  [ -f "${HOME}/.mail_patchwork_sync.rc" ] && source "${HOME}/.mail_patchw=
ork_sync.rc"
>
>  # Patchwork instance to update with new reports from mailing list
> @@ -75,6 +76,13 @@ send_post() {
>      if [ -z "$context" -o -z "$state" -o -z "$description" -o -z "$patch=
_id" ]; then
>          echo "Skpping \"$link\" due to missing context, state, descripti=
on," \
>               "or patch_id" 1>&2
> +        # Just don't want to even bother seeing these "bad" patches as w=
ell.
> +        add_check_scanned_url "$patch_id" "$target_url"
> +        return 0
> +    fi
> +
> +    if check_id_exists "$patch_id" "$target_url" ; then
> +        echo "Skipping \"$link\" - already reported." 1>&2
>          return 0
>      fi
>
> @@ -84,6 +92,7 @@ send_post() {
>          "$api_url")"
>      if [ $? -ne 0 ]; then
>          echo "Failed to get proper server response on link ${api_url}" 1=
>&2
> +        # Don't store these as processed in case the server has a tempor=
ary issue.
>          return 0
>      fi
>
> @@ -95,6 +104,9 @@ send_post() {
>      jq -e "[.[].target_url] | contains([\"$mail_url\"])" >/dev/null
>      then
>          echo "Report ${target_url} already pushed to patchwork. Skipping=
." 1>&2
> +        # Somehow this was not stored (for example, first time we apply =
the tracking
> +        # feature).  Store it now.
> +        add_check_scanned_url "$patch_id" "$target_url"
>          return 0
>      fi
>
> @@ -114,12 +126,31 @@ send_post() {
>      if [ $? -ne 0 ]; then
>          echo -e "Failed to push retults based on report ${link} to the"\
>                  "patchwork instance ${pw_instance} using the following R=
EST"\
> -                "API Endpoint ${api_url} with the following data:\n$data=
\n"
> +                "API Endpoint ${api_url} with the following data:\n$data=
\n" 1>&2
>          return 0
>      fi
> +
> +    add_check_scanned_url "$patch_id" "$target_url"
>  }
>
> +# Collect the date.  NOTE: this needs some accomodate to catch the month=
 change-overs
>  year_month=3D"$(date +"%Y-%B")"
> +
> +# Get the last modified time
> +report_last_mod=3D$(curl --head -A "(pw-ci) pw-post" -sSf "${mail_archiv=
e}${year_month}/thread.html" | grep -i Last-Modified)
> +
> +mailing_list_save_file=3D$(echo ".post_pw_${mail_archive}${year_month}" =
| sed -e "s@/@_@g" -e "s@:@_@g" -e "s,@,_,g")
> +
> +if [ -e "${HOME}/${mailing_list_save_file}" ]; then
> +    last_read_date=3D$(cat "${HOME}/${mailing_list_save_file}")
> +    if [ "$last_read_date" -a "$last_read_date" =3D=3D "$report_last_mod=
" ]; then
> +        echo "Last modified times match.  Skipping list parsing."
> +        exit 0
> +    fi
> +fi
> +
> +last_read_date=3D"$report_last_mod"
> +
>  reports=3D"$(curl -A "(pw-ci) pw-post" -sSf "${mail_archive}${year_month=
}/thread.html" | \
>           grep -i 'HREF=3D' | sed -e 's@[0-9]*<LI><A HREF=3D"@\|@' -e 's@=
">@\|@')"
>  if [ $? -ne 0 ]; then
> @@ -132,3 +163,5 @@ echo "$reports" | while IFS=3D'|' read -r blank link =
title; do
>          send_post "${mail_archive}${year_month}/$link"
>      fi
>  done
> +
> +echo "$last_read_date" > "${HOME}/${mailing_list_save_file}"
> diff --git a/series_db_lib.sh b/series_db_lib.sh
> index c5f42e0..0635469 100644
> --- a/series_db_lib.sh
> +++ b/series_db_lib.sh
> @@ -130,6 +130,17 @@ recheck_sync INTEGER
>  EOF
>          run_db_command "INSERT INTO series_schema_version(id) values (8)=
;"
>      fi
> +
> +    run_db_command "select * from series_schema_version;" | egrep '^9$' =
> /dev/null 2>&1
> +    if [ $? -eq 1 ]; then
> +        sqlite3 ${HOME}/.series-db <<EOF
> +CREATE TABLE check_id_scanned (
> +check_patch_id INTEGER,
> +check_url STRING
> +)
> +EOF
> +        run_db_command "INSERT INTO series_schema_version(id) values (9)=
;"
> +    fi
>  }
>
>  function series_db_exists() {
> @@ -468,3 +479,17 @@ function set_recheck_request_state() {
>
>      echo "UPDATE recheck_requests set recheck_sync=3D$recheck_state wher=
e patchwork_instance=3D\"$recheck_instance\" and patchwork_project=3D\"$rec=
heck_project\" and recheck_requested_by=3D\"$recheck_requested_by\" and rec=
heck_series=3D\"$recheck_series\";" | series_db_execute
>  }
> +
> +function add_check_scanned_url() {
> +    local patch_id=3D"$1"
> +    local url=3D"$2"
> +
> +    echo "INSERT into check_id_scanned (check_patch_id, check_url) value=
s (${patch_id}, \"$url\");" | series_db_execute
> +}
> +
> +function check_id_exists() {
> +    local patch_id=3D"$1"
> +    local url=3D"$2"
> +
> +    echo "select * from check_id_scanned where check_patch_id=3D$patch_i=
d and check_url=3D\"$url\";" | series_db_execute | grep "$url" >/dev/null 2=
>&1
> +}
> --
> 2.41.0
>