From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 39B8F4399F; Tue, 23 Jan 2024 00:40:45 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 32E3E402A8; Tue, 23 Jan 2024 00:40:45 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 3988C40273 for ; Tue, 23 Jan 2024 00:40:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705966842; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9KXb/a6QwqzMR83FGaTOegpFxdyD/2I/HBVUU/JDx9E=; b=N36JdoU/Bk/+97gaJmi0m77LsFR/XduYZ32pXJemec4Zh/ZnPvx47f5Ek9+CkLNT9k+r1V dmvPOw58TrejHopn+T1BwQN265Zvdv/RkpFoKVICVPnFZKK7tC/MrUF6MUC4MjK/XL3iQA fVRjTK1rs8YVlx+x2QsE9p0J8ZdqYlQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-112-Z9xwfTlJOseKTY-EgOOscQ-1; Mon, 22 Jan 2024 18:40:39 -0500 X-MC-Unique: Z9xwfTlJOseKTY-EgOOscQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E572583BA85; Mon, 22 Jan 2024 23:40:38 +0000 (UTC) Received: from RHTPC1VM0NT.redhat.com (unknown [10.22.33.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id B3DDA1121306; Mon, 22 Jan 2024 23:40:38 +0000 (UTC) From: Aaron Conole To: ci@dpdk.org Cc: Michael Santana , Ilya Maximets , Jeremy Kerr Subject: [PATCH v3 2/2] post_pw: Store submitted checks locally as well Date: Mon, 22 Jan 2024 18:40:34 -0500 Message-ID: <20240122234034.3883647-3-aconole@redhat.com> In-Reply-To: <20240122234034.3883647-1-aconole@redhat.com> References: <20240122234034.3883647-1-aconole@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org Jeremy Kerr reports that our PW checks reporting submitted 43000 API calls in just a single day. That is alarmingly unacceptable. We can store the URLs we've already submitted and then just skip over any additional processing at least on the PW side. This patch does two things to try and mitigate this issue: 1. Store each patch ID and URL in the series DB to show that we reported the check. This means we don't need to poll patchwork for check status 2. Store the last modified time of the reports mailing list. This means we only poll the mailing list when a new email has surely landed. Signed-off-by: Aaron Conole --- v2: fixed up the Last-Modified grep and storage v3: Simplified the logic of creating the last-access file post_pw.sh | 35 ++++++++++++++++++++++++++++++++++- series_db_lib.sh | 25 +++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/post_pw.sh b/post_pw.sh index 9163ea1..a8111ff 100755 --- a/post_pw.sh +++ b/post_pw.sh @@ -20,6 +20,7 @@ # License for the specific language governing permissions and limitations # under the License. +[ -f "$(dirname $0)/series_db_lib.sh" ] && source "$(dirname $0)/series_db_lib.sh" || exit 1 [ -f "${HOME}/.mail_patchwork_sync.rc" ] && source "${HOME}/.mail_patchwork_sync.rc" # Patchwork instance to update with new reports from mailing list @@ -75,6 +76,13 @@ send_post() { if [ -z "$context" -o -z "$state" -o -z "$description" -o -z "$patch_id" ]; then echo "Skpping \"$link\" due to missing context, state, description," \ "or patch_id" 1>&2 + # Just don't want to even bother seeing these "bad" patches as well. + add_check_scanned_url "$patch_id" "$target_url" + return 0 + fi + + if check_id_exists "$patch_id" "$target_url" ; then + echo "Skipping \"$link\" - already reported." 1>&2 return 0 fi @@ -84,6 +92,7 @@ send_post() { "$api_url")" if [ $? -ne 0 ]; then echo "Failed to get proper server response on link ${api_url}" 1>&2 + # Don't store these as processed in case the server has a temporary issue. return 0 fi @@ -95,6 +104,9 @@ send_post() { jq -e "[.[].target_url] | contains([\"$mail_url\"])" >/dev/null then echo "Report ${target_url} already pushed to patchwork. Skipping." 1>&2 + # Somehow this was not stored (for example, first time we apply the tracking + # feature). Store it now. + add_check_scanned_url "$patch_id" "$target_url" return 0 fi @@ -114,12 +126,31 @@ send_post() { if [ $? -ne 0 ]; then echo -e "Failed to push retults based on report ${link} to the"\ "patchwork instance ${pw_instance} using the following REST"\ - "API Endpoint ${api_url} with the following data:\n$data\n" + "API Endpoint ${api_url} with the following data:\n$data\n" 1>&2 return 0 fi + + add_check_scanned_url "$patch_id" "$target_url" } +# Collect the date. NOTE: this needs some accomodate to catch the month change-overs year_month="$(date +"%Y-%B")" + +# Get the last modified time +report_last_mod=$(curl --head -A "(pw-ci) pw-post" -sSf "${mail_archive}${year_month}/thread.html" | grep -i Last-Modified) + +mailing_list_save_file=$(echo ".post_pw_${mail_archive}${year_month}" | sed -e "s@/@_@g" -e "s@:@_@g" -e "s,@,_,g") + +if [ -e "${HOME}/${mailing_list_save_file}" ]; then + last_read_date=$(cat "${HOME}/${mailing_list_save_file}") + if [ "$last_read_date" -a "$last_read_date" == "$report_last_mod" ]; then + echo "Last modified times match. Skipping list parsing." + exit 0 + fi +fi + +last_read_date="$report_last_mod" + reports="$(curl -A "(pw-ci) pw-post" -sSf "${mail_archive}${year_month}/thread.html" | \ grep -i 'HREF=' | sed -e 's@[0-9]*
  • @\|@')" if [ $? -ne 0 ]; then @@ -132,3 +163,5 @@ echo "$reports" | while IFS='|' read -r blank link title; do send_post "${mail_archive}${year_month}/$link" fi done + +echo "$last_read_date" > "${HOME}/${mailing_list_save_file}" diff --git a/series_db_lib.sh b/series_db_lib.sh index c5f42e0..0635469 100644 --- a/series_db_lib.sh +++ b/series_db_lib.sh @@ -130,6 +130,17 @@ recheck_sync INTEGER EOF run_db_command "INSERT INTO series_schema_version(id) values (8);" fi + + run_db_command "select * from series_schema_version;" | egrep '^9$' > /dev/null 2>&1 + if [ $? -eq 1 ]; then + sqlite3 ${HOME}/.series-db </dev/null 2>&1 +} -- 2.41.0