From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 670204399F; Mon, 22 Jan 2024 20:32:39 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2F91340DDA; Mon, 22 Jan 2024 20:32:39 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id A65EE40298 for ; Mon, 22 Jan 2024 20:32:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705951958; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f+q+Av/vy0qfHm/S51gI7w/E8ci7ipHMeOBI7EoPq5U=; b=gBXN/mGO0O1A446RidM2gNEAXeCz0Anlg/ME/XebvIrPaTDZEblDy+y6R1Zi2QocWegyI4 VS0XdEWw4RMqXzMJnxcv/uB0TPsix8OPqye7t5afXgRRUoVKf36UvWatzbNkJhNOZ8WlrI wRanshc55Z0LwN/SXzgeHxfPRTYWZhY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-208-CgIxeZ5hOjSd94kNEq99mA-1; Mon, 22 Jan 2024 14:32:33 -0500 X-MC-Unique: CgIxeZ5hOjSd94kNEq99mA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 65385185A780; Mon, 22 Jan 2024 19:32:33 +0000 (UTC) Received: from RHTPC1VM0NT.redhat.com (unknown [10.22.33.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 308C4492BC7; Mon, 22 Jan 2024 19:32:33 +0000 (UTC) From: Aaron Conole To: ci@dpdk.org Cc: Michael Santana , Ilya Maximets , Jeremy Kerr Subject: [PATCH v2 2/2] post_pw: Store submitted checks locally as well Date: Mon, 22 Jan 2024 14:32:32 -0500 Message-ID: <20240122193232.3734371-3-aconole@redhat.com> In-Reply-To: <20240122193232.3734371-1-aconole@redhat.com> References: <20240122193232.3734371-1-aconole@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org Jeremy Kerr reports that our PW checks reporting submitted 43000 API calls in just a single day. That is alarmingly unacceptable. We can store the URLs we've already submitted and then just skip over any additional processing at least on the PW side. This patch does two things to try and mitigate this issue: 1. Store each patch ID and URL in the series DB to show that we reported the check. This means we don't need to poll patchwork for check status 2. Store the last modified time of the reports mailing list. This means we only poll the mailing list when a new email has surely landed. Signed-off-by: Aaron Conole --- v2: fixed up the Last-Modified grep and storage post_pw.sh | 38 +++++++++++++++++++++++++++++++++++++- series_db_lib.sh | 25 +++++++++++++++++++++++++ 2 files changed, 62 insertions(+), 1 deletion(-) diff --git a/post_pw.sh b/post_pw.sh index 9163ea1..a23bdc5 100755 --- a/post_pw.sh +++ b/post_pw.sh @@ -20,6 +20,7 @@ # License for the specific language governing permissions and limitations # under the License. +[ -f "$(dirname $0)/series_db_lib.sh" ] && source "$(dirname $0)/series_db_lib.sh" || exit 1 [ -f "${HOME}/.mail_patchwork_sync.rc" ] && source "${HOME}/.mail_patchwork_sync.rc" # Patchwork instance to update with new reports from mailing list @@ -75,6 +76,13 @@ send_post() { if [ -z "$context" -o -z "$state" -o -z "$description" -o -z "$patch_id" ]; then echo "Skpping \"$link\" due to missing context, state, description," \ "or patch_id" 1>&2 + # Just don't want to even bother seeing these "bad" patches as well. + add_check_scanned_url "$patch_id" "$target_url" + return 0 + fi + + if check_id_exists "$patch_id" "$target_url" ; then + echo "Skipping \"$link\" - already reported." 1>&2 return 0 fi @@ -84,6 +92,7 @@ send_post() { "$api_url")" if [ $? -ne 0 ]; then echo "Failed to get proper server response on link ${api_url}" 1>&2 + # Don't store these as processed in case the server has a temporary issue. return 0 fi @@ -95,6 +104,9 @@ send_post() { jq -e "[.[].target_url] | contains([\"$mail_url\"])" >/dev/null then echo "Report ${target_url} already pushed to patchwork. Skipping." 1>&2 + # Somehow this was not stored (for example, first time we apply the tracking + # feature). Store it now. + add_check_scanned_url "$patch_id" "$target_url" return 0 fi @@ -114,12 +126,34 @@ send_post() { if [ $? -ne 0 ]; then echo -e "Failed to push retults based on report ${link} to the"\ "patchwork instance ${pw_instance} using the following REST"\ - "API Endpoint ${api_url} with the following data:\n$data\n" + "API Endpoint ${api_url} with the following data:\n$data\n" 1>&2 return 0 fi + + add_check_scanned_url "$patch_id" "$target_url" } +# Collect the date. NOTE: this needs some accomodate to catch the month change-overs year_month="$(date +"%Y-%B")" + +# Get the last modified time +report_last_mod=$(curl --head -A "(pw-ci) pw-post" -sSf "${mail_archive}${year_month}/thread.html" | grep -i Last-Modified) + +mailing_list_save_file=$(echo ".post_pw_${mail_archive}${year_month}" | sed -e "s@/@_@g" -e "s@:@_@g" -e "s,@,_,g") + +if [ -e "${HOME}/${mailing_list_save_file}" ]; then + last_read_date=$(cat "${HOME}/${mailing_list_save_file}") + if [ "$last_read_date" -a "$last_read_date" == "$report_last_mod" ]; then + echo "Last modified times match. Skipping list parsing." + exit 0 + else + last_read_date="$report_last_mod" + fi +else + last_read_date="Failed curl." + touch "${HOME}/${mailing_list_save_file}" +fi + reports="$(curl -A "(pw-ci) pw-post" -sSf "${mail_archive}${year_month}/thread.html" | \ grep -i 'HREF=' | sed -e 's@[0-9]*
  • @\|@')" if [ $? -ne 0 ]; then @@ -132,3 +166,5 @@ echo "$reports" | while IFS='|' read -r blank link title; do send_post "${mail_archive}${year_month}/$link" fi done + +echo "$last_read_date" > "${HOME}/${mailing_list_save_file}" diff --git a/series_db_lib.sh b/series_db_lib.sh index c5f42e0..0635469 100644 --- a/series_db_lib.sh +++ b/series_db_lib.sh @@ -130,6 +130,17 @@ recheck_sync INTEGER EOF run_db_command "INSERT INTO series_schema_version(id) values (8);" fi + + run_db_command "select * from series_schema_version;" | egrep '^9$' > /dev/null 2>&1 + if [ $? -eq 1 ]; then + sqlite3 ${HOME}/.series-db </dev/null 2>&1 +} -- 2.41.0