From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 65E604399D; Mon, 22 Jan 2024 18:26:40 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3320C40DDA; Mon, 22 Jan 2024 18:26:40 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id D7AF840298 for ; Mon, 22 Jan 2024 18:26:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705944398; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jcJ9oW6QlaOEMu7V7u2CCIyJdmz/TMjmr8F5eMsMm5Q=; b=YfTuvs1TohE7pLTDKhAHTs39BlSnb3LAYHSqWpRp3Ho9qU+L1cjKKNHqHOstPeP5ijQEAu Qc4KcIde7OdAKkv6kr0m6tXIkJXEoVvnevZLEph4+op41Rs8d+Fb1W0MXPH4/gWrzf6Fo/ toiRID3YxTCh9LAWneQubItpNbJdLFI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-600-a96rdh4gMBSR1dCVEp6yFA-1; Mon, 22 Jan 2024 12:26:36 -0500 X-MC-Unique: a96rdh4gMBSR1dCVEp6yFA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 64643185A782; Mon, 22 Jan 2024 17:26:36 +0000 (UTC) Received: from RHTPC1VM0NT.redhat.com (unknown [10.22.33.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2D48B1121306; Mon, 22 Jan 2024 17:26:36 +0000 (UTC) From: Aaron Conole To: ci@dpdk.org Cc: Michael Santana , Ilya Maximets , Jeremy Kerr Subject: [PATCH 2/2] post_pw: Store submitted checks locally as well Date: Mon, 22 Jan 2024 12:26:35 -0500 Message-ID: <20240122172635.3641078-3-aconole@redhat.com> In-Reply-To: <20240122172635.3641078-1-aconole@redhat.com> References: <20240122172635.3641078-1-aconole@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org Jeremy Kerr reports that our PW checks reporting submitted 43000 API calls in just a single day. That is alarmingly unacceptable. We can store the URLs we've already submitted and then just skip over any additional processing at least on the PW side. This patch does two things to try and mitigate this issue: 1. Store each patch ID and URL in the series DB to show that we reported the check. This means we don't need to poll patchwork for check status 2. Store the last modified time of the reports mailing list. This means we only poll the mailing list when a new email has surely landed. Signed-off-by: Aaron Conole --- post_pw.sh | 35 ++++++++++++++++++++++++++++++++++- series_db_lib.sh | 25 +++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/post_pw.sh b/post_pw.sh index fe2f41c..3e3a493 100755 --- a/post_pw.sh +++ b/post_pw.sh @@ -20,6 +20,7 @@ # License for the specific language governing permissions and limitations # under the License. +[ -f "$(dirname $0)/series_db_lib.sh" ] && source "$(dirname $0)/series_db_lib.sh" || exit 1 [ -f "${HOME}/.mail_patchwork_sync.rc" ] && source "${HOME}/.mail_patchwork_sync.rc" # Patchwork instance to update with new reports from mailing list @@ -75,6 +76,13 @@ send_post() { if [ -z "$context" -o -z "$state" -o -z "$description" -o -z "$patch_id" ]; then echo "Skpping \"$link\" due to missing context, state, description," \ "or patch_id" 1>&2 + # Just don't want to even bother seeing these "bad" patches as well. + add_check_scanned_url "$patch_id" "$target_url" + return 0 + fi + + if check_id_exists "$patch_id" "$target_url" ; then + echo "Skipping \"$link\" - already reported." 1>&2 return 0 fi @@ -84,6 +92,7 @@ send_post() { "$api_url")" if [ $? -ne 0 ]; then echo "Failed to get proper server response on link ${api_url}" 1>&2 + # Don't store these as processed in case the server has a temporary issue. return 0 fi @@ -95,6 +104,9 @@ send_post() { jq -e "[.[].target_url] | contains([\"$mail_url\"])" >/dev/null then echo "Report ${target_url} already pushed to patchwork. Skipping." 1>&2 + # Somehow this was not stored (for example, first time we apply the tracking + # feature). Store it now. + add_check_scanned_url "$patch_id" "$target_url" return 0 fi @@ -114,12 +126,31 @@ send_post() { if [ $? -ne 0 ]; then echo -e "Failed to push retults based on report ${link} to the"\ "patchwork instance ${pw_instance} using the following REST"\ - "API Endpoint ${api_url} with the following data:\n$data\n" + "API Endpoint ${api_url} with the following data:\n$data\n" 1>&2 return 0 fi + + add_check_scanned_url "$patch_id" "$target_url" } +# Collect the date. NOTE: this needs some accomodate to catch the month change-overs year_month="$(date +"%Y-%B")" + +# Get the last modified time +report_last_mod=$(curl -A "pw-post" -sSf "${mail_archive}${year_month}/thread.html" | grep Last-Modified) + +mailing_list_save_file=$(echo ".post_pw_${mail_archive}${year_month}" | sed -e "s@/@_@g" -e "s@:@_@g" -e "s,@,_,g") + +if [ -e "${HOME}/${mailing_list_save_file}" ]; then + last_read_date=$(cat "${HOME}/${mailing_list_save_file}") + if [ "$last_read_date" == "$report_last_mod" ]; then + echo "Last modified times match. Skipping list parsing." + exit 0 + fi +else + touch "${HOME}/${mailing_list_save_file}" +fi + reports="$(curl -A "pw-post" -sSf "${mail_archive}${year_month}/thread.html" | \ grep -i 'HREF=' | sed -e 's@[0-9]*
  • @\|@')" if [ $? -ne 0 ]; then @@ -132,3 +163,5 @@ echo "$reports" | while IFS='|' read -r blank link title; do send_post "${mail_archive}${year_month}/$link" fi done + +echo "$last_read_date" > "${HOME}/${mailing_list_save_file}" diff --git a/series_db_lib.sh b/series_db_lib.sh index c5f42e0..0635469 100644 --- a/series_db_lib.sh +++ b/series_db_lib.sh @@ -130,6 +130,17 @@ recheck_sync INTEGER EOF run_db_command "INSERT INTO series_schema_version(id) values (8);" fi + + run_db_command "select * from series_schema_version;" | egrep '^9$' > /dev/null 2>&1 + if [ $? -eq 1 ]; then + sqlite3 ${HOME}/.series-db </dev/null 2>&1 +} -- 2.41.0