From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0391242D09 for ; Tue, 20 Jun 2023 16:02:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D0E754068E; Tue, 20 Jun 2023 16:02:45 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 883DC400D6 for ; Tue, 20 Jun 2023 16:02:44 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687269764; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zBtni08mvfYX6oesLE7MjKVeYhE1y4dJ3je/M5ZyPzc=; b=OIRItiVzn2vEhhFlMcAxeR9FCgetrcHmLBoA2mL1P05H7Cw3avpFrCt9TwGw529EGED55c 7cEW9cx0Vi7Mxj5UuO7iiqGaVl+HSh75yie6e5NB/Er3/vMrddVlNkth9DUcQzEvGWHVCy CBgkQdR6KIdis13IWUO+hukvgQlcrJs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-157-YoJOcwfkNUq9G2JJ7G3tLg-1; Tue, 20 Jun 2023 10:02:37 -0400 X-MC-Unique: YoJOcwfkNUq9G2JJ7G3tLg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C9E78887121; Tue, 20 Jun 2023 14:01:37 +0000 (UTC) Received: from RHTPC1VM0NT (unknown [10.22.34.71]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BB1C41402C06; Tue, 20 Jun 2023 14:01:36 +0000 (UTC) From: Aaron Conole To: Patrick Robb Cc: Ferruh Yigit , ci@dpdk.org, "Tu, Lijuan" , zhoumin , Michael Santana , Lincoln Lavoie Subject: Re: Email Based Re-Testing Framework In-Reply-To: (Patrick Robb's message of "Tue, 13 Jun 2023 09:28:00 -0400") References: <3fa6546b-8152-e317-30f0-30d5118b9fc4@amd.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) Date: Tue, 20 Jun 2023 10:01:30 -0400 Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org Patrick Robb writes: > 1. Ephemeral lab/network failures, or flaky unit tests that sometimes > fail. In this case, we probably just want to re-run the tree as-is. > > 2. Failing tree before apply. In this case, we have applied the series > to a tree, but the tree isn't in a good state and will fail > regardless of the series being applied. > > I had originally thought of this only as rerunning the tree as-is, though= if the community wants the 2nd option > to be available that works too. It does increase the scope of the task an= d complexity of the request text format > to be understood for submitters. Personally, where I fall is that there i= sn't enough additional value to justify > doing more than rerunning as-is. Plus, if we do end up with a bad tree, s= ubmitters can resubmit their patches > when the tree is in a good state again, right? Or alternatively, lab mana= gers may be aware of this situation > and be able to order a rerun for the last x days (duration of bad tree) o= n the most up to date tree.=20 Re: submitters resubmitting, that is currently one way to get an automatic "retest" right? The thought is to prevent cluttering up even more series in patchwork, but I can see that it might not make sense just yet. It's probably fine to start with the retest "as-is," and add the feature on later. > On Mon, Jun 12, 2023 at 11:01=E2=80=AFAM Aaron Conole wrote: > > Patrick Robb writes: > > > On Wed, Jun 7, 2023 at 8:53=E2=80=AFAM Aaron Conole wrote: > > > > Patrick Robb writes: > > > > > Also it can be useful to run daily sub-tree testing by request, if= possible. > > > > > > That wouldn't be too difficult. I'll make a ticket for this. Althou= gh, for testing on the daily sub-trees, > > since that's > > > UNH-IOL specific, that wouldn't necessarily have to be done via an = email based testing request > > framework. We > > > could also just add a button to our dashboard which triggers a sub-= tree ci run. That would help keep > > narrow > > > the scope of what the email based retesting framework is for. But, = both email or a dashboard button > > would > > > both work.=20 > > > > We had discussed this long ago - including agreeing on a format, IIRC= . > > > > See the thread starting here: > > https://mails.dpdk.org/archives/ci/2021-May/001189.html > > > > The idea was to have a line like: > > > > Recheck-request: > > > > I like using this simpler format which is easier to parse. As Thomas p= ointed out, specifying labs does > not really > > provide extra information if we are already going to request by label/= context, which is already specifies > the > > lab. =20 > > One thing we haven't discussed or determined is if we should have the > ability to re-apply the series or simply to rerun the patches based on > the original sha sum. There are two cases I can think of: > > 1. Ephemeral lab/network failures, or flaky unit tests that sometimes > fail. In this case, we probably just want to re-run the tree as-is. > > 2. Failing tree before apply. In this case, we have applied the series > to a tree, but the tree isn't in a good state and will fail > regardless of the series being applied. > > WDYT? Does (2) case warrant any consideration as a possible reason to > retest? If so, what is the right way of handling that situation? > > > where was the tests in the check labels. In fact, what > > started the discussion was a patch for the pw-ci scripts that > > implemented part of it. > > > > I don't see how to make your proposal as easily parsed. > > > > WDYT? Can you re-read that thread and come up with comments? > > > > Will do. And thanks, this thread is very informative.=20 > > > > It is important to use the 'msgid' field to distinguish recheck > > requests. Otherwise, we will continuously reparse the same > > recheck request and loop forever. Additionally, we've discussed usin= g a > > counter to limit the recheck requests to a single 'recheck' per test > > name. > > > > We can track message ids to avoid considering a single retest request = twice. Perhaps we can > accomplish the > > same thing by tracking retested patchseries ids and their total number= of requested retests (which > could be 1 > > retest per patchseries).=20 > > > > +function check_series_needs_retest() { > > > > + local pw_instance=3D"$1" > > + > > + series_get_active_branches "$pw_instance" | while IFS=3D\| read -= r series_id project url repo > branchname; do > > + local patch_comments_url=3D$(curl -s "$userpw" "$url" | jq -r= c '.comments') > > + if [ "Xnull" !=3D "X$patch_comments_url" ]; then > > + local comments_json=3D$(curl -s "$userpw" "$patch_comment= s_url") > > + local seq_end=3D$(echo "$comments_json" | jq -rc 'length'= ) > > + if [ "$seq_end" -a $seq_end -gt 0 ]; then > > + seq_end=3D$((seq_end-1)) > > + for comment_id in $(seq 0 $seq_end); do > > + local recheck_requested=3D$(echo "$comments_json"= | jq -rc ".[$comment_id].content" | > grep > > "^Recheck-request: ") > > + if [ "X$recheck_requested" !=3D "X" ]; then > > + local msgid=3D$(echo "$comments_json" | jq -r= c ".[$comment_id].msgid") > > + run_recheck "$pw_instance" "$series_id" "$pro= ject" "$url" "$repo" "$branchname" > > "$recheck_requested" "$msgid" > > + fi > > + done > > + fi > > + fi > > + done > > +} > > This is already a superior approach to what I had in mind for acquirin= g comments. Unless you're > opposed, I > > think at the communit lab we can experiment based on this starting poi= nt to verify the process is > sound, but I > > don't see any problems here.=20 > > > > I think that if we're able to specify multiple contexts, then there's= not really any reason to run multiple > > rechecks per patchset. > > > > Agreed. > > > > There was also an ask on filtering requesters (only maintainers and > > > > patch authors can ask for a recheck).=20 > > > > If we can use the maintainers file as a single source of truth that is= convenient and stable as the list of > > maintainers changes. But, also I think retesting request permission sh= ould be extended to the > submitter too. > > They may want to initiate a re-run without engaging a maintainer. It's= not likely to cause a big increase > in test > > load for us or other labs, so there's no harm there.=20 > > > > No, an explicit list is actually better. > > > > When a new check is added, for someone looking at the mails (maybe 2/= 3 > > > > weeks later), and reading just "all", he would have to know what > > > > checks were available at the time.=20 > > > > Context/Labels rarely change, so I don't think this concern is too ser= ious. But, if people dont mind > comma > > separating an entire list of contexts, that's fine.=20 > > > > Thanks, > > Patrick