From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ci-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 32CC8A0524
	for <public@inbox.dpdk.org>; Tue, 13 Apr 2021 15:50:22 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 27587161000;
	Tue, 13 Apr 2021 15:50:22 +0200 (CEST)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by mails.dpdk.org (Postfix) with ESMTP id 1C07D1608EA
 for <ci@dpdk.org>; Tue, 13 Apr 2021 15:50:20 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1618321819;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type;
 bh=GuE4xEBycCi2QXVg48+DQBj087SCdCJX9ZvVfavxBNk=;
 b=auSjq6SdKu2tz+cusN25Xx/sa1m3K2tiuToK/y/JnlR+JjgQr5LSvZ4iS/ZaoTpBaUuzvs
 CWIOqaE7E05AwCrZnnC+ewBrITAgXwJgPEGu+adv/Wtn5tWZO1BRDGA/GEqUKsAaRepWBZ
 dMkB3JLFDJiPIQNSAeK8wz2yUIL82b0=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-468-N9kV2RWSN6201rw4kOG77Q-1; Tue, 13 Apr 2021 09:50:18 -0400
X-MC-Unique: N9kV2RWSN6201rw4kOG77Q-1
Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com
 [10.5.11.23])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CB4D41008074;
 Tue, 13 Apr 2021 13:50:09 +0000 (UTC)
Received: from dhcp-25.97.bos.redhat.com (ovpn-115-147.rdu2.redhat.com
 [10.10.115.147])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 006F119D9F;
 Tue, 13 Apr 2021 13:50:08 +0000 (UTC)
From: Aaron Conole <aconole@redhat.com>
To: dev@dpdk.org, ci@dpdk.org
Cc: Michael Santana <msantana@redhat.com>,
 Lincoln Lavoie <lylavoie@iol.unh.edu>, dpdklab <dpdklab@iol.unh.edu>
Date: Tue, 13 Apr 2021 09:50:08 -0400
Message-ID: <f7teefefg4v.fsf@dhcp-25.97.bos.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23
Authentication-Results: relay.mimecast.com;
 auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=aconole@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
Subject: [dpdk-ci] [RFC] Proposal for allowing rerun of tests
X-BeenThere: ci@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK CI discussions <ci.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/ci>,
 <mailto:ci-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/ci/>
List-Post: <mailto:ci@dpdk.org>
List-Help: <mailto:ci-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/ci>,
 <mailto:ci-request@dpdk.org?subject=subscribe>
Errors-To: ci-bounces@dpdk.org
Sender: "ci" <ci-bounces@dpdk.org>

Greetings,

During the various CI pipelines, sometimes a test setup or lab will
have an internal failure unrelated to the specific patch.  Perhaps
'master' branch (or the associated -next branch) is broken and we cannot
get a successful run anyway.  Perhaps a network outage occurs during
infrastructure setup.  Perhaps some other transient error clobbers the
setup.  In all of these cases the report to the mailing flags the patch
as 'FAIL'.

It would be very helpful if maintainers had the ability to tell various
CI infrastructures to restart / rerun patch tests.  For now, this has to
be done by the individual managers of those labs.  Some labs, it isn't
possible.  Others, it's possible but is a very time-consuming process to
restart a test case.  In all cases, a maintainer needs to spend time
communicating with a lab manager.  This could be made a bit nicer.

One proposal we (Michael and I) have toyed with for our lab is having
the infrastructure monitor patchwork comments for a restart flag, and
kick off based on that information.  Patchwork tracks all of the
comments for each patch / series so we could look at the series that
are still in a state for 'merging' (new, assigned, etc) and check the
patch .comments API for new comments.  Getting the data from PW should
be pretty simple - but I think that knowing whether to kick off the
test might be more difficult.  We have concerns about which messages we
should accept (for example, can anyone ask for a series to be rerun, and
we'll need to track which rerun messages we've accepted).  The
convention needs to be something we all can work with (ie: /Re-check:
[checkname] or something as a single line in the email).

This is just a start to identify and explain the concern.  Maybe there
are other issues we've not considered, or maybe folks think this is a
terrible idea not worth spending any time developing.  I think there's
enough use for it that I am raising it here, and we can discuss it.

Thanks,
-Aaron