From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 92D9FA04A4 for ; Tue, 26 May 2020 22:27:40 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5F0751D71F; Tue, 26 May 2020 22:27:40 +0200 (CEST) Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by dpdk.org (Postfix) with ESMTP id 477681D716 for ; Tue, 26 May 2020 22:27:39 +0200 (CEST) Received: by mail-ed1-f53.google.com with SMTP id g9so18749829edw.10 for ; Tue, 26 May 2020 13:27:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+SZ7SJAdLsgVtsDp/cvmLQhZ14+e6HlOfotrwUjCmeU=; b=fsPOlMESMPBowryPggTcQQOFMlEGDHd9mWmS/YlNuR+1oLRt84rKB2M0H23PqhkOCs E22xiq4Bls7Uibfpu1qAa+ND9ZRhB7zt/TrRQ0NeNOkc4x1v60TpmBgZ3IwJGu5wwdlS ZaUdL2TVEN2SiUsSsaCOq4529/ylFkUQEs79Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+SZ7SJAdLsgVtsDp/cvmLQhZ14+e6HlOfotrwUjCmeU=; b=NA/SHq3NqoKhFffIEDeaeTh1OOsA6tvfPbOV4M6TgHBk8FuVeTPQRGBO2rIdPTgT0o +/2Sfoe1sd4vDwVtB2SH2RjP47fAgakeKNOcuHw847n/OyC2I9FTkdcSCbnRMDuaKNo1 j4HFurNuDW6gs4+/bFpyvLl+OULJvmTxYSUWlasn8hJa8OIMZwGywVwNJ9reIdt8sEig p83eO4e2Wq0dD/9rVrwF9XsYR+SbsWR7Ds1Vgyxuh5F/GX883rfS1kekrdl32h0I4CN7 VVA4eFexVXpfy50YnEizmHN9xdzGzc8Eb7ahbfwCJzBOBJFlJRucxljRoNWM9JntjsON Y82g== X-Gm-Message-State: AOAM533Oh/OKvrQrjVL3YtPy9kTvMXqVco65hL6AwA0MtHD+2E3jzjKm A0cWMEU79FC85y1pOg/SuNiI4mri8nJ/IjyBtY9KLg== X-Google-Smtp-Source: ABdhPJyjpYJtHGBQl2cgThf8VjDNp3qYhUA83BRsH11/YYrHVaqsDbT9ZOGTD0reItM/lauttOPY0HB5D8k/5SeM5mI= X-Received: by 2002:aa7:dc49:: with SMTP id g9mr20737221edu.62.1590524858773; Tue, 26 May 2020 13:27:38 -0700 (PDT) MIME-Version: 1.0 References: <5232496.1u8oYCttyy@thomas> In-Reply-To: <5232496.1u8oYCttyy@thomas> From: Lincoln Lavoie Date: Tue, 26 May 2020 16:27:10 -0400 Message-ID: To: Thomas Monjalon Cc: ci@dpdk.org, James Hendergart Content-Type: multipart/alternative; boundary="00000000000036d65505a692ed8c" Subject: Re: [dpdk-ci] CI reliability X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org Sender: "ci" --00000000000036d65505a692ed8c Content-Type: text/plain; charset="UTF-8" Hi Thomas, This has been fixed as of yesterday. The failure was caused by a commit to the SPDK repos in how they pull in their dependencies, which was done in a way that is not compatible with docker. The team created a work around so that case is fixed, but there is always a risk where other commits for those type of items could cause a failure in the containers. I asked Brandon to change the scripts that run the testing in the containers to try and catch failures from docker separately, so they can be flagged as infrastructure, compared to failures of the build. I'm also very surprised, this was not raised during the CI meeting, or by anyone else. I'm wondering if this is caused by the actual error logs being a little abstracted from the emails, i.e. they are a link and a zip file away for the actual email text, so maybe folks are not really looking into the output as closely as they should be. Is this something we can make better by including more detail in the email text, so issues are caught more quickly? Cheers, Lincoln On Sun, May 24, 2020 at 5:50 AM Thomas Monjalon wrote: > Hi all, > > I think we have a CI reliability issue in general. > Perhaps we lack some alert mechanism warning test platform maintainers > when too many tests are failing. > > Recent example: the community lab compilation test is failing on > Fedora 31 for at least 2 weeks, and I don't see any action to fix it: > https://lab.dpdk.org/results/dashboard/patchsets/11040/ > > Because of such recurring errors, the whole CI becomes irrelevant. > Please, we need taking actions to avoid such issue in the near future. > > > -- *Lincoln Lavoie* Senior Engineer, Broadband Technologies 21 Madbury Rd., Ste. 100, Durham, NH 03824 lylavoie@iol.unh.edu https://www.iol.unh.edu +1-603-674-2755 (m) --00000000000036d65505a692ed8c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi = Thomas,

This has been fix= ed as of yesterday.=C2=A0 The failure was caused by a commit to the SPDK re= pos in how they pull in their dependencies, which was done in a way that is= not compatible with=C2=A0docker.=C2=A0 The team created a work around so t= hat case is fixed, but there is always a risk where other commits for those= type of items could cause a failure in the containers.

I asked Brandon to change the scripts that r= un the testing in the containers=C2=A0to try and catch failures from docker= separately, so they can be flagged as infrastructure, compared to failures= of the build.
=
I'm al= so very surprised, this was not raised during the CI meeting, or by anyone = else.=C2=A0 I'm wondering if this is caused by the actual error logs be= ing a little abstracted from the emails, i.e. they are a link and a zip fil= e away for the actual email text, so maybe folks are not really looking int= o the output as closely as they should be.=C2=A0 Is this something we can m= ake better by including more detail in the email=C2=A0text, so issues are c= aught more quickly?

Cheer= s,
Lincoln

On Sun, May 24, 2020 at 5:50 AM Thomas Monjalon <thomas@monjalon.net> wrote:
Hi all,

I think we have a CI reliability issue in general.
Perhaps we lack some alert mechanism warning test platform maintainers
when too many tests are failing.

Recent example: the community lab compilation test is failing on
Fedora 31 for at least 2 weeks, and I don't see any action to fix it: =C2=A0 =C2=A0 =C2=A0 =C2=A0 https://lab.dpdk.= org/results/dashboard/patchsets/11040/

Because of such recurring errors, the whole CI becomes irrelevant.
Please, we need taking actions to avoid such issue in the near future.




--
Lincoln Lavoie
Senior Engineer, Broadband Technologies
21 Madbury Rd., Ste. 10= 0, Durham, NH 03824
+1-603-674= -2755 (m)
--00000000000036d65505a692ed8c--