DPDK CI discussions
 help / color / mirror / Atom feed
* [dpdk-ci] UNH CI failing
@ 2021-05-19  7:36 Thomas Monjalon
  2021-05-19 13:05 ` Aaron Conole
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Monjalon @ 2021-05-19  7:36 UTC (permalink / raw)
  To: ci, dpdklab; +Cc: aconole

It seems the IOL CI is failing today:

https://patches.dpdk.org/project/dpdk/patch/1621406749-15536-1-git-send-email-changpeng.liu@intel.com/
https://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/

That's especially embarassing for closing the release.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-ci] UNH CI failing
  2021-05-19  7:36 [dpdk-ci] UNH CI failing Thomas Monjalon
@ 2021-05-19 13:05 ` Aaron Conole
  2021-05-19 13:21   ` Lincoln Lavoie
  0 siblings, 1 reply; 5+ messages in thread
From: Aaron Conole @ 2021-05-19 13:05 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: ci, dpdklab

Thomas Monjalon <thomas@monjalon.net> writes:

> It seems the IOL CI is failing today:
>
> https://patches.dpdk.org/project/dpdk/patch/1621406749-15536-1-git-send-email-changpeng.liu@intel.com/
> https://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/
>
> That's especially embarassing for closing the release.

I don't see any useful logs in the failures.

What changed?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-ci] UNH CI failing
  2021-05-19 13:05 ` Aaron Conole
@ 2021-05-19 13:21   ` Lincoln Lavoie
  2021-05-19 15:54     ` Aaron Conole
  0 siblings, 1 reply; 5+ messages in thread
From: Lincoln Lavoie @ 2021-05-19 13:21 UTC (permalink / raw)
  To: Aaron Conole; +Cc: Thomas Monjalon, ci, dpdklab

[-- Attachment #1: Type: text/plain, Size: 1352 bytes --]

As far as I can tell, it looks like one of the clocks fell out of sync on
the container runners, which caused the builds to fail.

Also, as far I can tell from an initial look, it impacted the two patches
Thomas cited.  Two patches that are running now (i.e. they don't have a
full set of results yet, look like they are running on.  So, it was a
transient issue.  Obviously we need to track down its root cause. I suspect
something happened with NTP, which should be keeping the runners and bare
metal systems synced.  I'm looking into that now.

For the patches with the failed jobs, we will queue those for rerun today.

Cheers,
Lincoln



On Wed, May 19, 2021 at 9:05 AM Aaron Conole <aconole@redhat.com> wrote:

> Thomas Monjalon <thomas@monjalon.net> writes:
>
> > It seems the IOL CI is failing today:
> >
> >
> https://patches.dpdk.org/project/dpdk/patch/1621406749-15536-1-git-send-email-changpeng.liu@intel.com/
> >
> https://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/
> >
> > That's especially embarassing for closing the release.
>
> I don't see any useful logs in the failures.
>
> What changed?
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 3204 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-ci] UNH CI failing
  2021-05-19 13:21   ` Lincoln Lavoie
@ 2021-05-19 15:54     ` Aaron Conole
  2021-05-19 16:06       ` Lincoln Lavoie
  0 siblings, 1 reply; 5+ messages in thread
From: Aaron Conole @ 2021-05-19 15:54 UTC (permalink / raw)
  To: Lincoln Lavoie; +Cc: Thomas Monjalon, ci, dpdklab, David Marchand

Lincoln Lavoie <lylavoie@iol.unh.edu> writes:

> As far as I can tell, it looks like one of the clocks fell out of sync on the container runners, which caused the
> builds to fail. 

I think that's also been causing some failures with the alarm_test and
cycles_test unit tests.  If the time source is making adjustments to
time, we will probably fail these tests as well.

> Also, as far I can tell from an initial look, it impacted the two patches Thomas cited.  Two patches that are
> running now (i.e. they don't have a full set of results yet, look like they are running on.  So, it was a transient
> issue.  Obviously we need to track down its root cause. I suspect something happened with NTP, which should
> be keeping the runners and bare metal systems synced.  I'm looking into that now.
>
> For the patches with the failed jobs, we will queue those for rerun today.
>
> Cheers,
> Lincoln
>
> On Wed, May 19, 2021 at 9:05 AM Aaron Conole <aconole@redhat.com> wrote:
>
>  Thomas Monjalon <thomas@monjalon.net> writes:
>
>  > It seems the IOL CI is failing today:
>  >
>  >
>  https://patches.dpdk.org/project/dpdk/patch/1621406749-15536-1-git-send-email-changpeng.liu@intel.com/
>  
>  >
>  https://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/
>  
>  >
>  > That's especially embarassing for closing the release.
>
>  I don't see any useful logs in the failures.
>
>  What changed?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-ci] UNH CI failing
  2021-05-19 15:54     ` Aaron Conole
@ 2021-05-19 16:06       ` Lincoln Lavoie
  0 siblings, 0 replies; 5+ messages in thread
From: Lincoln Lavoie @ 2021-05-19 16:06 UTC (permalink / raw)
  To: Aaron Conole; +Cc: Thomas Monjalon, ci, dpdklab, David Marchand

[-- Attachment #1: Type: text/plain, Size: 2780 bytes --]

I'm continuing to hunt. I did confirm all of those systems are syncing time
from the same master, using chronyd, and look to be configured correctly.
I'm not sure what would cause the time to jump like that, unless maybe the
"master", which is our IPA server, synced it's time and that caused some
sort of ripple to the downstream other systems.

The patches have been rerun, with most results coming in,
https://lab.dpdk.org/results/dashboard/patchsets/17130/, and are passing
without issue.

The only failures I see in recent patches are on "bugfix for Kunpeng SVE
compile" (https://lab.dpdk.org/results/dashboard/patchsets/17135/), with
unit tests failing on two OSes.

And the intermittent Dynamic Config failure.  We think we have tracked this
down to a patch we sent into DTS, and that was merged into DTS, but was
then reverted a few days ago (
https://git.dpdk.org/tools/dts/commit/?id=90f460df240b3020191916b15705abe208a14694).
I've asked Lijuan why it was reverted.

Cheers,
Lincoln

On Wed, May 19, 2021 at 11:55 AM Aaron Conole <aconole@redhat.com> wrote:

> Lincoln Lavoie <lylavoie@iol.unh.edu> writes:
>
> > As far as I can tell, it looks like one of the clocks fell out of sync
> on the container runners, which caused the
> > builds to fail.
>
> I think that's also been causing some failures with the alarm_test and
> cycles_test unit tests.  If the time source is making adjustments to
> time, we will probably fail these tests as well.
>
> > Also, as far I can tell from an initial look, it impacted the two
> patches Thomas cited.  Two patches that are
> > running now (i.e. they don't have a full set of results yet, look like
> they are running on.  So, it was a transient
> > issue.  Obviously we need to track down its root cause. I suspect
> something happened with NTP, which should
> > be keeping the runners and bare metal systems synced.  I'm looking into
> that now.
> >
> > For the patches with the failed jobs, we will queue those for rerun
> today.
> >
> > Cheers,
> > Lincoln
> >
> > On Wed, May 19, 2021 at 9:05 AM Aaron Conole <aconole@redhat.com> wrote:
> >
> >  Thomas Monjalon <thomas@monjalon.net> writes:
> >
> >  > It seems the IOL CI is failing today:
> >  >
> >  >
> >
> https://patches.dpdk.org/project/dpdk/patch/1621406749-15536-1-git-send-email-changpeng.liu@intel.com/
> >
> >  >
> >
> https://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/
> >
> >  >
> >  > That's especially embarassing for closing the release.
> >
> >  I don't see any useful logs in the failures.
> >
> >  What changed?
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 5267 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-19 16:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19  7:36 [dpdk-ci] UNH CI failing Thomas Monjalon
2021-05-19 13:05 ` Aaron Conole
2021-05-19 13:21   ` Lincoln Lavoie
2021-05-19 15:54     ` Aaron Conole
2021-05-19 16:06       ` Lincoln Lavoie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).