From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D44ABA0C5C; Fri, 19 Nov 2021 18:25:57 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6C41A40143; Fri, 19 Nov 2021 18:25:57 +0100 (CET) Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by mails.dpdk.org (Postfix) with ESMTP id B4F0A40140 for ; Fri, 19 Nov 2021 18:25:56 +0100 (CET) Received: by mail-ed1-f47.google.com with SMTP id e3so45624862edu.4 for ; Fri, 19 Nov 2021 09:25:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FVe8l8ht35IG6xA1RhBSgEhCqqY7/37a3cnRhEJX0S8=; b=VvLTMLGqcvjUZ2n4xSJaivmHDHNbd90bYmSFV4imWnZUkmmjDk1vVndaLl94GGGUrW 2Ck6uVkTtWgvaP/SW98Bqz4zzfEvM+IqSmBtXIm9YiM+Gf1MY/INU4VKQGX20AWPyzom ffsKh69QjMLXeqh6umkf0HQAq1TruTO4xj7wg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FVe8l8ht35IG6xA1RhBSgEhCqqY7/37a3cnRhEJX0S8=; b=ryTm+U2tiiEer6W74ytk2FQkVY8L9gm77iW4SsfWwghWOqs5smehXqqrcrT1yxjjxo ZD3hNbB7HcShQ88j7K5LS5578SQ5nJG9VuNYTaNb/7dkt5vi0xPJSfBqIIiThxeSVe/O ztNgNUsQomRWsLfYWv2B4VT72xgdiKNBIu7mYRW6NkMfkHJNx7L6qkjdm8JCqjtH5C7G hFipAMZFEJRysF13blQGsb4jk/2t+axkDdmbTpdyEau2XTZH1cqQXRbYShZIfsZP8vmn 7UfBqkgC9GZs8BClM/gVU0RLmKtOY3FoK3ABm1oo54G1h0hKMLlwve24TYtOqRUL2+z8 j29A== X-Gm-Message-State: AOAM531oWEMZbqk/WwuTwkbS6yYuXxvqoLQgO53Mjl8zZeeztjdfw0th 3YlTAnfG/DeqG2N4tKa3hm8Qy+L1H+yiSgBtYj/7DQ== X-Google-Smtp-Source: ABdhPJzhRxbkOjvm7C4/Cwk2Xy3KdCRttEOdnNbQs0ozr7BvV3JnU2kEkVVHsH0kgPy3BWDs6gNF94dljEKvLWCM290= X-Received: by 2002:a17:907:7e88:: with SMTP id qb8mr9668892ejc.535.1637342756258; Fri, 19 Nov 2021 09:25:56 -0800 (PST) MIME-Version: 1.0 References: <1928246.j4tpOohVRJ@thomas> In-Reply-To: From: Lincoln Lavoie Date: Fri, 19 Nov 2021 12:25:44 -0500 Message-ID: Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures To: "Dumitrescu, Cristian" Cc: Thomas Monjalon , David Marchand , "Ajmera, Megha" , "Singh, Jasvinder" , "Liguzinski, WojciechX" , dev , Aaron Conole , "Yigit, Ferruh" , "ci@dpdk.org" , "Zegota, AnnaX" Content-Type: multipart/alternative; boundary="0000000000005cdd1d05d127917f" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --0000000000005cdd1d05d127917f Content-Type: text/plain; charset="UTF-8" Hi All, I'm not sure if it will help, but this is an example of a failing case in the CI: https://lab.dpdk.org/results/dashboard/patchsets/20222/ The test is running within a docker container. CI is set up to only allow one active unit test at a time, so the host might be running compile jobs, but not other unit tests. This ensures there isn't "competition" for resources like hugepages between two running unit test jobs. The host is actually a VM running on VMware vCenter, not a bare-metal host, the VM's sole purpose is running the docker jobs. The command to start the unit test run is pretty generic (script is below). #!/bin/bash #################################################### # $1 argument: extra arguments to send to meson test #################################################### # Exit on first command failure set -e # Extract dpdk.tar.gz tar xzfm dpdk.tar.gz # Compile DPDK cd dpdk meson build --werror ninja -C build install # Unit test cd build meson test --suite fast-tests -t 60 $1 I think a starting point is to understand if the unit test expects or makes assumptions on the system / environment. If it has sole access to a CPU core, minimum number of hugepages, etc. If it would help, I can also give you the DockerFile to build the container (note the RHEL images have to be built on a licensed Redhat server, based on being able to install the required packages). Cheers, Lincoln On Fri, Nov 19, 2021 at 11:54 AM Dumitrescu, Cristian < cristian.dumitrescu@intel.com> wrote: > > > > -----Original Message----- > > From: Thomas Monjalon > > Sent: Friday, November 19, 2021 7:26 AM > > To: Dumitrescu, Cristian ; David Marchand > > ; Lincoln Lavoie ; > > Ajmera, Megha ; Singh, Jasvinder > > ; Liguzinski, WojciechX > > > > Cc: dev ; Aaron Conole ; Yigit, > > Ferruh ; ci@dpdk.org; Zegota, AnnaX > > > > Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures > > > > 18/11/2021 23:10, Liguzinski, WojciechX: > > > Hi, > > > > > > I was trying to reproduce this test failure, but for me RED tests are > passing. > > > I was running the exact test command like the one described in Bug 826 > - > > 'red_autotest' on the current main branch. > > > > The test is not always failing. > > There are some failing conditions, please find them. > > I think you should try in a container with more limited resources. > > > > Hi Thomas, > > This is not a fair request IMO. We want to avoid wasting everybody's time, > including Wojciech's time. Can the bug originator provide the details on > the setup to reproduce the failure, please? Thank you! > > On a different point, we should probably tweak our autotests to > differentiate between logical failures and those failures related to > resources not being available, and flag the test result accordingly in the > report. For example, if memory allocation fails, the test should be flagged > as "Not enough resources" instead of simply "Failed". In the first case, > the next step should be fixing the test setup, while in the second case the > next step should be fixing the code. What do people think on this? > > Regards, > Cristian > -- *Lincoln Lavoie* Principal Engineer, Broadband Technologies 21 Madbury Rd., Ste. 100, Durham, NH 03824 lylavoie@iol.unh.edu https://www.iol.unh.edu +1-603-674-2755 (m) --0000000000005cdd1d05d127917f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi = All,

=
I'm not sure if = it will help, but this is an example of a failing case in the CI:=C2=A0https://lab= .dpdk.org/results/dashboard/patchsets/20222/

The test is running within a docker container.=C2= =A0 CI is set up to only allow one active unit test at a time, so the host = might be running compile jobs, but not other unit tests.=C2=A0 This ensures= there isn't "competition" for resources like hugepages betwe= en two running unit test jobs.=C2=A0 The host is actually a VM running on V= Mware vCenter, not a bare-metal host, the VM's sole purpose is running = the docker jobs.

The comm= and=C2=A0to start=C2=A0the unit test run is pretty generic (script is below= ).

#!/bin/bash

###= #################################################
# $1 argument: extra a= rguments to send to meson test
#########################################= ###########

# Exit on first command failure
set -e

# Extra= ct dpdk.tar.gz
tar xzfm dpdk.tar.gz

# Compile DPDK
cd dpdk
= meson build --werror
ninja -C build install

# Unit test
cd bui= ld
meson test --suite fast-tests -t 60 $1

I think a starting point is to understand if the un= it test expects or makes assumptions on the system / environment.=C2=A0 If = it has sole access to a CPU core, minimum number of hugepages, etc.=C2=A0 I= f it would help, I can also give you the DockerFile to build the container = (note the RHEL images have to be built on a licensed Redhat server, based o= n being able to install the required packages).

Cheers,
Lincoln


<= div dir=3D"ltr" class=3D"gmail_attr">On Fri, Nov 19, 2021 at 11:54 AM Dumit= rescu, Cristian <cristi= an.dumitrescu@intel.com> wrote:


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, November 19, 2021 7:26 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; David Marc= hand
> <dav= id.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>;
> Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder
> <jas= vinder.singh@intel.com>; Liguzinski, WojciechX
> <wojciechx.liguzinski@intel.com>
> Cc: dev <dev@dpdk= .org>; Aaron Conole <aconole@redhat.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX
> <annax.= zegota@intel.com>
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
>
> 18/11/2021 23:10, Liguzinski, WojciechX:
> > Hi,
> >
> > I was trying to reproduce this test failure, but for me RED tests= are passing.
> > I was running the exact test command like the one described in Bu= g 826 -
> 'red_autotest' on the current main branch.
>
> The test is not always failing.
> There are some failing conditions, please find them.
> I think you should try in a container with more limited resources.
>

Hi Thomas,

This is not a fair request IMO. We want to avoid wasting everybody's ti= me, including Wojciech's time. Can the bug originator provide the detai= ls on the setup to reproduce the failure, please? Thank you!

On a different point, we should probably tweak our autotests to differentia= te between logical failures and those failures related to resources not bei= ng available, and flag the test result accordingly in the report. For examp= le, if memory allocation fails, the test should be flagged as "Not eno= ugh resources" instead of simply "Failed". In the first case= , the next step should be fixing the test setup, while in the second case t= he next step should be fixing the code. What do people think on this?

Regards,
Cristian


--
Lincoln Lavoie
Prin= cipal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, = Durham, NH 03824
+1-603-674-= 2755 (m)

--0000000000005cdd1d05d127917f--