DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [Bug 826] red_autotest random failures
@ 2021-10-08  7:23 bugzilla
  2021-11-12 13:51 ` David Marchand
  0 siblings, 1 reply; 16+ messages in thread
From: bugzilla @ 2021-10-08  7:23 UTC (permalink / raw)
  To: dev

https://bugs.dpdk.org/show_bug.cgi?id=826

            Bug ID: 826
           Summary: red_autotest random failures
           Product: DPDK
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: other
          Assignee: cristian.dumitrescu@intel.com
          Reporter: david.marchand@redhat.com
                CC: dev@dpdk.org, jasvinder.singh@intel.com
  Target Milestone: ---

A recent failure can be found at:
https://lab.dpdk.org/results/dashboard/patchsets/19223/

50/94 DPDK:fast-tests / red_autotest                   FAIL              0.86s 
 exit status 1
23:08:32 MALLOC_PERTURB_=116 DPDK_TEST=red_autotest
/home-local/jenkins-local/jenkins-agent/workspace/Generic-Unit-Test-DPDK/dpdk/build/app/test/dpdk-test
'-l 0-15' --file-prefix=red_autotest
----------------------------------- output -----------------------------------
stdout:
RTE>>red_autotest

--------------------------------------------------------------------------------
functional test 1 : use one rte_red configuration,
                    increase average queue size to various levels,
                    compare drop rate to drop probability

                avg queue size enqueued       dropped        drop prob %   
drop rate %    diff %         tolerance %    
                6              10000          0              0.0000        
0.0000         0.0000         50.0000        
                12             10000          0              0.0000        
0.0000         0.0000         50.0000        
                18             10000          0              0.0000        
0.0000         0.0000         50.0000        
                24             10000          0              0.0000        
0.0000         0.0000         50.0000        
                30             10000          0              0.0000        
0.0000         0.0000         50.0000        
                36             9954           46             0.4167        
0.4600         0.0000         50.0000        
                42             9901           99             1.0417        
0.9900         0.0000         50.0000        
                48             9831           169            1.6667        
1.6900         0.0000         50.0000        
                54             9761           239            2.2917        
2.3900         0.0000         50.0000        
                60             9701           299            2.9167        
2.9900         0.0000         50.0000        
                66             9618           382            3.5417        
3.8200         0.0000         50.0000        
                72             9581           419            4.1667        
4.1900         0.0000         50.0000        
                78             9490           510            4.7917        
5.1000         0.0000         50.0000        
                84             9460           540            5.4167        
5.4000         0.0000         50.0000        
                90             9398           602            6.0417        
6.0200         0.0000         50.0000        
                96             9371           629            6.6667        
6.2900         0.0000         50.0000        
                102            9259           741            7.2917        
7.4100         0.0000         50.0000        
                108            9211           789            7.9167        
7.8900         0.0000         50.0000        
                114            9133           867            8.5417        
8.6700         0.0000         50.0000        
                120            9058           942            9.1667        
9.4200         0.0000         50.0000        
                126            8987           1013           9.7917        
10.1300        0.0000         50.0000        
                132            0              10000          100.0000      
100.0000       0.0000         50.0000        
                138            0              10000          100.0000      
100.0000       0.0000         50.0000        
                144            0              10000          100.0000      
100.0000       0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 2 : use several RED configurations,
                    increase average queue size to just below maximum
threshold,
                    compare drop rate to drop probability

RED config     avg queue size min threshold  max threshold  drop prob %    drop
rate %    diff %         tolerance %    
0              127            32             128            9.8958        
9.9400         0.0000         50.0000        
1              127            32             128            4.9479        
4.8100         0.0000         50.0000        
2              127            32             128            3.2986        
2.6800         0.0000         50.0000        
3              127            32             128            2.4740        
1.6300         0.0000         50.0000        
4              127            32             128            1.9792        
1.2700         0.0000         50.0000        
5              127            32             128            1.6493        
1.0300         0.0000         50.0000        
6              127            32             128            1.4137        
0.8300         0.0000         50.0000        
7              127            32             128            1.2370        
0.7300         0.0000         50.0000        
8              127            32             128            1.0995        
0.6200         0.0000         50.0000        
9              127            32             128            0.9896        
0.5500         0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 3 : use one RED configuration,
                    increase average queue size to target level,
                    dequeue all packets until queue is empty,
                    confirm that average queue size is computed correctly while
queue is empty

q avg before   q avg after    expected       difference %   tolerance %   
result        
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass 
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass 
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass 
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass 
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass 
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 5 : use several queues (each with its own run-time data),
                    use several RED configurations (such that each
configuration is shared by multiple queues),
                    increase average queue size to just below maximum
threshold,
                    compare drop rate to drop probability,
                    (this is a larger scale version of functional test 2)

queue          config         avg queue size min threshold  max threshold  drop
prob %    drop rate %    diff %         tolerance %    
0              0              127            32             128           
9.8958         9.9700         0.0000         50.0000        
1              0              127            32             128           
9.8958         9.7200         0.0000         50.0000        
2              1              127            32             128           
4.9479         4.8900         0.0000         50.0000        
3              1              127            32             128           
4.9479         4.8200         0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 6 : use several queues (each with its own run-time data),
                    use several RED configurations (such that each
configuration is sharte_red by multiple queues),
                    increase average queue size to target level,
                    dequeue all packets until queue is empty,
                    confirm that average queue size is computed correctly while
queue is empty
                    (this is a larger scale version of functional test 3)

queue          config         q avg before   q avg after    expected      
difference %   tolerance %    result  
0              0              1022.0000      1022.0000      1016.0627     
0.5843         5.0000         pass           
1              0              1022.0000      1022.0000      1016.0627     
0.5843         5.0000         pass           
2              1              1022.0000      1022.0000      1010.1483     
1.1733         5.0000         pass           
3              1              1022.0000      937.1660       1010.1483     
7.2249         5.0000         fail           
-------------------------------------<fail>-------------------------------------

--------------------------------------------------------------------------------
overflow test 1 : use one RED configuration,
                  increase average queue size to target level,
                  check maximum number of bits requirte_red to represent avg_s

avg queue size  wq_log2  fraction bits  max queue avg  num bits  enqueued 
dropped   drop prob %  drop rate %  
1023            12       10             0xffc00000     32        0        
941366    100.00       100.00       
-------------------------------------<pass>-------------------------------------
[total: 6, pass: 5, fail: 1]
Test Failed

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-10-08  7:23 [dpdk-dev] [Bug 826] red_autotest random failures bugzilla
@ 2021-11-12 13:51 ` David Marchand
  2021-11-12 14:10   ` Lincoln Lavoie
  0 siblings, 1 reply; 16+ messages in thread
From: David Marchand @ 2021-11-12 13:51 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev, Aaron Conole, Thomas Monjalon, Yigit, Ferruh, ci

On Fri, Oct 8, 2021 at 9:24 AM <bugzilla@dpdk.org> wrote:
>
> https://bugs.dpdk.org/show_bug.cgi?id=826
>
>             Bug ID: 826
>            Summary: red_autotest random failures
>            Product: DPDK
>            Version: unspecified
>           Hardware: All
>                 OS: All
>             Status: UNCONFIRMED
>           Severity: normal
>           Priority: Normal
>          Component: other
>           Assignee: cristian.dumitrescu@intel.com
>           Reporter: david.marchand@redhat.com
>                 CC: dev@dpdk.org, jasvinder.singh@intel.com
>   Target Milestone: ---
>
> A recent failure can be found at:
> https://lab.dpdk.org/results/dashboard/patchsets/19223/
>
> 50/94 DPDK:fast-tests / red_autotest                   FAIL              0.86s
>  exit status 1


functional test 6 : use several queues (each with its own run-time data),
            use several RED configurations (such that each
configuration is sharte_red by multiple queues),
            increase average queue size to target level,
            dequeue all packets until queue is empty,
            confirm that average queue size is computed correctly
while queue is empty
            (this is a larger scale version of functional test 3)

queue          config         q avg before   q avg after    expected
    difference %   tolerance %    result
0              0              1022.0000      1022.0000      1016.0627
    0.5843         5.0000         pass
1              0              1022.0000      1022.0000      1016.0627
    0.5843         5.0000         pass
2              1              1022.0000      1022.0000      1010.1483
    1.1733         5.0000         pass
3              1              1022.0000      937.1660       1010.1483
    7.2249         5.0000         fail
-------------------------------------<fail>-------------------------------------



This failure keeps on popping in the CI.
The bug report is one month old, with no reply.


I sent a proposal of removing red_autotest from the list executed by the CI.
https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-david.marchand@redhat.com/

It might be the best solution waiting for an analysis.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-12 13:51 ` David Marchand
@ 2021-11-12 14:10   ` Lincoln Lavoie
  2021-11-12 14:15     ` David Marchand
  0 siblings, 1 reply; 16+ messages in thread
From: Lincoln Lavoie @ 2021-11-12 14:10 UTC (permalink / raw)
  To: David Marchand
  Cc: Cristian Dumitrescu, dev, Aaron Conole, Thomas Monjalon, Yigit,
	Ferruh, ci

[-- Attachment #1: Type: text/plain, Size: 3293 bytes --]

On Fri, Nov 12, 2021 at 8:52 AM David Marchand <david.marchand@redhat.com>
wrote:

> On Fri, Oct 8, 2021 at 9:24 AM <bugzilla@dpdk.org> wrote:
> >
> > https://bugs.dpdk.org/show_bug.cgi?id=826
> >
> >             Bug ID: 826
> >            Summary: red_autotest random failures
> >            Product: DPDK
> >            Version: unspecified
> >           Hardware: All
> >                 OS: All
> >             Status: UNCONFIRMED
> >           Severity: normal
> >           Priority: Normal
> >          Component: other
> >           Assignee: cristian.dumitrescu@intel.com
> >           Reporter: david.marchand@redhat.com
> >                 CC: dev@dpdk.org, jasvinder.singh@intel.com
> >   Target Milestone: ---
> >
> > A recent failure can be found at:
> > https://lab.dpdk.org/results/dashboard/patchsets/19223/
> >
> > 50/94 DPDK:fast-tests / red_autotest                   FAIL
> 0.86s
> >  exit status 1
>
>
> functional test 6 : use several queues (each with its own run-time data),
>             use several RED configurations (such that each
> configuration is sharte_red by multiple queues),
>             increase average queue size to target level,
>             dequeue all packets until queue is empty,
>             confirm that average queue size is computed correctly
> while queue is empty
>             (this is a larger scale version of functional test 3)
>
> queue          config         q avg before   q avg after    expected
>     difference %   tolerance %    result
> 0              0              1022.0000      1022.0000      1016.0627
>     0.5843         5.0000         pass
> 1              0              1022.0000      1022.0000      1016.0627
>     0.5843         5.0000         pass
> 2              1              1022.0000      1022.0000      1010.1483
>     1.1733         5.0000         pass
> 3              1              1022.0000      937.1660       1010.1483
>     7.2249         5.0000         fail
>
> -------------------------------------<fail>-------------------------------------
>
>
>
> This failure keeps on popping in the CI.
> The bug report is one month old, with no reply.
>
>
> I sent a proposal of removing red_autotest from the list executed by the
> CI.
>
> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-david.marchand@redhat.com/
>
> It might be the best solution waiting for an analysis.
>
>
> --
> David Marchand
>
>
Hi David,

My understanding is, removing the test would require removing it from the
DPDK unit tests, we are just running the fast-tests suite for the unit
tests.  DPDK's unit test structure / framework does not allow removing or
customizing the suite of tests beyond the suites.

In the lab, Brandon has been looking into and trying different
configurations for running the tests within the containers along the lines
of the CPU pinning requirements that might be assumed by the unit tests. So
far, everything he has tried has still had the similar failures / issues.
We are still looking into it, so the bug is not sitting without action,
just no final resolution.

Cheers,
Lincoln
-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 5888 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-12 14:10   ` Lincoln Lavoie
@ 2021-11-12 14:15     ` David Marchand
  2021-11-15 11:51       ` Dumitrescu, Cristian
  0 siblings, 1 reply; 16+ messages in thread
From: David Marchand @ 2021-11-12 14:15 UTC (permalink / raw)
  To: Lincoln Lavoie
  Cc: Cristian Dumitrescu, dev, Aaron Conole, Thomas Monjalon, Yigit,
	Ferruh, ci

On Fri, Nov 12, 2021 at 3:11 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
>> This failure keeps on popping in the CI.
>> The bug report is one month old, with no reply.
>>
>>
>> I sent a proposal of removing red_autotest from the list executed by the CI.
>> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-david.marchand@redhat.com/
>>
>> It might be the best solution waiting for an analysis.
>>
>>
>> --
>> David Marchand
>>
>
> Hi David,
>
> My understanding is, removing the test would require removing it from the DPDK unit tests, we are just running the fast-tests suite for the unit tests.  DPDK's unit test structure / framework does not allow removing or customizing the suite of tests beyond the suites.

https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-david.marchand@redhat.com/


>
> In the lab, Brandon has been looking into and trying different configurations for running the tests within the containers along the lines of the CPU pinning requirements that might be assumed by the unit tests. So far, everything he has tried has still had the similar failures / issues.  We are still looking into it, so the bug is not sitting without action, just no final resolution.

The mail I sent was not a comment for the investigation on UNH side.
The ask is for Cristian to have a look too.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-12 14:15     ` David Marchand
@ 2021-11-15 11:51       ` Dumitrescu, Cristian
  2021-11-15 17:26         ` Liguzinski, WojciechX
  0 siblings, 1 reply; 16+ messages in thread
From: Dumitrescu, Cristian @ 2021-11-15 11:51 UTC (permalink / raw)
  To: David Marchand, Lincoln Lavoie, Liguzinski, WojciechX, Ajmera,
	Megha, Singh, Jasvinder
  Cc: dev, Aaron Conole, Thomas Monjalon, Yigit, Ferruh, ci



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 12, 2021 2:16 PM
> To: Lincoln Lavoie <lylavoie@iol.unh.edu>
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; dev
> <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> ci@dpdk.org
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
> 
> On Fri, Nov 12, 2021 at 3:11 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
> >> This failure keeps on popping in the CI.
> >> The bug report is one month old, with no reply.
> >>
> >>
> >> I sent a proposal of removing red_autotest from the list executed by the
> CI.
> >> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-
> 2-david.marchand@redhat.com/
> >>
> >> It might be the best solution waiting for an analysis.
> >>
> >>
> >> --
> >> David Marchand
> >>
> >
> > Hi David,
> >
> > My understanding is, removing the test would require removing it from the
> DPDK unit tests, we are just running the fast-tests suite for the unit tests.
> DPDK's unit test structure / framework does not allow removing or
> customizing the suite of tests beyond the suites.
> 
> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-
> david.marchand@redhat.com/
> 
> 
> >
> > In the lab, Brandon has been looking into and trying different
> configurations for running the tests within the containers along the lines of
> the CPU pinning requirements that might be assumed by the unit tests. So
> far, everything he has tried has still had the similar failures / issues.  We are
> still looking into it, so the bug is not sitting without action, just no final
> resolution.
> 
> The mail I sent was not a comment for the investigation on UNH side.
> The ask is for Cristian to have a look too.
> 
> 
> --
> David Marchand

Wojciech, Megha,

Are you able to take a look at why is the RED autotest failing, please?

Thanks,
Cristian



^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-15 11:51       ` Dumitrescu, Cristian
@ 2021-11-15 17:26         ` Liguzinski, WojciechX
  2021-11-18 22:10           ` Liguzinski, WojciechX
  0 siblings, 1 reply; 16+ messages in thread
From: Liguzinski, WojciechX @ 2021-11-15 17:26 UTC (permalink / raw)
  To: Dumitrescu, Cristian, David Marchand, Lincoln Lavoie, Ajmera,
	Megha, Singh, Jasvinder
  Cc: dev, Aaron Conole, Thomas Monjalon, Yigit, Ferruh, ci

Hi,

Sure, I will have a look.

Best Regards,
Wojciech


-----Original Message-----
From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com> 
Sent: Monday, November 15, 2021 12:51 PM
To: David Marchand <david.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>; Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>
Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org
Subject: RE: [dpdk-dev] [Bug 826] red_autotest random failures



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 12, 2021 2:16 PM
> To: Lincoln Lavoie <lylavoie@iol.unh.edu>
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; dev 
> <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon 
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>; 
> ci@dpdk.org
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
> 
> On Fri, Nov 12, 2021 at 3:11 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
> >> This failure keeps on popping in the CI.
> >> The bug report is one month old, with no reply.
> >>
> >>
> >> I sent a proposal of removing red_autotest from the list executed 
> >> by the
> CI.
> >> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-
> 2-david.marchand@redhat.com/
> >>
> >> It might be the best solution waiting for an analysis.
> >>
> >>
> >> --
> >> David Marchand
> >>
> >
> > Hi David,
> >
> > My understanding is, removing the test would require removing it 
> > from the
> DPDK unit tests, we are just running the fast-tests suite for the unit tests.
> DPDK's unit test structure / framework does not allow removing or 
> customizing the suite of tests beyond the suites.
> 
> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-
> david.marchand@redhat.com/
> 
> 
> >
> > In the lab, Brandon has been looking into and trying different
> configurations for running the tests within the containers along the 
> lines of the CPU pinning requirements that might be assumed by the 
> unit tests. So far, everything he has tried has still had the similar 
> failures / issues.  We are still looking into it, so the bug is not 
> sitting without action, just no final resolution.
> 
> The mail I sent was not a comment for the investigation on UNH side.
> The ask is for Cristian to have a look too.
> 
> 
> --
> David Marchand

Wojciech, Megha,

Are you able to take a look at why is the RED autotest failing, please?

Thanks,
Cristian



^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-15 17:26         ` Liguzinski, WojciechX
@ 2021-11-18 22:10           ` Liguzinski, WojciechX
  2021-11-19  7:26             ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Liguzinski, WojciechX @ 2021-11-18 22:10 UTC (permalink / raw)
  To: Dumitrescu, Cristian, David Marchand, Lincoln Lavoie, Ajmera,
	Megha, Singh, Jasvinder
  Cc: dev, Aaron Conole, Thomas Monjalon, Yigit, Ferruh, ci, Zegota, AnnaX

Hi,

I was trying to reproduce this test failure, but for me RED tests are passing. 
I was running the exact test command like the one described in Bug 826 - 'red_autotest' on the current main branch.

Here is an example when DPDK is build without RTE_SCHED_CMAN enabled, but with this flag set to true tests are also not failing.

root@silpixa00400629:~/wojtek/dpdk/build/app/test# ./dpdk-test '-l 0-15' --file-prefix=red_autotest
EAL: Detected CPU lcores: 96
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/red_autotest/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
APP: HPET is not enabled, using TSC as default timer
RTE>>red_autotest

--------------------------------------------------------------------------------
functional test 1 : use one rte_red configuration,
                    increase average queue size to various levels,
                    compare drop rate to drop probability

                avg queue size enqueued       dropped        drop prob %    drop rate %    diff %         tolerance %    
                6              10000          0              0.0000         0.0000         0.0000         50.0000        
                12             10000          0              0.0000         0.0000         0.0000         50.0000        
                18             10000          0              0.0000         0.0000         0.0000         50.0000        
                24             10000          0              0.0000         0.0000         0.0000         50.0000        
                30             10000          0              0.0000         0.0000         0.0000         50.0000        
                36             9961           39             0.4167         0.3900         0.0000         50.0000        
                42             9898           102            1.0417         1.0200         0.0000         50.0000        
                48             9835           165            1.6667         1.6500         0.0000         50.0000        
                54             9785           215            2.2917         2.1500         0.0000         50.0000        
                60             9703           297            2.9167         2.9700         0.0000         50.0000        
                66             9627           373            3.5417         3.7300         0.0000         50.0000        
                72             9580           420            4.1667         4.2000         0.0000         50.0000        
                78             9511           489            4.7917         4.8900         0.0000         50.0000        
                84             9462           538            5.4167         5.3800         0.0000         50.0000        
                90             9398           602            6.0417         6.0200         0.0000         50.0000        
                96             9366           634            6.6667         6.3400         0.0000         50.0000        
                102            9267           733            7.2917         7.3300         0.0000         50.0000        
                108            9212           788            7.9167         7.8800         0.0000         50.0000        
                114            9146           854            8.5417         8.5400         0.0000         50.0000        
                120            9102           898            9.1667         8.9800         0.0000         50.0000        
                126            8984           1016           9.7917         10.1600        0.0000         50.0000        
                132            0              10000          100.0000       100.0000       0.0000         50.0000        
                138            0              10000          100.0000       100.0000       0.0000         50.0000        
                144            0              10000          100.0000       100.0000       0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 2 : use several RED configurations,
                    increase average queue size to just below maximum threshold,
                    compare drop rate to drop probability

RED config     avg queue size min threshold  max threshold  drop prob %    drop rate %    diff %         tolerance %    
0              127            32             128            9.8958         10.0100        0.0000         50.0000        
1              127            32             128            4.9479         4.9700         0.0000         50.0000        
2              127            32             128            3.2986         2.6800         0.0000         50.0000        
3              127            32             128            2.4740         1.7000         0.0000         50.0000        
4              127            32             128            1.9792         1.2700         0.0000         50.0000        
5              127            32             128            1.6493         1.0500         0.0000         50.0000        
6              127            32             128            1.4137         0.8100         0.0000         50.0000        
7              127            32             128            1.2370         0.7100         0.0000         50.0000        
8              127            32             128            1.0995         0.6200         0.0000         50.0000        
9              127            32             128            0.9896         0.5600         0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 3 : use one RED configuration,
                    increase average queue size to target level,
                    dequeue all packets until queue is empty,
                    confirm that average queue size is computed correctly while queue is empty

q avg before   q avg after    expected       difference %   tolerance %    result        
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 5 : use several queues (each with its own run-time data),
                    use several RED configurations (such that each configuration is shared by multiple queues),
                    increase average queue size to just below maximum threshold,
                    compare drop rate to drop probability,
                    (this is a larger scale version of functional test 2)

queue          config         avg queue size min threshold  max threshold  drop prob %    drop rate %    diff %         tolerance %    
0              0              127            32             128            9.8958         9.9200         0.0000         50.0000        
1              0              127            32             128            9.8958         9.9700         0.0000         50.0000        
2              1              127            32             128            4.9479         4.8600         0.0000         50.0000        
3              1              127            32             128            4.9479         4.9400         0.0000         50.0000        
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
functional test 6 : use several queues (each with its own run-time data),
                    use several RED configurations (such that each configuration is shared by multiple queues),
                    increase average queue size to target level,
                    dequeue all packets until queue is empty,
                    confirm that average queue size is computed correctly while queue is empty
                    (this is a larger scale version of functional test 3)

queue          config         q avg before   q avg after    expected       difference %   tolerance %    result  
0              0              1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
1              0              1022.0000      1022.0000      1016.0627      0.5843         5.0000         pass           
2              1              1022.0000      1022.0000      1010.1483      1.1733         5.0000         pass           
3              1              1022.0000      1022.0000      1010.1483      1.1733         5.0000         pass           
-------------------------------------<pass>-------------------------------------

--------------------------------------------------------------------------------
overflow test 1 : use one RED configuration,
                  increase average queue size to target level,
                  check maximum number of bits requirte_red to represent avg_s

avg queue size  wq_log2  fraction bits  max queue avg  num bits  enqueued  dropped   drop prob %  drop rate %  
1023            12       10             0xffc00000     32        0         941366    100.00       100.00       
-------------------------------------<pass>-------------------------------------
[total: 6, pass: 6]
Test OK
RTE>>quit


Kind Regards,
Wojtek


-----Original Message-----
From: Liguzinski, WojciechX 
Sent: Monday, November 15, 2021 6:27 PM
To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; David Marchand <david.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>
Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org
Subject: RE: [dpdk-dev] [Bug 826] red_autotest random failures

Hi,

Sure, I will have a look.

Best Regards,
Wojciech


-----Original Message-----
From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com> 
Sent: Monday, November 15, 2021 12:51 PM
To: David Marchand <david.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>; Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>
Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org
Subject: RE: [dpdk-dev] [Bug 826] red_autotest random failures



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 12, 2021 2:16 PM
> To: Lincoln Lavoie <lylavoie@iol.unh.edu>
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; dev 
> <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Thomas Monjalon 
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>; 
> ci@dpdk.org
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
> 
> On Fri, Nov 12, 2021 at 3:11 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
> >> This failure keeps on popping in the CI.
> >> The bug report is one month old, with no reply.
> >>
> >>
> >> I sent a proposal of removing red_autotest from the list executed 
> >> by the
> CI.
> >> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-
> 2-david.marchand@redhat.com/
> >>
> >> It might be the best solution waiting for an analysis.
> >>
> >>
> >> --
> >> David Marchand
> >>
> >
> > Hi David,
> >
> > My understanding is, removing the test would require removing it 
> > from the
> DPDK unit tests, we are just running the fast-tests suite for the unit tests.
> DPDK's unit test structure / framework does not allow removing or 
> customizing the suite of tests beyond the suites.
> 
> https://patchwork.dpdk.org/project/dpdk/patch/20211027140458.2502-2-
> david.marchand@redhat.com/
> 
> 
> >
> > In the lab, Brandon has been looking into and trying different
> configurations for running the tests within the containers along the 
> lines of the CPU pinning requirements that might be assumed by the 
> unit tests. So far, everything he has tried has still had the similar 
> failures / issues.  We are still looking into it, so the bug is not 
> sitting without action, just no final resolution.
> 
> The mail I sent was not a comment for the investigation on UNH side.
> The ask is for Cristian to have a look too.
> 
> 
> --
> David Marchand

Wojciech, Megha,

Are you able to take a look at why is the RED autotest failing, please?

Thanks,
Cristian



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-18 22:10           ` Liguzinski, WojciechX
@ 2021-11-19  7:26             ` Thomas Monjalon
  2021-11-19 16:53               ` Dumitrescu, Cristian
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2021-11-19  7:26 UTC (permalink / raw)
  To: Dumitrescu, Cristian, David Marchand, Lincoln Lavoie, Ajmera,
	Megha, Singh, Jasvinder, Liguzinski, WojciechX
  Cc: dev, Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX

18/11/2021 23:10, Liguzinski, WojciechX:
> Hi,
> 
> I was trying to reproduce this test failure, but for me RED tests are passing. 
> I was running the exact test command like the one described in Bug 826 - 'red_autotest' on the current main branch.

The test is not always failing.
There are some failing conditions, please find them.
I think you should try in a container with more limited resources.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-19  7:26             ` Thomas Monjalon
@ 2021-11-19 16:53               ` Dumitrescu, Cristian
  2021-11-19 17:25                 ` Lincoln Lavoie
  2021-11-22  8:17                 ` David Marchand
  0 siblings, 2 replies; 16+ messages in thread
From: Dumitrescu, Cristian @ 2021-11-19 16:53 UTC (permalink / raw)
  To: Thomas Monjalon, David Marchand, Lincoln Lavoie, Ajmera, Megha,
	Singh, Jasvinder, Liguzinski, WojciechX
  Cc: dev, Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, November 19, 2021 7:26 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; David Marchand
> <david.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>;
> Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder
> <jasvinder.singh@intel.com>; Liguzinski, WojciechX
> <wojciechx.liguzinski@intel.com>
> Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX
> <annax.zegota@intel.com>
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
> 
> 18/11/2021 23:10, Liguzinski, WojciechX:
> > Hi,
> >
> > I was trying to reproduce this test failure, but for me RED tests are passing.
> > I was running the exact test command like the one described in Bug 826 -
> 'red_autotest' on the current main branch.
> 
> The test is not always failing.
> There are some failing conditions, please find them.
> I think you should try in a container with more limited resources.
> 

Hi Thomas,

This is not a fair request IMO. We want to avoid wasting everybody's time, including Wojciech's time. Can the bug originator provide the details on the setup to reproduce the failure, please? Thank you!

On a different point, we should probably tweak our autotests to differentiate between logical failures and those failures related to resources not being available, and flag the test result accordingly in the report. For example, if memory allocation fails, the test should be flagged as "Not enough resources" instead of simply "Failed". In the first case, the next step should be fixing the test setup, while in the second case the next step should be fixing the code. What do people think on this?

Regards,
Cristian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-19 16:53               ` Dumitrescu, Cristian
@ 2021-11-19 17:25                 ` Lincoln Lavoie
  2021-11-24  7:48                   ` Liguzinski, WojciechX
  2021-11-22  8:17                 ` David Marchand
  1 sibling, 1 reply; 16+ messages in thread
From: Lincoln Lavoie @ 2021-11-19 17:25 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: Thomas Monjalon, David Marchand, Ajmera, Megha, Singh, Jasvinder,
	Liguzinski, WojciechX, dev, Aaron Conole, Yigit, Ferruh, ci,
	Zegota, AnnaX

[-- Attachment #1: Type: text/plain, Size: 3602 bytes --]

Hi All,

I'm not sure if it will help, but this is an example of a failing case in
the CI: https://lab.dpdk.org/results/dashboard/patchsets/20222/

The test is running within a docker container.  CI is set up to only allow
one active unit test at a time, so the host might be running compile jobs,
but not other unit tests.  This ensures there isn't "competition" for
resources like hugepages between two running unit test jobs.  The host is
actually a VM running on VMware vCenter, not a bare-metal host, the VM's
sole purpose is running the docker jobs.

The command to start the unit test run is pretty generic (script is below).

#!/bin/bash

####################################################
# $1 argument: extra arguments to send to meson test
####################################################

# Exit on first command failure
set -e

# Extract dpdk.tar.gz
tar xzfm dpdk.tar.gz

# Compile DPDK
cd dpdk
meson build --werror
ninja -C build install

# Unit test
cd build
meson test --suite fast-tests -t 60 $1

I think a starting point is to understand if the unit test expects or makes
assumptions on the system / environment.  If it has sole access to a CPU
core, minimum number of hugepages, etc.  If it would help, I can also give
you the DockerFile to build the container (note the RHEL images have to be
built on a licensed Redhat server, based on being able to install the
required packages).

Cheers,
Lincoln


On Fri, Nov 19, 2021 at 11:54 AM Dumitrescu, Cristian <
cristian.dumitrescu@intel.com> wrote:

>
>
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, November 19, 2021 7:26 AM
> > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; David Marchand
> > <david.marchand@redhat.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>;
> > Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder
> > <jasvinder.singh@intel.com>; Liguzinski, WojciechX
> > <wojciechx.liguzinski@intel.com>
> > Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Yigit,
> > Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX
> > <annax.zegota@intel.com>
> > Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
> >
> > 18/11/2021 23:10, Liguzinski, WojciechX:
> > > Hi,
> > >
> > > I was trying to reproduce this test failure, but for me RED tests are
> passing.
> > > I was running the exact test command like the one described in Bug 826
> -
> > 'red_autotest' on the current main branch.
> >
> > The test is not always failing.
> > There are some failing conditions, please find them.
> > I think you should try in a container with more limited resources.
> >
>
> Hi Thomas,
>
> This is not a fair request IMO. We want to avoid wasting everybody's time,
> including Wojciech's time. Can the bug originator provide the details on
> the setup to reproduce the failure, please? Thank you!
>
> On a different point, we should probably tweak our autotests to
> differentiate between logical failures and those failures related to
> resources not being available, and flag the test result accordingly in the
> report. For example, if memory allocation fails, the test should be flagged
> as "Not enough resources" instead of simply "Failed". In the first case,
> the next step should be fixing the test setup, while in the second case the
> next step should be fixing the code. What do people think on this?
>
> Regards,
> Cristian
>


-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 6527 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-19 16:53               ` Dumitrescu, Cristian
  2021-11-19 17:25                 ` Lincoln Lavoie
@ 2021-11-22  8:17                 ` David Marchand
  2021-11-22 13:34                   ` Lincoln Lavoie
  1 sibling, 1 reply; 16+ messages in thread
From: David Marchand @ 2021-11-22  8:17 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: Thomas Monjalon, Lincoln Lavoie, Ajmera, Megha, Singh, Jasvinder,
	Liguzinski, WojciechX, dev, Aaron Conole, Yigit, Ferruh, ci,
	Zegota, AnnaX

On Fri, Nov 19, 2021 at 5:54 PM Dumitrescu, Cristian
<cristian.dumitrescu@intel.com> wrote:
> On a different point, we should probably tweak our autotests to differentiate between logical failures and those failures related to resources not being available, and flag the test result accordingly in the report. For example, if memory allocation fails, the test should be flagged as "Not enough resources" instead of simply "Failed". In the first case, the next step should be fixing the test setup, while in the second case the next step should be fixing the code. What do people think on this?

In such case, the test must return TEST_SKIPPED.

I did a pass for cores count / specific hw requirements, some time ago.
See https://git.dpdk.org/dpdk/commit/?id=e0f4a0ed4237


-- 
David Marchand


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-22  8:17                 ` David Marchand
@ 2021-11-22 13:34                   ` Lincoln Lavoie
  0 siblings, 0 replies; 16+ messages in thread
From: Lincoln Lavoie @ 2021-11-22 13:34 UTC (permalink / raw)
  To: David Marchand
  Cc: Dumitrescu, Cristian, Thomas Monjalon, Ajmera, Megha, Singh,
	Jasvinder, Liguzinski, WojciechX, dev, Aaron Conole, Yigit,
	Ferruh, ci, Zegota, AnnaX

[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]

On Mon, Nov 22, 2021 at 3:17 AM David Marchand <david.marchand@redhat.com>
wrote:

> On Fri, Nov 19, 2021 at 5:54 PM Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com> wrote:
> > On a different point, we should probably tweak our autotests to
> differentiate between logical failures and those failures related to
> resources not being available, and flag the test result accordingly in the
> report. For example, if memory allocation fails, the test should be flagged
> as "Not enough resources" instead of simply "Failed". In the first case,
> the next step should be fixing the test setup, while in the second case the
> next step should be fixing the code. What do people think on this?
>
> In such case, the test must return TEST_SKIPPED.
>
> If the purpose of the component / function being tested is to get / create
/ reserve the resource(s), the failure might be valid. So it can't be
applied across the board.  But places where the test is checking other
functionality, this might at least prevent some failures that are transient
(i.e. based on what the test could "get" from the system at that moment in
time).



> I did a pass for cores count / specific hw requirements, some time ago.
> See https://git.dpdk.org/dpdk/commit/?id=e0f4a0ed4237
>
>
> --
> David Marchand
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 2996 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-19 17:25                 ` Lincoln Lavoie
@ 2021-11-24  7:48                   ` Liguzinski, WojciechX
  2021-11-29 17:58                     ` Brandon Lo
  0 siblings, 1 reply; 16+ messages in thread
From: Liguzinski, WojciechX @ 2021-11-24  7:48 UTC (permalink / raw)
  To: Lincoln Lavoie, Dumitrescu, Cristian
  Cc: Thomas Monjalon, David Marchand, Ajmera, Megha, Singh, Jasvinder,
	dev, Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX


[-- Attachment #1.1: Type: text/plain, Size: 4724 bytes --]

Hi,

Thanks Lincoln, I will also have a try with such script.

Cheers,
Wojciech

From: Lincoln Lavoie <lylavoie@iol.unh.edu>
Sent: Friday, November 19, 2021 6:26 PM
To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
Cc: Thomas Monjalon <thomas@monjalon.net>; David Marchand <david.marchand@redhat.com>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>; Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>; dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX <annax.zegota@intel.com>
Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures

Hi All,

I'm not sure if it will help, but this is an example of a failing case in the CI: https://lab.dpdk.org/results/dashboard/patchsets/20222/

The test is running within a docker container.  CI is set up to only allow one active unit test at a time, so the host might be running compile jobs, but not other unit tests.  This ensures there isn't "competition" for resources like hugepages between two running unit test jobs.  The host is actually a VM running on VMware vCenter, not a bare-metal host, the VM's sole purpose is running the docker jobs.

The command to start the unit test run is pretty generic (script is below).

#!/bin/bash

####################################################
# $1 argument: extra arguments to send to meson test
####################################################

# Exit on first command failure
set -e

# Extract dpdk.tar.gz
tar xzfm dpdk.tar.gz

# Compile DPDK
cd dpdk
meson build --werror
ninja -C build install

# Unit test
cd build
meson test --suite fast-tests -t 60 $1

I think a starting point is to understand if the unit test expects or makes assumptions on the system / environment.  If it has sole access to a CPU core, minimum number of hugepages, etc.  If it would help, I can also give you the DockerFile to build the container (note the RHEL images have to be built on a licensed Redhat server, based on being able to install the required packages).

Cheers,
Lincoln


On Fri, Nov 19, 2021 at 11:54 AM Dumitrescu, Cristian <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>> wrote:


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>
> Sent: Friday, November 19, 2021 7:26 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>>; David Marchand
> <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>;
> Ajmera, Megha <megha.ajmera@intel.com<mailto:megha.ajmera@intel.com>>; Singh, Jasvinder
> <jasvinder.singh@intel.com<mailto:jasvinder.singh@intel.com>>; Liguzinski, WojciechX
> <wojciechx.liguzinski@intel.com<mailto:wojciechx.liguzinski@intel.com>>
> Cc: dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Aaron Conole <aconole@redhat.com<mailto:aconole@redhat.com>>; Yigit,
> Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>; ci@dpdk.org<mailto:ci@dpdk.org>; Zegota, AnnaX
> <annax.zegota@intel.com<mailto:annax.zegota@intel.com>>
> Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures
>
> 18/11/2021 23:10, Liguzinski, WojciechX:
> > Hi,
> >
> > I was trying to reproduce this test failure, but for me RED tests are passing.
> > I was running the exact test command like the one described in Bug 826 -
> 'red_autotest' on the current main branch.
>
> The test is not always failing.
> There are some failing conditions, please find them.
> I think you should try in a container with more limited resources.
>

Hi Thomas,

This is not a fair request IMO. We want to avoid wasting everybody's time, including Wojciech's time. Can the bug originator provide the details on the setup to reproduce the failure, please? Thank you!

On a different point, we should probably tweak our autotests to differentiate between logical failures and those failures related to resources not being available, and flag the test result accordingly in the report. For example, if memory allocation fails, the test should be flagged as "Not enough resources" instead of simply "Failed". In the first case, the next step should be fixing the test setup, while in the second case the next step should be fixing the code. What do people think on this?

Regards,
Cristian


--
Lincoln Lavoie
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>
https://www.iol.unh.edu
+1-603-674-2755 (m)
[Image removed by sender.]<https://www.iol.unh.edu/>

[-- Attachment #1.2: Type: text/html, Size: 10574 bytes --]

[-- Attachment #2: image002.jpg --]
[-- Type: image/jpeg, Size: 444 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-24  7:48                   ` Liguzinski, WojciechX
@ 2021-11-29 17:58                     ` Brandon Lo
  2021-11-30  7:51                       ` Liguzinski, WojciechX
  0 siblings, 1 reply; 16+ messages in thread
From: Brandon Lo @ 2021-11-29 17:58 UTC (permalink / raw)
  To: Liguzinski, WojciechX
  Cc: Lincoln Lavoie, Dumitrescu, Cristian, Thomas Monjalon,
	David Marchand, Ajmera, Megha, Singh, Jasvinder, dev,
	Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Wed, Nov 24, 2021 at 2:48 AM Liguzinski, WojciechX <
wojciechx.liguzinski@intel.com> wrote:

> Hi,
>
>
>
> Thanks Lincoln, I will also have a try with such script.
>
>
>
> Cheers,
>
> Wojciech
>
>
Hello Wojciech,

I also recommend trying to run the test with around 4GB of RAM and 2GB of
hugepages to see if it fails. That is roughly the number of resources we
have per machine that is completely dedicated to unit tests. The amount of
RAM available can sometimes increase depending on how many jobs are running
per machine, but 4GB is the lowest it can go for the unit test job.

Thanks,
Brandon



-- 
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu

[-- Attachment #2: Type: text/html, Size: 1676 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-29 17:58                     ` Brandon Lo
@ 2021-11-30  7:51                       ` Liguzinski, WojciechX
  2021-12-10 13:31                         ` Liguzinski, WojciechX
  0 siblings, 1 reply; 16+ messages in thread
From: Liguzinski, WojciechX @ 2021-11-30  7:51 UTC (permalink / raw)
  To: Brandon Lo
  Cc: Lincoln Lavoie, Dumitrescu, Cristian, Thomas Monjalon,
	David Marchand, Ajmera, Megha, Singh, Jasvinder, dev,
	Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX

[-- Attachment #1: Type: text/plain, Size: 1549 bytes --]

Ok, thanks Brandon for the tip :)
Let’s see if I can setup the machine with such configuration.

Cheers,
Wojciech

From: Brandon Lo <blo@iol.unh.edu>
Sent: Monday, November 29, 2021 6:58 PM
To: Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>
Cc: Lincoln Lavoie <lylavoie@iol.unh.edu>; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Thomas Monjalon <thomas@monjalon.net>; David Marchand <david.marchand@redhat.com>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>; dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX <annax.zegota@intel.com>
Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures

On Wed, Nov 24, 2021 at 2:48 AM Liguzinski, WojciechX <wojciechx.liguzinski@intel.com<mailto:wojciechx.liguzinski@intel.com>> wrote:
Hi,

Thanks Lincoln, I will also have a try with such script.

Cheers,
Wojciech

Hello Wojciech,

I also recommend trying to run the test with around 4GB of RAM and 2GB of hugepages to see if it fails. That is roughly the number of resources we have per machine that is completely dedicated to unit tests. The amount of RAM available can sometimes increase depending on how many jobs are running per machine, but 4GB is the lowest it can go for the unit test job.

Thanks,
Brandon



--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 5152 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [dpdk-dev] [Bug 826] red_autotest random failures
  2021-11-30  7:51                       ` Liguzinski, WojciechX
@ 2021-12-10 13:31                         ` Liguzinski, WojciechX
  0 siblings, 0 replies; 16+ messages in thread
From: Liguzinski, WojciechX @ 2021-12-10 13:31 UTC (permalink / raw)
  To: Brandon Lo
  Cc: Lincoln Lavoie, Dumitrescu, Cristian, Thomas Monjalon,
	David Marchand, Ajmera, Megha, Singh, Jasvinder, dev,
	Aaron Conole, Yigit, Ferruh, ci, Zegota, AnnaX, Danilewicz,
	MarcinX

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

Hi,

Unfortunately, I haven’t been able to move the investigation much further.
I have been running those tests on machines with higher amount of RAM than 4GB, but with hugepages set there to 1GB and using the script provided by Lincoln.
For several runs red_autotest tests didn’t fail even once, not giving any clue what might be the cause of what’s happening on CI.

+Adding Marcin Danilewicz
To let you know, Marcin Danilewicz will be taking over my tasks, so for any further aspects please include or direct messages to him.

Best Regards,
Wojciech


From: Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>
Sent: Tuesday, November 30, 2021 8:51 AM
To: Brandon Lo <blo@iol.unh.edu>
Cc: Lincoln Lavoie <lylavoie@iol.unh.edu>; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Thomas Monjalon <thomas@monjalon.net>; David Marchand <david.marchand@redhat.com>; Ajmera, Megha <megha.ajmera@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>; dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; ci@dpdk.org; Zegota, AnnaX <annax.zegota@intel.com>
Subject: RE: [dpdk-dev] [Bug 826] red_autotest random failures

Ok, thanks Brandon for the tip :)
Let’s see if I can setup the machine with such configuration.

Cheers,
Wojciech

From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Monday, November 29, 2021 6:58 PM
To: Liguzinski, WojciechX <wojciechx.liguzinski@intel.com<mailto:wojciechx.liguzinski@intel.com>>
Cc: Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Dumitrescu, Cristian <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; Ajmera, Megha <megha.ajmera@intel.com<mailto:megha.ajmera@intel.com>>; Singh, Jasvinder <jasvinder.singh@intel.com<mailto:jasvinder.singh@intel.com>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Aaron Conole <aconole@redhat.com<mailto:aconole@redhat.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>; ci@dpdk.org<mailto:ci@dpdk.org>; Zegota, AnnaX <annax.zegota@intel.com<mailto:annax.zegota@intel.com>>
Subject: Re: [dpdk-dev] [Bug 826] red_autotest random failures

On Wed, Nov 24, 2021 at 2:48 AM Liguzinski, WojciechX <wojciechx.liguzinski@intel.com<mailto:wojciechx.liguzinski@intel.com>> wrote:
Hi,

Thanks Lincoln, I will also have a try with such script.

Cheers,
Wojciech

Hello Wojciech,

I also recommend trying to run the test with around 4GB of RAM and 2GB of hugepages to see if it fails. That is roughly the number of resources we have per machine that is completely dedicated to unit tests. The amount of RAM available can sometimes increase depending on how many jobs are running per machine, but 4GB is the lowest it can go for the unit test job.

Thanks,
Brandon



--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 7735 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-12-10 13:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08  7:23 [dpdk-dev] [Bug 826] red_autotest random failures bugzilla
2021-11-12 13:51 ` David Marchand
2021-11-12 14:10   ` Lincoln Lavoie
2021-11-12 14:15     ` David Marchand
2021-11-15 11:51       ` Dumitrescu, Cristian
2021-11-15 17:26         ` Liguzinski, WojciechX
2021-11-18 22:10           ` Liguzinski, WojciechX
2021-11-19  7:26             ` Thomas Monjalon
2021-11-19 16:53               ` Dumitrescu, Cristian
2021-11-19 17:25                 ` Lincoln Lavoie
2021-11-24  7:48                   ` Liguzinski, WojciechX
2021-11-29 17:58                     ` Brandon Lo
2021-11-30  7:51                       ` Liguzinski, WojciechX
2021-12-10 13:31                         ` Liguzinski, WojciechX
2021-11-22  8:17                 ` David Marchand
2021-11-22 13:34                   ` Lincoln Lavoie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).