DPDK CI discussions
 help / color / mirror / Atom feed
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: David Marchand <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"dpdklab@iol.unh.edu" <dpdklab@iol.unh.edu>,
	"ci@dpdk.org" <ci@dpdk.org>,
	"Honnappa.Nagarahalli@arm.com" <Honnappa.Nagarahalli@arm.com>,
	"mattias.ronnblom" <mattias.ronnblom@ericsson.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"Tyler Retzlaff" <roretzla@linux.microsoft.com>,
	"Aaron Conole" <aconole@redhat.com>
Subject: RE: [PATCH v3] test/service: fix spurious failures by extending timeout
Date: Fri, 3 Feb 2023 15:03:38 +0000	[thread overview]
Message-ID: <BN0PR11MB5712D936D06832DBF4FDAC27D7D79@BN0PR11MB5712.namprd11.prod.outlook.com> (raw)
In-Reply-To: <BN0PR11MB57126014E08124D900D5ABC2D7D09@BN0PR11MB5712.namprd11.prod.outlook.com>

> -----Original Message-----
> From: Van Haaren, Harry
> Sent: Tuesday, January 31, 2023 5:25 PM
> To: David Marchand <david.marchand@redhat.com>
> Cc: dev@dpdk.org; dpdklab@iol.unh.edu; ci@dpdk.org;
> Honnappa.Nagarahalli@arm.com; mattias.ronnblom
> <mattias.ronnblom@ericsson.com>; thomas@monjalon.net; Morten Brørup
> <mb@smartsharesystems.com>; Tyler Retzlaff <roretzla@linux.microsoft.com>;
> Aaron Conole <aconole@redhat.com>
> Subject: RE: [PATCH v3] test/service: fix spurious failures by extending timeout

<snip>

> <snip>
> > The timeout approach just does not have its place in a functional test.
> > Either this test is rewritten, or it must go to the performance tests
> > list so that we stop getting false positives.
> > Can you work on this?
> 
> I'll investigate various approaches on Thursday and reply here with suggested
> next steps.

I've identified 3 checks that fail in CI (from the above log outputs), all 3 cases
Have different dlays: 100 ms delay, 200 ms delay and 1000ms.
In the CI, the service-core just hasn't been scheduled (yet) and causes the "failure".

Option 1)
One option is to while(1) loop, waiting for the service-thread to be scheduled. This can be
seen as "increasing the timeout", however in this case the test-case would be errored
not in the test-code, but in the meson-test runner as a timeout (with a 10sec default?)
The benefit here is that massively increasing (~1sec or less to 10 sec) will cover all/many
of the CI timeouts.

Option 2)
Move to perf-tests, and not run these in a noisy-CI environment where the results are not
consistent enough to have value. This would mean that the tests are not run in CI for the
3 checks in question are below, they all *require* the service core to be scheduled:
service_attr_get() -> requires service core to run for service stats to increment
service_lcore_attr_get() -> requires service core to run for lcore stats to increment
service_lcore_start_stop() -> requires service to run to to ensure service-func itself executes.

I don't see how we can "improve" option 2 to not require the service-thread to be scheduled by the OS..
And the only way to make the OS schedule it in the CI more consistently is to give it more time?

Thoughts and input welcomed, I'm happy to make the code changes themselves, its small effort
For both option 1 & 2.

Regards, -Harry


  reply	other threads:[~2023-02-03 15:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-06  8:17 [PATCH] " Harry van Haaren
2022-10-06  8:28 ` [PATCH v2] " Harry van Haaren
2022-10-06  8:39   ` David Marchand
2022-10-06  8:54     ` Mattias Rönnblom
2022-10-06  8:37 ` [PATCH] " Mattias Rönnblom
2022-10-06 12:52 ` [PATCH v3] " Harry van Haaren
2022-10-06 13:27   ` Morten Brørup
2022-10-06 19:33     ` David Marchand
2023-01-26  9:29       ` David Marchand
2023-01-31 17:24         ` Van Haaren, Harry
2023-02-03 15:03           ` Van Haaren, Harry [this message]
2023-02-03 15:12             ` Bruce Richardson
2023-02-23 20:10               ` Thomas Monjalon
2023-02-27  8:41                 ` Van Haaren, Harry
2023-02-03 15:16             ` Thomas Monjalon
2023-02-03 16:09               ` Van Haaren, Harry
2023-02-23 20:15                 ` Thomas Monjalon
2023-02-27  8:41                   ` Van Haaren, Harry
2022-10-06 14:00   ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN0PR11MB5712D936D06832DBF4FDAC27D7D79@BN0PR11MB5712.namprd11.prod.outlook.com \
    --to=harry.van.haaren@intel.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=aconole@redhat.com \
    --cc=ci@dpdk.org \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dpdklab@iol.unh.edu \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=mb@smartsharesystems.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).