From: David Marchand <david.marchand@redhat.com>
To: dev <dev@dpdk.org>
Cc: Aaron Conole <aconole@redhat.com>, dpdk stable <stable@dpdk.org>,
Harry van Haaren <harry.van.haaren@intel.com>,
Kevin Laatz <kevin.laatz@intel.com>
Subject: Re: [dpdk-stable] [PATCH] test/service: fix race in attr check
Date: Tue, 12 Oct 2021 20:49:20 +0200 [thread overview]
Message-ID: <CAJFAV8yz4NZdto+M4AUVvuS47vbtHqRiHo0WPuGoDPsurW8aXw@mail.gmail.com> (raw)
In-Reply-To: <20211011145430.6587-1-david.marchand@redhat.com>
On Mon, Oct 11, 2021 at 4:54 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> The CI reported rare (and cryptic) failures like:
>
> RTE>>service_autotest
> + ------------------------------------------------------- +
> + Test Suite : service core test suite
> + ------------------------------------------------------- +
> + TestCase [ 0] : unregister_all succeeded
> + TestCase [ 1] : service_name succeeded
> + TestCase [ 2] : service_get_by_name succeeded
> Service dummy_service Summary
> dummy_service: stats 1 calls 0 cycles 0 avg: 0
> Service dummy_service Summary
> dummy_service: stats 0 calls 0 cycles 0 avg: 0
> + TestCase [ 3] : service_dump succeeded
> + TestCase [ 4] : service_attr_get failed
> + TestCase [ 5] : service_lcore_attr_get succeeded
> + TestCase [ 6] : service_probe_capability succeeded
> + TestCase [ 7] : service_start_stop succeeded
> + TestCase [ 8] : service_lcore_add_del succeeded
> + TestCase [ 9] : service_lcore_start_stop succeeded
> + TestCase [10] : service_lcore_en_dis_able succeeded
> + TestCase [11] : service_mt_unsafe_poll succeeded
> + TestCase [12] : service_mt_safe_poll succeeded
> perf test for MT Safe: 42.7 cycles per call
> + TestCase [13] : service_app_lcore_mt_safe succeeded
> perf test for MT Unsafe: 73.3 cycles per call
> + TestCase [14] : service_app_lcore_mt_unsafe succeeded
> + TestCase [15] : service_may_be_active succeeded
> + TestCase [16] : service_active_two_cores succeeded
> + ------------------------------------------------------- +
> + Test Suite Summary : service core test suite
> + ------------------------------------------------------- +
> + Tests Total : 17
> + Tests Skipped : 0
> + Tests Executed : 17
> + Tests Unsupported: 0
> + Tests Passed : 16
> + Tests Failed : 1
> + ------------------------------------------------------- +
> Test Failed
> RTE>>
> stderr:
> EAL: Detected CPU lcores: 16
> EAL: Detected NUMA nodes: 2
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available 1048576 kB hugepages reported
> EAL: VFIO support initialized
> EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0
> APP: HPET is not enabled, using TSC as default timer
> EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't
> get call count (zero)
>
> According to API, trying to stop a service lcore is not possible if this
> lcore is the only one associated to a service.
> Doing this will result in a -EBUSY return code from
> rte_service_lcore_stop() which the service_attr_get subtest was not
> checking.
> This left the service lcore running, and a race existed with the main
> lcore on checking the service attributes which triggered this CI
> failure.
>
> To fix this, dissociate the service lcore with current service.
>
> Once fixed this first issue, a race still exists, because the
> wait_slcore_inactive helper added in a previous fix was not
> paired with a check that the service lcore _did_ stop.
>
> Add missing check on rte_service_lcore_may_be_active.
>
> Fixes: 4d55194d76a4 ("service: add attribute get function")
> Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore")
> Cc: stable@dpdk.org
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Applied, thanks.
--
David Marchand
prev parent reply other threads:[~2021-10-12 18:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-11 14:54 David Marchand
2021-10-11 15:06 ` Aaron Conole
2021-10-12 11:52 ` Van Haaren, Harry
2021-10-12 18:49 ` David Marchand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJFAV8yz4NZdto+M4AUVvuS47vbtHqRiHo0WPuGoDPsurW8aXw@mail.gmail.com \
--to=david.marchand@redhat.com \
--cc=aconole@redhat.com \
--cc=dev@dpdk.org \
--cc=harry.van.haaren@intel.com \
--cc=kevin.laatz@intel.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).