From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 12097A00C2;
	Wed,  5 Oct 2022 22:52:52 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 93AB942670;
	Wed,  5 Oct 2022 22:52:51 +0200 (CEST)
Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com
 [66.111.4.28]) by mails.dpdk.org (Postfix) with ESMTP id 68EED40DDC;
 Wed,  5 Oct 2022 22:52:49 +0200 (CEST)
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43])
 by mailout.nyi.internal (Postfix) with ESMTP id EAC3D5C0126;
 Wed,  5 Oct 2022 16:52:46 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute3.internal (MEProxy); Wed, 05 Oct 2022 16:52:46 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 cc:cc:content-transfer-encoding:content-type:date:date:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to; s=fm2; t=1665003166; x=
 1665089566; bh=D/UFd0Zv2PohIa5KXIFVByQiN5Twd2IRzJ5N16qIJWE=; b=y
 EDxL0vI/Gwn2pt/NLArYnBKq5b2mPgxzQn/n3UAgk03PjDdGcqw1VMILyB6c/Bkk
 DO2yX1ZwdTetL48d4iqTcBh2lAiKEsKq8pCD67MbSguekUN6r0UZX1tJhwjOqjw0
 rRGTap0OspV22cE/Fs9y2Jy2XXdy3WYvRo/1c3rcQCIhSag/S9BPmMV8Z67Sn28u
 BEvL5BpF8WnJ9emXivkce7xUCVLMiGSJu0FNMTMaN4FNwnBQct7krBDFutysBj6+
 s0jCBU7bpmn4eZ15uc5TVOPPC+ovkcxiM5ajXTi7qC5Ij5LZOHf6FANvmygK1+s/
 qluL9ul06nz0215tu3Wxg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:date:date:feedback-id:feedback-id:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1665003166; x=
 1665089566; bh=D/UFd0Zv2PohIa5KXIFVByQiN5Twd2IRzJ5N16qIJWE=; b=o
 zbl3te76uXIMyDSMbIi0SvwwcejXGstDysTUmEB7CbmMh0r/AsjEKucmUIXkdGRz
 dM4ceYO0+8UdgInHu62wT3qjdlQN9e26X0xk13BwRxMxmQxAGjs7d2ChQz9YAqxw
 QwBI9FWCFL1/63fx4F9pRtKYsl+uBWCKJNoNhGplJhtGW4SyQN3CRBIeQY+Qav2l
 YtVDW9sYMrMvyB3UJLYtKTfd3lezKo58Q1V8a/fBKxLuu6mxBQdhefA1Sc03rl40
 vAAhku/Cr4laVmQh6j0FPVeClA+avZ9qoa2+WhpzTbnzF5k8QyrNF0/FwBbsVxoO
 RLXnEpZEKGOzWB++SAtsA==
X-ME-Sender: <xms:nu49Y78LiU2pCgOq88R3jVN12io7geJnecTCNKqCyiGNS7-7esxm1A>
 <xme:nu49Y3u28eFspkdURYaja7zmRUQMK_dfeSzn7SxNf65HNINUAN6YOutWuC-ANxFaH
 z_UTdFUziMDqhoJbA>
X-ME-Received: <xmr:nu49Y5CMoTiUbCLywTMECazY5LP00r5DPRZuyVvRfX1Y2CEFIbMofcU5LyT691SUcxwnlZkvGYvlY9W7JE1hRCRC5w>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeeifedgudehfecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh
 necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
 enucfjughrpefhvfevufffkfgjfhgggfgtsehtqhertddttddunecuhfhrohhmpefvhhho
 mhgrshcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqne
 cuggftrfgrthhtvghrnhepfefhjeeluedvvedtuddtuedtvefhieejtefhffeujefhtedu
 udevtdektdeikeffnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilh
 hfrhhomhepthhhohhmrghssehmohhnjhgrlhhonhdrnhgvth
X-ME-Proxy: <xmx:nu49Y3eltfqcgmEj_7r3TOR8L5nXXrSJwMVI7yEZGZEcCmWsuZgVSw>
 <xmx:nu49YwMi-Onq85apbAj_EyWeo5jGJ_lN14TOnPnFWxwZw8C9Uid0_A>
 <xmx:nu49Y5nJ3wJ6ec2h7dGcNu59cfyGd4IXXBTfuOE_zmkcBswjbUL17w>
 <xmx:nu49Yxi5DI0CGOA5hTYiuPuIemnxm0uvc6GAeaj-2eMsQzldoKLjpQ>
Feedback-ID: i47234305:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 5 Oct 2022 16:52:45 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: Van Haaren Harry <harry.van.haaren@intel.com>,
 Mattias =?ISO-8859-1?Q?R=F6nnblom?= <mattias.ronnblom@ericsson.com>
Cc: David Marchand <david.marchand@redhat.com>, dpdklab <dpdklab@iol.unh.edu>,
 ci@dpdk.org, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
 Morten =?ISO-8859-1?Q?Br=F8rup?= <mb@smartsharesystems.com>,
 Aaron Conole <aconole@redhat.com>, dev <dev@dpdk.org>,
 "ci@dpdk.org" <ci@dpdk.org>
Subject: Re: rte_service unit test failing randomly
Date: Wed, 05 Oct 2022 22:52:43 +0200
Message-ID: <3000673.mvXUDI8C0e@thomas>
In-Reply-To: <739ee0ca-ccbe-5918-c2af-18e77327a898@ericsson.com>
References: <CAJFAV8wcXZ_XhML=64HcV6Zs06PrB=UP9q5M6LWQhGvfLHmjEQ@mail.gmail.com>
 <739ee0ca-ccbe-5918-c2af-18e77327a898@ericsson.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="iso-8859-1"
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

05/10/2022 22:33, Mattias R=F6nnblom:
> On 2022-10-05 21:14, David Marchand wrote:
> > Hello,
> >=20
> > The service_autotest unit test has been failing randomly.
> > This is not something new.
> > We have been fixing this unit test and the service code, here and there.
> > For some time we were "fine": the failures were rare.
> >=20
> > But recenly (for the last two weeks at least), it started failing more
> > frequently in UNH lab.
> >=20
> > The symptoms are linked to places where the unit test code is "waiting
> > for some time":
> >=20
> > -  service_lcore_attr_get:
> > + TestCase [ 5] : service_lcore_attr_get failed
> > EAL: Test assert service_lcore_attr_get line 422 failed: Service lcore
> > not stopped after waiting.
> >=20
> >=20
> > -  service_may_be_active:
> > + TestCase [15] : service_may_be_active failed
> > ...
> > EAL: Test assert service_may_be_active line 960 failed: Error: Service
> > not stopped after 100ms
> >=20
> > Ideas?
> >=20
> >=20
> > Thanks.
>=20
> Do you run the test suite in a controlled environment? I.e., one where=20
> you can trust that the lcore threads aren't interrupted for long periods=
=20
> of time.
>=20
> 100 ms is not a long time if a SCHED_OTHER lcore thread competes for the=
=20
> CPU with other threads.

You mean the tests cannot be interrupted?
Then it looks very fragile.
Please could help making it more robust?