From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9C933A00C2 for ; Wed, 5 Oct 2022 22:52:51 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7E8D140DFB; Wed, 5 Oct 2022 22:52:51 +0200 (CEST) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by mails.dpdk.org (Postfix) with ESMTP id 68EED40DDC; Wed, 5 Oct 2022 22:52:49 +0200 (CEST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id EAC3D5C0126; Wed, 5 Oct 2022 16:52:46 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 05 Oct 2022 16:52:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1665003166; x= 1665089566; bh=D/UFd0Zv2PohIa5KXIFVByQiN5Twd2IRzJ5N16qIJWE=; b=y EDxL0vI/Gwn2pt/NLArYnBKq5b2mPgxzQn/n3UAgk03PjDdGcqw1VMILyB6c/Bkk DO2yX1ZwdTetL48d4iqTcBh2lAiKEsKq8pCD67MbSguekUN6r0UZX1tJhwjOqjw0 rRGTap0OspV22cE/Fs9y2Jy2XXdy3WYvRo/1c3rcQCIhSag/S9BPmMV8Z67Sn28u BEvL5BpF8WnJ9emXivkce7xUCVLMiGSJu0FNMTMaN4FNwnBQct7krBDFutysBj6+ s0jCBU7bpmn4eZ15uc5TVOPPC+ovkcxiM5ajXTi7qC5Ij5LZOHf6FANvmygK1+s/ qluL9ul06nz0215tu3Wxg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1665003166; x= 1665089566; bh=D/UFd0Zv2PohIa5KXIFVByQiN5Twd2IRzJ5N16qIJWE=; b=o zbl3te76uXIMyDSMbIi0SvwwcejXGstDysTUmEB7CbmMh0r/AsjEKucmUIXkdGRz dM4ceYO0+8UdgInHu62wT3qjdlQN9e26X0xk13BwRxMxmQxAGjs7d2ChQz9YAqxw QwBI9FWCFL1/63fx4F9pRtKYsl+uBWCKJNoNhGplJhtGW4SyQN3CRBIeQY+Qav2l YtVDW9sYMrMvyB3UJLYtKTfd3lezKo58Q1V8a/fBKxLuu6mxBQdhefA1Sc03rl40 vAAhku/Cr4laVmQh6j0FPVeClA+avZ9qoa2+WhpzTbnzF5k8QyrNF0/FwBbsVxoO RLXnEpZEKGOzWB++SAtsA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeeifedgudehfecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvfevufffkfgjfhgggfgtsehtqhertddttddunecuhfhrohhmpefvhhho mhgrshcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqne cuggftrfgrthhtvghrnhepfefhjeeluedvvedtuddtuedtvefhieejtefhffeujefhtedu udevtdektdeikeffnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilh hfrhhomhepthhhohhmrghssehmohhnjhgrlhhonhdrnhgvth X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Oct 2022 16:52:45 -0400 (EDT) From: Thomas Monjalon To: Van Haaren Harry , Mattias =?ISO-8859-1?Q?R=F6nnblom?= Cc: David Marchand , dpdklab , ci@dpdk.org, Honnappa Nagarahalli , Morten =?ISO-8859-1?Q?Br=F8rup?= , Aaron Conole , dev , "ci@dpdk.org" Subject: Re: rte_service unit test failing randomly Date: Wed, 05 Oct 2022 22:52:43 +0200 Message-ID: <3000673.mvXUDI8C0e@thomas> In-Reply-To: <739ee0ca-ccbe-5918-c2af-18e77327a898@ericsson.com> References: <739ee0ca-ccbe-5918-c2af-18e77327a898@ericsson.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org 05/10/2022 22:33, Mattias R=F6nnblom: > On 2022-10-05 21:14, David Marchand wrote: > > Hello, > >=20 > > The service_autotest unit test has been failing randomly. > > This is not something new. > > We have been fixing this unit test and the service code, here and there. > > For some time we were "fine": the failures were rare. > >=20 > > But recenly (for the last two weeks at least), it started failing more > > frequently in UNH lab. > >=20 > > The symptoms are linked to places where the unit test code is "waiting > > for some time": > >=20 > > - service_lcore_attr_get: > > + TestCase [ 5] : service_lcore_attr_get failed > > EAL: Test assert service_lcore_attr_get line 422 failed: Service lcore > > not stopped after waiting. > >=20 > >=20 > > - service_may_be_active: > > + TestCase [15] : service_may_be_active failed > > ... > > EAL: Test assert service_may_be_active line 960 failed: Error: Service > > not stopped after 100ms > >=20 > > Ideas? > >=20 > >=20 > > Thanks. >=20 > Do you run the test suite in a controlled environment? I.e., one where=20 > you can trust that the lcore threads aren't interrupted for long periods= =20 > of time. >=20 > 100 ms is not a long time if a SCHED_OTHER lcore thread competes for the= =20 > CPU with other threads. You mean the tests cannot be interrupted? Then it looks very fragile. Please could help making it more robust?