From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C7D3BA034F; Mon, 11 Oct 2021 16:54:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B75004111B; Mon, 11 Oct 2021 16:54:46 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id D29C2410DA for ; Mon, 11 Oct 2021 16:54:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1633964085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=DdZ2aiAuwG1dL9Av4HtzwH13UD3K8L7sPKxeT78ZRGM=; b=cCEnfRjL4YgZi6gMrTdIqR1MtYBeoYI5SdobSJrvRyUsWRwdqACBdXPI9ZERER9jmpnSng S6uQMUzQyhB2SuHNY352PSWpfJ/7bGotoxGQBfnsrxBwo5B6/xZJuVMrMoaBv6llAvts4H Ek6sWmpK+1YcDD4H2QcKTWid0HcSGCs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-578-Bdy0W42FNFSA06hfStDSOQ-1; Mon, 11 Oct 2021 10:54:42 -0400 X-MC-Unique: Bdy0W42FNFSA06hfStDSOQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 405201926DA1; Mon, 11 Oct 2021 14:54:41 +0000 (UTC) Received: from dmarchan.remote.csb (unknown [10.40.192.75]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD7E860938; Mon, 11 Oct 2021 14:54:38 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: aconole@redhat.com, stable@dpdk.org, Harry van Haaren , Kevin Laatz Date: Mon, 11 Oct 2021 16:54:30 +0200 Message-Id: <20211011145430.6587-1-david.marchand@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david.marchand@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Subject: [dpdk-dev] [PATCH] test/service: fix race in attr check X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The CI reported rare (and cryptic) failures like: RTE>>service_autotest + ------------------------------------------------------- + + Test Suite : service core test suite + ------------------------------------------------------- + + TestCase [ 0] : unregister_all succeeded + TestCase [ 1] : service_name succeeded + TestCase [ 2] : service_get_by_name succeeded Service dummy_service Summary dummy_service: stats 1 calls 0 cycles 0 avg: 0 Service dummy_service Summary dummy_service: stats 0 calls 0 cycles 0 avg: 0 + TestCase [ 3] : service_dump succeeded + TestCase [ 4] : service_attr_get failed + TestCase [ 5] : service_lcore_attr_get succeeded + TestCase [ 6] : service_probe_capability succeeded + TestCase [ 7] : service_start_stop succeeded + TestCase [ 8] : service_lcore_add_del succeeded + TestCase [ 9] : service_lcore_start_stop succeeded + TestCase [10] : service_lcore_en_dis_able succeeded + TestCase [11] : service_mt_unsafe_poll succeeded + TestCase [12] : service_mt_safe_poll succeeded perf test for MT Safe: 42.7 cycles per call + TestCase [13] : service_app_lcore_mt_safe succeeded perf test for MT Unsafe: 73.3 cycles per call + TestCase [14] : service_app_lcore_mt_unsafe succeeded + TestCase [15] : service_may_be_active succeeded + TestCase [16] : service_active_two_cores succeeded + ------------------------------------------------------- + + Test Suite Summary : service core test suite + ------------------------------------------------------- + + Tests Total : 17 + Tests Skipped : 0 + Tests Executed : 17 + Tests Unsupported: 0 + Tests Passed : 16 + Tests Failed : 1 + ------------------------------------------------------- + Test Failed RTE>> stderr: EAL: Detected CPU lcores: 16 EAL: Detected NUMA nodes: 2 EAL: Detected static linkage of DPDK EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket EAL: Selected IOVA mode 'PA' EAL: No available 1048576 kB hugepages reported EAL: VFIO support initialized EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0 APP: HPET is not enabled, using TSC as default timer EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't get call count (zero) According to API, trying to stop a service lcore is not possible if this lcore is the only one associated to a service. Doing this will result in a -EBUSY return code from rte_service_lcore_stop() which the service_attr_get subtest was not checking. This left the service lcore running, and a race existed with the main lcore on checking the service attributes which triggered this CI failure. To fix this, dissociate the service lcore with current service. Once fixed this first issue, a race still exists, because the wait_slcore_inactive helper added in a previous fix was not paired with a check that the service lcore _did_ stop. Add missing check on rte_service_lcore_may_be_active. Fixes: 4d55194d76a4 ("service: add attribute get function") Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore") Cc: stable@dpdk.org Signed-off-by: David Marchand --- app/test/test_service_cores.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c index ece104054e..6821dca0e6 100644 --- a/app/test/test_service_cores.c +++ b/app/test/test_service_cores.c @@ -318,10 +318,16 @@ service_attr_get(void) TEST_ASSERT_EQUAL(1, cycles_gt_zero, "attr_get() failed to get cycles (expected > zero)"); - rte_service_lcore_stop(slcore_id); + TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 0), + "Disabling valid service and core failed"); + TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id), + "Failed to stop service lcore"); wait_slcore_inactive(slcore_id); + TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id), + "Service lcore not stopped after waiting."); + TEST_ASSERT_EQUAL(0, rte_service_attr_get(id, attr_calls, &attr_value), "Valid attr_get() call didn't return success"); TEST_ASSERT_EQUAL(1, (attr_value > 0), -- 2.23.0