From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE791A0C4C; Tue, 12 Oct 2021 20:49:35 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CC3D3410EA; Tue, 12 Oct 2021 20:49:35 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id A443F410E7 for ; Tue, 12 Oct 2021 20:49:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634064574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=N81zhatHmAT5B2elQpzZFQhCJHoNxw0xL0MxlECoAxs=; b=VjDNQ8cudfg+8TYG4wCYaMPPN2R/kYhinimVa3TLz3a2cROT2NB579jUwHyqlQvUQKYFFf nCpJYInhprCQ2vmkjomw/0HL0MAVu7YClH4XmPZ9ZYHbxpgmVEfxxvMP/HoPVMNWcCZbuL X2eOJDHeXnaET7RAin7wJkFFlDdSmIk= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-g7nRztsPMoi04fdoDSRSsA-1; Tue, 12 Oct 2021 14:49:33 -0400 X-MC-Unique: g7nRztsPMoi04fdoDSRSsA-1 Received: by mail-il1-f197.google.com with SMTP id t3-20020a056e02060300b00259150f4bb8so99227ils.2 for ; Tue, 12 Oct 2021 11:49:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=N81zhatHmAT5B2elQpzZFQhCJHoNxw0xL0MxlECoAxs=; b=M4gYEstmiEAz2kc/QSZQ5HDsVLQW9EwVo3LOpqTCzaznyc3vWMqbpbbE+MqdGxTOO6 0CBsVV50qI+lvw9ZxMkBioE2jK+6fsNVEwUEPH/Ho9lVIgPDpgp43i+iuznKMVm6ozlB CFsd1s0s82bdZweJp2V02CQfLD/e8IuOdDesgpqBkMuo65orQvsQdNzTdIGNfyx4Q/Sx TQ1neW3qmm1D4uNwgyeWSJwsRTnD6mk3PlNykkFWwEm6rYtO5ePxoJRNm8JGBkygvuVa vAf8NgflILLq3pnSXO60y8P4Vbn7gTfdXdwYCkrOrMEXsewFxNVu66QJrDgEXErEfuHA wq4A== X-Gm-Message-State: AOAM530YbXh474L8zrC/myoKjgG6JgVKLKIZWWqNkK94sqRHeMugcXpy a7a0xk6WWeadgZe9VW8B4Bl8U4icRMQX9Uo/UNgf+ct9yGCHCCHBozZYNkaaQ5ZKZsCcq6S5TdT vQnD/z+olCF9DGgO/xeE= X-Received: by 2002:a5d:8c83:: with SMTP id g3mr25228436ion.24.1634064572735; Tue, 12 Oct 2021 11:49:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyULo6fnmaD/cOaBIwXXvPjIvV5Y273uKmqjrTBwTmvzvNTytWDKOc7zHUmzSzsXdeHwVQ1ILHAirnSvKkoN5o= X-Received: by 2002:a5d:8c83:: with SMTP id g3mr25228418ion.24.1634064572542; Tue, 12 Oct 2021 11:49:32 -0700 (PDT) MIME-Version: 1.0 References: <20211011145430.6587-1-david.marchand@redhat.com> In-Reply-To: <20211011145430.6587-1-david.marchand@redhat.com> From: David Marchand Date: Tue, 12 Oct 2021 20:49:20 +0200 Message-ID: To: dev Cc: Aaron Conole , dpdk stable , Harry van Haaren , Kevin Laatz Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] test/service: fix race in attr check X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Oct 11, 2021 at 4:54 PM David Marchand wrote: > > The CI reported rare (and cryptic) failures like: > > RTE>>service_autotest > + ------------------------------------------------------- + > + Test Suite : service core test suite > + ------------------------------------------------------- + > + TestCase [ 0] : unregister_all succeeded > + TestCase [ 1] : service_name succeeded > + TestCase [ 2] : service_get_by_name succeeded > Service dummy_service Summary > dummy_service: stats 1 calls 0 cycles 0 avg: 0 > Service dummy_service Summary > dummy_service: stats 0 calls 0 cycles 0 avg: 0 > + TestCase [ 3] : service_dump succeeded > + TestCase [ 4] : service_attr_get failed > + TestCase [ 5] : service_lcore_attr_get succeeded > + TestCase [ 6] : service_probe_capability succeeded > + TestCase [ 7] : service_start_stop succeeded > + TestCase [ 8] : service_lcore_add_del succeeded > + TestCase [ 9] : service_lcore_start_stop succeeded > + TestCase [10] : service_lcore_en_dis_able succeeded > + TestCase [11] : service_mt_unsafe_poll succeeded > + TestCase [12] : service_mt_safe_poll succeeded > perf test for MT Safe: 42.7 cycles per call > + TestCase [13] : service_app_lcore_mt_safe succeeded > perf test for MT Unsafe: 73.3 cycles per call > + TestCase [14] : service_app_lcore_mt_unsafe succeeded > + TestCase [15] : service_may_be_active succeeded > + TestCase [16] : service_active_two_cores succeeded > + ------------------------------------------------------- + > + Test Suite Summary : service core test suite > + ------------------------------------------------------- + > + Tests Total : 17 > + Tests Skipped : 0 > + Tests Executed : 17 > + Tests Unsupported: 0 > + Tests Passed : 16 > + Tests Failed : 1 > + ------------------------------------------------------- + > Test Failed > RTE>> > stderr: > EAL: Detected CPU lcores: 16 > EAL: Detected NUMA nodes: 2 > EAL: Detected static linkage of DPDK > EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket > EAL: Selected IOVA mode 'PA' > EAL: No available 1048576 kB hugepages reported > EAL: VFIO support initialized > EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0 > APP: HPET is not enabled, using TSC as default timer > EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't > get call count (zero) > > According to API, trying to stop a service lcore is not possible if this > lcore is the only one associated to a service. > Doing this will result in a -EBUSY return code from > rte_service_lcore_stop() which the service_attr_get subtest was not > checking. > This left the service lcore running, and a race existed with the main > lcore on checking the service attributes which triggered this CI > failure. > > To fix this, dissociate the service lcore with current service. > > Once fixed this first issue, a race still exists, because the > wait_slcore_inactive helper added in a previous fix was not > paired with a check that the service lcore _did_ stop. > > Add missing check on rte_service_lcore_may_be_active. > > Fixes: 4d55194d76a4 ("service: add attribute get function") > Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore") > Cc: stable@dpdk.org > > Signed-off-by: David Marchand Acked-by: Aaron Conole Acked-by: Harry van Haaren Applied, thanks. -- David Marchand