From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DB8A0A0C4D for ; Tue, 12 Oct 2021 20:49:36 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D3877410F4; Tue, 12 Oct 2021 20:49:36 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 009EB410E7 for ; Tue, 12 Oct 2021 20:49:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634064574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=N81zhatHmAT5B2elQpzZFQhCJHoNxw0xL0MxlECoAxs=; b=VjDNQ8cudfg+8TYG4wCYaMPPN2R/kYhinimVa3TLz3a2cROT2NB579jUwHyqlQvUQKYFFf nCpJYInhprCQ2vmkjomw/0HL0MAVu7YClH4XmPZ9ZYHbxpgmVEfxxvMP/HoPVMNWcCZbuL X2eOJDHeXnaET7RAin7wJkFFlDdSmIk= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-582-kV3AonbEPASx-KtOboqpYQ-1; Tue, 12 Oct 2021 14:49:33 -0400 X-MC-Unique: kV3AonbEPASx-KtOboqpYQ-1 Received: by mail-il1-f199.google.com with SMTP id d12-20020a92680c000000b00258ec365becso83134ilc.7 for ; Tue, 12 Oct 2021 11:49:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=N81zhatHmAT5B2elQpzZFQhCJHoNxw0xL0MxlECoAxs=; b=y4je+yycS1+mPoRSk/RPcS/JfbamAxOZq8jWzfqqOwmqjzitfj4r8Bv4udVs1TohyX Gg2PYGPzY7nuVGvmltY4tntS7oiUTPtge7KYum7l00NjwMvo06rhT1dH0LoDwsRiYfbM S2gyIQ9NkweobDJq1u39xciMGDoD7XSu48L+JeQ+B+7qYhh0O26ZPPQJ55I8081oJSnh BLjwFxul2zfRcIMeJ1Z+2jcPSlRdf+UrjvMDpXmS1R/TlzuTbnaFao+vUDpErbgSuOIp cCYJ3MJpZRdx5csOWHepwgKM+3WLaGZ/BW0EM0swz2GoI6r0TaCb6D24lrNZkG6oS01F OumA== X-Gm-Message-State: AOAM532g9kcoug+85a2wP+M4DbsH8d91tbESULsBQZS3HJxpelt3fAjX nhjed2ZPzlaAei9X4aXVLhE6IFebc16sQrxCPHX+rJvu97IYfEScG9/BWMWVKy7EvMotW9FUemv M+X+1YQb7wIZdrJ4OoyyodI0= X-Received: by 2002:a5d:8c83:: with SMTP id g3mr25228439ion.24.1634064572736; Tue, 12 Oct 2021 11:49:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyULo6fnmaD/cOaBIwXXvPjIvV5Y273uKmqjrTBwTmvzvNTytWDKOc7zHUmzSzsXdeHwVQ1ILHAirnSvKkoN5o= X-Received: by 2002:a5d:8c83:: with SMTP id g3mr25228418ion.24.1634064572542; Tue, 12 Oct 2021 11:49:32 -0700 (PDT) MIME-Version: 1.0 References: <20211011145430.6587-1-david.marchand@redhat.com> In-Reply-To: <20211011145430.6587-1-david.marchand@redhat.com> From: David Marchand Date: Tue, 12 Oct 2021 20:49:20 +0200 Message-ID: To: dev Cc: Aaron Conole , dpdk stable , Harry van Haaren , Kevin Laatz Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-stable] [PATCH] test/service: fix race in attr check X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" On Mon, Oct 11, 2021 at 4:54 PM David Marchand wrote: > > The CI reported rare (and cryptic) failures like: > > RTE>>service_autotest > + ------------------------------------------------------- + > + Test Suite : service core test suite > + ------------------------------------------------------- + > + TestCase [ 0] : unregister_all succeeded > + TestCase [ 1] : service_name succeeded > + TestCase [ 2] : service_get_by_name succeeded > Service dummy_service Summary > dummy_service: stats 1 calls 0 cycles 0 avg: 0 > Service dummy_service Summary > dummy_service: stats 0 calls 0 cycles 0 avg: 0 > + TestCase [ 3] : service_dump succeeded > + TestCase [ 4] : service_attr_get failed > + TestCase [ 5] : service_lcore_attr_get succeeded > + TestCase [ 6] : service_probe_capability succeeded > + TestCase [ 7] : service_start_stop succeeded > + TestCase [ 8] : service_lcore_add_del succeeded > + TestCase [ 9] : service_lcore_start_stop succeeded > + TestCase [10] : service_lcore_en_dis_able succeeded > + TestCase [11] : service_mt_unsafe_poll succeeded > + TestCase [12] : service_mt_safe_poll succeeded > perf test for MT Safe: 42.7 cycles per call > + TestCase [13] : service_app_lcore_mt_safe succeeded > perf test for MT Unsafe: 73.3 cycles per call > + TestCase [14] : service_app_lcore_mt_unsafe succeeded > + TestCase [15] : service_may_be_active succeeded > + TestCase [16] : service_active_two_cores succeeded > + ------------------------------------------------------- + > + Test Suite Summary : service core test suite > + ------------------------------------------------------- + > + Tests Total : 17 > + Tests Skipped : 0 > + Tests Executed : 17 > + Tests Unsupported: 0 > + Tests Passed : 16 > + Tests Failed : 1 > + ------------------------------------------------------- + > Test Failed > RTE>> > stderr: > EAL: Detected CPU lcores: 16 > EAL: Detected NUMA nodes: 2 > EAL: Detected static linkage of DPDK > EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket > EAL: Selected IOVA mode 'PA' > EAL: No available 1048576 kB hugepages reported > EAL: VFIO support initialized > EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0 > APP: HPET is not enabled, using TSC as default timer > EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't > get call count (zero) > > According to API, trying to stop a service lcore is not possible if this > lcore is the only one associated to a service. > Doing this will result in a -EBUSY return code from > rte_service_lcore_stop() which the service_attr_get subtest was not > checking. > This left the service lcore running, and a race existed with the main > lcore on checking the service attributes which triggered this CI > failure. > > To fix this, dissociate the service lcore with current service. > > Once fixed this first issue, a race still exists, because the > wait_slcore_inactive helper added in a previous fix was not > paired with a check that the service lcore _did_ stop. > > Add missing check on rte_service_lcore_may_be_active. > > Fixes: 4d55194d76a4 ("service: add attribute get function") > Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore") > Cc: stable@dpdk.org > > Signed-off-by: David Marchand Acked-by: Aaron Conole Acked-by: Harry van Haaren Applied, thanks. -- David Marchand