From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 518A045B13 for ; Fri, 11 Oct 2024 10:51:01 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4C210400D5; Fri, 11 Oct 2024 10:51:01 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id DCECB400D5 for ; Fri, 11 Oct 2024 10:50:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728636659; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=NiNp700qZUkLBUB2BmG50FVzHjTTQAwJ7hwGAk62/P46AjJjyzQuS30IFsKYYC0XYyqKQW fUb/G2f6XCuRrCiXykmwxv5JPETG+Jk/XT/BiCDEBFK0X2e1QiOFWu+OutEWnbMxOMIK38 je14sqE44QMwPRVF3UVHV6381UUTixI= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-62-CPe7QrVePayCc0qaDDdN5Q-1; Fri, 11 Oct 2024 04:50:34 -0400 X-MC-Unique: CPe7QrVePayCc0qaDDdN5Q-1 Received: by mail-lf1-f70.google.com with SMTP id 2adb3069b0e04-5399a4f3a48so2125609e87.0 for ; Fri, 11 Oct 2024 01:50:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728636633; x=1729241433; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=lFyjo4RxDPJJeApe0sZjg6uTdknXsc9FWFrJlvRTtSO8u9+tX8cBVIgPUiYHorhrBO sL/BBVJTiWvwvQdl3Rln8ObwOF9ktakQnRkpGpfaKMLPmMAS0/upg2LJC9VcJ34elcTd VpKGnvmUEts90tBm90Nso6MOXbiDENHPMhi3gelwcOsLGXdBbq+LNQ2QDM0BHMSFivgi 64t1zErLZdxwGu/TQ2nzuPFARh9KVKnofNHGyfN8/57tmUW1/WmD6upkYfIZviYk98kv RuMJTogC1TzCjyBrC8ZjiuFR2xd5nGaKG0aXYAmy6/k31zPSqzxzWUOshb1E3Rw6QvlP 9w0Q== X-Forwarded-Encrypted: i=1; AJvYcCXUn0t5OPSvH4Wuq/I2FrfHMkXaczhODNtbktSiq15YceFdlMSMdhhGt2KXAxOEUuoNgGGHfWs=@dpdk.org X-Gm-Message-State: AOJu0YxG3aSdd18T6ediOnayMg3KmwJPUiRWFOUXnFzF5ncZYJ1ZRC3c h/DHMoDqdt0XoaEbJ8vkczaY/MfO+4NKwegiY9Q2EYWLlonbbBsnkGxtd+ijbX//i/UpAh35Udi 4ntOks3Msah2+vF7tjura0j48o6A+jgqjQTkcyP38JOgh9UhENNLyNK0zNKAsO8nm99bTEkalgY 8l7UdhxphWr+3W5TSOmkA= X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981576e87.41.1728636632834; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0TbXk5t0AOwC8IswHCWmWI3OHb96yYSLHNBHNUYOnNFJHLSIhYSofeQOsSgd1InjAaHTT1V2PWUibO7wKOpY= X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981569e87.41.1728636632409; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) MIME-Version: 1.0 References: <20241001162603.793853-1-mattias.ronnblom@ericsson.com> <20241003065702.3051158-1-david.marchand@redhat.com> In-Reply-To: From: David Marchand Date: Fri, 11 Oct 2024 10:50:21 +0200 Message-ID: Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit To: "Van Haaren, Harry" , ci@dpdk.org Cc: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , "dev@dpdk.org" , "stephen@networkplumber.org" , "suanmingm@nvidia.com" , "thomas@monjalon.net" , "stable@dpdk.org" , Tyler Retzlaff , Aaron Conole X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org On Thu, Oct 3, 2024 at 5:50=E2=80=AFPM Van Haaren, Harry wrote: > > From: David Marchand > > Sent: Thursday, October 3, 2024 10:13 AM > > To: Mattias R=C3=B6nnblom ; Van Haaren, = Harry > > Cc: dev@dpdk.org ; stephen@networkplumber.org ; suanmingm@nvidia.com ; thomas@monj= alon.net ; stable@dpdk.org ; Tyler Re= tzlaff ; Aaron Conole > > Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit > > > > On Thu, Oct 3, 2024 at 8:57 AM David Marchand wrote: > > > > > > From: Mattias R=C3=B6nnblom > > > > > > Calling rte_exit() from a worker lcore thread causes a deadlock in > > > rte_service_finalize(). > > > > > > This patch makes rte_service_finalize() deadlock-free by avoiding the > > > need to synchronize with service lcore threads, which in turn is > > > achieved by moving service and per-lcore state from the heap to being > > > statically allocated. > > > > > > The BSS segment increases with ~156 kB (on x86_64 with default > > > RTE_MAX_LCORE and RTE_SERVICE_NUM_MAX). > > > > > > According to the service perf autotest, this change also results in a > > > slight reduction of service framework overhead. > > > > > > Fixes: 33666b448f15 ("service: fix crash on exit") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Mattias R=C3=B6nnblom > > > Acked-by: Tyler Retzlaff > > > --- > > > Changes since v1: > > > - rebased, > > > > I can't merge this patch in its current state. > > > > At the moment, two CI report a problem with the > > eal_flags_file_prefix_autotest unit test. > > > > -------------------------------------stdout----------------------------= --------- > > RTE>>eal_flags_file_prefix_autotest > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '--proc-type=3Dsecondary' '-m' '18' '--file-prefix=3Dmemtest' > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '-m' '18' '--file-prefix=3Dmemtest1' > > Error - hugepage files for memtest1 were not deleted! > > Test Failed > > RTE>> > > > > Can you have a look? > > Not sure how the code change in question is relating to the eal-flags fai= lure, but I can reproduce the failure here. > Reproducing issue on *all* of the below tags; this indicates its likely a= board-config issue, and not a true issue (unless its been there since 23.1= 1??). > > Tested commits were all bad: > b3485f4293 (HEAD, tag: v24.07) version: 24.07.0 > a9778aad62 (HEAD, tag: v24.03) version: 24.03.0 > eeb0605f11 (HEAD, tag: v23.11) version: 23.11.0 > > So I'm pretty sure this is a board/runner config issue, with the error ou= tput as follows here: > RTE>>eal_flags_file_prefix_autotest > Running binary with argv[]:'./app/test/dpdk-test' '--proc-type=3Dsecondar= y' '-m' '18' '--file-prefix=3Dmemtest' > EAL: Detected CPU lcores: 64 > EAL: Detected NUMA nodes: 2 > EAL: Detected static linkage of DPDK > EAL: Cannot open '/var/run/dpdk/memtest/config' for rte_mem_config > EAL: FATAL: Cannot init config > EAL: Cannot init config > > FAIL: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test --no-pci > > PASS: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test > > So seems like the eal-flags test is NOT able to handle args like "--no-pc= i"? I tend to run tests in no PCI mode to speed up things :) Well, speeding up, or hiding the issue, I guess. > In short, this service-cores patch is not the root cause. Perhaps some of= the CI folks can confirm if there's extra args passed to the runner? To be clear, I can't merge this patch because of this (systematic) failure in many CI env (GHA, LoongArch, UNH). Adding CI ml in the loop. --=20 David Marchand