From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2576B45B12; Fri, 11 Oct 2024 10:51:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 75099402E8; Fri, 11 Oct 2024 10:51:01 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 04DC04028B for ; Fri, 11 Oct 2024 10:50:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728636659; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=NiNp700qZUkLBUB2BmG50FVzHjTTQAwJ7hwGAk62/P46AjJjyzQuS30IFsKYYC0XYyqKQW fUb/G2f6XCuRrCiXykmwxv5JPETG+Jk/XT/BiCDEBFK0X2e1QiOFWu+OutEWnbMxOMIK38 je14sqE44QMwPRVF3UVHV6381UUTixI= Received: from mail-lf1-f72.google.com (mail-lf1-f72.google.com [209.85.167.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-575-PPrD8w5zNUeRQpIYcJUr1A-1; Fri, 11 Oct 2024 04:50:34 -0400 X-MC-Unique: PPrD8w5zNUeRQpIYcJUr1A-1 Received: by mail-lf1-f72.google.com with SMTP id 2adb3069b0e04-539948bd825so1557346e87.2 for ; Fri, 11 Oct 2024 01:50:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728636633; x=1729241433; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=Zp+xwPh+xC/7XVLwPme/4efhQMrUCp3iJqRRZnDTwo9Ps/a3HADUl0lII/4fFNV+H0 v6LraTnidQvTMCnE5JCwbGTt/d8bzJRlEtpfouyy1npMSVraFzy7SWAIXrDCHTOQHdEi nsscxHwYknPO4p/GwAhnYGwSC+C44CGy/QlOdjktHwGVV2mghYxXMXZmsCR7REnA84ZK 35LiW9duH52+oXN+Itt8fxtqv9PuIswnxaY9AE4crDYEwGjbXaY0tbftz8OkHmn7n4Si 9UfyzYGs9Zc+DtlaZZ4Tiic3ifPjaJEuz8bbI7cAwGmvfYwsILm/rSHeQy1xZUMqrG5u BhtQ== X-Forwarded-Encrypted: i=1; AJvYcCWE1NocDCR+TQvevfH6TvJBimHSnJ+Qrhe7M73Zg+76RtQitwKjUe7P8BK4fvXkgwTPnZo=@dpdk.org X-Gm-Message-State: AOJu0YyMOzhWWtAfvoN/2Qvbxbr44xTnWvpojdX0kAdaU48yCxhSJ7+F kzc+OO7WqC1U/53RgWCuEDvo0zdD3+W3wHHatb/n+DKg61VtgTsNBNEambRlfvVJji24rJnqMnK Nr3Yb5yxcKs/px50kKo7AwH4TGY2PlNvhBpuSH8hD9knJly7Bb5UAMn/zJyfz1aplegmNcrYxNT MV285XCEF1VeWui7o= X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981578e87.41.1728636632836; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0TbXk5t0AOwC8IswHCWmWI3OHb96yYSLHNBHNUYOnNFJHLSIhYSofeQOsSgd1InjAaHTT1V2PWUibO7wKOpY= X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981569e87.41.1728636632409; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) MIME-Version: 1.0 References: <20241001162603.793853-1-mattias.ronnblom@ericsson.com> <20241003065702.3051158-1-david.marchand@redhat.com> In-Reply-To: From: David Marchand Date: Fri, 11 Oct 2024 10:50:21 +0200 Message-ID: Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit To: "Van Haaren, Harry" , ci@dpdk.org Cc: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , "dev@dpdk.org" , "stephen@networkplumber.org" , "suanmingm@nvidia.com" , "thomas@monjalon.net" , "stable@dpdk.org" , Tyler Retzlaff , Aaron Conole X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Oct 3, 2024 at 5:50=E2=80=AFPM Van Haaren, Harry wrote: > > From: David Marchand > > Sent: Thursday, October 3, 2024 10:13 AM > > To: Mattias R=C3=B6nnblom ; Van Haaren, = Harry > > Cc: dev@dpdk.org ; stephen@networkplumber.org ; suanmingm@nvidia.com ; thomas@monj= alon.net ; stable@dpdk.org ; Tyler Re= tzlaff ; Aaron Conole > > Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit > > > > On Thu, Oct 3, 2024 at 8:57 AM David Marchand wrote: > > > > > > From: Mattias R=C3=B6nnblom > > > > > > Calling rte_exit() from a worker lcore thread causes a deadlock in > > > rte_service_finalize(). > > > > > > This patch makes rte_service_finalize() deadlock-free by avoiding the > > > need to synchronize with service lcore threads, which in turn is > > > achieved by moving service and per-lcore state from the heap to being > > > statically allocated. > > > > > > The BSS segment increases with ~156 kB (on x86_64 with default > > > RTE_MAX_LCORE and RTE_SERVICE_NUM_MAX). > > > > > > According to the service perf autotest, this change also results in a > > > slight reduction of service framework overhead. > > > > > > Fixes: 33666b448f15 ("service: fix crash on exit") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Mattias R=C3=B6nnblom > > > Acked-by: Tyler Retzlaff > > > --- > > > Changes since v1: > > > - rebased, > > > > I can't merge this patch in its current state. > > > > At the moment, two CI report a problem with the > > eal_flags_file_prefix_autotest unit test. > > > > -------------------------------------stdout----------------------------= --------- > > RTE>>eal_flags_file_prefix_autotest > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '--proc-type=3Dsecondary' '-m' '18' '--file-prefix=3Dmemtest' > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '-m' '18' '--file-prefix=3Dmemtest1' > > Error - hugepage files for memtest1 were not deleted! > > Test Failed > > RTE>> > > > > Can you have a look? > > Not sure how the code change in question is relating to the eal-flags fai= lure, but I can reproduce the failure here. > Reproducing issue on *all* of the below tags; this indicates its likely a= board-config issue, and not a true issue (unless its been there since 23.1= 1??). > > Tested commits were all bad: > b3485f4293 (HEAD, tag: v24.07) version: 24.07.0 > a9778aad62 (HEAD, tag: v24.03) version: 24.03.0 > eeb0605f11 (HEAD, tag: v23.11) version: 23.11.0 > > So I'm pretty sure this is a board/runner config issue, with the error ou= tput as follows here: > RTE>>eal_flags_file_prefix_autotest > Running binary with argv[]:'./app/test/dpdk-test' '--proc-type=3Dsecondar= y' '-m' '18' '--file-prefix=3Dmemtest' > EAL: Detected CPU lcores: 64 > EAL: Detected NUMA nodes: 2 > EAL: Detected static linkage of DPDK > EAL: Cannot open '/var/run/dpdk/memtest/config' for rte_mem_config > EAL: FATAL: Cannot init config > EAL: Cannot init config > > FAIL: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test --no-pci > > PASS: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test > > So seems like the eal-flags test is NOT able to handle args like "--no-pc= i"? I tend to run tests in no PCI mode to speed up things :) Well, speeding up, or hiding the issue, I guess. > In short, this service-cores patch is not the root cause. Perhaps some of= the CI folks can confirm if there's extra args passed to the runner? To be clear, I can't merge this patch because of this (systematic) failure in many CI env (GHA, LoongArch, UNH). Adding CI ml in the loop. --=20 David Marchand