From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 394FE45B12; Fri, 11 Oct 2024 10:51:00 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3106E402E0; Fri, 11 Oct 2024 10:51:00 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 36B7B400D5 for ; Fri, 11 Oct 2024 10:50:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728636658; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=Pfz1GbYFDggaCLpLZjkt1IZgm3O+tL81PXTlwno82FyVqzWQf+6oNUW6FYinukiZVLkEdj k1OmY9VqWrU0QslyDnJDVhXx5TIb5w+XZ7Yo4sWhBsMz2KBiPjXnZMVg9RgJDY+k77kVVn QioGNOSLCTb/HGxOW7jCFMKSIbNXXLs= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-244-Cc5K9CRIN0KnfPiCEujVHQ-1; Fri, 11 Oct 2024 04:50:34 -0400 X-MC-Unique: Cc5K9CRIN0KnfPiCEujVHQ-1 Received: by mail-lf1-f71.google.com with SMTP id 2adb3069b0e04-53691cd5a20so1640955e87.3 for ; Fri, 11 Oct 2024 01:50:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728636633; x=1729241433; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5YIgGinTUieSEYlbjriqfeiT0nMikLBaUXTY2+hgk1M=; b=il+Ojeq88cfG0QyaXkb3xywLF48zhn9dVNYngiWqwAs10+DG5pMzT1pAjxi7wHX/FL WV0947550XEMNQGVYxc28U0btAF62zFNj5VFcmS9kCcSN1/o//zGBRES40revfOCEt6O ojcQ5QKFuX/iJbSonBzBMJclknjnr0IfGNEc75JAjjQkF8HWjNsZjfXUXyxToFC+U5Ld 5dc1C2zB/NTSeb4UsBaLoRdioiXdPvScwC9Hxvrd/HXlC7Fo2PaXuxc1O4Rn2X4fyTWZ AcI5XmZFTvSH/jzDjkb8XAtrsO+z2MA2M419z7csgwK+vR93nVcFKL7dwgRJU3d7fPqi vl+w== X-Forwarded-Encrypted: i=1; AJvYcCXlLA6nOibDuia2fAlejc3f8EN28ZZhdlAF2OKSyhsoJEJRg+lQgBGXI4BBlameEt9HCQ==@dpdk.org X-Gm-Message-State: AOJu0YznhWt+xTp7YaYmEkqmh/61+JAjJLSH1+/8BxQ78jt6jxFlTUlk y1uosjJaokagC74bX84mmQgvBxQdivl6yXlbXh53XDYqBTz3Rj3p/XKosx0+5LzcQkGrIjxuNj1 LGkkH8mLRCQ+hFINV7Ta3lK5K5d0nqySCIgy4Ia+c8loA1/Njsf/SSuAcPd/7BFEhpiaJ/urytD k2xKzPlq1563HOMA== X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981585e87.41.1728636632841; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0TbXk5t0AOwC8IswHCWmWI3OHb96yYSLHNBHNUYOnNFJHLSIhYSofeQOsSgd1InjAaHTT1V2PWUibO7wKOpY= X-Received: by 2002:a05:6512:2384:b0:539:8f4d:a7c9 with SMTP id 2adb3069b0e04-539da552ebfmr981569e87.41.1728636632409; Fri, 11 Oct 2024 01:50:32 -0700 (PDT) MIME-Version: 1.0 References: <20241001162603.793853-1-mattias.ronnblom@ericsson.com> <20241003065702.3051158-1-david.marchand@redhat.com> In-Reply-To: From: David Marchand Date: Fri, 11 Oct 2024 10:50:21 +0200 Message-ID: Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit To: "Van Haaren, Harry" , ci@dpdk.org Cc: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , "dev@dpdk.org" , "stephen@networkplumber.org" , "suanmingm@nvidia.com" , "thomas@monjalon.net" , "stable@dpdk.org" , Tyler Retzlaff , Aaron Conole X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org On Thu, Oct 3, 2024 at 5:50=E2=80=AFPM Van Haaren, Harry wrote: > > From: David Marchand > > Sent: Thursday, October 3, 2024 10:13 AM > > To: Mattias R=C3=B6nnblom ; Van Haaren, = Harry > > Cc: dev@dpdk.org ; stephen@networkplumber.org ; suanmingm@nvidia.com ; thomas@monj= alon.net ; stable@dpdk.org ; Tyler Re= tzlaff ; Aaron Conole > > Subject: Re: [PATCH v2] service: fix deadlock on worker lcore exit > > > > On Thu, Oct 3, 2024 at 8:57 AM David Marchand wrote: > > > > > > From: Mattias R=C3=B6nnblom > > > > > > Calling rte_exit() from a worker lcore thread causes a deadlock in > > > rte_service_finalize(). > > > > > > This patch makes rte_service_finalize() deadlock-free by avoiding the > > > need to synchronize with service lcore threads, which in turn is > > > achieved by moving service and per-lcore state from the heap to being > > > statically allocated. > > > > > > The BSS segment increases with ~156 kB (on x86_64 with default > > > RTE_MAX_LCORE and RTE_SERVICE_NUM_MAX). > > > > > > According to the service perf autotest, this change also results in a > > > slight reduction of service framework overhead. > > > > > > Fixes: 33666b448f15 ("service: fix crash on exit") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Mattias R=C3=B6nnblom > > > Acked-by: Tyler Retzlaff > > > --- > > > Changes since v1: > > > - rebased, > > > > I can't merge this patch in its current state. > > > > At the moment, two CI report a problem with the > > eal_flags_file_prefix_autotest unit test. > > > > -------------------------------------stdout----------------------------= --------- > > RTE>>eal_flags_file_prefix_autotest > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '--proc-type=3Dsecondary' '-m' '18' '--file-prefix=3Dmemtest' > > Running binary with argv[]:'/home/zhoumin/gh_dpdk/build/app/dpdk-test' > > '-m' '18' '--file-prefix=3Dmemtest1' > > Error - hugepage files for memtest1 were not deleted! > > Test Failed > > RTE>> > > > > Can you have a look? > > Not sure how the code change in question is relating to the eal-flags fai= lure, but I can reproduce the failure here. > Reproducing issue on *all* of the below tags; this indicates its likely a= board-config issue, and not a true issue (unless its been there since 23.1= 1??). > > Tested commits were all bad: > b3485f4293 (HEAD, tag: v24.07) version: 24.07.0 > a9778aad62 (HEAD, tag: v24.03) version: 24.03.0 > eeb0605f11 (HEAD, tag: v23.11) version: 23.11.0 > > So I'm pretty sure this is a board/runner config issue, with the error ou= tput as follows here: > RTE>>eal_flags_file_prefix_autotest > Running binary with argv[]:'./app/test/dpdk-test' '--proc-type=3Dsecondar= y' '-m' '18' '--file-prefix=3Dmemtest' > EAL: Detected CPU lcores: 64 > EAL: Detected NUMA nodes: 2 > EAL: Detected static linkage of DPDK > EAL: Cannot open '/var/run/dpdk/memtest/config' for rte_mem_config > EAL: FATAL: Cannot init config > EAL: Cannot init config > > FAIL: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test --no-pci > > PASS: > DPDK_TEST=3Deal_flags_file_prefix_autotest ./app/test/dpdk-test > > So seems like the eal-flags test is NOT able to handle args like "--no-pc= i"? I tend to run tests in no PCI mode to speed up things :) Well, speeding up, or hiding the issue, I guess. > In short, this service-cores patch is not the root cause. Perhaps some of= the CI folks can confirm if there's extra args passed to the runner? To be clear, I can't merge this patch because of this (systematic) failure in many CI env (GHA, LoongArch, UNH). Adding CI ml in the loop. --=20 David Marchand