From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8FCA643083; Wed, 16 Aug 2023 20:29:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6E96B40693; Wed, 16 Aug 2023 20:29:45 +0200 (CEST) Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by mails.dpdk.org (Postfix) with ESMTP id EB0274003C for ; Wed, 16 Aug 2023 20:29:43 +0200 (CEST) Received: by mail-oo1-f49.google.com with SMTP id 006d021491bc7-56ced49d51aso81248eaf.1 for ; Wed, 16 Aug 2023 11:29:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1692210583; x=1692815383; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=VOovwlWWPge5/McV4zxbSyPN7JcISAvpS90FIgsglvg=; b=IiToRcuOBFC+2XgSGJ1Ol0tI8y+am7cvSHi8XIkhHkvAXjm1mRieUBpCKUKRkQqBl+ bAvM8K/9v3YxBl7R0I1DuzaPqgchnoV75ehuAheXb8UCLrs1bYulpP1YvC9Nn0Jt830q 2Pz0RtHqhb5FarhysY9Pu7kNIPcToYiywfeOU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692210583; x=1692815383; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VOovwlWWPge5/McV4zxbSyPN7JcISAvpS90FIgsglvg=; b=Ts/gPB8v/7qSP44awqh8M8EQsvUz15oSWJHWtcy4qC0aITu3zfB6CSDfxqBN9MnMkB 21GZrlr0k0Tu9yOoiGgheWuGq752tSCbnXMprIiS5pZSU8BBF+e+l1WazDUb6deck5jO CR+E1QH8PBlcNvLxB5jxkrpytSbAojARkibAtbmZD6EuFFFjrgm36ah0/y6JyDsed8Ti JgngIFAZxZIxMUqKhL5GfHWWptY8oyq+Th5dR0J4cZUpFAQqnnG6nP7E4rA4h8nA0+gD LgqA/07u6U0zX4KLaRRqsnlJJZjrlPl1dyXvyRSW2mSQ26m6TM7ZRFP2jZmQpde5JWB1 L4aA== X-Gm-Message-State: AOJu0YzwIvy9K3pzGONS0Fvz9mgWDQl4O9HAGVdObfDASaPqdBJ1zbgn xeSJu7kUzuGRunC/Zv9tzs/p+8QGoQaa7Yxzem3sWg== X-Google-Smtp-Source: AGHT+IGA3gTkCJ6BDih2omeysVkKjXaeHKitS36z7yGHhZnVtoFEPfG9Dfjmd4sdfj5fAzFgXNC+eMGcBUDIU+6o2l4= X-Received: by 2002:a4a:dcc6:0:b0:56d:fa76:be31 with SMTP id h6-20020a4adcc6000000b0056dfa76be31mr380458oou.1.1692210583046; Wed, 16 Aug 2023 11:29:43 -0700 (PDT) MIME-Version: 1.0 References: <20230721115125.55137-1-bruce.richardson@intel.com> <20230815151053.996469-1-bruce.richardson@intel.com> <20230815151053.996469-5-bruce.richardson@intel.com> In-Reply-To: From: Patrick Robb Date: Wed, 16 Aug 2023 14:29:32 -0400 Message-ID: Subject: Re: [PATCH v5 04/10] app/test: build using per-file dependency matrix To: David Marchand Cc: Bruce Richardson , dev@dpdk.org, ci@dpdk.org, =?UTF-8?Q?Morten_Br=C3=B8rup?= , Honnappa Nagarahalli , "Ruifeng Wang (Arm Technology China)" , Thomas Monjalon , Aaron Conole Content-Type: multipart/alternative; boundary="000000000000b0237e06030e7a58" X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org --000000000000b0237e06030e7a58 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 16, 2023 at 10:40=E2=80=AFAM David Marchand wrote: > Patrick, Bruce, > > If it was reported, I either missed it or forgot about it, sorry. > Can you (re)share the context? > > > > Does the test suite pass if the mlx5 driver is disabled in the build? > That > > could confirm or refute the suspicion of where the issue is, and also > > provide a temporary workaround while this set is merged (possibly > including > > support for disabling specific tests, as I suggested in my other email)= . > > Or disabling the driver as Bruce proposes. > Okay, we ran the test with the mlx5 driver disabled, and it still fails. So, this might be more of an ARM architecture issue. Ruifeng, are you still seeing this on your test bed? @David you didn't miss anything, we had a unicast with ARM when setting up the new arm container runners for unit testing a few months back. Ruifeng also noticed the same issue and speculated about mlx5 memory leaks. He raised the possibility of disabling the mlx5 driver too, but that option isn't great since we want to have a uniform build process (as much as possible) for our unit testing. Anyways, now we know that that isn't relevant. I'll forward the thread to you in any case - let me know if you have any ideas. > > > > > /Bruce > > > > PS: Are there any other workarounds inside the test/DTS/CI systems that > > involve patching sources? If so, it would be good to get a list that we > can > > work through removing by putting place proper fixes or workarounds, as > > changing sources for testing like this blocks future patch acceptance. > Patching sources from the test tool is a poor solution. > In general, developers won't be aware of source patching and will > waste time trying to understand why they can't reproduce what the CI > reports (it happened to me with DTS on the interrupt stuff with vhost, > at least). > > For this specific case of skipping a test, if nobody can fix the > issue, I prefer if the CI can skip some "known broken in my lab" tests > via some meson configuration. > And, such configuration should be easy to catch in the test report. > > I strongly agree on all points, which is why I said it was probably a goo= d thing anyhow for us to lose this ability. In the case of the disabled fast-test for arm, that was a new discovery coming from adding new environments, not a regression introduced by a patch, and I don't think it made sense then to block the introduction of the entire unit test coverage for arm while they looked into this issue. If it's possible to introduce meson configure functionality to disable specific tests, that does give us more flexibility. And it's obviously a better process than us doing it at the CI end. We don't currently patch source in any other way in our CI testing. --000000000000b0237e06030e7a58 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Wed, Aug 16, 2023 at 10:40=E2=80=AFAM = David Marchand <david.march= and@redhat.com> wrote:
Patrick, Bruce,

If it was reported, I either missed it or forgot about it, sorry.
Can you (re)share the context?

>
> Does the test suite pass if the mlx5 driver is disabled in the build? = That
> could confirm or refute the suspicion of where the issue is, and also<= br> > provide a temporary workaround while this set is merged (possibly incl= uding
> support for disabling specific tests, as I suggested in my other email= ).

Or disabling the driver as Bruce proposes.
=C2=A0Okay,= we ran the test with the mlx5 driver disabled, and it still fails. So, thi= s might be more of an ARM architecture issue. Ruifeng, are you still seeing= this on your test bed?

@David you didn't = miss anything, we had a unicast with ARM when setting up the new arm contai= ner runners for unit testing a few months back. Ruifeng also noticed the sa= me issue and speculated about mlx5 memory leaks. He raised the possibility = of disabling the mlx5 driver too, but that option isn't great since we = want to have a uniform build process (as much as possible) for our unit tes= ting. Anyways, now we know that that isn't relevant. I'll forward t= he thread to you in any case - let me know if you have any ideas.=C2=A0

>
> /Bruce
>
> PS: Are there any other workarounds inside the test/DTS/CI systems tha= t
> involve patching sources? If so, it would be good to get a list that w= e can
> work through removing by putting place proper fixes or workarounds, as=
> changing sources for testing like this blocks future patch acceptance.= =C2=A0

Patching sources from the test tool is a poor solution.
In general, developers won't be aware of source patching and will
waste time trying to understand why they can't reproduce what the CI reports (it happened to me with DTS on the interrupt stuff with vhost,
at least).

For this specific case of skipping a test, if nobody can fix the
issue, I prefer if the CI can skip some "known broken in my lab" = tests
via some meson configuration.
And, such configuration should be easy to catch in the test report.

I strongly agree on all points, which is why I said= it was probably a good thing anyhow for us to lose this ability. In the ca= se of the disabled fast-test for arm, that was a new discovery coming from = adding new environments, not a regression introduced by a patch, and I don&= #39;t think it made sense then to block the introduction of the entire unit= test coverage for arm while they looked into this issue. If it's possi= ble to introduce meson configure functionality to disable specific tests, t= hat does give us more flexibility. And it's obviously a better process = than us doing it at the CI end.=C2=A0

We don't= currently patch source in any other way in our CI testing.=C2=A0
--000000000000b0237e06030e7a58--