From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 200C443084; Wed, 16 Aug 2023 20:29:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8A1DB427E9; Wed, 16 Aug 2023 20:29:45 +0200 (CEST) Received: from mail-oo1-f48.google.com (mail-oo1-f48.google.com [209.85.161.48]) by mails.dpdk.org (Postfix) with ESMTP id ECB5040693 for ; Wed, 16 Aug 2023 20:29:43 +0200 (CEST) Received: by mail-oo1-f48.google.com with SMTP id 006d021491bc7-56ced49d51aso81249eaf.1 for ; Wed, 16 Aug 2023 11:29:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1692210583; x=1692815383; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=VOovwlWWPge5/McV4zxbSyPN7JcISAvpS90FIgsglvg=; b=IiToRcuOBFC+2XgSGJ1Ol0tI8y+am7cvSHi8XIkhHkvAXjm1mRieUBpCKUKRkQqBl+ bAvM8K/9v3YxBl7R0I1DuzaPqgchnoV75ehuAheXb8UCLrs1bYulpP1YvC9Nn0Jt830q 2Pz0RtHqhb5FarhysY9Pu7kNIPcToYiywfeOU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692210583; x=1692815383; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VOovwlWWPge5/McV4zxbSyPN7JcISAvpS90FIgsglvg=; b=V0ONRAaBDKx2qglaQYnVBz8oL0iO8vk5SxZ3foGHFIB9Xv/2yLOJtH6LJxew14vKdL X6dBMxGwfPC88giizpgPcErL3nsfOdcyEtLGC8NQuWDxIhjQfsR5Eqgu0MUQa771/pbD 7MfX1iYbzldp5sFPiyFp2NPvf40nTrjPKQuUxqxOQ860CjsnhPuUuUP0VdYh/aX+9XZ+ 7/OPeC71fxAWNp35HNu+lD4lZSTYBWP7CvThqouGl02GFlS1ckP5lluk9nWPZ7Kml66m ooJ8gAKsM0qtOKgu7g1go0OSmpwJ+vcAZmAbnCh1VD7yK3SYs5b2HMJ2r5IckanAj3Tm VaWg== X-Gm-Message-State: AOJu0YwYxFAv3ygf6+7bM4rPtX2xwDv+Y2qRFFvblHl7uVtRlfCOU9IE n5nkv4dPKoszev5gyzLUqgleQDjaUWUUI6YCgFXh3A== X-Google-Smtp-Source: AGHT+IGA3gTkCJ6BDih2omeysVkKjXaeHKitS36z7yGHhZnVtoFEPfG9Dfjmd4sdfj5fAzFgXNC+eMGcBUDIU+6o2l4= X-Received: by 2002:a4a:dcc6:0:b0:56d:fa76:be31 with SMTP id h6-20020a4adcc6000000b0056dfa76be31mr380458oou.1.1692210583046; Wed, 16 Aug 2023 11:29:43 -0700 (PDT) MIME-Version: 1.0 References: <20230721115125.55137-1-bruce.richardson@intel.com> <20230815151053.996469-1-bruce.richardson@intel.com> <20230815151053.996469-5-bruce.richardson@intel.com> In-Reply-To: From: Patrick Robb Date: Wed, 16 Aug 2023 14:29:32 -0400 Message-ID: Subject: Re: [PATCH v5 04/10] app/test: build using per-file dependency matrix To: David Marchand Cc: Bruce Richardson , dev@dpdk.org, ci@dpdk.org, =?UTF-8?Q?Morten_Br=C3=B8rup?= , Honnappa Nagarahalli , "Ruifeng Wang (Arm Technology China)" , Thomas Monjalon , Aaron Conole Content-Type: multipart/alternative; boundary="000000000000b0237e06030e7a58" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --000000000000b0237e06030e7a58 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 16, 2023 at 10:40=E2=80=AFAM David Marchand wrote: > Patrick, Bruce, > > If it was reported, I either missed it or forgot about it, sorry. > Can you (re)share the context? > > > > Does the test suite pass if the mlx5 driver is disabled in the build? > That > > could confirm or refute the suspicion of where the issue is, and also > > provide a temporary workaround while this set is merged (possibly > including > > support for disabling specific tests, as I suggested in my other email)= . > > Or disabling the driver as Bruce proposes. > Okay, we ran the test with the mlx5 driver disabled, and it still fails. So, this might be more of an ARM architecture issue. Ruifeng, are you still seeing this on your test bed? @David you didn't miss anything, we had a unicast with ARM when setting up the new arm container runners for unit testing a few months back. Ruifeng also noticed the same issue and speculated about mlx5 memory leaks. He raised the possibility of disabling the mlx5 driver too, but that option isn't great since we want to have a uniform build process (as much as possible) for our unit testing. Anyways, now we know that that isn't relevant. I'll forward the thread to you in any case - let me know if you have any ideas. > > > > > /Bruce > > > > PS: Are there any other workarounds inside the test/DTS/CI systems that > > involve patching sources? If so, it would be good to get a list that we > can > > work through removing by putting place proper fixes or workarounds, as > > changing sources for testing like this blocks future patch acceptance. > Patching sources from the test tool is a poor solution. > In general, developers won't be aware of source patching and will > waste time trying to understand why they can't reproduce what the CI > reports (it happened to me with DTS on the interrupt stuff with vhost, > at least). > > For this specific case of skipping a test, if nobody can fix the > issue, I prefer if the CI can skip some "known broken in my lab" tests > via some meson configuration. > And, such configuration should be easy to catch in the test report. > > I strongly agree on all points, which is why I said it was probably a goo= d thing anyhow for us to lose this ability. In the case of the disabled fast-test for arm, that was a new discovery coming from adding new environments, not a regression introduced by a patch, and I don't think it made sense then to block the introduction of the entire unit test coverage for arm while they looked into this issue. If it's possible to introduce meson configure functionality to disable specific tests, that does give us more flexibility. And it's obviously a better process than us doing it at the CI end. We don't currently patch source in any other way in our CI testing. --000000000000b0237e06030e7a58 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Wed, Aug 16, 2023 at 10:40=E2=80=AFAM = David Marchand <david.march= and@redhat.com> wrote:
Patrick, Bruce,

If it was reported, I either missed it or forgot about it, sorry.
Can you (re)share the context?

>
> Does the test suite pass if the mlx5 driver is disabled in the build? = That
> could confirm or refute the suspicion of where the issue is, and also<= br> > provide a temporary workaround while this set is merged (possibly incl= uding
> support for disabling specific tests, as I suggested in my other email= ).

Or disabling the driver as Bruce proposes.
=C2=A0Okay,= we ran the test with the mlx5 driver disabled, and it still fails. So, thi= s might be more of an ARM architecture issue. Ruifeng, are you still seeing= this on your test bed?

@David you didn't = miss anything, we had a unicast with ARM when setting up the new arm contai= ner runners for unit testing a few months back. Ruifeng also noticed the sa= me issue and speculated about mlx5 memory leaks. He raised the possibility = of disabling the mlx5 driver too, but that option isn't great since we = want to have a uniform build process (as much as possible) for our unit tes= ting. Anyways, now we know that that isn't relevant. I'll forward t= he thread to you in any case - let me know if you have any ideas.=C2=A0

>
> /Bruce
>
> PS: Are there any other workarounds inside the test/DTS/CI systems tha= t
> involve patching sources? If so, it would be good to get a list that w= e can
> work through removing by putting place proper fixes or workarounds, as=
> changing sources for testing like this blocks future patch acceptance.= =C2=A0

Patching sources from the test tool is a poor solution.
In general, developers won't be aware of source patching and will
waste time trying to understand why they can't reproduce what the CI reports (it happened to me with DTS on the interrupt stuff with vhost,
at least).

For this specific case of skipping a test, if nobody can fix the
issue, I prefer if the CI can skip some "known broken in my lab" = tests
via some meson configuration.
And, such configuration should be easy to catch in the test report.

I strongly agree on all points, which is why I said= it was probably a good thing anyhow for us to lose this ability. In the ca= se of the disabled fast-test for arm, that was a new discovery coming from = adding new environments, not a regression introduced by a patch, and I don&= #39;t think it made sense then to block the introduction of the entire unit= test coverage for arm while they looked into this issue. If it's possi= ble to introduce meson configure functionality to disable specific tests, t= hat does give us more flexibility. And it's obviously a better process = than us doing it at the CI end.=C2=A0

We don't= currently patch source in any other way in our CI testing.=C2=A0
--000000000000b0237e06030e7a58--