From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3B26746AE9; Thu, 3 Jul 2025 19:43:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1F3D140267; Thu, 3 Jul 2025 19:43:46 +0200 (CEST) Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by mails.dpdk.org (Postfix) with ESMTP id EBDD440264 for ; Thu, 3 Jul 2025 19:43:44 +0200 (CEST) Received: by mail-pg1-f175.google.com with SMTP id 41be03b00d2f7-b2fd091f826so185555a12.1 for ; Thu, 03 Jul 2025 10:43:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1751564624; x=1752169424; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2RQDaN+uSseRquR3jWQpQPb1EF9zsK4JejjIHC894UE=; b=LCIysakYOV+kN9f8sjjK0pTBT9r9G+ot4WcDGOfyXZP+8U2EzBhv9jkxbI3yKf95IL wc7TZlSYmHgjv2sJrNoX7n8RutcrI0L7hjRDM0ObFoF3x60rIb12uZ9RG9FlIOjmH7ns Fp90L2C1mjT++C8We1hnDZ/87RYZHT2PCCec8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751564624; x=1752169424; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2RQDaN+uSseRquR3jWQpQPb1EF9zsK4JejjIHC894UE=; b=OESBJg4y9wxQcPqzhzWGObW3Mpqb5Ktx4rBylwji41P5YIiwUeUEUASU0EvDVxzTSA Pg39SFv782vYZ7shfUTxDzXdBhrY8rtPVt54goST7gk+j0B4FRQkaOsOPRGGNtraguP5 kZc7u6evsUOHVKsjhRDX/qaFIfUkdE8523s5IVwT4eXMnwaBhXlrHIM1/zuFH5G39LY4 zGfik99Qb7wv+CXkW+3TMHRntb9sxpGT6D7cnf0+1MgwfQucvl31ESqblNauScf9syjw Dxth2Ak/FQ04hB4poJgKKknG11DrNDRO6/x1X6cAX5h1KokFeLATB4WzHpe5yZzjpPSf dDhA== X-Forwarded-Encrypted: i=1; AJvYcCXiV0o6SluReFF1r08RKnTJ8ol6+clLXnbF7MW2gYhZScFjoRVRU3dhnyXLwyvApo2H3g==@dpdk.org X-Gm-Message-State: AOJu0YwFwaprQ+NPg+nlN7CAQ6ZfqgLu1PEyo4ActgEqltlEbZiCYyH5 rHjGd1kQNIL++veIp6RDOhddSN0K22N6Y50jqRKGzIXDG4r4oV8uPJfN5IycaI7YW/DZM7BUgTd jtkelh4MJS1HO8ULRWLv1fziIpySVd8Pps0tgtPRBkw== X-Gm-Gg: ASbGncv9HgkzNoqdEAI4qfrCcjkXAqzjwa7rVTy5nz2GRcf+1RYY9YIIgNCxqWSpsVm NMd2/79exeVO3aq1myHHY54vzWbUAJaAhJQMEMe2dqCk1Yjq0pT9+4bonkj5yrZBibnc2yK4YTl FNL/ctU4U1EJgF7E9Toq2fN2efAqWvp97ow2zWvngRmBok9yfDoJUJ62ZEc8k= X-Google-Smtp-Source: AGHT+IEoyP7DYALAKkEaWRSuLJnSeBv4AOJI7XeMU2JMUsBzljqqrRLA8aWKY4nEzVdNw7ajLDqTagp7E3N+YXHqfvU= X-Received: by 2002:a17:90b:1649:b0:311:ab20:1591 with SMTP id 98e67ed59e1d1-31aab201c1cmr9456a91.15.1751564623855; Thu, 03 Jul 2025 10:43:43 -0700 (PDT) MIME-Version: 1.0 References: <1A9A6C1A-B762-4295-BA5B-E3FB6DE10EB8@arm.com> In-Reply-To: From: Patrick Robb Date: Thu, 3 Jul 2025 13:38:07 -0400 X-Gm-Features: Ac12FXy753S-Ll_IIfSux3wGWv5x0a3MdqIe9DDNl14xGVt5w2FDZC_kIzrbNGk Message-ID: Subject: Re: Intel E810 Performance Regression - ARM Grace Server To: "Richardson, Bruce" Cc: Manit Mahajan , "Nagarahalli, Honnappa" , "Burakov, Anatoly" , "ci@dpdk.org" , Wathsala Wathawana Vithanage , Paul Szczepanek , "Mcnamara, John" Content-Type: multipart/alternative; boundary="00000000000034d99d063909eba1" X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org --00000000000034d99d063909eba1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable We finished up on Slack but I'm just noting for the CI mailing list that this is resolved now as we are using the new flag in the test, thanks. On Thu, Jul 3, 2025 at 11:39=E2=80=AFAM Richardson, Bruce < bruce.richardson@intel.com> wrote: > Suggest testing with both flag names set, just for safety and backward > compatibility. Having the old flag still defined is harmless. > > > > *From:* Manit Mahajan > *Sent:* Thursday, July 3, 2025 4:31 PM > *To:* Patrick Robb > *Cc:* Richardson, Bruce ; Nagarahalli, > Honnappa ; Burakov, Anatoly < > anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage < > wathsala.vithanage@arm.com>; Paul Szczepanek ; > Mcnamara, John > *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server > > > > Hi Bruce, > > I looked at the commit and I see that it > changes RTE_LIBRTE_ICE_16BYTE_RX_DESC to RTE_NET_INTEL_USE_16BYTE_DESC. T= he > test I ran runs the meson setup with flag > -Dc_args=3D-DRTE_LIBRTE_ICE_16BYTE_RX_DESC. I will run another test with = the > new flag name. > > Thanks, > Manit > > > > On Thu, Jul 3, 2025 at 11:26=E2=80=AFAM Patrick Robb = wrote: > > Hi Bruce, > > > > When the NIC is E810, the test runs the meson setup with > flag -Dc_args=3D-DRTE_LIBRTE_ICE_16BYTE_RX_DESC > > I think that is what you mean? Is this setup correct? > > > > On Thu, Jul 3, 2025 at 11:22=E2=80=AFAM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > > Is the test you are running setting the 16B descriptor flag, and does it > need updating to take account of the new flag name? > > > > *From:* Manit Mahajan > *Sent:* Thursday, July 3, 2025 4:22 PM > *To:* Richardson, Bruce > *Cc:* Patrick Robb ; Nagarahalli, Honnappa < > Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly ; > ci@dpdk.org; Wathsala Wathawana Vithanage ; > Paul Szczepanek ; Mcnamara, John < > john.mcnamara@intel.com> > *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server > > > > Hi Bruce, > > This morning, I was able to narrow down the performance issue to a > specific commit. I ran performance tests on the following two commits: > > - d1a350c089e0 =E2=80=93 net/ice: rename 16-byte descriptor flag > - 4c4b9ce017fe =E2=80=93 net/i40e: rename 16-byte descriptor flag > > The net/i40e commit directly precedes the net/ice commit. I observed a > significant drop in mpps beginning with commit d1a350c089e0, confirming > that this commit introduced the regression. > > Thanks, > Manit > > > > On Thu, Jul 3, 2025 at 9:12=E2=80=AFAM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > > Thanks Patrick, I=E2=80=99m planning on checking some performance numbers= again on > our end too. > > > > My thoughts on the ring size, is that the total number of ring slots > across all rings should be enough to ride out an expected stall. So back = in > the 10G days (max packet arrival rate of ~67ns), we would use ring sizes = of > 512 entries, which would give us just short of 35usec of buffering. Even > with 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, > admittedly CPUs are faster too, so should be less likely to stop polling > for that amount of time, but they aren=E2=80=99t 10x as fast as in the 10= G days so > I find 512 of a ring size a little small. For 100G, I would expect 2k to = be > a reasonable min ring size to test with =E2=80=93 if testing single queue= . > Obviously the more queues and cores we test with, the smaller each ring c= an > be, since the arrival rate per-ring should be lower. > > > > /Bruce > > > > *From:* Patrick Robb > *Sent:* Thursday, July 3, 2025 1:53 PM > *To:* Richardson, Bruce > *Cc:* Nagarahalli, Honnappa ; Manit Mahajan > ; Burakov, Anatoly ; > ci@dpdk.org; Wathsala Wathawana Vithanage ; > Paul Szczepanek > *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server > > > > Hi Bruce, > > > > Manit can identify the specific commit this morning. > > > > You raise a good point about the descriptor count. It is worth us > assessing the performance with a broader set of descriptor counts and > deciding what set of test configurations will yield helpful results for > developers going forward. By my understanding, we want to test with a set > of descriptor counts which are basically appropriate for the given traffi= c > flow, not the other way around. We will gather more info this morning and > share it back to you. > > > > On Thu, Jul 3, 2025 at 4:43=E2=80=AFAM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > > Hi Manit, > > Can you identify which patch exactly within the series is causing the > regression? We were not expecting performance to change with the patchset= , > but obviously something got missed. > I will follow up on our end to see if we see any regressions. > > I must say, though, that 512 entries is pretty small rings sizes to use > for 100G traffic. The slightest stall would cause those rings to overflow= . > What is perf like at other ring sizes, e.g. 1k or 2k? > > /Bruce > > > > -----Original Message----- > > From: Honnappa Nagarahalli > > Sent: Thursday, July 3, 2025 8:03 AM > > To: Manit Mahajan > > Cc: Burakov, Anatoly ; ci@dpdk.org; > Richardson, > > Bruce ; Wathsala Wathawana Vithanage > > ; Paul Szczepanek > > > > Subject: Re: Intel E810 Performance Regression - ARM Grace Server > > > > + Wathsala, Paul > > > > > On Jul 2, 2025, at 10:09=E2=80=AFPM, Manit Mahajan > > wrote: > > > > > > Hi we have an update about the single core forwarding test on the ARM > > Grace server with the E810 100G Ice card. There was an intel PMDs serie= s > that > > was merged a week ago which had some performance failures when it was > > going through the CI: > > https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8 > > 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/ > > > > > > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html > > > > > > As you can see it causes roughly a 6% decrease in packets forwarded i= n > the > > single core forwarding test with 64Byte frames and 512 txd/rxd. The del= ta > > tolerance on the single core forwarding test is 5%, so a 6% reduction i= n > MPPS > > forwarded is a failure. > > > > > > This was merged into mainline 6 days ago, which is why some failures > started > > to come in this week for the E810 Grace test. > > > > > > To double check this, on DPDK I checked out to: > > > > > > test/event: fix event vector adapter timeouts > > (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the > > Intel PMD patchseries, and ran the test and it forwarded the > pre-regression > > MPPS that we expected. > > > > > > Then I checked out to net/intel: add common Tx mbuf recycle > > (f5fd081c86ae415515ab55cbacf10c9c50536ca1) > > > > > > and I ran the test and it had the 6% reduction in MPPS forwarded. > > > > > > Another thing to note is that regrettably the ARM Grace E810 test did > not get > > run on the v7 (the final version) of this series, which meant the > failure was not > > displayed on that version and that's probably why it was merged. We wil= l > look > > back into our job history and see why this test failed to report. > > > > > > Please let me know if you have any questions about the test, the > testbed > > environment info, or anything else. > > Thanks Manit for looking into this. Adding few folks from Arm to follow > up. > > > > > > > > Thanks, > > > Manit Mahajan > > > > IMPORTANT NOTICE: The contents of this email and any attachments are > > confidential and may also be privileged. If you are not the intended > recipient, > > please notify the sender immediately and do not disclose the contents t= o > any > > other person, use it for any purpose, or store or copy the information > in any > > medium. Thank you. > > --00000000000034d99d063909eba1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
We finished up on Slack but I'm just noting for the CI= mailing list that this is resolved now as we are using the new flag in the= test, thanks.

On Thu, Jul 3, 2025 at 11:39=E2=80=AFAM= Richardson, Bruce <bruce.= richardson@intel.com> wrote:

Suggest testing with = both flag names set, just for safety and backward compatibility. Having the= old flag still defined is harmless.

=C2=A0<= /span>

From: Manit Mahajan <mmahajan@iol.unh.edu>= ;
Sent: Thursday, July 3, 2025 4:31 PM
To: Patrick Robb <probb@iol.unh.edu>
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Nagarahalli, Honn= appa <= Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wa= thsala Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <= ;Paul.Szczepan= ek@arm.com>; Mcnamara, John <john.mcnamara@intel.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server=

=C2=A0

Hi Bruce,=C2=A0

I looked at the commit and I see that it changes=C2=A0RTE_LIBRTE_ICE_16BYTE= _RX_DESC to=C2=A0RTE_NET_INTEL_USE_16BYTE_DESC. The test I ran runs the mes= on setup with flag -Dc_args=3D-DRTE_LIBRTE_ICE_16BYTE_RX_DESC. I will run a= nother test with the new flag name.=C2=A0

Thanks,
Manit

=C2=A0

On Thu, Jul 3, 2025 at 11:26=E2=80=AFAM Patrick Robb <probb@iol.unh.edu> wrote:

Hi Bruce,=C2=A0

=C2=A0

When the NIC is E810, the test runs the meson setup = with flag=C2=A0-Dc_args=3D-DRTE_LIBRTE_ICE_16BYTE_RX_DESC

I think that is what you mean? Is this setup correct?

=C2=A0

On Thu, Jul 3, 2025 at 11:22=E2=80=AFAM Richardson, Bruce <bruce.richardson@intel.c= om> wrote:

Is the test you are r= unning setting the 16B descriptor flag, and does it need updating to take a= ccount of the new flag name?

=C2=A0<= u>

From: Manit Mahajan <mmah= ajan@iol.unh.edu>
Sent: Thursday, July 3, 2025 4:22 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Patrick Robb <probb@iol.unh.edu>; Nagarahalli, Honnappa <Honnappa.Nagarahalli@ar= m.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala = Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.= com>; Mcnamara, John <john.mcnamara@intel.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

=C2=A0

Hi Bruce,=C2=A0

This morning, I was able to narrow down the performance issue to a specific= commit. I ran performance tests on the following two commits:

  • d1a350c089e0 =E2=80=93 net/ice: rename 16-byte descriptor flag
  • 4c4b9ce017fe =E2=80=93 net/i40e: rename 16-byte descriptor flag

The net/i40e commit directly precedes the net/ice co= mmit. I observed a significant drop in mpps beginning with commit d1a350c08= 9e0, confirming that this commit introduced the regression.

Thanks,
Manit

=C2=A0

On Thu, Jul 3, 2025 at 9:12=E2=80=AFAM Richardson, Bruce <bruce.richardson@intel.co= m> wrote:

Thanks Patrick, I=E2= =80=99m planning on checking some performance numbers again on our end too.=

=C2=A0<= u>

My thoughts on the ri= ng size, is that the total number of ring slots across all rings should be = enough to ride out an expected stall. So back in the 10G days (max packet arrival rate of ~67ns), we would use ring sizes of 51= 2 entries, which would give us just short of 35usec of buffering. Even with= 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, admit= tedly CPUs are faster too, so should be less likely to stop polling for that amount of time, but they aren=E2= =80=99t 10x as fast as in the 10G days so I find 512 of a ring size a littl= e small. For 100G, I would expect 2k to be a reasonable min ring size to te= st with =E2=80=93 if testing single queue. Obviously the more queues and cores we test with, the smaller each ring can be, sinc= e the arrival rate per-ring should be lower.

=C2=A0<= u>

/Bruce<= u>

=C2=A0<= u>

From: Patrick Robb <probb@iol.= unh.edu>
Sent: Thursday, July 3, 2025 1:53 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Nagarahalli, Honnappa <Honnappa.Nagarahalli@arm.com>; Manit Mah= ajan <mmahajan= @iol.unh.edu>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala = Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.= com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

=C2=A0

Hi Bruce,

=C2=A0

Manit can identify the specific commit this morning.=

=C2=A0

You raise a good point about the descriptor count. I= t is worth us assessing the performance with a broader set of descriptor co= unts and deciding what set of test configurations will yield helpful results for developers going forward. By my understandi= ng, we want to test with a set of descriptor counts which are basically app= ropriate for the given traffic flow, not the other way around. We will gath= er more info this morning and share it back to you.

=C2=A0

On Thu, Jul 3, 2025 at 4:43=E2=80=AFAM Richardson, Bruce <bruce.richardson@intel.co= m> wrote:

Hi Manit,

Can you identify which patch exactly within the series is causing the regre= ssion? We were not expecting performance to change with the patchset, but o= bviously something got missed.
I will follow up on our end to see if we see any regressions.

I must say, though, that 512 entries is pretty small rings sizes to use for= 100G traffic. The slightest stall would cause those rings to overflow. Wha= t is perf like at other ring sizes, e.g. 1k or 2k?

/Bruce


> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Thursday, July 3, 2025 8:03 AM
> To: Manit Mahajan <mmahajan@iol.unh.edu>
> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Richardso= n,
> Bruce <bruce.richardson@intel.com>; Wathsala Wathawana Vithanage
> <wa= thsala.vithanage@arm.com>; Paul Szczepanek
> <Paul.= Szczepanek@arm.com>
> Subject: Re: Intel E810 Performance Regression - ARM Grace Server
>
> + Wathsala, Paul
>
> > On Jul 2, 2025, at 10:09=E2=80=AFPM, Manit Mahajan <mmahajan@iol.unh.edu>
> wrote:
> >
> > Hi we have an update about the single core forwarding test on the= ARM
> Grace server with the E810 100G Ice card. There was an intel PMDs seri= es that
> was merged a week ago which had some performance failures when it was<= br> > going through the CI:
> https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8 > 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/
> >
> > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html
> >
> > As you can see it causes roughly a 6% decrease in packets forward= ed in the
> single core forwarding test with 64Byte frames and 512 txd/rxd. The de= lta
> tolerance on the single core forwarding test is 5%, so a 6% reduction = in MPPS
> forwarded is a failure.
> >
> > This was merged into mainline 6 days ago, which is why some failu= res started
> to come in this week for the E810 Grace test.
> >
> > To double check this, on DPDK I checked out to:
> >
> > test/event: fix event vector adapter timeouts
> (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the=
> Intel PMD patchseries, and ran the test and it forwarded the pre-regre= ssion
> MPPS that we expected.
> >
> > Then I checked out to net/intel: add common Tx mbuf recycle
> (f5fd081c86ae415515ab55cbacf10c9c50536ca1)
> >
> > and I ran the test and it had the 6% reduction in MPPS forwarded.=
> >
> > Another thing to note is that regrettably the ARM Grace E810 test= did not get
> run on the v7 (the final version) of this series, which meant the fail= ure was not
> displayed on that version and that's probably why it was merged. W= e will look
> back into our job history and see why this test failed to report.
> >
> > Please let me know if you have any questions about the test, the = testbed
> environment info, or anything else.
> Thanks Manit for looking into this. Adding few folks from Arm to follo= w up.
>
> >
> > Thanks,
> > Manit Mahajan
>
> IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended r= ecipient,
> please notify the sender immediately and do not disclose the contents = to any
> other person, use it for any purpose, or store or copy the information= in any
> medium. Thank you.

--00000000000034d99d063909eba1--