From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A407646AE6; Thu, 3 Jul 2025 17:26:18 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8548E40267; Thu, 3 Jul 2025 17:26:18 +0200 (CEST) Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by mails.dpdk.org (Postfix) with ESMTP id BDA9140264 for ; Thu, 3 Jul 2025 17:26:16 +0200 (CEST) Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-313bb9b2f5bso84352a91.3 for ; Thu, 03 Jul 2025 08:26:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1751556376; x=1752161176; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7D+ubKAxfK0QpCH1WMKh3wuNAVJSGaDNFhnhZLi1lN0=; b=J1IcF/9iM5zy5szKDFOiAzD+of5NobDThXsV5NRdK7uQtVCENaks5nySJDJSLlyivo 6H4r3uByt2WFh8AV44hfefltc2kdLTLjXPRdQDdfOrZie3HqHtdjgTmbR51iKiyufrEh ocdl0812uuEJ1ELUc19bdRlpGHQ5I8ljCe3FE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751556376; x=1752161176; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7D+ubKAxfK0QpCH1WMKh3wuNAVJSGaDNFhnhZLi1lN0=; b=la2MipBuok9ERkni+pBtwPHBl/43oLiFM7/IXHHvY5bmM3vSyDPT8R8Y+l950NiTvK uqb1M0Rw1xY24eSe7RYh9rUphYPoY6koej6OFxeODwgjzHBP/LCBKxXN2z63x4I8/WFR QQbNZwJ3m/LBrY7OE35/QqJpX58+QW7Ax18VX4lYzOsw+gMjoj6etGRXXyI9QH1716mn IE8yXtvTZm8hqy5myhmKAJUzuKJhRVVi7tq6rnWTaB5G/l5STtKk7sSIw+HCDHIx3ZKW SdNer6lKBREBSJ8iE5hMya7xcIL0vaxK+3T7xohL1/bYZtZRMWCxqQ0xEeKa/wwR8xbC 5HqA== X-Forwarded-Encrypted: i=1; AJvYcCX2O98BZzrI9rpnqGShz9ETxQrlri/qQrrykH0hGJw51IPNexfFAbCW7KlcEmZgug4/lA==@dpdk.org X-Gm-Message-State: AOJu0YyG/16MjpVQfPPGm1jEDPJ1PBNe8EDw+3TKQcWMne9FTgilV0/k 4NcGXJLJyp6OTMo4aVrQTtpyyZI4WhSI+I6kfqtKHWZ/WFcmT0mAeTNRSLEMaJVfSq1zpjraoNm loMZQW7eXt/1+L77yMY0GTAL7cwmpeMCzKqcwRQiMkA== X-Gm-Gg: ASbGncuiUwSKJEzKLcCvQKZIsemKPFmtohFW1d/cHDP2HrbXgKJQjBzo3DeHuO8mYKI kh9En1aLh3Kjl6YF/HyI0kphJhB7o1osNQOWyEuGt5aiybP6IIBf7orH2s/e2HYCkK9GvzOh/Lv ADhdlRbGETplZKImoeuxAnhFay0kd6gPSGcG48gkPlj4WCXJ00BqTOrRgbmzDGLAuNjB2QPw== X-Google-Smtp-Source: AGHT+IFbjgsMexkGYFg7p6p7F6TBvLmLZvItD1aM8u/fg60eVdJ0kppHIIeKEds12Cp23dTNZJHbwDmBX6Q7F9SQ8uY= X-Received: by 2002:a17:90b:57cd:b0:313:1e60:584d with SMTP id 98e67ed59e1d1-31a90b2ff40mr11705262a91.11.1751556375511; Thu, 03 Jul 2025 08:26:15 -0700 (PDT) MIME-Version: 1.0 References: <1A9A6C1A-B762-4295-BA5B-E3FB6DE10EB8@arm.com> In-Reply-To: From: Patrick Robb Date: Thu, 3 Jul 2025 11:20:38 -0400 X-Gm-Features: Ac12FXyxYUzvPAQV-BVmeL0hQTTcc1VbQNa_T4Qb3Vuo0S4TZ1RsJw61U6knb_Q Message-ID: Subject: Re: Intel E810 Performance Regression - ARM Grace Server To: "Richardson, Bruce" Cc: Manit Mahajan , "Nagarahalli, Honnappa" , "Burakov, Anatoly" , "ci@dpdk.org" , Wathsala Wathawana Vithanage , Paul Szczepanek , "Mcnamara, John" Content-Type: multipart/alternative; boundary="000000000000911928063907ff00" X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org --000000000000911928063907ff00 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Bruce, When the NIC is E810, the test runs the meson setup with flag -Dc_args=3D-DRTE_LIBRTE_ICE_16BYTE_RX_DESC I think that is what you mean? Is this setup correct? On Thu, Jul 3, 2025 at 11:22=E2=80=AFAM Richardson, Bruce < bruce.richardson@intel.com> wrote: > Is the test you are running setting the 16B descriptor flag, and does it > need updating to take account of the new flag name? > > > > *From:* Manit Mahajan > *Sent:* Thursday, July 3, 2025 4:22 PM > *To:* Richardson, Bruce > *Cc:* Patrick Robb ; Nagarahalli, Honnappa < > Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly ; > ci@dpdk.org; Wathsala Wathawana Vithanage ; > Paul Szczepanek ; Mcnamara, John < > john.mcnamara@intel.com> > *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server > > > > Hi Bruce, > > This morning, I was able to narrow down the performance issue to a > specific commit. I ran performance tests on the following two commits: > > - d1a350c089e0 =E2=80=93 net/ice: rename 16-byte descriptor flag > - 4c4b9ce017fe =E2=80=93 net/i40e: rename 16-byte descriptor flag > > The net/i40e commit directly precedes the net/ice commit. I observed a > significant drop in mpps beginning with commit d1a350c089e0, confirming > that this commit introduced the regression. > > Thanks, > Manit > > > > On Thu, Jul 3, 2025 at 9:12=E2=80=AFAM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > > Thanks Patrick, I=E2=80=99m planning on checking some performance numbers= again on > our end too. > > > > My thoughts on the ring size, is that the total number of ring slots > across all rings should be enough to ride out an expected stall. So back = in > the 10G days (max packet arrival rate of ~67ns), we would use ring sizes = of > 512 entries, which would give us just short of 35usec of buffering. Even > with 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, > admittedly CPUs are faster too, so should be less likely to stop polling > for that amount of time, but they aren=E2=80=99t 10x as fast as in the 10= G days so > I find 512 of a ring size a little small. For 100G, I would expect 2k to = be > a reasonable min ring size to test with =E2=80=93 if testing single queue= . > Obviously the more queues and cores we test with, the smaller each ring c= an > be, since the arrival rate per-ring should be lower. > > > > /Bruce > > > > *From:* Patrick Robb > *Sent:* Thursday, July 3, 2025 1:53 PM > *To:* Richardson, Bruce > *Cc:* Nagarahalli, Honnappa ; Manit Mahajan > ; Burakov, Anatoly ; > ci@dpdk.org; Wathsala Wathawana Vithanage ; > Paul Szczepanek > *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server > > > > Hi Bruce, > > > > Manit can identify the specific commit this morning. > > > > You raise a good point about the descriptor count. It is worth us > assessing the performance with a broader set of descriptor counts and > deciding what set of test configurations will yield helpful results for > developers going forward. By my understanding, we want to test with a set > of descriptor counts which are basically appropriate for the given traffi= c > flow, not the other way around. We will gather more info this morning and > share it back to you. > > > > On Thu, Jul 3, 2025 at 4:43=E2=80=AFAM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > > Hi Manit, > > Can you identify which patch exactly within the series is causing the > regression? We were not expecting performance to change with the patchset= , > but obviously something got missed. > I will follow up on our end to see if we see any regressions. > > I must say, though, that 512 entries is pretty small rings sizes to use > for 100G traffic. The slightest stall would cause those rings to overflow= . > What is perf like at other ring sizes, e.g. 1k or 2k? > > /Bruce > > > > -----Original Message----- > > From: Honnappa Nagarahalli > > Sent: Thursday, July 3, 2025 8:03 AM > > To: Manit Mahajan > > Cc: Burakov, Anatoly ; ci@dpdk.org; > Richardson, > > Bruce ; Wathsala Wathawana Vithanage > > ; Paul Szczepanek > > > > Subject: Re: Intel E810 Performance Regression - ARM Grace Server > > > > + Wathsala, Paul > > > > > On Jul 2, 2025, at 10:09=E2=80=AFPM, Manit Mahajan > > wrote: > > > > > > Hi we have an update about the single core forwarding test on the ARM > > Grace server with the E810 100G Ice card. There was an intel PMDs serie= s > that > > was merged a week ago which had some performance failures when it was > > going through the CI: > > https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8 > > 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/ > > > > > > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html > > > > > > As you can see it causes roughly a 6% decrease in packets forwarded i= n > the > > single core forwarding test with 64Byte frames and 512 txd/rxd. The del= ta > > tolerance on the single core forwarding test is 5%, so a 6% reduction i= n > MPPS > > forwarded is a failure. > > > > > > This was merged into mainline 6 days ago, which is why some failures > started > > to come in this week for the E810 Grace test. > > > > > > To double check this, on DPDK I checked out to: > > > > > > test/event: fix event vector adapter timeouts > > (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the > > Intel PMD patchseries, and ran the test and it forwarded the > pre-regression > > MPPS that we expected. > > > > > > Then I checked out to net/intel: add common Tx mbuf recycle > > (f5fd081c86ae415515ab55cbacf10c9c50536ca1) > > > > > > and I ran the test and it had the 6% reduction in MPPS forwarded. > > > > > > Another thing to note is that regrettably the ARM Grace E810 test did > not get > > run on the v7 (the final version) of this series, which meant the > failure was not > > displayed on that version and that's probably why it was merged. We wil= l > look > > back into our job history and see why this test failed to report. > > > > > > Please let me know if you have any questions about the test, the > testbed > > environment info, or anything else. > > Thanks Manit for looking into this. Adding few folks from Arm to follow > up. > > > > > > > > Thanks, > > > Manit Mahajan > > > > IMPORTANT NOTICE: The contents of this email and any attachments are > > confidential and may also be privileged. If you are not the intended > recipient, > > please notify the sender immediately and do not disclose the contents t= o > any > > other person, use it for any purpose, or store or copy the information > in any > > medium. Thank you. > > --000000000000911928063907ff00 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Bruce,=C2=A0

When the NIC is E810, t= he test runs the meson setup with flag=C2=A0-Dc_args=3D-DRTE_LIBRTE_ICE_16B= YTE_RX_DESC

I think that is what you mean? Is this setup correct?

On Thu, Jul 3, 2025 at 11:22=E2=80=AFAM Richardson= , Bruce <bruce.richardson@= intel.com> wrote:

Is the test you are r= unning setting the 16B descriptor flag, and does it need updating to take a= ccount of the new flag name?

=C2=A0<= /span>

From: Manit Mahajan <mmahajan@iol.unh.edu>= ;
Sent: Thursday, July 3, 2025 4:22 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Patrick Robb <probb@iol.unh.edu>; Nagarahalli, Honnappa <Honnappa.Nagarahalli@ar= m.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithan= age <wat= hsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.com>; Mcnamara, John <john.mcnamara@intel.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server=

=C2=A0

Hi Bruce,=C2=A0

This morning, I was able to narrow down the performance issue to a specific= commit. I ran performance tests on the following two commits:

  • d1a350c089e0 =E2=80=93 net/ice: rename 16-byte descriptor flag
  • 4c4b9ce017fe =E2=80=93 net/i40e: rename 16-byte descriptor flag

The net/i40e commit directly precedes the net/ice co= mmit. I observed a significant drop in mpps beginning with commit d1a350c08= 9e0, confirming that this commit introduced the regression.

Thanks,
Manit

=C2=A0

On Thu, Jul 3, 2025 at 9:12=E2=80=AFAM Richardson, Bruce <bruce.richardson@intel.co= m> wrote:

Thanks Patrick, I=E2= =80=99m planning on checking some performance numbers again on our end too.=

=C2=A0<= u>

My thoughts on the ri= ng size, is that the total number of ring slots across all rings should be = enough to ride out an expected stall. So back in the 10G days (max packet arrival rate of ~67ns), we would use ring sizes of 51= 2 entries, which would give us just short of 35usec of buffering. Even with= 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, admit= tedly CPUs are faster too, so should be less likely to stop polling for that amount of time, but they aren=E2= =80=99t 10x as fast as in the 10G days so I find 512 of a ring size a littl= e small. For 100G, I would expect 2k to be a reasonable min ring size to te= st with =E2=80=93 if testing single queue. Obviously the more queues and cores we test with, the smaller each ring can be, sinc= e the arrival rate per-ring should be lower.

=C2=A0<= u>

/Bruce<= u>

=C2=A0<= u>

From: Patrick Robb <probb@iol.= unh.edu>
Sent: Thursday, July 3, 2025 1:53 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Nagarahalli, Honnappa <Honnappa.Nagarahalli@arm.com>; Manit Mah= ajan <mmahajan= @iol.unh.edu>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala = Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.= com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

=C2=A0

Hi Bruce,

=C2=A0

Manit can identify the specific commit this morning.=

=C2=A0

You raise a good point about the descriptor count. I= t is worth us assessing the performance with a broader set of descriptor co= unts and deciding what set of test configurations will yield helpful results for developers going forward. By my understandi= ng, we want to test with a set of descriptor counts which are basically app= ropriate for the given traffic flow, not the other way around. We will gath= er more info this morning and share it back to you.

=C2=A0

On Thu, Jul 3, 2025 at 4:43=E2=80=AFAM Richardson, Bruce <bruce.richardson@intel.co= m> wrote:

Hi Manit,

Can you identify which patch exactly within the series is causing the regre= ssion? We were not expecting performance to change with the patchset, but o= bviously something got missed.
I will follow up on our end to see if we see any regressions.

I must say, though, that 512 entries is pretty small rings sizes to use for= 100G traffic. The slightest stall would cause those rings to overflow. Wha= t is perf like at other ring sizes, e.g. 1k or 2k?

/Bruce


> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Thursday, July 3, 2025 8:03 AM
> To: Manit Mahajan <mmahajan@iol.unh.edu>
> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Richardso= n,
> Bruce <bruce.richardson@intel.com>; Wathsala Wathawana Vithanage
> <wa= thsala.vithanage@arm.com>; Paul Szczepanek
> <Paul.= Szczepanek@arm.com>
> Subject: Re: Intel E810 Performance Regression - ARM Grace Server
>
> + Wathsala, Paul
>
> > On Jul 2, 2025, at 10:09=E2=80=AFPM, Manit Mahajan <mmahajan@iol.unh.edu>
> wrote:
> >
> > Hi we have an update about the single core forwarding test on the= ARM
> Grace server with the E810 100G Ice card. There was an intel PMDs seri= es that
> was merged a week ago which had some performance failures when it was<= br> > going through the CI:
> https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8 > 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/
> >
> > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html
> >
> > As you can see it causes roughly a 6% decrease in packets forward= ed in the
> single core forwarding test with 64Byte frames and 512 txd/rxd. The de= lta
> tolerance on the single core forwarding test is 5%, so a 6% reduction = in MPPS
> forwarded is a failure.
> >
> > This was merged into mainline 6 days ago, which is why some failu= res started
> to come in this week for the E810 Grace test.
> >
> > To double check this, on DPDK I checked out to:
> >
> > test/event: fix event vector adapter timeouts
> (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the=
> Intel PMD patchseries, and ran the test and it forwarded the pre-regre= ssion
> MPPS that we expected.
> >
> > Then I checked out to net/intel: add common Tx mbuf recycle
> (f5fd081c86ae415515ab55cbacf10c9c50536ca1)
> >
> > and I ran the test and it had the 6% reduction in MPPS forwarded.=
> >
> > Another thing to note is that regrettably the ARM Grace E810 test= did not get
> run on the v7 (the final version) of this series, which meant the fail= ure was not
> displayed on that version and that's probably why it was merged. W= e will look
> back into our job history and see why this test failed to report.
> >
> > Please let me know if you have any questions about the test, the = testbed
> environment info, or anything else.
> Thanks Manit for looking into this. Adding few folks from Arm to follo= w up.
>
> >
> > Thanks,
> > Manit Mahajan
>
> IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended r= ecipient,
> please notify the sender immediately and do not disclose the contents = to any
> other person, use it for any purpose, or store or copy the information= in any
> medium. Thank you.

--000000000000911928063907ff00--