Suggest testing with both flag names set, just for safety and backward compatibility. Having the old flag still defined is harmless.

 

From: Manit Mahajan <mmahajan@iol.unh.edu>
Sent: Thursday, July 3, 2025 4:31 PM
To: Patrick Robb <probb@iol.unh.edu>
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Nagarahalli, Honnappa <Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.com>; Mcnamara, John <john.mcnamara@intel.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

 

Hi Bruce, 

I looked at the commit and I see that it changes RTE_LIBRTE_ICE_16BYTE_RX_DESC to RTE_NET_INTEL_USE_16BYTE_DESC. The test I ran runs the meson setup with flag -Dc_args=-DRTE_LIBRTE_ICE_16BYTE_RX_DESC. I will run another test with the new flag name. 

Thanks,
Manit

 

On Thu, Jul 3, 2025 at 11:26AM Patrick Robb <probb@iol.unh.edu> wrote:

Hi Bruce, 

 

When the NIC is E810, the test runs the meson setup with flag -Dc_args=-DRTE_LIBRTE_ICE_16BYTE_RX_DESC

I think that is what you mean? Is this setup correct?

 

On Thu, Jul 3, 2025 at 11:22AM Richardson, Bruce <bruce.richardson@intel.com> wrote:

Is the test you are running setting the 16B descriptor flag, and does it need updating to take account of the new flag name?

 

From: Manit Mahajan <mmahajan@iol.unh.edu>
Sent: Thursday, July 3, 2025 4:22 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Patrick Robb <probb@iol.unh.edu>; Nagarahalli, Honnappa <Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.com>; Mcnamara, John <john.mcnamara@intel.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

 

Hi Bruce, 

This morning, I was able to narrow down the performance issue to a specific commit. I ran performance tests on the following two commits:

  • d1a350c089e0 – net/ice: rename 16-byte descriptor flag
  • 4c4b9ce017fe – net/i40e: rename 16-byte descriptor flag

The net/i40e commit directly precedes the net/ice commit. I observed a significant drop in mpps beginning with commit d1a350c089e0, confirming that this commit introduced the regression.

Thanks,
Manit

 

On Thu, Jul 3, 2025 at 9:12AM Richardson, Bruce <bruce.richardson@intel.com> wrote:

Thanks Patrick, I’m planning on checking some performance numbers again on our end too.

 

My thoughts on the ring size, is that the total number of ring slots across all rings should be enough to ride out an expected stall. So back in the 10G days (max packet arrival rate of ~67ns), we would use ring sizes of 512 entries, which would give us just short of 35usec of buffering. Even with 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, admittedly CPUs are faster too, so should be less likely to stop polling for that amount of time, but they aren’t 10x as fast as in the 10G days so I find 512 of a ring size a little small. For 100G, I would expect 2k to be a reasonable min ring size to test with – if testing single queue. Obviously the more queues and cores we test with, the smaller each ring can be, since the arrival rate per-ring should be lower.

 

/Bruce

 

From: Patrick Robb <probb@iol.unh.edu>
Sent: Thursday, July 3, 2025 1:53 PM
To: Richardson, Bruce <bruce.richardson@intel.com>
Cc: Nagarahalli, Honnappa <Honnappa.Nagarahalli@arm.com>; Manit Mahajan <mmahajan@iol.unh.edu>; Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage <wathsala.vithanage@arm.com>; Paul Szczepanek <Paul.Szczepanek@arm.com>
Subject: Re: Intel E810 Performance Regression - ARM Grace Server

 

Hi Bruce,

 

Manit can identify the specific commit this morning.

 

You raise a good point about the descriptor count. It is worth us assessing the performance with a broader set of descriptor counts and deciding what set of test configurations will yield helpful results for developers going forward. By my understanding, we want to test with a set of descriptor counts which are basically appropriate for the given traffic flow, not the other way around. We will gather more info this morning and share it back to you.

 

On Thu, Jul 3, 2025 at 4:43AM Richardson, Bruce <bruce.richardson@intel.com> wrote:

Hi Manit,

Can you identify which patch exactly within the series is causing the regression? We were not expecting performance to change with the patchset, but obviously something got missed.
I will follow up on our end to see if we see any regressions.

I must say, though, that 512 entries is pretty small rings sizes to use for 100G traffic. The slightest stall would cause those rings to overflow. What is perf like at other ring sizes, e.g. 1k or 2k?

/Bruce


> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Thursday, July 3, 2025 8:03 AM
> To: Manit Mahajan <mmahajan@iol.unh.edu>
> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>; ci@dpdk.org; Richardson,
> Bruce <bruce.richardson@intel.com>; Wathsala Wathawana Vithanage
> <wathsala.vithanage@arm.com>; Paul Szczepanek
> <Paul.Szczepanek@arm.com>
> Subject: Re: Intel E810 Performance Regression - ARM Grace Server
>
> + Wathsala, Paul
>
> > On Jul 2, 2025, at 10:09PM, Manit Mahajan <mmahajan@iol.unh.edu>
> wrote:
> >
> > Hi we have an update about the single core forwarding test on the ARM
> Grace server with the E810 100G Ice card. There was an intel PMDs series that
> was merged a week ago which had some performance failures when it was
> going through the CI:
> https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8
> 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/
> >
> > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html
> >
> > As you can see it causes roughly a 6% decrease in packets forwarded in the
> single core forwarding test with 64Byte frames and 512 txd/rxd. The delta
> tolerance on the single core forwarding test is 5%, so a 6% reduction in MPPS
> forwarded is a failure.
> >
> > This was merged into mainline 6 days ago, which is why some failures started
> to come in this week for the E810 Grace test.
> >
> > To double check this, on DPDK I checked out to:
> >
> > test/event: fix event vector adapter timeouts
> (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the
> Intel PMD patchseries, and ran the test and it forwarded the pre-regression
> MPPS that we expected.
> >
> > Then I checked out to net/intel: add common Tx mbuf recycle
> (f5fd081c86ae415515ab55cbacf10c9c50536ca1)
> >
> > and I ran the test and it had the 6% reduction in MPPS forwarded.
> >
> > Another thing to note is that regrettably the ARM Grace E810 test did not get
> run on the v7 (the final version) of this series, which meant the failure was not
> displayed on that version and that's probably why it was merged. We will look
> back into our job history and see why this test failed to report.
> >
> > Please let me know if you have any questions about the test, the testbed
> environment info, or anything else.
> Thanks Manit for looking into this. Adding few folks from Arm to follow up.
>
> >
> > Thanks,
> > Manit Mahajan
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium. Thank you.