Hi Bruce, I looked at the commit and I see that it changes RTE_LIBRTE_ICE_16BYTE_RX_DESC to RTE_NET_INTEL_USE_16BYTE_DESC. The test I ran runs the meson setup with flag -Dc_args=-DRTE_LIBRTE_ICE_16BYTE_RX_DESC. I will run another test with the new flag name. Thanks, Manit On Thu, Jul 3, 2025 at 11:26 AM Patrick Robb wrote: > Hi Bruce, > > When the NIC is E810, the test runs the meson setup with > flag -Dc_args=-DRTE_LIBRTE_ICE_16BYTE_RX_DESC > > I think that is what you mean? Is this setup correct? > > On Thu, Jul 3, 2025 at 11:22 AM Richardson, Bruce < > bruce.richardson@intel.com> wrote: > >> Is the test you are running setting the 16B descriptor flag, and does it >> need updating to take account of the new flag name? >> >> >> >> *From:* Manit Mahajan >> *Sent:* Thursday, July 3, 2025 4:22 PM >> *To:* Richardson, Bruce >> *Cc:* Patrick Robb ; Nagarahalli, Honnappa < >> Honnappa.Nagarahalli@arm.com>; Burakov, Anatoly < >> anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage < >> wathsala.vithanage@arm.com>; Paul Szczepanek ; >> Mcnamara, John >> *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server >> >> >> >> Hi Bruce, >> >> This morning, I was able to narrow down the performance issue to a >> specific commit. I ran performance tests on the following two commits: >> >> - d1a350c089e0 – net/ice: rename 16-byte descriptor flag >> - 4c4b9ce017fe – net/i40e: rename 16-byte descriptor flag >> >> The net/i40e commit directly precedes the net/ice commit. I observed a >> significant drop in mpps beginning with commit d1a350c089e0, confirming >> that this commit introduced the regression. >> >> Thanks, >> Manit >> >> >> >> On Thu, Jul 3, 2025 at 9:12 AM Richardson, Bruce < >> bruce.richardson@intel.com> wrote: >> >> Thanks Patrick, I’m planning on checking some performance numbers again >> on our end too. >> >> >> >> My thoughts on the ring size, is that the total number of ring slots >> across all rings should be enough to ride out an expected stall. So back in >> the 10G days (max packet arrival rate of ~67ns), we would use ring sizes of >> 512 entries, which would give us just short of 35usec of buffering. Even >> with 4k of a ring size, at 100G we only have 27.5 usec of buffering. Now, >> admittedly CPUs are faster too, so should be less likely to stop polling >> for that amount of time, but they aren’t 10x as fast as in the 10G days so >> I find 512 of a ring size a little small. For 100G, I would expect 2k to be >> a reasonable min ring size to test with – if testing single queue. >> Obviously the more queues and cores we test with, the smaller each ring can >> be, since the arrival rate per-ring should be lower. >> >> >> >> /Bruce >> >> >> >> *From:* Patrick Robb >> *Sent:* Thursday, July 3, 2025 1:53 PM >> *To:* Richardson, Bruce >> *Cc:* Nagarahalli, Honnappa ; Manit >> Mahajan ; Burakov, Anatoly < >> anatoly.burakov@intel.com>; ci@dpdk.org; Wathsala Wathawana Vithanage < >> wathsala.vithanage@arm.com>; Paul Szczepanek >> *Subject:* Re: Intel E810 Performance Regression - ARM Grace Server >> >> >> >> Hi Bruce, >> >> >> >> Manit can identify the specific commit this morning. >> >> >> >> You raise a good point about the descriptor count. It is worth us >> assessing the performance with a broader set of descriptor counts and >> deciding what set of test configurations will yield helpful results for >> developers going forward. By my understanding, we want to test with a set >> of descriptor counts which are basically appropriate for the given traffic >> flow, not the other way around. We will gather more info this morning and >> share it back to you. >> >> >> >> On Thu, Jul 3, 2025 at 4:43 AM Richardson, Bruce < >> bruce.richardson@intel.com> wrote: >> >> Hi Manit, >> >> Can you identify which patch exactly within the series is causing the >> regression? We were not expecting performance to change with the patchset, >> but obviously something got missed. >> I will follow up on our end to see if we see any regressions. >> >> I must say, though, that 512 entries is pretty small rings sizes to use >> for 100G traffic. The slightest stall would cause those rings to overflow. >> What is perf like at other ring sizes, e.g. 1k or 2k? >> >> /Bruce >> >> >> > -----Original Message----- >> > From: Honnappa Nagarahalli >> > Sent: Thursday, July 3, 2025 8:03 AM >> > To: Manit Mahajan >> > Cc: Burakov, Anatoly ; ci@dpdk.org; >> Richardson, >> > Bruce ; Wathsala Wathawana Vithanage >> > ; Paul Szczepanek >> > >> > Subject: Re: Intel E810 Performance Regression - ARM Grace Server >> > >> > + Wathsala, Paul >> > >> > > On Jul 2, 2025, at 10:09 PM, Manit Mahajan >> > wrote: >> > > >> > > Hi we have an update about the single core forwarding test on the ARM >> > Grace server with the E810 100G Ice card. There was an intel PMDs >> series that >> > was merged a week ago which had some performance failures when it was >> > going through the CI: >> > https://patches.dpdk.org/project/dpdk/patch/01c94afcb0b1c2795c031afc8 >> > 72a8faf3f0db2b5.1749229651.git.anatoly.burakov@intel.com/ >> > > >> > > and: http://mails.dpdk.org/archives/test-report/2025-June/883654.html >> > > >> > > As you can see it causes roughly a 6% decrease in packets forwarded >> in the >> > single core forwarding test with 64Byte frames and 512 txd/rxd. The >> delta >> > tolerance on the single core forwarding test is 5%, so a 6% reduction >> in MPPS >> > forwarded is a failure. >> > > >> > > This was merged into mainline 6 days ago, which is why some failures >> started >> > to come in this week for the E810 Grace test. >> > > >> > > To double check this, on DPDK I checked out to: >> > > >> > > test/event: fix event vector adapter timeouts >> > (2eca0f4cd5daf6cd54b8705f6f76f3003c923912) which directly precedes the >> > Intel PMD patchseries, and ran the test and it forwarded the >> pre-regression >> > MPPS that we expected. >> > > >> > > Then I checked out to net/intel: add common Tx mbuf recycle >> > (f5fd081c86ae415515ab55cbacf10c9c50536ca1) >> > > >> > > and I ran the test and it had the 6% reduction in MPPS forwarded. >> > > >> > > Another thing to note is that regrettably the ARM Grace E810 test did >> not get >> > run on the v7 (the final version) of this series, which meant the >> failure was not >> > displayed on that version and that's probably why it was merged. We >> will look >> > back into our job history and see why this test failed to report. >> > > >> > > Please let me know if you have any questions about the test, the >> testbed >> > environment info, or anything else. >> > Thanks Manit for looking into this. Adding few folks from Arm to follow >> up. >> > >> > > >> > > Thanks, >> > > Manit Mahajan >> > >> > IMPORTANT NOTICE: The contents of this email and any attachments are >> > confidential and may also be privileged. If you are not the intended >> recipient, >> > please notify the sender immediately and do not disclose the contents >> to any >> > other person, use it for any purpose, or store or copy the information >> in any >> > medium. Thank you. >> >>