* [PATCH] examples/l3fwd: optimize packet prefetch
@ 2024-12-25 7:53 Dengdui Huang
2024-12-25 21:21 ` Stephen Hemminger
0 siblings, 1 reply; 2+ messages in thread
From: Dengdui Huang @ 2024-12-25 7:53 UTC (permalink / raw)
To: dev
Cc: wathsala.vithanage, stephen, liuyonglong, fengchengwen, haijie1,
lihuisong
The prefetch window depending on the hardware platform. The current prefetch
policy may not be applicable to all platforms. In most cases, the number of
packets received by Rx burst is small (64 is used in most performance reports).
In L3fwd, the maximum value cannot exceed 512. Therefore, prefetching all
packets before processing can achieve better performance.
Signed-off-by: Dengdui Huang <huangdengdui@huawei.com>
---
examples/l3fwd/l3fwd_lpm_neon.h | 42 ++++-----------------------------
1 file changed, 5 insertions(+), 37 deletions(-)
diff --git a/examples/l3fwd/l3fwd_lpm_neon.h b/examples/l3fwd/l3fwd_lpm_neon.h
index 3c1f827424..0b51782b8c 100644
--- a/examples/l3fwd/l3fwd_lpm_neon.h
+++ b/examples/l3fwd/l3fwd_lpm_neon.h
@@ -91,53 +91,21 @@ l3fwd_lpm_process_packets(int nb_rx, struct rte_mbuf **pkts_burst,
const int32_t k = RTE_ALIGN_FLOOR(nb_rx, FWDSTEP);
const int32_t m = nb_rx % FWDSTEP;
- if (k) {
- for (i = 0; i < FWDSTEP; i++) {
- rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i],
- void *));
- }
- for (j = 0; j != k - FWDSTEP; j += FWDSTEP) {
- for (i = 0; i < FWDSTEP; i++) {
- rte_prefetch0(rte_pktmbuf_mtod(
- pkts_burst[j + i + FWDSTEP],
- void *));
- }
+ /* The number of packets is small. Prefetch all packets. */
+ for (i = 0; i < nb_rx; i++)
+ rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i], void *));
+ if (k) {
+ for (j = 0; j != k; j += FWDSTEP) {
processx4_step1(&pkts_burst[j], &dip, &ipv4_flag);
processx4_step2(qconf, dip, ipv4_flag, portid,
&pkts_burst[j], &dst_port[j]);
if (do_step3)
processx4_step3(&pkts_burst[j], &dst_port[j]);
}
-
- processx4_step1(&pkts_burst[j], &dip, &ipv4_flag);
- processx4_step2(qconf, dip, ipv4_flag, portid, &pkts_burst[j],
- &dst_port[j]);
- if (do_step3)
- processx4_step3(&pkts_burst[j], &dst_port[j]);
-
- j += FWDSTEP;
}
if (m) {
- /* Prefetch last up to 3 packets one by one */
- switch (m) {
- case 3:
- rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
- void *));
- j++;
- /* fallthrough */
- case 2:
- rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
- void *));
- j++;
- /* fallthrough */
- case 1:
- rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
- void *));
- j++;
- }
- j -= m;
/* Classify last up to 3 packets one by one */
switch (m) {
case 3:
--
2.33.0
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] examples/l3fwd: optimize packet prefetch
2024-12-25 7:53 [PATCH] examples/l3fwd: optimize packet prefetch Dengdui Huang
@ 2024-12-25 21:21 ` Stephen Hemminger
0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2024-12-25 21:21 UTC (permalink / raw)
To: Dengdui Huang
Cc: dev, wathsala.vithanage, liuyonglong, fengchengwen, haijie1, lihuisong
On Wed, 25 Dec 2024 15:53:02 +0800
Dengdui Huang <huangdengdui@huawei.com> wrote:
> From: Dengdui Huang <huangdengdui@huawei.com>
> To: <dev@dpdk.org>
> CC: <wathsala.vithanage@arm.com>, <stephen@networkplumber.org>, <liuyonglong@huawei.com>, <fengchengwen@huawei.com>, <haijie1@huawei.com>, <lihuisong@huawei.com>
> Subject: [PATCH] examples/l3fwd: optimize packet prefetch
> Date: Wed, 25 Dec 2024 15:53:02 +0800
> X-Mailer: git-send-email 2.33.0
>
> The prefetch window depending on the hardware platform. The current prefetch
> policy may not be applicable to all platforms. In most cases, the number of
> packets received by Rx burst is small (64 is used in most performance reports).
> In L3fwd, the maximum value cannot exceed 512. Therefore, prefetching all
> packets before processing can achieve better performance.
>
> Signed-off-by: Dengdui Huang <huangdengdui@huawei.com>
> ---
I think Vpp had a good description of how to unroll and deal with prefetch.
With larger burst sizes you don't want to prefetch the whole burst.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-12-25 21:21 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-25 7:53 [PATCH] examples/l3fwd: optimize packet prefetch Dengdui Huang
2024-12-25 21:21 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).