DPDK patches and discussions
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: huangdengdui <huangdengdui@huawei.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "wathsala.vithanage@arm.com" <wathsala.vithanage@arm.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	liuyonglong <liuyonglong@huawei.com>,
	Fengchengwen <fengchengwen@huawei.com>,
	haijie <haijie1@huawei.com>,
	"lihuisong (C)" <lihuisong@huawei.com>
Subject: RE: [PATCH] examples/l3fwd: optimize packet prefetch
Date: Wed, 8 Jan 2025 13:42:20 +0000	[thread overview]
Message-ID: <a966e66c538946d9b22ed337c77d9b25@huawei.com> (raw)
In-Reply-To: <20241225075302.353013-1-huangdengdui@huawei.com>



> 
> The prefetch window depending on the hardware platform. The current prefetch
> policy may not be applicable to all platforms. In most cases, the number of
> packets received by Rx burst is small (64 is used in most performance reports).
> In L3fwd, the maximum value cannot exceed 512. Therefore, prefetching all
> packets before processing can achieve better performance.

As you mentioned 'prefetch' behavior differs a lot from one HW platform to another.
So it could easily be that changes you suggesting will cause performance
boost on one platform and degradation on another.
In fact, right now l3fwd 'prefetch' usage is a bit of mess:
- l3fwd_lpm_neon.h uses  FWDSTEP as a prefetch window.
- l3fwd_fib.c uses FIB_PREFETCH_OFFSET for that purpose
- rest of the code uses either PREFETCH_OFFSET or doesn't use 'prefetch' at all
 
Probably what we need here is some unified approach:
configurable at run-time prefetch_window_size that all code-paths will obey. 

> Signed-off-by: Dengdui Huang <huangdengdui@huawei.com>
> ---
>  examples/l3fwd/l3fwd_lpm_neon.h | 42 ++++-----------------------------
>  1 file changed, 5 insertions(+), 37 deletions(-)
> 
> diff --git a/examples/l3fwd/l3fwd_lpm_neon.h b/examples/l3fwd/l3fwd_lpm_neon.h
> index 3c1f827424..0b51782b8c 100644
> --- a/examples/l3fwd/l3fwd_lpm_neon.h
> +++ b/examples/l3fwd/l3fwd_lpm_neon.h
> @@ -91,53 +91,21 @@ l3fwd_lpm_process_packets(int nb_rx, struct rte_mbuf **pkts_burst,
>  	const int32_t k = RTE_ALIGN_FLOOR(nb_rx, FWDSTEP);
>  	const int32_t m = nb_rx % FWDSTEP;
> 
> -	if (k) {
> -		for (i = 0; i < FWDSTEP; i++) {
> -			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i],
> -							void *));
> -		}
> -		for (j = 0; j != k - FWDSTEP; j += FWDSTEP) {
> -			for (i = 0; i < FWDSTEP; i++) {
> -				rte_prefetch0(rte_pktmbuf_mtod(
> -						pkts_burst[j + i + FWDSTEP],
> -						void *));
> -			}
> +	/* The number of packets is small. Prefetch all packets. */
> +	for (i = 0; i < nb_rx; i++)
> +		rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i], void *));
> 
> +	if (k) {
> +		for (j = 0; j != k; j += FWDSTEP) {
>  			processx4_step1(&pkts_burst[j], &dip, &ipv4_flag);
>  			processx4_step2(qconf, dip, ipv4_flag, portid,
>  					&pkts_burst[j], &dst_port[j]);
>  			if (do_step3)
>  				processx4_step3(&pkts_burst[j], &dst_port[j]);
>  		}
> -
> -		processx4_step1(&pkts_burst[j], &dip, &ipv4_flag);
> -		processx4_step2(qconf, dip, ipv4_flag, portid, &pkts_burst[j],
> -				&dst_port[j]);
> -		if (do_step3)
> -			processx4_step3(&pkts_burst[j], &dst_port[j]);
> -
> -		j += FWDSTEP;
>  	}
> 
>  	if (m) {
> -		/* Prefetch last up to 3 packets one by one */
> -		switch (m) {
> -		case 3:
> -			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
> -							void *));
> -			j++;
> -			/* fallthrough */
> -		case 2:
> -			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
> -							void *));
> -			j++;
> -			/* fallthrough */
> -		case 1:
> -			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
> -							void *));
> -			j++;
> -		}
> -		j -= m;
>  		/* Classify last up to 3 packets one by one */
>  		switch (m) {
>  		case 3:
> --
> 2.33.0


  parent reply	other threads:[~2025-01-08 13:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-25  7:53 Dengdui Huang
2024-12-25 21:21 ` Stephen Hemminger
2025-01-08 13:42 ` Konstantin Ananyev [this message]
2025-01-09 11:31   ` huangdengdui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a966e66c538946d9b22ed337c77d9b25@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=haijie1@huawei.com \
    --cc=huangdengdui@huawei.com \
    --cc=lihuisong@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=stephen@networkplumber.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).