DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	Stanislaw Kardach <kda@semihalf.com>,
	Vladimir Medvedkin <vladimir.medvedkin@intel.com>,
	Michal Mazurek <maz@semihalf.com>,
	dev@dpdk.org, Frank Zhao <Frank.Zhao@starfivetech.com>,
	Sam Grove <sam.grove@sifive.com>,
	mw@semihalf.com, upstream@semihalf.com
Subject: Re: [PATCH v2 2/2] lpm: add a scalar version of lookupx4 function
Date: Mon, 30 May 2022 11:42:00 +0100	[thread overview]
Message-ID: <YpSfeORa9Y3lLXxx@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D870C0@smartserver.smartshare.dk>

On Mon, May 30, 2022 at 10:00:34AM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Monday, 30 May 2022 09.52
> > 
> > On Fri, May 27, 2022 at 01:15:20PM -0700, Stephen Hemminger wrote:
> > > On Fri, 27 May 2022 20:18:22 +0200
> > > Stanislaw Kardach <kda@semihalf.com> wrote:
> > >
> > > > +static inline void
> > > > +rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint32_t
> > hop[4],
> > > > +		uint32_t defv)
> > > > +{
> > > > +	uint32_t nh;
> > > > +	int i, ret;
> > > > +
> > > > +	for (i = 0; i < 4; i++) {
> > > > +		ret = rte_lpm_lookup(lpm, ((rte_xmm_t)ip).u32[i], &nh);
> > > > +		hop[i] = (ret == 0) ? nh : defv;
> > > > +	}
> > > > +}
> > >
> > > For performance, manually unroll the loop.
> > 
> > Given a constant 4x iterations, will compilers not unroll this
> > automatically. I think the loop is a little clearer if it can be kept
> > 
> > /Bruce
> 
> If in doubt, add this and look at the assembler output:
> 
> #define REVIEW_INLINE_FUNCTIONS 1
> 
> #if REVIEW_INLINE_FUNCTIONS /* For compiler output review purposes only. */
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wmissing-prototypes"
> void review_rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t defv)
> {
> 	rte_lpm_lookupx4(lpm, ip, hop, defv);
> }
> #pragma GCC diagnostic pop
> #endif /* REVIEW_INLINE_FUNCTIONS */
> 

Used godbolt.org to check and indeed the function is not unrolled.
(Gcc 11.2, with flags "-O3 -march=icelake-server").

Manually unrolling changes the assembly generated in interesting ways. For
example, it appears to generate more cmov-type instructions for the
miss/default-value case rather than using branches as in the looped
version. Whether this is better or not may depend upon usecase - if one
expects most lpm lookup entries to hit, then having (predictable) branches
may well be cheaper.

In any case, I'll withdraw any object to unrolling, but I'm still not
convinced it's necessary.

/Bruce

  reply	other threads:[~2022-05-30 10:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10 11:58 [PATCH 1/1] " Stanislaw Kardach
2022-05-19 17:02 ` Medvedkin, Vladimir
2022-05-24 16:28   ` Stanisław Kardach
2022-05-27 11:16     ` Stanisław Kardach
2022-05-27 13:16       ` Medvedkin, Vladimir
2022-05-27 18:18 ` [PATCH v2 1/2] lpm: add const to lpm arg of rte_lpm_lookup Stanislaw Kardach
2022-05-27 18:18   ` [PATCH v2 2/2] lpm: add a scalar version of lookupx4 function Stanislaw Kardach
2022-05-27 20:15     ` Stephen Hemminger
2022-05-30  7:52       ` Bruce Richardson
2022-05-30  8:00         ` Morten Brørup
2022-05-30 10:42           ` Bruce Richardson [this message]
2022-05-30 11:20             ` Stanisław Kardach
2022-05-30 12:46               ` Bruce Richardson
2022-05-30 18:24   ` [PATCH v3 1/2] lpm: add const to lpm arg of rte_lpm_lookup Stanislaw Kardach
2022-05-30 18:24     ` [PATCH v3 2/2] lpm: add a scalar version of lookupx4 function Stanislaw Kardach
2022-06-01  9:41       ` Medvedkin, Vladimir
2022-06-01 10:32         ` Stanisław Kardach
2022-05-30 20:38     ` [PATCH v3 1/2] lpm: add const to lpm arg of rte_lpm_lookup Stephen Hemminger
2022-06-01  9:35     ` Medvedkin, Vladimir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YpSfeORa9Y3lLXxx@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=Frank.Zhao@starfivetech.com \
    --cc=dev@dpdk.org \
    --cc=kda@semihalf.com \
    --cc=maz@semihalf.com \
    --cc=mb@smartsharesystems.com \
    --cc=mw@semihalf.com \
    --cc=sam.grove@sifive.com \
    --cc=stephen@networkplumber.org \
    --cc=upstream@semihalf.com \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).