From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 642FFA00C2; Mon, 30 May 2022 10:00:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3D6BB40A8A; Mon, 30 May 2022 10:00:40 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id DA2B040A89 for ; Mon, 30 May 2022 10:00:38 +0200 (CEST) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH v2 2/2] lpm: add a scalar version of lookupx4 function Date: Mon, 30 May 2022 10:00:34 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D870C0@smartserver.smartshare.dk> In-Reply-To: X-MimeOLE: Produced By Microsoft Exchange V6.5 X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH v2 2/2] lpm: add a scalar version of lookupx4 function Thread-Index: Adhz+jiVOs354PwCQSSqtoTnxQNxfgAAIJJQ References: <20220510115824.457885-1-kda@semihalf.com> <20220527181822.716758-1-kda@semihalf.com> <20220527181822.716758-2-kda@semihalf.com> <20220527131520.23d9f544@hermes.local> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" , "Stephen Hemminger" Cc: "Stanislaw Kardach" , "Vladimir Medvedkin" , "Michal Mazurek" , , "Frank Zhao" , "Sam Grove" , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Monday, 30 May 2022 09.52 >=20 > On Fri, May 27, 2022 at 01:15:20PM -0700, Stephen Hemminger wrote: > > On Fri, 27 May 2022 20:18:22 +0200 > > Stanislaw Kardach wrote: > > > > > +static inline void > > > +rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint32_t > hop[4], > > > + uint32_t defv) > > > +{ > > > + uint32_t nh; > > > + int i, ret; > > > + > > > + for (i =3D 0; i < 4; i++) { > > > + ret =3D rte_lpm_lookup(lpm, ((rte_xmm_t)ip).u32[i], &nh); > > > + hop[i] =3D (ret =3D=3D 0) ? nh : defv; > > > + } > > > +} > > > > For performance, manually unroll the loop. >=20 > Given a constant 4x iterations, will compilers not unroll this > automatically. I think the loop is a little clearer if it can be kept >=20 > /Bruce If in doubt, add this and look at the assembler output: #define REVIEW_INLINE_FUNCTIONS 1 #if REVIEW_INLINE_FUNCTIONS /* For compiler output review purposes only. = */ #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wmissing-prototypes" void review_rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, = uint32_t hop[4], uint32_t defv) { rte_lpm_lookupx4(lpm, ip, hop, defv); } #pragma GCC diagnostic pop #endif /* REVIEW_INLINE_FUNCTIONS */