From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7B57DA04FD; Mon, 30 May 2022 13:21:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5A12742B75; Mon, 30 May 2022 13:21:28 +0200 (CEST) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by mails.dpdk.org (Postfix) with ESMTP id CA02A400D6 for ; Mon, 30 May 2022 13:21:26 +0200 (CEST) Received: by mail-lf1-f47.google.com with SMTP id bf44so667945lfb.0 for ; Mon, 30 May 2022 04:21:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=tG14eQc9CQIiqOwHXiOMfltF5OVfqXtsRa68m5JbtQc=; b=awwjN9LuzYu8jrdU9M17wryM3XzfWGg423CZc5PdPjsE4xNzyYsRd9VIDxoP0HpoyA 5zOQQng6Bl49pRNtFVQg2gqxYIJD412CDgtx9kl0YgtD/R0UOtbn6s8CwAtvj6QBvyJL N+8vw28Okr+2Y4aPDKWS6QWY23+THF+vQB7k1xScKwFPTu2qnu5cR5qjBlIyjo+0mgfw u4xoQ33v34K+AUpfMB7XUcCFwECnve5necN7PjliyE45u5UX35/IHIrqvxZs0esR2h2G AEB2u8n7NT/oRY8hTZZxjcjb7AIml+zw1KRkXdiHa2nFwTWheObTVlpaJQTx1Eiuw0Pk sL5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=tG14eQc9CQIiqOwHXiOMfltF5OVfqXtsRa68m5JbtQc=; b=zT96XHTLmpX2Fxp4rwKJZbAmpgGol11gmxHFnFG4edtjOMwZG+YIqbtnVOoiIjbuWt gqih43TxhwKcyLQpWV3jsG694HEKadNwQIlAY0/4Y676sz2t6+H5RFa5zxw1VvJ2f5hn OQRISdQZ4babzuFYa+I/idIFnCDVt2KSKI6BJP9k3+QtQQVTYC62hhD5yhc+aSPyXsJT al68Uwk9e065+StqaT4YReMxJh7Jyw9kXz69/8qeJrlKHZjj5SMcVREMmT9X73JjOsM8 ZGaytP6X3Hi8iZp3psSgtce73mgD7zTYqDeW2hVGEHfhOYw9mYd1SWLJfpxhjSi6IhCv tMzg== X-Gm-Message-State: AOAM531+QPZmwEVcQgCfxXVlD2hPzltEnGOrkNLA4VvbzKposgc3mLv6 V0MtJsw+RBp01fqlJufOsngNblJrk5pK4yoFbfB+4w== X-Google-Smtp-Source: ABdhPJxUqRtbiNo64P1kd+lE4EPb9iIfdONGOzMoQnGzCpMWFmR0PxJneygMZqzuT7vtB02hcZ2yXaGhJciXs/4yuxo= X-Received: by 2002:a05:6512:e9c:b0:478:e289:a911 with SMTP id bi28-20020a0565120e9c00b00478e289a911mr2211471lfb.589.1653909686329; Mon, 30 May 2022 04:21:26 -0700 (PDT) MIME-Version: 1.0 References: <20220510115824.457885-1-kda@semihalf.com> <20220527181822.716758-1-kda@semihalf.com> <20220527181822.716758-2-kda@semihalf.com> <20220527131520.23d9f544@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35D870C0@smartserver.smartshare.dk> In-Reply-To: From: =?UTF-8?Q?Stanis=C5=82aw_Kardach?= Date: Mon, 30 May 2022 13:20:50 +0200 Message-ID: Subject: Re: [PATCH v2 2/2] lpm: add a scalar version of lookupx4 function To: Bruce Richardson Cc: =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Vladimir Medvedkin , Michal Mazurek , dev , Frank Zhao , Sam Grove , Marcin Wojtas , upstream@semihalf.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, May 30, 2022 at 12:42 PM Bruce Richardson wrote: > > On Mon, May 30, 2022 at 10:00:34AM +0200, Morten Br=C3=B8rup wrote: > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > > Sent: Monday, 30 May 2022 09.52 > > > > > > On Fri, May 27, 2022 at 01:15:20PM -0700, Stephen Hemminger wrote: > > > > On Fri, 27 May 2022 20:18:22 +0200 > > > > Stanislaw Kardach wrote: > > > > > > > > > +static inline void > > > > > +rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint32_t > > > hop[4], > > > > > + uint32_t defv) > > > > > +{ > > > > > + uint32_t nh; > > > > > + int i, ret; > > > > > + > > > > > + for (i =3D 0; i < 4; i++) { > > > > > + ret =3D rte_lpm_lookup(lpm, ((rte_xmm_t)ip).u32[i= ], &nh); > > > > > + hop[i] =3D (ret =3D=3D 0) ? nh : defv; > > > > > + } > > > > > +} > > > > > > > > For performance, manually unroll the loop. > > > > > > Given a constant 4x iterations, will compilers not unroll this > > > automatically. I think the loop is a little clearer if it can be kept > > > > > > /Bruce > > > > If in doubt, add this and look at the assembler output: > > > > #define REVIEW_INLINE_FUNCTIONS 1 > > > > #if REVIEW_INLINE_FUNCTIONS /* For compiler output review purposes only= . */ > > #pragma GCC diagnostic push > > #pragma GCC diagnostic ignored "-Wmissing-prototypes" > > void review_rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint3= 2_t hop[4], uint32_t defv) > > { > > rte_lpm_lookupx4(lpm, ip, hop, defv); > > } > > #pragma GCC diagnostic pop > > #endif /* REVIEW_INLINE_FUNCTIONS */ > > > > Used godbolt.org to check and indeed the function is not unrolled. > (Gcc 11.2, with flags "-O3 -march=3Dicelake-server"). > > Manually unrolling changes the assembly generated in interesting ways. Fo= r > example, it appears to generate more cmov-type instructions for the > miss/default-value case rather than using branches as in the looped > version. Whether this is better or not may depend upon usecase - if one > expects most lpm lookup entries to hit, then having (predictable) branche= s > may well be cheaper. > > In any case, I'll withdraw any object to unrolling, but I'm still not > convinced it's necessary. > > /Bruce Interestingly enough until I've defined unlikely() in godbolt, I did not get any automatic unrolling on godbolt (either with x86 or RISC-V GCC). Did you get any compilation warnings? That said it only happens on O3 since it implies -fpeel-loops. O3 is the default for DPDK.