From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 53347A0350; Wed, 24 Jun 2020 15:19:00 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AC1311D926; Wed, 24 Jun 2020 15:18:59 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 183351D73A for ; Wed, 24 Jun 2020 15:18:56 +0200 (CEST) IronPort-SDR: kKrhz3MAYqz3UEbhJduzrvkLTVcYbB5G4h1kEpEqHCxXQycJpF8iNDsRAIPGXYCVmN4ak/Q6kI 0CysBVSgfZxg== X-IronPort-AV: E=McAfee;i="6000,8403,9661"; a="132892872" X-IronPort-AV: E=Sophos;i="5.75,275,1589266800"; d="scan'208";a="132892872" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2020 06:18:56 -0700 IronPort-SDR: q2SRS2eAoR1LN0dWif1cjnJSr/6YUXZXjVJ+4E6VjjvvLMlWk2mOlr5Th+F6uEUibvNNLnL+W+ +IsONI2ZDhpA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,275,1589266800"; d="scan'208";a="291645907" Received: from orsmsx109.amr.corp.intel.com ([10.22.240.7]) by orsmga007.jf.intel.com with ESMTP; 24 Jun 2020 06:18:56 -0700 Received: from orsmsx603.amr.corp.intel.com (10.22.229.16) by ORSMSX109.amr.corp.intel.com (10.22.240.7) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 24 Jun 2020 06:18:55 -0700 Received: from orsmsx603.amr.corp.intel.com (10.22.229.16) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 24 Jun 2020 06:18:55 -0700 Received: from ORSEDG002.ED.cps.intel.com (10.7.248.5) by orsmsx603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 24 Jun 2020 06:18:55 -0700 Received: from NAM02-BL2-obe.outbound.protection.outlook.com (104.47.38.50) by edgegateway.intel.com (134.134.137.101) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 24 Jun 2020 06:18:54 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gODWIEukrArMoxaMqqX31Fi0v6EoV6JCupvkSC7dWxcvoGlr7GvgNKvQOjO5Nv6496kimzc+ctHvl6UWZziMhqxjoEHVmJu05gt0+00BYnIyl3Gd05/s9HHXR6hLb206QHibntbIyivKAOnJZmLCh/coKsqprIlIpL2j7/325YYPUhBRw296Y+SjCA6tdSRq8VO2H/ILMpT+c8ufhXCnYbGyVRQmkjWgC5Lfc0ZGsvjtqx1gcYkX4TVZUyOSPeWBQxYPEDXTh8Ria5XfiEkEld0dQItgr+iTzFGkSh5I038RRh+QOp9kwnCTMjjUHfPcBLeweOy5wxNzsMKs6BlImQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G9hS8e5b4Py8tr7BNxSEqYV62Ka2ZQD+VrD7hRH/+pc=; b=YHDku4N/nbm4Jnij57KZMPCwYSAw3rcsvwbdXdAkZYvMX9nIViffWDm07oKFKPn37EThqwe0EvXF9z/Ny6PZ3q4ZTOqUB5RC8O7dRuE95msgpPR681+nDVq14UaPkd74lzVY6uxLjo8Y9eZ3LSoWdR1gR6Yf8MBkUFT9KE6m6+aSGGQkUclTw91/v07D6VGO8SFE6Pnt52e+GSP8EmqWyBSJ7qHbj173bdhBUeaqv+10BMFHipO6eaTr7vYF6VfgTGQK+vDBrWY6BdbWzwDga3C0eNGSMi1F8y5NYZuSW4g9W2KxL0oxtLuk7jia8yFO9aJsYyeTJ5nSg3ViR6lmJQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G9hS8e5b4Py8tr7BNxSEqYV62Ka2ZQD+VrD7hRH/+pc=; b=CPy7CWJOUzUbv0w1YoJxku2jj2OFTLOYScK9PhDh+TtOzdbZpT4+c07q/w4c4a4oFzBLprqDh2Ze9UL0i3meAhDrXMEd/WktC+12kMnjDaF1shZ77BoBtirJaKbmB7MWmfko65l7+YGZsnlw3XHsZ3a3fCws1Inh+XQ4ix5wzEQ= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB3352.namprd11.prod.outlook.com (2603:10b6:a03:1d::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3131.21; Wed, 24 Jun 2020 13:18:51 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f160:29ab:b8f9:4189]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f160:29ab:b8f9:4189%6]) with mapi id 15.20.3109.027; Wed, 24 Jun 2020 13:18:51 +0000 From: "Ananyev, Konstantin" To: "Medvedkin, Vladimir" , "dev@dpdk.org" CC: "Richardson, Bruce" Thread-Topic: [PATCH v3 4/8] fib: introduce AVX512 lookup Thread-Index: AQHWLdb0R3LasdNm5U+933vMo3Fvzajn92DA Date: Wed, 24 Jun 2020 13:18:50 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [192.198.151.184] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 64225256-62a0-4dc6-947e-08d818412310 x-ms-traffictypediagnostic: BYAPR11MB3352: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4714; x-forefront-prvs: 0444EB1997 x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: n1lo11SutyRoKq5QnU1vsP7iwcTNff9A3TPP1HW57tRTFkxg7OhMSDGygPfjq41Pu4vNnJwPKo1Tv2EYHgLQ3XusFKUs6xh2X8cHMNdT8hk7UMQ0NwmeYMPquNid01wlB4Cu2+REexvAQl/4LjwuEyxRPF/BEnQ1He8gY/p7rT/07ApfE3diq+ERq90EdCy9QBqHz6LF76vozO0PpRI8ixvbwrYTAONKYItgsJDajU1Fijp4ABCEfjMs8YNT6rLheSSkOH/IUENwxgD1qhfqHjZyNYdSCZ2LlaLsFMh5GSYl3YXjiBpWFyMH2aBMYJ30 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(346002)(396003)(376002)(366004)(136003)(39860400002)(71200400001)(5660300002)(52536014)(4326008)(83380400001)(55016002)(9686003)(107886003)(30864003)(8676002)(8936002)(86362001)(33656002)(478600001)(66446008)(66556008)(66946007)(2906002)(64756008)(66476007)(6506007)(76116006)(316002)(110136005)(26005)(7696005)(186003); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: Lwg9nFQ6l8YPdQgpRdzoa0UcIMvRMN2e1FHVSAvyYcuaxaKwGL6vPB//7mq0W1MkGTkrr2DGmmpKu88Fc8wPZgLSkwdrQturAVbcjEdHeLDZBFY8szj38D95U7wAatknRVuWWi1ShkUh9e+dLxpY7MDEQCi2mRx7W0UJfV6r39QHnhOIFk1Dq3Y0SsQdCthzoGljHIPLPtSynTF80hQCtfWbv377og6U9R2RJATL4a4SylV4bC/Kt5guXsIasi/NUevnhsGYizyauEi/Ty6msw7E1R8EFXFP1gFnaEqueJUEt+MDLCmvcF67/4XGO77fUn1E4n0nPX+2oo3kKd67szzcfs7MrPuayAOLaJSeJ12HnnR4hnTuwkaNbWN9wFztFeEBmVEtQVlYHH0N3bNGscLAiO+fqYp2ZZ1X1qTaUvj25OveKZP8Hi6v0eEHw7jenP9ySdV5aPS5PUZOP3ww6PT8XuW6wOYLnMf3sRGVZTbYEhUDq/59uZGTJvYwgfqo Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 64225256-62a0-4dc6-947e-08d818412310 X-MS-Exchange-CrossTenant-originalarrivaltime: 24 Jun 2020 13:18:50.8487 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: zfdmRVODuJAsllD4H4oGuRFPzaUTL6ijpwzmo1evxw0FUJVR9E9CK5h5x/7SpdigoPVfYL3CV2CjYgNKvIOEBvqnNa+TiGDwL9ukr94Khbc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3352 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v3 4/8] fib: introduce AVX512 lookup X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > Add new lookup implementation for DIR24_8 algorithm using > AVX512 instruction set >=20 > Signed-off-by: Vladimir Medvedkin > --- > lib/librte_fib/Makefile | 14 ++++ > lib/librte_fib/dir24_8.c | 24 ++++++ > lib/librte_fib/dir24_8_avx512.c | 165 ++++++++++++++++++++++++++++++++++= ++++++ > lib/librte_fib/dir24_8_avx512.h | 24 ++++++ > lib/librte_fib/meson.build | 11 +++ > lib/librte_fib/rte_fib.h | 3 +- > 6 files changed, 240 insertions(+), 1 deletion(-) > create mode 100644 lib/librte_fib/dir24_8_avx512.c > create mode 100644 lib/librte_fib/dir24_8_avx512.h >=20 > diff --git a/lib/librte_fib/Makefile b/lib/librte_fib/Makefile > index 1dd2a49..3958da1 100644 > --- a/lib/librte_fib/Makefile > +++ b/lib/librte_fib/Makefile > @@ -19,4 +19,18 @@ SRCS-$(CONFIG_RTE_LIBRTE_FIB) :=3D rte_fib.c rte_fib6.= c dir24_8.c trie.c > # install this header file > SYMLINK-$(CONFIG_RTE_LIBRTE_FIB)-include :=3D rte_fib.h rte_fib6.h >=20 > +CC_AVX512F_SUPPORT=3D$(shell $(CC) -mavx512f -dM -E - &1 | = \ > +grep -q __AVX512F__ && echo 1) > + > +CC_AVX512DQ_SUPPORT=3D$(shell $(CC) -mavx512dq -dM -E - &1 = | \ > +grep -q __AVX512DQ__ && echo 1) > + > +ifeq ($(CC_AVX512F_SUPPORT), 1) > + ifeq ($(CC_AVX512DQ_SUPPORT), 1) > + SRCS-$(CONFIG_RTE_LIBRTE_FIB) +=3D dir24_8_avx512.c > + CFLAGS_dir24_8_avx512.o +=3D -mavx512f > + CFLAGS_dir24_8_avx512.o +=3D -mavx512dq > + CFLAGS_dir24_8.o +=3D -DCC_DIR24_8_AVX512_SUPPORT > + endif > +endif > include $(RTE_SDK)/mk/rte.lib.mk > diff --git a/lib/librte_fib/dir24_8.c b/lib/librte_fib/dir24_8.c > index 9d74653..0a1c53f 100644 > --- a/lib/librte_fib/dir24_8.c > +++ b/lib/librte_fib/dir24_8.c > @@ -18,6 +18,12 @@ > #include > #include "dir24_8.h" >=20 > +#ifdef CC_DIR24_8_AVX512_SUPPORT > + > +#include "dir24_8_avx512.h" > + > +#endif /* CC_DIR24_8_AVX512_SUPPORT */ > + > #define DIR24_8_NAMESIZE 64 >=20 > #define ROUNDUP(x, y) RTE_ALIGN_CEIL(x, (1 << (32 - y))) > @@ -62,6 +68,24 @@ dir24_8_get_lookup_fn(void *p, enum rte_fib_dir24_8_lo= okup_type type) > } > case RTE_FIB_DIR24_8_SCALAR_UNI: > return dir24_8_lookup_bulk_uni; > +#ifdef CC_DIR24_8_AVX512_SUPPORT > + case RTE_FIB_DIR24_8_VECTOR: > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) <=3D 0) > + return NULL; > + > + switch (nh_sz) { > + case RTE_FIB_DIR24_8_1B: > + return rte_dir24_8_vec_lookup_bulk_1b; > + case RTE_FIB_DIR24_8_2B: > + return rte_dir24_8_vec_lookup_bulk_2b; > + case RTE_FIB_DIR24_8_4B: > + return rte_dir24_8_vec_lookup_bulk_4b; > + case RTE_FIB_DIR24_8_8B: > + return rte_dir24_8_vec_lookup_bulk_8b; > + default: > + return NULL; > + } > +#endif > default: > return NULL; > } > diff --git a/lib/librte_fib/dir24_8_avx512.c b/lib/librte_fib/dir24_8_avx= 512.c > new file mode 100644 > index 0000000..43dba28 > --- /dev/null > +++ b/lib/librte_fib/dir24_8_avx512.c > @@ -0,0 +1,165 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#include > +#include > + > +#include "dir24_8.h" > +#include "dir24_8_avx512.h" > + > +static __rte_always_inline void > +dir24_8_vec_lookup_x16(void *p, const uint32_t *ips, > + uint64_t *next_hops, int size) > +{ > + struct dir24_8_tbl *dp =3D (struct dir24_8_tbl *)p; > + __mmask16 msk_ext; > + __mmask16 exp_msk =3D 0x5555; > + __m512i ip_vec, idxes, res, bytes; > + const __m512i zero =3D _mm512_set1_epi32(0); > + const __m512i lsb =3D _mm512_set1_epi32(1); > + const __m512i lsbyte_msk =3D _mm512_set1_epi32(0xff); > + __m512i tmp1, tmp2, res_msk; > + __m256i tmp256; > + /* used to mask gather values if size is 1/2 (8/16 bit next hops) */ > + if (size =3D=3D sizeof(uint8_t)) > + res_msk =3D _mm512_set1_epi32(UINT8_MAX); > + else if (size =3D=3D sizeof(uint16_t)) > + res_msk =3D _mm512_set1_epi32(UINT16_MAX); > + > + ip_vec =3D _mm512_loadu_si512(ips); > + /* mask 24 most significant bits */ > + idxes =3D _mm512_srli_epi32(ip_vec, 8); > + > + /** > + * lookup in tbl24 > + * Put it inside branch to make compiler happy with -O0 > + */ > + if (size =3D=3D sizeof(uint8_t)) { > + res =3D _mm512_i32gather_epi32(idxes, (const int *)dp->tbl24, 1); > + res =3D _mm512_and_epi32(res, res_msk); > + } else if (size =3D=3D sizeof(uint16_t)) { > + res =3D _mm512_i32gather_epi32(idxes, (const int *)dp->tbl24, 2); > + res =3D _mm512_and_epi32(res, res_msk); > + } else > + res =3D _mm512_i32gather_epi32(idxes, (const int *)dp->tbl24, 4); > + > + /* get extended entries indexes */ > + msk_ext =3D _mm512_test_epi32_mask(res, lsb); > + > + if (msk_ext !=3D 0) { > + idxes =3D _mm512_srli_epi32(res, 1); > + idxes =3D _mm512_slli_epi32(idxes, 8); > + bytes =3D _mm512_and_epi32(ip_vec, lsbyte_msk); > + idxes =3D _mm512_maskz_add_epi32(msk_ext, idxes, bytes); > + if (size =3D=3D sizeof(uint8_t)) { > + idxes =3D _mm512_mask_i32gather_epi32(zero, msk_ext, > + idxes, (const int *)dp->tbl8, 1); > + idxes =3D _mm512_and_epi32(idxes, res_msk); > + } else if (size =3D=3D sizeof(uint16_t)) { > + idxes =3D _mm512_mask_i32gather_epi32(zero, msk_ext, > + idxes, (const int *)dp->tbl8, 2); > + idxes =3D _mm512_and_epi32(idxes, res_msk); > + } else > + idxes =3D _mm512_mask_i32gather_epi32(zero, msk_ext, > + idxes, (const int *)dp->tbl8, 4); > + > + res =3D _mm512_mask_blend_epi32(msk_ext, res, idxes); > + } > + > + res =3D _mm512_srli_epi32(res, 1); > + tmp1 =3D _mm512_maskz_expand_epi32(exp_msk, res); > + tmp256 =3D _mm512_extracti32x8_epi32(res, 1); > + tmp2 =3D _mm512_maskz_expand_epi32(exp_msk, > + _mm512_castsi256_si512(tmp256)); > + _mm512_storeu_si512(next_hops, tmp1); > + _mm512_storeu_si512(next_hops + 8, tmp2); > +} > + > +static __rte_always_inline void > +dir24_8_vec_lookup_x8_8b(void *p, const uint32_t *ips, > + uint64_t *next_hops) > +{ > + struct dir24_8_tbl *dp =3D (struct dir24_8_tbl *)p; > + const __m512i zero =3D _mm512_set1_epi32(0); > + const __m512i lsbyte_msk =3D _mm512_set1_epi64(0xff); > + const __m512i lsb =3D _mm512_set1_epi64(1); > + __m512i res, idxes, bytes; > + __m256i idxes_256, ip_vec; > + __mmask8 msk_ext; > + > + ip_vec =3D _mm256_loadu_si256((const void *)ips); > + /* mask 24 most significant bits */ > + idxes_256 =3D _mm256_srli_epi32(ip_vec, 8); > + > + /* lookup in tbl24 */ > + res =3D _mm512_i32gather_epi64(idxes_256, (const void *)dp->tbl24, 8); > + > + /* get extended entries indexes */ > + msk_ext =3D _mm512_test_epi64_mask(res, lsb); > + > + if (msk_ext !=3D 0) { > + bytes =3D _mm512_cvtepi32_epi64(ip_vec); > + idxes =3D _mm512_srli_epi64(res, 1); > + idxes =3D _mm512_slli_epi64(idxes, 8); > + bytes =3D _mm512_and_epi64(bytes, lsbyte_msk); > + idxes =3D _mm512_maskz_add_epi64(msk_ext, idxes, bytes); > + idxes =3D _mm512_mask_i64gather_epi64(zero, msk_ext, idxes, > + (const void *)dp->tbl8, 8); > + > + res =3D _mm512_mask_blend_epi64(msk_ext, res, idxes); > + } > + > + res =3D _mm512_srli_epi64(res, 1); > + _mm512_storeu_si512(next_hops, res); > +} > + > +void > +rte_dir24_8_vec_lookup_bulk_1b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n) > +{ > + uint32_t i; > + for (i =3D 0; i < (n / 16); i++) > + dir24_8_vec_lookup_x16(p, ips + i * 16, next_hops + i * 16, > + sizeof(uint8_t)); > + Just curious: if for reminder, instead of calling scalar lookup, Introduce a masked version of avx512 lookup - would it be slower? > + dir24_8_lookup_bulk_1b(p, ips + i * 16, next_hops + i * 16, > + n - i * 16); > +} > + > +void > +rte_dir24_8_vec_lookup_bulk_2b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n) > +{ > + uint32_t i; > + for (i =3D 0; i < (n / 16); i++) > + dir24_8_vec_lookup_x16(p, ips + i * 16, next_hops + i * 16, > + sizeof(uint16_t)); > + > + dir24_8_lookup_bulk_2b(p, ips + i * 16, next_hops + i * 16, > + n - i * 16); > +} > + > +void > +rte_dir24_8_vec_lookup_bulk_4b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n) > +{ > + uint32_t i; > + for (i =3D 0; i < (n / 16); i++) > + dir24_8_vec_lookup_x16(p, ips + i * 16, next_hops + i * 16, > + sizeof(uint32_t)); > + > + dir24_8_lookup_bulk_4b(p, ips + i * 16, next_hops + i * 16, > + n - i * 16); > +} > + > +void > +rte_dir24_8_vec_lookup_bulk_8b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n) > +{ > + uint32_t i; > + for (i =3D 0; i < (n / 8); i++) > + dir24_8_vec_lookup_x8_8b(p, ips + i * 8, next_hops + i * 8); > + > + dir24_8_lookup_bulk_8b(p, ips + i * 8, next_hops + i * 8, n - i * 8); > +} > diff --git a/lib/librte_fib/dir24_8_avx512.h b/lib/librte_fib/dir24_8_avx= 512.h > new file mode 100644 > index 0000000..1d3c2b9 > --- /dev/null > +++ b/lib/librte_fib/dir24_8_avx512.h > @@ -0,0 +1,24 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#ifndef _DIR248_AVX512_H_ > +#define _DIR248_AVX512_H_ > + > +void > +rte_dir24_8_vec_lookup_bulk_1b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n); > + > +void > +rte_dir24_8_vec_lookup_bulk_2b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n); > + > +void > +rte_dir24_8_vec_lookup_bulk_4b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n); > + > +void > +rte_dir24_8_vec_lookup_bulk_8b(void *p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned int n); > + > +#endif /* _DIR248_AVX512_H_ */ > diff --git a/lib/librte_fib/meson.build b/lib/librte_fib/meson.build > index 771828f..0963f3c 100644 > --- a/lib/librte_fib/meson.build > +++ b/lib/librte_fib/meson.build > @@ -5,3 +5,14 @@ > sources =3D files('rte_fib.c', 'rte_fib6.c', 'dir24_8.c', 'trie.c') > headers =3D files('rte_fib.h', 'rte_fib6.h') > deps +=3D ['rib'] > + > +if dpdk_conf.has('RTE_ARCH_X86') and cc.has_argument('-mavx512f') > + if cc.has_argument('-mavx512dq') > + dir24_8_avx512_tmp =3D static_library('dir24_8_avx512_tmp', > + 'dir24_8_avx512.c', > + dependencies: static_rte_eal, > + c_args: cflags + ['-mavx512f'] + ['-mavx512dq']) > + objs +=3D dir24_8_avx512_tmp.extract_objects('dir24_8_avx512.c') > + cflags +=3D '-DCC_DIR24_8_AVX512_SUPPORT' > + endif > +endif > diff --git a/lib/librte_fib/rte_fib.h b/lib/librte_fib/rte_fib.h > index db35685..2919d13 100644 > --- a/lib/librte_fib/rte_fib.h > +++ b/lib/librte_fib/rte_fib.h > @@ -54,7 +54,8 @@ enum rte_fib_dir24_8_nh_sz { > enum rte_fib_dir24_8_lookup_type { > RTE_FIB_DIR24_8_SCALAR_MACRO, > RTE_FIB_DIR24_8_SCALAR_INLINE, > - RTE_FIB_DIR24_8_SCALAR_UNI > + RTE_FIB_DIR24_8_SCALAR_UNI, > + RTE_FIB_DIR24_8_VECTOR > }; >=20 > /** FIB configuration structure */ > -- Acked-by: Konstantin Ananyev > 2.7.4