From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id D19D9374E for ; Tue, 27 Oct 2015 16:32:28 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 27 Oct 2015 08:32:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,205,1444719600"; d="scan'208";a="804436695" Received: from irsmsx154.ger.corp.intel.com ([163.33.192.96]) by orsmga001.jf.intel.com with ESMTP; 27 Oct 2015 08:31:47 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.75]) by IRSMSX154.ger.corp.intel.com ([169.254.12.252]) with mapi id 14.03.0248.002; Tue, 27 Oct 2015 15:31:45 +0000 From: "Ananyev, Konstantin" To: Jan Viktorin , Thomas Monjalon , "Hunt, David" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Thread-Index: AQHREA0+CBa2RGBdvkGKK2txp8Wd4J5/eCxg Date: Tue, 27 Oct 2015 15:31:44 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836AB58ED@irsmsx105.ger.corp.intel.com> References: <1445877458-31052-1-git-send-email-viktorin@rehivetech.com> <1445877458-31052-16-git-send-email-viktorin@rehivetech.com> In-Reply-To: <1445877458-31052-16-git-send-email-viktorin@rehivetech.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: Vlastimil Kosar Subject: Re: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Oct 2015 15:32:29 -0000 Hi Jan, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin > Sent: Monday, October 26, 2015 4:38 PM > To: Thomas Monjalon; Hunt, David; dev@dpdk.org > Cc: Vlastimil Kosar > Subject: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 = using rte_lpm_lookup_bulk on for-x86 >=20 > From: Vlastimil Kosar >=20 > LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefo= re, > the function is reimplemented using non-vector operations for non-x86 > architectures. In the future, each architecture should have vectorized co= de. > This patch includes rudimentary emulation of intrinsic functions _mm_set_= epi32(), > _mm_loadu_si128() and _mm_load_si128() for easy portability of existing > applications. >=20 > LPM builds now when on ARM. >=20 > FIXME: to be reworked >=20 > Signed-off-by: Vlastimil Kosar > Signed-off-by: Jan Viktorin > --- > config/defconfig_arm-armv7-a-linuxapp-gcc | 1 - > lib/librte_lpm/rte_lpm.h | 71 +++++++++++++++++++++++++= ++++++ > 2 files changed, 71 insertions(+), 1 deletion(-) >=20 > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig= _arm-armv7-a-linuxapp-gcc > index 5b582a8..33afb33 100644 > --- a/config/defconfig_arm-armv7-a-linuxapp-gcc > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc > @@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=3D16 >=20 > # fails to compile on ARM > CONFIG_RTE_LIBRTE_ACL=3Dn > -CONFIG_RTE_LIBRTE_LPM=3Dn >=20 > # cannot use those on ARM > CONFIG_RTE_KNI_KMOD=3Dn > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h > index c299ce2..4619992 100644 > --- a/lib/librte_lpm/rte_lpm.h > +++ b/lib/librte_lpm/rte_lpm.h > @@ -47,7 +47,9 @@ > #include > #include > #include > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) > #include > +#endif >=20 > #ifdef __cplusplus > extern "C" { > @@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, c= onst uint32_t * ips, > return 0; > } >=20 > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) > /* Mask four results. */ > #define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff) >=20 > @@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i = ip, uint16_t hop[4], > hop[2] =3D (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv; > hop[3] =3D (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv; > } > +#else Probably better to create an lib/librte_eal/common/include/arch/arm/rte_vec= t.h, and move all these x86 vector support emulation there? Konstantin > +// TODO: this code should be reworked. > + > +typedef struct { > + union uint128 { > + uint8_t uint8[16]; > + uint32_t uint32[4]; > + } val; > +} __m128i; > + > +static inline __m128i > +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3) > +{ > + __m128i res; > + res.val.uint32[0] =3D v0; > + res.val.uint32[1] =3D v1; > + res.val.uint32[2] =3D v2; > + res.val.uint32[3] =3D v3; > + return res; > +} > + > +static inline __m128i > +_mm_loadu_si128(__m128i * v) > +{ > + __m128i res; > + res =3D *v; > + return res; > +} > + > +static inline __m128i > +_mm_load_si128(__m128i * v) > +{ > + __m128i res; > + res =3D *v; > + return res; > +} > + > +/** > + * Lookup four IP addresses in an LPM table. > + * > + * @param lpm > + * LPM object handle > + * @param ip > + * Four IPs to be looked up in the LPM table > + * @param hop > + * Next hop of the most specific rule found for IP (valid on lookup hi= t only). > + * This is an 4 elements array of two byte values. > + * If the lookup was succesfull for the given IP, then least significa= nt byte > + * of the corresponding element is the actual next hop and the most > + * significant byte is zero. > + * If the lookup for the given IP failed, then corresponding element w= ould > + * contain default value, see description of then next parameter. > + * @param defv > + * Default value to populate into corresponding element of hop[] array= , > + * if lookup would fail. > + */ > +static inline void > +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], > + uint16_t defv) > +{ > + rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4); > + > + hop[0] =3D (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv; > + hop[1] =3D (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv; > + hop[2] =3D (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv; > + hop[3] =3D (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv; > +} > +#endif >=20 > #ifdef __cplusplus > } > -- > 2.6.1