From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2on0058.outbound.protection.outlook.com [207.46.100.58]) by dpdk.org (Postfix) with ESMTP id 5C2B9C512 for ; Fri, 29 Jan 2016 05:11:58 +0100 (CET) Authentication-Results: dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=none action=none header.from=caviumnetworks.com; Received: from localhost.localdomain.localdomain (122.167.176.101) by CY1PR0701MB1726.namprd07.prod.outlook.com (10.163.21.140) with Microsoft SMTP Server (TLS) id 15.1.390.13; Fri, 29 Jan 2016 04:11:53 +0000 From: Jerin Jacob To: Date: Fri, 29 Jan 2016 09:40:44 +0530 Message-ID: <1454040645-23864-3-git-send-email-jerin.jacob@caviumnetworks.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1454040645-23864-1-git-send-email-jerin.jacob@caviumnetworks.com> References: <1449242086-19051-1-git-send-email-jerin.jacob@caviumnetworks.com> <1454040645-23864-1-git-send-email-jerin.jacob@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [122.167.176.101] X-ClientProxiedBy: BM1PR01CA0020.INDPRD01.PROD.OUTLOOK.COM (25.163.198.155) To CY1PR0701MB1726.namprd07.prod.outlook.com (25.163.21.140) X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1726; 2:Se6JC8cNCgQEf1Vek5++vMi3HjWthLuCokkjLLV/n96QNK3FgOpEev221OP1c6AWSxWNqItmm6h1AynVKLlKcAhe/7XwbV4GgS03Z7b59FqbBBrHcLpLy40uxkBBREKHucE6dhqTuaZIZwfr+XMkxA==; 3:otFo8TF9GGOGzs2+lO9fCIKsVBDTgEgF5/y67abh1MIxjiYXkrmq74EPWcFND+05pZUz54DMIN3gVZUieOBYjcYZvNIKrDoDHFonSW7FHViYKgmPxfw/+UR7KToVSOsg; 25:ttho78g6ZUg/RC9ybBplowZ14IF87h4l3R8/PUa4PxD6hQz0B1yd6YDL2xZOKuH7y+Nv+zhd3o6v78KJd6wbWMHdtn4CmOoh2fcxDTffJi7fjJ9X+42kSUjNcl9DmQB+XYLOHn9DUqNen7oqsWl3G0+M2MMoKZEtbZebwO1PhB3TUTRi3KqxQyRc4uhcNjjclGf2KNgnHRYhqGMNyrJ/q/s0lm3GoiDBEj0oBOmNOmyD6Pa3m6DMe8beDgH/0KmZ X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0701MB1726; X-MS-Office365-Filtering-Correlation-Id: 53433327-1827-4631-a9de-08d3286253be X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1726; 20:y1emE5/QcqSql3hLNUpBJbZ5iZ4BanyACP/p4bfRkIMGn1T4whZkjbq997hePquvPktTBLYAB5zngQ5mHucvBmOmYoPu6iE03Zeai5LhZr/0wZkmtrY6VZ44WmheHE4BEtPp9EWGTPUWZeqp3PyQ/SpCiJGeBm20SOS4EV9mPUQvaLoFr5EdxwHDoLFTGdcNnBL1jw84TplRylWR8NOgBcdj+L8fka4+klJA5UAiYb0SOpcmO+2Qjr4AKFlIsJGJ+bx4mTv1/EEZ1XSTbUNIof/dwAOvTtRwoE0Fiv802vr1UnBevtRuv55WFR4LYKuZXuFMB37xByixpvu82jXbXzz4CodmV4epI6AV1cTFFGbWznL8ublbsmwn2hJSF9LhCHOjglBV9qFMTm588XhNiOBeQbjHBDRsySkYswXPDdBk7l9u6yBl1Asf+sBkVsMidSjusx5MdAZCreNjOt4nlyULk59a2q47dGI4mSqrvULdW0IwXfYJtoM9/Q8NH4uOicJVwjFrDLFReuqEVndbj63PGpWSsduKeZ5CDSz5NcLZhS1qBKGzzh9o0WebTwOdI2YsYPYsnWcfzfaVZ36mX/W78jJwbO02oPllL6zdpEg= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001); SRVR:CY1PR0701MB1726; BCL:0; PCL:0; RULEID:; SRVR:CY1PR0701MB1726; X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1726; 4:R3cMRVGlNPQ/6lN+8UmT3MIMrIaPks7sP/mQlY07kSTYqRdmI5dmWPOyoAmVW7RVwBClG+5vaFw4T9ocHUaeYi5uR1dKPhjTj19Q/VnIlcqvkSqr0hidpU0Ex+PjlASmygYarIe7bpqsps380Xe5wkG0jSbivQJQQH5+68rmwlhhQwhgtj1xpxxPKeUDWXVWhUsKRxFjw+P+D9EM6Gc8P6SL4fQd5Zedx5JEdduGpUb+zvGWlflUz38O+WDczYp68xvuYA8CrEdzqXwWU6ylpRf1M7zD1TjmWg+tuQzpJ4CYoglRLSjn5GrcRHLCXThtaEdqNaeq79rCsMY7DIUtgVBM9hI//zLHa1OxEqBqGeYbHLOgapt9rBZpi4Qjit5R X-Forefront-PRVS: 083691450C X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6069001)(6009001)(50226001)(5008740100001)(122386002)(36756003)(33646002)(1096002)(3470700001)(40100003)(3846002)(586003)(6116002)(48376002)(4326007)(2906002)(42186005)(92566002)(87976001)(107886002)(19580405001)(110136002)(19580395003)(76176999)(4001430100002)(2351001)(47776003)(50466002)(5001960100002)(77096005)(229853001)(2950100001)(189998001)(86362001)(66066001)(5003940100001)(50986999)(5004730100002)(7099028)(357404004); DIR:OUT; SFP:1101; SCL:1; SRVR:CY1PR0701MB1726; H:localhost.localdomain.localdomain; FPR:; SPF:None; MLV:nov; PTR:InfoNoRecords; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; CY1PR0701MB1726; 23:SnSAcA4XxWn3V2oO1mA1DzsIgRB1KEKh9XqGVFZ?= =?us-ascii?Q?Pfb0w3dL+Ykk+mrcpqyNkxgiOVNJQ+uBuLnBuQ1EC+f1WQRUcvsl6O5/Bqj6?= =?us-ascii?Q?3mHBQZGGOR+OZnFEiMPPvd6faHQGCotbSUqqYZw2MDTzF6oP/sZ5+aj+jQwr?= =?us-ascii?Q?bppfGlwU1IfKDKWKgTY0h2o1Vuf/CrT5vlrA0Trt6K6SnqvZ0cum89ZnfYYl?= =?us-ascii?Q?05JN3zzxRQ7e8F8742ElbJY36IdEo1dzfldSEBK+G7OoLLIRGxr7KSTMvQEu?= =?us-ascii?Q?4sqCVF6RigFSN7Yk97S81vJrnMfQHGni7vxQS5H8SZSva5cHUyN6AN96m83j?= =?us-ascii?Q?pgwmt4w6BXpNV+vBrVrY23U6nMJ5BLYqniKYUaDtCKIZe/mVF8O5D2POfvSe?= =?us-ascii?Q?9XbgwY4uwRuFUWPh/YcywzP/lrNUuUnip67u81zM7ESqhrVfiREkoFJWm9vA?= =?us-ascii?Q?c/r9oy1/tXbg8iai4yDlsaSWrET6s5DKcayZGmkufMBJaixngchRTzeYkS4/?= =?us-ascii?Q?PAZW87DJcu0tg87Us6fY3EGLalxzc4Y3u8gC8WulO7s9iwUnTPBzkeCIm64T?= =?us-ascii?Q?GpJSdx8EeMphO9yiLw83zADTXWiqWK4vUC4mLdg2kY2MVjggsiciwKm2+C+U?= =?us-ascii?Q?MWdXwERVJc43HJ8TaLznqcToygsYu5zAd4NDaboP+IrI4n9oXqSSR0yMr1K2?= =?us-ascii?Q?0nM5kMp/sX7JRNIGgyyYSWwdfajSDjyyR/5admr0gTeBb4IEBrUMvwwV3Qn4?= =?us-ascii?Q?rB0p1zrZYjLRblvNQKWcutMY4WUf3osEa1sJ2M7VXRGBXQUdZ9S655HRNXLj?= =?us-ascii?Q?ebjEcw7W7E42buH35VWBoGMKyLq+Bx/YtnuldAWpK9CTp5NNLOs5tFTszXs3?= =?us-ascii?Q?EF62Y40rBDDExedR08m4psVKGh//4/sus1FgkKuaVJxSO795Z02CXWFgTjlM?= =?us-ascii?Q?iqgionuUUDTPp3pxKGOhKxZdybW3MZaoVlPEaX6zrbXQcwj702+jiciXsvKx?= =?us-ascii?Q?sCiqV2maG3V4vE8GMRPdFUz0QvCv1sG8wA9pt2/5tOtpAG3RaxRbuik3T/+E?= =?us-ascii?Q?3hRjBitiJ4DzfRRfnFClFvHnuxkND?= X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1726; 5:U1PFaK2uCawrr53CfGioz2r6ixE/xDYFh78XvYL8IrD9S0HzkO8TsG5zDMYxKNK0xCElvfIbCAp9Of0GOyWPyKgeGdgbn/wV06VHYqyYXdN7XpamJZBUxm1k8IhMM62i2yniiyXQFXbbfJS36kHFQA==; 24:WY1z5y9W9YDKFJ8p3OrT6Z6hHs54soooRbBheeWH6lwonEeFrqwzqLkrqmJVgrSktU60VhHP/uqrgAH2znVtxdbww1FK2F1IKLM2pUmoqv4= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jan 2016 04:11:53.3936 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0701MB1726 Cc: viktorin@rehivetech.com Subject: [dpdk-dev] [PATCH v3 2/3] lpm: add support for NEON X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jan 2016 04:11:59 -0000 Enabled CONFIG_RTE_LIBRTE_LPM, CONFIG_RTE_LIBRTE_TABLE, CONFIG_RTE_LIBRTE_PIPELINE libraries for arm and arm64 TABLE, PIPELINE libraries were disabled due to LPM library dependency. Signed-off-by: Jerin Jacob Signed-off-by: Jianbo Liu --- app/test/test_xmmt_ops.h | 20 ++++ config/defconfig_arm-armv7a-linuxapp-gcc | 3 - config/defconfig_arm64-armv8a-linuxapp-gcc | 3 - lib/librte_lpm/Makefile | 4 + lib/librte_lpm/rte_lpm.h | 4 + lib/librte_lpm/rte_lpm_neon.h | 148 +++++++++++++++++++++++++++++ 6 files changed, 176 insertions(+), 6 deletions(-) create mode 100644 lib/librte_lpm/rte_lpm_neon.h diff --git a/app/test/test_xmmt_ops.h b/app/test/test_xmmt_ops.h index c055912..c18fc12 100644 --- a/app/test/test_xmmt_ops.h +++ b/app/test/test_xmmt_ops.h @@ -36,6 +36,24 @@ #include +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) + +/* vect_* abstraction implementation using NEON */ + +/* loads the xmm_t value from address p(does not need to be 16-byte aligned)*/ +#define vect_loadu_sil128(p) vld1q_s32((const int32_t *)p) + +/* sets the 4 signed 32-bit integer values and returns the xmm_t variable */ +static inline xmm_t __attribute__((always_inline)) +vect_set_epi32(int i3, int i2, int i1, int i0) +{ + int32_t data[4] = {i0, i1, i2, i3}; + + return vld1q_s32(data); +} + +#else + /* vect_* abstraction implementation using SSE */ /* loads the xmm_t value from address p(does not need to be 16-byte aligned)*/ @@ -44,4 +62,6 @@ /* sets the 4 signed 32-bit integer values and returns the xmm_t variable */ #define vect_set_epi32(i3, i2, i1, i0) _mm_set_epi32(i3, i2, i1, i0) +#endif + #endif /* _TEST_XMMT_OPS_H_ */ diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc index cbebd64..efffa1f 100644 --- a/config/defconfig_arm-armv7a-linuxapp-gcc +++ b/config/defconfig_arm-armv7a-linuxapp-gcc @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n CONFIG_RTE_EAL_IGB_UIO=n # fails to compile on ARM -CONFIG_RTE_LIBRTE_LPM=n -CONFIG_RTE_LIBRTE_TABLE=n -CONFIG_RTE_LIBRTE_PIPELINE=n CONFIG_RTE_SCHED_VECTOR=n # cannot use those on ARM diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc index 504f3ed..57f7941 100644 --- a/config/defconfig_arm64-armv8a-linuxapp-gcc +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n CONFIG_RTE_LIBRTE_FM10K_PMD=n CONFIG_RTE_LIBRTE_I40E_PMD=n -CONFIG_RTE_LIBRTE_LPM=n -CONFIG_RTE_LIBRTE_TABLE=n -CONFIG_RTE_LIBRTE_PIPELINE=n CONFIG_RTE_SCHED_VECTOR=n diff --git a/lib/librte_lpm/Makefile b/lib/librte_lpm/Makefile index ce3a1d1..7f93006 100644 --- a/lib/librte_lpm/Makefile +++ b/lib/librte_lpm/Makefile @@ -47,7 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_LPM) := rte_lpm.c rte_lpm6.c # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include := rte_lpm.h rte_lpm6.h +ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) $(CONFIG_RTE_ARCH_ARM64)),) +SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include += rte_lpm_neon.h +else SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include += rte_lpm_sse.h +endif # this lib needs eal DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index dfe1378..0c892de 100644 --- a/lib/librte_lpm/rte_lpm.h +++ b/lib/librte_lpm/rte_lpm.h @@ -384,7 +384,11 @@ static inline void rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint16_t hop[4], uint16_t defv); +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) +#include "rte_lpm_neon.h" +#else #include "rte_lpm_sse.h" +#endif #ifdef __cplusplus } diff --git a/lib/librte_lpm/rte_lpm_neon.h b/lib/librte_lpm/rte_lpm_neon.h new file mode 100644 index 0000000..fcd2a8a --- /dev/null +++ b/lib/librte_lpm/rte_lpm_neon.h @@ -0,0 +1,148 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Cavium Networks. All rights reserved. + * All rights reserved. + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Derived rte_lpm_lookupx4 implementation from lib/librte_lpm/rte_lpm_sse.h + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium Networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_LPM_NEON_H_ +#define _RTE_LPM_NEON_H_ + +#include +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +static inline void +rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t ip, uint16_t hop[4], + uint16_t defv) +{ + uint32x4_t i24; + rte_xmm_t i8; + uint16_t tbl[4]; + uint64_t idx, pt; + + const uint32_t mask = UINT8_MAX; + const int32x4_t mask8 = vdupq_n_s32(mask); + + /* + * RTE_LPM_VALID_EXT_ENTRY_BITMASK for 4 LPM entries + * as one 64-bit value (0x0300030003000300). + */ + const uint64_t mask_xv = + ((uint64_t)RTE_LPM_VALID_EXT_ENTRY_BITMASK | + (uint64_t)RTE_LPM_VALID_EXT_ENTRY_BITMASK << 16 | + (uint64_t)RTE_LPM_VALID_EXT_ENTRY_BITMASK << 32 | + (uint64_t)RTE_LPM_VALID_EXT_ENTRY_BITMASK << 48); + + /* + * RTE_LPM_LOOKUP_SUCCESS for 4 LPM entries + * as one 64-bit value (0x0100010001000100). + */ + const uint64_t mask_v = + ((uint64_t)RTE_LPM_LOOKUP_SUCCESS | + (uint64_t)RTE_LPM_LOOKUP_SUCCESS << 16 | + (uint64_t)RTE_LPM_LOOKUP_SUCCESS << 32 | + (uint64_t)RTE_LPM_LOOKUP_SUCCESS << 48); + + /* get 4 indexes for tbl24[]. */ + i24 = vshrq_n_u32((uint32x4_t)ip, CHAR_BIT); + + /* extract values from tbl24[] */ + idx = vgetq_lane_u64((uint64x2_t)i24, 0); + + tbl[0] = *(const uint16_t *)&lpm->tbl24[(uint32_t)idx]; + tbl[1] = *(const uint16_t *)&lpm->tbl24[idx >> 32]; + + idx = vgetq_lane_u64((uint64x2_t)i24, 1); + + tbl[2] = *(const uint16_t *)&lpm->tbl24[(uint32_t)idx]; + tbl[3] = *(const uint16_t *)&lpm->tbl24[idx >> 32]; + + /* get 4 indexes for tbl8[]. */ + i8.x = vandq_s32(ip, mask8); + + pt = (uint64_t)tbl[0] | + (uint64_t)tbl[1] << 16 | + (uint64_t)tbl[2] << 32 | + (uint64_t)tbl[3] << 48; + + /* search successfully finished for all 4 IP addresses. */ + if (likely((pt & mask_xv) == mask_v)) { + uintptr_t ph = (uintptr_t)hop; + *(uint64_t *)ph = pt & RTE_LPM_MASKX4_RES; + return; + } + + if (unlikely((pt & RTE_LPM_VALID_EXT_ENTRY_BITMASK) == + RTE_LPM_VALID_EXT_ENTRY_BITMASK)) { + i8.u32[0] = i8.u32[0] + + (uint8_t)tbl[0] * RTE_LPM_TBL8_GROUP_NUM_ENTRIES; + tbl[0] = *(const uint16_t *)&lpm->tbl8[i8.u32[0]]; + } + if (unlikely((pt >> 16 & RTE_LPM_VALID_EXT_ENTRY_BITMASK) == + RTE_LPM_VALID_EXT_ENTRY_BITMASK)) { + i8.u32[1] = i8.u32[1] + + (uint8_t)tbl[1] * RTE_LPM_TBL8_GROUP_NUM_ENTRIES; + tbl[1] = *(const uint16_t *)&lpm->tbl8[i8.u32[1]]; + } + if (unlikely((pt >> 32 & RTE_LPM_VALID_EXT_ENTRY_BITMASK) == + RTE_LPM_VALID_EXT_ENTRY_BITMASK)) { + i8.u32[2] = i8.u32[2] + + (uint8_t)tbl[2] * RTE_LPM_TBL8_GROUP_NUM_ENTRIES; + tbl[2] = *(const uint16_t *)&lpm->tbl8[i8.u32[2]]; + } + if (unlikely((pt >> 48 & RTE_LPM_VALID_EXT_ENTRY_BITMASK) == + RTE_LPM_VALID_EXT_ENTRY_BITMASK)) { + i8.u32[3] = i8.u32[3] + + (uint8_t)tbl[3] * RTE_LPM_TBL8_GROUP_NUM_ENTRIES; + tbl[3] = *(const uint16_t *)&lpm->tbl8[i8.u32[3]]; + } + + hop[0] = (tbl[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[0] : defv; + hop[1] = (tbl[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[1] : defv; + hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv; + hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_LPM_NEON_H_ */ -- 2.1.0