From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0056.outbound.protection.outlook.com [157.56.111.56]) by dpdk.org (Postfix) with ESMTP id 64FC2FE5 for ; Mon, 2 Nov 2015 05:58:00 +0100 (CET) Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.Jacob@caviumnetworks.com; Received: from localhost.localdomain (122.167.182.182) by BLUPR0701MB1972.namprd07.prod.outlook.com (10.163.121.23) with Microsoft SMTP Server (TLS) id 15.1.312.18; Mon, 2 Nov 2015 04:57:52 +0000 Date: Mon, 2 Nov 2015 10:27:29 +0530 From: Jerin Jacob To: David Hunt Message-ID: <20151102045728.GB16413@localhost.localdomain> References: <1446212959-19832-1-git-send-email-david.hunt@intel.com> <1446212959-19832-2-git-send-email-david.hunt@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <1446212959-19832-2-git-send-email-david.hunt@intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Originating-IP: [122.167.182.182] X-ClientProxiedBy: MA1PR01CA0050.INDPRD01.PROD.OUTLOOK.COM (25.164.116.150) To BLUPR0701MB1972.namprd07.prod.outlook.com (25.163.121.23) X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 2:3mHjC+XNn1HaxAHl7UKQCsFGElYoXFMzVXKJyxKf9RCyoq69XYn6HJ1vXNi91rFzN6KW/zHlxdVygmOC+U2JREs87W6SknQdpbAS2FvsvyYDWOmKV9b+lGgKTz2WfCM3pGavoIbG5MMMGGGI0w/JrkFkZxJmbNxHxIxWjduDY5c=; 3:C7+Sg2RbmJooFL36Q7lDwRXqYgnH4eD4kJpnIyluA8nEQQPhIvAtlqUeEFYqoFDSyxExT8f8w+HBPhpY01uNSaPKqtQ2QEQFyV4QUHaYVIbN3Gr58Xbh32jMk+OdA86ad1ag5ORyUGcUPFAvOCkInw==; 25:dK+aovjAzz8WXCTciJS6zTlDn3+c0tP9s6S4fiWaY1H1GUx5FWPc4OtxXyxuOjSMMpcIQCoyEaMlwI/SqC8Ko16vqwIFbx7kpNL+pJaM/+wlta8cOgbtPaiebre9D4MKxts3p+lf9kI5yuaYccIJMH80YIKudddGUCSHafaDIwsCiEFuNZa3nQM8U7MZI7NOuvXxS5zHWoRscn30b64OEjyP/rEtUCRxUKur8MBhKa7we2s3Nh92zhuJS/Ql0WahvB2MyxyhjbAKe+0MJvYIiA== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR0701MB1972; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 20:Nuf2PtFFLf8gPOk6EQ8zlW3oHzUSmc3nzZCIhK1fZ0UslP4sCDoBB+bafWSerandrNyH5WG48+MxnonWpdDfivNo96/WcP/6lD+Sx+w19+FEDYDAfIOaJ8G/XZPQsT6c/B1IcYaf7XWdw7NE/TGxooN9NtzXUD8gpywlXP1VwOoEfk/FC4+AgRZSpXrtDbdB22cPULu0rMYXqTRI66PtwAktTgjM/QujfZtK8mxnfCb3aZaO/VHeStf032U3lG4yinTtSLnTQRW50CTGtNqPu8WSVKg112yl0qRnh80Gn0hz87u8/FV/bs7Mgd41t+9GOy/3RxhKiv43xwQ5OHpDj1EBb3CJcAX4bzaPzFzFOKdfUKIMfLU3IaZqy14Jh52HPU8zSPBz8XxBWbskNNkPW0VHTmd8B5PuqAqjuhP3csnr8vihv8VcZDQc4dupuWSRS9FU4NyHo9ix+xGiJOcMn2dkGSVGx+jiEWFSX7HTlP75T34zx9SPeLq2HTeQxYGqdfz2lM3LJKp45Xf0j9lipQX2/nlH0pAVI8SbhGfoAXqVcQVgPGnKANH9G4KswlS9stU4pAymrpAqFOEY80DSD/+qxmlsoQqVgwIKCAs7C5o= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(520078)(3002001)(10201501046); SRVR:BLUPR0701MB1972; BCL:0; PCL:0; RULEID:; SRVR:BLUPR0701MB1972; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 4:qjHsxpzfqtXvC0mvy+rXyDW9UzHtvp6hXyyin+kW+1zz6mlfGKOLFPGl/vPm3s1ei8yKoF13/6e7P484WgGoivVfKV04fYEPS1n1zkmXN6BMlhD34EozByc/Sne4o25/V3/o+PngnTymre36iVvCi4oQvn45qtvhMUuFoyLCPI3hQNNt9huBsy6UWkctxkd2g6idZ2T47+tpkZ5IAq1F+zLdUuQSHqQ/OHeU60jX1NuQTgyJoVruUCJZ1xxUQWQdQ2jHfTn+cXJ3/JqkXPPn+3Zz375PiNhw2c3OuZXcJvA2dmM5xsfW9eWdpfx3jCJp2hbzLT5d4aqumUxmuyhBJ/I0RjU0XXptYsh10ZCthv9l8W4ZYUv8wNBpc1GqRvFi X-Forefront-PRVS: 0748FF9A04 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6069001)(6009001)(199003)(189002)(24454002)(2950100001)(42186005)(4001350100001)(46406003)(81156007)(110136002)(54356999)(33656002)(86362001)(61506002)(97756001)(5008740100001)(5004730100002)(77096005)(47776003)(101416001)(50466002)(83506001)(66066001)(23726002)(189998001)(106356001)(5001960100002)(5007970100001)(40100003)(122386002)(19580395003)(105586002)(50986999)(97736004)(19580405001)(92566002)(76176999)(87976001)(7099028); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR0701MB1972; H:localhost.localdomain; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (protection.outlook.com: caviumnetworks.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR0701MB1972; 23:MrncCjT4erd0+WmdODnYY9ooCa/O8gPp2Ypi+cl?= =?us-ascii?Q?JBj2PIQ/WtHz0TFODSfs0EFO7OWJIP3MXz8N/7fcx3iI7Or9WcGBy0iO2G8G?= =?us-ascii?Q?Tnw44+mKRS9dUwV8Jn/oKivpc+UQQo9IA+MEc8zj+J2qgZmkRLI66EdD+8jF?= =?us-ascii?Q?yvXZBFdZgbJrD8Uyf/F3qPn1YOdGHfVTQIlN5KUTL9mUCPV/NnqFHBLGzN88?= =?us-ascii?Q?haISpf0W0VlHxZ4qMkZw+IXKxOLGWVqgJ7Hfy+wfDBqmfAPw1IDX3vOXYctK?= =?us-ascii?Q?MrC8cr5Q2fcI4AxUfPtRRtnjyyHac7yP77Gi1uGuAa8mO8vrS2oat5vR0Vxo?= =?us-ascii?Q?bVVT900fXOke2BGJi6U1pAccivV7AGfk5DxguXLeEcVYJf9ZzWHRhmUSpFWV?= =?us-ascii?Q?/w+8kpldVzQlOIMveuPYkExmlaRqzpoq4dxOOQ+s4Gvzg5msN0NzyzluunRy?= =?us-ascii?Q?zWjHy1DL6eLiY+yIk9hPe2dnZ5nfOClJyM+iXO1fe7BICezR1WPn472DRC0J?= =?us-ascii?Q?stj9cQq6Jfm+ZomwXb7nKzaE8XyowCjYoW5Rk57hyv3cFJ0o+k2t8byPxofk?= =?us-ascii?Q?6xIG2P9cvpDmPqCvVOKxfM6+3Jnh0qajq1De69Zo6ugunt6vtARjQe/Ihl/4?= =?us-ascii?Q?XHTqexscGsKrZg3HiRE05DCDreDB2uIqQyryBe6c54DVRLjWx3qKgcjEG2dA?= =?us-ascii?Q?OafvovlSbfo5u9FibB1HRGqCF1GSDb4HUmlAn+Emkkf9kmmZI47JoBPh54li?= =?us-ascii?Q?bov9HY5YkjkdMQJQDintiFYF6w1n0dZOhz91eym7xWE3RoNzFci4S/WMt6Rb?= =?us-ascii?Q?NtJ4+Yi7LXie3vF352A9jzkkcmPI1V5EzkARKvhSKoflJ5LNBfqvPmn//n40?= =?us-ascii?Q?AhZdpzWq2qCcMfIkp6UM48NV37BaRxSDKUro+aZXFJf99dAx0i2flTiHfFw4?= =?us-ascii?Q?LvrhAKP0ANGVmZM/P7SRdSyH3vGS9oyXaF+AUZXphESiyTDW5D35UWuzIQkn?= =?us-ascii?Q?bqNDCRAtWidqNVH5rryrCmDaGghz9LClre7PnEO4y7CBBLDI1le2j5fPy+Q0?= =?us-ascii?Q?yTXncGRQI+Xs+JPZR4q4CiPHWlDzH?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 5:89e0164VcfIjpSIgl3qBbos+59H50yZQ/i8aXUdk1bSn+qhyJ/FffbkewVAu3eB++T1JZw+8MtnsWpV5S2/PT8faLrX8JAhTR80cBE9K54DjrgDTX8fd5JvgBcbcO6GihiLlsEubqQo/J9AppQJGGA==; 24:KfE75oayrqvjmfV/zpLSbLHVzmOu8IKc/t/DlYC77ycC1XtAziJeCdqsT49V1JMSTv/+/fFphItNFLdtoBk/q1tjxTJjdcHwG/g82kdTWM8=; 20:at67gECjK1EGJQQ0/2tafd/PG3paKSIN+s8YKVaj1jPquTz3ZBiqwsKHWb5sc19bFZItDFJqMarLMURO5r4y4Q== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Nov 2015 04:57:52.7508 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0701MB1972 Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Nov 2015 04:58:00 -0000 On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > Signed-off-by: David Hunt > --- > .../common/include/arch/arm/rte_memcpy.h | 4 + > .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ > 2 files changed, 312 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > index d9f5bf1..1d562c3 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > @@ -33,6 +33,10 @@ > #ifndef _RTE_MEMCPY_ARM_H_ > #define _RTE_MEMCPY_ARM_H_ > > +#ifdef RTE_ARCH_64 > +#include > +#else > #include > +#endif > > #endif /* _RTE_MEMCPY_ARM_H_ */ > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > new file mode 100644 > index 0000000..6d85113 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > @@ -0,0 +1,308 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) IBM Corporation 2014. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of IBM Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +*/ > + > +#ifndef _RTE_MEMCPY_ARM_64_H_ > +#define _RTE_MEMCPY_ARM_64_H_ > + > +#include > +#include > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_memcpy.h" > + > +#ifdef __ARM_NEON_FP SIMD is not optional in armv8 spec.So every armv8 machine will have SIMD instruction unlike armv7.More over LDP/STP instruction is not part of SIMD.So this check is not required or it can be replaced with a check that select memcpy from either libc or this specific implementation > + > +/* ARM NEON Intrinsics are used to copy data */ > +#include > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP d0, d1, [%0]\n\t" > + "STP d0, d1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} IMO, no need to hardcode registers used for the mem move(d0, d1). Let compiler schedule the registers for better performance. > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP d0, d1, [%0 , #32]\n\t" > + "STP d0, d1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + "LDP q0, q1, [%0 , #128]\n\t" > + "STP q0, q1, [%1 , #128]\n\t" > + "LDP q0, q1, [%0 , #160]\n\t" > + "STP q0, q1, [%1 , #160]\n\t" > + "LDP q0, q1, [%0 , #192]\n\t" > + "STP q0, q1, [%1 , #192]\n\t" > + "LDP q0, q1, [%0 , #224]\n\t" > + "STP q0, q1, [%1 , #224]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +#define rte_memcpy(dst, src, n) \ > + ({ (__builtin_constant_p(n)) ? \ > + memcpy((dst), (src), (n)) : \ > + rte_memcpy_func((dst), (src), (n)); }) > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + void *ret = dst; > + > + /* We can't copy < 16 bytes using XMM registers so do it manually. */ > + if (n < 16) { > + if (n & 0x01) { > + *(uint8_t *)dst = *(const uint8_t *)src; > + dst = (uint8_t *)dst + 1; > + src = (const uint8_t *)src + 1; > + } > + if (n & 0x02) { > + *(uint16_t *)dst = *(const uint16_t *)src; > + dst = (uint16_t *)dst + 1; > + src = (const uint16_t *)src + 1; > + } > + if (n & 0x04) { > + *(uint32_t *)dst = *(const uint32_t *)src; > + dst = (uint32_t *)dst + 1; > + src = (const uint32_t *)src + 1; > + } > + if (n & 0x08) > + *(uint64_t *)dst = *(const uint64_t *)src; > + return ret; > + } > + > + /* Special fast cases for <= 128 bytes */ > + if (n <= 32) { > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > + } > + > + if (n <= 64) { > + rte_mov32((uint8_t *)dst, (const uint8_t *)src); > + rte_mov32((uint8_t *)dst - 32 + n, > + (const uint8_t *)src - 32 + n); > + return ret; > + } > + > + if (n <= 128) { > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + rte_mov64((uint8_t *)dst - 64 + n, > + (const uint8_t *)src - 64 + n); > + return ret; > + } > + > + /* > + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte > + * copies was found to be faster than doing 128 and 32 byte copies as > + * well. > + */ > + for ( ; n >= 256; n -= 256) { There is room for prefetching the next cacheline based on the cache line size. > + rte_mov256((uint8_t *)dst, (const uint8_t *)src); > + dst = (uint8_t *)dst + 256; > + src = (const uint8_t *)src + 256; > + } > + > + /* > + * We split the remaining bytes (which will be less than 256) into > + * 64byte (2^6) chunks. > + * Using incrementing integers in the case labels of a switch statement > + * enourages the compiler to use a jump table. To get incrementing > + * integers, we shift the 2 relevant bits to the LSB position to first > + * get decrementing integers, and then subtract. > + */ > + switch (3 - (n >> 6)) { > + case 0x00: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x01: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x02: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + default: > + break; > + } > + > + /* > + * We split the remaining bytes (which will be less than 64) into > + * 16byte (2^4) chunks, using the same switch structure as above. > + */ > + switch (3 - (n >> 4)) { > + case 0x00: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x01: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x02: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + default: > + break; > + } > + > + /* Copy any remaining bytes, without going beyond end of buffers */ > + if (n != 0) > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > +} > + > +#else > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 16); > +} > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 32); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 48); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 64); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 128); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 256); > +} > + > +static inline void * > +rte_memcpy(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +#endif /* __ARM_NEON_FP */ > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_MEMCPY_ARM_64_H_ */ > -- > 1.9.1 >