From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0040.outbound.protection.outlook.com [104.47.41.40]) by dpdk.org (Postfix) with ESMTP id 33A4D1D7 for ; Mon, 18 Dec 2017 08:44:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=+r65JWLXrhc0j+ffI7oOHz4v1d1tT44JFJeDP5Oi9Go=; b=WFN6fSR2VyraHlR7zqKRJIKtZ2cjZ2UzSt5hN1N9BAw3tzJW6NHC8r7Yk3g7+sb/Fb/RMilLpXX9/310+bMw2vNrsTi9CyH6tB1ZRat53mKdp9dt8XYPm9mNhFCWg23VYSqaAllXtRFyAo5/IycrgkDA+p6pfHMLEn51ytsaDGY= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.JacobKollanukkaran@cavium.com; Received: from jerin (111.93.218.67) by CY1PR07MB2522.namprd07.prod.outlook.com (10.167.16.13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.282.5; Mon, 18 Dec 2017 07:44:39 +0000 Date: Mon, 18 Dec 2017 13:13:51 +0530 From: Jerin Jacob To: Herbert Guan Cc: dev@dpdk.org Message-ID: <20171218074349.GA16659@jerin> References: <1511768985-21639-1-git-send-email-herbert.guan@arm.com> <1513565664-19509-1-git-send-email-herbert.guan@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1513565664-19509-1-git-send-email-herbert.guan@arm.com> User-Agent: Mutt/1.9.2 (2017-12-15) X-Originating-IP: [111.93.218.67] X-ClientProxiedBy: MA1PR0101CA0014.INDPRD01.PROD.OUTLOOK.COM (52.134.136.152) To CY1PR07MB2522.namprd07.prod.outlook.com (10.167.16.13) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7a9e1e4f-41fe-4b72-eef7-08d545eb32f2 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(2017052603307); SRVR:CY1PR07MB2522; X-Microsoft-Exchange-Diagnostics: 1; CY1PR07MB2522; 3:9iDa+ptfCfCX1rd4KmNrTIppXm7Q7PskJ4rXwpLGGWuFGsyuvQLbU6MAPlC25d0RElJqb+/1M64o8bcxSpywg+9rPO9tx8RJHiKN6JNqbB0f6woTm4vrF3WFEmuSWgf6gTGBN5+QXCoeyyiq7ZitZn1KFvUcs1uYL7oVmMynpd1hX43wWELlY+4umFV3F4LIuAGbYvEQA7/QfctQBX6uLp6DFiyi/+i6rkYdbxHYqQjx8foT0GwAk3AZiNiQKQq9; 25:g4tJTh+B9J4pods3Q+69oaik78a15GZw9OU5Y2xL/a3vfW/9gZy4MH6TkB6sk2vvf08DwT7jYKY/ZMfhj7uytqx21BPjGgtWly/kVHXwmK1WOGjtA2S4AOVXSUK2WsdeP5mZLRXc6dgp9tTMLaMFG+nudHQLwtCd62totkT5FAItBJAtdT8epJ5i8RmOyFBS2+9avU6kYHd0nJaM3b1Jipxk3tkgyuDWu1EmuCUsXjtatOZ4ZxS+KkRHR6KP/wmNQ5/4FZjPwWzDoOEUjcsUjXAt3dylh0KwhtcwjWJojK9WsHvS5omBmGrm3+/56X8M0fNviFaXSyJGvNoHmTPH8w==; 31:CfgKWsSQc/7jslqpKC+iKPLnx2RCtvR3XPVrPwiIZ+PZdL3hRN944t5U2hoSFYeSUct9tHBt8tttpt+wvESGtK4s2Pf4cTLQmWAOuA+R5KLFZmPdNdKQXW9+af7JCJuB6z/MLMOVlypUGtimBdZK0UyTxduadQmqm1AdHaCsx708Ypv/aAPbs2QuZb5rmbdchemeTa1ZHnr5byrJ39Fn7jvmK0UvGdV9MuYnn4Aqdc4= X-MS-TrafficTypeDiagnostic: CY1PR07MB2522: X-Microsoft-Exchange-Diagnostics: 1; CY1PR07MB2522; 20:Eng75jvz4DpS1pOmThPr3fXZ1hpG6ZADjoz0v15I8haSSNFR0Gr6SX5YO6Y71lC2dEK/yDotjC7KeCw497VJTPkTH86Tfsn8NMF7FEVnsYvSg2SXIdQb2CeBxQD5zWqHKIPhzOAJlkK+7mxWAApqoPrYzqWSV5HTzBISNKFzq5vudwAUyU3HNase0Ashi4YrUrtnWTWqPvNMvDFVZG2DKfK6PGeZWkyU2k//BtQfPD6BMcihDRiS7Y6nhHFGhJ6wNXeVlQhG4EPCuSNGlPQjZLYTJxDysrwkPzBTT885IdoEiI9pgSRv1QAFNgWX1gyLLIcCxkUjkMQHifZHN3WsOXhWnUr29/MyhJqhh/XXukGULm3WHBU+BQHASvpDT4S3M5TeQ0lLX8ODd9ETw17GYNrP/fk9TvfJxWxTNZzYLOjcqWxD5aESqtSrPlmgCcyXv3s1msMcmzy9ftDn5tXLI1bdcKmu1m5ks76j6wtn0oP+qs7jYVfqb5jOkSa7iLpK6zdoubjMK9BXsisMqj+sntyWF9Q17nT1v582ZDaGlD2qug3439YYXj4K4x58559/c2CZG0jKzTl8p4PiYaPGeo/6nonXCNTAblfjTEiNm1c=; 4:bIbBpD1VtnwlnXzI64TIloyxvWbbrdmJP6DAlhMMWlLlB8Al4OnGSXjcz5Ju+ORqY80fuT/PpPDIoPO4NASOVAFfwjM08uEZnrjAZO0ddqf6kjBQTY6Y34msyFvzPK9YaphTzoLdMunjL4EtzaOujwHPCrhLQbFE45+ggHMmqWyxc0MQNgcxw4YQntCr+ORjywRwSnyJ0opk5PsGBtkeMBeU1eIa88CZAjHStZVQ2X5ogMjzljVibM+z4crnKKK7QrLhrU4wrs+TURQypLvDwZE6mvBVY2eyIir7zrQMSxYqTgGEoSB60QL4vqwXM40T X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040450)(2401047)(8121501046)(5005006)(3231023)(93006095)(10201501046)(3002001)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123564025)(20161123560025)(20161123558100)(20161123562025)(6072148)(201708071742011); SRVR:CY1PR07MB2522; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:CY1PR07MB2522; X-Forefront-PRVS: 0525BB0ADF X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(376002)(39850400004)(346002)(396003)(366004)(13464003)(189003)(199004)(478600001)(6246003)(9686003)(105586002)(47776003)(5009440100003)(97736004)(52116002)(66066001)(3846002)(68736007)(106356001)(1076002)(6496006)(575784001)(72206003)(316002)(6116002)(23726003)(33716001)(83506002)(16526018)(33656002)(4326008)(305945005)(6916009)(53936002)(16586007)(8936002)(58126008)(50466002)(81166006)(33896004)(5660300001)(229853002)(55016002)(76176011)(42882006)(8676002)(59450400001)(81156014)(6666003)(25786009)(386003)(2950100002)(2906002)(7736002)(18370500001); DIR:OUT; SFP:1101; SCL:1; SRVR:CY1PR07MB2522; H:jerin; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; CY1PR07MB2522; 23:DKBTfqC6lbPu4tOnldp/hji+NBd4kj8mBGu2dRDFJ?= =?us-ascii?Q?tSYZBeGZ2gCF2LwW6F2tPxF02UchZanMi64aG0VNHzjO3X8WfdCU75CroZRm?= =?us-ascii?Q?Z3ORIphEA+W5UYSkwMQARjQYq2YXHqS96DdPYquRDq8OhnPgZ3XtNXTDZnKz?= =?us-ascii?Q?dqetykoCSseh68hroxPvhybUhwS7czoDBNRcsn+gCogOMvS75Ad2SoZqrWHH?= =?us-ascii?Q?0OtnMXmMlOArcxJX3f/WhZkHoENzRsMUttusRyjaREY8mKK2j04Wsl+SVBit?= =?us-ascii?Q?PU7gBjlDs4V8YMBH+uRDZpUZ37hTqmgQrjTXX4NrcnZRVI2kKdEg8OjB/tW+?= =?us-ascii?Q?Vu78uoZ9Q9B656CYqsy+qW/ginXegAHecS2DLxMh57FSLmfieu4FRKS0aa4k?= =?us-ascii?Q?SFB88BffX4/TdiK9V+NdzlOx9FQ1G0jpigOtMH1fAuZgg5uUWQm9iKf55BFc?= =?us-ascii?Q?Q4IwXTloMW74VpW/9rwOv2LXUoL9+roaq+DfsL8jdj8ng5Zbt5UzDm0sMrGb?= =?us-ascii?Q?LdVUblSP4lBaOjUqqDB4uhcbzk0ofq8OEBz3X9Rst5xg8n+I/Vkn5GsIMCac?= =?us-ascii?Q?hU1tbr9JQ53KsQovxp2htA9QEpmMtZl0unib7zuHxuuZ1AKy+4JeuzGYPs9t?= =?us-ascii?Q?sx8hZsNcd1b64kqcbJ3C/FEXRYcVr4MyQInNob9REqLUggzwsAKrBGVq5u/Y?= =?us-ascii?Q?YVyK1ox00dRWW+CWDZdnd9ikQkLFmXARzHcHmPU783KrEqOBEJaglVxxB86a?= =?us-ascii?Q?HtmrfdFvVbk0ydr1cQcH65anQpVX+msFGMcb5PPoB6sWdYrSKhbdmj7IWYyP?= =?us-ascii?Q?Xra89G8Qot07+c8r51RyLHPNwoo7WczIkRCNdOSQY6bN/iCGdwzYMqKaFllq?= =?us-ascii?Q?H1EJAN+Qese+4JQUBK+Qke+TebtrEIgQgKMaPV21TZ4yRmULzr/SJjvsDliE?= =?us-ascii?Q?PBLhl/K7mFzNarFPM5l2+NPMTiNpi0J/42QbEKKqXsIhBjS8OjIU5ETv6ym1?= =?us-ascii?Q?4MX0Og8ZqP1bOEf0l4/FiMXqNNU5ksum3BzYZR18FWAjNfZi7+mkDI1D989G?= =?us-ascii?Q?GsaWgvTXXWNhMPXAFVBh+K6fVnmlySelVQVFj3O/hdRdn6xyrmHhri+krA2H?= =?us-ascii?Q?7nRNAT+vzC2u8E67ihzXKv3myFcWKoeRFRlmRn914H9S/jsvO5+PR0/wNK8K?= =?us-ascii?Q?yaihCWdQF+zfwk7J+n2O1lAgkZ3604oV/2OA1qyMGQRZp0kUV7+xfoAX7CPn?= =?us-ascii?Q?bxImvfaPtF7nYy9vX/FUGYht/kHH57PZycLxWhGyg8YDfRVk0w3QtH38Yuf7?= =?us-ascii?B?QT09?= X-Microsoft-Exchange-Diagnostics: 1; CY1PR07MB2522; 6:lj40dAki8rdg7jqmOhGD3iXzPBhUKMW6q0TMybPZwpxxc9StTxo/VpXuY5TDr3zLzLcbvGPjtYx53iUTI1yYkwLtIerCLLnmJd6J4+V7U/sFFesIZC1vGHgdBw+VEs+2w31u4IFD95R0WaYAhJX1CMNObewK9IQrPNeOG2ZDLs6VETyvCkowGrJaYg/kSFeldecttL3Mk1LGe7HhyzP+bEd0pDpUci+S3QUnzTcFBHqhtwzmf+FwA7P2jrfu8sB7/bA4fyjRFgcWoPj4NA9VdFx+h+ZOGg2WpuLPHQdHDAnyFyfzfXSTQ6PmO1ktYfUWwLBwLNaT50x9bLJ9deYPaNBj4e3DLOUx/oxAuocnabs=; 5:q5lo1E3zZ7GzNfTVIxRLQq4CeOXzvd9cMrk3fXT/v276DzSpMgSPB8nSrKHzptdYw3Ku/BYHvQafFqvE+C6TTNj9Ddl1Pt6mxdP20jpO2iyJ/ZPOgcb8hL+DlLa3cR2YHM+ZPZMNta1+0sT+4NArhv1FnaII7xnxaO2Ccji/WYI=; 24:gRn/uN6WH38hU4r1oP5tqM5J3CVyZ3XeDxNBFSDeoZQLyp0NS/JX1lYQSRZS/Q4/XDGjGod0QYtyBepQ1MdXf/EoUL8cGDsqnSxoor7rjAs=; 7:gUUG8ZpL5tzBu2n/j4irzbKTUNXlrdUbc6Bs3f22tDDaTzzIIqC0cjUPR4+vdKK+midpfcn7TSs5oXvwYERshqFjLSZql20aAWf+YVgUA6j5DC2difL7YBQKanRgYSej6O9AAxcb1YLq4LqOOvwCDuEWpI/schm6KDX+3v0OCksJAPmkpI+16FPof2QilP7xTTDC5MnmvQmORwUNMtvexjav0LbvYwXuKNo3LtnLefNIaTVuQR6seCpsa9ovlUBY SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Dec 2017 07:44:39.8299 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7a9e1e4f-41fe-4b72-eef7-08d545eb32f2 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR07MB2522 Subject: Re: [dpdk-dev] [PATCH v3] arch/arm: optimization for memcpy on AArch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Dec 2017 07:44:43 -0000 -----Original Message----- > Date: Mon, 18 Dec 2017 10:54:24 +0800 > From: Herbert Guan > To: dev@dpdk.org, jerin.jacob@caviumnetworks.com > CC: Herbert Guan > Subject: [PATCH v3] arch/arm: optimization for memcpy on AArch64 > X-Mailer: git-send-email 1.8.3.1 > > Signed-off-by: Herbert Guan > --- > config/common_armv8a_linuxapp | 6 + > .../common/include/arch/arm/rte_memcpy_64.h | 292 +++++++++++++++++++++ > 2 files changed, 298 insertions(+) > > diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp > index 6732d1e..8f0cbed 100644 > --- a/config/common_armv8a_linuxapp > +++ b/config/common_armv8a_linuxapp > @@ -44,6 +44,12 @@ CONFIG_RTE_FORCE_INTRINSICS=y > # to address minimum DMA alignment across all arm64 implementations. > CONFIG_RTE_CACHE_LINE_SIZE=128 > > +# Accelarate rte_memcpy. Be sure to run unit test to determine the Additional space before "Be". Rather than just mentioning the unit test, mention the absolute test case name(memcpy_perf_autotest) > +# best threshold in code. Refer to notes in source file Additional space before "Refer" > +# (lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h) for more > +# info. > +CONFIG_RTE_ARCH_ARM64_MEMCPY=n > + > CONFIG_RTE_LIBRTE_FM10K_PMD=n > CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n > CONFIG_RTE_LIBRTE_AVP_PMD=n > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > index b80d8ba..1ea275d 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > @@ -42,6 +42,296 @@ > > #include "generic/rte_memcpy.h" > > +#ifdef RTE_ARCH_ARM64_MEMCPY See the comment below at "(GCC_VERSION < 50400)" check > +#include > +#include > + > +/* > + * The memory copy performance differs on different AArch64 micro-architectures. > + * And the most recent glibc (e.g. 2.23 or later) can provide a better memcpy() > + * performance compared to old glibc versions. It's always suggested to use a > + * more recent glibc if possible, from which the entire system can get benefit. > + * > + * This implementation improves memory copy on some aarch64 micro-architectures, > + * when an old glibc (e.g. 2.19, 2.17...) is being used. It is disabled by > + * default and needs "RTE_ARCH_ARM64_MEMCPY" defined to activate. It's not > + * always providing better performance than memcpy() so users need to run unit > + * test "memcpy_perf_autotest" and customize parameters in customization section > + * below for best performance. > + * > + * Compiler version will also impact the rte_memcpy() performance. It's observed > + * on some platforms and with the same code, GCC 7.2.0 compiled binaries can > + * provide better performance than GCC 4.8.5 compiled binaries. > + */ > + > +/************************************** > + * Beginning of customization section > + **************************************/ > +#define ALIGNMENT_MASK 0x0F This symbol will be included in public rte_memcpy.h version for arm64 DPDK build. Please use RTE_ prefix to avoid multi definition.(RTE_ARCH_ARM64_ALIGN_MASK ? or any shorter name) > +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN > +/* Only src unalignment will be treaed as unaligned copy */ > +#define IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & ALIGNMENT_MASK) > +#else > +/* Both dst and src unalignment will be treated as unaligned copy */ > +#define IS_UNALIGNED_COPY(dst, src) \ > + (((uintptr_t)(dst) | (uintptr_t)(src)) & ALIGNMENT_MASK) > +#endif > + > + > +/* > + * If copy size is larger than threshold, memcpy() will be used. > + * Run "memcpy_perf_autotest" to determine the proper threshold. > + */ > +#define ALIGNED_THRESHOLD ((size_t)(0xffffffff)) > +#define UNALIGNED_THRESHOLD ((size_t)(0xffffffff)) Same as above comment. > + > +/************************************** > + * End of customization section > + **************************************/ > +#ifdef RTE_TOOLCHAIN_GCC > +#if (GCC_VERSION < 50400) > +#warning "The GCC version is quite old, which may result in sub-optimal \ > +performance of the compiled code. It is suggested that at least GCC 5.4.0 \ > +be used." Even though it is warning, based on where this file get included it will generate error(see below) How about, selecting optimized memcpy when RTE_ARCH_ARM64_MEMCPY && if (GCC_VERSION >= 50400) ? CC eal_common_options.o In file included from /home/jerin/dpdk.org/build/include/rte_memcpy.h:37:0,from /home/jerin/dpdk.org/lib/librte_eal/common/eal_common_options.c:53: /home/jerin/dpdk.org/build/include/rte_memcpy_64.h:93:2: error: #warning ^^^^^^^^ "The GCC version is quite old, which may result in sub-optimal performance of the compiled code. It is suggested that at least GCC 5.4.0 be used." [-Werror=cpp] ^^^^^^^^^^^^^^ #warning "The GCC version is quite old, which may result in sub-optimal \ ^ > +#endif > +#endif > + > + > +#if RTE_CACHE_LINE_SIZE >= 128 We can remove this conditional compilation check. ie. It can get compiled for both cases, But it will be used only when RTE_CACHE_LINE_SIZE >= 128 > +static __rte_always_inline void > +rte_memcpy_ge16_lt128 > +(uint8_t *restrict dst, const uint8_t *restrict src, size_t n) > +{ > + if (n < 64) { > + if (n == 16) { > + rte_mov16(dst, src); > + } else if (n <= 32) { > + rte_mov16(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } else if (n <= 48) { > + rte_mov32(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } else { > + rte_mov48(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } > + } else { > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + if (n > 48 + 64) > + rte_mov64(dst - 64 + n, src - 64 + n); > + else if (n > 32 + 64) > + rte_mov48(dst - 48 + n, src - 48 + n); > + else if (n > 16 + 64) > + rte_mov32(dst - 32 + n, src - 32 + n); > + else if (n > 64) > + rte_mov16(dst - 16 + n, src - 16 + n); > + } > +} > + > + > +#else Same as above comment. > +static __rte_always_inline void > +rte_memcpy_ge16_lt64 > +(uint8_t *restrict dst, const uint8_t *restrict src, size_t n) > +{ > + if (n == 16) { > + rte_mov16(dst, src); > + } else if (n <= 32) { > + rte_mov16(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } else if (n <= 48) { > + rte_mov32(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } else { > + rte_mov48(dst, src); > + rte_mov16(dst - 16 + n, src - 16 + n); > + } > +} > + > + > +static __rte_always_inline void * > +rte_memcpy(void *restrict dst, const void *restrict src, size_t n) > +{ > + if (n < 16) { > + rte_memcpy_lt16((uint8_t *)dst, (const uint8_t *)src, n); > + return dst; > + } > +#if RTE_CACHE_LINE_SIZE >= 128 > + if (n < 128) { > + rte_memcpy_ge16_lt128((uint8_t *)dst, (const uint8_t *)src, n); > + return dst; > + } > +#else > + if (n < 64) { > + rte_memcpy_ge16_lt64((uint8_t *)dst, (const uint8_t *)src, n); > + return dst; > + } > +#endif > + __builtin_prefetch(src, 0, 0); > + __builtin_prefetch(dst, 1, 0); > + if (likely( > + (!IS_UNALIGNED_COPY(dst, src) && n <= ALIGNED_THRESHOLD) > + || (IS_UNALIGNED_COPY(dst, src) && n <= UNALIGNED_THRESHOLD) > + )) { > +#if RTE_CACHE_LINE_SIZE >= 128 > + rte_memcpy_ge128((uint8_t *)dst, (const uint8_t *)src, n); > +#else > + rte_memcpy_ge64((uint8_t *)dst, (const uint8_t *)src, n); > +#endif Can we remove this #ifdef clutter(We have two of them in a same function)? I suggest to remove this clutter by having the separate routine. ie. 1) #if RTE_CACHE_LINE_SIZE >= 128 rte_memcpy(void *restrict dst, const void *restrict src, size_t n) { } #else rte_memcpy(void *restrict dst, const void *restrict src, size_t n) { } #endif 2) Have separate inline function to resolve following logic and used it in both variants. if (likely( (!IS_UNALIGNED_COPY(dst, src) && n <= ALIGNED_THRESHOLD) || (IS_UNALIGNED_COPY(dst, src) && n <= UNALIGNED_THRESHOLD) )) { With above changes: Acked-by: Jerin Jacob