From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 0D6D0A04B6;
	Mon, 12 Oct 2020 16:52:03 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id C74641BB9A;
	Mon, 12 Oct 2020 16:52:00 +0200 (CEST)
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 1A8E41B9F0
 for <dev@dpdk.org>; Mon, 12 Oct 2020 16:51:57 +0200 (CEST)
IronPort-SDR: UlNwE5TQzV0w1rtBHmWNgvya6QRSq1U243FQKa0ZzBCuCVpn57KUJMLRdbq2+EIXxsZvUof3mn
 ssLjBOC6Tl5g==
X-IronPort-AV: E=McAfee;i="6000,8403,9771"; a="164964380"
X-IronPort-AV: E=Sophos;i="5.77,367,1596524400"; d="scan'208";a="164964380"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 12 Oct 2020 07:51:57 -0700
IronPort-SDR: qu6b58a4pza7AZkw579lEw3+vFssMAeYYSDHdo6NxF5H3MelvGKN7BOznhKU7jdXuPuttFh/FI
 xfEWFYruEFVg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.77,367,1596524400"; d="scan'208";a="355817826"
Received: from silpixa00399126.ir.intel.com ([10.237.222.4])
 by FMSMGA003.fm.intel.com with ESMTP; 12 Oct 2020 07:51:56 -0700
From: Bruce Richardson <bruce.richardson@intel.com>
To: dev@dpdk.org
Cc: yingyax.han@intel.com, konstantin.ananyev@intel.com, lijuan.tu@intel.com,
 Bruce Richardson <bruce.richardson@intel.com>
Date: Mon, 12 Oct 2020 15:51:48 +0100
Message-Id: <20201012145148.290451-1-bruce.richardson@intel.com>
X-Mailer: git-send-email 2.25.1
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH] build: fix memcpy behaviour regression
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

When testing on some x86 platforms, code compiled with meson was observed
running at a different power-license level to that compiled with make. This
is due to the fact that meson auto-detects the instruction sets available
on the system and enabled AVX512 rte_memcpy when AVX512 was available,
while on make, a build time AVX-512 flag needed to be explicitly set to
enable that AVX512 rte_memcpy code path.

In the absense of runtime path selection for rte_memcpy - which is
complicated by it being a static inline function in a header file - we can
fix this behaviour regression by similarly having a build-time option which
must be set to enable the AVX-512 memcpy path.

Fixes: a25a650be5f0 ("build: add infrastructure for meson and ninja builds")
Fixes: 3e1bb55fd6ef ("build/x86: add SSE flags")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

---
NOTE: This patch is not suitable for backporting, as it will break the
build support for make builds without addition makefile changes.
---
 lib/librte_eal/include/generic/rte_memcpy.h | 4 ++++
 lib/librte_eal/x86/include/rte_memcpy.h     | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/include/generic/rte_memcpy.h b/lib/librte_eal/include/generic/rte_memcpy.h
index 701e550c3..e7f0f8eaa 100644
--- a/lib/librte_eal/include/generic/rte_memcpy.h
+++ b/lib/librte_eal/include/generic/rte_memcpy.h
@@ -95,6 +95,10 @@ rte_mov256(uint8_t *dst, const uint8_t *src);
  * @note This is implemented as a macro, so it's address should not be taken
  * and care is needed as parameter expressions may be evaluated multiple times.
  *
+ * @note For x86 platforms to enable the AVX-512 memcpy implementation, set
+ * -DRTE_MEMCPY_AVX512 macro in CFLAGS, or define the RTE_MEMCPY_AVX512 macro
+ * explicitly in the source file before including the rte_memcpy header file.
+ *
  * @param dst
  *   Pointer to the destination of the data.
  * @param src
diff --git a/lib/librte_eal/x86/include/rte_memcpy.h b/lib/librte_eal/x86/include/rte_memcpy.h
index 008a3de67..79f381dd9 100644
--- a/lib/librte_eal/x86/include/rte_memcpy.h
+++ b/lib/librte_eal/x86/include/rte_memcpy.h
@@ -45,7 +45,7 @@ extern "C" {
 static __rte_always_inline void *
 rte_memcpy(void *dst, const void *src, size_t n);
 
-#ifdef __AVX512F__
+#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512
 
 #define ALIGNMENT_MASK 0x3F
 
-- 
2.25.1