From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id E70EE1B647 for ; Fri, 13 Oct 2017 11:03:28 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP; 13 Oct 2017 02:03:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,370,1503385200"; d="scan'208";a="1024785324" Received: from dpdk-lixiaoyun.sh.intel.com ([10.67.111.93]) by orsmga003.jf.intel.com with ESMTP; 13 Oct 2017 02:03:26 -0700 From: Xiaoyun Li To: thomas@monjalon.net, konstantin.ananyev@intel.com Cc: dev@dpdk.org, bruce.richardson@intel.com, wenzhuo.lu@intel.com, helin.zhang@intel.com, Xiaoyun Li Date: Fri, 13 Oct 2017 17:01:46 +0800 Message-Id: <1507885309-165144-1-git-send-email-xiaoyun.li@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1507206794-79941-1-git-send-email-xiaoyun.li@intel.com> References: <1507206794-79941-1-git-send-email-xiaoyun.li@intel.com> Subject: [dpdk-dev] [PATCH v8 0/3] run-time Linking support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Oct 2017 09:03:29 -0000 This patchset dynamically selects functions at run-time based on CPU flags that current machine supports.This patchset modifies mempcy, memcpy perf test and x86 EFD, using function pointers and bind them at constructor time. Then in the cloud environment, users can compiler once for the minimum target such as 'haswell'(not 'native') and run on different platforms (equal or above haswell) and can get ISA optimization based on running CPU. Xiaoyun Li (3): eal/x86: run-time dispatch over memcpy app/test: run-time dispatch over memcpy perf test efd: run-time dispatch over x86 EFD functions --- v2 * Use gcc function multi-versioning to avoid compilation issues. * Add macros for AVX512 and AVX2. Only if users enable AVX512 and the compiler supports it, the AVX512 codes would be compiled. Only if the compiler supports AVX2, the AVX2 codes would be compiled. v3 * Reduce function calls via only keep rte_memcpy_xxx. * Add conditions that when copy size is small, use inline code path. Otherwise, use dynamic code path. * To support attribute target, clang version must be greater than 3.7. Otherwise, would choose SSE/AVX code path, the same as before. * Move two mocro functions to the top of the code since they would be used in inline SSE/AVX and dynamic SSE/AVX codes. v4 * Modify rte_memcpy.h to several .c files and modify makefiles to compile AVX2 and AVX512 files. v5 * Delete redundant repeated codes of rte_memcpy_xxx. * Modify makefiles to enable reuse of existing rte_memcpy. * Delete redundant codes of rte_efd_x86.h in v4. Move it into .c file and enable compilation -mavx2 for it in makefile since it is already chosen at run-time. v6 * Fix shared target build failure. * Safely remove redundant efd x86 avx2 codes since the file is compiled with -mavx2. v7 * Modify the added version map code in v6 to be more reasonable. * Safely remove redundant efd x86 avx2 codes since the file is compiled with -mavx2. v8 * Move added .c files to .../common/arch/x86 directory. * Fix patchset warnings. lib/librte_eal/bsdapp/eal/Makefile | 18 + lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 + lib/librte_eal/common/arch/x86/rte_memcpy.c | 59 ++ lib/librte_eal/common/arch/x86/rte_memcpy_avx2.c | 44 + .../common/arch/x86/rte_memcpy_avx512f.c | 44 + lib/librte_eal/common/arch/x86/rte_memcpy_sse.c | 40 + .../common/include/arch/x86/rte_memcpy.h | 861 +----------------- .../common/include/arch/x86/rte_memcpy_internal.h | 966 +++++++++++++++++++++ lib/librte_eal/linuxapp/eal/Makefile | 18 + lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 + lib/librte_efd/Makefile | 6 + lib/librte_efd/rte_efd_x86.c | 77 ++ lib/librte_efd/rte_efd_x86.h | 48 +- mk/rte.cpuflags.mk | 14 + test/test/test_memcpy_perf.c | 50 +- 15 files changed, 1342 insertions(+), 905 deletions(-) create mode 100644 lib/librte_eal/common/arch/x86/rte_memcpy.c create mode 100644 lib/librte_eal/common/arch/x86/rte_memcpy_avx2.c create mode 100644 lib/librte_eal/common/arch/x86/rte_memcpy_avx512f.c create mode 100644 lib/librte_eal/common/arch/x86/rte_memcpy_sse.c create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_internal.h create mode 100644 lib/librte_efd/rte_efd_x86.c -- 2.7.4