From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Li, Xiaoyun" <xiaoyun.li@intel.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
"Zhang, Helin" <helin.zhang@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v5 0/3] run-time Linking support
Date: Wed, 4 Oct 2017 17:56:17 +0000 [thread overview]
Message-ID: <2601191342CEEE43887BDE71AB9772585FAA4014@IRSMSX103.ger.corp.intel.com> (raw)
In-Reply-To: <1507042796-86318-1-git-send-email-xiaoyun.li@intel.com>
Hi Xiaouyn,
> -----Original Message-----
> From: Li, Xiaoyun
> Sent: Tuesday, October 3, 2017 4:00 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Li, Xiaoyun <xiaoyun.li@intel.com>
> Subject: [PATCH v5 0/3] run-time Linking support
>
> This patchset dynamically selects functions at run-time based on CPU flags
> that current machine supports.This patchset modifies mempcy, memcpy perf
> test and x86 EFD, using function pointers and bind them at constructor time.
> Then in the cloud environment, users can compiler once for the minimum target
> such as 'haswell'(not 'native') and run on different platforms (equal or above
> haswell) and can get ISA optimization based on running CPU.
>
> Xiaoyun Li (3):
> eal/x86: run-time dispatch over memcpy
> app/test: run-time dispatch over memcpy perf test
> efd: run-time dispatch over x86 EFD functions
>
> ---
> v2
> * Use gcc function multi-versioning to avoid compilation issues.
> * Add macros for AVX512 and AVX2. Only if users enable AVX512 and the compiler
> supports it, the AVX512 codes would be compiled. Only if the compiler supports
> AVX2, the AVX2 codes would be compiled.
>
> v3
> * Reduce function calls via only keep rte_memcpy_xxx.
> * Add conditions that when copy size is small, use inline code path.
> Otherwise, use dynamic code path.
> * To support attribute target, clang version must be greater than 3.7.
> Otherwise, would choose SSE/AVX code path, the same as before.
> * Move two mocro functions to the top of the code since they would be used in
> inline SSE/AVX and dynamic SSE/AVX codes.
>
> v4
> * Modify rte_memcpy.h to several .c files and modify makefiles to compile
> AVX2 and AVX512 files.
>
> v5
> * Delete redundant repeated codes of rte_memcpy_xxx.
> * Modify makefiles to enable reuse of existing rte_memcpy.
> * Delete redundant codes of rte_efd_x86.h in v4. Move it into .c file and enable
> compilation -mavx2 for it in makefile since it is already chosen at run-time.
>
Generally looks good, just two things to fix below.
Konstantin
1. [dpdk-dev,v5,1/3] eal/x86: run-time dispatch over memcpy
Shared target build fails:
http://dpdk.org/ml/archives/test-report/2017-October/031032.html
I think you need to include rte_memcpy_ptr into the:
lib/librte_eal/linuxapp/eal/rte_eal_version.map
lib/librte_eal/bsdapp/eal/rte_eal_version.map
to fix it.
2. [dpdk-dev,v5,3/3] efd: run-time dispatch over x86 EFD functions
/lib/librte_efd/rte_efd_x86.c
....
+efd_value_t
+efd_lookup_internal_avx2(const efd_hashfunc_t *group_hash_idx,
+ const efd_lookuptbl_t *group_lookup_table,
+ const uint32_t hash_val_a, const uint32_t hash_val_b)
+{
+#ifdef CC_SUPPORT_AVX2
+ efd_value_t value = 0;
+ uint32_t i = 0;
+ __m256i vhash_val_a = _mm256_set1_epi32(hash_val_a);
+ __m256i vhash_val_b = _mm256_set1_epi32(hash_val_b);
+
+ for (; i < RTE_EFD_VALUE_NUM_BITS; i += 8) {
+ __m256i vhash_idx =
+ _mm256_cvtepu16_epi32(EFD_LOAD_SI128(
+ (__m128i const *) &group_hash_idx[i]));
+ __m256i vlookup_table = _mm256_cvtepu16_epi32(
+ EFD_LOAD_SI128((__m128i const *)
+ &group_lookup_table[i]));
+ __m256i vhash = _mm256_add_epi32(vhash_val_a,
+ _mm256_mullo_epi32(vhash_idx, vhash_val_b));
+ __m256i vbucket_idx = _mm256_srli_epi32(vhash,
+ EFD_LOOKUPTBL_SHIFT);
+ __m256i vresult = _mm256_srlv_epi32(vlookup_table,
+ vbucket_idx);
+
+ value |= (_mm256_movemask_ps(
+ (__m256) _mm256_slli_epi32(vresult, 31))
+ & ((1 << (RTE_EFD_VALUE_NUM_BITS - i)) - 1)) << i;
+ }
+
+ return value;
+#else
We always build that file with AVX2 option, so I think we can safely remove
The #ifdef CC_SUPPORT_AVX2 and the code below.
+ RTE_SET_USED(group_hash_idx);
+ RTE_SET_USED(group_lookup_table);
+ RTE_SET_USED(hash_val_a);
+ RTE_SET_USED(hash_val_b);
+ /* Return dummy value, only to avoid compilation breakage */
+ return 0;
+#endif
+
+}
> lib/librte_eal/bsdapp/eal/Makefile | 19 +
> .../common/include/arch/x86/rte_memcpy.c | 59 ++
> .../common/include/arch/x86/rte_memcpy.h | 861 +------------------
> .../common/include/arch/x86/rte_memcpy_avx2.c | 44 +
> .../common/include/arch/x86/rte_memcpy_avx512f.c | 44 +
> .../common/include/arch/x86/rte_memcpy_internal.h | 909 +++++++++++++++++++++
> .../common/include/arch/x86/rte_memcpy_sse.c | 40 +
> lib/librte_eal/linuxapp/eal/Makefile | 19 +
> lib/librte_efd/Makefile | 6 +
> lib/librte_efd/rte_efd_x86.c | 87 ++
> lib/librte_efd/rte_efd_x86.h | 48 +-
> mk/rte.cpuflags.mk | 14 +
> test/test/test_memcpy_perf.c | 40 +-
> 13 files changed, 1285 insertions(+), 905 deletions(-)
> create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.c
> create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx2.c
> create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx512f.c
> create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_internal.h
> create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_sse.c
> create mode 100644 lib/librte_efd/rte_efd_x86.c
>
> --
> 2.7.4
next prev parent reply other threads:[~2017-10-04 17:56 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-26 7:41 [dpdk-dev] [PATCH v3 0/3] dynamic linking support Xiaoyun Li
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-01 23:41 ` Ananyev, Konstantin
2017-10-02 0:12 ` Li, Xiaoyun
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 0:08 ` Ananyev, Konstantin
2017-10-02 0:09 ` Li, Xiaoyun
2017-10-02 9:35 ` Ananyev, Konstantin
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 0/3] run-time Linking support Xiaoyun Li
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-02 16:39 ` Ananyev, Konstantin
2017-10-02 23:10 ` Li, Xiaoyun
2017-10-03 11:15 ` Ananyev, Konstantin
2017-10-03 11:39 ` Li, Xiaoyun
2017-10-03 12:12 ` Ananyev, Konstantin
2017-10-03 12:23 ` Li, Xiaoyun
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 16:52 ` Ananyev, Konstantin
2017-10-03 8:15 ` Li, Xiaoyun
2017-10-03 11:23 ` Ananyev, Konstantin
2017-10-03 11:27 ` Li, Xiaoyun
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-04 17:56 ` Ananyev, Konstantin [this message]
2017-10-04 22:33 ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Li, Xiaoyun
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 " Xiaoyun Li
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-05 9:37 ` Ananyev, Konstantin
2017-10-05 9:38 ` Ananyev, Konstantin
2017-10-05 11:19 ` Li, Xiaoyun
2017-10-05 11:26 ` Richardson, Bruce
2017-10-05 11:26 ` Li, Xiaoyun
2017-10-05 12:12 ` Ananyev, Konstantin
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 9:40 ` Ananyev, Konstantin
2017-10-05 10:23 ` Li, Xiaoyun
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Xiaoyun Li
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-09 17:47 ` Thomas Monjalon
2017-10-13 1:06 ` Li, Xiaoyun
2017-10-13 7:21 ` Thomas Monjalon
2017-10-13 7:30 ` Li, Xiaoyun
2017-10-13 7:31 ` Ananyev, Konstantin
2017-10-13 7:36 ` Thomas Monjalon
2017-10-13 7:41 ` Li, Xiaoyun
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 13:24 ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Ananyev, Konstantin
2017-10-09 17:40 ` Thomas Monjalon
2017-10-13 0:58 ` Li, Xiaoyun
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 " Xiaoyun Li
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-13 9:28 ` Thomas Monjalon
2017-10-13 10:26 ` Ananyev, Konstantin
2017-10-17 21:24 ` Thomas Monjalon
2017-10-18 2:21 ` Li, Xiaoyun
2017-10-18 6:22 ` Li, Xiaoyun
2017-10-19 2:45 ` Li, Xiaoyun
2017-10-19 6:58 ` Thomas Monjalon
2017-10-19 7:51 ` Li, Xiaoyun
2017-10-19 8:33 ` Thomas Monjalon
2017-10-19 8:50 ` Li, Xiaoyun
2017-10-19 8:59 ` Ananyev, Konstantin
2017-10-19 9:00 ` Thomas Monjalon
2017-10-19 9:29 ` Bruce Richardson
2017-10-20 1:02 ` Li, Xiaoyun
2017-10-25 6:55 ` Li, Xiaoyun
2017-10-25 7:25 ` Thomas Monjalon
2017-10-29 8:49 ` Thomas Monjalon
2017-11-02 10:22 ` Wang, Zhihong
2017-11-02 10:44 ` Thomas Monjalon
2017-11-02 10:58 ` Li, Xiaoyun
2017-11-02 12:15 ` Thomas Monjalon
2017-11-03 7:47 ` Yao, Lei A
2017-10-25 8:50 ` Ananyev, Konstantin
2017-10-25 8:54 ` Li, Xiaoyun
2017-10-25 9:00 ` Thomas Monjalon
2017-10-25 10:32 ` Li, Xiaoyun
2017-10-25 9:14 ` Ananyev, Konstantin
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-13 13:13 ` [dpdk-dev] [PATCH v8 0/3] run-time Linking support Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2601191342CEEE43887BDE71AB9772585FAA4014@IRSMSX103.ger.corp.intel.com \
--to=konstantin.ananyev@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=helin.zhang@intel.com \
--cc=wenzhuo.lu@intel.com \
--cc=xiaoyun.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).