From: "Li, Xiaoyun" <xiaoyun.li@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
"Zhang, Helin" <helin.zhang@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v5 0/3] run-time Linking support
Date: Wed, 4 Oct 2017 22:33:42 +0000 [thread overview]
Message-ID: <B9E724F4CB7543449049E7AE7669D82F46443E@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA4014@IRSMSX103.ger.corp.intel.com>
OK. Will send it later. Many thanks!
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, October 5, 2017 01:56
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin
> <helin.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [PATCH v5 0/3] run-time Linking support
>
> Hi Xiaouyn,
>
> > -----Original Message-----
> > From: Li, Xiaoyun
> > Sent: Tuesday, October 3, 2017 4:00 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin
> > <helin.zhang@intel.com>; dev@dpdk.org; Li, Xiaoyun
> > <xiaoyun.li@intel.com>
> > Subject: [PATCH v5 0/3] run-time Linking support
> >
> > This patchset dynamically selects functions at run-time based on CPU
> > flags that current machine supports.This patchset modifies mempcy,
> > memcpy perf test and x86 EFD, using function pointers and bind them at
> constructor time.
> > Then in the cloud environment, users can compiler once for the minimum
> > target such as 'haswell'(not 'native') and run on different platforms
> > (equal or above
> > haswell) and can get ISA optimization based on running CPU.
> >
> > Xiaoyun Li (3):
> > eal/x86: run-time dispatch over memcpy
> > app/test: run-time dispatch over memcpy perf test
> > efd: run-time dispatch over x86 EFD functions
> >
> > ---
> > v2
> > * Use gcc function multi-versioning to avoid compilation issues.
> > * Add macros for AVX512 and AVX2. Only if users enable AVX512 and the
> > compiler supports it, the AVX512 codes would be compiled. Only if the
> > compiler supports AVX2, the AVX2 codes would be compiled.
> >
> > v3
> > * Reduce function calls via only keep rte_memcpy_xxx.
> > * Add conditions that when copy size is small, use inline code path.
> > Otherwise, use dynamic code path.
> > * To support attribute target, clang version must be greater than 3.7.
> > Otherwise, would choose SSE/AVX code path, the same as before.
> > * Move two mocro functions to the top of the code since they would be
> > used in inline SSE/AVX and dynamic SSE/AVX codes.
> >
> > v4
> > * Modify rte_memcpy.h to several .c files and modify makefiles to
> > compile
> > AVX2 and AVX512 files.
> >
> > v5
> > * Delete redundant repeated codes of rte_memcpy_xxx.
> > * Modify makefiles to enable reuse of existing rte_memcpy.
> > * Delete redundant codes of rte_efd_x86.h in v4. Move it into .c file
> > and enable compilation -mavx2 for it in makefile since it is already chosen
> at run-time.
> >
>
> Generally looks good, just two things to fix below.
> Konstantin
>
> 1. [dpdk-dev,v5,1/3] eal/x86: run-time dispatch over memcpy
>
> Shared target build fails:
> http://dpdk.org/ml/archives/test-report/2017-October/031032.html
>
> I think you need to include rte_memcpy_ptr into the:
> lib/librte_eal/linuxapp/eal/rte_eal_version.map
> lib/librte_eal/bsdapp/eal/rte_eal_version.map
> to fix it.
>
> 2. [dpdk-dev,v5,3/3] efd: run-time dispatch over x86 EFD functions
>
> /lib/librte_efd/rte_efd_x86.c
> ....
> +efd_value_t
> +efd_lookup_internal_avx2(const efd_hashfunc_t *group_hash_idx,
> + const efd_lookuptbl_t *group_lookup_table,
> + const uint32_t hash_val_a, const uint32_t hash_val_b)
> { #ifdef
> +CC_SUPPORT_AVX2
> + efd_value_t value = 0;
> + uint32_t i = 0;
> + __m256i vhash_val_a = _mm256_set1_epi32(hash_val_a);
> + __m256i vhash_val_b = _mm256_set1_epi32(hash_val_b);
> +
> + for (; i < RTE_EFD_VALUE_NUM_BITS; i += 8) {
> + __m256i vhash_idx =
> + _mm256_cvtepu16_epi32(EFD_LOAD_SI128(
> + (__m128i const *) &group_hash_idx[i]));
> + __m256i vlookup_table = _mm256_cvtepu16_epi32(
> + EFD_LOAD_SI128((__m128i const *)
> + &group_lookup_table[i]));
> + __m256i vhash = _mm256_add_epi32(vhash_val_a,
> + _mm256_mullo_epi32(vhash_idx,
> vhash_val_b));
> + __m256i vbucket_idx = _mm256_srli_epi32(vhash,
> + EFD_LOOKUPTBL_SHIFT);
> + __m256i vresult = _mm256_srlv_epi32(vlookup_table,
> + vbucket_idx);
> +
> + value |= (_mm256_movemask_ps(
> + (__m256) _mm256_slli_epi32(vresult, 31))
> + & ((1 << (RTE_EFD_VALUE_NUM_BITS - i)) - 1)) << i;
> + }
> +
> + return value;
> +#else
>
> We always build that file with AVX2 option, so I think we can safely remove
> The #ifdef CC_SUPPORT_AVX2 and the code below.
>
> + RTE_SET_USED(group_hash_idx);
> + RTE_SET_USED(group_lookup_table);
> + RTE_SET_USED(hash_val_a);
> + RTE_SET_USED(hash_val_b);
> + /* Return dummy value, only to avoid compilation breakage */
> + return 0;
> +#endif
> +
> +}
>
>
> > lib/librte_eal/bsdapp/eal/Makefile | 19 +
> > .../common/include/arch/x86/rte_memcpy.c | 59 ++
> > .../common/include/arch/x86/rte_memcpy.h | 861 +------------------
> > .../common/include/arch/x86/rte_memcpy_avx2.c | 44 +
> > .../common/include/arch/x86/rte_memcpy_avx512f.c | 44 +
> > .../common/include/arch/x86/rte_memcpy_internal.h | 909
> +++++++++++++++++++++
> > .../common/include/arch/x86/rte_memcpy_sse.c | 40 +
> > lib/librte_eal/linuxapp/eal/Makefile | 19 +
> > lib/librte_efd/Makefile | 6 +
> > lib/librte_efd/rte_efd_x86.c | 87 ++
> > lib/librte_efd/rte_efd_x86.h | 48 +-
> > mk/rte.cpuflags.mk | 14 +
> > test/test/test_memcpy_perf.c | 40 +-
> > 13 files changed, 1285 insertions(+), 905 deletions(-) create mode
> > 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.c
> > create mode 100644
> > lib/librte_eal/common/include/arch/x86/rte_memcpy_avx2.c
> > create mode 100644
> > lib/librte_eal/common/include/arch/x86/rte_memcpy_avx512f.c
> > create mode 100644
> > lib/librte_eal/common/include/arch/x86/rte_memcpy_internal.h
> > create mode 100644
> > lib/librte_eal/common/include/arch/x86/rte_memcpy_sse.c
> > create mode 100644 lib/librte_efd/rte_efd_x86.c
> >
> > --
> > 2.7.4
next prev parent reply other threads:[~2017-10-04 22:33 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-26 7:41 [dpdk-dev] [PATCH v3 0/3] dynamic linking support Xiaoyun Li
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-01 23:41 ` Ananyev, Konstantin
2017-10-02 0:12 ` Li, Xiaoyun
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-09-26 7:41 ` [dpdk-dev] [PATCH v3 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 0:08 ` Ananyev, Konstantin
2017-10-02 0:09 ` Li, Xiaoyun
2017-10-02 9:35 ` Ananyev, Konstantin
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 0/3] run-time Linking support Xiaoyun Li
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-02 16:39 ` Ananyev, Konstantin
2017-10-02 23:10 ` Li, Xiaoyun
2017-10-03 11:15 ` Ananyev, Konstantin
2017-10-03 11:39 ` Li, Xiaoyun
2017-10-03 12:12 ` Ananyev, Konstantin
2017-10-03 12:23 ` Li, Xiaoyun
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 16:52 ` Ananyev, Konstantin
2017-10-03 8:15 ` Li, Xiaoyun
2017-10-03 11:23 ` Ananyev, Konstantin
2017-10-03 11:27 ` Li, Xiaoyun
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-03 14:59 ` [dpdk-dev] [PATCH v5 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-04 17:56 ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Ananyev, Konstantin
2017-10-04 22:33 ` Li, Xiaoyun [this message]
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 " Xiaoyun Li
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-05 9:37 ` Ananyev, Konstantin
2017-10-05 9:38 ` Ananyev, Konstantin
2017-10-05 11:19 ` Li, Xiaoyun
2017-10-05 11:26 ` Richardson, Bruce
2017-10-05 11:26 ` Li, Xiaoyun
2017-10-05 12:12 ` Ananyev, Konstantin
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-04 22:58 ` [dpdk-dev] [PATCH v6 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 9:40 ` Ananyev, Konstantin
2017-10-05 10:23 ` Li, Xiaoyun
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Xiaoyun Li
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-09 17:47 ` Thomas Monjalon
2017-10-13 1:06 ` Li, Xiaoyun
2017-10-13 7:21 ` Thomas Monjalon
2017-10-13 7:30 ` Li, Xiaoyun
2017-10-13 7:31 ` Ananyev, Konstantin
2017-10-13 7:36 ` Thomas Monjalon
2017-10-13 7:41 ` Li, Xiaoyun
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-05 12:33 ` [dpdk-dev] [PATCH v7 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 13:24 ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Ananyev, Konstantin
2017-10-09 17:40 ` Thomas Monjalon
2017-10-13 0:58 ` Li, Xiaoyun
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 " Xiaoyun Li
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-13 9:28 ` Thomas Monjalon
2017-10-13 10:26 ` Ananyev, Konstantin
2017-10-17 21:24 ` Thomas Monjalon
2017-10-18 2:21 ` Li, Xiaoyun
2017-10-18 6:22 ` Li, Xiaoyun
2017-10-19 2:45 ` Li, Xiaoyun
2017-10-19 6:58 ` Thomas Monjalon
2017-10-19 7:51 ` Li, Xiaoyun
2017-10-19 8:33 ` Thomas Monjalon
2017-10-19 8:50 ` Li, Xiaoyun
2017-10-19 8:59 ` Ananyev, Konstantin
2017-10-19 9:00 ` Thomas Monjalon
2017-10-19 9:29 ` Bruce Richardson
2017-10-20 1:02 ` Li, Xiaoyun
2017-10-25 6:55 ` Li, Xiaoyun
2017-10-25 7:25 ` Thomas Monjalon
2017-10-29 8:49 ` Thomas Monjalon
2017-11-02 10:22 ` Wang, Zhihong
2017-11-02 10:44 ` Thomas Monjalon
2017-11-02 10:58 ` Li, Xiaoyun
2017-11-02 12:15 ` Thomas Monjalon
2017-11-03 7:47 ` Yao, Lei A
2017-10-25 8:50 ` Ananyev, Konstantin
2017-10-25 8:54 ` Li, Xiaoyun
2017-10-25 9:00 ` Thomas Monjalon
2017-10-25 10:32 ` Li, Xiaoyun
2017-10-25 9:14 ` Ananyev, Konstantin
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-13 9:01 ` [dpdk-dev] [PATCH v8 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-13 13:13 ` [dpdk-dev] [PATCH v8 0/3] run-time Linking support Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B9E724F4CB7543449049E7AE7669D82F46443E@SHSMSX101.ccr.corp.intel.com \
--to=xiaoyun.li@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=helin.zhang@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=wenzhuo.lu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).