DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Li, Xiaoyun" <xiaoyun.li@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
	"Zhang, Helin" <helin.zhang@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy
Date: Tue, 3 Oct 2017 11:39:43 +0000	[thread overview]
Message-ID: <B9E724F4CB7543449049E7AE7669D82F463EC2@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA35AA@IRSMSX103.ger.corp.intel.com>

Hi
You mean just use rte_memcpy_internal in rte_memcpy_avx2, rte_memcpy_avx512?
But if RTE_MACHINE_CPUFLAGS_AVX2 means only whether the compiler supports avx2, then internal would only compiled
With avx2 codes, then cannot choose other code path. What if the HW cannot support avx2?
If RTE_MACHINE_CPUFLAGS_AVX2 means as before, suggests whether both compiler and HW supports avx2. Then the function
has no difference right now.
The mocro is determined at compilation time. But selection is hoped to be at runtime.
Did I consider something wrong?

Best Regards,
Xiaoyun Li




> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, October 3, 2017 19:16
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin
> <helin.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy
> 
> Hi,
> 
> >
> > Hi
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Tuesday, October 3, 2017 00:39
> > > To: Li, Xiaoyun <xiaoyun.li@intel.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>
> > > Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin
> > > <helin.zhang@intel.com>; dev@dpdk.org
> > > Subject: RE: [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Li, Xiaoyun
> > > > Sent: Monday, October 2, 2017 5:13 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > Richardson,
> > > Bruce <bruce.richardson@intel.com>
> > > > Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Helin
> > > <helin.zhang@intel.com>; dev@dpdk.org; Li, Xiaoyun
> > > <xiaoyun.li@intel.com>
> > > > Subject: [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy
> > > >
> > > > This patch dynamically selects functions of memcpy at run-time
> > > > based on CPU flags that current machine supports. This patch uses
> > > > function pointers which are bind to the relative functions at constrctor
> time.
> > > > In addition, AVX512 instructions set would be compiled only if
> > > > users config it enabled and the compiler supports it.
> > > >
> > > > Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> > > > ---
> > > > v2
> > > > * Use gcc function multi-versioning to avoid compilation issues.
> > > > * Add macros for AVX512 and AVX2. Only if users enable AVX512 and
> > > > the compiler supports it, the AVX512 codes would be compiled. Only
> > > > if the compiler supports AVX2, the AVX2 codes would be compiled.
> > > >
> > > > v3
> > > > * Reduce function calls via only keep rte_memcpy_xxx.
> > > > * Add conditions that when copy size is small, use inline code path.
> > > > Otherwise, use dynamic code path.
> > > > * To support attribute target, clang version must be greater than 3.7.
> > > > Otherwise, would choose SSE/AVX code path, the same as before.
> > > > * Move two mocro functions to the top of the code since they would
> > > > be used in inline SSE/AVX and dynamic SSE/AVX codes.
> > > >
> > > > v4
> > > > * Modify rte_memcpy.h to several .c files and modify makefiles to
> > > > compile
> > > > AVX2 and AVX512 files.
> > >
> > > Could you explain to me why instead of reusing existing rte_memcpy()
> > > code to generate _sse/_avx2/ax512f flavors you keep pushing changes
> > > with 3 separate implementations?
> > > Obviously that is much more expensive in terms of maintenance and
> > > doesn't look like feasible solution to me.
> > > Is existing rte_memcpy() implementation is not good enough in terms
> > > of functionality and/or performance?
> > > If so, can you outline these problems and try to fix them first.
> > > Konstantin
> > >
> >
> > I just change many small functions to one function in those 3 separate
> functions.
> 
> Yes, so with what you suggest  we'll have 4 implementations  for rte_memcpy
> to support.
> That's very expensive terms of maintenance and I believe totally unnecessary.
> 
> > Because the existing codes are totally inline, including rte_memcpy()
> > itself. So the compilation will change all rte_memcpy() calls into the basic
> codes like xmm0=xxx.
> >
> > The existing codes in this way are OK.
> 
> Good.
> 
> >But when run-time, it will bring lots of function calls  and cause perf
> >drop.
> 
> I believe it wouldn't if we do it properly.
> All internal functions (mov16, mov32, etc.) will still be unlined by the
> compiler for each flavor (sse/avx2/etc.) - have a look at the patch I sent.
> 
> Konstantin

  reply	other threads:[~2017-10-03 11:39 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-26  7:41 [dpdk-dev] [PATCH v3 0/3] dynamic linking support Xiaoyun Li
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-01 23:41   ` Ananyev, Konstantin
2017-10-02  0:12     ` Li, Xiaoyun
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02  0:08   ` Ananyev, Konstantin
2017-10-02  0:09     ` Li, Xiaoyun
2017-10-02  9:35     ` Ananyev, Konstantin
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 0/3] run-time Linking support Xiaoyun Li
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-02 16:39     ` Ananyev, Konstantin
2017-10-02 23:10       ` Li, Xiaoyun
2017-10-03 11:15         ` Ananyev, Konstantin
2017-10-03 11:39           ` Li, Xiaoyun [this message]
2017-10-03 12:12             ` Ananyev, Konstantin
2017-10-03 12:23               ` Li, Xiaoyun
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 16:52     ` Ananyev, Konstantin
2017-10-03  8:15       ` Li, Xiaoyun
2017-10-03 11:23         ` Ananyev, Konstantin
2017-10-03 11:27           ` Li, Xiaoyun
2017-10-03 14:59   ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-04 17:56     ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Ananyev, Konstantin
2017-10-04 22:33       ` Li, Xiaoyun
2017-10-04 22:58     ` [dpdk-dev] [PATCH v6 " Xiaoyun Li
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-05  9:37         ` Ananyev, Konstantin
2017-10-05  9:38           ` Ananyev, Konstantin
2017-10-05 11:19           ` Li, Xiaoyun
2017-10-05 11:26             ` Richardson, Bruce
2017-10-05 11:26             ` Li, Xiaoyun
2017-10-05 12:12               ` Ananyev, Konstantin
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05  9:40         ` Ananyev, Konstantin
2017-10-05 10:23           ` Li, Xiaoyun
2017-10-05 12:33       ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Xiaoyun Li
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-09 17:47           ` Thomas Monjalon
2017-10-13  1:06             ` Li, Xiaoyun
2017-10-13  7:21               ` Thomas Monjalon
2017-10-13  7:30                 ` Li, Xiaoyun
2017-10-13  7:31                 ` Ananyev, Konstantin
2017-10-13  7:36                   ` Thomas Monjalon
2017-10-13  7:41                     ` Li, Xiaoyun
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 13:24         ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Ananyev, Konstantin
2017-10-09 17:40         ` Thomas Monjalon
2017-10-13  0:58           ` Li, Xiaoyun
2017-10-13  9:01         ` [dpdk-dev] [PATCH v8 " Xiaoyun Li
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-13  9:28             ` Thomas Monjalon
2017-10-13 10:26               ` Ananyev, Konstantin
2017-10-17 21:24             ` Thomas Monjalon
2017-10-18  2:21               ` Li, Xiaoyun
2017-10-18  6:22                 ` Li, Xiaoyun
2017-10-19  2:45                   ` Li, Xiaoyun
2017-10-19  6:58                     ` Thomas Monjalon
2017-10-19  7:51                       ` Li, Xiaoyun
2017-10-19  8:33                         ` Thomas Monjalon
2017-10-19  8:50                           ` Li, Xiaoyun
2017-10-19  8:59                             ` Ananyev, Konstantin
2017-10-19  9:00                             ` Thomas Monjalon
2017-10-19  9:29                               ` Bruce Richardson
2017-10-20  1:02                                 ` Li, Xiaoyun
2017-10-25  6:55                                   ` Li, Xiaoyun
2017-10-25  7:25                                     ` Thomas Monjalon
2017-10-29  8:49                                       ` Thomas Monjalon
2017-11-02 10:22                                         ` Wang, Zhihong
2017-11-02 10:44                                           ` Thomas Monjalon
2017-11-02 10:58                                             ` Li, Xiaoyun
2017-11-02 12:15                                               ` Thomas Monjalon
2017-11-03  7:47                                             ` Yao, Lei A
2017-10-25  8:50                                     ` Ananyev, Konstantin
2017-10-25  8:54                                       ` Li, Xiaoyun
2017-10-25  9:00                                         ` Thomas Monjalon
2017-10-25 10:32                                           ` Li, Xiaoyun
2017-10-25  9:14                                         ` Ananyev, Konstantin
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-13 13:13           ` [dpdk-dev] [PATCH v8 0/3] run-time Linking support Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B9E724F4CB7543449049E7AE7669D82F463EC2@SHSMSX101.ccr.corp.intel.com \
    --to=xiaoyun.li@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=helin.zhang@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=wenzhuo.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).