DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Li, Xiaoyun" <xiaoyun.li@intel.com>
To: "Wang, Liang-min" <liang-min.wang@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
	"Zhang, Helin" <helin.zhang@intel.com>,
	"pierre@emutex.com" <pierre@emutex.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2 1/3] eal/x86: run-time dispatch over memcpy
Date: Tue, 12 Sep 2017 02:27:05 +0000	[thread overview]
Message-ID: <B9E724F4CB7543449049E7AE7669D82F443E45@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <B9E724F4CB7543449049E7AE7669D82F442FE6@SHSMSX101.ccr.corp.intel.com>

Hi ALL

After investigating, most DPDK codes are already run-time dispatching. Only rte_memcpy chooses the ISA at build-time.

To modify memcpy, there are two ways. The first one is function pointers and another is function multi-versioning in GCC.

But memcpy has been greatly optimized and gets benefit from total inline. If changing it to run-time dispatching via function pointers, the perf will drop a lot especially when copy size is small.

And function multi-versioning in GCC only works for C++. Even if it is said that GCC6 can support C, but in fact it does not support C in my trial.



The attachment is the perf results of memcpy with and without my patch and original DPDK codes but without inline.

It's just for comparison, so right now, I only tested on Broadwell, using AVX2.

The results are from running test/test/test_memcpy_perf.c.

(C = compile-time constant)

/* Do aligned tests where size is a variable */

/* Do aligned tests where size is a compile-time constant */

/* Do unaligned tests where size is a variable */

/* Do unaligned tests where size is a compile-time constant */



4-7 means dpdk costs time 4 and glibc costs time 7

For size smaller than 128 bytes. This patch's perf is bad and even worse than glibc.

When size grows, the perf is better than glibc but worse than original dpdk.

And when grows above about 1024 bytes, it performs similarly to original dpdk.

Furthermore, if delete inline in original dpdk, the perf are similar to the perf with patch.

Different situations(4 types, such as cache to cache) perform differently but the trend is the same (size grows, perf grows).



So if needs dynamic, needs sacrifices some perf and needs to compile for the minimum target (e.g. compile for target avx, run on avx, avx2, avx512f).



Thus, I think this feature shouldn't be delivered in this release.



Best Regards,

Xiaoyun Li

  parent reply	other threads:[~2017-09-12  2:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-25  2:06 [dpdk-dev] [PATCH 0/3] dynamic linking support Xiaoyun Li
2017-08-25  2:06 ` [dpdk-dev] [PATCH 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-08-30 14:56   ` Ananyev, Konstantin
2017-08-30 17:51     ` Bruce Richardson
2017-08-31  1:21       ` Lu, Wenzhuo
2017-08-30 18:00   ` Stephen Hemminger
2017-08-31  1:23     ` Lu, Wenzhuo
2017-08-31  5:05       ` Stephen Hemminger
2017-08-31  5:24         ` Li, Xiaoyun
2017-08-25  2:06 ` [dpdk-dev] [PATCH 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-08-25  2:06 ` [dpdk-dev] [PATCH 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-09-01  8:56 ` [dpdk-dev] [PATCH v2 0/3] dynamic linking support Xiaoyun Li
2017-09-01  8:57   ` [dpdk-dev] [PATCH v2 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-09-01  9:16     ` Ananyev, Konstantin
2017-09-01  9:28       ` Li, Xiaoyun
2017-09-01 10:38         ` Ananyev, Konstantin
2017-09-04  1:41           ` Li, Xiaoyun
     [not found]             ` <B9E724F4CB7543449049E7AE7669D82F44216E@SHSMSX101.ccr.corp.intel.com>
     [not found]               ` <B9E724F4CB7543449049E7AE7669D82F442FE6@SHSMSX101.ccr.corp.intel.com>
2017-09-12  2:27                 ` Li, Xiaoyun [this message]
2017-09-20  6:57                   ` Li, Xiaoyun
2017-09-01 15:34     ` Stephen Hemminger
2017-09-01  8:57   ` [dpdk-dev] [PATCH v2 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-09-01  8:57   ` [dpdk-dev] [PATCH v2 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B9E724F4CB7543449049E7AE7669D82F443E45@SHSMSX101.ccr.corp.intel.com \
    --to=xiaoyun.li@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=helin.zhang@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=liang-min.wang@intel.com \
    --cc=pierre@emutex.com \
    --cc=qi.z.zhang@intel.com \
    --cc=wenzhuo.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).