DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: "Li, Xiaoyun" <xiaoyun.li@intel.com>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	dev@dpdk.org, "Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
	"Zhang, Helin" <helin.zhang@intel.com>,
	"ophirmu@mellanox.com" <ophirmu@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy
Date: Thu, 19 Oct 2017 10:33:36 +0200	[thread overview]
Message-ID: <3438028.jIYWTcBuhA@xps> (raw)
In-Reply-To: <B9E724F4CB7543449049E7AE7669D82F47FE41@SHSMSX101.ccr.corp.intel.com>

19/10/2017 09:51, Li, Xiaoyun:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 19/10/2017 04:45, Li, Xiaoyun:
> > > Hi
> > > > > >
> > > > > > The significant change of this patch is to call a function
> > > > > > pointer for packet size > 128 (RTE_X86_MEMCPY_THRESH).
> > > > > The perf drop is due to function call replacing inline.
> > > > >
> > > > > > Please could you provide some benchmark numbers?
> > > > > I ran memcpy_perf_test which would show the time cost of memcpy. I
> > > > > ran it on broadwell with sse and avx2.
> > > > > But I just draw pictures and looked at the trend not computed the
> > > > > exact percentage. Sorry about that.
> > > > > The picture shows results of copy size of 2, 4, 6, 8, 9, 12, 16,
> > > > > 32, 64, 128, 192, 256, 320, 384, 448, 512, 768, 1024, 1518, 1522,
> > > > > 1536, 1600, 2048, 2560, 3072, 3584, 4096, 4608, 5120, 5632, 6144,
> > > > > 6656, 7168,
> > > > 7680, 8192.
> > > > > In my test, the size grows, the drop degrades. (Using copy time
> > > > > indicates the
> > > > > perf.) From the trend picture, when the size is smaller than 128
> > > > > bytes, the perf drops a lot, almost 50%. And above 128 bytes, it
> > > > > approaches the original dpdk.
> > > > > I computed it right now, it shows that when greater than 128 bytes
> > > > > and smaller than 1024 bytes, the perf drops about 15%. When above
> > > > > 1024 bytes, the perf drops about 4%.
> > > > >
> > > > > > From a test done at Mellanox, there might be a performance
> > > > > > degradation of about 15% in testpmd txonly with AVX2.
> > > >
> > >
> > > I did tests on X710, XXV710, X540 and MT27710 but didn't see
> > performance degradation.
> > >
> > > I used command "./x86_64-native-linuxapp-gcc/app/testpmd -c 0xf -n 4 -- -
> > I" and set fwd txonly.
> > > I tested it on v17.11-rc1, then revert my patch and tested it again.
> > > Show port stats all and see the throughput pps. But the results are similar
> > and no drop.
> > >
> > > Did I miss something?
> > 
> > I do not understand. Yesterday you confirmed a 15% drop with buffers
> > between
> > 128 and 1024 bytes.
> > But you do not see this drop in your txonly tests, right?
> > 
> Yes. The drop is using test.
> Using command "make test -j" and then " ./build/app/test -c f -n 4 " 
> Then run "memcpy_perf_autotest"
> The results are the cycles that memory copy costs.
> But I just use it to show the trend because I heard that it's not recommended to use micro benchmarks like test_memcpy_perf for memcpy performance report as they aren't likely able to reflect performance of real world applications.

Yes real applications can hide the memcpy cost.
Sometimes, the cost appear for real :)

> Details can be seen at https://software.intel.com/en-us/articles/performance-optimization-of-memcpy-in-dpdk
> 
> And I didn't see drop in testpmd txonly test. Maybe it's because not a lot memcpy calls.

It has been seen in a mlx4 use-case using more memcpy.
I think 15% in micro-benchmark is too much.
What can we do? Raise the threshold?

> > > > Another thing, I will test testpmd txonly with intel nics and
> > > > mellanox these days.
> > > > And try adjusting the RTE_X86_MEMCPY_THRESH to see if there is any
> > > > improvement.
> > > >
> > > > > > Is there someone else seeing a performance degradation?

  reply	other threads:[~2017-10-19  8:33 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-26  7:41 [dpdk-dev] [PATCH v3 0/3] dynamic linking support Xiaoyun Li
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-01 23:41   ` Ananyev, Konstantin
2017-10-02  0:12     ` Li, Xiaoyun
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-09-26  7:41 ` [dpdk-dev] [PATCH v3 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02  0:08   ` Ananyev, Konstantin
2017-10-02  0:09     ` Li, Xiaoyun
2017-10-02  9:35     ` Ananyev, Konstantin
2017-10-02 16:13 ` [dpdk-dev] [PATCH v4 0/3] run-time Linking support Xiaoyun Li
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-02 16:39     ` Ananyev, Konstantin
2017-10-02 23:10       ` Li, Xiaoyun
2017-10-03 11:15         ` Ananyev, Konstantin
2017-10-03 11:39           ` Li, Xiaoyun
2017-10-03 12:12             ` Ananyev, Konstantin
2017-10-03 12:23               ` Li, Xiaoyun
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-02 16:13   ` [dpdk-dev] [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-02 16:52     ` Ananyev, Konstantin
2017-10-03  8:15       ` Li, Xiaoyun
2017-10-03 11:23         ` Ananyev, Konstantin
2017-10-03 11:27           ` Li, Xiaoyun
2017-10-03 14:59   ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-03 14:59     ` [dpdk-dev] [PATCH v5 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-04 17:56     ` [dpdk-dev] [PATCH v5 0/3] run-time Linking support Ananyev, Konstantin
2017-10-04 22:33       ` Li, Xiaoyun
2017-10-04 22:58     ` [dpdk-dev] [PATCH v6 " Xiaoyun Li
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-05  9:37         ` Ananyev, Konstantin
2017-10-05  9:38           ` Ananyev, Konstantin
2017-10-05 11:19           ` Li, Xiaoyun
2017-10-05 11:26             ` Richardson, Bruce
2017-10-05 11:26             ` Li, Xiaoyun
2017-10-05 12:12               ` Ananyev, Konstantin
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-04 22:58       ` [dpdk-dev] [PATCH v6 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05  9:40         ` Ananyev, Konstantin
2017-10-05 10:23           ` Li, Xiaoyun
2017-10-05 12:33       ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Xiaoyun Li
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-09 17:47           ` Thomas Monjalon
2017-10-13  1:06             ` Li, Xiaoyun
2017-10-13  7:21               ` Thomas Monjalon
2017-10-13  7:30                 ` Li, Xiaoyun
2017-10-13  7:31                 ` Ananyev, Konstantin
2017-10-13  7:36                   ` Thomas Monjalon
2017-10-13  7:41                     ` Li, Xiaoyun
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-05 12:33         ` [dpdk-dev] [PATCH v7 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-05 13:24         ` [dpdk-dev] [PATCH v7 0/3] run-time Linking support Ananyev, Konstantin
2017-10-09 17:40         ` Thomas Monjalon
2017-10-13  0:58           ` Li, Xiaoyun
2017-10-13  9:01         ` [dpdk-dev] [PATCH v8 " Xiaoyun Li
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy Xiaoyun Li
2017-10-13  9:28             ` Thomas Monjalon
2017-10-13 10:26               ` Ananyev, Konstantin
2017-10-17 21:24             ` Thomas Monjalon
2017-10-18  2:21               ` Li, Xiaoyun
2017-10-18  6:22                 ` Li, Xiaoyun
2017-10-19  2:45                   ` Li, Xiaoyun
2017-10-19  6:58                     ` Thomas Monjalon
2017-10-19  7:51                       ` Li, Xiaoyun
2017-10-19  8:33                         ` Thomas Monjalon [this message]
2017-10-19  8:50                           ` Li, Xiaoyun
2017-10-19  8:59                             ` Ananyev, Konstantin
2017-10-19  9:00                             ` Thomas Monjalon
2017-10-19  9:29                               ` Bruce Richardson
2017-10-20  1:02                                 ` Li, Xiaoyun
2017-10-25  6:55                                   ` Li, Xiaoyun
2017-10-25  7:25                                     ` Thomas Monjalon
2017-10-29  8:49                                       ` Thomas Monjalon
2017-11-02 10:22                                         ` Wang, Zhihong
2017-11-02 10:44                                           ` Thomas Monjalon
2017-11-02 10:58                                             ` Li, Xiaoyun
2017-11-02 12:15                                               ` Thomas Monjalon
2017-11-03  7:47                                             ` Yao, Lei A
2017-10-25  8:50                                     ` Ananyev, Konstantin
2017-10-25  8:54                                       ` Li, Xiaoyun
2017-10-25  9:00                                         ` Thomas Monjalon
2017-10-25 10:32                                           ` Li, Xiaoyun
2017-10-25  9:14                                         ` Ananyev, Konstantin
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 2/3] app/test: run-time dispatch over memcpy perf test Xiaoyun Li
2017-10-13  9:01           ` [dpdk-dev] [PATCH v8 3/3] efd: run-time dispatch over x86 EFD functions Xiaoyun Li
2017-10-13 13:13           ` [dpdk-dev] [PATCH v8 0/3] run-time Linking support Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3438028.jIYWTcBuhA@xps \
    --to=thomas@monjalon.net \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=helin.zhang@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=ophirmu@mellanox.com \
    --cc=wenzhuo.lu@intel.com \
    --cc=xiaoyun.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).