DPDK usage discussions
 help / color / mirror / Atom feed
From: "Kinsella, Ray" <ray.kinsella@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>, Jared Brown <cpu-dpdk@mail.com>
Cc: "users@dpdk.org" <users@dpdk.org>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>
Subject: RE: [dpdk-users] DPDK CPU selection criteria?
Date: Thu, 30 Sep 2021 09:26:33 +0000
Message-ID: <PH0PR11MB477667B1A8A958D43725B83490AA9@PH0PR11MB4776.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1988781.SuJIhfhpoT@thomas>

Hi Jared,

inline

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday 29 September 2021 11:30
> To: Jared Brown <cpu-dpdk@mail.com>
> Cc: users@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>;
> Kinsella, Ray <ray.kinsella@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-users] DPDK CPU selection criteria?
> 
> 17/09/2021 17:23, Jared Brown:
> > Hello everybody!
> >
> > Is there some canonical resource or at least a recommendation on how
> to evaluate different CPUs for suitability for use with DPDK?
> 
> There are some performance reports with details:
> https://core.dpdk.org/perf-reports/
> 
> It is difficult to give any generic info because it depends on CPU
> architecture, use case, and application.
> Please keep in mind there are hundreds of API functions so we cannot
> have clues for all combinations.
> 
> > My use case is software routers (for example DANOS, 6WIND, TNRS), so I
> am mainly interested in forwarding performance in terms of Mpps.
> >
> > What I am looking for is to develop some kind of heuristics to
> evaluate CPUs in terms of $/Mpps without having to purchase hundreds of
> SKUs and running tests on them.

Maciek Konstantynowicz & Shrikant Shah have done extensive work in this area.
I recommend reading the following.

https://www.lfnetworking.org/wp-content/uploads/sites/55/2019/03/benchmarking_sw_data_planes_skx_bdx_mar07_2019.pdf
https://fd.io/docs/whitepapers/performance_analysis_sw_data_planes_dec21_2017.pdf

Maciek and Shrikant analysis of the performance of DPDK (OVS and FD.io).
I'd focus on their methodology here, it should resonate with you.

Once you have done this ... you can take a look at my recent presentation with Maciek for more pointers.

How to build secure Terabit Network Services with FD.io technologies | Maciek Konstantynowicz & Ray Kinsella 
Slides: https://drive.google.com/drive/folders/1KKxfu45xD785UeUra3-eBsn37OI_syxp?usp=sharing
Video: https://www.youtube.com/watch?v=lZ_GqidsnTw

> >
> > The official DPDK documentation[0] states thus:
> >
> > "7.1. Hardware and Memory Requirements
> >
> > For best performance use an Intel Xeon class server system such as Ivy
> Bridge, Haswell or newer."
> >
> > This is somewhat... vague.
> 
> This is only for Intel platforms.
> DPDK is supported on AMD and Arm CPUs as well.
> 
> Are you interested only in Intel CPU?
> All your questions below are interesting but are very hardware-specific.
> +Cc few Intel engineers for specific questions.
> 
> > I suppose one could take [1] as a baseline, which states on page 2
> that an Ivy Bridge Xeon E3-1230 V2 is able to forward unidirectional
> flows at linerate using 10G NICs at all frequencies above 1.6 GHz and
> bidirectional flows at linerate using 10G NICs at 3.3 GHz.
> >
> > This however pales compared with [2] that on page 23 shows that a 3rd
> Generation Scalable Xeon 8380 manages to very nearly saturate a 100G NIC
> at all packet sizes.
> >
> > As there is almost a magnitude in difference in forwarding performance
> per core, you can perhaps understand that I am somewhat at a loss when
> trying to gauge the performance of a particular CPU model.

Well I'd point out that there is a number of generations between those microprocessors. 
I'd also point out that considering microprocessors in your analysis,
such as Ivybridge (9 years old) and even Skylake (6 years) might be kind of dated.

Having reviewed your other questions below.
I'd point out that there are two things at work here which might be confusing. 

1. Platform improvement over time (Microprocessor & Network Card etc).
2. Software optimization over time, addition of AVX-512 instructions and so on.

> >
> > Reading [3] one learns that several aspects of the CPU affect the
> forwarding performance, but very little light is shed on how much each
> feature on its own contributes. On page 172 one learns that CPU
> frequency has a linear impact on the performance. This is borne out by
> [1], but does not take into consideration inter-generational gaps as
> witnessed by [2].
> >
> > This begs the question, what are those inter-generational differences
> made of?
> >
> > - L3 cache latency (p. 54) as an upper limit on Mpps. Do newer
> generations have decidedly lower cache latencies and is this the
> defining performance factor?
> >
> > - Direct Data I/O (p. 69)? Is DDIO combined with lower L3 cache
> latency a major force multipler? Or is prefetch sufficient to keep
> caches hot? This is somewhat confusing, as [3] states on page 62 that
> DPDK can get one core to handleup to 33 mpps, on average. On one hand
> this is the performance [1] demonstrated the better part of a decade
> earlier, but on the other hand [2] demonstrates a magnitude larger
> performance per core.
> >
> > - New instructions? On page 171 [3] notes that the AVX512 instruction
> can move 64 bytes per cycle which [2] indicates has an almost 30% effect
> on Mpps on page 22. How important is Transactional Synchronization
> Extensions (TSX) support (see page 119 of [3]) for forwarding
> performance?
> >
> > - Other factors are mentioned, such as memory frequency, memory size,
> memory channels and cache sizes, but nothing is said how each of these
> affect forwarding performance in terms of Mpps. The official
> documentation [0] only states that: "Ensure that each memory channel has
> at least one memory DIMM inserted, and that the memory size for each is
> at least 4GB. Note: this has one of the most direct effects on
> performance."
> >
> > - Turbo boost and hyperthreading? Are these supposed to be enabled or
> disabled? I am getting conflicting information.  Results listed in [2]
> show increased Mpps by enabling, but [1] notes that they were disabled
> due to them introducing measurement artifacts. I recall some
> documentation recommending disabling, since enabling increases latency
> and variance.
> >
> > - Xeon D, W, E3, E5, E7 and Scalable. Are these different processor
> siblings observably different from each other from the perspective of
> DPDK? Atoms certainly are as [3] notes on page 57, because they only
> perform at 50% compared to an equivalent Xeon core. A reson isn't given,
> but perhaps it is due to the missing L3 cache?
> >
> > - Something entirely else? Am I missing something completely obvious
> that explains the inter-generational differences between CPUS in terms
> of forwarding performance?
> >
> >
> > So, given all this, how can I perform the mundane task of comparing
> for example the Xeon W-1250P with the Xeon W-1350P?
> >
> > The 1250 is older, but has a larger L2 cache and a higher frequency.
> >
> > The 1350 is newer, uses faster memory, has a higher max memory
> bandwidth, PCIe4.0, more PCI lanes and AVX-512.
> >
> > Or any other CPU model comparison, for that matter?
> >
> > - Jared
> >
> >
> > [0]
> > https://doc.dpdk.org/guides-16.04/linux_gsg/nic_perf_intel_platform.ht
> > ml [1]
> > https://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ICN2015
> > .pdf [2]
> > http://fast.dpdk.org/doc/perf/DPDK_21_05_Intel_NIC_performance_report.
> > pdf [3]
> > https://www.routledge.com/Data-Plane-Development-Kit-DPDK-A-Software-O
> > ptimization-Guide-to-the/Zhu/p/book/9780367373955
> >
> 
> 
> 
> 


      reply	other threads:[~2021-09-30  9:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 15:23 Jared Brown
2021-09-17 15:51 ` Stephen Hemminger
2021-09-17 16:22   ` Jared Brown
2021-09-29 10:30 ` Thomas Monjalon
2021-09-30  9:26   ` Kinsella, Ray [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR11MB477667B1A8A958D43725B83490AA9@PH0PR11MB4776.namprd11.prod.outlook.com \
    --to=ray.kinsella@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=cpu-dpdk@mail.com \
    --cc=ferruh.yigit@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=thomas@monjalon.net \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK usage discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/users/0 users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 users users/ http://inbox.dpdk.org/users \
		users@dpdk.org
	public-inbox-index users

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.users


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git