RE: Optimizing memory access with DPDK allocated memory

DPDK usage discussions
 help / color / mirror / Atom feed

From: "Kinsella, Ray" <ray.kinsella@intel.com>
To: Antonio Di Bacco <a.dibacco.ks@gmail.com>,
	Stephen Hemminger <stephen@networkplumber.org>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: RE: Optimizing memory access with DPDK allocated memory
Date: Wed, 25 May 2022 10:55:20 +0000	[thread overview]
Message-ID: <PH0PR11MB47762B5DB43B098BE8420E8E90D69@PH0PR11MB4776.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAO8pfF=fbLCiWsWyfXv1Ns2pRX3=abB4t+_NeYsz+0ET6ASzig@mail.gmail.com>

Hi Antonio,

If it is an Intel Platform you are using.
You can take a look at the Intel Memory Latency Checker.
https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html

(don't be fooled by the name, it does measure bandwidth).

Ray K

-----Original Message-----
From: Antonio Di Bacco <a.dibacco.ks@gmail.com> 
Sent: Wednesday 25 May 2022 08:30
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: Re: Optimizing memory access with DPDK allocated memory

Just to add some more info that could possibly be useful to someone.
Even if a processor has many memory channels; there is also another parameter to take into consideration, a given "core" cannot exploit all the memory bandwidth available.
For example for a DDR4 2933 MT/s with 4 channels:
the memory bandwidth is   2933 X 8 (# of bytes of width) X 4 (# of
channels) = 93,866.88 MB/s bandwidth, or 94 GB/s but a single core (according to my tests with DPDK process writing a 1GB hugepage) is about 12 GB/s (with a block size exceeding the L3 cache size).

Can anyone confirm that ?

On Mon, May 23, 2022 at 3:16 PM Antonio Di Bacco <a.dibacco.ks@gmail.com> wrote:
>
> Got feedback from a guy working on HPC with DPDK and he told me that 
> with dpdk mem-test (don't know where to find it) I should be doing 
> 16GB/s with DDR4 (2666) per channel. In my case with 6 channels I 
> should be doing 90GB/s .... that would be amazing!
>
> On Sat, May 21, 2022 at 11:42 AM Antonio Di Bacco 
> <a.dibacco.ks@gmail.com> wrote:
> >
> > I read a couple of articles
> > (https://www.thomas-krenn.com/en/wiki/Optimize_memory_performance_of
> > _Intel_Xeon_Scalable_systems?xtxsearchselecthit=1
> > and this 
> > https://www.exxactcorp.com/blog/HPC/balance-memory-guidelines-for-in
> > tel-xeon-scalable-family-processors)
> > and I understood a little bit more.
> >
> > If the XEON memory controller is able to spread contiguous memory 
> > accesses onto different channels in hardware (as Stepphen correctly 
> > stated), then, how DPDK with option -n can benefit an application?
> > I also coded a test application to write a 1GB hugepage and 
> > calculate time needed but, equipping an additional two DIMM on two 
> > unused channels of my available six channels motherboard (X11DPi-NT) 
> > , I didn't observe any improvement. This is strange because adding 
> > two channels to the 4 already equipped should make a noticeable 
> > difference.
> >
> > For reference this is the small program for allocating and writing memory.
> > https://github.com/adibacco/simple_mp_mem_2
> > and the results with 4 memory channels:
> > https://docs.google.com/spreadsheets/d/1mDoKYLMhMMKDaOS3RuGEnpPgRNKu
> > ZOy4lMIhG-1N7B8/edit?usp=sharing
> >
> >
> > On Fri, May 20, 2022 at 5:48 PM Stephen Hemminger 
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Fri, 20 May 2022 10:34:46 +0200 Antonio Di Bacco 
> > > <a.dibacco.ks@gmail.com> wrote:
> > >
> > > > Let us say I have two memory channels each one with its own 16GB 
> > > > memory module, I suppose the first memory channel will be used 
> > > > when addressing physical memory in the range 0 to 0x4 0000 0000 
> > > > and the second when addressing physical memory in the range 0x4 0000 0000 to  0x7 ffff ffff.
> > > > Correct?
> > > > Now, I need to have a 2GB buffer with one "writer" and one 
> > > > "reader", the writer writes on half of the buffer (call it A) 
> > > > and, in the meantime, the reader reads on the other half (B). 
> > > > When the writer finishes writing its half buffer (A), signal it 
> > > > to the reader and they swap,  the reader starts to read from A and writer starts to write to B.
> > > > If I allocate the whole buffer (on two 1GB hugepages) across the 
> > > > two memory channels, one half of the buffer is allocated on the 
> > > > end of first channel while the other half is allocated on the 
> > > > start of the second memory channel, would this increase 
> > > > performances compared to the whole buffer allocated within the same memory channel?
> > >
> > > Most systems just interleave memory chips based on number of filled slots.
> > > This is handled by BIOS before kernel even starts.
> > > The DPDK has a number of memory channels parameter and what it 
> > > does is try and optimize memory allocation by spreading.
> > >
> > > Looks like you are inventing your own limited version of what memif does.

next prev parent reply	other threads:[~2022-05-25 10:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20  8:34 Antonio Di Bacco
2022-05-20 15:48 ` Stephen Hemminger
2022-05-21  9:42   ` Antonio Di Bacco
2022-05-23 13:16     ` Antonio Di Bacco
2022-05-25  7:30       ` Antonio Di Bacco
2022-05-25 10:55         ` Kinsella, Ray [this message]
2022-05-25 13:33           ` Antonio Di Bacco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR11MB47762B5DB43B098BE8420E8E90D69@PH0PR11MB4776.namprd11.prod.outlook.com \
    --to=ray.kinsella@intel.com \
    --cc=a.dibacco.ks@gmail.com \
    --cc=stephen@networkplumber.org \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).