Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Paul Emmerich <emmericp@net.in.tum.de>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split
Date: Mon, 15 Feb 2016 20:15:23 +0100	[thread overview]
Message-ID: <56C223CB.9080901@net.in.tum.de> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB9772582142EB46@irsmsx105.ger.corp.intel.com>

Hi,

here's a kind of late follow-up. I've only recently found the need 
(mostly for the better support of XL710 NICs (which I still dislike but 
people are using them...)) to seriously address DPDK 2.x support in MoonGen.

On 13.05.15 11:03, Ananyev, Konstantin wrote:
> Before start to discuss your findings, there is one thing in your test app that looks strange to me:
> You use BATCH_SIZE==64 for TX packets, but your mempool cache_size==32.
> This is not really a good choice, as it means that for each iteration your mempool cache will be exhausted,
> and you'll endup doing ring_dequeue().
> I'd suggest you use something like ' 2 * BATCH_SIZE' for mempools cache size,
> that should improve your numbers (at least it did to me).

Thanks for pointing that out. However, my real app did not have this bug 
and I also saw the performance improvement there.

> Though, I suppose that scenario might be improved without manual 'prefetch' - by reordering code a bit.
> Below are 2 small patches, that introduce rte_pktmbuf_bulk_alloc() and modifies your test app to use it.
> Could you give it a try and see would it help to close a gap between 1.7.1 and 2.0?
> I don't have box with the same off-hand, but on my IVB box results are quite promising:
> on 1.2 GHz for simple_tx there is practically no difference in results (-0.33%),
> for full_tx the drop reduced to 2%.
> That's comparing DPDK1.7.1+testpapp with cache_size=2*batch_size vs
> latest DPDK+ testpapp with cache_size=2*batch_size+bulk_alloc.

The bulk_alloc patch is great and helps. I'd love to see such a function 
in DPDK.

I agree that this is a better solution than prefetching. I also can't 
see a difference with/without prefetching when using bulk alloc.


  Paul

next prev parent reply	other threads:[~2016-02-15 19:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-11  0:14 Paul Emmerich
2015-05-11  9:13 ` Luke Gorrie
2015-05-11 10:16   ` Paul Emmerich
2015-05-11 22:32 ` Paul Emmerich
2015-05-11 23:18   ` Paul Emmerich
2015-05-12  0:28     ` Marc Sune
2015-05-12  0:38       ` Marc Sune
2015-05-13  9:03     ` Ananyev, Konstantin
2016-02-15 19:15       ` Paul Emmerich [this message]
2016-02-19 12:31         ` Olivier MATZ

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C223CB.9080901@net.in.tum.de \
    --to=emmericp@net.in.tum.de \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).