Re: [dpdk-users] Low Rx throughput when using Mellanox ConnectX-3 card with DPDK

DPDK usage discussions
 help / color / mirror / Atom feed

From: Kyle Larose <klarose@sandvine.com>
To: Shihabur Rahman Chowdhury <shihab.buet@gmail.com>,
	Shahaf Shuler <shahafs@mellanox.com>
Cc: Dave Wallace <dwallacelf@gmail.com>,
	Olga Shern <olgas@mellanox.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Low Rx throughput when using Mellanox ConnectX-3 card with DPDK
Date: Thu, 13 Apr 2017 15:49:02 +0000	[thread overview]
Message-ID: <D76BBBCF97F57144BB5FCF08007244A7705A6B96@wtl-exchp-1.sandvine.com> (raw)
In-Reply-To: <CAMGVCn4LNOyWP5BxyMQ9s34JTQSNs_vYAyQZt_z38jRC3TgHjw@mail.gmail.com>

Hey Shihab,


> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Shihabur Rahman
> Chowdhury
> Sent: Thursday, April 13, 2017 10:21 AM
> To: Shahaf Shuler
> Cc: Dave Wallace; Olga Shern; Adrien Mazarguil; Wiles, Keith; users@dpdk.org
> Subject: Re: [dpdk-users] Low Rx throughput when using Mellanox ConnectX-3
> card with DPDK
> 
>
> To give a bit more context, we are developing a set of packet processors
> that can be independently deployed as separate processes and can be scaled
> out independently as well. So a batch of packet goes through a sequence of
> processes until at some point they are written to the Tx queue or gets
> dropped because of some processing decision. These packet processors are
> running as secondary dpdk processes and the rx is being taking place at a
> primary process (since Mellanox PMD does not allow Rx from a secondary
> process). In this example configuration, one primary process is doing the
> Rx, handing over the packet to another secondary process through a shared
> ring and that secondary process is swapping the MAC and writing packets to
> Tx queue. We are expecting some performance drop because of the cache
> invalidation across lcores (also we cannot use the same lcore for different
> secondary process for mempool cache corruption), but again 7.3Mpps is ~30+%
> overhead.
> 
> Since you said, we tried the run to completion processing in the primary
> process (i.e., rx and tx is now on the same lcore). We also configured
> pktgent to handle rx and tx on the same lcore as well. With that we are now
> getting ~9.9-10Mpps with 64B packets. With our multi-process setup that
> drops down to ~8.4Mpps. So it seems like pktgen was not configured properly.
> It seems a bit counter-intuitive since from pktgen's side doing rx and tx on
> different lcore should not cause any cache invalidation (set of rx and tx
> packets are disjoint). So using different lcores should theoretically be
> better than handling both rx/tx in the same lcore for pkgetn. Am I missing
> something here?
> 
> Thanks

It sounds to me like your bottleneck is the primary -- the packet distributor. Consider the comment from Shahaf earlier: the best Mellanox was able to achieve with testpmd (which is extremely simple) is 10Mpps per core. I've always found that receiving is more expensive than transmitting, which means that if you're splitting your work on those dimensions, you'll need to allocate more CPU to the receiver than the transmitter. This may be one of the reasons run to completion works out -- the lower tx load on that core offsets the higher rx.

If you want to continue using the packet distribution model, why don't you try using RSS/multiqueue on the distributor, and allocate two cores to it? You'll need some entropy in the packets for it to distribute well, but hopefully that's not a problem. :)

Thanks,

Kyle

next prev parent reply	other threads:[~2017-04-13 15:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-12 21:00 Shihabur Rahman Chowdhury
2017-04-12 22:41 ` Wiles, Keith
2017-04-13  0:06   ` Shihabur Rahman Chowdhury
2017-04-13  1:56     ` Dave Wallace
2017-04-13  1:57       ` Shihabur Rahman Chowdhury
2017-04-13  5:19         ` Shahaf Shuler
2017-04-13 14:21           ` Shihabur Rahman Chowdhury
2017-04-13 15:49             ` Kyle Larose [this message]
2017-04-17 17:43               ` Shihabur Rahman Chowdhury
2017-04-13 13:49     ` Wiles, Keith
2017-04-13 14:22       ` Shihabur Rahman Chowdhury
2017-04-13 14:47         ` Wiles, Keith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D76BBBCF97F57144BB5FCF08007244A7705A6B96@wtl-exchp-1.sandvine.com \
    --to=klarose@sandvine.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dwallacelf@gmail.com \
    --cc=keith.wiles@intel.com \
    --cc=olgas@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=shihab.buet@gmail.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).