[dpdk-dev] Rx-errors with testpmd (only 75% line rate)

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
@ 2014-01-09 19:28 Michael Quicquaro
  2014-01-09 21:21 ` François-Frédéric Ozog
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Michael Quicquaro @ 2014-01-09 19:28 UTC (permalink / raw)
  To: dev; +Cc: mayhan

Hello,
My hardware is a Dell PowerEdge R820:
4x Intel Xeon E5-4620 2.20GHz 8 core
16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
Intel X520 DP 10Gb DA/SFP+

So in summary 32 cores @ 2.20GHz and 256GB RAM

... plenty of horsepower.

I've reserved 16 1GB Hugepages

I am configuring only one interface and using testpmd in rx_only mode to
first see if I can receive at line rate.

I am generating traffic on a different system which is running the netmap
pkt-gen program - generating 64 byte packets at close to line rate.

I am only able to receive approx. 75% of line rate and I see the Rx-errors
in the port stats going up proportionally.
I have verified that all receive queues are being used, but strangely
enough, it doesn't matter how many queues more than 2 that I use, the
throughput is the same.  I have verified with 'mpstat -P ALL' that all
specified cores are used.  The utilization of each core is only roughly 25%.

Here is my command line:
testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
--nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
--txq=8 --interactive

What can I do to trace down this problem?  It seems very similar to a
thread on this list back in May titled "Best example for showing
throughput?" where no resolution was ever mentioned in the thread.

Thanks for any help.
- Michael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro
@ 2014-01-09 21:21 ` François-Frédéric Ozog
  2014-01-22 14:52 ` Dmitry Vyal
  2014-01-22 20:38 ` Robert Sanford
  2 siblings, 0 replies; 10+ messages in thread
From: François-Frédéric Ozog @ 2014-01-09 21:21 UTC (permalink / raw)
  To: 'Michael Quicquaro', dev; +Cc: mayhan

Hi,

Can you check that the threads you use for handling the queues are on the
same socket as the card ?

cat /sys/class/net/<interface name>/device/numa_node

will give you the node.

François-Frédéric

> -----Message d'origine-----
> De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael Quicquaro
> Envoyé : jeudi 9 janvier 2014 20:28
> À : dev@dpdk.org
> Cc : mayhan@mayhan.org
> Objet : [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
> 
> Hello,
> My hardware is a Dell PowerEdge R820:
> 4x Intel Xeon E5-4620 2.20GHz 8 core
> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 Intel X520 DP 10Gb DA/SFP+
> 
> So in summary 32 cores @ 2.20GHz and 256GB RAM
> 
> ... plenty of horsepower.
> 
> I've reserved 16 1GB Hugepages
> 
> I am configuring only one interface and using testpmd in rx_only mode to
> first see if I can receive at line rate.
> 
> I am generating traffic on a different system which is running the netmap
> pkt-gen program - generating 64 byte packets at close to line rate.
> 
> I am only able to receive approx. 75% of line rate and I see the Rx-errors
> in the port stats going up proportionally.
> I have verified that all receive queues are being used, but strangely
> enough, it doesn't matter how many queues more than 2 that I use, the
> throughput is the same.  I have verified with 'mpstat -P ALL' that all
> specified cores are used.  The utilization of each core is only roughly
> 25%.
> 
> Here is my command line:
> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
> --txq=8 --interactive
> 
> What can I do to trace down this problem?  It seems very similar to a
> thread on this list back in May titled "Best example for showing
> throughput?" where no resolution was ever mentioned in the thread.
> 
> Thanks for any help.
> - Michael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro
  2014-01-09 21:21 ` François-Frédéric Ozog
@ 2014-01-22 14:52 ` Dmitry Vyal
  2014-01-22 17:46   ` Wang, Shawn
  2014-01-27 20:00   ` Michael Quicquaro
  2014-01-22 20:38 ` Robert Sanford
  2 siblings, 2 replies; 10+ messages in thread
From: Dmitry Vyal @ 2014-01-22 14:52 UTC (permalink / raw)
  To: Michael Quicquaro, dev; +Cc: mayhan

Hello MIchael,

I suggest you to check average burst sizes on receive queues. Looks like 
I stumbled upon a similar issue several times. If you are calling 
rte_eth_rx_burst too frequently, NIC begins losing packets no matter how 
many CPU horse power you have (more you have, more it loses, actually). 
In my case this situation occured when average burst size is less than 
20 packets or so. I'm not sure what's the reason for this behavior, but 
I observed it on several applications on Intel 82599 10Gb cards.

Regards, Dmitry


On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
> Hello,
> My hardware is a Dell PowerEdge R820:
> 4x Intel Xeon E5-4620 2.20GHz 8 core
> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
> Intel X520 DP 10Gb DA/SFP+
>
> So in summary 32 cores @ 2.20GHz and 256GB RAM
>
> ... plenty of horsepower.
>
> I've reserved 16 1GB Hugepages
>
> I am configuring only one interface and using testpmd in rx_only mode to
> first see if I can receive at line rate.
>
> I am generating traffic on a different system which is running the netmap
> pkt-gen program - generating 64 byte packets at close to line rate.
>
> I am only able to receive approx. 75% of line rate and I see the Rx-errors
> in the port stats going up proportionally.
> I have verified that all receive queues are being used, but strangely
> enough, it doesn't matter how many queues more than 2 that I use, the
> throughput is the same.  I have verified with 'mpstat -P ALL' that all
> specified cores are used.  The utilization of each core is only roughly 25%.
>
> Here is my command line:
> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
> --txq=8 --interactive
>
> What can I do to trace down this problem?  It seems very similar to a
> thread on this list back in May titled "Best example for showing
> throughput?" where no resolution was ever mentioned in the thread.
>
> Thanks for any help.
> - Michael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-22 14:52 ` Dmitry Vyal
@ 2014-01-22 17:46   ` Wang, Shawn
  2014-01-27 20:00   ` Michael Quicquaro
  1 sibling, 0 replies; 10+ messages in thread
From: Wang, Shawn @ 2014-01-22 17:46 UTC (permalink / raw)
  To: Dmitry Vyal, Michael Quicquaro, dev; +Cc: mayhan

Does your NIC connect directly to your Socket?
If not, the packet might go through QPI, which will cause additional
latency.
Check your motherboard.

Wang, Shawn




On 1/22/14, 6:52 AM, "Dmitry Vyal" <dmitryvyal@gmail.com> wrote:

>Hello MIchael,
>
>I suggest you to check average burst sizes on receive queues. Looks like
>I stumbled upon a similar issue several times. If you are calling
>rte_eth_rx_burst too frequently, NIC begins losing packets no matter how
>many CPU horse power you have (more you have, more it loses, actually).
>In my case this situation occured when average burst size is less than
>20 packets or so. I'm not sure what's the reason for this behavior, but
>I observed it on several applications on Intel 82599 10Gb cards.
>
>Regards, Dmitry
>
>
>On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
>> Hello,
>> My hardware is a Dell PowerEdge R820:
>> 4x Intel Xeon E5-4620 2.20GHz 8 core
>> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
>> Intel X520 DP 10Gb DA/SFP+
>>
>> So in summary 32 cores @ 2.20GHz and 256GB RAM
>>
>> ... plenty of horsepower.
>>
>> I've reserved 16 1GB Hugepages
>>
>> I am configuring only one interface and using testpmd in rx_only mode to
>> first see if I can receive at line rate.
>>
>> I am generating traffic on a different system which is running the
>>netmap
>> pkt-gen program - generating 64 byte packets at close to line rate.
>>
>> I am only able to receive approx. 75% of line rate and I see the
>>Rx-errors
>> in the port stats going up proportionally.
>> I have verified that all receive queues are being used, but strangely
>> enough, it doesn't matter how many queues more than 2 that I use, the
>> throughput is the same.  I have verified with 'mpstat -P ALL' that all
>> specified cores are used.  The utilization of each core is only roughly
>>25%.
>>
>> Here is my command line:
>> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
>> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
>> --txq=8 --interactive
>>
>> What can I do to trace down this problem?  It seems very similar to a
>> thread on this list back in May titled "Best example for showing
>> throughput?" where no resolution was ever mentioned in the thread.
>>
>> Thanks for any help.
>> - Michael
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-22 14:52 ` Dmitry Vyal
  2014-01-22 17:46   ` Wang, Shawn
@ 2014-01-27 20:00   ` Michael Quicquaro
  2014-01-28  8:31     ` Dmitry Vyal
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Quicquaro @ 2014-01-27 20:00 UTC (permalink / raw)
  To: Dmitry Vyal; +Cc: dev

Dmitry,
I cannot thank you enough for this information.  This too was my main
problem.  I put a "small" unmeasured delay before the call to
rte_eth_rx_burst() and suddenly it starts returning bursts of 512 packets
vs. 4!!
Best Regards,
Mike


On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com> wrote:

> Hello MIchael,
>
> I suggest you to check average burst sizes on receive queues. Looks like I
> stumbled upon a similar issue several times. If you are calling
> rte_eth_rx_burst too frequently, NIC begins losing packets no matter how
> many CPU horse power you have (more you have, more it loses, actually). In
> my case this situation occured when average burst size is less than 20
> packets or so. I'm not sure what's the reason for this behavior, but I
> observed it on several applications on Intel 82599 10Gb cards.
>
> Regards, Dmitry
>
>
>
> On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
>
>> Hello,
>> My hardware is a Dell PowerEdge R820:
>> 4x Intel Xeon E5-4620 2.20GHz 8 core
>> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
>> Intel X520 DP 10Gb DA/SFP+
>>
>> So in summary 32 cores @ 2.20GHz and 256GB RAM
>>
>> ... plenty of horsepower.
>>
>> I've reserved 16 1GB Hugepages
>>
>> I am configuring only one interface and using testpmd in rx_only mode to
>> first see if I can receive at line rate.
>>
>> I am generating traffic on a different system which is running the netmap
>> pkt-gen program - generating 64 byte packets at close to line rate.
>>
>> I am only able to receive approx. 75% of line rate and I see the Rx-errors
>> in the port stats going up proportionally.
>> I have verified that all receive queues are being used, but strangely
>> enough, it doesn't matter how many queues more than 2 that I use, the
>> throughput is the same.  I have verified with 'mpstat -P ALL' that all
>> specified cores are used.  The utilization of each core is only roughly
>> 25%.
>>
>> Here is my command line:
>> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
>> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
>> --txq=8 --interactive
>>
>> What can I do to trace down this problem?  It seems very similar to a
>> thread on this list back in May titled "Best example for showing
>> throughput?" where no resolution was ever mentioned in the thread.
>>
>> Thanks for any help.
>> - Michael
>>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-27 20:00   ` Michael Quicquaro
@ 2014-01-28  8:31     ` Dmitry Vyal
  2014-02-10 17:34       ` Jun Han
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Vyal @ 2014-01-28  8:31 UTC (permalink / raw)
  To: Michael Quicquaro; +Cc: dev

On 01/28/2014 12:00 AM, Michael Quicquaro wrote:
> Dmitry,
> I cannot thank you enough for this information.  This too was my main 
> problem.  I put a "small" unmeasured delay before the call to 
> rte_eth_rx_burst() and suddenly it starts returning bursts of 512 
> packets vs. 4!!
> Best Regards,
> Mike
>

Thanks for confirming my guesses! By the way, make sure the number of 
packets you receive in a single burst is less than configured queue 
size. Or you will lose packets too. Maybe your "small" delay in not so 
small :) For my own purposes I use a delay of about 150usecs.

P.S. I wonder why this issue is not mentioned in documentation. Is it 
evident for everyone doing network programming?


>
> On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com 
> <mailto:dmitryvyal@gmail.com>> wrote:
>
>     Hello MIchael,
>
>     I suggest you to check average burst sizes on receive queues.
>     Looks like I stumbled upon a similar issue several times. If you
>     are calling rte_eth_rx_burst too frequently, NIC begins losing
>     packets no matter how many CPU horse power you have (more you
>     have, more it loses, actually). In my case this situation occured
>     when average burst size is less than 20 packets or so. I'm not
>     sure what's the reason for this behavior, but I observed it on
>     several applications on Intel 82599 10Gb cards.
>
>     Regards, Dmitry
>
>
>
>     On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
>
>         Hello,
>         My hardware is a Dell PowerEdge R820:
>         4x Intel Xeon E5-4620 2.20GHz 8 core
>         16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
>         Intel X520 DP 10Gb DA/SFP+
>
>         So in summary 32 cores @ 2.20GHz and 256GB RAM
>
>         ... plenty of horsepower.
>
>         I've reserved 16 1GB Hugepages
>
>         I am configuring only one interface and using testpmd in
>         rx_only mode to
>         first see if I can receive at line rate.
>
>         I am generating traffic on a different system which is running
>         the netmap
>         pkt-gen program - generating 64 byte packets at close to line
>         rate.
>
>         I am only able to receive approx. 75% of line rate and I see
>         the Rx-errors
>         in the port stats going up proportionally.
>         I have verified that all receive queues are being used, but
>         strangely
>         enough, it doesn't matter how many queues more than 2 that I
>         use, the
>         throughput is the same.  I have verified with 'mpstat -P ALL'
>         that all
>         specified cores are used.  The utilization of each core is
>         only roughly 25%.
>
>         Here is my command line:
>         testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
>         --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512
>         --rxq=8
>         --txq=8 --interactive
>
>         What can I do to trace down this problem?  It seems very
>         similar to a
>         thread on this list back in May titled "Best example for showing
>         throughput?" where no resolution was ever mentioned in the thread.
>
>         Thanks for any help.
>         - Michael
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-28  8:31     ` Dmitry Vyal
@ 2014-02-10 17:34       ` Jun Han
  0 siblings, 0 replies; 10+ messages in thread
From: Jun Han @ 2014-02-10 17:34 UTC (permalink / raw)
  To: Dmitry Vyal; +Cc: dev

Hi Michael,

We are also trying to purchase an IXIA traffic generator. Could you let us
know which chassis + load modules you are using so we can use that as a
reference to look for the model we need? There seems to be quite a number
of different models.

Thank you.


On Tue, Jan 28, 2014 at 9:31 AM, Dmitry Vyal <dmitryvyal@gmail.com> wrote:

> On 01/28/2014 12:00 AM, Michael Quicquaro wrote:
>
>> Dmitry,
>> I cannot thank you enough for this information.  This too was my main
>> problem.  I put a "small" unmeasured delay before the call to
>> rte_eth_rx_burst() and suddenly it starts returning bursts of 512 packets
>> vs. 4!!
>> Best Regards,
>> Mike
>>
>>
> Thanks for confirming my guesses! By the way, make sure the number of
> packets you receive in a single burst is less than configured queue size.
> Or you will lose packets too. Maybe your "small" delay in not so small :)
> For my own purposes I use a delay of about 150usecs.
>
> P.S. I wonder why this issue is not mentioned in documentation. Is it
> evident for everyone doing network programming?
>
>
>
>
>> On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com<mailto:
>> dmitryvyal@gmail.com>> wrote:
>>
>>     Hello MIchael,
>>
>>     I suggest you to check average burst sizes on receive queues.
>>     Looks like I stumbled upon a similar issue several times. If you
>>     are calling rte_eth_rx_burst too frequently, NIC begins losing
>>     packets no matter how many CPU horse power you have (more you
>>     have, more it loses, actually). In my case this situation occured
>>     when average burst size is less than 20 packets or so. I'm not
>>     sure what's the reason for this behavior, but I observed it on
>>     several applications on Intel 82599 10Gb cards.
>>
>>     Regards, Dmitry
>>
>>
>>
>>     On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
>>
>>         Hello,
>>         My hardware is a Dell PowerEdge R820:
>>         4x Intel Xeon E5-4620 2.20GHz 8 core
>>         16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
>>         Intel X520 DP 10Gb DA/SFP+
>>
>>         So in summary 32 cores @ 2.20GHz and 256GB RAM
>>
>>         ... plenty of horsepower.
>>
>>         I've reserved 16 1GB Hugepages
>>
>>         I am configuring only one interface and using testpmd in
>>         rx_only mode to
>>         first see if I can receive at line rate.
>>
>>         I am generating traffic on a different system which is running
>>         the netmap
>>         pkt-gen program - generating 64 byte packets at close to line
>>         rate.
>>
>>         I am only able to receive approx. 75% of line rate and I see
>>         the Rx-errors
>>         in the port stats going up proportionally.
>>         I have verified that all receive queues are being used, but
>>         strangely
>>         enough, it doesn't matter how many queues more than 2 that I
>>         use, the
>>         throughput is the same.  I have verified with 'mpstat -P ALL'
>>         that all
>>         specified cores are used.  The utilization of each core is
>>         only roughly 25%.
>>
>>         Here is my command line:
>>         testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe
>>         --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512
>>         --rxq=8
>>         --txq=8 --interactive
>>
>>         What can I do to trace down this problem?  It seems very
>>         similar to a
>>         thread on this list back in May titled "Best example for showing
>>         throughput?" where no resolution was ever mentioned in the thread.
>>
>>         Thanks for any help.
>>         - Michael
>>
>>
>>
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro
  2014-01-09 21:21 ` François-Frédéric Ozog
  2014-01-22 14:52 ` Dmitry Vyal
@ 2014-01-22 20:38 ` Robert Sanford
  2014-01-23 23:22   ` Michael Quicquaro
  2 siblings, 1 reply; 10+ messages in thread
From: Robert Sanford @ 2014-01-22 20:38 UTC (permalink / raw)
  To: Michael Quicquaro; +Cc: dev, mayhan

Hi Michael,

> What can I do to trace down this problem?

May I suggest that you try to be more selective in the core masks on the
command line. The test app may choose some cores from "other" CPU sockets.
Only enable cores of the one socket to which the NIC is attached.

> It seems very similar to a
> thread on this list back in May titled "Best example for showing
> throughput?" where no resolution was ever mentioned in the thread.

After re-reading *that* thread, it appears that their problem may have been
trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx + 2 ports x
10 Gb Tx), plus overhead, over a typical dual-port NIC whose total bus
bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8).

--
Regards,
Robert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-22 20:38 ` Robert Sanford
@ 2014-01-23 23:22   ` Michael Quicquaro
  2014-01-24  9:18     ` François-Frédéric Ozog
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Quicquaro @ 2014-01-23 23:22 UTC (permalink / raw)
  To: Robert Sanford; +Cc: dev, mayhan

Thank you, everyone, for all of your suggestions, but unfortunately I'm
still having the problem.

I have reduced the test down to using 2 cores (one is the master core) both
of which are on the socket in which the NIC's PCI slot is connected.  I am
running in rxonly mode, so I am basically just counting the packets.  I've
tried all different burst sizes.  Nothing seems to make any difference.

Since my original post, I have acquired an IXIA tester so I have better
control over my testing.   I send 250,000,000 packets to the interface.  I
am getting roughly 25,000,000 Rx-errors with every run.  I have verified
that the number of Rx-errors is consistent in the value in the RXMPC of the
NIC.

Just for sanity's sake, I tried switching the cores to the other socket and
run the same test.  As expected I got more packet loss.  Roughly 87,000,000

I am running Red Hat 6.4 which uses kernel 2.6.32-358

This is a numa supported system, but whether or not I use --numa doesn't
seem to make a difference.

Looking at the Intel documentation it appears that I should be able to
easily do what I am trying to do.  Actually, the documentation infers that
I should be able to do roughly 40 Gbps with a single 2.x GHz processor core
with other configuration (memory, os, etc.) similar to my system.  It
appears to me that much of the details of these benchmarks are missing.

Can someone on this list actually verify for me that what I am trying to do
is possible and that they have done it with success?

Much appreciation for all the help.
- Michael

On Wed, Jan 22, 2014 at 3:38 PM, Robert Sanford <rsanford@prolexic.com>wrote:

> Hi Michael,
>
> > What can I do to trace down this problem?
>
> May I suggest that you try to be more selective in the core masks on the
> command line. The test app may choose some cores from "other" CPU sockets.
> Only enable cores of the one socket to which the NIC is attached.
>
>
> > It seems very similar to a
> > thread on this list back in May titled "Best example for showing
> > throughput?" where no resolution was ever mentioned in the thread.
>
> After re-reading *that* thread, it appears that their problem may have
> been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx + 2
> ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC whose total
> bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8).
>
> --
> Regards,
> Robert
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
  2014-01-23 23:22   ` Michael Quicquaro
@ 2014-01-24  9:18     ` François-Frédéric Ozog
  0 siblings, 0 replies; 10+ messages in thread
From: François-Frédéric Ozog @ 2014-01-24  9:18 UTC (permalink / raw)
  To: 'Michael Quicquaro'; +Cc: dev

> -----Message d'origine-----
> De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael Quicquaro
> Envoyé : vendredi 24 janvier 2014 00:23
> À : Robert Sanford
> Cc : dev@dpdk.org; mayhan@mayhan.org
> Objet : Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate)
> 
> Thank you, everyone, for all of your suggestions, but unfortunately I'm
> still having the problem.
> 
> I have reduced the test down to using 2 cores (one is the master core)
both
> of which are on the socket in which the NIC's PCI slot is connected.  I am
> running in rxonly mode, so I am basically just counting the packets.  I've
> tried all different burst sizes.  Nothing seems to make any difference.
> 
> Since my original post, I have acquired an IXIA tester so I have better
> control over my testing.   I send 250,000,000 packets to the interface.  I
> am getting roughly 25,000,000 Rx-errors with every run.  I have verified
> that the number of Rx-errors is consistent in the value in the RXMPC of
the
> NIC.
> 
> Just for sanity's sake, I tried switching the cores to the other socket
and
> run the same test.  As expected I got more packet loss.  Roughly
87,000,000
> 
> I am running Red Hat 6.4 which uses kernel 2.6.32-358
> 
> This is a numa supported system, but whether or not I use --numa doesn't
> seem to make a difference.
> 

Is the BIOS configured NUMA? If not, the BIOS may program System Address
Decoding so that memory address space is interleaved between sockets on 64MB
boundaries (you may have a look at Xeon 7500 datasheet volume 2 - a public
document - §4.4 for an "explanation" of this). 

In general you don't want memory interleaving: QPI bandwidth tops at 16GBps
on the latest processors while single node aggregated memory bandwidth can
be over 60GB/s.

> Looking at the Intel documentation it appears that I should be able to
> easily do what I am trying to do.  Actually, the documentation infers that
> I should be able to do roughly 40 Gbps with a single 2.x GHz processor
core
> with other configuration (memory, os, etc.) similar to my system.  It
> appears to me that much of the details of these benchmarks are missing.
> 
> Can someone on this list actually verify for me that what I am trying to
do
> is possible and that they have done it with success?

I have done a NAT64 proof of concept that handled 40Gbps throughput on a
single Xeon E5 2697v2.
Intel NIC chip was 82599ES (if I recall correctly, I don't have the card
handy anymore), 4 rx queues 4 tx queues per port, 32768 descriptors per
queue, Intel DCA on, Ethernet pause parameters OFF: 14.8Mpps per port, no
packet loss.
However this was with a kernel based proprietary packet framework. I expect
DPDK to achieve the same results.

> 
> Much appreciation for all the help.
> - Michael
> 
> 
> On Wed, Jan 22, 2014 at 3:38 PM, Robert Sanford
> <rsanford@prolexic.com>wrote:
> 
> > Hi Michael,
> >
> > > What can I do to trace down this problem?
> >
> > May I suggest that you try to be more selective in the core masks on
> > the command line. The test app may choose some cores from "other" CPU
> sockets.
> > Only enable cores of the one socket to which the NIC is attached.
> >
> >
> > > It seems very similar to a
> > > thread on this list back in May titled "Best example for showing
> > > throughput?" where no resolution was ever mentioned in the thread.
> >
> > After re-reading *that* thread, it appears that their problem may have
> > been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx +
> > 2 ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC whose
> > total bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8).

PCIe is "32Gbps" full duplex, meaning on each direction.
On a single dual port card you have 20Gbps inbound traffic (below 32Gbps)
and 20Gbps outbound traffic (below 32Gbps).

A 10Gbos port running at  10,000,000,000bps (10^10bps, *not* a power of
two). A 64 byte frame (incl. CRC) has preamble, interframe gap... So on the
wire there are 
7+1+64+12=84bytes=672bits. The max packet rate is thus 10^10 / 672 =
14,880,952 pps.

On the PCIexpress side there will be 60 byte (frame excluding CRC)
transferred in a single DMA transaction with additional overhead, plus
8b/10b encoding per packet:
(60 + 8 + 16) = 84 bytes (fits into a 128 byte typical max payload) or 840
'bits' (8b/10b encoding). I 
An 8 lane 5GT/s (GigaTransaction = 5*10^10 "transaction" per second; i.e. a
"bit" every 200picosecond) can be viewed as a 40GT/s link, so we can have
4*10^10/840=47,619,047pps per direction (PCIe is full duplex).

So two fully loaded ports generate 29,761,904pps on each direction which can
be absorbed on the PCIexpress Gen x8 even taking account overhead of DMA
stuff.

> >
> > --
> > Regards,
> > Robert
> >
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-02-10 17:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro
2014-01-09 21:21 ` François-Frédéric Ozog
2014-01-22 14:52 ` Dmitry Vyal
2014-01-22 17:46   ` Wang, Shawn
2014-01-27 20:00   ` Michael Quicquaro
2014-01-28  8:31     ` Dmitry Vyal
2014-02-10 17:34       ` Jun Han
2014-01-22 20:38 ` Robert Sanford
2014-01-23 23:22   ` Michael Quicquaro
2014-01-24  9:18     ` François-Frédéric Ozog

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).