* [dpdk-dev] Rx-errors with testpmd (only 75% line rate) @ 2014-01-09 19:28 Michael Quicquaro 2014-01-09 21:21 ` François-Frédéric Ozog ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Michael Quicquaro @ 2014-01-09 19:28 UTC (permalink / raw) To: dev; +Cc: mayhan Hello, My hardware is a Dell PowerEdge R820: 4x Intel Xeon E5-4620 2.20GHz 8 core 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 Intel X520 DP 10Gb DA/SFP+ So in summary 32 cores @ 2.20GHz and 256GB RAM ... plenty of horsepower. I've reserved 16 1GB Hugepages I am configuring only one interface and using testpmd in rx_only mode to first see if I can receive at line rate. I am generating traffic on a different system which is running the netmap pkt-gen program - generating 64 byte packets at close to line rate. I am only able to receive approx. 75% of line rate and I see the Rx-errors in the port stats going up proportionally. I have verified that all receive queues are being used, but strangely enough, it doesn't matter how many queues more than 2 that I use, the throughput is the same. I have verified with 'mpstat -P ALL' that all specified cores are used. The utilization of each core is only roughly 25%. Here is my command line: testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8 --txq=8 --interactive What can I do to trace down this problem? It seems very similar to a thread on this list back in May titled "Best example for showing throughput?" where no resolution was ever mentioned in the thread. Thanks for any help. - Michael ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro @ 2014-01-09 21:21 ` François-Frédéric Ozog 2014-01-22 14:52 ` Dmitry Vyal 2014-01-22 20:38 ` Robert Sanford 2 siblings, 0 replies; 10+ messages in thread From: François-Frédéric Ozog @ 2014-01-09 21:21 UTC (permalink / raw) To: 'Michael Quicquaro', dev; +Cc: mayhan Hi, Can you check that the threads you use for handling the queues are on the same socket as the card ? cat /sys/class/net/<interface name>/device/numa_node will give you the node. François-Frédéric > -----Message d'origine----- > De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael Quicquaro > Envoyé : jeudi 9 janvier 2014 20:28 > À : dev@dpdk.org > Cc : mayhan@mayhan.org > Objet : [dpdk-dev] Rx-errors with testpmd (only 75% line rate) > > Hello, > My hardware is a Dell PowerEdge R820: > 4x Intel Xeon E5-4620 2.20GHz 8 core > 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 Intel X520 DP 10Gb DA/SFP+ > > So in summary 32 cores @ 2.20GHz and 256GB RAM > > ... plenty of horsepower. > > I've reserved 16 1GB Hugepages > > I am configuring only one interface and using testpmd in rx_only mode to > first see if I can receive at line rate. > > I am generating traffic on a different system which is running the netmap > pkt-gen program - generating 64 byte packets at close to line rate. > > I am only able to receive approx. 75% of line rate and I see the Rx-errors > in the port stats going up proportionally. > I have verified that all receive queues are being used, but strangely > enough, it doesn't matter how many queues more than 2 that I use, the > throughput is the same. I have verified with 'mpstat -P ALL' that all > specified cores are used. The utilization of each core is only roughly > 25%. > > Here is my command line: > testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe > --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8 > --txq=8 --interactive > > What can I do to trace down this problem? It seems very similar to a > thread on this list back in May titled "Best example for showing > throughput?" where no resolution was ever mentioned in the thread. > > Thanks for any help. > - Michael ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro 2014-01-09 21:21 ` François-Frédéric Ozog @ 2014-01-22 14:52 ` Dmitry Vyal 2014-01-22 17:46 ` Wang, Shawn 2014-01-27 20:00 ` Michael Quicquaro 2014-01-22 20:38 ` Robert Sanford 2 siblings, 2 replies; 10+ messages in thread From: Dmitry Vyal @ 2014-01-22 14:52 UTC (permalink / raw) To: Michael Quicquaro, dev; +Cc: mayhan Hello MIchael, I suggest you to check average burst sizes on receive queues. Looks like I stumbled upon a similar issue several times. If you are calling rte_eth_rx_burst too frequently, NIC begins losing packets no matter how many CPU horse power you have (more you have, more it loses, actually). In my case this situation occured when average burst size is less than 20 packets or so. I'm not sure what's the reason for this behavior, but I observed it on several applications on Intel 82599 10Gb cards. Regards, Dmitry On 01/09/2014 11:28 PM, Michael Quicquaro wrote: > Hello, > My hardware is a Dell PowerEdge R820: > 4x Intel Xeon E5-4620 2.20GHz 8 core > 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 > Intel X520 DP 10Gb DA/SFP+ > > So in summary 32 cores @ 2.20GHz and 256GB RAM > > ... plenty of horsepower. > > I've reserved 16 1GB Hugepages > > I am configuring only one interface and using testpmd in rx_only mode to > first see if I can receive at line rate. > > I am generating traffic on a different system which is running the netmap > pkt-gen program - generating 64 byte packets at close to line rate. > > I am only able to receive approx. 75% of line rate and I see the Rx-errors > in the port stats going up proportionally. > I have verified that all receive queues are being used, but strangely > enough, it doesn't matter how many queues more than 2 that I use, the > throughput is the same. I have verified with 'mpstat -P ALL' that all > specified cores are used. The utilization of each core is only roughly 25%. > > Here is my command line: > testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe > --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8 > --txq=8 --interactive > > What can I do to trace down this problem? It seems very similar to a > thread on this list back in May titled "Best example for showing > throughput?" where no resolution was ever mentioned in the thread. > > Thanks for any help. > - Michael ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-22 14:52 ` Dmitry Vyal @ 2014-01-22 17:46 ` Wang, Shawn 2014-01-27 20:00 ` Michael Quicquaro 1 sibling, 0 replies; 10+ messages in thread From: Wang, Shawn @ 2014-01-22 17:46 UTC (permalink / raw) To: Dmitry Vyal, Michael Quicquaro, dev; +Cc: mayhan Does your NIC connect directly to your Socket? If not, the packet might go through QPI, which will cause additional latency. Check your motherboard. Wang, Shawn On 1/22/14, 6:52 AM, "Dmitry Vyal" <dmitryvyal@gmail.com> wrote: >Hello MIchael, > >I suggest you to check average burst sizes on receive queues. Looks like >I stumbled upon a similar issue several times. If you are calling >rte_eth_rx_burst too frequently, NIC begins losing packets no matter how >many CPU horse power you have (more you have, more it loses, actually). >In my case this situation occured when average burst size is less than >20 packets or so. I'm not sure what's the reason for this behavior, but >I observed it on several applications on Intel 82599 10Gb cards. > >Regards, Dmitry > > >On 01/09/2014 11:28 PM, Michael Quicquaro wrote: >> Hello, >> My hardware is a Dell PowerEdge R820: >> 4x Intel Xeon E5-4620 2.20GHz 8 core >> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 >> Intel X520 DP 10Gb DA/SFP+ >> >> So in summary 32 cores @ 2.20GHz and 256GB RAM >> >> ... plenty of horsepower. >> >> I've reserved 16 1GB Hugepages >> >> I am configuring only one interface and using testpmd in rx_only mode to >> first see if I can receive at line rate. >> >> I am generating traffic on a different system which is running the >>netmap >> pkt-gen program - generating 64 byte packets at close to line rate. >> >> I am only able to receive approx. 75% of line rate and I see the >>Rx-errors >> in the port stats going up proportionally. >> I have verified that all receive queues are being used, but strangely >> enough, it doesn't matter how many queues more than 2 that I use, the >> throughput is the same. I have verified with 'mpstat -P ALL' that all >> specified cores are used. The utilization of each core is only roughly >>25%. >> >> Here is my command line: >> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe >> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8 >> --txq=8 --interactive >> >> What can I do to trace down this problem? It seems very similar to a >> thread on this list back in May titled "Best example for showing >> throughput?" where no resolution was ever mentioned in the thread. >> >> Thanks for any help. >> - Michael > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-22 14:52 ` Dmitry Vyal 2014-01-22 17:46 ` Wang, Shawn @ 2014-01-27 20:00 ` Michael Quicquaro 2014-01-28 8:31 ` Dmitry Vyal 1 sibling, 1 reply; 10+ messages in thread From: Michael Quicquaro @ 2014-01-27 20:00 UTC (permalink / raw) To: Dmitry Vyal; +Cc: dev Dmitry, I cannot thank you enough for this information. This too was my main problem. I put a "small" unmeasured delay before the call to rte_eth_rx_burst() and suddenly it starts returning bursts of 512 packets vs. 4!! Best Regards, Mike On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com> wrote: > Hello MIchael, > > I suggest you to check average burst sizes on receive queues. Looks like I > stumbled upon a similar issue several times. If you are calling > rte_eth_rx_burst too frequently, NIC begins losing packets no matter how > many CPU horse power you have (more you have, more it loses, actually). In > my case this situation occured when average burst size is less than 20 > packets or so. I'm not sure what's the reason for this behavior, but I > observed it on several applications on Intel 82599 10Gb cards. > > Regards, Dmitry > > > > On 01/09/2014 11:28 PM, Michael Quicquaro wrote: > >> Hello, >> My hardware is a Dell PowerEdge R820: >> 4x Intel Xeon E5-4620 2.20GHz 8 core >> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 >> Intel X520 DP 10Gb DA/SFP+ >> >> So in summary 32 cores @ 2.20GHz and 256GB RAM >> >> ... plenty of horsepower. >> >> I've reserved 16 1GB Hugepages >> >> I am configuring only one interface and using testpmd in rx_only mode to >> first see if I can receive at line rate. >> >> I am generating traffic on a different system which is running the netmap >> pkt-gen program - generating 64 byte packets at close to line rate. >> >> I am only able to receive approx. 75% of line rate and I see the Rx-errors >> in the port stats going up proportionally. >> I have verified that all receive queues are being used, but strangely >> enough, it doesn't matter how many queues more than 2 that I use, the >> throughput is the same. I have verified with 'mpstat -P ALL' that all >> specified cores are used. The utilization of each core is only roughly >> 25%. >> >> Here is my command line: >> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe >> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8 >> --txq=8 --interactive >> >> What can I do to trace down this problem? It seems very similar to a >> thread on this list back in May titled "Best example for showing >> throughput?" where no resolution was ever mentioned in the thread. >> >> Thanks for any help. >> - Michael >> > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-27 20:00 ` Michael Quicquaro @ 2014-01-28 8:31 ` Dmitry Vyal 2014-02-10 17:34 ` Jun Han 0 siblings, 1 reply; 10+ messages in thread From: Dmitry Vyal @ 2014-01-28 8:31 UTC (permalink / raw) To: Michael Quicquaro; +Cc: dev On 01/28/2014 12:00 AM, Michael Quicquaro wrote: > Dmitry, > I cannot thank you enough for this information. This too was my main > problem. I put a "small" unmeasured delay before the call to > rte_eth_rx_burst() and suddenly it starts returning bursts of 512 > packets vs. 4!! > Best Regards, > Mike > Thanks for confirming my guesses! By the way, make sure the number of packets you receive in a single burst is less than configured queue size. Or you will lose packets too. Maybe your "small" delay in not so small :) For my own purposes I use a delay of about 150usecs. P.S. I wonder why this issue is not mentioned in documentation. Is it evident for everyone doing network programming? > > On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com > <mailto:dmitryvyal@gmail.com>> wrote: > > Hello MIchael, > > I suggest you to check average burst sizes on receive queues. > Looks like I stumbled upon a similar issue several times. If you > are calling rte_eth_rx_burst too frequently, NIC begins losing > packets no matter how many CPU horse power you have (more you > have, more it loses, actually). In my case this situation occured > when average burst size is less than 20 packets or so. I'm not > sure what's the reason for this behavior, but I observed it on > several applications on Intel 82599 10Gb cards. > > Regards, Dmitry > > > > On 01/09/2014 11:28 PM, Michael Quicquaro wrote: > > Hello, > My hardware is a Dell PowerEdge R820: > 4x Intel Xeon E5-4620 2.20GHz 8 core > 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 > Intel X520 DP 10Gb DA/SFP+ > > So in summary 32 cores @ 2.20GHz and 256GB RAM > > ... plenty of horsepower. > > I've reserved 16 1GB Hugepages > > I am configuring only one interface and using testpmd in > rx_only mode to > first see if I can receive at line rate. > > I am generating traffic on a different system which is running > the netmap > pkt-gen program - generating 64 byte packets at close to line > rate. > > I am only able to receive approx. 75% of line rate and I see > the Rx-errors > in the port stats going up proportionally. > I have verified that all receive queues are being used, but > strangely > enough, it doesn't matter how many queues more than 2 that I > use, the > throughput is the same. I have verified with 'mpstat -P ALL' > that all > specified cores are used. The utilization of each core is > only roughly 25%. > > Here is my command line: > testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe > --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 > --rxq=8 > --txq=8 --interactive > > What can I do to trace down this problem? It seems very > similar to a > thread on this list back in May titled "Best example for showing > throughput?" where no resolution was ever mentioned in the thread. > > Thanks for any help. > - Michael > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-28 8:31 ` Dmitry Vyal @ 2014-02-10 17:34 ` Jun Han 0 siblings, 0 replies; 10+ messages in thread From: Jun Han @ 2014-02-10 17:34 UTC (permalink / raw) To: Dmitry Vyal; +Cc: dev Hi Michael, We are also trying to purchase an IXIA traffic generator. Could you let us know which chassis + load modules you are using so we can use that as a reference to look for the model we need? There seems to be quite a number of different models. Thank you. On Tue, Jan 28, 2014 at 9:31 AM, Dmitry Vyal <dmitryvyal@gmail.com> wrote: > On 01/28/2014 12:00 AM, Michael Quicquaro wrote: > >> Dmitry, >> I cannot thank you enough for this information. This too was my main >> problem. I put a "small" unmeasured delay before the call to >> rte_eth_rx_burst() and suddenly it starts returning bursts of 512 packets >> vs. 4!! >> Best Regards, >> Mike >> >> > Thanks for confirming my guesses! By the way, make sure the number of > packets you receive in a single burst is less than configured queue size. > Or you will lose packets too. Maybe your "small" delay in not so small :) > For my own purposes I use a delay of about 150usecs. > > P.S. I wonder why this issue is not mentioned in documentation. Is it > evident for everyone doing network programming? > > > > >> On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal <dmitryvyal@gmail.com<mailto: >> dmitryvyal@gmail.com>> wrote: >> >> Hello MIchael, >> >> I suggest you to check average burst sizes on receive queues. >> Looks like I stumbled upon a similar issue several times. If you >> are calling rte_eth_rx_burst too frequently, NIC begins losing >> packets no matter how many CPU horse power you have (more you >> have, more it loses, actually). In my case this situation occured >> when average burst size is less than 20 packets or so. I'm not >> sure what's the reason for this behavior, but I observed it on >> several applications on Intel 82599 10Gb cards. >> >> Regards, Dmitry >> >> >> >> On 01/09/2014 11:28 PM, Michael Quicquaro wrote: >> >> Hello, >> My hardware is a Dell PowerEdge R820: >> 4x Intel Xeon E5-4620 2.20GHz 8 core >> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16 >> Intel X520 DP 10Gb DA/SFP+ >> >> So in summary 32 cores @ 2.20GHz and 256GB RAM >> >> ... plenty of horsepower. >> >> I've reserved 16 1GB Hugepages >> >> I am configuring only one interface and using testpmd in >> rx_only mode to >> first see if I can receive at line rate. >> >> I am generating traffic on a different system which is running >> the netmap >> pkt-gen program - generating 64 byte packets at close to line >> rate. >> >> I am only able to receive approx. 75% of line rate and I see >> the Rx-errors >> in the port stats going up proportionally. >> I have verified that all receive queues are being used, but >> strangely >> enough, it doesn't matter how many queues more than 2 that I >> use, the >> throughput is the same. I have verified with 'mpstat -P ALL' >> that all >> specified cores are used. The utilization of each core is >> only roughly 25%. >> >> Here is my command line: >> testpmd -c 0xffffffff -n 4 -- --nb-ports=1 --coremask=0xfffffffe >> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 >> --rxq=8 >> --txq=8 --interactive >> >> What can I do to trace down this problem? It seems very >> similar to a >> thread on this list back in May titled "Best example for showing >> throughput?" where no resolution was ever mentioned in the thread. >> >> Thanks for any help. >> - Michael >> >> >> >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro 2014-01-09 21:21 ` François-Frédéric Ozog 2014-01-22 14:52 ` Dmitry Vyal @ 2014-01-22 20:38 ` Robert Sanford 2014-01-23 23:22 ` Michael Quicquaro 2 siblings, 1 reply; 10+ messages in thread From: Robert Sanford @ 2014-01-22 20:38 UTC (permalink / raw) To: Michael Quicquaro; +Cc: dev, mayhan Hi Michael, > What can I do to trace down this problem? May I suggest that you try to be more selective in the core masks on the command line. The test app may choose some cores from "other" CPU sockets. Only enable cores of the one socket to which the NIC is attached. > It seems very similar to a > thread on this list back in May titled "Best example for showing > throughput?" where no resolution was ever mentioned in the thread. After re-reading *that* thread, it appears that their problem may have been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx + 2 ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC whose total bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8). -- Regards, Robert ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-22 20:38 ` Robert Sanford @ 2014-01-23 23:22 ` Michael Quicquaro 2014-01-24 9:18 ` François-Frédéric Ozog 0 siblings, 1 reply; 10+ messages in thread From: Michael Quicquaro @ 2014-01-23 23:22 UTC (permalink / raw) To: Robert Sanford; +Cc: dev, mayhan Thank you, everyone, for all of your suggestions, but unfortunately I'm still having the problem. I have reduced the test down to using 2 cores (one is the master core) both of which are on the socket in which the NIC's PCI slot is connected. I am running in rxonly mode, so I am basically just counting the packets. I've tried all different burst sizes. Nothing seems to make any difference. Since my original post, I have acquired an IXIA tester so I have better control over my testing. I send 250,000,000 packets to the interface. I am getting roughly 25,000,000 Rx-errors with every run. I have verified that the number of Rx-errors is consistent in the value in the RXMPC of the NIC. Just for sanity's sake, I tried switching the cores to the other socket and run the same test. As expected I got more packet loss. Roughly 87,000,000 I am running Red Hat 6.4 which uses kernel 2.6.32-358 This is a numa supported system, but whether or not I use --numa doesn't seem to make a difference. Looking at the Intel documentation it appears that I should be able to easily do what I am trying to do. Actually, the documentation infers that I should be able to do roughly 40 Gbps with a single 2.x GHz processor core with other configuration (memory, os, etc.) similar to my system. It appears to me that much of the details of these benchmarks are missing. Can someone on this list actually verify for me that what I am trying to do is possible and that they have done it with success? Much appreciation for all the help. - Michael On Wed, Jan 22, 2014 at 3:38 PM, Robert Sanford <rsanford@prolexic.com>wrote: > Hi Michael, > > > What can I do to trace down this problem? > > May I suggest that you try to be more selective in the core masks on the > command line. The test app may choose some cores from "other" CPU sockets. > Only enable cores of the one socket to which the NIC is attached. > > > > It seems very similar to a > > thread on this list back in May titled "Best example for showing > > throughput?" where no resolution was ever mentioned in the thread. > > After re-reading *that* thread, it appears that their problem may have > been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx + 2 > ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC whose total > bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8). > > -- > Regards, > Robert > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) 2014-01-23 23:22 ` Michael Quicquaro @ 2014-01-24 9:18 ` François-Frédéric Ozog 0 siblings, 0 replies; 10+ messages in thread From: François-Frédéric Ozog @ 2014-01-24 9:18 UTC (permalink / raw) To: 'Michael Quicquaro'; +Cc: dev > -----Message d'origine----- > De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael Quicquaro > Envoyé : vendredi 24 janvier 2014 00:23 > À : Robert Sanford > Cc : dev@dpdk.org; mayhan@mayhan.org > Objet : Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) > > Thank you, everyone, for all of your suggestions, but unfortunately I'm > still having the problem. > > I have reduced the test down to using 2 cores (one is the master core) both > of which are on the socket in which the NIC's PCI slot is connected. I am > running in rxonly mode, so I am basically just counting the packets. I've > tried all different burst sizes. Nothing seems to make any difference. > > Since my original post, I have acquired an IXIA tester so I have better > control over my testing. I send 250,000,000 packets to the interface. I > am getting roughly 25,000,000 Rx-errors with every run. I have verified > that the number of Rx-errors is consistent in the value in the RXMPC of the > NIC. > > Just for sanity's sake, I tried switching the cores to the other socket and > run the same test. As expected I got more packet loss. Roughly 87,000,000 > > I am running Red Hat 6.4 which uses kernel 2.6.32-358 > > This is a numa supported system, but whether or not I use --numa doesn't > seem to make a difference. > Is the BIOS configured NUMA? If not, the BIOS may program System Address Decoding so that memory address space is interleaved between sockets on 64MB boundaries (you may have a look at Xeon 7500 datasheet volume 2 - a public document - §4.4 for an "explanation" of this). In general you don't want memory interleaving: QPI bandwidth tops at 16GBps on the latest processors while single node aggregated memory bandwidth can be over 60GB/s. > Looking at the Intel documentation it appears that I should be able to > easily do what I am trying to do. Actually, the documentation infers that > I should be able to do roughly 40 Gbps with a single 2.x GHz processor core > with other configuration (memory, os, etc.) similar to my system. It > appears to me that much of the details of these benchmarks are missing. > > Can someone on this list actually verify for me that what I am trying to do > is possible and that they have done it with success? I have done a NAT64 proof of concept that handled 40Gbps throughput on a single Xeon E5 2697v2. Intel NIC chip was 82599ES (if I recall correctly, I don't have the card handy anymore), 4 rx queues 4 tx queues per port, 32768 descriptors per queue, Intel DCA on, Ethernet pause parameters OFF: 14.8Mpps per port, no packet loss. However this was with a kernel based proprietary packet framework. I expect DPDK to achieve the same results. > > Much appreciation for all the help. > - Michael > > > On Wed, Jan 22, 2014 at 3:38 PM, Robert Sanford > <rsanford@prolexic.com>wrote: > > > Hi Michael, > > > > > What can I do to trace down this problem? > > > > May I suggest that you try to be more selective in the core masks on > > the command line. The test app may choose some cores from "other" CPU > sockets. > > Only enable cores of the one socket to which the NIC is attached. > > > > > > > It seems very similar to a > > > thread on this list back in May titled "Best example for showing > > > throughput?" where no resolution was ever mentioned in the thread. > > > > After re-reading *that* thread, it appears that their problem may have > > been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx + > > 2 ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC whose > > total bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8). PCIe is "32Gbps" full duplex, meaning on each direction. On a single dual port card you have 20Gbps inbound traffic (below 32Gbps) and 20Gbps outbound traffic (below 32Gbps). A 10Gbos port running at 10,000,000,000bps (10^10bps, *not* a power of two). A 64 byte frame (incl. CRC) has preamble, interframe gap... So on the wire there are 7+1+64+12=84bytes=672bits. The max packet rate is thus 10^10 / 672 = 14,880,952 pps. On the PCIexpress side there will be 60 byte (frame excluding CRC) transferred in a single DMA transaction with additional overhead, plus 8b/10b encoding per packet: (60 + 8 + 16) = 84 bytes (fits into a 128 byte typical max payload) or 840 'bits' (8b/10b encoding). I An 8 lane 5GT/s (GigaTransaction = 5*10^10 "transaction" per second; i.e. a "bit" every 200picosecond) can be viewed as a 40GT/s link, so we can have 4*10^10/840=47,619,047pps per direction (PCIe is full duplex). So two fully loaded ports generate 29,761,904pps on each direction which can be absorbed on the PCIexpress Gen x8 even taking account overhead of DMA stuff. > > > > -- > > Regards, > > Robert > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-02-10 17:33 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-01-09 19:28 [dpdk-dev] Rx-errors with testpmd (only 75% line rate) Michael Quicquaro 2014-01-09 21:21 ` François-Frédéric Ozog 2014-01-22 14:52 ` Dmitry Vyal 2014-01-22 17:46 ` Wang, Shawn 2014-01-27 20:00 ` Michael Quicquaro 2014-01-28 8:31 ` Dmitry Vyal 2014-02-10 17:34 ` Jun Han 2014-01-22 20:38 ` Robert Sanford 2014-01-23 23:22 ` Michael Quicquaro 2014-01-24 9:18 ` François-Frédéric Ozog
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).