From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo6.mail-out.ovh.net (8.mo6.mail-out.ovh.net [178.33.42.204]) by dpdk.org (Postfix) with ESMTP id BB0A65320 for ; Fri, 24 Jan 2014 10:19:37 +0100 (CET) Received: from mail625.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo6.mail-out.ovh.net (Postfix) with SMTP id 86E76FF901F for ; Fri, 24 Jan 2014 10:25:34 +0100 (CET) Received: from b0.ovh.net (HELO queueout) (213.186.33.50) by b0.ovh.net with SMTP; 24 Jan 2014 11:20:13 +0200 Received: from lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr (HELO pcdeff) (ff@ozog.com@193.252.40.75) by ns0.ovh.net with SMTP; 24 Jan 2014 11:20:12 +0200 From: =?iso-8859-1?Q?Fran=E7ois-Fr=E9d=E9ric_Ozog?= To: "'Michael Quicquaro'" References: In-Reply-To: Date: Fri, 24 Jan 2014 10:18:05 +0100 Message-ID: <00a501cf18e5$30b3f070$921bd150$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ac8Ykg7SlPjVqWfoRyCPoSkdhBF1TQASZizg Content-Language: fr X-Ovh-Tracer-Id: 3480156613019818201 X-Ovh-Remote: 193.252.40.75 (lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr) X-Ovh-Local: 213.186.33.20 (ns0.ovh.net) X-OVH-SPAMSTATE: OK X-OVH-SPAMSCORE: -100 X-OVH-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrheegucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd X-Spam-Check: DONE|U 0.5/N X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrheegucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Cc: dev@dpdk.org Subject: Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Jan 2014 09:19:37 -0000 > -----Message d'origine----- > De=A0: dev [mailto:dev-bounces@dpdk.org] De la part de Michael = Quicquaro > Envoy=E9=A0: vendredi 24 janvier 2014 00:23 > =C0=A0: Robert Sanford > Cc=A0: dev@dpdk.org; mayhan@mayhan.org > Objet=A0: Re: [dpdk-dev] Rx-errors with testpmd (only 75% line rate) >=20 > Thank you, everyone, for all of your suggestions, but unfortunately = I'm > still having the problem. >=20 > I have reduced the test down to using 2 cores (one is the master core) both > of which are on the socket in which the NIC's PCI slot is connected. = I am > running in rxonly mode, so I am basically just counting the packets. = I've > tried all different burst sizes. Nothing seems to make any = difference. >=20 > Since my original post, I have acquired an IXIA tester so I have = better > control over my testing. I send 250,000,000 packets to the = interface. I > am getting roughly 25,000,000 Rx-errors with every run. I have = verified > that the number of Rx-errors is consistent in the value in the RXMPC = of the > NIC. >=20 > Just for sanity's sake, I tried switching the cores to the other = socket and > run the same test. As expected I got more packet loss. Roughly 87,000,000 >=20 > I am running Red Hat 6.4 which uses kernel 2.6.32-358 >=20 > This is a numa supported system, but whether or not I use --numa = doesn't > seem to make a difference. >=20 Is the BIOS configured NUMA? If not, the BIOS may program System Address Decoding so that memory address space is interleaved between sockets on = 64MB boundaries (you may have a look at Xeon 7500 datasheet volume 2 - a = public document - =A74.4 for an "explanation" of this).=20 In general you don't want memory interleaving: QPI bandwidth tops at = 16GBps on the latest processors while single node aggregated memory bandwidth = can be over 60GB/s. > Looking at the Intel documentation it appears that I should be able to > easily do what I am trying to do. Actually, the documentation infers = that > I should be able to do roughly 40 Gbps with a single 2.x GHz processor core > with other configuration (memory, os, etc.) similar to my system. It > appears to me that much of the details of these benchmarks are = missing. >=20 > Can someone on this list actually verify for me that what I am trying = to do > is possible and that they have done it with success? I have done a NAT64 proof of concept that handled 40Gbps throughput on a single Xeon E5 2697v2. Intel NIC chip was 82599ES (if I recall correctly, I don't have the card handy anymore), 4 rx queues 4 tx queues per port, 32768 descriptors per queue, Intel DCA on, Ethernet pause parameters OFF: 14.8Mpps per port, = no packet loss. However this was with a kernel based proprietary packet framework. I = expect DPDK to achieve the same results. >=20 > Much appreciation for all the help. > - Michael >=20 >=20 > On Wed, Jan 22, 2014 at 3:38 PM, Robert Sanford > wrote: >=20 > > Hi Michael, > > > > > What can I do to trace down this problem? > > > > May I suggest that you try to be more selective in the core masks on > > the command line. The test app may choose some cores from "other" = CPU > sockets. > > Only enable cores of the one socket to which the NIC is attached. > > > > > > > It seems very similar to a > > > thread on this list back in May titled "Best example for showing > > > throughput?" where no resolution was ever mentioned in the thread. > > > > After re-reading *that* thread, it appears that their problem may = have > > been trying to achieve ~40 Gbits/s of bandwidth (2 ports x 10 Gb Rx = + > > 2 ports x 10 Gb Tx), plus overhead, over a typical dual-port NIC = whose > > total bus bandwidth is a maximum of 32 Gbits/s (PCI express 2.1 x8). PCIe is "32Gbps" full duplex, meaning on each direction. On a single dual port card you have 20Gbps inbound traffic (below = 32Gbps) and 20Gbps outbound traffic (below 32Gbps). A 10Gbos port running at 10,000,000,000bps (10^10bps, *not* a power of two). A 64 byte frame (incl. CRC) has preamble, interframe gap... So on = the wire there are=20 7+1+64+12=3D84bytes=3D672bits. The max packet rate is thus 10^10 / 672 = =3D 14,880,952 pps. On the PCIexpress side there will be 60 byte (frame excluding CRC) transferred in a single DMA transaction with additional overhead, plus 8b/10b encoding per packet: (60 + 8 + 16) =3D 84 bytes (fits into a 128 byte typical max payload) or = 840 'bits' (8b/10b encoding). I=20 An 8 lane 5GT/s (GigaTransaction =3D 5*10^10 "transaction" per second; = i.e. a "bit" every 200picosecond) can be viewed as a 40GT/s link, so we can = have 4*10^10/840=3D47,619,047pps per direction (PCIe is full duplex). So two fully loaded ports generate 29,761,904pps on each direction which = can be absorbed on the PCIexpress Gen x8 even taking account overhead of DMA stuff. > > > > -- > > Regards, > > Robert > > > >