From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f52.google.com (mail-wg0-f52.google.com [74.125.82.52]) by dpdk.org (Postfix) with ESMTP id 7EF1CDE6 for ; Fri, 3 Jul 2015 10:35:45 +0200 (CEST) Received: by wguu7 with SMTP id u7so82428852wgu.3 for ; Fri, 03 Jul 2015 01:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=N13oOoWhGZTGfNxkIVU5xUgkHW5USF9raVgDnom9CXs=; b=nTJzDqvSMiN5J2Fp3Af/6UyjkWM63fthiWIT4YiwuQIwS+0QY+rv7g/TyynNUqG7gV 6i4zO+I0Yv13yYF4ce1XM1Smc4u2nUKXxwni4t1Uc3NeinXhmVk0waP8YSK9uaNC7Eop VtYzQ5Z55CSsvWfgs/FxBucDlYA6sO+Xnsh+tocJ/nYVdgKs0b9VcIKDt8IX70mqYvqA qa2TU/tO1qBmbZzGWC6qm69tFp1fngDG7KhXvSXfk+q4Nam2dUIB/POZmB3mpMGngxHO kDtaURgzVgEN/KZwWYUG35mjv5265Z3WgQmKy/fpW5+mPRId79czfuq3kxuDjLbkQaVI Pfww== MIME-Version: 1.0 X-Received: by 10.180.101.138 with SMTP id fg10mr24816862wib.46.1435912545379; Fri, 03 Jul 2015 01:35:45 -0700 (PDT) Received: by 10.27.178.129 with HTTP; Fri, 3 Jul 2015 01:35:45 -0700 (PDT) In-Reply-To: References: <20150701125918.GA6960@bricha3-MOBL3> Date: Fri, 3 Jul 2015 11:35:45 +0300 Message-ID: From: Pavel Odintsov To: Anuj Kalia Content-Type: text/plain; charset=UTF-8 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Could not achieve wire speed for 40GE with any DPDK version on XL710 NIC's X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 08:35:45 -0000 Hello, folks! We have found root of issue. Intel do not offer wire speed for 64b packets in XL710 at all. As mentioned in data sheet http://www.intel.ru/content/dam/www/public/us/en/documents/product-briefs/xl710-10-40-gbe-controller-brief.pdf we have: Small packet performance: Maintains wire-rate throughput on smaller payload sizes (>128 Bytes for 40 GbE and >64 Bytes for 10 GbE Could anybody recommend NIC's which could truly achieve wire rate for 40GE? On Wed, Jul 1, 2015 at 9:01 PM, Anuj Kalia wrote: > Thanks for the comments. > > On Wed, Jul 1, 2015 at 1:32 PM, Vladimir Medvedkin wrote: >> Hi Anuj, >> >> Thanks for fixes! >> I have 2 comments >> - from i40e_ethdev.h : #define I40E_DEFAULT_RX_WTHRESH 0 >> - (26 + 32) / 4 (batched descriptor writeback) should be (26 + 4 * 32) / 4 >> (batched descriptor writeback) >> , thus we have 135 bytes/packet >> >> This corresponds to 58.8 Mpps >> >> Regards, >> Vladimir >> >> 2015-07-01 17:22 GMT+03:00 Anuj Kalia : >>> >>> Vladimir, >>> >>> Few possible fixes to your PCIe analysis (let me know if I'm wrong): >>> - ECRC is probably disabled (check using sudo lspci -vvv | grep >>> CGenEn-), so TLP header is 26 bytes >>> - Descriptor writeback can be batched using high value of WTHRESH, >>> which is what DPDK uses by default >>> - Read request contains full TLP header (26 bytes) >>> >>> Assuming WTHRESH = 4, bytes transferred from NIC to host per packet = >>> 26 + 64 (packet itself) + >>> (26 + 32) / 4 (batched descriptor writeback) + >>> (26 / 4) (read request for new descriptors) = >>> 111 bytes / packet >>> >>> This corresponds to 70.9 Mpps over PCIe 3.0 x8. Assuming 5% DLLP >>> overhead, rate = 67.4 Mpps >>> >>> --Anuj >>> >>> >>> >>> On Wed, Jul 1, 2015 at 9:40 AM, Vladimir Medvedkin >>> wrote: >>> > In case with syn flood you should take into account return syn-ack >>> > traffic, >>> > which generates PCIe DLLP's from NIC to host, thus pcie bandwith exceeds >>> > faster. And don't forget about DLLP's generated by rx traffic, which >>> > saturates host-to-NIC bus. >>> > >>> > 2015-07-01 16:05 GMT+03:00 Pavel Odintsov : >>> > >>> >> Yes, Bruce, we understand this. But we are working with huge SYN >>> >> attacks processing and they are 64byte only :( >>> >> >>> >> On Wed, Jul 1, 2015 at 3:59 PM, Bruce Richardson >>> >> wrote: >>> >> > On Wed, Jul 01, 2015 at 03:44:57PM +0300, Pavel Odintsov wrote: >>> >> >> Thanks for answer, Vladimir! So we need look for x16 NIC if we want >>> >> >> achieve 40GE line rate... >>> >> >> >>> >> > Note that this would only apply for your minimal i.e. 64-byte, packet >>> >> sizes. >>> >> > Once you go up to larger e.g. 128B packets, your PCI bandwidth >>> >> requirements >>> >> > are lower and you can easier achieve line rate. >>> >> > >>> >> > /Bruce >>> >> > >>> >> >> On Wed, Jul 1, 2015 at 3:06 PM, Vladimir Medvedkin < >>> >> medvedkinv@gmail.com> wrote: >>> >> >> > Hi Pavel, >>> >> >> > >>> >> >> > Looks like you ran into pcie bottleneck. So let's calculate xl710 >>> >> >> > rx >>> >> only >>> >> >> > case. >>> >> >> > Assume we have 32byte descriptors (if we want more offload). >>> >> >> > DMA makes one pcie transaction with packet payload, one descriptor >>> >> writeback >>> >> >> > and one memory request for free descriptors for every 4 packets. >>> >> >> > For >>> >> >> > Transaction Layer Packet (TLP) there is 30 bytes overhead (4 PHY + >>> >> >> > 6 >>> >> DLL + >>> >> >> > 16 header + 4 ECRC). So for 1 rx packet dma sends 30 + 64(packet >>> >> itself) + >>> >> >> > 30 + 32 (writeback descriptor) + (16 / 4) (read request for new >>> >> >> > descriptors). Note that we do not take into account PCIe >>> >> >> > ACK/NACK/FC >>> >> Update >>> >> >> > DLLP. So we have 160 bytes per packet. One lane PCIe 3.0 transmits >>> >> >> > 1 >>> >> byte in >>> >> >> > 1 ns, so x8 transmits 8 bytes in 1 ns. 1 packet transmits in 20 >>> >> >> > ns. >>> >> Thus >>> >> >> > in theory pcie 3.0 x8 may transfer not more than 50mpps. >>> >> >> > Correct me if I'm wrong. >>> >> >> > >>> >> >> > Regards, >>> >> >> > Vladimir >>> >> >> > >>> >> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Sincerely yours, Pavel Odintsov >>> >> >> >> -- Sincerely yours, Pavel Odintsov