From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f180.google.com (mail-ig0-f180.google.com [209.85.213.180]) by dpdk.org (Postfix) with ESMTP id 956C69E5 for ; Wed, 1 Jul 2015 20:01:27 +0200 (CEST) Received: by igrv9 with SMTP id v9so40330944igr.1 for ; Wed, 01 Jul 2015 11:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Na96Fjx3VgtkJoSA4rarRsRcwWBpU2knxdVlJdosLs0=; b=R1pdtxdLt7hS2TCHfECN1sHiVEwNrP8VR8femgP06K8eyodHoghF6AAZoEUO7AegSR vd5nd+ZpQGCZlqivcMkfRHujKX9RNKubsqhT02b9TiZjx4sFnbAz6yyflqV0bIQe7/mG xdvIN9tc85RUkS0YErs/xlY8+I4vPeNirf+M4n2BPUGA1/ardKcZkCsthCbymoHTU5gT Nn3jMBP9Fm6dhLRT6GIfCiXT4r8+BgR3X+1uRIgVZ63nZ+cSvul3c8/NbwC1b3vBXiWA 1jPGGADgwGz/aFRU7LEh2wt3VQ98f7maY/3PaTe1ZHLzPhjKYZyM44fRQLQrMFm1cPSQ jA2g== MIME-Version: 1.0 X-Received: by 10.42.119.76 with SMTP id a12mr5896200icr.83.1435773687129; Wed, 01 Jul 2015 11:01:27 -0700 (PDT) Received: by 10.79.107.148 with HTTP; Wed, 1 Jul 2015 11:01:27 -0700 (PDT) In-Reply-To: References: <20150701125918.GA6960@bricha3-MOBL3> Date: Wed, 1 Jul 2015 14:01:27 -0400 Message-ID: From: Anuj Kalia To: Vladimir Medvedkin Content-Type: text/plain; charset=UTF-8 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Could not achieve wire speed for 40GE with any DPDK version on XL710 NIC's X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 18:01:28 -0000 Thanks for the comments. On Wed, Jul 1, 2015 at 1:32 PM, Vladimir Medvedkin wrote: > Hi Anuj, > > Thanks for fixes! > I have 2 comments > - from i40e_ethdev.h : #define I40E_DEFAULT_RX_WTHRESH 0 > - (26 + 32) / 4 (batched descriptor writeback) should be (26 + 4 * 32) / 4 > (batched descriptor writeback) > , thus we have 135 bytes/packet > > This corresponds to 58.8 Mpps > > Regards, > Vladimir > > 2015-07-01 17:22 GMT+03:00 Anuj Kalia : >> >> Vladimir, >> >> Few possible fixes to your PCIe analysis (let me know if I'm wrong): >> - ECRC is probably disabled (check using sudo lspci -vvv | grep >> CGenEn-), so TLP header is 26 bytes >> - Descriptor writeback can be batched using high value of WTHRESH, >> which is what DPDK uses by default >> - Read request contains full TLP header (26 bytes) >> >> Assuming WTHRESH = 4, bytes transferred from NIC to host per packet = >> 26 + 64 (packet itself) + >> (26 + 32) / 4 (batched descriptor writeback) + >> (26 / 4) (read request for new descriptors) = >> 111 bytes / packet >> >> This corresponds to 70.9 Mpps over PCIe 3.0 x8. Assuming 5% DLLP >> overhead, rate = 67.4 Mpps >> >> --Anuj >> >> >> >> On Wed, Jul 1, 2015 at 9:40 AM, Vladimir Medvedkin >> wrote: >> > In case with syn flood you should take into account return syn-ack >> > traffic, >> > which generates PCIe DLLP's from NIC to host, thus pcie bandwith exceeds >> > faster. And don't forget about DLLP's generated by rx traffic, which >> > saturates host-to-NIC bus. >> > >> > 2015-07-01 16:05 GMT+03:00 Pavel Odintsov : >> > >> >> Yes, Bruce, we understand this. But we are working with huge SYN >> >> attacks processing and they are 64byte only :( >> >> >> >> On Wed, Jul 1, 2015 at 3:59 PM, Bruce Richardson >> >> wrote: >> >> > On Wed, Jul 01, 2015 at 03:44:57PM +0300, Pavel Odintsov wrote: >> >> >> Thanks for answer, Vladimir! So we need look for x16 NIC if we want >> >> >> achieve 40GE line rate... >> >> >> >> >> > Note that this would only apply for your minimal i.e. 64-byte, packet >> >> sizes. >> >> > Once you go up to larger e.g. 128B packets, your PCI bandwidth >> >> requirements >> >> > are lower and you can easier achieve line rate. >> >> > >> >> > /Bruce >> >> > >> >> >> On Wed, Jul 1, 2015 at 3:06 PM, Vladimir Medvedkin < >> >> medvedkinv@gmail.com> wrote: >> >> >> > Hi Pavel, >> >> >> > >> >> >> > Looks like you ran into pcie bottleneck. So let's calculate xl710 >> >> >> > rx >> >> only >> >> >> > case. >> >> >> > Assume we have 32byte descriptors (if we want more offload). >> >> >> > DMA makes one pcie transaction with packet payload, one descriptor >> >> writeback >> >> >> > and one memory request for free descriptors for every 4 packets. >> >> >> > For >> >> >> > Transaction Layer Packet (TLP) there is 30 bytes overhead (4 PHY + >> >> >> > 6 >> >> DLL + >> >> >> > 16 header + 4 ECRC). So for 1 rx packet dma sends 30 + 64(packet >> >> itself) + >> >> >> > 30 + 32 (writeback descriptor) + (16 / 4) (read request for new >> >> >> > descriptors). Note that we do not take into account PCIe >> >> >> > ACK/NACK/FC >> >> Update >> >> >> > DLLP. So we have 160 bytes per packet. One lane PCIe 3.0 transmits >> >> >> > 1 >> >> byte in >> >> >> > 1 ns, so x8 transmits 8 bytes in 1 ns. 1 packet transmits in 20 >> >> >> > ns. >> >> Thus >> >> >> > in theory pcie 3.0 x8 may transfer not more than 50mpps. >> >> >> > Correct me if I'm wrong. >> >> >> > >> >> >> > Regards, >> >> >> > Vladimir >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> -- >> >> Sincerely yours, Pavel Odintsov >> >> > >