From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 64D158E9D for ; Tue, 13 Oct 2015 15:59:59 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 13 Oct 2015 06:59:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,678,1437462000"; d="scan'208";a="825629717" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.63]) by orsmga002.jf.intel.com with SMTP; 13 Oct 2015 06:59:57 -0700 Received: by (sSMTP sendmail emulation); Tue, 13 Oct 2015 14:59:55 +0025 Date: Tue, 13 Oct 2015 14:59:55 +0100 From: Bruce Richardson To: Stephen Hemminger Message-ID: <20151013135955.GA31844@bricha3-MOBL3> References: <20151012221830.6f5f42af@xeon-e3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151012221830.6f5f42af@xeon-e3> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] IXGBE RX packet loss with 5+ cores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Oct 2015 14:00:00 -0000 On Mon, Oct 12, 2015 at 10:18:30PM -0700, Stephen Hemminger wrote: > On Tue, 13 Oct 2015 02:57:46 +0000 > "Sanford, Robert" wrote: > > > I'm hoping that someone (perhaps at Intel) can help us understand > > an IXGBE RX packet loss issue we're able to reproduce with testpmd. > > > > We run testpmd with various numbers of cores. We offer line-rate > > traffic (~14.88 Mpps) to one ethernet port, and forward all received > > packets via the second port. > > > > When we configure 1, 2, 3, or 4 cores (per port, with same number RX > > queues per port), there is no RX packet loss. When we configure 5 or > > more cores, we observe the following packet loss (approximate): > > 5 cores - 3% loss > > 6 cores - 7% loss > > 7 cores - 11% loss > > 8 cores - 15% loss > > 9 cores - 18% loss > > > > All of the "lost" packets are accounted for in the device's Rx Missed > > Packets Count register (RXMPC[0]). Quoting the datasheet: > > "Packets are missed when the receive FIFO has insufficient space to > > store the incoming packet. This might be caused due to insufficient > > buffers allocated, or because there is insufficient bandwidth on the > > IO bus." > > > > RXMPC, and our use of API rx_descriptor_done to verify that we don't > > run out of mbufs (discussed below), lead us to theorize that packet > > loss occurs because the device is unable to DMA all packets from its > > internal packet buffer (512 KB, reported by register RXPBSIZE[0]) > > before overrun. > > > > Questions > > ========= > > 1. The 82599 device supports up to 128 queues. Why do we see trouble > > with as few as 5 queues? What could limit the system (and one port > > controlled by 5+ cores) from receiving at line-rate without loss? > > > > 2. As far as we can tell, the RX path only touches the device > > registers when it updates a Receive Descriptor Tail register (RDT[n]), > > roughly every rx_free_thresh packets. Is there a big difference > > between one core doing this and N cores doing it 1/N as often? > > > > 3. Do CPU reads/writes from/to device registers have a higher priority > > than device reads/writes from/to memory? Could the former transactions > > (CPU <-> device) significantly impede the latter (device <-> RAM)? > > > > Thanks in advance for any help you can provide. > > As you add cores, there is more traffic on the PCI bus from each core > polling. There is a fix number of PCI bus transactions per second possible. > Each core is increasing the number of useless (empty) transactions. > Why do you think adding more cores will help? > The polling for packets by the core should not be using PCI bandwidth directly, as the ixgbe driver (and other drivers) check for the DD bit being set on the descriptor in memory/cache. However, using an increased number of queues can use PCI bandwidth in other ways, for instance, with more queues you reduce the amount of descriptor coalescing that can be done by the NICs, so that instead of having a single transaction of 4 descriptors to one queue, the NIC may instead have to do 4 transactions each writing 1 descriptor to 4 different queues. This is possibly why sending all traffic to a single queue works ok - the polling on the other queues is still being done, but has little effect. Regards, /Bruce