From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id B731A5A66 for ; Wed, 14 Oct 2015 11:29:57 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 14 Oct 2015 02:29:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,681,1437462000"; d="scan'208";a="826347694" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.208.61]) by orsmga002.jf.intel.com with SMTP; 14 Oct 2015 02:29:53 -0700 Received: by (sSMTP sendmail emulation); Wed, 14 Oct 2015 10:29:53 +0025 Date: Wed, 14 Oct 2015 10:29:52 +0100 From: Bruce Richardson To: Alexander Duyck Message-ID: <20151014092952.GB32308@bricha3-MOBL3> References: <20151012221830.6f5f42af@xeon-e3> <20151013135955.GA31844@bricha3-MOBL3> <561D6876.6040709@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <561D6876.6040709@gmail.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] IXGBE RX packet loss with 5+ cores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Oct 2015 09:29:58 -0000 On Tue, Oct 13, 2015 at 01:24:22PM -0700, Alexander Duyck wrote: > On 10/13/2015 07:47 AM, Sanford, Robert wrote: > >>>>[Robert:] > >>>>1. The 82599 device supports up to 128 queues. Why do we see trouble > >>>>with as few as 5 queues? What could limit the system (and one port > >>>>controlled by 5+ cores) from receiving at line-rate without loss? > >>>> > >>>>2. As far as we can tell, the RX path only touches the device > >>>>registers when it updates a Receive Descriptor Tail register (RDT[n]), > >>>>roughly every rx_free_thresh packets. Is there a big difference > >>>>between one core doing this and N cores doing it 1/N as often? > >>>[Stephen:] > >>>As you add cores, there is more traffic on the PCI bus from each core > >>>polling. There is a fix number of PCI bus transactions per second > >>>possible. > >>>Each core is increasing the number of useless (empty) transactions. > >>[Bruce:] > >>The polling for packets by the core should not be using PCI bandwidth > >>directly, > >>as the ixgbe driver (and other drivers) check for the DD bit being set on > >>the > >>descriptor in memory/cache. > >I was preparing to reply with the same point. > > > >>>[Stephen:] Why do you think adding more cores will help? > >We're using run-to-completion and sometimes spend too many cycles per pkt. > >We realize that we need to move to io+workers model, but wanted a better > >understanding of the dynamics involved here. > > > > > > > >>[Bruce:] However, using an increased number of queues can > >>use PCI bandwidth in other ways, for instance, with more queues you > >>reduce the > >>amount of descriptor coalescing that can be done by the NICs, so that > >>instead of > >>having a single transaction of 4 descriptors to one queue, the NIC may > >>instead > >>have to do 4 transactions each writing 1 descriptor to 4 different > >>queues. This > >>is possibly why sending all traffic to a single queue works ok - the > >>polling on > >>the other queues is still being done, but has little effect. > >Brilliant! This idea did not occur to me. > > You can actually make the throughput regression disappear by altering the > traffic pattern you are testing with. In the past I have found that sending > traffic in bursts where 4 frames belong to the same queue before moving to > the next one essentially eliminated the dropped packets due to PCIe > bandwidth limitations. The trick is you need to have the Rx descriptor > processing work in batches so that you can get multiple descriptors > processed for each PCIe read/write. > Yep, that's one test we used to prove the effect on descriptor coalescing, and it does work a treat! Unfortunately, I think controlling real-world input traffic that way, could be, ... em ... challenging? :-) /Bruce