From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 709E42BDE for ; Wed, 30 Mar 2016 16:23:22 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 30 Mar 2016 07:23:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,416,1455004800"; d="scan'208";a="678011917" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.52]) by FMSMGA003.fm.intel.com with SMTP; 30 Mar 2016 07:23:19 -0700 Received: by (sSMTP sendmail emulation); Wed, 30 Mar 2016 15:23:18 +0025 Date: Wed, 30 Mar 2016 15:23:18 +0100 From: Bruce Richardson To: Stephen Hemminger Cc: Mohammad El-Shabani , dev@dpdk.org Message-ID: <20160330142318.GA21156@bricha3-MOBL3> References: <20160329093119.GC17800@bricha3-MOBL3> <20160329095418.5a0edd4e@xeon-e3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160329095418.5a0edd4e@xeon-e3> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] librte_pmd_ixgbe implementation of ixgbe_dev_rx_queue_count X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2016 14:23:22 -0000 On Tue, Mar 29, 2016 at 09:54:18AM -0700, Stephen Hemminger wrote: > On Tue, 29 Mar 2016 10:31:19 +0100 > Bruce Richardson wrote: > > > On Mon, Mar 28, 2016 at 06:45:26PM -0700, Mohammad El-Shabani wrote: > > > Hi, > > > Looking into why it hurts performance, I see that ixgbe_dev_rx_queue_count > > > is implemented a scan of elements of rx descriptors, which is very > > > expensive. I am wondering why its implemented the way it is. Could it not > > > just read the head location from the driver? > > > > > > Thanks! > > > Mohammad El-Shabani > > > > It's likely that reading the head location from the driver will be even slower > > than scanning the descriptor rings in memory. Access to PCI is very much slower > > than accessing memory - especially since on platforms with DDIO, many memory > > accesses will actually be cache reads. > > > > That being said, I haven't actually written a test to prove this out, so feel > > free to try out the head pointer read method instead and see if it improves > > things. The results may vary depending on how far ahead needs to be scanned, > > but certainly for the empty ring case, the descriptor scan method will be far > > faster than a head read. > > > > Regards, > > /Bruce > > Also the most common use case is "is there any more packets ready before > I go to sleep on epoll", and the descriptor done API tells more than > is needed. Yes, it's not designed for that case. For the are-there-any-more-packets query, the rx_burst api is the one to call. :-) The rx_queue_count API is for the case where you are under load and need to see beyond the max count returned by rx_burst before you process the burst of packets. /Bruce