From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 61EB4532C for ; Tue, 29 Mar 2016 11:31:23 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP; 29 Mar 2016 02:31:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,410,1455004800"; d="scan'208";a="75074293" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.48]) by fmsmga004.fm.intel.com with SMTP; 29 Mar 2016 02:31:20 -0700 Received: by (sSMTP sendmail emulation); Tue, 29 Mar 2016 10:31:19 +0025 Date: Tue, 29 Mar 2016 10:31:19 +0100 From: Bruce Richardson To: Mohammad El-Shabani Cc: dev@dpdk.org Message-ID: <20160329093119.GC17800@bricha3-MOBL3> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] librte_pmd_ixgbe implementation of ixgbe_dev_rx_queue_count X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2016 09:31:23 -0000 On Mon, Mar 28, 2016 at 06:45:26PM -0700, Mohammad El-Shabani wrote: > Hi, > Looking into why it hurts performance, I see that ixgbe_dev_rx_queue_count > is implemented a scan of elements of rx descriptors, which is very > expensive. I am wondering why its implemented the way it is. Could it not > just read the head location from the driver? > > Thanks! > Mohammad El-Shabani It's likely that reading the head location from the driver will be even slower than scanning the descriptor rings in memory. Access to PCI is very much slower than accessing memory - especially since on platforms with DDIO, many memory accesses will actually be cache reads. That being said, I haven't actually written a test to prove this out, so feel free to try out the head pointer read method instead and see if it improves things. The results may vary depending on how far ahead needs to be scanned, but certainly for the empty ring case, the descriptor scan method will be far faster than a head read. Regards, /Bruce