From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id ED46A8DAD for ; Wed, 13 Jan 2016 12:34:36 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 13 Jan 2016 03:34:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,288,1449561600"; d="scan'208";a="892278304" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.208.64]) by fmsmga002.fm.intel.com with SMTP; 13 Jan 2016 03:34:34 -0800 Received: by (sSMTP sendmail emulation); Wed, 13 Jan 2016 11:34:33 +0025 Date: Wed, 13 Jan 2016 11:34:33 +0000 From: Bruce Richardson To: Moon-Sang Lee Message-ID: <20160113113432.GA7216@bricha3-MOBL3> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: dev@dpdk.org Subject: Re: [dpdk-dev] rte_prefetch0() is effective? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jan 2016 11:34:37 -0000 On Thu, Dec 24, 2015 at 03:35:14PM +0900, Moon-Sang Lee wrote: > I see codes as below in example directory, and I wonder it is effective. > Coherent IO is adopted to modern architectures, > so I think that DMA initiation by rte_eth_rx_burst() might already fulfills > cache lines of RX buffers. > Do I really need to call rte_prefetchX()? > > nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, > MAX_PKT_BURST); > ... > /* Prefetch and forward already prefetched packets */ > for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) { > rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[ > j + PREFETCH_OFFSET], void *)); > l3fwd_simple_forward(pkts_burst[j], portid, > qconf); > } > Good question. When the first example apps using this style of prefetch were originally written, yes, there was a noticable performance increase achieved by using the prefetch. Thereafter, I'm not sure that anyone has checked with each generation of platforms whether the prefetches are still necessary and how much they help, but I suspect that they still help a bit, and don't hurt performance. It would be an interesting exercise to check whether the prefetch offsets used in code like above can be adjusted to give better performance on our latest supported platforms. /Bruce