From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 906007E52 for ; Thu, 9 Oct 2014 11:07:04 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 09 Oct 2014 02:08:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,683,1406617200"; d="scan'208";a="615682340" Received: from bricha3-mobl.ger.corp.intel.com (HELO bricha3-mobl.ir.intel.com) ([10.243.20.24]) by orsmga002.jf.intel.com with SMTP; 09 Oct 2014 02:14:22 -0700 Received: by bricha3-mobl.ir.intel.com (sSMTP sendmail emulation); Thu, 09 Oct 2014 10:14:21 +0001 Date: Thu, 9 Oct 2014 10:14:21 +0100 From: Bruce Richardson To: Matthew Hall Message-ID: <20141009091421.GB14308@BRICHA3-MOBL> References: <3AEA2BF9852C6F48A459DA490692831FE21954@IRSMSX109.ger.corp.intel.com> <20141008224111.GC29243@mhcomputing.net> <20141008225540.GA15850@hmsreliant.think-freely.org> <20141008230728.GA29712@mhcomputing.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20141008230728.GA29712@mhcomputing.net> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.22 (2013-10-16) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Oct 2014 09:07:05 -0000 On Wed, Oct 08, 2014 at 04:07:28PM -0700, Matthew Hall wrote: > On Wed, Oct 08, 2014 at 06:55:41PM -0400, Neil Horman wrote: > > I think because there is a possibility that multiple workers may be used for a > > single tx queue. > > > > Neil > > OK, so, in my application packets are RX'ed to a predictable RX queue and core > using RSS. > > Then you put them into a predictable TX queue for the same core, in the same > order they came in from the RX queue with RSS enabled. > > So you've got a consistent-hashed subset of packets as input, being converted > to output in the same order. > > Will it work, or not work? I'm just curious if my app is doing it wrong and I > need to fix it, or how this case should be handled in general... > > Matthew. Hi Matthew, What you are doing will indeed work, and it's the way the vast majority of the sample apps are written. However, this will not always work for everyone else, sadly. First off, with RSS, there are a number of limitations. On the 1G and 10G NICs RSS works only with IP traffic, and won't work in cases with other protocols or where IP is encapsulated in anything other than a single VLAN. Those cases need software load distribution. As well as this, you have very little control over where flows get put, as the separation into queues (which go to cores), is only done on the low seven bits. For applications which work with a small number of flows, e.g. where multiple flows are contained inside a single tunnel, you get a get a large flow imbalance, where you get far more traffic coming to one queue/core than to another. Again in this instance, software load balancing is needed. Secondly, then, based off that, it is entirely possible when doing software load balancing to strictly process packets for a flow in order - and indeed this is what the existing packet distributor does. However, for certain types of flow where processing of packets for that flow can be done in parallel, forcing things to be done serially can slow things down. As well as this, there can sometimes be requirements for the load balancing between cores to be done as fairly as possible so that it is guaranteed that all cores have approx the same load, irrespective of the number of input flows. In these cases, having the option to blindly distribute traffic to cores and then reorder packets on TX is the best way to ensure even load distribution. It's not going to be for everyone, but it's good to have the option - and there are a number of people doing things this way already. Lastly, there is also the assumption being made that all flows are independent, which again may not always be the case. If you need ordering across flows and to share load between cores then reordering on transmission is the only way to do things. Hope this helps, Regards, /Bruce