From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 38F879AAB for ; Wed, 25 Feb 2015 12:02:33 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 25 Feb 2015 03:02:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,644,1418112000"; d="scan'208";a="656940726" Received: from bricha3-mobl3.ger.corp.intel.com ([10.243.20.32]) by orsmga001.jf.intel.com with SMTP; 25 Feb 2015 03:02:30 -0800 Received: by (sSMTP sendmail emulation); Wed, 25 Feb 2015 11:02:28 +0025 Date: Wed, 25 Feb 2015 11:02:28 +0000 From: Bruce Richardson To: Vlad Zolotarov Message-ID: <20150225110228.GA4896@bricha3-MOBL3> References: <54ED9894.3050409@cloudius-systems.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54ED9894.3050409@cloudius-systems.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] : ixgbe: why bulk allocation is not used for a scattered Rx flow? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 11:02:34 -0000 On Wed, Feb 25, 2015 at 11:40:36AM +0200, Vlad Zolotarov wrote: > Hi, I have a question about the "scattered Rx" feature: why enabling it > disabled "bulk allocation" feature? The "bulk-allocation" feature is one where a more optimized RX code path is used. For the sake of performance, when doing that code path, certain assumptions were made, one of which was that packets would fit inside a single mbuf. Not having this assumption makes the receiving of packets much more complicated and therefore slower. [For similar reasons, the optimized TX routines e.g. vector TX, are only used if it is guaranteed that no hardware offload features are going to be used]. Now, it is possible, though challenging, to write optimized code for these more complicated cases, such as scattered RX, or TX with offloads or scattered packets. In general, we will always want separate routines for the simple case and the complicated cases, as the performance hit of checking for the offloads, or multi-mbuf packets will be significant enough to hit our performance badly when they are not needed. In the case of the vector PMD for ixgbe - our highest performance path right now - we have indeed two receive routines, for simple and scattered cases. For TX, we only have an optimized path for the simple case, but that is not to say that at some point someone may provide one for the offload case too. A final note on scattered packets in particular: if packets are too big to fit in a single mbuf, then they are not small packets, and the processing time per packet available is, by definition, larger than for packets that fit in a single mbuf. For 64-byte packets, the packet arrival rate is 67ns @ 10G, or approx 200 cycles at 3GHz. If we assume a standard 2k mbuf, then a packet which spans two mbufs takes at least 1654ns, and therefore a 3GHz CPU has nearly 5000 cycles to process that same packet. Therefore, since the processing budget is so much bigger the need to optimize is much less. Therefore it's more important to focus on the small packet case, which is what we have done. > There is some unclear comment in the ixgbe_recv_scattered_pkts(): > > /* > * Descriptor done. > * > * Allocate a new mbuf to replenish the RX ring descriptor. > * If the allocation fails: > * - arrange for that RX descriptor to be the first one > * being parsed the next time the receive function is > * invoked [on the same queue]. > * > * - Stop parsing the RX ring and return immediately. > * > * This policy does not drop the packet received in the RX > * descriptor for which the allocation of a new mbuf failed. > * Thus, it allows that packet to be later retrieved if > * mbuf have been freed in the mean time. > * As a side effect, holding RX descriptors instead of > * systematically giving them back to the NIC may lead to > * RX ring exhaustion situations. > * However, the NIC can gracefully prevent such situations > * to happen by sending specific "back-pressure" flow control > * frames to its peer(s). > */ > > Why the same "policy" can't be done in the bulk-context allocation? - Don't > advance the RDT until u've refilled the ring. What do I miss here? A lot of the optimizations done in other code paths, such as bulk alloc, may well be applicable here, it's just that the work has not been done yet, as the focus is elsewhere. For vector PMD RX, we have now routines that work on both regular and scattered packets, and both perform much better than the scalar equivalents. Also to note that in every RX (and TX) routine, the NIC tail pointer update is always done just once at the end of the function. > > Another question is about the LRO feature - is there a reason why it's not > implemented? I've implemented the LRO support in ixgbe PMD to begin with - I > used a "scattered Rx" as a template and now I'm tuning it (things like the > stuff above). > > Is there any philosophical reason why it hasn't been implemented in *any* > PMD so far? ;) I'm not aware of any philosophical reasons why it hasn't been done. Patches are welcome, as always. :-) /Bruce