From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id D92FD5589 for ; Thu, 19 Jan 2017 13:10:06 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP; 19 Jan 2017 04:10:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,253,1477983600"; d="scan'208";a="924334011" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.61]) by orsmga003.jf.intel.com with SMTP; 19 Jan 2017 04:10:03 -0800 Received: by (sSMTP sendmail emulation); Thu, 19 Jan 2017 12:10:03 +0000 Date: Thu, 19 Jan 2017 12:10:02 +0000 From: Bruce Richardson To: Olivier Matz Cc: dev@dpdk.org Message-ID: <20170119121002.GA1720@bricha3-MOBL3.ger.corp.intel.com> References: <1484147125-5948-1-git-send-email-bruce.richardson@intel.com> <20170113152334.5efda35a@platinum> <20170113150053.GA193796@bricha3-MOBL3.ger.corp.intel.com> <20170117143820.336b40d9@platinum> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170117143820.336b40d9@platinum> Organization: Intel Research and =?iso-8859-1?Q?De=ACvel?= =?iso-8859-1?Q?opment?= Ireland Ltd. User-Agent: Mutt/1.7.1 (2016-10-04) Subject: Re: [dpdk-dev] [RFC PATCH 00/11] generalise rte_ring to allow different datatypes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2017 12:10:07 -0000 On Tue, Jan 17, 2017 at 02:38:20PM +0100, Olivier Matz wrote: > Hi Bruce, > > Maybe it's worth checking the impact. The size check could be done only > once per bulk, so it may not cost that much. > > It's also possible to have a particular case for pointer size, and > use a memcpy for other sizes. > > > I think having a performance test showing storing the elt size in the > ring structure has a deep impact would help to reach a consensus > faster :) > > Hi Olivier, I did a quick prototype using a switch statement for three data element sizes: 8, 16, and 32 bytes. The performance difference was neglible to none. In most cases, with ring_perf_autotest on my system, there was a small degradation, of less than 1 cycle per packet, and a few were slightly faster, probably due to the natural variation in results between runs. I did not test with any memcpy calls in the datapath, all assignments were done using uint64_t's or vectors of the appropriate sizes. Therefore it looks like some kind of solution without macros and using a stored element size is possible. However, I think there is a third alternative too. It is outlined below as option 3. 1. Use macros as in original RFC 2. Update rte_ring like I did for tests described above so that create takes the size parameter, and the switch statment in enqueue and dequeue looks that up at runtime. This means that rte_ring becomes the type used for all transfers of all sizes. Also, enqueues/dequeue functions take void * or const void * obj_table parameters rather than void ** and void * const * obj_table. Downside, this would change the ring API and ABI, and the ring maintains no type information 3. Update rte_ring as above but rename it to rte_common_ring, and have the element size parameter passed to enqueue and dequeue functions too - allowing the compiler to optimise the switch out. Then we update the existing rte_ring to use the rte_common_ring calls, passing in sizeof(void *) as parameter to each common call. An event-ring type, or any other ring types can similarly be written using common ring code, and present the appropriate type information on enqueue/dequeue to the apps using them. Downside: more code to maintain, and more specialised APIs. Personally, because I like having type-specialised code, I prefer the third option. It also gives us the ability to change the common code without affecting the API/ABI of the rings [which could be updated later after a proper deprecation period, if we want]. An example of a change I have in mind for this common code would be some rework around the watermarks support. While the watermarks support is useful, for the event_rings we actually need more information provided from enqueue. To that end, I would see the common_rings code changed so that the enqueue function returns an additional parameter of the amount of space left in the ring. This information is computed by the function anyway, and can therefore be efficiently returned by the calls. For sp_enqueue, this extra parameter would allow the app to know the minimum number of elements which can be successfully enqueued to the ring in a subsequent call. The existing rte_ring code can use the return value to calculate itself if the watermark is exceeded and return a value as it does now. Other ring types can then decide themselves if they want to provide watermark functionality, or what their API set would be - though it's probably best to keep the APIs consistent. Further thoughts? /Bruce