From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 013D512A8 for ; Tue, 31 Jan 2017 14:46:56 +0100 (CET) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP; 31 Jan 2017 05:46:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,315,1477983600"; d="scan'208";a="60175942" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.61]) by fmsmga006.fm.intel.com with SMTP; 31 Jan 2017 05:46:53 -0800 Received: by (sSMTP sendmail emulation); Tue, 31 Jan 2017 13:46:52 +0000 Date: Tue, 31 Jan 2017 13:46:52 +0000 From: Bruce Richardson To: Olivier Matz Cc: "Ananyev, Konstantin" , "Wiles, Keith" , "dev@dpdk.org" Message-ID: <20170131134651.GB133984@bricha3-MOBL3.ger.corp.intel.com> References: <20170125142052.7989e0ec@glumotte.dev.6wind.com> <20170125135404.GA24352@bricha3-MOBL3.ger.corp.intel.com> <20170125144809.GA26936@bricha3-MOBL3.ger.corp.intel.com> <647B0F09-667B-41D2-A8E1-964F71D4C365@intel.com> <20170125165740.GA33248@bricha3-MOBL3.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583F10CD7E@irsmsx105.ger.corp.intel.com> <20170131115349.7efadb09@platinum> <20170131114142.GA130764@bricha3-MOBL3.ger.corp.intel.com> <20170131121050.GA130444@bricha3-MOBL3.ger.corp.intel.com> <20170131142718.25e6c876@platinum> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170131142718.25e6c876@platinum> Organization: Intel Research and =?iso-8859-1?Q?De=ACvel?= =?iso-8859-1?Q?opment?= Ireland Ltd. User-Agent: Mutt/1.7.1 (2016-10-04) Subject: Re: [dpdk-dev] rte_ring features in use (or not) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jan 2017 13:46:57 -0000 On Tue, Jan 31, 2017 at 02:27:18PM +0100, Olivier Matz wrote: > On Tue, 31 Jan 2017 12:10:50 +0000, Bruce Richardson > wrote: > > On Tue, Jan 31, 2017 at 11:41:42AM +0000, Bruce Richardson wrote: > > > On Tue, Jan 31, 2017 at 11:53:49AM +0100, Olivier Matz wrote: > > > > On Wed, 25 Jan 2017 17:29:18 +0000, "Ananyev, Konstantin" > > > > wrote: > > > > > > > > Bonus question: > > > > > > > > * Do we know how widely used the enq_bulk/deq_bulk > > > > > > > > functions are? They are useful for unit tests, so they do > > > > > > > > have uses, but I think it would be good if we harmonized > > > > > > > > the return values between bulk and burst functions. Right > > > > > > > > now: enq_bulk - only enqueues all elements or none. > > > > > > > > Returns 0 for all, or negative error for none. > > > > > > > > enq_burst - enqueues as many elements as possible. > > > > > > > > Returns the number enqueued. > > > > > > > > > > > > > > I do use the apis in pktgen and the difference in return > > > > > > > values has got me once. Making them common would be great, > > > > > > > but the problem is > > > > > > backward compat to old versions I would need to have an ifdef > > > > > > in pktgen now. So it seems like we moved the problem to the > > > > > > application. > > > > > > > > > > > > > > > > > > > Yes, an ifdef would be needed, but how many versions of DPDK > > > > > > back do you support? Could the ifdef be removed again after > > > > > > say, 6 months? > > > > > > > I would like to see the old API kept and a new API with the > > > > > > > new behavior. I know it adds another API but one of the API > > > > > > > would be nothing > > > > > > more than wrapper function if not a macro. > > > > > > > > > > > > > > Would that be more reasonable then changing the ABI? > > > > > > > > > > > > Technically, this would be an API rather than ABI change, > > > > > > since the functions are inlined in the code. However, it's > > > > > > not the only API change I'm looking to make here - I'd like > > > > > > to have all the functions start returning details of the > > > > > > state of the ring, rather than have the watermarks facility. > > > > > > If we add all new functions for this and keep the old ones > > > > > > around, we are just increasing our maintenance burden. > > > > > > > > > > > > I'd like other opinions here. Do we see increasing the API > > > > > > surface as the best solution, or are we ok to change the APIs > > > > > > of a key library like the rings one? > > > > > > > > > > I am ok with changing API to make both _bulk and _burst return > > > > > the same thing. Konstantin > > > > > > > > I agree that the _bulk() functions returning 0 or -err can be > > > > confusing. But it has at least one advantage: it explicitly shows > > > > that if user ask for N enqueues/dequeues, it will either get N or > > > > 0, not something between. > > > > > > > > Changing the API of the existing _bulk() functions looks a bit > > > > dangerous to me. There's probably a lot of code relying on the > > > > ring API, and changing its behavior may break it. > > > > > > > > I'd prefer to deprecate the old _bulk and _burst functions, and > > > > introduce a new api, maybe something like: > > > > > > > > rte_ring_generic_dequeue(ring, objs, n, behavior, flags) > > > > -> return nb_objs or -err > > > > > > > Don't like the -err, since it's not a valid value that can be used > > > e.g. in simple loops in the case that the user doesn't care about > > > the exact reason for error. I prefer having zero returned on error, > > > with rte_errno set appropriately, since then it is trivial for apps > > > to ignore error values they don't care about. > > > It also makes the APIs in a ring library consistent in that all > > > will set rte_errno on error, rather than returning the error code. > > > It's not right for rte_ring_create and rte_ring_lookup to return an > > > error code since they return pointers, not integer values. > > My assumption was that functions returning an int should return an > error instead of rte_errno. By the way, it's actually the same debate > than http://dpdk.org/ml/archives/dev/2017-January/056546.html > > In that particular case, I'm not convinced that this code: > > ret = ring_dequeue(r, objs, n); > if (ret == 0) { > /* handle error in rte_errno */ > return; > } > do_stuff_with_elements(objs, ret); > > Is better/faster/clearer than this one: > > ret = ring_dequeue(r, objs, n); > if (ret <= 0) { > /* handle error in ret */ > return; > } > do_stuff_with_elements(objs, ret); > > > In the first case, you could argue that the "if (ret)" part could be > stripped if the app does not care about errors, but I think it's not > efficient to call the next function with 0 object. Also, this if() does > not necessarily adds a test since ring_dequeue() is inline. > > In the first case, ring_dequeue needs to write rte_errno in memory on > error (because it's a global variable), even if the caller does not > look at it. In the second case, it can stay in a register. > I agree in many cases there is not a lot to choose between the two methods. However, I prefer the errno approach for 3 reasons: 1. Firstly, and primarily, it works in all cases, including for use with functions that return pointers. That allows a library like rte_ring to use it across all functions, rather than having some functions use an errno variable, or extra return value, and other functions return the error code directly. 2. It's how unix system calls work, so everyone is familiar with the scheme. 3. It allows the return value to be always in the valid domain of return values for the type. You can have dequeue functions that always return unsigned values, you can have functions that return enums etc. This means you can track stats and chain function calls without having to do error checking if you like: for example, moving packets between rings: rte_ring_enqueue_burst(r2, objs, rte_ring_dequeue_burst(r1, objs, sizeof(objs)); This is for me the least compelling of the reasons, but is still worth considering, and I do admit to liking the functional style of programming it allows. > > > > > > > As for deprecating the functions - I'm not sure about that. I think > > > the names of the existing functions are ok, and should be kept. > > > I've a new patchset of cleanups for rte_rings in the works. Let me > > > try and finish that and send it out as an RFC and we'll see what > > > you think then. > > Sorry, I realised on re-reading this reply seemed overly negative, > > sorry. > > haha, no problem :) > > > > I can actually see the case for deprecating both sets of > > functions to allow us to "start afresh". If we do so, are we as well > > to just replace the whole library with a new one, e.g. rte_fifo, which > > would allow us the freedom to keep e.g. functions with "burst" in the > > name if we so wish? If might also allow an easier transition. > > Yes, that's also an option. > > My fear is about changing the API of such widely used functions, > without triggering any compilation error because the prototypes stays > the same. > Don't worry, I also plan to change the prototypes! I want to add in an extra parameter to each call to optionally return the amount of space in the ring. This is can lead to large cycle savings - by avoiding extra calls to rte_ring_count() or rte_ring_free_count() - in cases where it is useful. We found that this lead to significant performance improvements in the SW eventdev, as we had less ping-ponging of cachelines between cores. Since we already know the amount of free space in an enqueue call we can return that for free, while calling a separate free_space API can lead to huge stalls if the cachelines have been adjusted by another core in the meantime. [Yes, this does mean that the value returned from an enqueue call would be inaccurate, but it can still, in the case of SP rings, provide a guarantee that any number of objects up to that number can be enqueued without error, since any changes will only increase the number that can be enqueued] The new APIs would therefore be: rte_ring_enqueue_bulk(r, objs, n, &free_space); rte_ring_dequeue_bulk(r, objs, n, &avail_objs); This also would allow us to remove the watermarks feature, as the app can itself check the free_space value against as many watermark thresholds as it likes. Hopefully I'll get an RFC with this change ready in the next few days and we can base the discussion more on working code. Regards, /Bruce