From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id 5DFC36893 for ; Mon, 29 Sep 2014 13:29:29 +0200 (CEST) Received: from hmsreliant.think-freely.org ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1XYZFJ-0002tb-Nc; Mon, 29 Sep 2014 07:36:03 -0400 Date: Mon, 29 Sep 2014 07:35:59 -0400 From: Neil Horman To: "Wodkowski, PawelX" Message-ID: <20140929113559.GA26483@hmsreliant.think-freely.org> References: <20140926150156.GB5619@hmsreliant.think-freely.org> <2601191342CEEE43887BDE71AB9772582137D88E@IRSMSX104.ger.corp.intel.com> <20140926162134.GE5619@hmsreliant.think-freely.org> <2601191342CEEE43887BDE71AB9772582137D95F@IRSMSX104.ger.corp.intel.com> <20140926193905.GH5619@hmsreliant.think-freely.org> <2601191342CEEE43887BDE71AB9772582138410B@IRSMSX104.ger.corp.intel.com> <20140928204754.GC4012@localhost.localdomain> <2601191342CEEE43887BDE71AB977258213874C5@IRSMSX104.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Status: No Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe: X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Sep 2014 11:29:29 -0000 On Mon, Sep 29, 2014 at 10:11:38AM +0000, Wodkowski, PawelX wrote: > > > > > > Image how you will be damned by someone that not even notice you change > > > and he Is managing some kind of resource based on returned number of > > > set/canceled timers. If you suddenly start returning negative values how those > > > application will behave? Silently changing returned value domain is evil in its > > > pure form. > > > > As I can see the impact is very limited. > > It is small impact to DPDK but can be huge to user application: > Ex: > If someone use this kind of expression in callback (skipping user app serialization part): > callback () { > ... > some_simple_semaphore += rte_alarm_cancel(...)); This code would be broken to begin with, as rte_eal_alarm_cancel is already written to return negative return codes. Its not documented as such, but its still the case. Note that if you run an application built against a shared library on BSD, the definition of rte_eal_alarm_cancel returns -ENOTSUP. The above code would be broken because it doesn't account for that. You can argue that the documentation should be updated, but the dpdk in the wild already conforms to the model Konstantin and I are proposing. > ... > } > > Anywhere in the code: > ... > If (some_simple_semapore) { > some_simple_semapore --; > if (rte_eal_alarm_set(...) != 0) > some_simple_semapore ++; > } > ... > > 1. Do you notice the change in cancel function? The application crashes, or otherwise misbehaves. > 2. How many hours you spend to find this issue in case of big app/system? You don't. Such a problem as you describe would very likely result in a semaphore deadlock, as the count would be incorrectly lowered, so you put watches on the variable, note that sometimes the count goes down on a cancel, which is completely counter-intuitive, read the updated documentation that indicates error codes are possible (which you should have been prepared for anyway), and move on with your day. > > > Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside alarm > > callback function might be affected. > > From other side, indeed, there could exist situations, when the caller needs to > > know > > was the alarm successfully cancelled or not. > > And if not by what reason. > > > > I can extend API of rte alarms to add alarm state checking in next patch, but for > now, since this is not urgent I think original patch v2 should be enough. I re-assert my origional argument here, without the above change, you haven't really fixed the race. If you can find another way to do it, thats fine with me, but keep in mind once again, that some implementations of rte_eal_alarm_set already do whats being proposed. Neil > > Pawel >