From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id B970E7DEB for ; Fri, 26 Sep 2014 14:32:15 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 26 Sep 2014 05:38:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,604,1406617200"; d="scan'208";a="597238215" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by fmsmga001.fm.intel.com with ESMTP; 26 Sep 2014 05:38:35 -0700 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.200]) by IRSMSX103.ger.corp.intel.com ([169.254.3.175]) with mapi id 14.03.0195.001; Fri, 26 Sep 2014 13:37:54 +0100 From: "Wodkowski, PawelX" To: Neil Horman Thread-Topic: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe: Thread-Index: AQHP2MAapayPUkT5UUGj9XqXyFJ3T5wR4sKAgAAPjwCAABZmAIAAZLsAgADPUACAAB2vsA== Date: Fri, 26 Sep 2014 12:37:54 +0000 Message-ID: References: <1411649768-8084-1-git-send-email-michalx.k.jastrzebski@intel.com> <20140925150807.GD32725@hmsreliant.think-freely.org> <2601191342CEEE43887BDE71AB977258213769DE@IRSMSX105.ger.corp.intel.com> <20140925172358.GG32725@hmsreliant.think-freely.org> <2601191342CEEE43887BDE71AB97725821378B50@IRSMSX104.ger.corp.intel.com> <20140926114630.GA3930@hmsreliant.think-freely.org> In-Reply-To: <20140926114630.GA3930@hmsreliant.think-freely.org> Accept-Language: pl-PL, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe: X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Sep 2014 12:32:16 -0000 > So basically cancel() just set ALARM_CANCELLED and leaves actual alarm > deletion to the callback()? > That was the thought, yes. >=20 > > I think it is doable - but I don't see any real advantage with that app= roach. > > Yes, code will become a bit simpler, as we'll have one point when we r= emove > alarm from the list. > Yes, that would be the advantage, that the code would be much simpler. >=20 > > But from other side, imagine such simple test-case: > > > > for (i =3D 0; i < 0x100000; i++) { > > rte_eal_alarm_set(ONE_MIN, cb_func, (void *)i); > > rte_eal_alarm_cancel(cb_func, (void *)i); > > } > > > > We'll endup with 1M of cancelled, but still not removed entries in the > alarm_list. > > With current implementation that means - few MBs of wasted memory, > Thats correct, and the tradeoff to choose between. Do you want simpler c= ode > that is easier to maintain, or do you want a high speed cancel and set > operation. I'm not aware of all the use cases, but I have a hard time se= eing > a use case in which the in-flight alarm list grows unboundedly large, whi= ch in > my mind mitigates the risk of deferred removal, but I'm perfectly willing= to > believe that there are use cases which I'm not aware of. >=20 > > plus very slow set() and cancel(), as they'll have to traverse all ent= ries in the > list. > > And all that - for empty from user perspective alarm_list > > So I still prefer Michal's way. > > After all, it doesn't look that complicated to me. > Except that the need for Michals fix arose from the fact that we have two= free > locations that might both get called depending on the situation. Thats w= hat I'm > trying to address here, the complexity itself, rather than the fix (which= I > agree is perfectly valid). >=20 > > BTW, any particular reason you are so negative about pthread_self()? > > > Nothing specifically against it (save for its inverted return code sense,= which > made it difficult for me to parse when reviewing). Its more the complexi= ty > itself in the alarm cancel and callback routine that I was looking at. G= iven > that the origional bug happened because an cancel in a callback might pro= duce a > double free, I wanted to fix it by simpifying the code, not adding condit= ions > which make it more complex. >=20 > You know, looking at it, something else just occured to me. I think this= could > all be fixed without the cancel flag or the pthread_self assignments. Wh= at if > we simply removed the alarm from the list before we called the callback i= n > rte_eal_alarm_callback()? That way any cancel operation called from with= in the > callback would fail, as it wouldn't appear on the list, and the callback > operation would be the only freeing entity? That would let you still hav= e a > fast set and cancel, and avoid the race. Thoughts? Untested sample patc= h > below >=20 >=20 > > > > > > It also seems like the alarm api as a whole could use some improvemen= t. > The > > > way its written right now, theres no way to refer to a specific alarm= (i.e. > > > cancelation relies on the specification of a function and data pointe= r, which > > > may refer to multiple timers). Shouldn't rte_eal_alarm_set return an= opaque > > > handle to a unique timer instance that can be store by a caller and u= sed to > > > specfically cancel that timer? Thats how both the bsd and linux time= r > > > subsystems model timers. > > > > Yeh, alarm API looks a bit unusual. > > Though, I suppose that's subject for another patch/discussion :) > > > Yes, agreed :) >=20 Please read quoted message bellow: > > > > > > His solution *does* eliminate race condition too. > > > I applied his patch. And here is the problem > 1 rte_spinlock_lock(&alarm_list_lk); > 2 while ((ap =3D LIST_FIRST(&alarm_list)) !=3DNULL && > 3 gettimeofday(&now, NULL) =3D=3D 0 && > 4 (ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec =3D=3D > now.tv_sec && > 5 ap->time.tv_usec <=3D > now.tv_usec))){ > 6 ap->executing |=3D ALARM_EXECUTING; > 7 if (likely(!(ap->executing & ALARM_CANCELLED))) { > 8 rte_spinlock_unlock(&alarm_list_lk); > 9 //another thread: rte_alarm_cancel called, ma= rk this timer > canceled and exit ( THE RACE) > 10 ap->cb_fn(ap->cb_arg); // rte_alarm_set called > (THE RACE) > 11 > 12 rte_spinlock_lock(&alarm_list_lk); > 13 } > 14 > 15 rte_spinlock_lock(&alarm_list_lk); > 16 LIST_REMOVE(ap, next); > 17 rte_free(ap); > 18 } >=20 > Imagine >=20 > Thread 1: Thread2 > Execute eal_alarm_callback > Lock list at 1 rte_alarm_cancel -> bloc= k on spinlock > .... > Realease lock at line 8 rte_alarm_cancel -> resumes exec= ution -> it > mark this timer canceled > ap->cb_fn is called at line 10 rte_alarm_cancel exits reporting all= alarms are > canceled and not executing (which is not true!) > rte_alarm_set is called > to rearm itself (THE RACE) >=20 > He only mark timer to * do not execute* but does not assure that it is no= t > executing while canceling. > Race condition is made by unlocking at line 8 and our solution workaround= s this > by looping until all alarms finish execution then cancel what left after = callback > finish (*including rearmed alarms*). > Real elimination of the race would be by using recursive locking when *ot= her > thread can't* access the list *while callback* is running but callback *i= tself can > by using recursive locking*. >=20 > Maybe I don't see something obvious? :) >=20 > Pawel