From: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:
Date: Fri, 26 Sep 2014 12:37:54 +0000 [thread overview]
Message-ID: <F6F2A6264E145F47A18AB6DF8E87425D12B39EB5@IRSMSX102.ger.corp.intel.com> (raw)
In-Reply-To: <20140926114630.GA3930@hmsreliant.think-freely.org>
> So basically cancel() just set ALARM_CANCELLED and leaves actual alarm
> deletion to the callback()?
> That was the thought, yes.
>
> > I think it is doable - but I don't see any real advantage with that approach.
> > Yes, code will become a bit simpler, as we'll have one point when we remove
> alarm from the list.
> Yes, that would be the advantage, that the code would be much simpler.
>
> > But from other side, imagine such simple test-case:
> >
> > for (i = 0; i < 0x100000; i++) {
> > rte_eal_alarm_set(ONE_MIN, cb_func, (void *)i);
> > rte_eal_alarm_cancel(cb_func, (void *)i);
> > }
> >
> > We'll endup with 1M of cancelled, but still not removed entries in the
> alarm_list.
> > With current implementation that means - few MBs of wasted memory,
> Thats correct, and the tradeoff to choose between. Do you want simpler code
> that is easier to maintain, or do you want a high speed cancel and set
> operation. I'm not aware of all the use cases, but I have a hard time seeing
> a use case in which the in-flight alarm list grows unboundedly large, which in
> my mind mitigates the risk of deferred removal, but I'm perfectly willing to
> believe that there are use cases which I'm not aware of.
>
> > plus very slow set() and cancel(), as they'll have to traverse all entries in the
> list.
> > And all that - for empty from user perspective alarm_list
> > So I still prefer Michal's way.
> > After all, it doesn't look that complicated to me.
> Except that the need for Michals fix arose from the fact that we have two free
> locations that might both get called depending on the situation. Thats what I'm
> trying to address here, the complexity itself, rather than the fix (which I
> agree is perfectly valid).
>
> > BTW, any particular reason you are so negative about pthread_self()?
> >
> Nothing specifically against it (save for its inverted return code sense, which
> made it difficult for me to parse when reviewing). Its more the complexity
> itself in the alarm cancel and callback routine that I was looking at. Given
> that the origional bug happened because an cancel in a callback might produce a
> double free, I wanted to fix it by simpifying the code, not adding conditions
> which make it more complex.
>
> You know, looking at it, something else just occured to me. I think this could
> all be fixed without the cancel flag or the pthread_self assignments. What if
> we simply removed the alarm from the list before we called the callback in
> rte_eal_alarm_callback()? That way any cancel operation called from within the
> callback would fail, as it wouldn't appear on the list, and the callback
> operation would be the only freeing entity? That would let you still have a
> fast set and cancel, and avoid the race. Thoughts? Untested sample patch
> below
>
>
> > >
> > > It also seems like the alarm api as a whole could use some improvement.
> The
> > > way its written right now, theres no way to refer to a specific alarm (i.e.
> > > cancelation relies on the specification of a function and data pointer, which
> > > may refer to multiple timers). Shouldn't rte_eal_alarm_set return an opaque
> > > handle to a unique timer instance that can be store by a caller and used to
> > > specfically cancel that timer? Thats how both the bsd and linux timer
> > > subsystems model timers.
> >
> > Yeh, alarm API looks a bit unusual.
> > Though, I suppose that's subject for another patch/discussion :)
> >
> Yes, agreed :)
>
Please read quoted message bellow:
> >
> >
> > His solution *does* eliminate race condition too.
> >
> I applied his patch. And here is the problem
> 1 rte_spinlock_lock(&alarm_list_lk);
> 2 while ((ap = LIST_FIRST(&alarm_list)) !=NULL &&
> 3 gettimeofday(&now, NULL) == 0 &&
> 4 (ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec ==
> now.tv_sec &&
> 5 ap->time.tv_usec <=
> now.tv_usec))){
> 6 ap->executing |= ALARM_EXECUTING;
> 7 if (likely(!(ap->executing & ALARM_CANCELLED))) {
> 8 rte_spinlock_unlock(&alarm_list_lk);
> 9 //another thread: rte_alarm_cancel called, mark this timer
> canceled and exit ( THE RACE)
> 10 ap->cb_fn(ap->cb_arg); // rte_alarm_set called
> (THE RACE)
> 11
> 12 rte_spinlock_lock(&alarm_list_lk);
> 13 }
> 14
> 15 rte_spinlock_lock(&alarm_list_lk);
> 16 LIST_REMOVE(ap, next);
> 17 rte_free(ap);
> 18 }
>
> Imagine
>
> Thread 1: Thread2
> Execute eal_alarm_callback
> Lock list at 1 rte_alarm_cancel -> block on spinlock
> ....
> Realease lock at line 8 rte_alarm_cancel -> resumes execution -> it
> mark this timer canceled
> ap->cb_fn is called at line 10 rte_alarm_cancel exits reporting all alarms are
> canceled and not executing (which is not true!)
> rte_alarm_set is called
> to rearm itself (THE RACE)
>
> He only mark timer to * do not execute* but does not assure that it is not
> executing while canceling.
> Race condition is made by unlocking at line 8 and our solution workarounds this
> by looping until all alarms finish execution then cancel what left after callback
> finish (*including rearmed alarms*).
> Real elimination of the race would be by using recursive locking when *other
> thread can't* access the list *while callback* is running but callback *itself can
> by using recursive locking*.
>
> Maybe I don't see something obvious? :)
>
> Pawel
next prev parent reply other threads:[~2014-09-26 12:32 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-25 12:56 Michal Jastrzebski
2014-09-25 13:11 ` Ananyev, Konstantin
2014-09-25 15:08 ` Neil Horman
2014-09-25 16:03 ` Ananyev, Konstantin
2014-09-25 17:23 ` Neil Horman
2014-09-25 23:24 ` Ananyev, Konstantin
2014-09-26 11:46 ` Neil Horman
2014-09-26 12:37 ` Wodkowski, PawelX [this message]
2014-09-26 13:40 ` Neil Horman
2014-09-26 14:01 ` Wodkowski, PawelX
2014-09-26 15:01 ` Neil Horman
2014-09-26 15:41 ` Ananyev, Konstantin
2014-09-26 16:21 ` Neil Horman
2014-09-26 18:07 ` Ananyev, Konstantin
2014-09-26 19:39 ` Neil Horman
2014-09-28 16:12 ` Ananyev, Konstantin
2014-09-28 20:47 ` Neil Horman
2014-09-29 6:40 ` Wodkowski, PawelX
2014-09-29 9:50 ` Ananyev, Konstantin
2014-09-29 10:11 ` Wodkowski, PawelX
2014-09-29 10:33 ` Bruce Richardson
2014-09-30 11:13 ` Wodkowski, PawelX
2014-09-30 12:05 ` Wodkowski, PawelX
2014-09-30 12:30 ` Ananyev, Konstantin
2014-09-30 12:54 ` Neil Horman
2014-09-29 11:35 ` Neil Horman
2014-09-26 14:13 ` Ananyev, Konstantin
2014-09-29 10:37 ` Bruce Richardson
2014-09-26 6:33 ` Wodkowski, PawelX
2014-09-26 9:49 ` Wodkowski, PawelX
2014-09-26 13:43 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F6F2A6264E145F47A18AB6DF8E87425D12B39EB5@IRSMSX102.ger.corp.intel.com \
--to=pawelx.wodkowski@intel.com \
--cc=dev@dpdk.org \
--cc=nhorman@tuxdriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).