DPDK patches and discussions
 help / color / mirror / Atom feed
From: Neil Horman <nhorman@tuxdriver.com>
To: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:
Date: Mon, 29 Sep 2014 07:35:59 -0400	[thread overview]
Message-ID: <20140929113559.GA26483@hmsreliant.think-freely.org> (raw)
In-Reply-To: <F6F2A6264E145F47A18AB6DF8E87425D12B3A80F@IRSMSX102.ger.corp.intel.com>

On Mon, Sep 29, 2014 at 10:11:38AM +0000, Wodkowski, PawelX wrote:
> > >
> > > Image how you will be damned by someone that not even notice you change
> > > and he Is managing some kind of resource based on returned number of
> > > set/canceled timers. If you suddenly start returning negative values how those
> > > application will behave? Silently changing returned value domain is evil in its
> > > pure form.
> > 
> > As I can see the impact is very limited.
> 
> It is small impact to DPDK but can be huge to user application:
> Ex:
> If someone use this kind of expression in callback (skipping user app serialization part):
> callback () {
> ...
> some_simple_semaphore += rte_alarm_cancel(...));

This code would be broken to begin with, as rte_eal_alarm_cancel is already
written to return negative return codes.  Its not documented as such, but its
still the case.  Note that if you run an application built against a shared
library on BSD, the definition of rte_eal_alarm_cancel returns -ENOTSUP.  The
above code would be broken because it doesn't account for that.  You can argue
that the documentation should be updated, but the dpdk in the wild already
conforms to the model Konstantin and I are proposing.

> ...
> }
> 
> Anywhere in the code:
> ...
> If (some_simple_semapore) {
> 	some_simple_semapore --;
> 	if (rte_eal_alarm_set(...) != 0)
> 		some_simple_semapore ++;
> }
> ...
> 
> 1. Do you notice the change in cancel function?
The application crashes, or otherwise misbehaves.

> 2. How many hours you spend to find this issue in case of big app/system?
You don't.  Such a problem as you describe would very likely result in a
semaphore deadlock, as the count would be incorrectly lowered, so you put
watches on the variable, note that sometimes the count goes down on a cancel,
which is completely counter-intuitive, read the updated documentation that
indicates error codes are possible (which you should have been prepared for
anyway), and move on with your day.

> 
> > Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside alarm
> > callback function might be affected.
> > From other side, indeed, there could exist situations, when the caller needs to
> > know
> > was the alarm successfully cancelled or not.
> > And if not by what reason.
> > 
> 
> I can extend API of rte alarms to add alarm state checking in next patch,  but for 
> now, since this is not urgent I think original patch  v2 should be enough.
I re-assert my origional argument here, without the above change, you haven't
really fixed the race.  If you can find another way to do it, thats fine with
me, but keep in mind once again, that some implementations of rte_eal_alarm_set
already do whats being proposed.

Neil

> 
> Pawel
> 

  parent reply	other threads:[~2014-09-29 11:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-25 12:56 Michal Jastrzebski
2014-09-25 13:11 ` Ananyev, Konstantin
2014-09-25 15:08 ` Neil Horman
2014-09-25 16:03   ` Ananyev, Konstantin
2014-09-25 17:23     ` Neil Horman
2014-09-25 23:24       ` Ananyev, Konstantin
2014-09-26 11:46         ` Neil Horman
2014-09-26 12:37           ` Wodkowski, PawelX
2014-09-26 13:40             ` Neil Horman
2014-09-26 14:01               ` Wodkowski, PawelX
2014-09-26 15:01                 ` Neil Horman
2014-09-26 15:41                   ` Ananyev, Konstantin
2014-09-26 16:21                     ` Neil Horman
2014-09-26 18:07                       ` Ananyev, Konstantin
2014-09-26 19:39                         ` Neil Horman
2014-09-28 16:12                           ` Ananyev, Konstantin
2014-09-28 20:47                             ` Neil Horman
2014-09-29  6:40                               ` Wodkowski, PawelX
2014-09-29  9:50                                 ` Ananyev, Konstantin
2014-09-29 10:11                                   ` Wodkowski, PawelX
2014-09-29 10:33                                     ` Bruce Richardson
2014-09-30 11:13                                       ` Wodkowski, PawelX
2014-09-30 12:05                                         ` Wodkowski, PawelX
2014-09-30 12:30                                           ` Ananyev, Konstantin
2014-09-30 12:54                                             ` Neil Horman
2014-09-29 11:35                                     ` Neil Horman [this message]
2014-09-26 14:13               ` Ananyev, Konstantin
2014-09-29 10:37                 ` Bruce Richardson
2014-09-26  6:33       ` Wodkowski, PawelX
2014-09-26  9:49         ` Wodkowski, PawelX
2014-09-26 13:43         ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140929113559.GA26483@hmsreliant.think-freely.org \
    --to=nhorman@tuxdriver.com \
    --cc=dev@dpdk.org \
    --cc=pawelx.wodkowski@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).