DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas.monjalon@6wind.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
Date: Mon, 30 Mar 2015 19:11:24 +0200	[thread overview]
Message-ID: <1597671.99gAEorbHa@xps13> (raw)
In-Reply-To: <59AF69C657FD0841A61C55336867B5B0344F112F@IRSMSX103.ger.corp.intel.com>

2015-03-27 16:56, Richardson, Bruce:
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, March 27, 2015 4:44 PM
> > To: Richardson, Bruce
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing
> > code
> > 
> > On Fri, Mar 27, 2015 at 02:55:27PM +0000, Bruce Richardson wrote:
> > > On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote:
> > > > On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> > > > > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > > > > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > > > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson
> > wrote:
> > > > > > > > > The logic used in the condition check before freeing an
> > > > > > > > > mbuf is sometimes confusing, so explain it in a proper
> > comment.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Bruce Richardson
> > > > > > > > > <bruce.richardson@intel.com>
> > > > > > > > > ---
> > > > > > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > b/lib/librte_mbuf/rte_mbuf.h index 17ba791..0265172 100644
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct
> > > > > > > > > rte_mbuf *m)  {
> > > > > > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > > > > > >
> > > > > > > > > +	/*
> > > > > > > > > +	 * Check to see if this is the last reference to the
> > mbuf.
> > > > > > > > > +	 * Note: the double check here is deliberate. If the
> > ref_cnt is "atomic"
> > > > > > > > > +	 * the call to "refcnt_update" is a very expensive
> > operation, so we
> > > > > > > > > +	 * don't want to call it in the case where we know we
> > are the holder
> > > > > > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > > > > > +	 * If however, ref_cnt != 1, it's still possible that we
> > may still be
> > > > > > > > > +	 * the final decrementer of the count, so we need to
> > check that
> > > > > > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > > > > > +	 */
> > > > > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0))
> > {
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.1.0
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > NAK
> > > > > > > >  the comment is incorrect, a return code of 1 from
> > > > > > > > rte_mbuf_refcnt_read doesn't guarantee you are the last
> > > > > > > > holder of the buffer if two contexts have a pointer to it.
> > > > > > > If two threads have pointers to it, and are both going to free
> > > > > > > it, the refcnt must be 2 not one, otherwise the refcnt is
> > meaningless.
> > > > > > >
> > > > > >
> > > > > > What about the other concrete case that I illustrated, where one
> > > > > > context is attempting to increment the refcount, while the other
> > > > > > is decrementing it with the intention to free?  By making the
> > > > > > read and set operation disctinct here you've broken the
> > > > > > atomicity of the read and update logic that atomics are there for
> > and created a race condition.  I don't know how else to explain this to
> > you.
> > > > > > if(atomic_read == 1) then atomic_set(0), breaks the entire
> > > > > > notion of what atomics are meant to do (namely update and read
> > > > > > state as an atomic unit), you just can't get away with not
> > > > > > having that atomicity here.  If you could, you might as well be
> > > > > > using plain integers for the reference count, as you're not using
> > the atomic properties of the type.
> > > > > >
> > > > > > Neil
> > > > >
> > > > > I disagree.
> > > > >
> > > > > A value of one, indicates that there is only one owner of the
> > > > > mbuf, and therefore since we are in the free routine, we are that
> > > > > owner. If there are to be two owners, the refcnt must be
> > > > > incremented before handing over the pointer to the other thread -
> > > > > to get to the example you make. If that does not occur, we can
> > > > > also have the situation where the "sending" thread calls free -
> > > > > and therefore this function - before the other thread receives the
> > > > > pointer. In that case, we will have the receiving thread getting a
> > > > > pointer to an mbuf which is now invalid as it has been put back
> > > > > into the mempool
> > > > >
> > > > > Again, in short, if refcnt == 1, there is only one mbuf owner. If
> > > > > refcnt == 1 and we are currently executing in prefree_seg, we are
> > > > > the owner and no other thread is allow to muck about with the mbuf.
> > > > >
> > > > Then the question remains, why aren't you just using ints here?
> > > > What is the purpose of even bothering with atomics, if you don't
> > > > feel like you need any reliance on the atomic set and read state,
> > which it was created for??
> > > >
> > > > Neil
> > >
> > > Because for the case where refcnt != 1, you need the atomics. If you
> > > have two threads using the mbuf and refcnt is 2, both of them
> > > simultaneously can hand over their copies to two more threads. In that
> > > case, we need to guarantee refcnt to be 4, so we need to use atomics.
> > > Similarly, if both threads attempt to free at the same time, we need
> > > to ensure that only one of them actually returns the buf to the mempool
> > - hence the atomic decrement and return value check.
> > >
> > > /Bruce
> > 
> > Sigh, ok, so that makes some sense.  This thing is entirely for the
> > purposes of special casing the single use case?  That seems like alot of
> > effort and confusion to go through for this.  Perhaps macrotizing it for
> > multiple use cases would clarify it:
> > #define mbuf_orphaned(mbuf) atomic_ref_read(mbuf)==1 ||
> > atomic_ref_dec(mbuf)==0
> 
> Yes, we could, except it's not "orphaned" since it has got a single thread owner, and this is the normal use-case we are special-casing. 
> The comment should adequately cover things, I think, and for cases where it doesn't we now have this thread to refer to also. :-)
> 
> > 
> > regardless, you've convinced me that its not broken.
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Thanks,
> /Bruce

Applied, thanks

  reply	other threads:[~2015-03-30 17:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-26 21:14 Bruce Richardson
2015-03-26 21:31 ` Olivier MATZ
2015-03-27 10:29 ` Neil Horman
2015-03-27 10:49   ` Ananyev, Konstantin
2015-03-27 11:32   ` Bruce Richardson
2015-03-27 12:42     ` Neil Horman
2015-03-27 14:07     ` Neil Horman
2015-03-27 14:30       ` Bruce Richardson
2015-03-27 14:38         ` Neil Horman
2015-03-27 14:55           ` Bruce Richardson
2015-03-27 16:43             ` Neil Horman
2015-03-27 16:56               ` Richardson, Bruce
2015-03-30 17:11                 ` Thomas Monjalon [this message]
2015-03-30 17:39                 ` Don Provan
2015-03-30 18:15                   ` Stephen Hemminger
2015-03-31 12:33                   ` Zoltan Kiss
2015-03-30 19:39         ` Marc Sune
2015-03-30 20:26           ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1597671.99gAEorbHa@xps13 \
    --to=thomas.monjalon@6wind.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).