From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f44.google.com (mail-wg0-f44.google.com [74.125.82.44]) by dpdk.org (Postfix) with ESMTP id 449805A8E for ; Mon, 30 Mar 2015 19:12:07 +0200 (CEST) Received: by wgra20 with SMTP id a20so181828712wgr.3 for ; Mon, 30 Mar 2015 10:12:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:organization :user-agent:in-reply-to:references:mime-version :content-transfer-encoding:content-type; bh=JITP/13DeczZOrDeXcMY6Pe7JvXuVRZPCnYrsD2VVO0=; b=HOdYNvCfVFHIgC8ycH0ft2CCKizwkHrKYo1M77mXt69FrvjaHIPdZ6qrwogX9mdrXI uY29KfWAqm+JpRbBQmU/mjH+ZDBf4RvVdZixr0mujBVPYjBe0RrkANb1Tsdmfd1ynGrX K9LGu/JHL7/r7cAqV0U3JILTULOsHnmrle7wCTD1GHdj0CYWfc50hZ+qJxOgzYVOVoE9 l9lc7REeXzfgJFGW2z3GOL9aEZ412jih75Y1iLxW9MAwOC8EHJjjAYY2DH89BwAyoyJW LSvc254UW87maw1VgjCx9u3FUzD2N2ck8xG/qkvj1AQSc/zTtlmwwnV88pbmczO8inTK dwVg== X-Gm-Message-State: ALoCoQl/8tgGfDs3zZQIbCJ8Q5i39Gt5z4MoebTC13QVSDmb9CDqTdLDMWbFHQ/452Mvu87DfXOo X-Received: by 10.180.73.111 with SMTP id k15mr24515840wiv.34.1427735525726; Mon, 30 Mar 2015 10:12:05 -0700 (PDT) Received: from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136]) by mx.google.com with ESMTPSA id 17sm16576932wjt.45.2015.03.30.10.12.03 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Mar 2015 10:12:04 -0700 (PDT) From: Thomas Monjalon To: "Richardson, Bruce" Date: Mon, 30 Mar 2015 19:11:24 +0200 Message-ID: <1597671.99gAEorbHa@xps13> Organization: 6WIND User-Agent: KMail/4.14.4 (Linux/3.18.4-1-ARCH; KDE/4.14.4; x86_64; ; ) In-Reply-To: <59AF69C657FD0841A61C55336867B5B0344F112F@IRSMSX103.ger.corp.intel.com> References: <1427404494-27256-1-git-send-email-bruce.richardson@intel.com> <20150327164358.GI5375@hmsreliant.think-freely.org> <59AF69C657FD0841A61C55336867B5B0344F112F@IRSMSX103.ger.corp.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 17:12:07 -0000 2015-03-27 16:56, Richardson, Bruce: > > > -----Original Message----- > > From: Neil Horman [mailto:nhorman@tuxdriver.com] > > Sent: Friday, March 27, 2015 4:44 PM > > To: Richardson, Bruce > > Cc: dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing > > code > > > > On Fri, Mar 27, 2015 at 02:55:27PM +0000, Bruce Richardson wrote: > > > On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote: > > > > On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote: > > > > > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote: > > > > > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote: > > > > > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote: > > > > > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson > > wrote: > > > > > > > > > The logic used in the condition check before freeing an > > > > > > > > > mbuf is sometimes confusing, so explain it in a proper > > comment. > > > > > > > > > > > > > > > > > > Signed-off-by: Bruce Richardson > > > > > > > > > > > > > > > > > > --- > > > > > > > > > lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++ > > > > > > > > > 1 file changed, 10 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h > > > > > > > > > b/lib/librte_mbuf/rte_mbuf.h index 17ba791..0265172 100644 > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h > > > > > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct > > > > > > > > > rte_mbuf *m) { > > > > > > > > > __rte_mbuf_sanity_check(m, 0); > > > > > > > > > > > > > > > > > > + /* > > > > > > > > > + * Check to see if this is the last reference to the > > mbuf. > > > > > > > > > + * Note: the double check here is deliberate. If the > > ref_cnt is "atomic" > > > > > > > > > + * the call to "refcnt_update" is a very expensive > > operation, so we > > > > > > > > > + * don't want to call it in the case where we know we > > are the holder > > > > > > > > > + * of the last reference to this mbuf i.e. ref_cnt == 1. > > > > > > > > > + * If however, ref_cnt != 1, it's still possible that we > > may still be > > > > > > > > > + * the final decrementer of the count, so we need to > > check that > > > > > > > > > + * result also, to make sure the mbuf is freed properly. > > > > > > > > > + */ > > > > > > > > > if (likely (rte_mbuf_refcnt_read(m) == 1) || > > > > > > > > > likely (rte_mbuf_refcnt_update(m, -1) == 0)) > > { > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.1.0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > NAK > > > > > > > > the comment is incorrect, a return code of 1 from > > > > > > > > rte_mbuf_refcnt_read doesn't guarantee you are the last > > > > > > > > holder of the buffer if two contexts have a pointer to it. > > > > > > > If two threads have pointers to it, and are both going to free > > > > > > > it, the refcnt must be 2 not one, otherwise the refcnt is > > meaningless. > > > > > > > > > > > > > > > > > > > What about the other concrete case that I illustrated, where one > > > > > > context is attempting to increment the refcount, while the other > > > > > > is decrementing it with the intention to free? By making the > > > > > > read and set operation disctinct here you've broken the > > > > > > atomicity of the read and update logic that atomics are there for > > and created a race condition. I don't know how else to explain this to > > you. > > > > > > if(atomic_read == 1) then atomic_set(0), breaks the entire > > > > > > notion of what atomics are meant to do (namely update and read > > > > > > state as an atomic unit), you just can't get away with not > > > > > > having that atomicity here. If you could, you might as well be > > > > > > using plain integers for the reference count, as you're not using > > the atomic properties of the type. > > > > > > > > > > > > Neil > > > > > > > > > > I disagree. > > > > > > > > > > A value of one, indicates that there is only one owner of the > > > > > mbuf, and therefore since we are in the free routine, we are that > > > > > owner. If there are to be two owners, the refcnt must be > > > > > incremented before handing over the pointer to the other thread - > > > > > to get to the example you make. If that does not occur, we can > > > > > also have the situation where the "sending" thread calls free - > > > > > and therefore this function - before the other thread receives the > > > > > pointer. In that case, we will have the receiving thread getting a > > > > > pointer to an mbuf which is now invalid as it has been put back > > > > > into the mempool > > > > > > > > > > Again, in short, if refcnt == 1, there is only one mbuf owner. If > > > > > refcnt == 1 and we are currently executing in prefree_seg, we are > > > > > the owner and no other thread is allow to muck about with the mbuf. > > > > > > > > > Then the question remains, why aren't you just using ints here? > > > > What is the purpose of even bothering with atomics, if you don't > > > > feel like you need any reliance on the atomic set and read state, > > which it was created for?? > > > > > > > > Neil > > > > > > Because for the case where refcnt != 1, you need the atomics. If you > > > have two threads using the mbuf and refcnt is 2, both of them > > > simultaneously can hand over their copies to two more threads. In that > > > case, we need to guarantee refcnt to be 4, so we need to use atomics. > > > Similarly, if both threads attempt to free at the same time, we need > > > to ensure that only one of them actually returns the buf to the mempool > > - hence the atomic decrement and return value check. > > > > > > /Bruce > > > > Sigh, ok, so that makes some sense. This thing is entirely for the > > purposes of special casing the single use case? That seems like alot of > > effort and confusion to go through for this. Perhaps macrotizing it for > > multiple use cases would clarify it: > > #define mbuf_orphaned(mbuf) atomic_ref_read(mbuf)==1 || > > atomic_ref_dec(mbuf)==0 > > Yes, we could, except it's not "orphaned" since it has got a single thread owner, and this is the normal use-case we are special-casing. > The comment should adequately cover things, I think, and for cases where it doesn't we now have this thread to refer to also. :-) > > > > > regardless, you've convinced me that its not broken. > > Acked-by: Neil Horman > > Thanks, > /Bruce Applied, thanks