DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
@ 2015-03-26 21:14 Bruce Richardson
  2015-03-26 21:31 ` Olivier MATZ
  2015-03-27 10:29 ` Neil Horman
  0 siblings, 2 replies; 18+ messages in thread
From: Bruce Richardson @ 2015-03-26 21:14 UTC (permalink / raw)
  To: dev

The logic used in the condition check before freeing an mbuf is
sometimes confusing, so explain it in a proper comment.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 17ba791..0265172 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 {
 	__rte_mbuf_sanity_check(m, 0);
 
+	/*
+	 * Check to see if this is the last reference to the mbuf.
+	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
+	 * the call to "refcnt_update" is a very expensive operation, so we
+	 * don't want to call it in the case where we know we are the holder
+	 * of the last reference to this mbuf i.e. ref_cnt == 1.
+	 * If however, ref_cnt != 1, it's still possible that we may still be
+	 * the final decrementer of the count, so we need to check that
+	 * result also, to make sure the mbuf is freed properly.
+	 */
 	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
 			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
 
-- 
2.1.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-26 21:14 [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code Bruce Richardson
@ 2015-03-26 21:31 ` Olivier MATZ
  2015-03-27 10:29 ` Neil Horman
  1 sibling, 0 replies; 18+ messages in thread
From: Olivier MATZ @ 2015-03-26 21:31 UTC (permalink / raw)
  To: Bruce Richardson, dev

Hi Bruce,

On 03/26/2015 10:14 PM, Bruce Richardson wrote:
> The logic used in the condition check before freeing an mbuf is
> sometimes confusing, so explain it in a proper comment.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 17ba791..0265172 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>  {
>  	__rte_mbuf_sanity_check(m, 0);
>  
> +	/*
> +	 * Check to see if this is the last reference to the mbuf.
> +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> +	 * the call to "refcnt_update" is a very expensive operation, so we
> +	 * don't want to call it in the case where we know we are the holder
> +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> +	 * If however, ref_cnt != 1, it's still possible that we may still be
> +	 * the final decrementer of the count, so we need to check that
> +	 * result also, to make sure the mbuf is freed properly.
> +	 */
>  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
>  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
>  
> 

Acked-by: Olivier Matz <olivier.matz@6wind.com>

Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-26 21:14 [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code Bruce Richardson
  2015-03-26 21:31 ` Olivier MATZ
@ 2015-03-27 10:29 ` Neil Horman
  2015-03-27 10:49   ` Ananyev, Konstantin
  2015-03-27 11:32   ` Bruce Richardson
  1 sibling, 2 replies; 18+ messages in thread
From: Neil Horman @ 2015-03-27 10:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> The logic used in the condition check before freeing an mbuf is
> sometimes confusing, so explain it in a proper comment.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 17ba791..0265172 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>  {
>  	__rte_mbuf_sanity_check(m, 0);
>  
> +	/*
> +	 * Check to see if this is the last reference to the mbuf.
> +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> +	 * the call to "refcnt_update" is a very expensive operation, so we
> +	 * don't want to call it in the case where we know we are the holder
> +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> +	 * If however, ref_cnt != 1, it's still possible that we may still be
> +	 * the final decrementer of the count, so we need to check that
> +	 * result also, to make sure the mbuf is freed properly.
> +	 */
>  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
>  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
>  
> -- 
> 2.1.0
> 
> 

NAK
 the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
guarantee you are the last holder of the buffer if two contexts have a pointer
to it.

Zoltans patch is the correct solution here, expensive or not.  I wrote up my
explination in this thread:
http://dpdk.org/ml/archives/dev/2015-March/015839.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 10:29 ` Neil Horman
@ 2015-03-27 10:49   ` Ananyev, Konstantin
  2015-03-27 11:32   ` Bruce Richardson
  1 sibling, 0 replies; 18+ messages in thread
From: Ananyev, Konstantin @ 2015-03-27 10:49 UTC (permalink / raw)
  To: Neil Horman, Richardson, Bruce; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> Sent: Friday, March 27, 2015 10:30 AM
> To: Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
> 
> On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > The logic used in the condition check before freeing an mbuf is
> > sometimes confusing, so explain it in a proper comment.
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 17ba791..0265172 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >  {
> >  	__rte_mbuf_sanity_check(m, 0);
> >
> > +	/*
> > +	 * Check to see if this is the last reference to the mbuf.
> > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > +	 * don't want to call it in the case where we know we are the holder
> > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > +	 * the final decrementer of the count, so we need to check that
> > +	 * result also, to make sure the mbuf is freed properly.
> > +	 */
> >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> >
> > --
> > 2.1.0
> >
> >
> 
> NAK
>  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> guarantee you are the last holder of the buffer if two contexts have a pointer
> to it.

Comment is absolutely correct.
Zoltan's 'fix' will introduce unnecessary slowdown.

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 
> Zoltans patch is the correct solution here, expensive or not.  I wrote up my
> explination in this thread:
> http://dpdk.org/ml/archives/dev/2015-March/015839.html
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 10:29 ` Neil Horman
  2015-03-27 10:49   ` Ananyev, Konstantin
@ 2015-03-27 11:32   ` Bruce Richardson
  2015-03-27 12:42     ` Neil Horman
  2015-03-27 14:07     ` Neil Horman
  1 sibling, 2 replies; 18+ messages in thread
From: Bruce Richardson @ 2015-03-27 11:32 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > The logic used in the condition check before freeing an mbuf is
> > sometimes confusing, so explain it in a proper comment.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 17ba791..0265172 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >  {
> >  	__rte_mbuf_sanity_check(m, 0);
> >  
> > +	/*
> > +	 * Check to see if this is the last reference to the mbuf.
> > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > +	 * don't want to call it in the case where we know we are the holder
> > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > +	 * the final decrementer of the count, so we need to check that
> > +	 * result also, to make sure the mbuf is freed properly.
> > +	 */
> >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> >  
> > -- 
> > 2.1.0
> > 
> > 
> 
> NAK
>  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> guarantee you are the last holder of the buffer if two contexts have a pointer
> to it.
If two threads have pointers to it, and are both going to free it, the refcnt
must be 2 not one, otherwise the refcnt is meaningless.

/Bruce

> 
> Zoltans patch is the correct solution here, expensive or not.  I wrote up my
> explination in this thread:
> http://dpdk.org/ml/archives/dev/2015-March/015839.html
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 11:32   ` Bruce Richardson
@ 2015-03-27 12:42     ` Neil Horman
  2015-03-27 14:07     ` Neil Horman
  1 sibling, 0 replies; 18+ messages in thread
From: Neil Horman @ 2015-03-27 12:42 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > The logic used in the condition check before freeing an mbuf is
> > > sometimes confusing, so explain it in a proper comment.
> > > 
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 17ba791..0265172 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > >  {
> > >  	__rte_mbuf_sanity_check(m, 0);
> > >  
> > > +	/*
> > > +	 * Check to see if this is the last reference to the mbuf.
> > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > +	 * don't want to call it in the case where we know we are the holder
> > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > +	 * the final decrementer of the count, so we need to check that
> > > +	 * result also, to make sure the mbuf is freed properly.
> > > +	 */
> > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > >  
> > > -- 
> > > 2.1.0
> > > 
> > > 
> > 
> > NAK
> >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > guarantee you are the last holder of the buffer if two contexts have a pointer
> > to it.
> If two threads have pointers to it, and are both going to free it, the refcnt
> must be 2 not one, otherwise the refcnt is meaningless.
> 
Please see my note in zoltans thread.  I illustrate why its important to fix
this in several ways.  Relying of every context to have a refcnt on an object
before having  pointer reference to it is impossible to do without additional
locking.


> /Bruce
> 
> > 
> > Zoltans patch is the correct solution here, expensive or not.  I wrote up my
> > explination in this thread:
> > http://dpdk.org/ml/archives/dev/2015-March/015839.html
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 11:32   ` Bruce Richardson
  2015-03-27 12:42     ` Neil Horman
@ 2015-03-27 14:07     ` Neil Horman
  2015-03-27 14:30       ` Bruce Richardson
  1 sibling, 1 reply; 18+ messages in thread
From: Neil Horman @ 2015-03-27 14:07 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > The logic used in the condition check before freeing an mbuf is
> > > sometimes confusing, so explain it in a proper comment.
> > > 
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 17ba791..0265172 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > >  {
> > >  	__rte_mbuf_sanity_check(m, 0);
> > >  
> > > +	/*
> > > +	 * Check to see if this is the last reference to the mbuf.
> > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > +	 * don't want to call it in the case where we know we are the holder
> > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > +	 * the final decrementer of the count, so we need to check that
> > > +	 * result also, to make sure the mbuf is freed properly.
> > > +	 */
> > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > >  
> > > -- 
> > > 2.1.0
> > > 
> > > 
> > 
> > NAK
> >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > guarantee you are the last holder of the buffer if two contexts have a pointer
> > to it.
> If two threads have pointers to it, and are both going to free it, the refcnt
> must be 2 not one, otherwise the refcnt is meaningless.
> 

What about the other concrete case that I illustrated, where one context is
attempting to increment the refcount, while the other is decrementing it with
the intention to free?  By making the read and set operation disctinct here
you've broken the atomicity of the read and update logic that atomics are there
for and created a race condition.  I don't know how else to explain this to you.
if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
atomics are meant to do (namely update and read state as an atomic unit), you
just can't get away with not having that atomicity here.  If you could, you
might as well be using plain integers for the reference count, as you're not
using the atomic properties of the type.

Neil

> /Bruce
> 
> > 
> > Zoltans patch is the correct solution here, expensive or not.  I wrote up my
> > explination in this thread:
> > http://dpdk.org/ml/archives/dev/2015-March/015839.html
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 14:07     ` Neil Horman
@ 2015-03-27 14:30       ` Bruce Richardson
  2015-03-27 14:38         ` Neil Horman
  2015-03-30 19:39         ` Marc Sune
  0 siblings, 2 replies; 18+ messages in thread
From: Bruce Richardson @ 2015-03-27 14:30 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > > The logic used in the condition check before freeing an mbuf is
> > > > sometimes confusing, so explain it in a proper comment.
> > > > 
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > index 17ba791..0265172 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > >  {
> > > >  	__rte_mbuf_sanity_check(m, 0);
> > > >  
> > > > +	/*
> > > > +	 * Check to see if this is the last reference to the mbuf.
> > > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > > +	 * don't want to call it in the case where we know we are the holder
> > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > > +	 * the final decrementer of the count, so we need to check that
> > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > +	 */
> > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > > >  
> > > > -- 
> > > > 2.1.0
> > > > 
> > > > 
> > > 
> > > NAK
> > >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > > guarantee you are the last holder of the buffer if two contexts have a pointer
> > > to it.
> > If two threads have pointers to it, and are both going to free it, the refcnt
> > must be 2 not one, otherwise the refcnt is meaningless.
> > 
> 
> What about the other concrete case that I illustrated, where one context is
> attempting to increment the refcount, while the other is decrementing it with
> the intention to free?  By making the read and set operation disctinct here
> you've broken the atomicity of the read and update logic that atomics are there
> for and created a race condition.  I don't know how else to explain this to you.
> if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
> atomics are meant to do (namely update and read state as an atomic unit), you
> just can't get away with not having that atomicity here.  If you could, you
> might as well be using plain integers for the reference count, as you're not
> using the atomic properties of the type.
> 
> Neil

I disagree. 

A value of one, indicates that there is only one owner of the mbuf,
and therefore since we are in the free routine, we are that owner. If there
are to be two owners, the refcnt must be incremented before handing over the
pointer to the other thread - to get to the example you make. If that does not
occur, we can also have the situation where the "sending" thread calls
free - and therefore this function - before the other thread receives the
pointer. In that case, we will have the receiving thread getting a pointer
to an mbuf which is now invalid as it has been put back into the mempool

Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
and we are currently executing in prefree_seg, we are the owner and no other
thread is allow to muck about with the mbuf.

/Bruce

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 14:30       ` Bruce Richardson
@ 2015-03-27 14:38         ` Neil Horman
  2015-03-27 14:55           ` Bruce Richardson
  2015-03-30 19:39         ` Marc Sune
  1 sibling, 1 reply; 18+ messages in thread
From: Neil Horman @ 2015-03-27 14:38 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > > > The logic used in the condition check before freeing an mbuf is
> > > > > sometimes confusing, so explain it in a proper comment.
> > > > > 
> > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > ---
> > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > >  1 file changed, 10 insertions(+)
> > > > > 
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > index 17ba791..0265172 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > >  {
> > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > >  
> > > > > +	/*
> > > > > +	 * Check to see if this is the last reference to the mbuf.
> > > > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > > > +	 * don't want to call it in the case where we know we are the holder
> > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > > > +	 * the final decrementer of the count, so we need to check that
> > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > +	 */
> > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > > > >  
> > > > > -- 
> > > > > 2.1.0
> > > > > 
> > > > > 
> > > > 
> > > > NAK
> > > >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > > > guarantee you are the last holder of the buffer if two contexts have a pointer
> > > > to it.
> > > If two threads have pointers to it, and are both going to free it, the refcnt
> > > must be 2 not one, otherwise the refcnt is meaningless.
> > > 
> > 
> > What about the other concrete case that I illustrated, where one context is
> > attempting to increment the refcount, while the other is decrementing it with
> > the intention to free?  By making the read and set operation disctinct here
> > you've broken the atomicity of the read and update logic that atomics are there
> > for and created a race condition.  I don't know how else to explain this to you.
> > if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
> > atomics are meant to do (namely update and read state as an atomic unit), you
> > just can't get away with not having that atomicity here.  If you could, you
> > might as well be using plain integers for the reference count, as you're not
> > using the atomic properties of the type.
> > 
> > Neil
> 
> I disagree. 
> 
> A value of one, indicates that there is only one owner of the mbuf,
> and therefore since we are in the free routine, we are that owner. If there
> are to be two owners, the refcnt must be incremented before handing over the
> pointer to the other thread - to get to the example you make. If that does not
> occur, we can also have the situation where the "sending" thread calls
> free - and therefore this function - before the other thread receives the
> pointer. In that case, we will have the receiving thread getting a pointer
> to an mbuf which is now invalid as it has been put back into the mempool
> 
> Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
> and we are currently executing in prefree_seg, we are the owner and no other
> thread is allow to muck about with the mbuf.
> 
Then the question remains, why aren't you just using ints here?  What is the
purpose of even bothering with atomics, if you don't feel like you need any
reliance on the atomic set and read state, which it was created for??

Neil

> /Bruce
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 14:38         ` Neil Horman
@ 2015-03-27 14:55           ` Bruce Richardson
  2015-03-27 16:43             ` Neil Horman
  0 siblings, 1 reply; 18+ messages in thread
From: Bruce Richardson @ 2015-03-27 14:55 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote:
> On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > > > > The logic used in the condition check before freeing an mbuf is
> > > > > > sometimes confusing, so explain it in a proper comment.
> > > > > > 
> > > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > ---
> > > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > > >  1 file changed, 10 insertions(+)
> > > > > > 
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > > index 17ba791..0265172 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > >  {
> > > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > > >  
> > > > > > +	/*
> > > > > > +	 * Check to see if this is the last reference to the mbuf.
> > > > > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > > > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > > > > +	 * don't want to call it in the case where we know we are the holder
> > > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > > > > +	 * the final decrementer of the count, so we need to check that
> > > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > > +	 */
> > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > > > > >  
> > > > > > -- 
> > > > > > 2.1.0
> > > > > > 
> > > > > > 
> > > > > 
> > > > > NAK
> > > > >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > > > > guarantee you are the last holder of the buffer if two contexts have a pointer
> > > > > to it.
> > > > If two threads have pointers to it, and are both going to free it, the refcnt
> > > > must be 2 not one, otherwise the refcnt is meaningless.
> > > > 
> > > 
> > > What about the other concrete case that I illustrated, where one context is
> > > attempting to increment the refcount, while the other is decrementing it with
> > > the intention to free?  By making the read and set operation disctinct here
> > > you've broken the atomicity of the read and update logic that atomics are there
> > > for and created a race condition.  I don't know how else to explain this to you.
> > > if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
> > > atomics are meant to do (namely update and read state as an atomic unit), you
> > > just can't get away with not having that atomicity here.  If you could, you
> > > might as well be using plain integers for the reference count, as you're not
> > > using the atomic properties of the type.
> > > 
> > > Neil
> > 
> > I disagree. 
> > 
> > A value of one, indicates that there is only one owner of the mbuf,
> > and therefore since we are in the free routine, we are that owner. If there
> > are to be two owners, the refcnt must be incremented before handing over the
> > pointer to the other thread - to get to the example you make. If that does not
> > occur, we can also have the situation where the "sending" thread calls
> > free - and therefore this function - before the other thread receives the
> > pointer. In that case, we will have the receiving thread getting a pointer
> > to an mbuf which is now invalid as it has been put back into the mempool
> > 
> > Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
> > and we are currently executing in prefree_seg, we are the owner and no other
> > thread is allow to muck about with the mbuf.
> > 
> Then the question remains, why aren't you just using ints here?  What is the
> purpose of even bothering with atomics, if you don't feel like you need any
> reliance on the atomic set and read state, which it was created for??
> 
> Neil

Because for the case where refcnt != 1, you need the atomics. If you have two
threads using the mbuf and refcnt is 2, both of them simultaneously can hand over
their copies to two more threads. In that case, we need to guarantee refcnt to 
be 4, so we need to use atomics. Similarly, if both threads attempt to free
at the same time, we need to ensure that only one of them actually returns the
buf to the mempool - hence the atomic decrement and return value check.

/Bruce
> 
> > /Bruce
> > 
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 14:55           ` Bruce Richardson
@ 2015-03-27 16:43             ` Neil Horman
  2015-03-27 16:56               ` Richardson, Bruce
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Horman @ 2015-03-27 16:43 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 27, 2015 at 02:55:27PM +0000, Bruce Richardson wrote:
> On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote:
> > On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> > > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> > > > > > > The logic used in the condition check before freeing an mbuf is
> > > > > > > sometimes confusing, so explain it in a proper comment.
> > > > > > > 
> > > > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > > ---
> > > > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > index 17ba791..0265172 100644
> > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > > >  {
> > > > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > > > >  
> > > > > > > +	/*
> > > > > > > +	 * Check to see if this is the last reference to the mbuf.
> > > > > > > +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> > > > > > > +	 * the call to "refcnt_update" is a very expensive operation, so we
> > > > > > > +	 * don't want to call it in the case where we know we are the holder
> > > > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > > > +	 * If however, ref_cnt != 1, it's still possible that we may still be
> > > > > > > +	 * the final decrementer of the count, so we need to check that
> > > > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > > > +	 */
> > > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> > > > > > >  
> > > > > > > -- 
> > > > > > > 2.1.0
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > NAK
> > > > > >  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> > > > > > guarantee you are the last holder of the buffer if two contexts have a pointer
> > > > > > to it.
> > > > > If two threads have pointers to it, and are both going to free it, the refcnt
> > > > > must be 2 not one, otherwise the refcnt is meaningless.
> > > > > 
> > > > 
> > > > What about the other concrete case that I illustrated, where one context is
> > > > attempting to increment the refcount, while the other is decrementing it with
> > > > the intention to free?  By making the read and set operation disctinct here
> > > > you've broken the atomicity of the read and update logic that atomics are there
> > > > for and created a race condition.  I don't know how else to explain this to you.
> > > > if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
> > > > atomics are meant to do (namely update and read state as an atomic unit), you
> > > > just can't get away with not having that atomicity here.  If you could, you
> > > > might as well be using plain integers for the reference count, as you're not
> > > > using the atomic properties of the type.
> > > > 
> > > > Neil
> > > 
> > > I disagree. 
> > > 
> > > A value of one, indicates that there is only one owner of the mbuf,
> > > and therefore since we are in the free routine, we are that owner. If there
> > > are to be two owners, the refcnt must be incremented before handing over the
> > > pointer to the other thread - to get to the example you make. If that does not
> > > occur, we can also have the situation where the "sending" thread calls
> > > free - and therefore this function - before the other thread receives the
> > > pointer. In that case, we will have the receiving thread getting a pointer
> > > to an mbuf which is now invalid as it has been put back into the mempool
> > > 
> > > Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
> > > and we are currently executing in prefree_seg, we are the owner and no other
> > > thread is allow to muck about with the mbuf.
> > > 
> > Then the question remains, why aren't you just using ints here?  What is the
> > purpose of even bothering with atomics, if you don't feel like you need any
> > reliance on the atomic set and read state, which it was created for??
> > 
> > Neil
> 
> Because for the case where refcnt != 1, you need the atomics. If you have two
> threads using the mbuf and refcnt is 2, both of them simultaneously can hand over
> their copies to two more threads. In that case, we need to guarantee refcnt to 
> be 4, so we need to use atomics. Similarly, if both threads attempt to free
> at the same time, we need to ensure that only one of them actually returns the
> buf to the mempool - hence the atomic decrement and return value check.
> 
> /Bruce

Sigh, ok, so that makes some sense.  This thing is entirely for the purposes of
special casing the single use case?  That seems like alot of effort and
confusion to go through for this.  Perhaps macrotizing it for multiple use cases
would clarify it:
#define mbuf_orphaned(mbuf) atomic_ref_read(mbuf)==1 || atomic_ref_dec(mbuf)==0

regardless, you've convinced me that its not broken.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > 
> > > /Bruce
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 16:43             ` Neil Horman
@ 2015-03-27 16:56               ` Richardson, Bruce
  2015-03-30 17:11                 ` Thomas Monjalon
  2015-03-30 17:39                 ` Don Provan
  0 siblings, 2 replies; 18+ messages in thread
From: Richardson, Bruce @ 2015-03-27 16:56 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev



> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, March 27, 2015 4:44 PM
> To: Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing
> code
> 
> On Fri, Mar 27, 2015 at 02:55:27PM +0000, Bruce Richardson wrote:
> > On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote:
> > > On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> > > > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > > > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson
> wrote:
> > > > > > > > The logic used in the condition check before freeing an
> > > > > > > > mbuf is sometimes confusing, so explain it in a proper
> comment.
> > > > > > > >
> > > > > > > > Signed-off-by: Bruce Richardson
> > > > > > > > <bruce.richardson@intel.com>
> > > > > > > > ---
> > > > > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > b/lib/librte_mbuf/rte_mbuf.h index 17ba791..0265172 100644
> > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct
> > > > > > > > rte_mbuf *m)  {
> > > > > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > > > > >
> > > > > > > > +	/*
> > > > > > > > +	 * Check to see if this is the last reference to the
> mbuf.
> > > > > > > > +	 * Note: the double check here is deliberate. If the
> ref_cnt is "atomic"
> > > > > > > > +	 * the call to "refcnt_update" is a very expensive
> operation, so we
> > > > > > > > +	 * don't want to call it in the case where we know we
> are the holder
> > > > > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > > > > +	 * If however, ref_cnt != 1, it's still possible that we
> may still be
> > > > > > > > +	 * the final decrementer of the count, so we need to
> check that
> > > > > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > > > > +	 */
> > > > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0))
> {
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.1.0
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > NAK
> > > > > > >  the comment is incorrect, a return code of 1 from
> > > > > > > rte_mbuf_refcnt_read doesn't guarantee you are the last
> > > > > > > holder of the buffer if two contexts have a pointer to it.
> > > > > > If two threads have pointers to it, and are both going to free
> > > > > > it, the refcnt must be 2 not one, otherwise the refcnt is
> meaningless.
> > > > > >
> > > > >
> > > > > What about the other concrete case that I illustrated, where one
> > > > > context is attempting to increment the refcount, while the other
> > > > > is decrementing it with the intention to free?  By making the
> > > > > read and set operation disctinct here you've broken the
> > > > > atomicity of the read and update logic that atomics are there for
> and created a race condition.  I don't know how else to explain this to
> you.
> > > > > if(atomic_read == 1) then atomic_set(0), breaks the entire
> > > > > notion of what atomics are meant to do (namely update and read
> > > > > state as an atomic unit), you just can't get away with not
> > > > > having that atomicity here.  If you could, you might as well be
> > > > > using plain integers for the reference count, as you're not using
> the atomic properties of the type.
> > > > >
> > > > > Neil
> > > >
> > > > I disagree.
> > > >
> > > > A value of one, indicates that there is only one owner of the
> > > > mbuf, and therefore since we are in the free routine, we are that
> > > > owner. If there are to be two owners, the refcnt must be
> > > > incremented before handing over the pointer to the other thread -
> > > > to get to the example you make. If that does not occur, we can
> > > > also have the situation where the "sending" thread calls free -
> > > > and therefore this function - before the other thread receives the
> > > > pointer. In that case, we will have the receiving thread getting a
> > > > pointer to an mbuf which is now invalid as it has been put back
> > > > into the mempool
> > > >
> > > > Again, in short, if refcnt == 1, there is only one mbuf owner. If
> > > > refcnt == 1 and we are currently executing in prefree_seg, we are
> > > > the owner and no other thread is allow to muck about with the mbuf.
> > > >
> > > Then the question remains, why aren't you just using ints here?
> > > What is the purpose of even bothering with atomics, if you don't
> > > feel like you need any reliance on the atomic set and read state,
> which it was created for??
> > >
> > > Neil
> >
> > Because for the case where refcnt != 1, you need the atomics. If you
> > have two threads using the mbuf and refcnt is 2, both of them
> > simultaneously can hand over their copies to two more threads. In that
> > case, we need to guarantee refcnt to be 4, so we need to use atomics.
> > Similarly, if both threads attempt to free at the same time, we need
> > to ensure that only one of them actually returns the buf to the mempool
> - hence the atomic decrement and return value check.
> >
> > /Bruce
> 
> Sigh, ok, so that makes some sense.  This thing is entirely for the
> purposes of special casing the single use case?  That seems like alot of
> effort and confusion to go through for this.  Perhaps macrotizing it for
> multiple use cases would clarify it:
> #define mbuf_orphaned(mbuf) atomic_ref_read(mbuf)==1 ||
> atomic_ref_dec(mbuf)==0

Yes, we could, except it's not "orphaned" since it has got a single thread owner, and this is the normal use-case we are special-casing. 
The comment should adequately cover things, I think, and for cases where it doesn't we now have this thread to refer to also. :-)

> 
> regardless, you've convinced me that its not broken.
> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 16:56               ` Richardson, Bruce
@ 2015-03-30 17:11                 ` Thomas Monjalon
  2015-03-30 17:39                 ` Don Provan
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas Monjalon @ 2015-03-30 17:11 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev

2015-03-27 16:56, Richardson, Bruce:
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, March 27, 2015 4:44 PM
> > To: Richardson, Bruce
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing
> > code
> > 
> > On Fri, Mar 27, 2015 at 02:55:27PM +0000, Bruce Richardson wrote:
> > > On Fri, Mar 27, 2015 at 10:38:41AM -0400, Neil Horman wrote:
> > > > On Fri, Mar 27, 2015 at 02:30:50PM +0000, Bruce Richardson wrote:
> > > > > On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> > > > > > On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> > > > > > > On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> > > > > > > > On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson
> > wrote:
> > > > > > > > > The logic used in the condition check before freeing an
> > > > > > > > > mbuf is sometimes confusing, so explain it in a proper
> > comment.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Bruce Richardson
> > > > > > > > > <bruce.richardson@intel.com>
> > > > > > > > > ---
> > > > > > > > >  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> > > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > b/lib/librte_mbuf/rte_mbuf.h index 17ba791..0265172 100644
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct
> > > > > > > > > rte_mbuf *m)  {
> > > > > > > > >  	__rte_mbuf_sanity_check(m, 0);
> > > > > > > > >
> > > > > > > > > +	/*
> > > > > > > > > +	 * Check to see if this is the last reference to the
> > mbuf.
> > > > > > > > > +	 * Note: the double check here is deliberate. If the
> > ref_cnt is "atomic"
> > > > > > > > > +	 * the call to "refcnt_update" is a very expensive
> > operation, so we
> > > > > > > > > +	 * don't want to call it in the case where we know we
> > are the holder
> > > > > > > > > +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> > > > > > > > > +	 * If however, ref_cnt != 1, it's still possible that we
> > may still be
> > > > > > > > > +	 * the final decrementer of the count, so we need to
> > check that
> > > > > > > > > +	 * result also, to make sure the mbuf is freed properly.
> > > > > > > > > +	 */
> > > > > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0))
> > {
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.1.0
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > NAK
> > > > > > > >  the comment is incorrect, a return code of 1 from
> > > > > > > > rte_mbuf_refcnt_read doesn't guarantee you are the last
> > > > > > > > holder of the buffer if two contexts have a pointer to it.
> > > > > > > If two threads have pointers to it, and are both going to free
> > > > > > > it, the refcnt must be 2 not one, otherwise the refcnt is
> > meaningless.
> > > > > > >
> > > > > >
> > > > > > What about the other concrete case that I illustrated, where one
> > > > > > context is attempting to increment the refcount, while the other
> > > > > > is decrementing it with the intention to free?  By making the
> > > > > > read and set operation disctinct here you've broken the
> > > > > > atomicity of the read and update logic that atomics are there for
> > and created a race condition.  I don't know how else to explain this to
> > you.
> > > > > > if(atomic_read == 1) then atomic_set(0), breaks the entire
> > > > > > notion of what atomics are meant to do (namely update and read
> > > > > > state as an atomic unit), you just can't get away with not
> > > > > > having that atomicity here.  If you could, you might as well be
> > > > > > using plain integers for the reference count, as you're not using
> > the atomic properties of the type.
> > > > > >
> > > > > > Neil
> > > > >
> > > > > I disagree.
> > > > >
> > > > > A value of one, indicates that there is only one owner of the
> > > > > mbuf, and therefore since we are in the free routine, we are that
> > > > > owner. If there are to be two owners, the refcnt must be
> > > > > incremented before handing over the pointer to the other thread -
> > > > > to get to the example you make. If that does not occur, we can
> > > > > also have the situation where the "sending" thread calls free -
> > > > > and therefore this function - before the other thread receives the
> > > > > pointer. In that case, we will have the receiving thread getting a
> > > > > pointer to an mbuf which is now invalid as it has been put back
> > > > > into the mempool
> > > > >
> > > > > Again, in short, if refcnt == 1, there is only one mbuf owner. If
> > > > > refcnt == 1 and we are currently executing in prefree_seg, we are
> > > > > the owner and no other thread is allow to muck about with the mbuf.
> > > > >
> > > > Then the question remains, why aren't you just using ints here?
> > > > What is the purpose of even bothering with atomics, if you don't
> > > > feel like you need any reliance on the atomic set and read state,
> > which it was created for??
> > > >
> > > > Neil
> > >
> > > Because for the case where refcnt != 1, you need the atomics. If you
> > > have two threads using the mbuf and refcnt is 2, both of them
> > > simultaneously can hand over their copies to two more threads. In that
> > > case, we need to guarantee refcnt to be 4, so we need to use atomics.
> > > Similarly, if both threads attempt to free at the same time, we need
> > > to ensure that only one of them actually returns the buf to the mempool
> > - hence the atomic decrement and return value check.
> > >
> > > /Bruce
> > 
> > Sigh, ok, so that makes some sense.  This thing is entirely for the
> > purposes of special casing the single use case?  That seems like alot of
> > effort and confusion to go through for this.  Perhaps macrotizing it for
> > multiple use cases would clarify it:
> > #define mbuf_orphaned(mbuf) atomic_ref_read(mbuf)==1 ||
> > atomic_ref_dec(mbuf)==0
> 
> Yes, we could, except it's not "orphaned" since it has got a single thread owner, and this is the normal use-case we are special-casing. 
> The comment should adequately cover things, I think, and for cases where it doesn't we now have this thread to refer to also. :-)
> 
> > 
> > regardless, you've convinced me that its not broken.
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Thanks,
> /Bruce

Applied, thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 16:56               ` Richardson, Bruce
  2015-03-30 17:11                 ` Thomas Monjalon
@ 2015-03-30 17:39                 ` Don Provan
  2015-03-30 18:15                   ` Stephen Hemminger
  2015-03-31 12:33                   ` Zoltan Kiss
  1 sibling, 2 replies; 18+ messages in thread
From: Don Provan @ 2015-03-30 17:39 UTC (permalink / raw)
  To: Richardson, Bruce, Neil Horman; +Cc: dev

> > > > > > > >  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> > > > > > > >  			likely (rte_mbuf_refcnt_update(m, -1) == 0))

In all the debate about atomics, I don't think anyone got around to pointing out that in the unlikely case that the refcnt is not 1, then it's equally unlikely that decrementing it will result in 0 despite the code's claim to the contrary. That's the part that confused me. Would it make sense to fix this while adding the comment?
-don
dprovan@bivio.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-30 17:39                 ` Don Provan
@ 2015-03-30 18:15                   ` Stephen Hemminger
  2015-03-31 12:33                   ` Zoltan Kiss
  1 sibling, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2015-03-30 18:15 UTC (permalink / raw)
  To: Don Provan; +Cc: dev

On Mon, 30 Mar 2015 17:39:06 +0000
Don Provan <dprovan@bivio.net> wrote:

> In all the debate about atomics, I don't think anyone got around to pointing out that in the unlikely case that the refcnt is not 1, then it's equally unlikely that decrementing it will result in 0 despite the code's claim to the contrary. That's the part that confused me. Would it make sense to fix this while adding the comment?

Really doubt the second likely() makes any difference in code speed.
Should probably just be removed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-27 14:30       ` Bruce Richardson
  2015-03-27 14:38         ` Neil Horman
@ 2015-03-30 19:39         ` Marc Sune
  2015-03-30 20:26           ` Neil Horman
  1 sibling, 1 reply; 18+ messages in thread
From: Marc Sune @ 2015-03-30 19:39 UTC (permalink / raw)
  To: dev



On 27/03/15 15:30, Bruce Richardson wrote:
> On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
>> On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
>>> On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
>>>> On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
>>>>> The logic used in the condition check before freeing an mbuf is
>>>>> sometimes confusing, so explain it in a proper comment.
>>>>>
>>>>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>>>>> ---
>>>>>   lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
>>>>>   1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>>>> index 17ba791..0265172 100644
>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>> @@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>>>>>   {
>>>>>   	__rte_mbuf_sanity_check(m, 0);
>>>>>   
>>>>> +	/*
>>>>> +	 * Check to see if this is the last reference to the mbuf.
>>>>> +	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
>>>>> +	 * the call to "refcnt_update" is a very expensive operation, so we
>>>>> +	 * don't want to call it in the case where we know we are the holder
>>>>> +	 * of the last reference to this mbuf i.e. ref_cnt == 1.
>>>>> +	 * If however, ref_cnt != 1, it's still possible that we may still be
>>>>> +	 * the final decrementer of the count, so we need to check that
>>>>> +	 * result also, to make sure the mbuf is freed properly.
>>>>> +	 */
>>>>>   	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
>>>>>   			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
>>>>>   
>>>>> -- 
>>>>> 2.1.0
>>>>>
>>>>>
>>>> NAK
>>>>   the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
>>>> guarantee you are the last holder of the buffer if two contexts have a pointer
>>>> to it.
>>> If two threads have pointers to it, and are both going to free it, the refcnt
>>> must be 2 not one, otherwise the refcnt is meaningless.
>>>
>> What about the other concrete case that I illustrated, where one context is
>> attempting to increment the refcount, while the other is decrementing it with
>> the intention to free?  By making the read and set operation disctinct here
>> you've broken the atomicity of the read and update logic that atomics are there
>> for and created a race condition.  I don't know how else to explain this to you.
>> if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
>> atomics are meant to do (namely update and read state as an atomic unit), you
>> just can't get away with not having that atomicity here.  If you could, you
>> might as well be using plain integers for the reference count, as you're not
>> using the atomic properties of the type.
>>
>> Neil
> I disagree.
>
> A value of one, indicates that there is only one owner of the mbuf,
> and therefore since we are in the free routine, we are that owner.

This is true if you assume the application is bug-free and two threads 
will never hold mistakenly the same mbuf pointer with ref_cnt=1.  That's 
ok in general, but if it doesn't, debugging it can be pretty painful.

Marc

>   If there
> are to be two owners, the refcnt must be incremented before handing over the
> pointer to the other thread - to get to the example you make. If that does not
> occur, we can also have the situation where the "sending" thread calls
> free - and therefore this function - before the other thread receives the
> pointer. In that case, we will have the receiving thread getting a pointer
> to an mbuf which is now invalid as it has been put back into the mempool
>
> Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
> and we are currently executing in prefree_seg, we are the owner and no other
> thread is allow to muck about with the mbuf.
>
> /Bruce
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-30 19:39         ` Marc Sune
@ 2015-03-30 20:26           ` Neil Horman
  0 siblings, 0 replies; 18+ messages in thread
From: Neil Horman @ 2015-03-30 20:26 UTC (permalink / raw)
  To: Marc Sune; +Cc: dev

On Mon, Mar 30, 2015 at 09:39:13PM +0200, Marc Sune wrote:
> 
> 
> On 27/03/15 15:30, Bruce Richardson wrote:
> >On Fri, Mar 27, 2015 at 10:07:35AM -0400, Neil Horman wrote:
> >>On Fri, Mar 27, 2015 at 11:32:38AM +0000, Bruce Richardson wrote:
> >>>On Fri, Mar 27, 2015 at 06:29:56AM -0400, Neil Horman wrote:
> >>>>On Thu, Mar 26, 2015 at 09:14:54PM +0000, Bruce Richardson wrote:
> >>>>>The logic used in the condition check before freeing an mbuf is
> >>>>>sometimes confusing, so explain it in a proper comment.
> >>>>>
> >>>>>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> >>>>>---
> >>>>>  lib/librte_mbuf/rte_mbuf.h | 10 ++++++++++
> >>>>>  1 file changed, 10 insertions(+)
> >>>>>
> >>>>>diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >>>>>index 17ba791..0265172 100644
> >>>>>--- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>+++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>@@ -764,6 +764,16 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >>>>>  {
> >>>>>  	__rte_mbuf_sanity_check(m, 0);
> >>>>>+	/*
> >>>>>+	 * Check to see if this is the last reference to the mbuf.
> >>>>>+	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
> >>>>>+	 * the call to "refcnt_update" is a very expensive operation, so we
> >>>>>+	 * don't want to call it in the case where we know we are the holder
> >>>>>+	 * of the last reference to this mbuf i.e. ref_cnt == 1.
> >>>>>+	 * If however, ref_cnt != 1, it's still possible that we may still be
> >>>>>+	 * the final decrementer of the count, so we need to check that
> >>>>>+	 * result also, to make sure the mbuf is freed properly.
> >>>>>+	 */
> >>>>>  	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
> >>>>>  			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
> >>>>>-- 
> >>>>>2.1.0
> >>>>>
> >>>>>
> >>>>NAK
> >>>>  the comment is incorrect, a return code of 1 from rte_mbuf_refcnt_read doesn't
> >>>>guarantee you are the last holder of the buffer if two contexts have a pointer
> >>>>to it.
> >>>If two threads have pointers to it, and are both going to free it, the refcnt
> >>>must be 2 not one, otherwise the refcnt is meaningless.
> >>>
> >>What about the other concrete case that I illustrated, where one context is
> >>attempting to increment the refcount, while the other is decrementing it with
> >>the intention to free?  By making the read and set operation disctinct here
> >>you've broken the atomicity of the read and update logic that atomics are there
> >>for and created a race condition.  I don't know how else to explain this to you.
> >>if(atomic_read == 1) then atomic_set(0), breaks the entire notion of what
> >>atomics are meant to do (namely update and read state as an atomic unit), you
> >>just can't get away with not having that atomicity here.  If you could, you
> >>might as well be using plain integers for the reference count, as you're not
> >>using the atomic properties of the type.
> >>
> >>Neil
> >I disagree.
> >
> >A value of one, indicates that there is only one owner of the mbuf,
> >and therefore since we are in the free routine, we are that owner.
> 
> This is true if you assume the application is bug-free and two threads will
> never hold mistakenly the same mbuf pointer with ref_cnt=1.  That's ok in
> general, but if it doesn't, debugging it can be pretty painful.
> 
> Marc
> 
The use case for the above is situations in which mbufs are stored/queued and
must be looked up prior to taking a hold on the refcnt (i.e. you have to get the
pointer so that you can pass it in to rte_pktmbuf_refcnt_inc or whtever
increases the hold count.  I don't think that use case is considered here
though, in that it presumes that in a provider/receiver context relationship,
the provider always increases the refcount before giving it to the receiver.  I
don't care for it either, but it is what it is.
Neil

> >  If there
> >are to be two owners, the refcnt must be incremented before handing over the
> >pointer to the other thread - to get to the example you make. If that does not
> >occur, we can also have the situation where the "sending" thread calls
> >free - and therefore this function - before the other thread receives the
> >pointer. In that case, we will have the receiving thread getting a pointer
> >to an mbuf which is now invalid as it has been put back into the mempool
> >
> >Again, in short, if refcnt == 1, there is only one mbuf owner. If refcnt == 1
> >and we are currently executing in prefree_seg, we are the owner and no other
> >thread is allow to muck about with the mbuf.
> >
> >/Bruce
> >
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
  2015-03-30 17:39                 ` Don Provan
  2015-03-30 18:15                   ` Stephen Hemminger
@ 2015-03-31 12:33                   ` Zoltan Kiss
  1 sibling, 0 replies; 18+ messages in thread
From: Zoltan Kiss @ 2015-03-31 12:33 UTC (permalink / raw)
  To: Don Provan, Richardson, Bruce, Neil Horman; +Cc: dev

Hi,

On 30/03/15 18:39, Don Provan wrote:
>>>>>>>>>   	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
>>>>>>>>>   			likely (rte_mbuf_refcnt_update(m, -1) == 0))
>
> In all the debate about atomics, I don't think anyone got around to pointing out that in the unlikely case that the refcnt is not 1, then it's equally unlikely that decrementing it will result in 0 despite the code's claim to the contrary. That's the part that confused me. Would it make sense to fix this while adding the comment?
> -don
> dprovan@bivio.net
>

I was thinking about that too, either remove it or turn it into 
"unlikely". Currently it suggest that "if there are more than one users, 
they are likely to release at the same time". If that's not true, we 
should remove it, but as Don said, it would hardly make a difference in 
real world cases as more than one users is not really a hot usecase, AFAIK.

Regards,

Zoltan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-03-31 12:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-26 21:14 [dpdk-dev] [PATCH] mbuf: add comment explaining confusing code Bruce Richardson
2015-03-26 21:31 ` Olivier MATZ
2015-03-27 10:29 ` Neil Horman
2015-03-27 10:49   ` Ananyev, Konstantin
2015-03-27 11:32   ` Bruce Richardson
2015-03-27 12:42     ` Neil Horman
2015-03-27 14:07     ` Neil Horman
2015-03-27 14:30       ` Bruce Richardson
2015-03-27 14:38         ` Neil Horman
2015-03-27 14:55           ` Bruce Richardson
2015-03-27 16:43             ` Neil Horman
2015-03-27 16:56               ` Richardson, Bruce
2015-03-30 17:11                 ` Thomas Monjalon
2015-03-30 17:39                 ` Don Provan
2015-03-30 18:15                   ` Stephen Hemminger
2015-03-31 12:33                   ` Zoltan Kiss
2015-03-30 19:39         ` Marc Sune
2015-03-30 20:26           ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).