DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach
@ 2015-06-01  9:32 Olivier Matz
  2015-06-03 10:59 ` Bruce Richardson
  2015-06-08 14:57 ` [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update Olivier Matz
  0 siblings, 2 replies; 15+ messages in thread
From: Olivier Matz @ 2015-06-01  9:32 UTC (permalink / raw)
  To: dev

As it's done in __rte_pktmbuf_prefree_seg(), we can avoid using an
atomic increment in rte_pktmbuf_attach() by checking if we are the
only owner of the mbuf first.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ab6de67..cea35b7 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -838,7 +838,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
 	else
 		md = rte_mbuf_from_indirect(m);
 
-	rte_mbuf_refcnt_update(md, 1);
+	/* optimize the case where we are the only owner */
+	if (likely(rte_mbuf_refcnt_read(md) == 1))
+		rte_mbuf_refcnt_set(md, 2);
+	else
+		rte_mbuf_refcnt_update(md, 1);
 	mi->priv_size = m->priv_size;
 	mi->buf_physaddr = m->buf_physaddr;
 	mi->buf_addr = m->buf_addr;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach
  2015-06-01  9:32 [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach Olivier Matz
@ 2015-06-03 10:59 ` Bruce Richardson
  2015-06-05  9:07   ` Olivier MATZ
  2015-06-08 14:57 ` [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update Olivier Matz
  1 sibling, 1 reply; 15+ messages in thread
From: Bruce Richardson @ 2015-06-03 10:59 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

On Mon, Jun 01, 2015 at 11:32:25AM +0200, Olivier Matz wrote:
> As it's done in __rte_pktmbuf_prefree_seg(), we can avoid using an
> atomic increment in rte_pktmbuf_attach() by checking if we are the
> only owner of the mbuf first.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index ab6de67..cea35b7 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -838,7 +838,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
>  	else
>  		md = rte_mbuf_from_indirect(m);
>  
> -	rte_mbuf_refcnt_update(md, 1);
> +	/* optimize the case where we are the only owner */
> +	if (likely(rte_mbuf_refcnt_read(md) == 1))
> +		rte_mbuf_refcnt_set(md, 2);
> +	else
> +		rte_mbuf_refcnt_update(md, 1);
>  	mi->priv_size = m->priv_size;
>  	mi->buf_physaddr = m->buf_physaddr;
>  	mi->buf_addr = m->buf_addr;
> -- 
> 2.1.4
> 
Why not make the change inside rte_mbuf_refcnt_update itself? If it is ever
called with a current refcnt of 1, it should always be safe to do the update
without a cmpset.

/Bruce

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach
  2015-06-03 10:59 ` Bruce Richardson
@ 2015-06-05  9:07   ` Olivier MATZ
  0 siblings, 0 replies; 15+ messages in thread
From: Olivier MATZ @ 2015-06-05  9:07 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Hi Bruce,

On 06/03/2015 12:59 PM, Bruce Richardson wrote:
> On Mon, Jun 01, 2015 at 11:32:25AM +0200, Olivier Matz wrote:
>> As it's done in __rte_pktmbuf_prefree_seg(), we can avoid using an
>> atomic increment in rte_pktmbuf_attach() by checking if we are the
>> only owner of the mbuf first.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> ---
>>  lib/librte_mbuf/rte_mbuf.h | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index ab6de67..cea35b7 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -838,7 +838,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
>>  	else
>>  		md = rte_mbuf_from_indirect(m);
>>  
>> -	rte_mbuf_refcnt_update(md, 1);
>> +	/* optimize the case where we are the only owner */
>> +	if (likely(rte_mbuf_refcnt_read(md) == 1))
>> +		rte_mbuf_refcnt_set(md, 2);
>> +	else
>> +		rte_mbuf_refcnt_update(md, 1);
>>  	mi->priv_size = m->priv_size;
>>  	mi->buf_physaddr = m->buf_physaddr;
>>  	mi->buf_addr = m->buf_addr;
>> -- 
>> 2.1.4
>>
> Why not make the change inside rte_mbuf_refcnt_update itself? If it is ever
> called with a current refcnt of 1, it should always be safe to do the update
> without a cmpset.

Good idea, I'll see if I can do that in a new version.

Thanks,
Olivier


> 
> /Bruce
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2015-06-01  9:32 [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach Olivier Matz
  2015-06-03 10:59 ` Bruce Richardson
@ 2015-06-08 14:57 ` Olivier Matz
  2015-06-09 12:57   ` Bruce Richardson
  1 sibling, 1 reply; 15+ messages in thread
From: Olivier Matz @ 2015-06-08 14:57 UTC (permalink / raw)
  To: dev

In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using
a costly atomic operation when updating the mbuf reference counter if
its value is 1. Indeed, it means that we are the only owner of the mbuf,
and therefore nobody can change it at the same time.

We can generalize this optimization directly in rte_mbuf_refcnt_update()
so the other callers of this function, like rte_pktmbuf_attach(), can
also take advantage of this optimization.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.h | 57 +++++++++++++++++++++++-----------------------
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ab6de67..6c9cfd6 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -426,21 +426,6 @@ if (!(exp)) {                                                        \
 #ifdef RTE_MBUF_REFCNT_ATOMIC
 
 /**
- * Adds given value to an mbuf's refcnt and returns its new value.
- * @param m
- *   Mbuf to update
- * @param value
- *   Value to add/subtract
- * @return
- *   Updated value
- */
-static inline uint16_t
-rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
-{
-	return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
-}
-
-/**
  * Reads the value of an mbuf's refcnt.
  * @param m
  *   Mbuf to read
@@ -466,6 +451,33 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
 	rte_atomic16_set(&m->refcnt_atomic, new_value);
 }
 
+/**
+ * Adds given value to an mbuf's refcnt and returns its new value.
+ * @param m
+ *   Mbuf to update
+ * @param value
+ *   Value to add/subtract
+ * @return
+ *   Updated value
+ */
+static inline uint16_t
+rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
+{
+	/*
+	 * The atomic_add is an expensive operation, so we don't want to
+	 * call it in the case where we know we are the uniq holder of
+	 * this mbuf (i.e. ref_cnt == 1). Otherwise, an atomic
+	 * operation has to be used because concurrent accesses on the
+	 * reference counter can occur.
+	 */
+	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
+		rte_mbuf_refcnt_set(m, 1 + value);
+		return 1 + value;
+	}
+
+	return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
+}
+
 #else /* ! RTE_MBUF_REFCNT_ATOMIC */
 
 /**
@@ -895,20 +907,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 {
 	__rte_mbuf_sanity_check(m, 0);
 
-	/*
-	 * Check to see if this is the last reference to the mbuf.
-	 * Note: the double check here is deliberate. If the ref_cnt is "atomic"
-	 * the call to "refcnt_update" is a very expensive operation, so we
-	 * don't want to call it in the case where we know we are the holder
-	 * of the last reference to this mbuf i.e. ref_cnt == 1.
-	 * If however, ref_cnt != 1, it's still possible that we may still be
-	 * the final decrementer of the count, so we need to check that
-	 * result also, to make sure the mbuf is freed properly.
-	 */
-	if (likely (rte_mbuf_refcnt_read(m) == 1) ||
-			likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
-
-		rte_mbuf_refcnt_set(m, 0);
+	if (likely(rte_mbuf_refcnt_update(m, -1) == 0)) {
 
 		/* if this is an indirect mbuf, then
 		 *  - detach mbuf
-- 
2.1.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2015-06-08 14:57 ` [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update Olivier Matz
@ 2015-06-09 12:57   ` Bruce Richardson
  2015-06-12 14:10     ` Thomas Monjalon
  0 siblings, 1 reply; 15+ messages in thread
From: Bruce Richardson @ 2015-06-09 12:57 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

On Mon, Jun 08, 2015 at 04:57:22PM +0200, Olivier Matz wrote:
> In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using
> a costly atomic operation when updating the mbuf reference counter if
> its value is 1. Indeed, it means that we are the only owner of the mbuf,
> and therefore nobody can change it at the same time.
> 
> We can generalize this optimization directly in rte_mbuf_refcnt_update()
> so the other callers of this function, like rte_pktmbuf_attach(), can
> also take advantage of this optimization.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2015-06-09 12:57   ` Bruce Richardson
@ 2015-06-12 14:10     ` Thomas Monjalon
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Monjalon @ 2015-06-12 14:10 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

2015-06-09 13:57, Bruce Richardson:
> On Mon, Jun 08, 2015 at 04:57:22PM +0200, Olivier Matz wrote:
> > In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using
> > a costly atomic operation when updating the mbuf reference counter if
> > its value is 1. Indeed, it means that we are the only owner of the mbuf,
> > and therefore nobody can change it at the same time.
> > 
> > We can generalize this optimization directly in rte_mbuf_refcnt_update()
> > so the other callers of this function, like rte_pktmbuf_attach(), can
> > also take advantage of this optimization.
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-13 16:28           ` Hanoch Haim (hhaim)
@ 2016-01-13 16:40             ` Bruce Richardson
  0 siblings, 0 replies; 15+ messages in thread
From: Bruce Richardson @ 2016-01-13 16:40 UTC (permalink / raw)
  To: Hanoch Haim (hhaim); +Cc: dev, Itay Marom (imarom), Ido Barnea (ibarnea)

On Wed, Jan 13, 2016 at 04:28:34PM +0000, Hanoch Haim (hhaim) wrote:
> Hi Bruce. 
> This is exactly what was possible before the patch and does *not* work after this patch. 
> I think it is more related to the semantic of the API. My understanding of the API was that I could allocate a share mbuf (ref=1) and then attach it from many concurrent threads (as each thread attach opt inc in 1 and driver reduces, atomic by 1)
> After the patch there is a need to make the ref of the share mbuf to be 2 (why?) and free it twice at the end( ?), this makes it cumbersome to use it and for sure there is a need to describe this semantic change.
> 

There is no such thing as allocating a shared mbuf. 
Mbufs are always allocated with a reference count of one. 
To share an mbuf among threads, then the ref count must be increased.

NOTE: this is a different operation to that of "attaching" another mbuf to it -
it just happens that when attaching one mbuf to another you have to increase
the reference count as the attaching mbuf acts like another thread holding a
pointer to the original mbuf. 

Once finished with the mbuf, each thread (or pointing mbuf) needs to
decrement the reference counter, and the last one holding the reference is then
responsible for freeing that buffer. The easiest way to do this is to use the
free call and let it handle the reference counts, but if it makes more logical
sense in an app to do the decrement and checking itself (so that inc's and dec's
match in the app code), then that can work too, obviously.

> However, this change is good for the common case, and I think it is better to keep it.
> The options I see are:
> 1) Document the current behavior 
> 2) Create two sets of API with two types of semantic 

Given the confusion I think documenting the way things are "meant to work" is
a good idea.

/Bruce

> 
> thanks,
> Hanoh
> 
> -----Original Message-----
> From: Bruce Richardson [mailto:bruce.richardson@intel.com] 
> Sent: Wednesday, January 13, 2016 1:48 PM
> To: Hanoch Haim (hhaim)
> Cc: Olivier MATZ; dev@dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
> Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
> 
> On Tue, Jan 05, 2016 at 11:11:24AM +0000, Hanoch Haim (hhaim) wrote:
> > Hi Oliver,
> > Thank you for the fast response and it would be great to open a discussion on that.
> > In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
> > However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
> > The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization. 
> > 
> > Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?
> > 
> Hi,
> 
> actually, the issue is not that you can't work with the reference counter field from multiple threads, or that you can't use an mbuf from multiple threads, it's that if you are working with the same mbuf in multiple threads you have multiple references to the mbuf and your application must increase the reference counter appropriately. For example, if thread A is going to pass an mbuf to thread B and keep using it itself, you must increment the reference counter in thread A before enqueuing it to B.
> 
> Regards,
> /Bruce
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-13 11:48         ` Bruce Richardson
@ 2016-01-13 16:28           ` Hanoch Haim (hhaim)
  2016-01-13 16:40             ` Bruce Richardson
  0 siblings, 1 reply; 15+ messages in thread
From: Hanoch Haim (hhaim) @ 2016-01-13 16:28 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Itay Marom (imarom), Ido Barnea (ibarnea)

Hi Bruce. 
This is exactly what was possible before the patch and does *not* work after this patch. 
I think it is more related to the semantic of the API. My understanding of the API was that I could allocate a share mbuf (ref=1) and then attach it from many concurrent threads (as each thread attach opt inc in 1 and driver reduces, atomic by 1)
After the patch there is a need to make the ref of the share mbuf to be 2 (why?) and free it twice at the end( ?), this makes it cumbersome to use it and for sure there is a need to describe this semantic change.

However, this change is good for the common case, and I think it is better to keep it.
The options I see are:
1) Document the current behavior 
2) Create two sets of API with two types of semantic 

thanks,
Hanoh

-----Original Message-----
From: Bruce Richardson [mailto:bruce.richardson@intel.com] 
Sent: Wednesday, January 13, 2016 1:48 PM
To: Hanoch Haim (hhaim)
Cc: Olivier MATZ; dev@dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

On Tue, Jan 05, 2016 at 11:11:24AM +0000, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
> Thank you for the fast response and it would be great to open a discussion on that.
> In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
> However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
> The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization. 
> 
> Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?
> 
Hi,

actually, the issue is not that you can't work with the reference counter field from multiple threads, or that you can't use an mbuf from multiple threads, it's that if you are working with the same mbuf in multiple threads you have multiple references to the mbuf and your application must increase the reference counter appropriately. For example, if thread A is going to pass an mbuf to thread B and keep using it itself, you must increment the reference counter in thread A before enqueuing it to B.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-05 11:11       ` Hanoch Haim (hhaim)
  2016-01-05 12:12         ` Olivier MATZ
@ 2016-01-13 11:48         ` Bruce Richardson
  2016-01-13 16:28           ` Hanoch Haim (hhaim)
  1 sibling, 1 reply; 15+ messages in thread
From: Bruce Richardson @ 2016-01-13 11:48 UTC (permalink / raw)
  To: Hanoch Haim (hhaim); +Cc: dev, Itay Marom (imarom), Ido Barnea (ibarnea)

On Tue, Jan 05, 2016 at 11:11:24AM +0000, Hanoch Haim (hhaim) wrote:
> Hi Oliver, 
> Thank you for the fast response and it would be great to open a discussion on that.
> In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
> However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
> The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization. 
> 
> Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?
> 
Hi,

actually, the issue is not that you can't work with the reference counter field
from multiple threads, or that you can't use an mbuf from multiple threads,
it's that if you are working with the same mbuf in multiple threads you have
multiple references to the mbuf and your application must increase the reference
counter appropriately. For example, if thread A is going to pass an mbuf to
thread B and keep using it itself, you must increment the reference counter
in thread A before enqueuing it to B.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-05 11:11       ` Hanoch Haim (hhaim)
@ 2016-01-05 12:12         ` Olivier MATZ
  2016-01-13 11:48         ` Bruce Richardson
  1 sibling, 0 replies; 15+ messages in thread
From: Olivier MATZ @ 2016-01-05 12:12 UTC (permalink / raw)
  To: Hanoch Haim (hhaim), bruce.richardson
  Cc: dev, Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Hanoch,

On 01/05/2016 12:11 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
> Thank you for the fast response and it would be great to open a discussion on that.
> In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
> However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
> The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization.
>
> Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?

Unfortunately it's not documented yet, but it's something we should
better describe.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-05 10:57     ` Olivier MATZ
@ 2016-01-05 11:11       ` Hanoch Haim (hhaim)
  2016-01-05 12:12         ` Olivier MATZ
  2016-01-13 11:48         ` Bruce Richardson
  0 siblings, 2 replies; 15+ messages in thread
From: Hanoch Haim (hhaim) @ 2016-01-05 11:11 UTC (permalink / raw)
  To: Olivier MATZ, bruce.richardson
  Cc: dev, Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Oliver, 
Thank you for the fast response and it would be great to open a discussion on that.
In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization. 

Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?

Thanks,
Hanoh

-----Original Message-----
From: Olivier MATZ [mailto:olivier.matz@6wind.com] 
Sent: Tuesday, January 05, 2016 12:58 PM
To: Hanoch Haim (hhaim); bruce.richardson@intel.com
Cc: dev@dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question The use 
> case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect 
> m1/m2 do alloc/attach/send
>
>      core0			             |	core1
> ---------------------------------                         |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)             |
>                                                                    |
> while true:                                                 |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                           |    tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on several cores. By "used", I mean calling a function that modifies the mbuf, which is the case for rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>    core0			                    |	core1
> ---------------------------------                  |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)      |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
>                                                             |
> while true:                                          |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                     |    tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to protect the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function (should concurrent calls be allowed or not). In any case, we may want to better document it in the doxygen API comments.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-04 14:43   ` Hanoch Haim (hhaim)
@ 2016-01-05 10:57     ` Olivier MATZ
  2016-01-05 11:11       ` Hanoch Haim (hhaim)
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier MATZ @ 2016-01-05 10:57 UTC (permalink / raw)
  To: Hanoch Haim (hhaim), bruce.richardson
  Cc: dev, Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question
> The use case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect m1/m2 do alloc/attach/send
>
>      core0			             |	core1
> ---------------------------------                         |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)             |
>                                                                    |
> while true:                                                 |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                           |    tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on
several cores. By "used", I mean calling a function that modifies
the mbuf, which is the case for rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>    core0			                    |	core1
> ---------------------------------                  |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)      |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
>                                                             |
> while true:                                          |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                     |    tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to
protect the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function
(should concurrent calls be allowed or not). In any case, we may
want to better document it in the doxygen API comments.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2016-01-04 13:53 ` Olivier MATZ
@ 2016-01-04 14:43   ` Hanoch Haim (hhaim)
  2016-01-05 10:57     ` Olivier MATZ
  0 siblings, 1 reply; 15+ messages in thread
From: Hanoch Haim (hhaim) @ 2016-01-04 14:43 UTC (permalink / raw)
  To: Olivier MATZ, bruce.richardson
  Cc: dev, Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Oliver, 

Let's take your drawing as a reference and add my question
The use case is sending a duplicate multicast packet by many threads.
I can split it to x threads to do the job and with atomic-ref (my multicast not mbuf) count it until it reaches zero.

In my following example the two cores (0 and 1) sending the indirect m1/m2 do alloc/attach/send 

    core0			             |	core1
---------------------------------                         |---------------------------------------
m_const=rte_pktmbuf_alloc(mp)             |
                                                                  |
while true:                                                 |  while True:
  m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
  rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, m_const)
  tx_burst(m1)                                           |    tx_burst(m2)

Is this example is not valid? 


BTW this is our workaround 
                      


  core0			                    |	core1
---------------------------------                  |---------------------------------------
m_const=rte_pktmbuf_alloc(mp)      |
rte_mbuf_refcnt_update(m_const,1)| <<-- workaround 
                                                           |
while true:                                          |  while True:
  m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
  rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
  tx_burst(m1)                                     |    tx_burst(m2)

thanks,
Hanoh

-----Original Message-----
From: Olivier MATZ [mailto:olivier.matz@6wind.com] 
Sent: Monday, January 04, 2016 3:53 PM
To: Hanoch Haim (hhaim); bruce.richardson@intel.com
Cc: dev@dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

Please find some comments below.

On 12/27/2015 10:39 AM, Hanoch Haim (hhaim) wrote:
> Hi Bruce,
> 
> I'm Hanoch from Cisco Systems works for  the https://github.com/cisco-system-traffic-generator/trex-core traffic generator project.
> 
> While upgrading from DPDK 1.8 to 2.2 Ido found that the following 
> commit creates a mbuf corruption and result in Tx hang
> 
> 
> 
> commit f20b50b946da9070d21e392e4dbc7d9f68bc983e
> 
> Author: Olivier Matz <olivier.matz@6wind.com>
> 
> Date:   Mon Jun 8 16:57:22 2015 +0200
> 
> 
> 
> Looking at the change it is clear why there is an issue, wanted to get your input.
> 
> 
> 
> Init
> 
> -----
> 
> alloc const mbuf  ==> mbuf-a (ref=1)
> 
> 
> 
> Simple case that works
> 
> ---------------------
> 
> 
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref=2)  inc- non atomic
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref32)  inc- atomic

do you mean "(ref=3)" ?
[hh] yes ref=3. 

> 
> thread 1 , drv : free()                    (ref=2) dec- atomic
> 
> thread 1 , drv : free()                    (ref=3) dec - non atomic

do you mean "(ref=1)" ?


> 
> Simple case that does not work
> 
> ---------------------
> 
> 
> 
> Both do that in parallel
> 
> 
> 
> thread 2 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic
> 
> thread 1 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic


It is not allowed to call a function from the mbuf API in parallel.

Example:

	core0			|	core1
--------------------------------|---------------------------------------
m = rte_pktmbuf_alloc(m);	|
enqueue(m);			|
				|m = dequeue();
do_something(m);		|do_something(m);


do_something() is not allowed because it accesses the same mbuf structure.
do_something() can be any function of mbuf API: rte_pktmbuf_prepend(), rte_pktmbuf_attach(), ...


This is allowed:

	core0			|	core1
--------------------------------|---------------------------------------
m = rte_pktmbuf_alloc(m);	|
m2 = rte_pktmbuf_attach(m);	|
enqueue(m2);			|
				|m2 = dequeue();
do_something(m);		|do_something(m2);



Regards,
Olivier

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
  2015-12-27  9:39 Hanoch Haim (hhaim)
@ 2016-01-04 13:53 ` Olivier MATZ
  2016-01-04 14:43   ` Hanoch Haim (hhaim)
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier MATZ @ 2016-01-04 13:53 UTC (permalink / raw)
  To: Hanoch Haim (hhaim), bruce.richardson
  Cc: dev, Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Hanoch,

Please find some comments below.

On 12/27/2015 10:39 AM, Hanoch Haim (hhaim) wrote:
> Hi Bruce,
> 
> I'm Hanoch from Cisco Systems works for  the https://github.com/cisco-system-traffic-generator/trex-core traffic generator project.
> 
> While upgrading from DPDK 1.8 to 2.2 Ido found that the following commit creates a mbuf corruption and result in Tx hang
> 
> 
> 
> commit f20b50b946da9070d21e392e4dbc7d9f68bc983e
> 
> Author: Olivier Matz <olivier.matz@6wind.com>
> 
> Date:   Mon Jun 8 16:57:22 2015 +0200
> 
> 
> 
> Looking at the change it is clear why there is an issue, wanted to get your input.
> 
> 
> 
> Init
> 
> -----
> 
> alloc const mbuf  ==> mbuf-a (ref=1)
> 
> 
> 
> Simple case that works
> 
> ---------------------
> 
> 
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref=2)  inc- non atomic
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref32)  inc- atomic

do you mean "(ref=3)" ?

> 
> thread 1 , drv : free()                    (ref=2) dec- atomic
> 
> thread 1 , drv : free()                    (ref=3) dec - non atomic

do you mean "(ref=1)" ?


> 
> Simple case that does not work
> 
> ---------------------
> 
> 
> 
> Both do that in parallel
> 
> 
> 
> thread 2 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic
> 
> thread 1 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic


It is not allowed to call a function from the mbuf API in parallel.

Example:

	core0			|	core1
--------------------------------|---------------------------------------
m = rte_pktmbuf_alloc(m);	|
enqueue(m);			|
				|m = dequeue();
do_something(m);		|do_something(m);


do_something() is not allowed because it accesses the same mbuf
structure.
do_something() can be any function of mbuf API: rte_pktmbuf_prepend(),
rte_pktmbuf_attach(), ...


This is allowed:

	core0			|	core1
--------------------------------|---------------------------------------
m = rte_pktmbuf_alloc(m);	|
m2 = rte_pktmbuf_attach(m);	|
enqueue(m2);			|
				|m2 = dequeue();
do_something(m);		|do_something(m2);



Regards,
Olivier

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
@ 2015-12-27  9:39 Hanoch Haim (hhaim)
  2016-01-04 13:53 ` Olivier MATZ
  0 siblings, 1 reply; 15+ messages in thread
From: Hanoch Haim (hhaim) @ 2015-12-27  9:39 UTC (permalink / raw)
  To: bruce.richardson
  Cc: dev, Hanoch Haim (hhaim), Ido Barnea (ibarnea), Itay Marom (imarom)

Hi Bruce,

I'm Hanoch from Cisco Systems works for  the https://github.com/cisco-system-traffic-generator/trex-core traffic generator project.

While upgrading from DPDK 1.8 to 2.2 Ido found that the following commit creates a mbuf corruption and result in Tx hang



commit f20b50b946da9070d21e392e4dbc7d9f68bc983e

Author: Olivier Matz <olivier.matz@6wind.com>

Date:   Mon Jun 8 16:57:22 2015 +0200



Looking at the change it is clear why there is an issue, wanted to get your input.



Init

-----

alloc const mbuf  ==> mbuf-a (ref=1)



Simple case that works

---------------------



thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref=2)  inc- non atomic

thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref32)  inc- atomic

thread 1 , drv : free()                    (ref=2) dec- atomic

thread 1 , drv : free()                    (ref=3) dec - non atomic





Simple case that does not work

---------------------



Both do that in parallel



thread 2 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic

thread 1 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic



  ==> ref in corrupted



thread 1 drv : free()                    (ref=0)   - free

thread 1 drv : free()                    (ref=-1)  - ???





Just as a side note that in our case we could benefit from this optimization by making a small change in our code, as our common case could be alloc/free of simple mbuf and in such scenario there is no atomic operation in both cases. But I think the general case is broken.



Could you explain what was your use case or this optimization? Is it a valid example the aforementioned example



thanks,

Hanoh

Cisco Systems







On Mon, Jun 08, 2015 at 04:57:22PM +0200, Olivier Matz wrote:

> In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using

> a costly atomic operation when updating the mbuf reference counter if

> its value is 1. Indeed, it means that we are the only owner of the mbuf,

> and therefore nobody can change it at the same time.

>

> We can generalize this optimization directly in rte_mbuf_refcnt_update()

> so the other callers of this function, like rte_pktmbuf_attach(), can

> also take advantage of this optimization.

>

> Signed-off-by: Olivier Matz <olivier.matz at 6wind.com<http://dpdk.org/ml/listinfo/dev>>



Acked-by: Bruce Richardson <bruce.richardson at intel.com<http://dpdk.org/ml/listinfo/dev>>


Hanoh

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-01-13 16:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-01  9:32 [dpdk-dev] [PATCH] mbuf: optimize first reference increment in rte_pktmbuf_attach Olivier Matz
2015-06-03 10:59 ` Bruce Richardson
2015-06-05  9:07   ` Olivier MATZ
2015-06-08 14:57 ` [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update Olivier Matz
2015-06-09 12:57   ` Bruce Richardson
2015-06-12 14:10     ` Thomas Monjalon
2015-12-27  9:39 Hanoch Haim (hhaim)
2016-01-04 13:53 ` Olivier MATZ
2016-01-04 14:43   ` Hanoch Haim (hhaim)
2016-01-05 10:57     ` Olivier MATZ
2016-01-05 11:11       ` Hanoch Haim (hhaim)
2016-01-05 12:12         ` Olivier MATZ
2016-01-13 11:48         ` Bruce Richardson
2016-01-13 16:28           ` Hanoch Haim (hhaim)
2016-01-13 16:40             ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).