* [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue @ 2016-03-17 15:49 Lazaros Koromilas 2016-03-17 16:09 ` Mauricio Vásquez ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Lazaros Koromilas @ 2016-03-17 15:49 UTC (permalink / raw) To: dev Issuing a zero objects dequeue with a single consumer has no effect. Doing so with multiple consumers, can get more than one thread to succeed the compare-and-set operation and observe starvation or even deadlock in the while loop that checks for preceding dequeues. The problematic piece of code when n = 0: cons_next = cons_head + n; success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); The same is possible on the enqueue path. Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com> --- lib/librte_ring/rte_ring.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 943c97c..eb45e41 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table, uint32_t mask = r->prod.mask; int ret; + /* Avoid the unnecessary cmpset operation below, which is also + * potentially harmful when n equals 0. */ + if (n == 0) + return 0; + /* move prod.head atomically */ do { /* Reset n to the initial burst count */ @@ -618,6 +623,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table, unsigned i, rep = 0; uint32_t mask = r->prod.mask; + /* Avoid the unnecessary cmpset operation below, which is also + * potentially harmful when n equals 0. */ + if (n == 0) + return 0; + /* move cons.head atomically */ do { /* Restore n as it may change every loop */ -- 1.9.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-17 15:49 [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Lazaros Koromilas @ 2016-03-17 16:09 ` Mauricio Vásquez 2016-03-18 10:18 ` Bruce Richardson 2016-03-21 12:23 ` Olivier Matz 2016-03-25 11:15 ` Olivier Matz 2 siblings, 1 reply; 19+ messages in thread From: Mauricio Vásquez @ 2016-03-17 16:09 UTC (permalink / raw) To: Lazaros Koromilas; +Cc: dev Hi Lazaros, On Thu, Mar 17, 2016 at 4:49 PM, Lazaros Koromilas <l@nofutznetworks.com> wrote: > Issuing a zero objects dequeue with a single consumer has no effect. > Doing so with multiple consumers, can get more than one thread to succeed > the compare-and-set operation and observe starvation or even deadlock in > the while loop that checks for preceding dequeues. The problematic piece > of code when n = 0: > > cons_next = cons_head + n; > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > The same is possible on the enqueue path. > > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com> > --- > lib/librte_ring/rte_ring.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > index 943c97c..eb45e41 100644 > --- a/lib/librte_ring/rte_ring.h > +++ b/lib/librte_ring/rte_ring.h > @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * > const *obj_table, > uint32_t mask = r->prod.mask; > int ret; > > + /* Avoid the unnecessary cmpset operation below, which is also > + * potentially harmful when n equals 0. */ > + if (n == 0) > What about using unlikely here? > + return 0; > + > /* move prod.head atomically */ > do { > /* Reset n to the initial burst count */ > @@ -618,6 +623,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void > **obj_table, > unsigned i, rep = 0; > uint32_t mask = r->prod.mask; > > + /* Avoid the unnecessary cmpset operation below, which is also > + * potentially harmful when n equals 0. */ > + if (n == 0) > Also here. > + return 0; > + > /* move cons.head atomically */ > do { > /* Restore n as it may change every loop */ > -- > 1.9.1 > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-17 16:09 ` Mauricio Vásquez @ 2016-03-18 10:18 ` Bruce Richardson 2016-03-18 10:27 ` Olivier Matz 0 siblings, 1 reply; 19+ messages in thread From: Bruce Richardson @ 2016-03-18 10:18 UTC (permalink / raw) To: Mauricio Vásquez; +Cc: Lazaros Koromilas, dev On Thu, Mar 17, 2016 at 05:09:08PM +0100, Mauricio Vásquez wrote: > Hi Lazaros, > > On Thu, Mar 17, 2016 at 4:49 PM, Lazaros Koromilas <l@nofutznetworks.com> > wrote: > > > Issuing a zero objects dequeue with a single consumer has no effect. > > Doing so with multiple consumers, can get more than one thread to succeed > > the compare-and-set operation and observe starvation or even deadlock in > > the while loop that checks for preceding dequeues. The problematic piece > > of code when n = 0: > > > > cons_next = cons_head + n; > > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > > > The same is possible on the enqueue path. > > > > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com> > > --- > > lib/librte_ring/rte_ring.h | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > > index 943c97c..eb45e41 100644 > > --- a/lib/librte_ring/rte_ring.h > > +++ b/lib/librte_ring/rte_ring.h > > @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * > > const *obj_table, > > uint32_t mask = r->prod.mask; > > int ret; > > > > + /* Avoid the unnecessary cmpset operation below, which is also > > + * potentially harmful when n equals 0. */ > > + if (n == 0) > > > > What about using unlikely here? > Unless there is a measurable performance increase by adding in likely/unlikely I'd suggest avoiding it's use. In general, likely/unlikely should only be used for things like catestrophic errors because the penalty for taking the unlikely leg of the code can be quite severe. For normal stuff, where the code nearly always goes one way in the branch but occasionally goes the other, the hardware branch predictors generally do a good enough job. Just my 2c. /Bruce ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 10:18 ` Bruce Richardson @ 2016-03-18 10:27 ` Olivier Matz 2016-03-18 10:35 ` Bruce Richardson 2016-03-18 10:35 ` Thomas Monjalon 0 siblings, 2 replies; 19+ messages in thread From: Olivier Matz @ 2016-03-18 10:27 UTC (permalink / raw) To: Bruce Richardson, Mauricio Vásquez; +Cc: Lazaros Koromilas, dev Hi, On 03/18/2016 11:18 AM, Bruce Richardson wrote: >>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h >>> index 943c97c..eb45e41 100644 >>> --- a/lib/librte_ring/rte_ring.h >>> +++ b/lib/librte_ring/rte_ring.h >>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * >>> const *obj_table, >>> uint32_t mask = r->prod.mask; >>> int ret; >>> >>> + /* Avoid the unnecessary cmpset operation below, which is also >>> + * potentially harmful when n equals 0. */ >>> + if (n == 0) >>> >> >> What about using unlikely here? >> > > Unless there is a measurable performance increase by adding in likely/unlikely > I'd suggest avoiding it's use. In general, likely/unlikely should only be used > for things like catestrophic errors because the penalty for taking the unlikely > leg of the code can be quite severe. For normal stuff, where the code nearly > always goes one way in the branch but occasionally goes the other, the hardware > branch predictors generally do a good enough job. Do you mean using likely/unlikely could be worst than not using it in this case? To me, using unlikely here is not a bad idea: it shows to the compiler and to the reader of the code that is case is not the usual case. Olivier ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 10:27 ` Olivier Matz @ 2016-03-18 10:35 ` Bruce Richardson 2016-03-18 10:35 ` Thomas Monjalon 1 sibling, 0 replies; 19+ messages in thread From: Bruce Richardson @ 2016-03-18 10:35 UTC (permalink / raw) To: Olivier Matz; +Cc: Mauricio Vásquez, Lazaros Koromilas, dev On Fri, Mar 18, 2016 at 11:27:18AM +0100, Olivier Matz wrote: > Hi, > > On 03/18/2016 11:18 AM, Bruce Richardson wrote: > >>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > >>> index 943c97c..eb45e41 100644 > >>> --- a/lib/librte_ring/rte_ring.h > >>> +++ b/lib/librte_ring/rte_ring.h > >>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * > >>> const *obj_table, > >>> uint32_t mask = r->prod.mask; > >>> int ret; > >>> > >>> + /* Avoid the unnecessary cmpset operation below, which is also > >>> + * potentially harmful when n equals 0. */ > >>> + if (n == 0) > >>> > >> > >> What about using unlikely here? > >> > > > > Unless there is a measurable performance increase by adding in likely/unlikely > > I'd suggest avoiding it's use. In general, likely/unlikely should only be used > > for things like catestrophic errors because the penalty for taking the unlikely > > leg of the code can be quite severe. For normal stuff, where the code nearly > > always goes one way in the branch but occasionally goes the other, the hardware > > branch predictors generally do a good enough job. > > Do you mean using likely/unlikely could be worst than not using it > in this case? > > To me, using unlikely here is not a bad idea: it shows to the compiler > and to the reader of the code that is case is not the usual case. > Hi Olivier, it might be worse if the user makes a lot of calls with n == 0. It almost certainly would depend upon the compiler. Overall, I'd rather see us err on the side of not putting in the calls unless there is a proven case to do so. I don't think the documentation benefit is huge here either, it's just standard parameter checking at the start of the function. /Bruce ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 10:27 ` Olivier Matz 2016-03-18 10:35 ` Bruce Richardson @ 2016-03-18 10:35 ` Thomas Monjalon 2016-03-18 12:47 ` Mauricio Vásquez 1 sibling, 1 reply; 19+ messages in thread From: Thomas Monjalon @ 2016-03-18 10:35 UTC (permalink / raw) To: dev, Bruce Richardson Cc: Olivier Matz, Mauricio Vásquez, Lazaros Koromilas 2016-03-18 11:27, Olivier Matz: > On 03/18/2016 11:18 AM, Bruce Richardson wrote: > >>> + /* Avoid the unnecessary cmpset operation below, which is also > >>> + * potentially harmful when n equals 0. */ > >>> + if (n == 0) > >>> > >> > >> What about using unlikely here? > >> > > > > Unless there is a measurable performance increase by adding in likely/unlikely > > I'd suggest avoiding it's use. In general, likely/unlikely should only be used > > for things like catestrophic errors because the penalty for taking the unlikely > > leg of the code can be quite severe. For normal stuff, where the code nearly > > always goes one way in the branch but occasionally goes the other, the hardware > > branch predictors generally do a good enough job. > > Do you mean using likely/unlikely could be worst than not using it > in this case? > > To me, using unlikely here is not a bad idea: it shows to the compiler > and to the reader of the code that is case is not the usual case. It would be nice to have a guideline section about likely/unlikely in doc/guides/contributing/design.rst Bruce gave a talk at Dublin about this kind of things. I'm sure he could contribute more design guidelines ;) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 10:35 ` Thomas Monjalon @ 2016-03-18 12:47 ` Mauricio Vásquez 2016-03-18 14:16 ` Bruce Richardson 0 siblings, 1 reply; 19+ messages in thread From: Mauricio Vásquez @ 2016-03-18 12:47 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev, Bruce Richardson, Olivier Matz, Lazaros Koromilas Hi, On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com > wrote: > 2016-03-18 11:27, Olivier Matz: > > On 03/18/2016 11:18 AM, Bruce Richardson wrote: > > >>> + /* Avoid the unnecessary cmpset operation below, which is > also > > >>> + * potentially harmful when n equals 0. */ > > >>> + if (n == 0) > > >>> > > >> > > >> What about using unlikely here? > > >> > > > > > > Unless there is a measurable performance increase by adding in > likely/unlikely > > > I'd suggest avoiding it's use. In general, likely/unlikely should only > be used > > > for things like catestrophic errors because the penalty for taking the > unlikely > > > leg of the code can be quite severe. For normal stuff, where the code > nearly > > > always goes one way in the branch but occasionally goes the other, the > hardware > > > branch predictors generally do a good enough job. > > > > Do you mean using likely/unlikely could be worst than not using it > > in this case? > > > > To me, using unlikely here is not a bad idea: it shows to the compiler > > and to the reader of the code that is case is not the usual case. > > It would be nice to have a guideline section about likely/unlikely in > doc/guides/contributing/design.rst > > Bruce gave a talk at Dublin about this kind of things. > I'm sure he could contribute more design guidelines ;) > There is a small explanation in the section "Branch Prediction" of doc/guides/contributing/coding_style.rst, but I do not know if that is enough to understand when to use them. I've made a fast check and there are many PMDs that use them to check if number of packets is zero in the transmission function. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 12:47 ` Mauricio Vásquez @ 2016-03-18 14:16 ` Bruce Richardson 2016-03-21 17:47 ` Xie, Huawei 0 siblings, 1 reply; 19+ messages in thread From: Bruce Richardson @ 2016-03-18 14:16 UTC (permalink / raw) To: Mauricio Vásquez Cc: Thomas Monjalon, dev, Olivier Matz, Lazaros Koromilas On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote: > Hi, > > > On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com > > wrote: > > > 2016-03-18 11:27, Olivier Matz: > > > On 03/18/2016 11:18 AM, Bruce Richardson wrote: > > > >>> + /* Avoid the unnecessary cmpset operation below, which is > > also > > > >>> + * potentially harmful when n equals 0. */ > > > >>> + if (n == 0) > > > >>> > > > >> > > > >> What about using unlikely here? > > > >> > > > > > > > > Unless there is a measurable performance increase by adding in > > likely/unlikely > > > > I'd suggest avoiding it's use. In general, likely/unlikely should only > > be used > > > > for things like catestrophic errors because the penalty for taking the > > unlikely > > > > leg of the code can be quite severe. For normal stuff, where the code > > nearly > > > > always goes one way in the branch but occasionally goes the other, the > > hardware > > > > branch predictors generally do a good enough job. > > > > > > Do you mean using likely/unlikely could be worst than not using it > > > in this case? > > > > > > To me, using unlikely here is not a bad idea: it shows to the compiler > > > and to the reader of the code that is case is not the usual case. > > > > It would be nice to have a guideline section about likely/unlikely in > > doc/guides/contributing/design.rst > > > > Bruce gave a talk at Dublin about this kind of things. > > I'm sure he could contribute more design guidelines ;) > > > > There is a small explanation in the section "Branch Prediction" of > doc/guides/contributing/coding_style.rst, but I do not know if that is > enough to understand when to use them. > > I've made a fast check and there are many PMDs that use them to check if > number of packets is zero in the transmission function. Yeah, and I wonder how many of those are actually necessary too :-) It's not a big deal either way, I just think the patch is fine as-is without the extra macros. /Bruce ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-18 14:16 ` Bruce Richardson @ 2016-03-21 17:47 ` Xie, Huawei 2016-03-22 10:13 ` Bruce Richardson 0 siblings, 1 reply; 19+ messages in thread From: Xie, Huawei @ 2016-03-21 17:47 UTC (permalink / raw) To: Richardson, Bruce, Mauricio Vásquez Cc: Thomas Monjalon, dev, Olivier Matz, Lazaros Koromilas On 3/18/2016 10:17 PM, Bruce Richardson wrote: > On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote: >> Hi, >> >> >> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com >>> wrote: >>> 2016-03-18 11:27, Olivier Matz: >>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote: >>>>>>> + /* Avoid the unnecessary cmpset operation below, which is >>> also >>>>>>> + * potentially harmful when n equals 0. */ >>>>>>> + if (n == 0) >>>>>>> >>>>>> What about using unlikely here? >>>>>> >>>>> Unless there is a measurable performance increase by adding in >>> likely/unlikely >>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only >>> be used >>>>> for things like catestrophic errors because the penalty for taking the >>> unlikely >>>>> leg of the code can be quite severe. For normal stuff, where the code >>> nearly >>>>> always goes one way in the branch but occasionally goes the other, the >>> hardware >>>>> branch predictors generally do a good enough job. >>>> Do you mean using likely/unlikely could be worst than not using it >>>> in this case? >>>> >>>> To me, using unlikely here is not a bad idea: it shows to the compiler >>>> and to the reader of the code that is case is not the usual case. >>> It would be nice to have a guideline section about likely/unlikely in >>> doc/guides/contributing/design.rst >>> >>> Bruce gave a talk at Dublin about this kind of things. >>> I'm sure he could contribute more design guidelines ;) >>> >> There is a small explanation in the section "Branch Prediction" of >> doc/guides/contributing/coding_style.rst, but I do not know if that is >> enough to understand when to use them. >> >> I've made a fast check and there are many PMDs that use them to check if >> number of packets is zero in the transmission function. > Yeah, and I wonder how many of those are actually necessary too :-) > > It's not a big deal either way, I just think the patch is fine as-is without > the extra macros. IMO we use likely/unlikely in two cases, catastrophic errors and the code nearly always goes one way, i.e, preferred/favored fast path. Likely/unlikely helps not only for branch predication but also for cache usage. The code generated for the likely path will directly follow the branch instruction. To me, it is reasonable enough to add unlikely for n == 0, which we don't expect to happen. I remember with/without likely, compiler could generate three kind of instructions. Didn't deep dive into it. > > /Bruce > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-21 17:47 ` Xie, Huawei @ 2016-03-22 10:13 ` Bruce Richardson 2016-03-22 14:38 ` Xie, Huawei 0 siblings, 1 reply; 19+ messages in thread From: Bruce Richardson @ 2016-03-22 10:13 UTC (permalink / raw) To: Xie, Huawei Cc: Mauricio Vásquez, Thomas Monjalon, dev, Olivier Matz, Lazaros Koromilas On Mon, Mar 21, 2016 at 05:47:44PM +0000, Xie, Huawei wrote: > On 3/18/2016 10:17 PM, Bruce Richardson wrote: > > On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote: > >> Hi, > >> > >> > >> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com > >>> wrote: > >>> 2016-03-18 11:27, Olivier Matz: > >>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote: > >>>>>>> + /* Avoid the unnecessary cmpset operation below, which is > >>> also > >>>>>>> + * potentially harmful when n equals 0. */ > >>>>>>> + if (n == 0) > >>>>>>> > >>>>>> What about using unlikely here? > >>>>>> > >>>>> Unless there is a measurable performance increase by adding in > >>> likely/unlikely > >>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only > >>> be used > >>>>> for things like catestrophic errors because the penalty for taking the > >>> unlikely > >>>>> leg of the code can be quite severe. For normal stuff, where the code > >>> nearly > >>>>> always goes one way in the branch but occasionally goes the other, the > >>> hardware > >>>>> branch predictors generally do a good enough job. > >>>> Do you mean using likely/unlikely could be worst than not using it > >>>> in this case? > >>>> > >>>> To me, using unlikely here is not a bad idea: it shows to the compiler > >>>> and to the reader of the code that is case is not the usual case. > >>> It would be nice to have a guideline section about likely/unlikely in > >>> doc/guides/contributing/design.rst > >>> > >>> Bruce gave a talk at Dublin about this kind of things. > >>> I'm sure he could contribute more design guidelines ;) > >>> > >> There is a small explanation in the section "Branch Prediction" of > >> doc/guides/contributing/coding_style.rst, but I do not know if that is > >> enough to understand when to use them. > >> > >> I've made a fast check and there are many PMDs that use them to check if > >> number of packets is zero in the transmission function. > > Yeah, and I wonder how many of those are actually necessary too :-) > > > > It's not a big deal either way, I just think the patch is fine as-is without > > the extra macros. > > IMO we use likely/unlikely in two cases, catastrophic errors and the > code nearly always goes one way, i.e, preferred/favored fast path. > Likely/unlikely helps not only for branch predication but also for cache For branch prediction, anything after the first time through the code path the prediction will be based on what happened before rather than any static hints in the code. > usage. The code generated for the likely path will directly follow the > branch instruction. To me, it is reasonable enough to add unlikely for n > == 0, which we don't expect to happen. > I remember with/without likely, compiler could generate three kind of > instructions. Didn't deep dive into it. > > > > > /Bruce > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-22 10:13 ` Bruce Richardson @ 2016-03-22 14:38 ` Xie, Huawei 0 siblings, 0 replies; 19+ messages in thread From: Xie, Huawei @ 2016-03-22 14:38 UTC (permalink / raw) To: Richardson, Bruce Cc: Mauricio Vásquez, Thomas Monjalon, dev, Olivier Matz, Lazaros Koromilas On 3/22/2016 6:13 PM, Richardson, Bruce wrote: > On Mon, Mar 21, 2016 at 05:47:44PM +0000, Xie, Huawei wrote: >> On 3/18/2016 10:17 PM, Bruce Richardson wrote: >>> On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote: >>>> Hi, >>>> >>>> >>>> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com >>>>> wrote: >>>>> 2016-03-18 11:27, Olivier Matz: >>>>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote: >>>>>>>>> + /* Avoid the unnecessary cmpset operation below, which is >>>>> also >>>>>>>>> + * potentially harmful when n equals 0. */ >>>>>>>>> + if (n == 0) >>>>>>>>> >>>>>>>> What about using unlikely here? >>>>>>>> >>>>>>> Unless there is a measurable performance increase by adding in >>>>> likely/unlikely >>>>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only >>>>> be used >>>>>>> for things like catestrophic errors because the penalty for taking the >>>>> unlikely >>>>>>> leg of the code can be quite severe. For normal stuff, where the code >>>>> nearly >>>>>>> always goes one way in the branch but occasionally goes the other, the >>>>> hardware >>>>>>> branch predictors generally do a good enough job. >>>>>> Do you mean using likely/unlikely could be worst than not using it >>>>>> in this case? >>>>>> >>>>>> To me, using unlikely here is not a bad idea: it shows to the compiler >>>>>> and to the reader of the code that is case is not the usual case. >>>>> It would be nice to have a guideline section about likely/unlikely in >>>>> doc/guides/contributing/design.rst >>>>> >>>>> Bruce gave a talk at Dublin about this kind of things. >>>>> I'm sure he could contribute more design guidelines ;) >>>>> >>>> There is a small explanation in the section "Branch Prediction" of >>>> doc/guides/contributing/coding_style.rst, but I do not know if that is >>>> enough to understand when to use them. >>>> >>>> I've made a fast check and there are many PMDs that use them to check if >>>> number of packets is zero in the transmission function. >>> Yeah, and I wonder how many of those are actually necessary too :-) >>> >>> It's not a big deal either way, I just think the patch is fine as-is without >>> the extra macros. >> IMO we use likely/unlikely in two cases, catastrophic errors and the >> code nearly always goes one way, i.e, preferred/favored fast path. >> Likely/unlikely helps not only for branch predication but also for cache > For branch prediction, anything after the first time through the code path > the prediction will be based on what happened before rather than any static > hints in the code. Yes, maybe i didn't make myself clear? My main concern isn't about branch predication... >> usage. The code generated for the likely path will directly follow the >> branch instruction. To me, it is reasonable enough to add unlikely for n >> == 0, which we don't expect to happen. >> I remember with/without likely, compiler could generate three kind of >> instructions. Didn't deep dive into it. >> >>> /Bruce >>> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-17 15:49 [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Lazaros Koromilas 2016-03-17 16:09 ` Mauricio Vásquez @ 2016-03-21 12:23 ` Olivier Matz 2016-03-22 16:49 ` Thomas Monjalon 2016-03-25 11:15 ` Olivier Matz 2 siblings, 1 reply; 19+ messages in thread From: Olivier Matz @ 2016-03-21 12:23 UTC (permalink / raw) To: Lazaros Koromilas, dev Hi, On 03/17/2016 04:49 PM, Lazaros Koromilas wrote: > Issuing a zero objects dequeue with a single consumer has no effect. > Doing so with multiple consumers, can get more than one thread to succeed > the compare-and-set operation and observe starvation or even deadlock in > the while loop that checks for preceding dequeues. The problematic piece > of code when n = 0: > > cons_next = cons_head + n; > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > The same is possible on the enqueue path. > > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-21 12:23 ` Olivier Matz @ 2016-03-22 16:49 ` Thomas Monjalon 0 siblings, 0 replies; 19+ messages in thread From: Thomas Monjalon @ 2016-03-22 16:49 UTC (permalink / raw) To: Lazaros Koromilas; +Cc: dev, Olivier Matz > > Issuing a zero objects dequeue with a single consumer has no effect. > > Doing so with multiple consumers, can get more than one thread to succeed > > the compare-and-set operation and observe starvation or even deadlock in > > the while loop that checks for preceding dequeues. The problematic piece > > of code when n = 0: > > > > cons_next = cons_head + n; > > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > > > The same is possible on the enqueue path. > > > > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com> > > Acked-by: Olivier Matz <olivier.matz@6wind.com> Applied, thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-17 15:49 [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Lazaros Koromilas 2016-03-17 16:09 ` Mauricio Vásquez 2016-03-21 12:23 ` Olivier Matz @ 2016-03-25 11:15 ` Olivier Matz 2016-03-28 15:48 ` Lazaros Koromilas 2 siblings, 1 reply; 19+ messages in thread From: Olivier Matz @ 2016-03-25 11:15 UTC (permalink / raw) To: Lazaros Koromilas, dev; +Cc: Thomas Monjalon Hi Lazaros, On 03/17/2016 04:49 PM, Lazaros Koromilas wrote: > Issuing a zero objects dequeue with a single consumer has no effect. > Doing so with multiple consumers, can get more than one thread to succeed > the compare-and-set operation and observe starvation or even deadlock in > the while loop that checks for preceding dequeues. The problematic piece > of code when n = 0: > > cons_next = cons_head + n; > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > The same is possible on the enqueue path. Just a question about this patch (that has been applied). Thomas retitled the commit from your log message: ring: fix deadlock in zero object multi enqueue or dequeue http://dpdk.org/browse/dpdk/commit/?id=d0979646166e I think this patch does not fix a deadlock, or did I miss something? As explained in the following links, the ring may not perform well if several threads running on the same cpu use it: http://dpdk.org/ml/archives/dev/2013-November/000714.html http://www.dpdk.org/ml/archives/dev/2014-January/001070.html http://www.dpdk.org/ml/archives/dev/2014-January/001162.html http://dpdk.org/ml/archives/dev/2015-July/020659.html A deadlock could occur if the threads running on the same core have different priority. Regards, Olivier ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-25 11:15 ` Olivier Matz @ 2016-03-28 15:48 ` Lazaros Koromilas 2016-03-29 8:54 ` Bruce Richardson 0 siblings, 1 reply; 19+ messages in thread From: Lazaros Koromilas @ 2016-03-28 15:48 UTC (permalink / raw) To: Olivier Matz; +Cc: dev, Thomas Monjalon Hi Olivier, We could have two threads (running on different cores in the general case) that both succeed the cmpset operation. In the dequeue path, when n == 0, then cons_next == cons_head, and cmpset will always succeed. Now, if they both see an old r->cons.tail value from a previous dequeue, they can get stuck in the while (unlikely(r->cons.tail != cons_head)) loop. I tried, however, to reproduce (without the patch) and it seems that there is still a window for deadlock. I'm pasting some debug output below that shows two processes' state. It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and one sender doing mp_enqueue(32) on the same ring. gdb --args ./ring-test -l0 --proc-type=primary gdb --args ./ring-test -l1 --proc-type=secondary gdb --args ./ring-test -l2 --proc-type=secondary -- tx This is what I would usually see, process 0 and 1 both stuck at the same state: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 0 (gdb) p r->cons.tail $2 = 576416 (gdb) p cons_head $3 = 576448 (gdb) p cons_next $4 = 576448 But now I managed to get the two processes stuck at this state too. process 0: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 32 (gdb) p r->cons.tail $2 = 254348832 (gdb) p cons_head $3 = 254348864 (gdb) p cons_next $4 = 254348896 proccess 1: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 32 (gdb) p r->cons.tail $2 = 254348832 (gdb) p cons_head $3 = 254348896 (gdb) p cons_next $4 = 254348928 I haven't been able to trigger this with the patch so far, but it should be possible. Lazaros. On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz <olivier.matz@6wind.com> wrote: > Hi Lazaros, > > On 03/17/2016 04:49 PM, Lazaros Koromilas wrote: >> Issuing a zero objects dequeue with a single consumer has no effect. >> Doing so with multiple consumers, can get more than one thread to succeed >> the compare-and-set operation and observe starvation or even deadlock in >> the while loop that checks for preceding dequeues. The problematic piece >> of code when n = 0: >> >> cons_next = cons_head + n; >> success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); >> >> The same is possible on the enqueue path. > > Just a question about this patch (that has been applied). Thomas > retitled the commit from your log message: > > ring: fix deadlock in zero object multi enqueue or dequeue > http://dpdk.org/browse/dpdk/commit/?id=d0979646166e > > I think this patch does not fix a deadlock, or did I miss something? > > As explained in the following links, the ring may not perform well > if several threads running on the same cpu use it: > > http://dpdk.org/ml/archives/dev/2013-November/000714.html > http://www.dpdk.org/ml/archives/dev/2014-January/001070.html > http://www.dpdk.org/ml/archives/dev/2014-January/001162.html > http://dpdk.org/ml/archives/dev/2015-July/020659.html > > A deadlock could occur if the threads running on the same core > have different priority. > > Regards, > Olivier ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-28 15:48 ` Lazaros Koromilas @ 2016-03-29 8:54 ` Bruce Richardson 2016-03-29 15:29 ` Olivier MATZ 0 siblings, 1 reply; 19+ messages in thread From: Bruce Richardson @ 2016-03-29 8:54 UTC (permalink / raw) To: Lazaros Koromilas; +Cc: Olivier Matz, dev, Thomas Monjalon On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: > Hi Olivier, > > We could have two threads (running on different cores in the general > case) that both succeed the cmpset operation. In the dequeue path, > when n == 0, then cons_next == cons_head, and cmpset will always > succeed. Now, if they both see an old r->cons.tail value from a > previous dequeue, they can get stuck in the while Hi, I don't see how threads reading an "old r->cons.tail" value is even possible. The head and tail pointers on the ring are marked in the code as volatile, so all reads and writes to those values are always done from memory and not cached in registers. No deadlock should be possible on that while loop, unless a process crashes in the middle of a ring operation. Each thread which updates the head pointer from x to y, is responsible for updating the tail pointer in a similar manner. The loop ensures the tail updates are in the same order as the head updates. If you believe deadlock is possible, can you outline the sequence of operations which would lead to such a state, because I cannot see how it could occur without a crash inside one of the threads. /Bruce > (unlikely(r->cons.tail != cons_head)) loop. I tried, however, to > reproduce (without the patch) and it seems that there is still a > window for deadlock. > > I'm pasting some debug output below that shows two processes' state. > It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and > one sender doing mp_enqueue(32) on the same ring. > > gdb --args ./ring-test -l0 --proc-type=primary > gdb --args ./ring-test -l1 --proc-type=secondary > gdb --args ./ring-test -l2 --proc-type=secondary -- tx > > This is what I would usually see, process 0 and 1 both stuck at the same state: > > 663 while (unlikely(r->cons.tail != cons_head)) { > (gdb) p n > $1 = 0 > (gdb) p r->cons.tail > $2 = 576416 > (gdb) p cons_head > $3 = 576448 > (gdb) p cons_next > $4 = 576448 > > But now I managed to get the two processes stuck at this state too. > > process 0: > 663 while (unlikely(r->cons.tail != cons_head)) { > (gdb) p n > $1 = 32 > (gdb) p r->cons.tail > $2 = 254348832 > (gdb) p cons_head > $3 = 254348864 > (gdb) p cons_next > $4 = 254348896 > > proccess 1: > 663 while (unlikely(r->cons.tail != cons_head)) { > (gdb) p n > $1 = 32 > (gdb) p r->cons.tail > $2 = 254348832 > (gdb) p cons_head > $3 = 254348896 > (gdb) p cons_next > $4 = 254348928 > Where is the thread which updated the head pointer from 832 to 864? That thread was the one which would update the tail pointer to 864 to allow your thread 0 to continue. /Bruce > I haven't been able to trigger this with the patch so far, but it > should be possible. > > Lazaros. > > On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz <olivier.matz@6wind.com> wrote: > > Hi Lazaros, > > > > On 03/17/2016 04:49 PM, Lazaros Koromilas wrote: > >> Issuing a zero objects dequeue with a single consumer has no effect. > >> Doing so with multiple consumers, can get more than one thread to succeed > >> the compare-and-set operation and observe starvation or even deadlock in > >> the while loop that checks for preceding dequeues. The problematic piece > >> of code when n = 0: > >> > >> cons_next = cons_head + n; > >> success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > >> > >> The same is possible on the enqueue path. > > > > Just a question about this patch (that has been applied). Thomas > > retitled the commit from your log message: > > > > ring: fix deadlock in zero object multi enqueue or dequeue > > http://dpdk.org/browse/dpdk/commit/?id=d0979646166e > > > > I think this patch does not fix a deadlock, or did I miss something? > > > > As explained in the following links, the ring may not perform well > > if several threads running on the same cpu use it: > > > > http://dpdk.org/ml/archives/dev/2013-November/000714.html > > http://www.dpdk.org/ml/archives/dev/2014-January/001070.html > > http://www.dpdk.org/ml/archives/dev/2014-January/001162.html > > http://dpdk.org/ml/archives/dev/2015-July/020659.html > > > > A deadlock could occur if the threads running on the same core > > have different priority. > > > > Regards, > > Olivier ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-29 8:54 ` Bruce Richardson @ 2016-03-29 15:29 ` Olivier MATZ 2016-03-29 16:04 ` Bruce Richardson 0 siblings, 1 reply; 19+ messages in thread From: Olivier MATZ @ 2016-03-29 15:29 UTC (permalink / raw) To: Bruce Richardson, Lazaros Koromilas; +Cc: dev, Thomas Monjalon Hi, On 03/29/2016 10:54 AM, Bruce Richardson wrote: > On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: >> Hi Olivier, >> >> We could have two threads (running on different cores in the general >> case) that both succeed the cmpset operation. In the dequeue path, >> when n == 0, then cons_next == cons_head, and cmpset will always >> succeed. Now, if they both see an old r->cons.tail value from a >> previous dequeue, they can get stuck in the while > > Hi, > > I don't see how threads reading an "old r->cons.tail" value is even possible. > The head and tail pointers on the ring are marked in the code as volatile, so > all reads and writes to those values are always done from memory and not cached > in registers. No deadlock should be possible on that while loop, unless a > process crashes in the middle of a ring operation. Each thread which updates > the head pointer from x to y, is responsible for updating the tail pointer in > a similar manner. The loop ensures the tail updates are in the same order as the > head updates. > > If you believe deadlock is possible, can you outline the sequence of operations > which would lead to such a state, because I cannot see how it could occur without > a crash inside one of the threads. I think the deadlock Lazaros describes could occur in the following condition: current ring state r->prod.head = 0 r->prod.tail = 0 core 0 core 1 ==================================================================== enqueue 0 object cmpset(&r->prod.head, 0, 0) core 0 is interrupted here enqueue 1 object cmpset(&r->prod.head, 0, 1) copy the objects in box 0 while (r->prod.tail != prod_head)) r->prod.tail = prod_next copy 0 object (-> nothing to do) while (r->prod.tail != prod_head)) <loop forever> I think this issue is indeed fixed by Lazaros' patch (I missed it in previous review). However, I don't think this deadlock could happen once we avoided the (n == 0) case. Olivier ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-29 15:29 ` Olivier MATZ @ 2016-03-29 16:04 ` Bruce Richardson 2016-03-29 17:35 ` Lazaros Koromilas 0 siblings, 1 reply; 19+ messages in thread From: Bruce Richardson @ 2016-03-29 16:04 UTC (permalink / raw) To: Olivier MATZ; +Cc: Lazaros Koromilas, dev, Thomas Monjalon On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote: > Hi, > > > On 03/29/2016 10:54 AM, Bruce Richardson wrote: > >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: > >>Hi Olivier, > >> > >>We could have two threads (running on different cores in the general > >>case) that both succeed the cmpset operation. In the dequeue path, > >>when n == 0, then cons_next == cons_head, and cmpset will always > >>succeed. Now, if they both see an old r->cons.tail value from a > >>previous dequeue, they can get stuck in the while > > > >Hi, > > > >I don't see how threads reading an "old r->cons.tail" value is even possible. > >The head and tail pointers on the ring are marked in the code as volatile, so > >all reads and writes to those values are always done from memory and not cached > >in registers. No deadlock should be possible on that while loop, unless a > >process crashes in the middle of a ring operation. Each thread which updates > >the head pointer from x to y, is responsible for updating the tail pointer in > >a similar manner. The loop ensures the tail updates are in the same order as the > >head updates. > > > >If you believe deadlock is possible, can you outline the sequence of operations > >which would lead to such a state, because I cannot see how it could occur without > >a crash inside one of the threads. > > I think the deadlock Lazaros describes could occur in the following > condition: > > current ring state > r->prod.head = 0 > r->prod.tail = 0 > > core 0 core 1 > ==================================================================== > enqueue 0 object > cmpset(&r->prod.head, 0, 0) > core 0 is interrupted here > enqueue 1 object > cmpset(&r->prod.head, 0, 1) > copy the objects in box 0 > while (r->prod.tail != prod_head)) > r->prod.tail = prod_next > copy 0 object (-> nothing to do) > while (r->prod.tail != prod_head)) > <loop forever> > > > I think this issue is indeed fixed by Lazaros' patch (I missed it > in previous review). However, I don't think this deadlock could > happen once we avoided the (n == 0) case. > Yes, I agree with your scenario. However, I thought the issue was occuring with non-zero updates, but I may well be mistaken. If it's fixed now, all good... :-) /Bruce ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue 2016-03-29 16:04 ` Bruce Richardson @ 2016-03-29 17:35 ` Lazaros Koromilas 0 siblings, 0 replies; 19+ messages in thread From: Lazaros Koromilas @ 2016-03-29 17:35 UTC (permalink / raw) To: Bruce Richardson; +Cc: Olivier MATZ, dev, Thomas Monjalon On Tue, Mar 29, 2016 at 7:04 PM, Bruce Richardson <bruce.richardson@intel.com> wrote: > > On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote: > > Hi, > > > > > > On 03/29/2016 10:54 AM, Bruce Richardson wrote: > > >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: > > >>Hi Olivier, > > >> > > >>We could have two threads (running on different cores in the general > > >>case) that both succeed the cmpset operation. In the dequeue path, > > >>when n == 0, then cons_next == cons_head, and cmpset will always > > >>succeed. Now, if they both see an old r->cons.tail value from a > > >>previous dequeue, they can get stuck in the while > > > > > >Hi, > > > > > >I don't see how threads reading an "old r->cons.tail" value is even possible. > > >The head and tail pointers on the ring are marked in the code as volatile, so > > >all reads and writes to those values are always done from memory and not cached > > >in registers. No deadlock should be possible on that while loop, unless a > > >process crashes in the middle of a ring operation. Each thread which updates > > >the head pointer from x to y, is responsible for updating the tail pointer in > > >a similar manner. The loop ensures the tail updates are in the same order as the > > >head updates. > > > > > >If you believe deadlock is possible, can you outline the sequence of operations > > >which would lead to such a state, because I cannot see how it could occur without > > >a crash inside one of the threads. > > > > I think the deadlock Lazaros describes could occur in the following > > condition: > > > > current ring state > > r->prod.head = 0 > > r->prod.tail = 0 > > > > core 0 core 1 > > ==================================================================== > > enqueue 0 object > > cmpset(&r->prod.head, 0, 0) > > core 0 is interrupted here > > enqueue 1 object > > cmpset(&r->prod.head, 0, 1) > > copy the objects in box 0 > > while (r->prod.tail != prod_head)) > > r->prod.tail = prod_next > > copy 0 object (-> nothing to do) > > while (r->prod.tail != prod_head)) > > <loop forever> > > > > > > I think this issue is indeed fixed by Lazaros' patch (I missed it > > in previous review). However, I don't think this deadlock could > > happen once we avoided the (n == 0) case. > > > Yes, I agree with your scenario. However, I thought the issue was occuring with > non-zero updates, but I may well be mistaken. > If it's fixed now, all good... :-) > > /Bruce Hi all, Bruce, I thought it could be still possible because of my threads being stuck inside two dequeue(32) calls. But haven't been able to reproduce it with non-zero dequeues. And by trying to find a scenario of my own, it seems that at least one dequeue(0) is needed, similarly to Olivier's example on the enqueue path. Thanks, Lazaros. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2016-03-29 17:35 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-03-17 15:49 [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Lazaros Koromilas 2016-03-17 16:09 ` Mauricio Vásquez 2016-03-18 10:18 ` Bruce Richardson 2016-03-18 10:27 ` Olivier Matz 2016-03-18 10:35 ` Bruce Richardson 2016-03-18 10:35 ` Thomas Monjalon 2016-03-18 12:47 ` Mauricio Vásquez 2016-03-18 14:16 ` Bruce Richardson 2016-03-21 17:47 ` Xie, Huawei 2016-03-22 10:13 ` Bruce Richardson 2016-03-22 14:38 ` Xie, Huawei 2016-03-21 12:23 ` Olivier Matz 2016-03-22 16:49 ` Thomas Monjalon 2016-03-25 11:15 ` Olivier Matz 2016-03-28 15:48 ` Lazaros Koromilas 2016-03-29 8:54 ` Bruce Richardson 2016-03-29 15:29 ` Olivier MATZ 2016-03-29 16:04 ` Bruce Richardson 2016-03-29 17:35 ` Lazaros Koromilas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).