DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
@ 2016-04-04 14:45 Tomasz Kulasek
  2016-04-04 15:34 ` Ananyev, Konstantin
  2016-04-04 19:06 ` Ananyev, Konstantin
  0 siblings, 2 replies; 8+ messages in thread
From: Tomasz Kulasek @ 2016-04-04 14:45 UTC (permalink / raw)
  To: dev

It seems that with gcc >5.x and -O2/-O3 optimization breaks packet grouping
algorithm.

When last packet pointer "lp" and "pnum->u64" buffer points the same
memory buffer, high optimization can cause unpredictable results. It seems
that assignment of precalculated group sizes may interfere with
initialization of new group size when lp points value inside current group
and didn't should be changed.

With gcc >5.x and optimization we cannot be sure which assignment will be
done first, so the group size can be counted incorrectly.

This patch eliminates intersection of assignment of initial group size
(lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.

Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 examples/l3fwd/l3fwd_sse.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
index f9cf50a..1afa1f0 100644
--- a/examples/l3fwd/l3fwd_sse.h
+++ b/examples/l3fwd/l3fwd_sse.h
@@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2)
 
 	/* if dest port value has changed. */
 	if (v != GRPMSK) {
-		lp = pnum->u16 + gptbl[v].idx;
-		lp[0] = 1;
 		pnum->u64 = gptbl[v].pnum;
+		pnum->u16[FWDSTEP] = 1;
+		lp = pnum->u16 + gptbl[v].idx;
 	}
 
 	return lp;
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 14:45 [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x Tomasz Kulasek
@ 2016-04-04 15:34 ` Ananyev, Konstantin
  2016-04-04 15:51   ` De Lara Guarch, Pablo
  2016-04-04 16:20   ` Kulasek, TomaszX
  2016-04-04 19:06 ` Ananyev, Konstantin
  1 sibling, 2 replies; 8+ messages in thread
From: Ananyev, Konstantin @ 2016-04-04 15:34 UTC (permalink / raw)
  To: Kulasek, TomaszX; +Cc: dev

Hi Tomasz,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz Kulasek
> Sent: Monday, April 04, 2016 3:45 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> It seems that with gcc >5.x and -O2/-O3 optimization breaks packet grouping
> algorithm.
> 
> When last packet pointer "lp" and "pnum->u64" buffer points the same
> memory buffer, high optimization can cause unpredictable results. It seems
> that assignment of precalculated group sizes may interfere with
> initialization of new group size when lp points value inside current group
> and didn't should be changed.
> 
> With gcc >5.x and optimization we cannot be sure which assignment will be
> done first, so the group size can be counted incorrectly.
> 
> This patch eliminates intersection of assignment of initial group size
> (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> 
> Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> ---
>  examples/l3fwd/l3fwd_sse.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> index f9cf50a..1afa1f0 100644
> --- a/examples/l3fwd/l3fwd_sse.h
> +++ b/examples/l3fwd/l3fwd_sse.h
> @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2)
> 
>  	/* if dest port value has changed. */
>  	if (v != GRPMSK) {
> -		lp = pnum->u16 + gptbl[v].idx;
> -		lp[0] = 1;
>  		pnum->u64 = gptbl[v].pnum;
> +		pnum->u16[FWDSTEP] = 1;

Hmm, but  FWDSTEP and gptbl[v].idx are not always equal.
Actually could you explain a bit more - what exactly is reordered by gcc 5.x,
and how to reproduce it?
i.e what sequence of input packets will trigger an error? 
Konstantin

> +		lp = pnum->u16 + gptbl[v].idx;
>  	}
> 
>  	return lp;
> --
> 1.7.9.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 15:34 ` Ananyev, Konstantin
@ 2016-04-04 15:51   ` De Lara Guarch, Pablo
  2016-04-04 16:20   ` Kulasek, TomaszX
  1 sibling, 0 replies; 8+ messages in thread
From: De Lara Guarch, Pablo @ 2016-04-04 15:51 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kulasek, TomaszX; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Monday, April 04, 2016 4:35 PM
> To: Kulasek, TomaszX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> Hi Tomasz,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz Kulasek
> > Sent: Monday, April 04, 2016 3:45 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> >
> > It seems that with gcc >5.x and -O2/-O3 optimization breaks packet
> grouping
> > algorithm.
> >
> > When last packet pointer "lp" and "pnum->u64" buffer points the same
> > memory buffer, high optimization can cause unpredictable results. It seems
> > that assignment of precalculated group sizes may interfere with
> > initialization of new group size when lp points value inside current group
> > and didn't should be changed.
> >
> > With gcc >5.x and optimization we cannot be sure which assignment will be
> > done first, so the group size can be counted incorrectly.
> >
> > This patch eliminates intersection of assignment of initial group size
> > (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> >
> > Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> >
> > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> > ---
> >  examples/l3fwd/l3fwd_sse.h |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> > index f9cf50a..1afa1f0 100644
> > --- a/examples/l3fwd/l3fwd_sse.h
> > +++ b/examples/l3fwd/l3fwd_sse.h
> > @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t
> *lp, __m128i dp1, __m128i dp2)
> >
> >  	/* if dest port value has changed. */
> >  	if (v != GRPMSK) {
> > -		lp = pnum->u16 + gptbl[v].idx;
> > -		lp[0] = 1;
> >  		pnum->u64 = gptbl[v].pnum;
> > +		pnum->u16[FWDSTEP] = 1;
> 
> Hmm, but  FWDSTEP and gptbl[v].idx are not always equal.
> Actually could you explain a bit more - what exactly is reordered by gcc 5.x,
> and how to reproduce it?
> i.e what sequence of input packets will trigger an error?

Hi Konstantin,

I could see the issue when having two flows in one port, one going to port 0 and the other to port 1 (using Exact Match).
There is no issue when there is just one flow per port, using an older gcc version (< 5.0) or using O0/O1 (and of course, using LPM).

Pablo


> Konstantin
> 
> > +		lp = pnum->u16 + gptbl[v].idx;
> >  	}
> >
> >  	return lp;
> > --
> > 1.7.9.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 15:34 ` Ananyev, Konstantin
  2016-04-04 15:51   ` De Lara Guarch, Pablo
@ 2016-04-04 16:20   ` Kulasek, TomaszX
  2016-04-04 19:05     ` Ananyev, Konstantin
  1 sibling, 1 reply; 8+ messages in thread
From: Kulasek, TomaszX @ 2016-04-04 16:20 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Monday, April 4, 2016 17:35
> To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> Hi Tomasz,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz Kulasek
> > Sent: Monday, April 04, 2016 3:45 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> >
> > It seems that with gcc >5.x and -O2/-O3 optimization breaks packet
> > grouping algorithm.
> >
> > When last packet pointer "lp" and "pnum->u64" buffer points the same
> > memory buffer, high optimization can cause unpredictable results. It
> > seems that assignment of precalculated group sizes may interfere with
> > initialization of new group size when lp points value inside current
> > group and didn't should be changed.
> >
> > With gcc >5.x and optimization we cannot be sure which assignment will
> > be done first, so the group size can be counted incorrectly.
> >
> > This patch eliminates intersection of assignment of initial group size
> > (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> >
> > Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> >
> > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> > ---
> >  examples/l3fwd/l3fwd_sse.h |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> > index f9cf50a..1afa1f0 100644
> > --- a/examples/l3fwd/l3fwd_sse.h
> > +++ b/examples/l3fwd/l3fwd_sse.h
> > @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t
> > *lp, __m128i dp1, __m128i dp2)
> >
> >  	/* if dest port value has changed. */
> >  	if (v != GRPMSK) {
> > -		lp = pnum->u16 + gptbl[v].idx;
> > -		lp[0] = 1;
> >  		pnum->u64 = gptbl[v].pnum;
> > +		pnum->u16[FWDSTEP] = 1;
> 
> Hmm, but  FWDSTEP and gptbl[v].idx are not always equal.
> Actually could you explain a bit more - what exactly is reordered by gcc
> 5.x, and how to reproduce it?
> i.e what sequence of input packets will trigger an error?
> Konstantin
> 
> > +		lp = pnum->u16 + gptbl[v].idx;
> >  	}
> >
> >  	return lp;
> > --
> > 1.7.9.5


Eg. For this case, when group is changed:

	{
		/* 0xb: a == b, b == c, c != d, d == e */
		.pnum = UINT64_C(0x0002000100020003),
		.idx = 3,
		.lpv = 2,
	},

We expect:

	pnum->u16 = { 3, 2, 1, 2, x }
	lp = pnum->u16 + 3;
	// should be lp[0] == 2

but for gcc 5.2

	lp = pnum->u16 + gptbl[v].idx;
	lp[0] = 1;
	pnum->u64 = gptbl[v].pnum;

gives, for some reason lp[0] == 1, even if pnum->u16[3] == 2.

It causes, that group is shorter and fails trying to send next group with messy length.

We should set lp[0] = 1 only when needed (gptbl[v].idx == 4), so this is why I set pnum->u16[4] = 1. I set it up always to prevent condition. For idx < 4 we don't need to set lp[0].

The problem is that both pointers operates on the same memory buffer and, it seems like gcc optimization will produce (it is wrong):

	lp = pnum->u16 + gptbl[v].idx;
	pnum->u64 = gptbl[v].pnum;
	lp[0] = 1;

except:

	lp = pnum->u16 + gptbl[v].idx;
	lp[0] = 1;
	pnum->u64 = gptbl[v].pnum;

This issue is with gcc 5.x and application seems to fail for the patterns where gptbl[v].idx < 4.

Tomasz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 16:20   ` Kulasek, TomaszX
@ 2016-04-04 19:05     ` Ananyev, Konstantin
  2016-04-05 12:02       ` Kulasek, TomaszX
  0 siblings, 1 reply; 8+ messages in thread
From: Ananyev, Konstantin @ 2016-04-04 19:05 UTC (permalink / raw)
  To: Kulasek, TomaszX; +Cc: dev



> -----Original Message-----
> From: Kulasek, TomaszX
> Sent: Monday, April 04, 2016 5:20 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Monday, April 4, 2016 17:35
> > To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> >
> > Hi Tomasz,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz Kulasek
> > > Sent: Monday, April 04, 2016 3:45 PM
> > > To: dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> > >
> > > It seems that with gcc >5.x and -O2/-O3 optimization breaks packet
> > > grouping algorithm.
> > >
> > > When last packet pointer "lp" and "pnum->u64" buffer points the same
> > > memory buffer, high optimization can cause unpredictable results. It
> > > seems that assignment of precalculated group sizes may interfere with
> > > initialization of new group size when lp points value inside current
> > > group and didn't should be changed.
> > >
> > > With gcc >5.x and optimization we cannot be sure which assignment will
> > > be done first, so the group size can be counted incorrectly.
> > >
> > > This patch eliminates intersection of assignment of initial group size
> > > (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> > >
> > > Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> > >
> > > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> > > ---
> > >  examples/l3fwd/l3fwd_sse.h |    4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> > > index f9cf50a..1afa1f0 100644
> > > --- a/examples/l3fwd/l3fwd_sse.h
> > > +++ b/examples/l3fwd/l3fwd_sse.h
> > > @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t
> > > *lp, __m128i dp1, __m128i dp2)
> > >
> > >  	/* if dest port value has changed. */
> > >  	if (v != GRPMSK) {
> > > -		lp = pnum->u16 + gptbl[v].idx;
> > > -		lp[0] = 1;
> > >  		pnum->u64 = gptbl[v].pnum;
> > > +		pnum->u16[FWDSTEP] = 1;
> >
> > Hmm, but  FWDSTEP and gptbl[v].idx are not always equal.
> > Actually could you explain a bit more - what exactly is reordered by gcc
> > 5.x, and how to reproduce it?
> > i.e what sequence of input packets will trigger an error?
> > Konstantin
> >
> > > +		lp = pnum->u16 + gptbl[v].idx;
> > >  	}
> > >
> > >  	return lp;
> > > --
> > > 1.7.9.5
> 
> 
> Eg. For this case, when group is changed:
> 
> 	{
> 		/* 0xb: a == b, b == c, c != d, d == e */
> 		.pnum = UINT64_C(0x0002000100020003),
> 		.idx = 3,
> 		.lpv = 2,
> 	},
> 
> We expect:
> 
> 	pnum->u16 = { 3, 2, 1, 2, x }
> 	lp = pnum->u16 + 3;
> 	// should be lp[0] == 2
> 
> but for gcc 5.2
> 
> 	lp = pnum->u16 + gptbl[v].idx;
> 	lp[0] = 1;
> 	pnum->u64 = gptbl[v].pnum;
> 
> gives, for some reason lp[0] == 1, even if pnum->u16[3] == 2.
> 
> It causes, that group is shorter and fails trying to send next group with messy length.
> 
> We should set lp[0] = 1 only when needed (gptbl[v].idx == 4), so this is why I set pnum->u16[4] = 1. I set it up always to prevent
> condition. For idx < 4 we don't need to set lp[0].
> 
> The problem is that both pointers operates on the same memory buffer and, it seems like gcc optimization will produce (it is wrong):
> 
> 	lp = pnum->u16 + gptbl[v].idx;
> 	pnum->u64 = gptbl[v].pnum;
> 	lp[0] = 1;
> 
> except:
> 
> 	lp = pnum->u16 + gptbl[v].idx;
> 	lp[0] = 1;
> 	pnum->u64 = gptbl[v].pnum;
> 
> This issue is with gcc 5.x and application seems to fail for the patterns where gptbl[v].idx < 4.


Thanks for explanation Tomasz.
So it reordered:
lp[0] = 1;
pnum->u64 = gptbl[v].pnum;
correct?
My first thought was to insert a rte_complier_barrier() between these two lines,
but actually your approach looks cleaner. 
Konstantin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 14:45 [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x Tomasz Kulasek
  2016-04-04 15:34 ` Ananyev, Konstantin
@ 2016-04-04 19:06 ` Ananyev, Konstantin
  2016-04-06  9:27   ` Thomas Monjalon
  1 sibling, 1 reply; 8+ messages in thread
From: Ananyev, Konstantin @ 2016-04-04 19:06 UTC (permalink / raw)
  To: Kulasek, TomaszX, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz Kulasek
> Sent: Monday, April 04, 2016 3:45 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> It seems that with gcc >5.x and -O2/-O3 optimization breaks packet grouping
> algorithm.
> 
> When last packet pointer "lp" and "pnum->u64" buffer points the same
> memory buffer, high optimization can cause unpredictable results. It seems
> that assignment of precalculated group sizes may interfere with
> initialization of new group size when lp points value inside current group
> and didn't should be changed.
> 
> With gcc >5.x and optimization we cannot be sure which assignment will be
> done first, so the group size can be counted incorrectly.
> 
> This patch eliminates intersection of assignment of initial group size
> (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> 
> Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> ---
>  examples/l3fwd/l3fwd_sse.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> index f9cf50a..1afa1f0 100644
> --- a/examples/l3fwd/l3fwd_sse.h
> +++ b/examples/l3fwd/l3fwd_sse.h
> @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2)
> 
>  	/* if dest port value has changed. */
>  	if (v != GRPMSK) {
> -		lp = pnum->u16 + gptbl[v].idx;
> -		lp[0] = 1;
>  		pnum->u64 = gptbl[v].pnum;
> +		pnum->u16[FWDSTEP] = 1;
> +		lp = pnum->u16 + gptbl[v].idx;
>  	}
> 
>  	return lp;
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 1.7.9.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 19:05     ` Ananyev, Konstantin
@ 2016-04-05 12:02       ` Kulasek, TomaszX
  0 siblings, 0 replies; 8+ messages in thread
From: Kulasek, TomaszX @ 2016-04-05 12:02 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Monday, April 4, 2016 21:05
> To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
> 
> 
> 
> > -----Original Message-----
> > From: Kulasek, TomaszX
> > Sent: Monday, April 04, 2016 5:20 PM
> > To: Ananyev, Konstantin
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc
> > 5.x
> >
> > Hi Konstantin,
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Monday, April 4, 2016 17:35
> > > To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> > > Cc: dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with
> > > gcc 5.x
> > >
> > > Hi Tomasz,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tomasz
> > > > Kulasek
> > > > Sent: Monday, April 04, 2016 3:45 PM
> > > > To: dev@dpdk.org
> > > > Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc
> > > > 5.x
> > > >
> > > > It seems that with gcc >5.x and -O2/-O3 optimization breaks packet
> > > > grouping algorithm.
> > > >
> > > > When last packet pointer "lp" and "pnum->u64" buffer points the
> > > > same memory buffer, high optimization can cause unpredictable
> > > > results. It seems that assignment of precalculated group sizes may
> > > > interfere with initialization of new group size when lp points
> > > > value inside current group and didn't should be changed.
> > > >
> > > > With gcc >5.x and optimization we cannot be sure which assignment
> > > > will be done first, so the group size can be counted incorrectly.
> > > >
> > > > This patch eliminates intersection of assignment of initial group
> > > > size (lp[0] = 1) and precalculated group sizes when gptbl[v].idx <
> 4.
> > > >
> > > > Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> > > >
> > > > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> > > > ---
> > > >  examples/l3fwd/l3fwd_sse.h |    4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/examples/l3fwd/l3fwd_sse.h
> > > > b/examples/l3fwd/l3fwd_sse.h index f9cf50a..1afa1f0 100644
> > > > --- a/examples/l3fwd/l3fwd_sse.h
> > > > +++ b/examples/l3fwd/l3fwd_sse.h
> > > > @@ -283,9 +283,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1],
> > > > uint16_t *lp, __m128i dp1, __m128i dp2)
> > > >
> > > >  	/* if dest port value has changed. */
> > > >  	if (v != GRPMSK) {
> > > > -		lp = pnum->u16 + gptbl[v].idx;
> > > > -		lp[0] = 1;
> > > >  		pnum->u64 = gptbl[v].pnum;
> > > > +		pnum->u16[FWDSTEP] = 1;
> > >
> > > Hmm, but  FWDSTEP and gptbl[v].idx are not always equal.
> > > Actually could you explain a bit more - what exactly is reordered by
> > > gcc 5.x, and how to reproduce it?
> > > i.e what sequence of input packets will trigger an error?
> > > Konstantin
> > >
> > > > +		lp = pnum->u16 + gptbl[v].idx;
> > > >  	}
> > > >
> > > >  	return lp;
> > > > --
> > > > 1.7.9.5
> >
> >
> > Eg. For this case, when group is changed:
> >
> > 	{
> > 		/* 0xb: a == b, b == c, c != d, d == e */
> > 		.pnum = UINT64_C(0x0002000100020003),
> > 		.idx = 3,
> > 		.lpv = 2,
> > 	},
> >
> > We expect:
> >
> > 	pnum->u16 = { 3, 2, 1, 2, x }
> > 	lp = pnum->u16 + 3;
> > 	// should be lp[0] == 2
> >
> > but for gcc 5.2
> >
> > 	lp = pnum->u16 + gptbl[v].idx;
> > 	lp[0] = 1;
> > 	pnum->u64 = gptbl[v].pnum;
> >
> > gives, for some reason lp[0] == 1, even if pnum->u16[3] == 2.
> >
> > It causes, that group is shorter and fails trying to send next group
> with messy length.
> >
> > We should set lp[0] = 1 only when needed (gptbl[v].idx == 4), so this
> > is why I set pnum->u16[4] = 1. I set it up always to prevent condition.
> For idx < 4 we don't need to set lp[0].
> >
> > The problem is that both pointers operates on the same memory buffer
> and, it seems like gcc optimization will produce (it is wrong):
> >
> > 	lp = pnum->u16 + gptbl[v].idx;
> > 	pnum->u64 = gptbl[v].pnum;
> > 	lp[0] = 1;
> >
> > except:
> >
> > 	lp = pnum->u16 + gptbl[v].idx;
> > 	lp[0] = 1;
> > 	pnum->u64 = gptbl[v].pnum;
> >
> > This issue is with gcc 5.x and application seems to fail for the
> patterns where gptbl[v].idx < 4.
> 
> 
> Thanks for explanation Tomasz.
> So it reordered:
> lp[0] = 1;
> pnum->u64 = gptbl[v].pnum;
> correct?
> My first thought was to insert a rte_complier_barrier() between these two
> lines, but actually your approach looks cleaner.
> Konstantin

Yes.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x
  2016-04-04 19:06 ` Ananyev, Konstantin
@ 2016-04-06  9:27   ` Thomas Monjalon
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2016-04-06  9:27 UTC (permalink / raw)
  To: Kulasek, TomaszX; +Cc: dev, Ananyev, Konstantin

> > It seems that with gcc >5.x and -O2/-O3 optimization breaks packet grouping
> > algorithm.
> > 
> > When last packet pointer "lp" and "pnum->u64" buffer points the same
> > memory buffer, high optimization can cause unpredictable results. It seems
> > that assignment of precalculated group sizes may interfere with
> > initialization of new group size when lp points value inside current group
> > and didn't should be changed.
> > 
> > With gcc >5.x and optimization we cannot be sure which assignment will be
> > done first, so the group size can be counted incorrectly.
> > 
> > This patch eliminates intersection of assignment of initial group size
> > (lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
> > 
> > Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> > 
> > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-04-06  9:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-04 14:45 [dpdk-dev] [PATCH] examples/l3fwd: fix segfault with gcc 5.x Tomasz Kulasek
2016-04-04 15:34 ` Ananyev, Konstantin
2016-04-04 15:51   ` De Lara Guarch, Pablo
2016-04-04 16:20   ` Kulasek, TomaszX
2016-04-04 19:05     ` Ananyev, Konstantin
2016-04-05 12:02       ` Kulasek, TomaszX
2016-04-04 19:06 ` Ananyev, Konstantin
2016-04-06  9:27   ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).