DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Guduri Prathyusha <gprathyusha@caviumnetworks.com>,
	"Kantecki, Tomasz" <tomasz.kantecki@intel.com>
Cc: "Jianbo.Liu@arm.com" <Jianbo.Liu@arm.com>,
	"guduriprathyusha@gmail.com" <guduriprathyusha@gmail.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH ] examples/l3fwd: fix aliasing in port grouping
Date: Thu, 2 Nov 2017 14:46:43 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB9772585FAB87F0@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <20171102143114.24380-1-gprathyusha@caviumnetworks.com>

Hi,

> -----Original Message-----
> From: Guduri Prathyusha [mailto:gprathyusha@caviumnetworks.com]
> Sent: Thursday, November 2, 2017 2:31 PM
> To: Kantecki, Tomasz <tomasz.kantecki@intel.com>
> Cc: Jianbo.Liu@arm.com; guduriprathyusha@gmail.com; Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org; Guduri
> Prathyusha <gprathyusha@caviumnetworks.com>
> Subject: [dpdk-dev] [PATCH ] examples/l3fwd: fix aliasing in port grouping
> 
> With -f-strict-aliasing enabled by default from -O2, gcc > 5.x gives
> undefined behavior in port_groupx4. 'pn' and 'pnum' are two different
> pointers pointing to same chunk of memory and with -f-strict-aliasing the
> pointers are assumed to be pointing to different memory and compiler
> reorders instructions that depend on pnum and pn. This breaks port
> grouping algorithm.
> 
> This patch eliminates the usage of union and uses memcpy for copying
> gptbl[v].pnum to pn. memcpy when applied on built_in constant size does
> not call its library implementation but uses appropriate LD and ST
> instructions directly and hence no performance overhead.
> 
> Fixes: 569b290cdb36 ("examples/l3fwd: add NEON implementation")
> Fixes: af1694d94bf1 ("examples/l3fwd: fix crash with gcc 5")
> Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
> ---
>  examples/l3fwd/l3fwd_neon.h | 11 +++--------
>  examples/l3fwd/l3fwd_sse.h  | 11 +++--------
>  2 files changed, 6 insertions(+), 16 deletions(-)
> 
> diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h
> index 4bc161394..10a602a04 100644
> --- a/examples/l3fwd/l3fwd_neon.h
> +++ b/examples/l3fwd/l3fwd_neon.h
> @@ -100,11 +100,6 @@ static inline uint16_t *
>  port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1,
>  	     uint16x8_t dp2)
>  {
> -	union {
> -		uint16_t u16[FWDSTEP + 1];
> -		uint64_t u64;
> -	} *pnum = (void *)pn;
> -
>  	int32_t v;
>  	uint16x8_t mask = {1, 2, 4, 8, 0, 0, 0, 0};
> 
> @@ -117,9 +112,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1,
> 
>  	/* if dest port value has changed. */
>  	if (v != GRPMSK) {
> -		pnum->u64 = gptbl[v].pnum;
> -		pnum->u16[FWDSTEP] = 1;
> -		lp = pnum->u16 + gptbl[v].idx;
> +		rte_memcpy(pn, &gptbl[v].pnum, sizeof(gptbl[v].pnum));
> +		pn[FWDSTEP] = 1;
> +		lp = pn + gptbl[v].idx;
>  	}
> 
>  	return lp;
> diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h
> index 831760f02..79a71d77e 100644
> --- a/examples/l3fwd/l3fwd_sse.h
> +++ b/examples/l3fwd/l3fwd_sse.h
> @@ -98,11 +98,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
>  static inline uint16_t *
>  port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2)
>  {
> -	union {
> -		uint16_t u16[FWDSTEP + 1];
> -		uint64_t u64;
> -	} *pnum = (void *)pn;
> -
>  	int32_t v;
> 
>  	dp1 = _mm_cmpeq_epi16(dp1, dp2);
> @@ -114,9 +109,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2)
> 
>  	/* if dest port value has changed. */
>  	if (v != GRPMSK) {
> -		pnum->u64 = gptbl[v].pnum;
> -		pnum->u16[FWDSTEP] = 1;
> -		lp = pnum->u16 + gptbl[v].idx;
> +		rte_memcpy(pn, &gptbl[v].pnum, sizeof(gptbl[v].pnum));
> +		pn[FWDSTEP] = 1;
> +		lp = pn + gptbl[v].idx;

Could you explain a bit more here - which exactly instructions were reordered
and what kind of problems did it cause?
Specially on IA?
In any case I don't think using rte_memcpy is a good thing to use here:
it is a huge inline function - way too much to copy just 64 bit variable.
Konstantin

>  	}
> 
>  	return lp;
> --
> 2.14.1

  reply	other threads:[~2017-11-02 14:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-02 14:31 Guduri Prathyusha
2017-11-02 14:46 ` Ananyev, Konstantin [this message]
2017-11-02 15:33   ` Guduri Prathyusha
2017-11-02 15:52     ` Ananyev, Konstantin
2017-11-02 17:38       ` Prathyusha, Guduri
2017-11-03  3:21       ` Jianbo.Liu
2017-11-03  5:42         ` Guduri Prathyusha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB9772585FAB87F0@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=Jianbo.Liu@arm.com \
    --cc=dev@dpdk.org \
    --cc=gprathyusha@caviumnetworks.com \
    --cc=guduriprathyusha@gmail.com \
    --cc=tomasz.kantecki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).