From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: "Yigit, Ferruh" <ferruh.yigit@intel.com>,
"Morten Brørup" <mb@smartsharesystems.com>,
"dev@dpdk.org" <dev@dpdk.org>
Cc: Olivier Matz <olivier.matz@6wind.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Subject: Re: [dpdk-dev] rte_ether_addr_copy() strange comment
Date: Fri, 26 Jun 2020 12:41:29 +0000 [thread overview]
Message-ID: <BYAPR11MB31434CF5A332B15D67376AA6D7930@BYAPR11MB3143.namprd11.prod.outlook.com> (raw)
In-Reply-To: <6b67ce84-92ee-550d-2fba-af8c4c1bb2aa@intel.com>
> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@intel.com>
> Sent: Friday, June 26, 2020 1:08 PM
> To: Morten Brørup <mb@smartsharesystems.com>; dev@dpdk.org
> Cc: Olivier Matz <olivier.matz@6wind.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: Re: [dpdk-dev] rte_ether_addr_copy() strange comment
>
> On 6/25/2020 4:45 PM, Morten Brørup wrote:
> > The function rte_ether_addr_copy() checks for __INTEL_COMPILER and has a
> comment about "a strange gcc warning". It says:
> >
> > static inline void rte_ether_addr_copy(const struct rte_ether_addr *ea_from,
> > struct rte_ether_addr *ea_to)
> > {
> > #ifdef __INTEL_COMPILER
> > uint16_t *from_words = (uint16_t *)(ea_from->addr_bytes);
> > uint16_t *to_words = (uint16_t *)(ea_to->addr_bytes);
> >
> > to_words[0] = from_words[0];
> > to_words[1] = from_words[1];
> > to_words[2] = from_words[2];
> > #else
> > /*
> > * Use the common way, because of a strange gcc warning.
> > */
> > *ea_to = *ea_from;
> > #endif
> > }
> >
> > I can see that from_words discards the const qualifier. Is that the "strange" gcc
> warning the comment is referring to?
> >
> > This goes back to before the first public release of DPDK in 2013, ref.
> https://git.dpdk.org/dpdk/log/lib/librte_ether/rte_ether.h
> >
> >
> > It should be fixed as follows:
> >
> > - uint16_t *from_words = (uint16_t *)(ea_from->addr_bytes);
> > - uint16_t *to_words = (uint16_t *)(ea_to->addr_bytes);
> > + const uint16_t *from_words = (const uint16_t *)ea_from;
> > + uint16_t *to_words = (uint16_t *)ea_to;
> >
> > And the consequential question: Is copying the three shorts faster than
> copying the struct? In other words: Should we get rid of the #ifdef and use the
> first method only?
>
>
> I tried to investigate this in godbolt: https://godbolt.org/z/YSmvDn
There was a small hiccup in the struct mac definition, it is aligned to 2, not 16 as the above Godbolt link...
With the aligned attribute changed to 2 (as per DPDK header https://git.dpdk.org/dpdk/tree/lib/librte_net/rte_ether.h#n59 )
we get the required (but less performant) smaller stores:
WORD_COPY = 0, Aligned = 16
NOTE: Incorrect alignment provided, and invalid ASM as it stores over the 10 bytes after eth addr.
This code is from a GodBolt test only, and this bug is NOT present in upstream DPDK.
movdqa (%rsi), %xmm0
movaps %xmm0, (%rdi)
ret
WORD_COPY = 0, Aligned = 2 (correct, as per dpdk header)
movl (%rsi), %eax
movl %eax, (%rdi)
movzwl 4(%rsi), %eax
movw %ax, 4(%rdi)
ret
Word Copy = 1 (aligned = 2)
movzwl (%rsi), %eax
movw %ax, (%rdi)
movzwl 2(%rsi), %eax
movw %ax, 2(%rdi)
movzwl 4(%rsi), %eax
movw %ax, 4(%rdi)
ret
<snip previous output>
Ferruh said:
> Related to the struct copy vs word copy, struct copy seems with less instruction [1],[2],
> my vote to remove ifdef and keep struct copy.
+1 for struct copy here too, it gives the compiler space to optimize within its alignment bounds.
It does the best that it can (given a 6 byte area to store into), with 1x 4byte store, 1x 2byte store.
Regards, -Harry
PS: For extra bonus points, here's a SIMD version that only uses one store
https://godbolt.org/z/VAR2La. Unless you intend on copying billions of
L1 resident eth addrs, this may or may not be a useful optimization.
Note that it requires the 10 bytes after the ether addr to be valid to read.
It loads 16B across both SRC and DST, blends 48 bits of SRC into DST and
writes the result back to DST.
movdqu (%rsi), %xmm0
movdqu (%rdi), %xmm1
pblendw $7, %xmm1, %xmm0
movups %xmm0, (%rdi)
ret
Actually, its possible to do this using a uint64_t (8 byte scalar) load/store too,
with some masking and bitwise OR... left as an exercise to the reader? :)
next prev parent reply other threads:[~2020-06-26 12:41 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-25 15:45 Morten Brørup
2020-06-26 12:08 ` Ferruh Yigit
2020-06-26 12:34 ` Morten Brørup
2020-06-26 14:37 ` Jerin Jacob
2020-06-26 12:41 ` Van Haaren, Harry [this message]
2020-06-26 15:54 ` Ferruh Yigit
2020-06-26 17:28 ` Van Haaren, Harry
2020-06-26 18:04 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BYAPR11MB31434CF5A332B15D67376AA6D7930@BYAPR11MB3143.namprd11.prod.outlook.com \
--to=harry.van.haaren@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=mb@smartsharesystems.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).