DPDK patches and discussions
 help / color / mirror / Atom feed
From: "De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>,
	"Marohn, Byron" <byron.marohn@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Edupuganti, Saikrishna" <saikrishna.edupuganti@intel.com>,
	"jianbo.liu@linaro.org" <jianbo.liu@linaro.org>,
	"chaozhu@linux.vnet.ibm.com" <chaozhu@linux.vnet.ibm.com>,
	"jerin.jacob@caviumnetworks.com" <jerin.jacob@caviumnetworks.com>
Subject: Re: [dpdk-dev] [PATCH 2/3] hash: add vectorized comparison
Date: Fri, 2 Sep 2016 17:05:10 +0000	[thread overview]
Message-ID: <E115CCD9D858EF4F90C690B0DCB4D8973C9D2CA2@IRSMSX108.ger.corp.intel.com> (raw)
In-Reply-To: <5721729.LXq7JRZ983@xps13>



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Saturday, August 27, 2016 1:58 AM
> To: De Lara Guarch, Pablo; Marohn, Byron
> Cc: dev@dpdk.org; Richardson, Bruce; Edupuganti, Saikrishna;
> jianbo.liu@linaro.org; chaozhu@linux.vnet.ibm.com;
> jerin.jacob@caviumnetworks.com
> Subject: Re: [dpdk-dev] [PATCH 2/3] hash: add vectorized comparison
> 
> 2016-08-26 22:34, Pablo de Lara:
> > From: Byron Marohn <byron.marohn@intel.com>
> >
> > In lookup bulk function, the signatures of all entries
> > are compared against the signature of the key that is being looked up.
> > Now that all the signatures are together, they can be compared
> > with vector instructions (SSE, AVX2), achieving higher lookup performance.
> >
> > Also, entries per bucket are increased to 8 when using processors
> > with AVX2, as 256 bits can be compared at once, which is the size of
> > 8x32-bit signatures.
> 
> Please, would it be possible to use the generic SIMD intrinsics?
> We could define generic types compatible with Altivec and NEON:
> 	__attribute__ ((vector_size (n)))
> as described in https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
> 

I tried to convert these into generic code with gcc builtins,
but I couldn't find a way to translate the __mm_movemask instrinsic into a generic builtin
(which is very necessary for performance reasons).
Therefore, I think it is not possible to do this without penalizing performance.
Sure, we could try to translate the other intrinsics, but it would mean that we still need to
use #ifdefs and we would have a mix of code with x86 instrinsics and gcc builtins,
so it is better to leave it this way.

> > +/* 8 entries per bucket */
> > +#if defined(__AVX2__)
> 
> Please prefer
> 	#ifdef RTE_MACHINE_CPUFLAG_AVX2
> Ideally the vector support could be checked at runtime:
> 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> It would allow packaging one binary using the best optimization available.
> 

Good idea. Will submit a v2 with this change. It took me a bit of time to figure out
a way to do this without paying a big performance penalty.

> > +	*prim_hash_matches |=
> _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32(
> > +			_mm256_load_si256((__m256i const *)prim_bkt-
> >sig_current),
> > +			_mm256_set1_epi32(prim_hash)));
> > +	*sec_hash_matches |=
> _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32(
> > +			_mm256_load_si256((__m256i const *)sec_bkt-
> >sig_current),
> > +			_mm256_set1_epi32(sec_hash)));
> > +/* 4 entries per bucket */
> > +#elif defined(__SSE2__)
> > +	*prim_hash_matches |=
> _mm_movemask_ps((__m128)_mm_cmpeq_epi16(
> > +			_mm_load_si128((__m128i const *)prim_bkt-
> >sig_current),
> > +			_mm_set1_epi32(prim_hash)));
> > +	*sec_hash_matches |=
> _mm_movemask_ps((__m128)_mm_cmpeq_epi16(
> > +			_mm_load_si128((__m128i const *)sec_bkt-
> >sig_current),
> > +			_mm_set1_epi32(sec_hash)));
> 
> In order to allow such switch based on register size, we could have an
> abstraction in EAL supporting 128/256/512 width for x86/ARM/POWER.
> I think aliasing RTE_MACHINE_CPUFLAG_ and RTE_CPUFLAG_ may be
> enough.

  reply	other threads:[~2016-09-02 17:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-26 21:34 [dpdk-dev] [PATCH 0/3] Cuckoo hash lookup enhancements Pablo de Lara
2016-08-26 21:34 ` [dpdk-dev] [PATCH 1/3] hash: reorganize bucket structure Pablo de Lara
2016-08-26 21:34 ` [dpdk-dev] [PATCH 2/3] hash: add vectorized comparison Pablo de Lara
2016-08-27  8:57   ` Thomas Monjalon
2016-09-02 17:05     ` De Lara Guarch, Pablo [this message]
2016-08-26 21:34 ` [dpdk-dev] [PATCH 3/3] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-02 22:56 ` [dpdk-dev] [PATCH v2 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 1/4] hash: reorder hash structure Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-06 19:33   ` [dpdk-dev] [PATCH v3 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-30  7:38     ` [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 1/4] hash: reorder hash structure Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-30 19:53       ` [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements Gobriel, Sameh
2016-10-03  9:59       ` Bruce Richardson
2016-10-04  6:50         ` De Lara Guarch, Pablo
2016-10-04  7:17           ` De Lara Guarch, Pablo
2016-10-04  9:47             ` Bruce Richardson
2016-10-04 23:25       ` [dpdk-dev] [PATCH v5 " Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 1/4] hash: reorder hash structure Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 2/4] hash: reorganize bucket structure Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 3/4] hash: add vectorized comparison Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-10-05 10:12         ` [dpdk-dev] [PATCH v5 0/4] Cuckoo hash enhancements Thomas Monjalon
2016-09-06 19:34   ` [dpdk-dev] [PATCH v3 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 1/4] hash: reorder hash structure Pablo de Lara
2016-09-28  9:02       ` Bruce Richardson
2016-09-29  1:33         ` De Lara Guarch, Pablo
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-28  9:05       ` Bruce Richardson
2016-09-29  1:40         ` De Lara Guarch, Pablo
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 4/4] hash: modify lookup bulk pipeline Pablo de Lara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E115CCD9D858EF4F90C690B0DCB4D8973C9D2CA2@IRSMSX108.ger.corp.intel.com \
    --to=pablo.de.lara.guarch@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=byron.marohn@intel.com \
    --cc=chaozhu@linux.vnet.ibm.com \
    --cc=dev@dpdk.org \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=jianbo.liu@linaro.org \
    --cc=saikrishna.edupuganti@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).