DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements
Date: Tue, 4 Oct 2016 10:47:31 +0100	[thread overview]
Message-ID: <20161004094730.GB100068@bricha3-MOBL3> (raw)
In-Reply-To: <E115CCD9D858EF4F90C690B0DCB4D8973CA03DCE@IRSMSX108.ger.corp.intel.com>

On Tue, Oct 04, 2016 at 08:17:28AM +0100, De Lara Guarch, Pablo wrote:
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of De Lara Guarch,
> > Pablo
> > Sent: Monday, October 03, 2016 11:51 PM
> > To: Richardson, Bruce
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements
> > 
> > Hi Bruce,
> > 
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Monday, October 03, 2016 2:59 AM
> > > To: De Lara Guarch, Pablo
> > > Cc: dev@dpdk.org
> > > Subject: Re: [PATCH v4 0/4] Cuckoo hash enhancements
> > >
> > > On Fri, Sep 30, 2016 at 08:38:52AM +0100, Pablo de Lara wrote:
> > > > This patchset improves lookup performance on the current hash library
> > > > by changing the existing lookup bulk pipeline, with an improved pipeline,
> > > > based on a loop-and-jump model, instead of the current 4-stage 2-entry
> > > pipeline.
> > > > Also, x86 vectorized intrinsics are used to improve performance when
> > > comparing signatures.
> > > >
> > > > First patch reorganizes the order of the hash structure.
> > > > The structure takes more than one 64-byte cache line, but not all
> > > > the fields are used in the lookup operation (the most common operation).
> > > > Therefore, all these fields have been moved to the first part of the
> > structure,
> > > > so they all fit in one cache line, improving slightly the performance in
> > some
> > > > scenarios.
> > > >
> > > > Second patch modifies the order of the bucket structure.
> > > > Currently, the buckets store all the signatures together (current and
> > > alternative).
> > > > In order to be able to perform a vectorized signature comparison,
> > > > all current signatures have to be together, so the order of the bucket has
> > > been changed,
> > > > having separated all the current signatures from the alternative
> > signatures.
> > > >
> > > > Third patch introduces x86 vectorized intrinsics.
> > > > When performing a lookup bulk operation, all current signatures in a
> > bucket
> > > > are compared against the signature of the key being looked up.
> > > > Now that they all are together, a vectorized comparison can be
> > performed,
> > > > which takes less instructions to be carried out.
> > > > In case of having a machine with AVX2, number of entries per bucket are
> > > > increased from 4 to 8, as AVX2 allows comparing two 256-bit values, with
> > > 8x32-bit integers,
> > > > which are the 8 signatures on the bucket.
> > > >
> > > > Fourth (and last) patch modifies the current pipeline of the lookup bulk
> > > function.
> > > > The new pipeline is based on a loop-and-jump model. The two key
> > > improvements are:
> > > >
> > > > - Better prefetching: in this case, first 4 keys to be looked up are
> > prefetched,
> > > >   and after that, the rest of the keys are prefetched at the time the
> > > calculation
> > > >   of the signatures are being performed. This gives more time for the CPU
> > to
> > > >   prefetch the data requesting before actually need it, which result in less
> > > >   cache misses and therefore, higher throughput.
> > > >
> > > > - Lower performance penalty when using fallback: the lookup bulk
> > > algorithm
> > > >   assumes that most times there will not be a collision in a bucket, but it
> > > might
> > > >   happen that two or more signatures are equal, which means that more
> > > than one
> > > >   key comparison might be necessary. In that case, only the key of the first
> > > hit is prefetched,
> > > >   like in the current implementation. The difference now is that if this
> > > comparison
> > > >   results in a miss, the information of the other keys to be compared has
> > > been stored,
> > > >   unlike the current implementation, which needs to perform an entire
> > > simple lookup again.
> > > >
> > > > Changes in v4:
> > > > - Reordered hash structure, so alt signature is at the start
> > > >   of the next cache line, and explain in the commit message
> > > >   why it has been moved
> > > > - Reordered hash structure, so name field is on top of the structure,
> > > >   leaving all the fields used in lookup in the next cache line
> > > >   (instead of the first cache line)
> > > >
> > > > Changes in v3:
> > > > - Corrected the cover letter (wrong number of patches)
> > > >
> > > > Changes in v2:
> > > > - Increased entries per bucket from 4 to 8 for all cases,
> > > >   so it is not architecture dependent any longer.
> > > > - Replaced compile-time signature comparison function election
> > > >   with run-time election, so best optimization available
> > > >   will be used from a single binary.
> > > > - Reordered the hash structure, so all the fields used by lookup
> > > >   are in the same cache line (first).
> > > >
> > > > Byron Marohn (3):
> > > >   hash: reorganize bucket structure
> > > >   hash: add vectorized comparison
> > > >   hash: modify lookup bulk pipeline
> > > >
> > >
> > > Hi,
> > >
> > > Firstly, checkpatches is reporting some style errors in these patches.
> > >
> > > Secondly, when I run the "hash_multiwriter_autotest" I get what I assume
> > to
> > > be
> > > an error after applying this patchset. Before this set is applied, running
> > > that test shows the cycles per insert with/without lock elision. Now, though
> > > I'm getting an error about a key being dropped or failing to insert in the lock
> > > elision case, e.g.
> > >
> > >   Core #2 inserting 1572864: 0 - 1,572,864
> > >   key 1497087 is lost
> > >   1 key lost
> > >
> > > I've run the test a number of times, and there is a single key lost each time.
> > > Please check on this, is it expected or is it a problem?
> > 
> > I am seeing that error even without the patchset. I am still investigating it,
> > but using "git bisect" looks like the problem is in commit 5fc74c2e146d
> > ("hash: check if slot is empty with key index").
> 
> I found the problem, and I submitted a patch for it (http://dpdk.org/dev/patchwork/patch/16361/).
> Could you check if it works for you? 
> 

That patch looks like a correct bugfix so I've acked it for you. However, I still
see the error appearing very occasionally. Since it also appeared before I applied
this set, I am ok to accept this set anyway. Please do a new version of the
set with checkpatch issues fixed and keep my ack.

Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

  reply	other threads:[~2016-10-04  9:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-26 21:34 [dpdk-dev] [PATCH 0/3] Cuckoo hash lookup enhancements Pablo de Lara
2016-08-26 21:34 ` [dpdk-dev] [PATCH 1/3] hash: reorganize bucket structure Pablo de Lara
2016-08-26 21:34 ` [dpdk-dev] [PATCH 2/3] hash: add vectorized comparison Pablo de Lara
2016-08-27  8:57   ` Thomas Monjalon
2016-09-02 17:05     ` De Lara Guarch, Pablo
2016-08-26 21:34 ` [dpdk-dev] [PATCH 3/3] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-02 22:56 ` [dpdk-dev] [PATCH v2 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 1/4] hash: reorder hash structure Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-02 22:56   ` [dpdk-dev] [PATCH v2 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-06 19:33   ` [dpdk-dev] [PATCH v3 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-30  7:38     ` [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 1/4] hash: reorder hash structure Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-30  7:38       ` [dpdk-dev] [PATCH v4 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-09-30 19:53       ` [dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements Gobriel, Sameh
2016-10-03  9:59       ` Bruce Richardson
2016-10-04  6:50         ` De Lara Guarch, Pablo
2016-10-04  7:17           ` De Lara Guarch, Pablo
2016-10-04  9:47             ` Bruce Richardson [this message]
2016-10-04 23:25       ` [dpdk-dev] [PATCH v5 " Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 1/4] hash: reorder hash structure Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 2/4] hash: reorganize bucket structure Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 3/4] hash: add vectorized comparison Pablo de Lara
2016-10-04 23:25         ` [dpdk-dev] [PATCH v5 4/4] hash: modify lookup bulk pipeline Pablo de Lara
2016-10-05 10:12         ` [dpdk-dev] [PATCH v5 0/4] Cuckoo hash enhancements Thomas Monjalon
2016-09-06 19:34   ` [dpdk-dev] [PATCH v3 0/4] Cuckoo hash lookup enhancements Pablo de Lara
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 1/4] hash: reorder hash structure Pablo de Lara
2016-09-28  9:02       ` Bruce Richardson
2016-09-29  1:33         ` De Lara Guarch, Pablo
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 2/4] hash: reorganize bucket structure Pablo de Lara
2016-09-28  9:05       ` Bruce Richardson
2016-09-29  1:40         ` De Lara Guarch, Pablo
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 3/4] hash: add vectorized comparison Pablo de Lara
2016-09-06 19:34     ` [dpdk-dev] [PATCH v3 4/4] hash: modify lookup bulk pipeline Pablo de Lara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161004094730.GB100068@bricha3-MOBL3 \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=pablo.de.lara.guarch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).