DPDK patches and discussions
 help / color / mirror / Atom feed
From: Neil Horman <nhorman@tuxdriver.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method and several other enhancements.
Date: Mon, 19 Jan 2015 13:39:48 -0500	[thread overview]
Message-ID: <20150119183948.GF21790@hmsreliant.think-freely.org> (raw)
In-Reply-To: <3790092.dknD3Zd4cr@xps13>

On Mon, Jan 19, 2015 at 06:16:02PM +0100, Thomas Monjalon wrote:
> 2015-01-14 13:39, Neil Horman:
> > On Mon, Jan 12, 2015 at 07:16:04PM +0000, Konstantin Ananyev wrote:
> > > v2 changes:
> > > - When build with the compilers that don't support AVX2 instructions,
> > > make rte_acl_classify_avx2() do nothing and return an error.
> > > - Remove unneeded 'ifdef __AVX2__' in acl_run_avx2.*.
> > > - Reorder order of patches in the set, to keep RTE_LIBRTE_ACL_STANDALONE=y
> > > always buildable.
> > > 
> > > This patch series contain several fixes and enhancements for ACL library.
> > > See complete list below.
> > > Two main changes that are externally visible:
> > > - Introduce new classify method:  RTE_ACL_CLASSIFY_AVX2.
> > > It uses AVX2 instructions and 256 bit wide data types
> > > to perform internal trie traversal.
> > > That helps to increase classify() throughput.
> > > This method is selected as default one on CPUs that supports AVX2.
> > > - Introduce new field in the build config structure: max_size.
> > > It specifies maximum size that internal RT structure for given context
> > > can reach.
> > > The purpose of that is to allow user to decide about space/performance trade-off
> > > (faster classify() vs less space for RT internal structures)
> > > for each given set of rules.
> > > 
> > > Konstantin Ananyev (17):
> > >   fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=y
> > >   app/test: few small fixes fot test_acl.c
> > >   librte_acl: make data_indexes long enough to survive idle transitions.
> > >   librte_acl: remove build phase heuristsic with negative perfomance
> > >     effect.
> > >   librte_acl: fix a bug at build phase that can cause matches beeing
> > >     overwirtten.
> > >   librte_acl: introduce DFA nodes compression (group64) for identical
> > >     entries.
> > >   librte_acl: build/gen phase - simplify the way match nodes are
> > >     allocated.
> > >   librte_acl: make scalar RT code to be more similar to vector one.
> > >   librte_acl: a bit of RT code deduplication.
> > >   EAL: introduce rte_ymm and relatives in rte_common_vect.h.
> > >   librte_acl: add AVX2 as new rte_acl_classify() method
> > >   test-acl: add ability to manually select RT method.
> > >   librte_acl: Remove search_sse_2 and relatives.
> > >   libter_acl: move lo/hi dwords shuffle out from calc_addr
> > >   libte_acl: make calc_addr a define to deduplicate the code.
> > >   libte_acl: introduce max_size into rte_acl_config.
> > >   libte_acl: remove unused macros.
> > > 
> > >  app/test-acl/main.c                             | 126 +++--
> > >  app/test/test_acl.c                             |   8 +-
> > >  examples/l3fwd-acl/main.c                       |   3 +-
> > >  examples/l3fwd/main.c                           |   2 +-
> > >  lib/librte_acl/Makefile                         |  18 +
> > >  lib/librte_acl/acl.h                            |  58 ++-
> > >  lib/librte_acl/acl_bld.c                        | 392 +++++++---------
> > >  lib/librte_acl/acl_gen.c                        | 268 +++++++----
> > >  lib/librte_acl/acl_run.h                        |   7 +-
> > >  lib/librte_acl/acl_run_avx2.c                   |  54 +++
> > >  lib/librte_acl/acl_run_avx2.h                   | 284 ++++++++++++
> > >  lib/librte_acl/acl_run_scalar.c                 |  65 ++-
> > >  lib/librte_acl/acl_run_sse.c                    | 585 +-----------------------
> > >  lib/librte_acl/acl_run_sse.h                    | 357 +++++++++++++++
> > >  lib/librte_acl/acl_vect.h                       | 132 +++---
> > >  lib/librte_acl/rte_acl.c                        |  47 +-
> > >  lib/librte_acl/rte_acl.h                        |   4 +
> > >  lib/librte_acl/rte_acl_osdep_alone.h            |  47 +-
> > >  lib/librte_eal/common/include/rte_common_vect.h |  39 +-
> > >  lib/librte_lpm/rte_lpm.h                        |   2 +-
> > >  20 files changed, 1444 insertions(+), 1054 deletions(-)
> > >  create mode 100644 lib/librte_acl/acl_run_avx2.c
> > >  create mode 100644 lib/librte_acl/acl_run_avx2.h
> > >  create mode 100644 lib/librte_acl/acl_run_sse.h
> > > 
> > Series
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Are you sure there is nothing to change or add in the documentation?
> Maybe that explaining the space/performance trade-off would be a good idea.
> 
Well, I'm satisfied with it, but my ACK shouldn't be definitive.  If you feel
like theres more work to be done on the documentation, by all means speak up.

Neil

> > Nice work
> 
> Yes, great work!
> 
> -- 
> Thomas
> 

  reply	other threads:[~2015-01-19 18:39 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-12 19:16 Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 01/17] fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=y Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 02/17] app/test: few small fixes fot test_acl.c Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 03/17] librte_acl: make data_indexes long enough to survive idle transitions Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 04/17] librte_acl: remove build phase heuristsic with negative perfomance effect Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 05/17] librte_acl: fix a bug at build phase that can cause matches beeing overwirtten Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 06/17] librte_acl: introduce DFA nodes compression (group64) for identical entries Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 07/17] librte_acl: build/gen phase - simplify the way match nodes are allocated Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 08/17] librte_acl: make scalar RT code to be more similar to vector one Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 09/17] librte_acl: a bit of RT code deduplication Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 10/17] EAL: introduce rte_ymm and relatives in rte_common_vect.h Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 11/17] librte_acl: add AVX2 as new rte_acl_classify() method Konstantin Ananyev
2015-01-19 17:22   ` Thomas Monjalon
2015-01-20 10:56     ` Ananyev, Konstantin
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 12/17] test-acl: add ability to manually select RT method Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 13/17] librte_acl: Remove search_sse_2 and relatives Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 14/17] libter_acl: move lo/hi dwords shuffle out from calc_addr Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 15/17] libte_acl: make calc_addr a define to deduplicate the code Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 16/17] libte_acl: introduce max_size into rte_acl_config Konstantin Ananyev
2015-01-12 19:16 ` [dpdk-dev] [PATCH v2 17/17] libte_acl: remove unused macros Konstantin Ananyev
2015-01-19 17:17   ` Thomas Monjalon
2015-01-20 10:09     ` Ananyev, Konstantin
2015-01-20 10:48       ` Jim Thompson
     [not found]         ` <2601191342CEEE43887BDE71AB977258213DE0BB@irsmsx105.ger.corp.intel.com>
2015-01-20 11:11           ` Ananyev, Konstantin
2015-01-20 12:26       ` Thomas Monjalon
2015-01-14 18:39 ` [dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method and several other enhancements Neil Horman
2015-01-19 17:16   ` Thomas Monjalon
2015-01-19 18:39     ` Neil Horman [this message]
2015-01-20 10:11     ` Ananyev, Konstantin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150119183948.GF21790@hmsreliant.think-freely.org \
    --to=nhorman@tuxdriver.com \
    --cc=dev@dpdk.org \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).