From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id BC56A5A15 for ; Tue, 20 Jan 2015 11:11:22 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP; 20 Jan 2015 02:11:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,433,1418112000"; d="scan'208";a="672727119" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga002.jf.intel.com with ESMTP; 20 Jan 2015 02:11:20 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.81]) by irsmsx110.ger.corp.intel.com ([169.254.15.8]) with mapi id 14.03.0195.001; Tue, 20 Jan 2015 10:11:19 +0000 From: "Ananyev, Konstantin" To: Thomas Monjalon Thread-Topic: [dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method and several other enhancements. Thread-Index: AQHQMCn29FoifvppZk6RDyQ7sxqyB5zHtt8AgAEbXyA= Date: Tue, 20 Jan 2015 10:11:19 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258213DE072@irsmsx105.ger.corp.intel.com> References: <1421090181-17150-1-git-send-email-konstantin.ananyev@intel.com> <20150114183928.GA28492@hmsreliant.think-freely.org> <3790092.dknD3Zd4cr@xps13> In-Reply-To: <3790092.dknD3Zd4cr@xps13> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method and several other enhancements. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2015 10:11:23 -0000 > -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Monday, January 19, 2015 5:16 PM > To: Ananyev, Konstantin > Cc: dev@dpdk.org; Neil Horman > Subject: Re: [dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method an= d several other enhancements. >=20 > 2015-01-14 13:39, Neil Horman: > > On Mon, Jan 12, 2015 at 07:16:04PM +0000, Konstantin Ananyev wrote: > > > v2 changes: > > > - When build with the compilers that don't support AVX2 instructions, > > > make rte_acl_classify_avx2() do nothing and return an error. > > > - Remove unneeded 'ifdef __AVX2__' in acl_run_avx2.*. > > > - Reorder order of patches in the set, to keep RTE_LIBRTE_ACL_STANDAL= ONE=3Dy > > > always buildable. > > > > > > This patch series contain several fixes and enhancements for ACL libr= ary. > > > See complete list below. > > > Two main changes that are externally visible: > > > - Introduce new classify method: RTE_ACL_CLASSIFY_AVX2. > > > It uses AVX2 instructions and 256 bit wide data types > > > to perform internal trie traversal. > > > That helps to increase classify() throughput. > > > This method is selected as default one on CPUs that supports AVX2. > > > - Introduce new field in the build config structure: max_size. > > > It specifies maximum size that internal RT structure for given contex= t > > > can reach. > > > The purpose of that is to allow user to decide about space/performanc= e trade-off > > > (faster classify() vs less space for RT internal structures) > > > for each given set of rules. > > > > > > Konstantin Ananyev (17): > > > fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=3Dy > > > app/test: few small fixes fot test_acl.c > > > librte_acl: make data_indexes long enough to survive idle transitio= ns. > > > librte_acl: remove build phase heuristsic with negative perfomance > > > effect. > > > librte_acl: fix a bug at build phase that can cause matches beeing > > > overwirtten. > > > librte_acl: introduce DFA nodes compression (group64) for identical > > > entries. > > > librte_acl: build/gen phase - simplify the way match nodes are > > > allocated. > > > librte_acl: make scalar RT code to be more similar to vector one. > > > librte_acl: a bit of RT code deduplication. > > > EAL: introduce rte_ymm and relatives in rte_common_vect.h. > > > librte_acl: add AVX2 as new rte_acl_classify() method > > > test-acl: add ability to manually select RT method. > > > librte_acl: Remove search_sse_2 and relatives. > > > libter_acl: move lo/hi dwords shuffle out from calc_addr > > > libte_acl: make calc_addr a define to deduplicate the code. > > > libte_acl: introduce max_size into rte_acl_config. > > > libte_acl: remove unused macros. > > > > > > app/test-acl/main.c | 126 +++-- > > > app/test/test_acl.c | 8 +- > > > examples/l3fwd-acl/main.c | 3 +- > > > examples/l3fwd/main.c | 2 +- > > > lib/librte_acl/Makefile | 18 + > > > lib/librte_acl/acl.h | 58 ++- > > > lib/librte_acl/acl_bld.c | 392 +++++++-------= -- > > > lib/librte_acl/acl_gen.c | 268 +++++++---- > > > lib/librte_acl/acl_run.h | 7 +- > > > lib/librte_acl/acl_run_avx2.c | 54 +++ > > > lib/librte_acl/acl_run_avx2.h | 284 ++++++++++++ > > > lib/librte_acl/acl_run_scalar.c | 65 ++- > > > lib/librte_acl/acl_run_sse.c | 585 +-------------= ---------- > > > lib/librte_acl/acl_run_sse.h | 357 ++++++++++++++= + > > > lib/librte_acl/acl_vect.h | 132 +++--- > > > lib/librte_acl/rte_acl.c | 47 +- > > > lib/librte_acl/rte_acl.h | 4 + > > > lib/librte_acl/rte_acl_osdep_alone.h | 47 +- > > > lib/librte_eal/common/include/rte_common_vect.h | 39 +- > > > lib/librte_lpm/rte_lpm.h | 2 +- > > > 20 files changed, 1444 insertions(+), 1054 deletions(-) > > > create mode 100644 lib/librte_acl/acl_run_avx2.c > > > create mode 100644 lib/librte_acl/acl_run_avx2.h > > > create mode 100644 lib/librte_acl/acl_run_sse.h > > > > > Series > > Acked-by: Neil Horman >=20 > Are you sure there is nothing to change or add in the documentation? > Maybe that explaining the space/performance trade-off would be a good ide= a. Ok, after that patch will be applied, I'll work on the docs update. Thanks Konstantin >=20 > > Nice work >=20 > Yes, great work! >=20 > -- > Thomas