From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8647FA04C7; Tue, 15 Sep 2020 18:50:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 84DED1C0D0; Tue, 15 Sep 2020 18:50:40 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 1EE251C10C for ; Tue, 15 Sep 2020 18:50:38 +0200 (CEST) IronPort-SDR: psAU0gsEe1RuaLF8QznNRIa9bEu3WdYRseS/bNg6SOE95zrWcp047aSwpwneeGpeS4lHYqpYoj JzJDtGFUR+BQ== X-IronPort-AV: E=McAfee;i="6000,8403,9745"; a="156692914" X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="156692914" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2020 09:50:36 -0700 IronPort-SDR: KhwRhiL5fe6Pq3PKX/okyp0ujLESQg6J6TV9dcoJOrt/y9R9j7MH7dzzWRSWu1aCfjMGdKAahB 3okbLKGzWodQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="306709231" Received: from sivswdev08.ir.intel.com ([10.237.217.47]) by orsmga006.jf.intel.com with ESMTP; 15 Sep 2020 09:50:35 -0700 From: Konstantin Ananyev To: dev@dpdk.org Cc: jerinj@marvell.com, ruifeng.wang@arm.com, vladimir.medvedkin@intel.com, Konstantin Ananyev Date: Tue, 15 Sep 2020 17:50:13 +0100 Message-Id: <20200915165025.543-1-konstantin.ananyev@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20200807162829.11690-1-konstantin.ananyev@intel.com> References: <20200807162829.11690-1-konstantin.ananyev@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v2 00/12] acl: introduce AVX512 classify method X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" These patch series introduce support of AVX512 specific classify implementation for ACL library. Inside it contains two code-paths – one uses mostly 256 bit instruction/registers and can process up to 16 flows in parallel. second uses 512 bit instruction/registers over majority of places and can process up to 32 flows in parallel. This runtime code-path selection is done internally based on input burst size and is totally opaque to the user. On my SKX box test-acl shows ~20-65% improvement (depending on rule-set and input burst size) when switching from AVX2 to AVX512 classify algorithms. ICX and CLX testing showed similar level of speedup: up to ~50-60%. Current AVX512 classify implementation is only supported on x86_64. Note that this series introduce a formal ABI incompatibility with previous versions of ACL library. v1 -> v2: Deduplicated 8/16 code paths as much as possible Updated default algorithm selection Removed library constructor to make it easier integrate with https://patches.dpdk.org/project/dpdk/list/?series=11831 Updated docs These patch series depends on: https://patches.dpdk.org/patch/73922/mbox/ to be applied first. Konstantin Ananyev (12): acl: fix x86 build when compiler doesn't support AVX2 doc: fix mixing classify methods in ACL guide acl: remove of unused enum value acl: remove library constructor app/acl: few small improvements test/acl: expand classify test coverage acl: add infrastructure to support AVX512 classify acl: introduce AVX512 classify implementation acl: enhance AVX512 classify implementation acl: for AVX512 classify use 4B load whenever possible test/acl: add AVX512 classify support app/acl: add AVX512 classify support app/test-acl/main.c | 19 +- app/test/test_acl.c | 104 ++-- config/x86/meson.build | 3 +- .../prog_guide/packet_classif_access_ctrl.rst | 15 + doc/guides/rel_notes/deprecation.rst | 4 - doc/guides/rel_notes/release_20_11.rst | 9 + lib/librte_acl/acl.h | 12 + lib/librte_acl/acl_bld.c | 34 ++ lib/librte_acl/acl_gen.c | 2 +- lib/librte_acl/acl_run_avx512.c | 331 +++++++++++ lib/librte_acl/acl_run_avx512x16.h | 526 ++++++++++++++++++ lib/librte_acl/acl_run_avx512x8.h | 439 +++++++++++++++ lib/librte_acl/meson.build | 39 ++ lib/librte_acl/rte_acl.c | 198 +++++-- lib/librte_acl/rte_acl.h | 3 +- 15 files changed, 1638 insertions(+), 100 deletions(-) create mode 100644 lib/librte_acl/acl_run_avx512.c create mode 100644 lib/librte_acl/acl_run_avx512x16.h create mode 100644 lib/librte_acl/acl_run_avx512x8.h -- 2.17.1