From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 64DFBA04B0; Fri, 7 Aug 2020 18:28:46 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 916C82BF2; Fri, 7 Aug 2020 18:28:45 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id C56642BF1 for ; Fri, 7 Aug 2020 18:28:43 +0200 (CEST) IronPort-SDR: 3g7APm5bXVaHOwU5VqQ00+wD1OKggjl48ggspbbgt8hnxKQ+SdwrrJEaTiPE9z2RPkAX8rUXd1 UamFNh3JTx1g== X-IronPort-AV: E=McAfee;i="6000,8403,9706"; a="171194107" X-IronPort-AV: E=Sophos;i="5.75,446,1589266800"; d="scan'208";a="171194107" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 09:28:42 -0700 IronPort-SDR: /OBvQJ3I5ceXBjIewXYmlrJxD+AO+4JnuGCcJSHcZ/7oPuZMZnBDSeRdqZC1BVRj7crlypVCye TsU5+Ywl7ceQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,446,1589266800"; d="scan'208";a="323799694" Received: from sivswdev08.ir.intel.com ([10.237.217.47]) by orsmga008.jf.intel.com with ESMTP; 07 Aug 2020 09:28:40 -0700 From: Konstantin Ananyev To: dev@dpdk.org Cc: jerinj@marvell.com, ruifeng.wang@arm.com, vladimir.medvedkin@intel.com, Konstantin Ananyev Date: Fri, 7 Aug 2020 17:28:22 +0100 Message-Id: <20200807162829.11690-1-konstantin.ananyev@intel.com> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH 20.11 0/7] acl: introduce AVX512 classify method X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" These patch series introduce support of AVX512 specific classify implementation for ACL library. Inside it contains two code-paths – one uses mostly 256 bit instruction/registers and can process up to 16 flows in parallel. second uses 512 bit instruction/registers over majority of places and can process up to 32 flows in parallel. These internal code-path selection is done internally based on input burst size and is totally opaque to the user. On my SKX box test-acl shows ~20-65% improvement (depending on rule-set and input burst size) when switching from AVX2 to AVX512 classify algorithms. Note that this change introduce a formal ABI incompatibility with previous versions of ACL library. TODO list: - Deduplicate 8/16 code paths - Update default algorithm selection - Update docs These patch series depends on: https://patches.dpdk.org/patch/70429/ to be applied first. Konstantin Ananyev (7): acl: fix x86 build when compiler doesn't support AVX2 app/acl: few small improvements acl: remove of unused enum value acl: add infrastructure to support AVX512 classify app/acl: add AVX512 classify support acl: introduce AVX512 classify implementation acl: enhance AVX512 classify implementation app/test-acl/main.c | 19 +- config/x86/meson.build | 3 +- lib/librte_acl/Makefile | 26 ++ lib/librte_acl/acl.h | 4 + lib/librte_acl/acl_run_avx512.c | 140 +++++++ lib/librte_acl/acl_run_avx512x16.h | 635 +++++++++++++++++++++++++++++ lib/librte_acl/acl_run_avx512x8.h | 614 ++++++++++++++++++++++++++++ lib/librte_acl/meson.build | 39 ++ lib/librte_acl/rte_acl.c | 19 +- lib/librte_acl/rte_acl.h | 2 +- 10 files changed, 1493 insertions(+), 8 deletions(-) create mode 100644 lib/librte_acl/acl_run_avx512.c create mode 100644 lib/librte_acl/acl_run_avx512x16.h create mode 100644 lib/librte_acl/acl_run_avx512x8.h -- 2.17.1