From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 3BCF8A04BB;
	Tue,  6 Oct 2020 17:08:05 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 1A8BA1B68E;
	Tue,  6 Oct 2020 17:08:04 +0200 (CEST)
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
 by dpdk.org (Postfix) with ESMTP id 148F21B671
 for <dev@dpdk.org>; Tue,  6 Oct 2020 17:08:02 +0200 (CEST)
IronPort-SDR: IVuAUv6RFbu3CyWWUR1Sx3hO2/vWS4oapnNQoPkyoZOVbdcztBmLBcKbGWmWFI/Jwq4qRTOJyo
 9oOoow9xzBuw==
X-IronPort-AV: E=McAfee;i="6000,8403,9765"; a="143919402"
X-IronPort-AV: E=Sophos;i="5.77,343,1596524400"; d="scan'208";a="143919402"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga005.fm.intel.com ([10.253.24.32])
 by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 06 Oct 2020 08:03:29 -0700
IronPort-SDR: Sm9ZGUKD8iCAiPpbyTVUYuHJjGR6MnqmmTERic6o8TPwVyUA4bE30DFIf7+R9kypUUOrz776Y0
 dzxBw6OB2p1w==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.77,343,1596524400"; d="scan'208";a="518315332"
Received: from sivswdev08.ir.intel.com ([10.237.217.47])
 by fmsmga005.fm.intel.com with ESMTP; 06 Oct 2020 08:03:24 -0700
From: Konstantin Ananyev <konstantin.ananyev@intel.com>
To: dev@dpdk.org
Cc: jerinj@marvell.com, ruifeng.wang@arm.com, vladimir.medvedkin@intel.com,
 Konstantin Ananyev <konstantin.ananyev@intel.com>
Date: Tue,  6 Oct 2020 16:03:02 +0100
Message-Id: <20201006150316.5776-1-konstantin.ananyev@intel.com>
X-Mailer: git-send-email 2.18.0
In-Reply-To: <20201005184526.7465-1-konstantin.ananyev@intel.com>
References: <20201005184526.7465-1-konstantin.ananyev@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH v4 00/14] acl: introduce AVX512 classify methods
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

These patch series introduce support of AVX512 specific classify
implementation for ACL library.
It adds two new algorithms:
 - RTE_ACL_CLASSIFY_AVX512X16 - can process up to 16 flows in parallel.
   It uses 256-bit width instructions/registers only
   (to avoid frequency level change).
   On my SKX box test-acl shows ~15-30% improvement
   (depending on rule-set and input burst size)
   when switching from AVX2 to AVX512X16 classify algorithms.
 - RTE_ACL_CLASSIFY_AVX512X32 - can process up to 32 flows in parallel.
   It uses 512-bit width instructions/registers and provides higher
   performance then AVX512X16, but can cause frequency level change.
   On my SKX box test-acl shows ~50-70% improvement
   (depending on rule-set and input burst size)
   when switching from AVX2 to AVX512X32 classify algorithms.
   ICX and CLX testing showed similar level of speedup.

Current AVX512 classify implementation is only supported on x86_64.
Note that this series introduce a formal ABI incompatibility
with previous versions of ACL library.

Depends-on: patch-79310 ("eal/x86: introduce AVX 512-bit type")

v3 -> v4
  Fix problems with meson 0.47
  Updates to conform latest changes in the mainline
  (removal of RTE_MACHINE_CPUFLAG_*)
  Fix checkpatch warnings

v2 -> v3:
  Fix checkpatch warnings
  Split AVX512 algorithm into two and deduplicate common code
v1 -> v2:
  Deduplicated 8/16 code paths as much as possible
  Updated default algorithm selection
    Removed library constructor to make it easier integrate with
    https://patches.dpdk.org/project/dpdk/list/?series=11831
  Updated docs


Konstantin Ananyev (14):
  acl: fix x86 build when compiler doesn't support AVX2
  doc: fix missing classify methods in ACL guide
  acl: remove of unused enum value
  acl: remove library constructor
  app/acl: few small improvements
  test/acl: expand classify test coverage
  acl: add infrastructure to support AVX512 classify
  acl: introduce 256-bit width AVX512 classify implementation
  acl: update default classify algorithm selection
  acl: introduce 512-bit width AVX512 classify implementation
  acl: for AVX512 classify use 4B load whenever possible
  acl: deduplicate AVX512 code paths
  test/acl: add AVX512 classify support
  app/acl: add AVX512 classify support

 app/test-acl/main.c                           |  23 +-
 app/test/test_acl.c                           | 105 ++--
 config/x86/meson.build                        |   3 +-
 .../prog_guide/packet_classif_access_ctrl.rst |  20 +
 doc/guides/rel_notes/deprecation.rst          |   4 -
 doc/guides/rel_notes/release_20_11.rst        |  12 +
 lib/librte_acl/acl.h                          |  16 +
 lib/librte_acl/acl_bld.c                      |  34 ++
 lib/librte_acl/acl_gen.c                      |   2 +-
 lib/librte_acl/acl_run_avx512.c               | 164 ++++++
 lib/librte_acl/acl_run_avx512_common.h        | 477 ++++++++++++++++++
 lib/librte_acl/acl_run_avx512x16.h            | 341 +++++++++++++
 lib/librte_acl/acl_run_avx512x8.h             | 253 ++++++++++
 lib/librte_acl/meson.build                    |  48 ++
 lib/librte_acl/rte_acl.c                      | 212 ++++++--
 lib/librte_acl/rte_acl.h                      |   4 +-
 16 files changed, 1618 insertions(+), 100 deletions(-)
 create mode 100644 lib/librte_acl/acl_run_avx512.c
 create mode 100644 lib/librte_acl/acl_run_avx512_common.h
 create mode 100644 lib/librte_acl/acl_run_avx512x16.h
 create mode 100644 lib/librte_acl/acl_run_avx512x8.h

-- 
2.17.1