From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id ADC0141DB0;
	Thu,  2 Mar 2023 03:22:29 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id EF63942D59;
	Thu,  2 Mar 2023 03:21:01 +0100 (CET)
Received: from mga17.intel.com (mga17.intel.com [192.55.52.151])
 by mails.dpdk.org (Postfix) with ESMTP id 4B30842D0C
 for <dev@dpdk.org>; Thu,  2 Mar 2023 03:20:59 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1677723659; x=1709259659;
 h=from:to:cc:subject:date:message-id:in-reply-to:
 references:mime-version:content-transfer-encoding;
 bh=oosuxF3J2rRsqRC5sJrgpfIO1QtMM+g+/mFuKFGJboA=;
 b=WTbmBzaeatMWj+8ecmjxYky7PXLVEMvTpgaCceVdh5cfI6SeQA5MfmnX
 EolWYK/cRiLG71jzYXRcf0mOxc4SvsKdggnXwhWRW3mM5EPqtX5KzgneO
 VBZD+i1GIRrf/RVe7ANQmCZK0L3XJ3pFB5ElBT+sRudRbb56KFeppI6FD
 LUnDZOU4HF/M9n1l2JndMfhQSQgx4VsZc+pyNQJVEjma1Ek9+74GXF0HH
 EVVCda9/jQrRpy4Z5oYfBj/XIKd+ZRqxFdetn8sX0QPD+2vzHYzhUA5cZ
 aO74gOTuA5tgqIkS3rLzF0OcRtxKQfDfyBKMzdV46kp8p0oyTGLHpV7hR w==;
X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="315013585"
X-IronPort-AV: E=Sophos;i="5.98,226,1673942400"; d="scan'208";a="315013585"
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 01 Mar 2023 18:20:58 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="784607584"
X-IronPort-AV: E=Sophos;i="5.98,226,1673942400"; d="scan'208";a="784607584"
Received: from dpdk-mingxial-ice.sh.intel.com ([10.67.110.191])
 by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2023 18:20:57 -0800
From: Mingxia Liu <mingxia.liu@intel.com>
To: dev@dpdk.org,
	beilei.xing@intel.com,
	yuying.zhang@intel.com
Cc: Mingxia Liu <mingxia.liu@intel.com>,
	Wenjun Wu <wenjun1.wu@intel.com>
Subject: [PATCH v8 15/21] net/cpfl: add AVX512 data path for single queue model
Date: Thu,  2 Mar 2023 10:35:21 +0000
Message-Id: <20230302103527.931071-16-mingxia.liu@intel.com>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20230302103527.931071-1-mingxia.liu@intel.com>
References: <20230216003010.3439881-1-mingxia.liu@intel.com>
 <20230302103527.931071-1-mingxia.liu@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Add support of AVX512 vector data path for single queue model.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Mingxia Liu <mingxia.liu@intel.com>
---
 doc/guides/nics/cpfl.rst                |  24 +++++-
 drivers/net/cpfl/cpfl_ethdev.c          |   3 +-
 drivers/net/cpfl/cpfl_rxtx.c            |  93 ++++++++++++++++++++++
 drivers/net/cpfl/cpfl_rxtx_vec_common.h | 100 ++++++++++++++++++++++++
 drivers/net/cpfl/meson.build            |  25 +++++-
 5 files changed, 242 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/cpfl/cpfl_rxtx_vec_common.h

diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst
index 253fa3afae..e2d71f8a4c 100644
--- a/doc/guides/nics/cpfl.rst
+++ b/doc/guides/nics/cpfl.rst
@@ -82,4 +82,26 @@ Runtime Config Options
 Driver compilation and testing
 ------------------------------
 
-Refer to the document :doc:`build_and_test` for details.
\ No newline at end of file
+Refer to the document :doc:`build_and_test` for details.
+
+Features
+--------
+
+Vector PMD
+~~~~~~~~~~
+
+Vector path for Rx and Tx path are selected automatically.
+The paths are chosen based on 2 conditions:
+
+- ``CPU``
+
+  On the x86 platform, the driver checks if the CPU supports AVX512.
+  If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth``
+  is set to 512, AVX512 paths will be chosen.
+
+- ``Offload features``
+
+  The supported HW offload features are described in the document cpfl.ini,
+  A value "P" means the offload feature is not supported by vector path.
+  If any not supported features are used, cpfl vector PMD is disabled
+  and the scalar paths are chosen.
diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c
index b9bfc38292..8685c6e27b 100644
--- a/drivers/net/cpfl/cpfl_ethdev.c
+++ b/drivers/net/cpfl/cpfl_ethdev.c
@@ -108,7 +108,8 @@ cpfl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 		RTE_ETH_TX_OFFLOAD_TCP_CKSUM		|
 		RTE_ETH_TX_OFFLOAD_SCTP_CKSUM		|
 		RTE_ETH_TX_OFFLOAD_TCP_TSO		|
-		RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
+		RTE_ETH_TX_OFFLOAD_MULTI_SEGS		|
+		RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;
 
 	dev_info->default_txconf = (struct rte_eth_txconf) {
 		.tx_free_thresh = CPFL_DEFAULT_TX_FREE_THRESH,
diff --git a/drivers/net/cpfl/cpfl_rxtx.c b/drivers/net/cpfl/cpfl_rxtx.c
index 520f61e07e..a3832acd4f 100644
--- a/drivers/net/cpfl/cpfl_rxtx.c
+++ b/drivers/net/cpfl/cpfl_rxtx.c
@@ -8,6 +8,7 @@
 
 #include "cpfl_ethdev.h"
 #include "cpfl_rxtx.h"
+#include "cpfl_rxtx_vec_common.h"
 
 static uint64_t
 cpfl_rx_offload_convert(uint64_t offload)
@@ -739,24 +740,96 @@ void
 cpfl_set_rx_function(struct rte_eth_dev *dev)
 {
 	struct idpf_vport *vport = dev->data->dev_private;
+#ifdef RTE_ARCH_X86
+	struct idpf_rx_queue *rxq;
+	int i;
+
+	if (cpfl_rx_vec_dev_check_default(dev) == CPFL_VECTOR_PATH &&
+	    rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) {
+		vport->rx_vec_allowed = true;
+
+		if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512)
+#ifdef CC_AVX512_SUPPORT
+			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+			    rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) == 1)
+				vport->rx_use_avx512 = true;
+#else
+		PMD_DRV_LOG(NOTICE,
+			    "AVX512 is not supported in build env");
+#endif /* CC_AVX512_SUPPORT */
+	} else {
+		vport->rx_vec_allowed = false;
+	}
+#endif /* RTE_ARCH_X86 */
 
+#ifdef RTE_ARCH_X86
 	if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
 		PMD_DRV_LOG(NOTICE,
 			    "Using Split Scalar Rx (port %d).",
 			    dev->data->port_id);
 		dev->rx_pkt_burst = idpf_dp_splitq_recv_pkts;
 	} else {
+		if (vport->rx_vec_allowed) {
+			for (i = 0; i < dev->data->nb_rx_queues; i++) {
+				rxq = dev->data->rx_queues[i];
+				(void)idpf_qc_singleq_rx_vec_setup(rxq);
+			}
+#ifdef CC_AVX512_SUPPORT
+			if (vport->rx_use_avx512) {
+				PMD_DRV_LOG(NOTICE,
+					    "Using Single AVX512 Vector Rx (port %d).",
+					    dev->data->port_id);
+				dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts_avx512;
+				return;
+			}
+#endif /* CC_AVX512_SUPPORT */
+		}
 		PMD_DRV_LOG(NOTICE,
 			    "Using Single Scalar Rx (port %d).",
 			    dev->data->port_id);
 		dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts;
 	}
+#else
+	if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
+		PMD_DRV_LOG(NOTICE,
+			    "Using Split Scalar Rx (port %d).",
+			    dev->data->port_id);
+		dev->rx_pkt_burst = idpf_dp_splitq_recv_pkts;
+	} else {
+		PMD_DRV_LOG(NOTICE,
+			    "Using Single Scalar Rx (port %d).",
+			    dev->data->port_id);
+		dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts;
+	}
+#endif /* RTE_ARCH_X86 */
 }
 
 void
 cpfl_set_tx_function(struct rte_eth_dev *dev)
 {
 	struct idpf_vport *vport = dev->data->dev_private;
+#ifdef RTE_ARCH_X86
+#ifdef CC_AVX512_SUPPORT
+	struct idpf_tx_queue *txq;
+	int i;
+#endif /* CC_AVX512_SUPPORT */
+
+	if (cpfl_tx_vec_dev_check_default(dev) == CPFL_VECTOR_PATH &&
+	    rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) {
+		vport->tx_vec_allowed = true;
+		if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512)
+#ifdef CC_AVX512_SUPPORT
+			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+			    rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) == 1)
+				vport->tx_use_avx512 = true;
+#else
+		PMD_DRV_LOG(NOTICE,
+			    "AVX512 is not supported in build env");
+#endif /* CC_AVX512_SUPPORT */
+	} else {
+		vport->tx_vec_allowed = false;
+	}
+#endif /* RTE_ARCH_X86 */
 
 	if (vport->txq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
 		PMD_DRV_LOG(NOTICE,
@@ -765,6 +838,26 @@ cpfl_set_tx_function(struct rte_eth_dev *dev)
 		dev->tx_pkt_burst = idpf_dp_splitq_xmit_pkts;
 		dev->tx_pkt_prepare = idpf_dp_prep_pkts;
 	} else {
+#ifdef RTE_ARCH_X86
+		if (vport->tx_vec_allowed) {
+#ifdef CC_AVX512_SUPPORT
+			if (vport->tx_use_avx512) {
+				for (i = 0; i < dev->data->nb_tx_queues; i++) {
+					txq = dev->data->tx_queues[i];
+					if (txq == NULL)
+						continue;
+					idpf_qc_tx_vec_avx512_setup(txq);
+				}
+				PMD_DRV_LOG(NOTICE,
+					    "Using Single AVX512 Vector Tx (port %d).",
+					    dev->data->port_id);
+				dev->tx_pkt_burst = idpf_dp_singleq_xmit_pkts_avx512;
+				dev->tx_pkt_prepare = idpf_dp_prep_pkts;
+				return;
+			}
+#endif /* CC_AVX512_SUPPORT */
+		}
+#endif /* RTE_ARCH_X86 */
 		PMD_DRV_LOG(NOTICE,
 			    "Using Single Scalar Tx (port %d).",
 			    dev->data->port_id);
diff --git a/drivers/net/cpfl/cpfl_rxtx_vec_common.h b/drivers/net/cpfl/cpfl_rxtx_vec_common.h
new file mode 100644
index 0000000000..2d4c6a0ef3
--- /dev/null
+++ b/drivers/net/cpfl/cpfl_rxtx_vec_common.h
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Intel Corporation
+ */
+
+#ifndef _CPFL_RXTX_VEC_COMMON_H_
+#define _CPFL_RXTX_VEC_COMMON_H_
+#include <stdint.h>
+#include <ethdev_driver.h>
+#include <rte_malloc.h>
+
+#include "cpfl_ethdev.h"
+#include "cpfl_rxtx.h"
+
+#ifndef __INTEL_COMPILER
+#pragma GCC diagnostic ignored "-Wcast-qual"
+#endif
+
+#define CPFL_SCALAR_PATH		0
+#define CPFL_VECTOR_PATH		1
+#define CPFL_RX_NO_VECTOR_FLAGS (		\
+		RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |	\
+		RTE_ETH_RX_OFFLOAD_UDP_CKSUM |	\
+		RTE_ETH_RX_OFFLOAD_TCP_CKSUM |	\
+		RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM |	\
+		RTE_ETH_RX_OFFLOAD_TIMESTAMP)
+#define CPFL_TX_NO_VECTOR_FLAGS (		\
+		RTE_ETH_TX_OFFLOAD_TCP_TSO |	\
+		RTE_ETH_TX_OFFLOAD_MULTI_SEGS)
+
+static inline int
+cpfl_rx_vec_queue_default(struct idpf_rx_queue *rxq)
+{
+	if (rxq == NULL)
+		return CPFL_SCALAR_PATH;
+
+	if (rte_is_power_of_2(rxq->nb_rx_desc) == 0)
+		return CPFL_SCALAR_PATH;
+
+	if (rxq->rx_free_thresh < IDPF_VPMD_RX_MAX_BURST)
+		return CPFL_SCALAR_PATH;
+
+	if ((rxq->nb_rx_desc % rxq->rx_free_thresh) != 0)
+		return CPFL_SCALAR_PATH;
+
+	if ((rxq->offloads & CPFL_RX_NO_VECTOR_FLAGS) != 0)
+		return CPFL_SCALAR_PATH;
+
+	return CPFL_VECTOR_PATH;
+}
+
+static inline int
+cpfl_tx_vec_queue_default(struct idpf_tx_queue *txq)
+{
+	if (txq == NULL)
+		return CPFL_SCALAR_PATH;
+
+	if (txq->rs_thresh < IDPF_VPMD_TX_MAX_BURST ||
+	    (txq->rs_thresh & 3) != 0)
+		return CPFL_SCALAR_PATH;
+
+	if ((txq->offloads & CPFL_TX_NO_VECTOR_FLAGS) != 0)
+		return CPFL_SCALAR_PATH;
+
+	return CPFL_VECTOR_PATH;
+}
+
+static inline int
+cpfl_rx_vec_dev_check_default(struct rte_eth_dev *dev)
+{
+	struct idpf_rx_queue *rxq;
+	int i, ret = 0;
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		rxq = dev->data->rx_queues[i];
+		ret = (cpfl_rx_vec_queue_default(rxq));
+		if (ret == CPFL_SCALAR_PATH)
+			return CPFL_SCALAR_PATH;
+	}
+
+	return CPFL_VECTOR_PATH;
+}
+
+static inline int
+cpfl_tx_vec_dev_check_default(struct rte_eth_dev *dev)
+{
+	int i;
+	struct idpf_tx_queue *txq;
+	int ret = 0;
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		txq = dev->data->tx_queues[i];
+		ret = cpfl_tx_vec_queue_default(txq);
+		if (ret == CPFL_SCALAR_PATH)
+			return CPFL_SCALAR_PATH;
+	}
+
+	return CPFL_VECTOR_PATH;
+}
+
+#endif /*_CPFL_RXTX_VEC_COMMON_H_*/
diff --git a/drivers/net/cpfl/meson.build b/drivers/net/cpfl/meson.build
index 1894423689..fbe6500826 100644
--- a/drivers/net/cpfl/meson.build
+++ b/drivers/net/cpfl/meson.build
@@ -7,9 +7,32 @@ if is_windows
     subdir_done()
 endif
 
+if dpdk_conf.get('RTE_IOVA_AS_PA') == 0
+    build = false
+    reason = 'driver does not support disabling IOVA as PA mode'
+    subdir_done()
+endif
+
 deps += ['common_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
         'cpfl_rxtx.c',
-)
\ No newline at end of file
+)
+
+if arch_subdir == 'x86'
+    cpfl_avx512_cpu_support = (
+        cc.get_define('__AVX512F__', args: machine_args) != '' and
+        cc.get_define('__AVX512BW__', args: machine_args) != ''
+    )
+
+    cpfl_avx512_cc_support = (
+        not machine_args.contains('-mno-avx512f') and
+        cc.has_argument('-mavx512f') and
+        cc.has_argument('-mavx512bw')
+    )
+
+    if cpfl_avx512_cpu_support == true or cpfl_avx512_cc_support == true
+        cflags += ['-DCC_AVX512_SUPPORT']
+    endif
+endif
-- 
2.34.1