From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ADC0141DB0; Thu, 2 Mar 2023 03:22:29 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EF63942D59; Thu, 2 Mar 2023 03:21:01 +0100 (CET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 4B30842D0C for ; Thu, 2 Mar 2023 03:20:59 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677723659; x=1709259659; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oosuxF3J2rRsqRC5sJrgpfIO1QtMM+g+/mFuKFGJboA=; b=WTbmBzaeatMWj+8ecmjxYky7PXLVEMvTpgaCceVdh5cfI6SeQA5MfmnX EolWYK/cRiLG71jzYXRcf0mOxc4SvsKdggnXwhWRW3mM5EPqtX5KzgneO VBZD+i1GIRrf/RVe7ANQmCZK0L3XJ3pFB5ElBT+sRudRbb56KFeppI6FD LUnDZOU4HF/M9n1l2JndMfhQSQgx4VsZc+pyNQJVEjma1Ek9+74GXF0HH EVVCda9/jQrRpy4Z5oYfBj/XIKd+ZRqxFdetn8sX0QPD+2vzHYzhUA5cZ aO74gOTuA5tgqIkS3rLzF0OcRtxKQfDfyBKMzdV46kp8p0oyTGLHpV7hR w==; X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="315013585" X-IronPort-AV: E=Sophos;i="5.98,226,1673942400"; d="scan'208";a="315013585" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Mar 2023 18:20:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="784607584" X-IronPort-AV: E=Sophos;i="5.98,226,1673942400"; d="scan'208";a="784607584" Received: from dpdk-mingxial-ice.sh.intel.com ([10.67.110.191]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2023 18:20:57 -0800 From: Mingxia Liu To: dev@dpdk.org, beilei.xing@intel.com, yuying.zhang@intel.com Cc: Mingxia Liu , Wenjun Wu Subject: [PATCH v8 15/21] net/cpfl: add AVX512 data path for single queue model Date: Thu, 2 Mar 2023 10:35:21 +0000 Message-Id: <20230302103527.931071-16-mingxia.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230302103527.931071-1-mingxia.liu@intel.com> References: <20230216003010.3439881-1-mingxia.liu@intel.com> <20230302103527.931071-1-mingxia.liu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support of AVX512 vector data path for single queue model. Signed-off-by: Wenjun Wu Signed-off-by: Mingxia Liu --- doc/guides/nics/cpfl.rst | 24 +++++- drivers/net/cpfl/cpfl_ethdev.c | 3 +- drivers/net/cpfl/cpfl_rxtx.c | 93 ++++++++++++++++++++++ drivers/net/cpfl/cpfl_rxtx_vec_common.h | 100 ++++++++++++++++++++++++ drivers/net/cpfl/meson.build | 25 +++++- 5 files changed, 242 insertions(+), 3 deletions(-) create mode 100644 drivers/net/cpfl/cpfl_rxtx_vec_common.h diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst index 253fa3afae..e2d71f8a4c 100644 --- a/doc/guides/nics/cpfl.rst +++ b/doc/guides/nics/cpfl.rst @@ -82,4 +82,26 @@ Runtime Config Options Driver compilation and testing ------------------------------ -Refer to the document :doc:`build_and_test` for details. \ No newline at end of file +Refer to the document :doc:`build_and_test` for details. + +Features +-------- + +Vector PMD +~~~~~~~~~~ + +Vector path for Rx and Tx path are selected automatically. +The paths are chosen based on 2 conditions: + +- ``CPU`` + + On the x86 platform, the driver checks if the CPU supports AVX512. + If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth`` + is set to 512, AVX512 paths will be chosen. + +- ``Offload features`` + + The supported HW offload features are described in the document cpfl.ini, + A value "P" means the offload feature is not supported by vector path. + If any not supported features are used, cpfl vector PMD is disabled + and the scalar paths are chosen. diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c index b9bfc38292..8685c6e27b 100644 --- a/drivers/net/cpfl/cpfl_ethdev.c +++ b/drivers/net/cpfl/cpfl_ethdev.c @@ -108,7 +108,8 @@ cpfl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) RTE_ETH_TX_OFFLOAD_TCP_CKSUM | RTE_ETH_TX_OFFLOAD_SCTP_CKSUM | RTE_ETH_TX_OFFLOAD_TCP_TSO | - RTE_ETH_TX_OFFLOAD_MULTI_SEGS; + RTE_ETH_TX_OFFLOAD_MULTI_SEGS | + RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE; dev_info->default_txconf = (struct rte_eth_txconf) { .tx_free_thresh = CPFL_DEFAULT_TX_FREE_THRESH, diff --git a/drivers/net/cpfl/cpfl_rxtx.c b/drivers/net/cpfl/cpfl_rxtx.c index 520f61e07e..a3832acd4f 100644 --- a/drivers/net/cpfl/cpfl_rxtx.c +++ b/drivers/net/cpfl/cpfl_rxtx.c @@ -8,6 +8,7 @@ #include "cpfl_ethdev.h" #include "cpfl_rxtx.h" +#include "cpfl_rxtx_vec_common.h" static uint64_t cpfl_rx_offload_convert(uint64_t offload) @@ -739,24 +740,96 @@ void cpfl_set_rx_function(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; +#ifdef RTE_ARCH_X86 + struct idpf_rx_queue *rxq; + int i; + + if (cpfl_rx_vec_dev_check_default(dev) == CPFL_VECTOR_PATH && + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { + vport->rx_vec_allowed = true; + + if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512) +#ifdef CC_AVX512_SUPPORT + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) == 1) + vport->rx_use_avx512 = true; +#else + PMD_DRV_LOG(NOTICE, + "AVX512 is not supported in build env"); +#endif /* CC_AVX512_SUPPORT */ + } else { + vport->rx_vec_allowed = false; + } +#endif /* RTE_ARCH_X86 */ +#ifdef RTE_ARCH_X86 if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) { PMD_DRV_LOG(NOTICE, "Using Split Scalar Rx (port %d).", dev->data->port_id); dev->rx_pkt_burst = idpf_dp_splitq_recv_pkts; } else { + if (vport->rx_vec_allowed) { + for (i = 0; i < dev->data->nb_rx_queues; i++) { + rxq = dev->data->rx_queues[i]; + (void)idpf_qc_singleq_rx_vec_setup(rxq); + } +#ifdef CC_AVX512_SUPPORT + if (vport->rx_use_avx512) { + PMD_DRV_LOG(NOTICE, + "Using Single AVX512 Vector Rx (port %d).", + dev->data->port_id); + dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts_avx512; + return; + } +#endif /* CC_AVX512_SUPPORT */ + } PMD_DRV_LOG(NOTICE, "Using Single Scalar Rx (port %d).", dev->data->port_id); dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts; } +#else + if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) { + PMD_DRV_LOG(NOTICE, + "Using Split Scalar Rx (port %d).", + dev->data->port_id); + dev->rx_pkt_burst = idpf_dp_splitq_recv_pkts; + } else { + PMD_DRV_LOG(NOTICE, + "Using Single Scalar Rx (port %d).", + dev->data->port_id); + dev->rx_pkt_burst = idpf_dp_singleq_recv_pkts; + } +#endif /* RTE_ARCH_X86 */ } void cpfl_set_tx_function(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; +#ifdef RTE_ARCH_X86 +#ifdef CC_AVX512_SUPPORT + struct idpf_tx_queue *txq; + int i; +#endif /* CC_AVX512_SUPPORT */ + + if (cpfl_tx_vec_dev_check_default(dev) == CPFL_VECTOR_PATH && + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { + vport->tx_vec_allowed = true; + if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512) +#ifdef CC_AVX512_SUPPORT + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) == 1) + vport->tx_use_avx512 = true; +#else + PMD_DRV_LOG(NOTICE, + "AVX512 is not supported in build env"); +#endif /* CC_AVX512_SUPPORT */ + } else { + vport->tx_vec_allowed = false; + } +#endif /* RTE_ARCH_X86 */ if (vport->txq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) { PMD_DRV_LOG(NOTICE, @@ -765,6 +838,26 @@ cpfl_set_tx_function(struct rte_eth_dev *dev) dev->tx_pkt_burst = idpf_dp_splitq_xmit_pkts; dev->tx_pkt_prepare = idpf_dp_prep_pkts; } else { +#ifdef RTE_ARCH_X86 + if (vport->tx_vec_allowed) { +#ifdef CC_AVX512_SUPPORT + if (vport->tx_use_avx512) { + for (i = 0; i < dev->data->nb_tx_queues; i++) { + txq = dev->data->tx_queues[i]; + if (txq == NULL) + continue; + idpf_qc_tx_vec_avx512_setup(txq); + } + PMD_DRV_LOG(NOTICE, + "Using Single AVX512 Vector Tx (port %d).", + dev->data->port_id); + dev->tx_pkt_burst = idpf_dp_singleq_xmit_pkts_avx512; + dev->tx_pkt_prepare = idpf_dp_prep_pkts; + return; + } +#endif /* CC_AVX512_SUPPORT */ + } +#endif /* RTE_ARCH_X86 */ PMD_DRV_LOG(NOTICE, "Using Single Scalar Tx (port %d).", dev->data->port_id); diff --git a/drivers/net/cpfl/cpfl_rxtx_vec_common.h b/drivers/net/cpfl/cpfl_rxtx_vec_common.h new file mode 100644 index 0000000000..2d4c6a0ef3 --- /dev/null +++ b/drivers/net/cpfl/cpfl_rxtx_vec_common.h @@ -0,0 +1,100 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Intel Corporation + */ + +#ifndef _CPFL_RXTX_VEC_COMMON_H_ +#define _CPFL_RXTX_VEC_COMMON_H_ +#include +#include +#include + +#include "cpfl_ethdev.h" +#include "cpfl_rxtx.h" + +#ifndef __INTEL_COMPILER +#pragma GCC diagnostic ignored "-Wcast-qual" +#endif + +#define CPFL_SCALAR_PATH 0 +#define CPFL_VECTOR_PATH 1 +#define CPFL_RX_NO_VECTOR_FLAGS ( \ + RTE_ETH_RX_OFFLOAD_IPV4_CKSUM | \ + RTE_ETH_RX_OFFLOAD_UDP_CKSUM | \ + RTE_ETH_RX_OFFLOAD_TCP_CKSUM | \ + RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM | \ + RTE_ETH_RX_OFFLOAD_TIMESTAMP) +#define CPFL_TX_NO_VECTOR_FLAGS ( \ + RTE_ETH_TX_OFFLOAD_TCP_TSO | \ + RTE_ETH_TX_OFFLOAD_MULTI_SEGS) + +static inline int +cpfl_rx_vec_queue_default(struct idpf_rx_queue *rxq) +{ + if (rxq == NULL) + return CPFL_SCALAR_PATH; + + if (rte_is_power_of_2(rxq->nb_rx_desc) == 0) + return CPFL_SCALAR_PATH; + + if (rxq->rx_free_thresh < IDPF_VPMD_RX_MAX_BURST) + return CPFL_SCALAR_PATH; + + if ((rxq->nb_rx_desc % rxq->rx_free_thresh) != 0) + return CPFL_SCALAR_PATH; + + if ((rxq->offloads & CPFL_RX_NO_VECTOR_FLAGS) != 0) + return CPFL_SCALAR_PATH; + + return CPFL_VECTOR_PATH; +} + +static inline int +cpfl_tx_vec_queue_default(struct idpf_tx_queue *txq) +{ + if (txq == NULL) + return CPFL_SCALAR_PATH; + + if (txq->rs_thresh < IDPF_VPMD_TX_MAX_BURST || + (txq->rs_thresh & 3) != 0) + return CPFL_SCALAR_PATH; + + if ((txq->offloads & CPFL_TX_NO_VECTOR_FLAGS) != 0) + return CPFL_SCALAR_PATH; + + return CPFL_VECTOR_PATH; +} + +static inline int +cpfl_rx_vec_dev_check_default(struct rte_eth_dev *dev) +{ + struct idpf_rx_queue *rxq; + int i, ret = 0; + + for (i = 0; i < dev->data->nb_rx_queues; i++) { + rxq = dev->data->rx_queues[i]; + ret = (cpfl_rx_vec_queue_default(rxq)); + if (ret == CPFL_SCALAR_PATH) + return CPFL_SCALAR_PATH; + } + + return CPFL_VECTOR_PATH; +} + +static inline int +cpfl_tx_vec_dev_check_default(struct rte_eth_dev *dev) +{ + int i; + struct idpf_tx_queue *txq; + int ret = 0; + + for (i = 0; i < dev->data->nb_tx_queues; i++) { + txq = dev->data->tx_queues[i]; + ret = cpfl_tx_vec_queue_default(txq); + if (ret == CPFL_SCALAR_PATH) + return CPFL_SCALAR_PATH; + } + + return CPFL_VECTOR_PATH; +} + +#endif /*_CPFL_RXTX_VEC_COMMON_H_*/ diff --git a/drivers/net/cpfl/meson.build b/drivers/net/cpfl/meson.build index 1894423689..fbe6500826 100644 --- a/drivers/net/cpfl/meson.build +++ b/drivers/net/cpfl/meson.build @@ -7,9 +7,32 @@ if is_windows subdir_done() endif +if dpdk_conf.get('RTE_IOVA_AS_PA') == 0 + build = false + reason = 'driver does not support disabling IOVA as PA mode' + subdir_done() +endif + deps += ['common_idpf'] sources = files( 'cpfl_ethdev.c', 'cpfl_rxtx.c', -) \ No newline at end of file +) + +if arch_subdir == 'x86' + cpfl_avx512_cpu_support = ( + cc.get_define('__AVX512F__', args: machine_args) != '' and + cc.get_define('__AVX512BW__', args: machine_args) != '' + ) + + cpfl_avx512_cc_support = ( + not machine_args.contains('-mno-avx512f') and + cc.has_argument('-mavx512f') and + cc.has_argument('-mavx512bw') + ) + + if cpfl_avx512_cpu_support == true or cpfl_avx512_cc_support == true + cflags += ['-DCC_AVX512_SUPPORT'] + endif +endif -- 2.34.1