From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7476F45D6F; Fri, 22 Nov 2024 13:57:20 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 129E54343D; Fri, 22 Nov 2024 13:55:15 +0100 (CET) Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) by mails.dpdk.org (Postfix) with ESMTP id D4D9C43431 for ; Fri, 22 Nov 2024 13:55:10 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732280112; x=1763816112; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wquC8MqseZQikid6/ftkgy3p+KHr0GkteyF/qUEj7n8=; b=E7pQzlhvQoVR0pERPW42KctLQYsIpJ/1Ld8vIMqfoRu8OOXNLdp03HWR xadmPa2KymDjfiPao/DlP/Dz74kg1E4qitdyuCLRlryznZqMVDL+D4jT4 zq+mLhYMIjYQPcUJQc9X5sq6STcPCsYQpaHQCNobTcEqRBQ/of8vTLNYk dBd2OqJ62w2BLw51LKJpoLpjMv2JTPN34Rdx9Q4x+HFTpDgZO56gBTtqu aNoXVmOgIwrRrwSgVe5RyHkzCjN2mKka1nyYUVXn6wqjulXbtGEC/H7vB oGFAvHwumonH9n2VEZ31mE6vvib91C528yN50+Na3Df5LZh3c528+puri g==; X-CSE-ConnectionGUID: UwSZAwIiRNSRcqc/YDR99Q== X-CSE-MsgGUID: H2zhAHN7Rq+2swIB3aUBuw== X-IronPort-AV: E=McAfee;i="6700,10204,11263"; a="43085396" X-IronPort-AV: E=Sophos;i="6.12,175,1728975600"; d="scan'208";a="43085396" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2024 04:55:10 -0800 X-CSE-ConnectionGUID: 70K994GCRs2FKWOZB2JnjQ== X-CSE-MsgGUID: ORvdaOLJSJyWCVAdQ1u6Tw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,175,1728975600"; d="scan'208";a="90373299" Received: from unknown (HELO silpixa00401385.ir.intel.com) ([10.237.214.25]) by fmviesa007.fm.intel.com with ESMTP; 22 Nov 2024 04:55:08 -0800 From: Bruce Richardson To: dev@dpdk.org Cc: Bruce Richardson , Ian Stokes , David Christensen , Konstantin Ananyev , Wathsala Vithanage Subject: [RFC PATCH 19/21] net/i40e: use vector SW ring for all vector paths Date: Fri, 22 Nov 2024 12:54:12 +0000 Message-ID: <20241122125418.2857301-20-bruce.richardson@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241122125418.2857301-1-bruce.richardson@intel.com> References: <20241122125418.2857301-1-bruce.richardson@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The AVX-512 code path used a smaller SW ring structure only containing the mbuf pointer, but no other fields. The other fields are only used in the scalar code path, so update all vector driver code paths (AVX2, SSE, Neon, Altivec) to use the smaller, faster structure. Signed-off-by: Bruce Richardson --- drivers/net/i40e/i40e_rxtx.c | 8 +++++--- drivers/net/i40e/i40e_rxtx_vec_altivec.c | 12 ++++++------ drivers/net/i40e/i40e_rxtx_vec_avx2.c | 12 ++++++------ drivers/net/i40e/i40e_rxtx_vec_avx512.c | 14 ++------------ drivers/net/i40e/i40e_rxtx_vec_common.h | 6 ------ drivers/net/i40e/i40e_rxtx_vec_neon.c | 12 ++++++------ drivers/net/i40e/i40e_rxtx_vec_sse.c | 12 ++++++------ 7 files changed, 31 insertions(+), 45 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 4878b9b8aa..05f7f380c4 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -1892,7 +1892,7 @@ i40e_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id) tx_queue_id); txq->vector_tx = ad->tx_vec_allowed; - txq->vector_sw_ring = ad->tx_use_avx512; + txq->vector_sw_ring = txq->vector_tx; /* * tx_queue_id is queue id application refers to, while @@ -3551,9 +3551,11 @@ i40e_set_tx_function(struct rte_eth_dev *dev) } } + if (rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_128) + ad->tx_vec_allowed = false; + if (ad->tx_simple_allowed) { - if (ad->tx_vec_allowed && - rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { + if (ad->tx_vec_allowed) { #ifdef RTE_ARCH_X86 if (ad->tx_use_avx512) { #ifdef CC_AVX512_SUPPORT diff --git a/drivers/net/i40e/i40e_rxtx_vec_altivec.c b/drivers/net/i40e/i40e_rxtx_vec_altivec.c index 2ab09eb167..7acf44d3fe 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_altivec.c +++ b/drivers/net/i40e/i40e_rxtx_vec_altivec.c @@ -553,14 +553,14 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, { struct ieth_tx_queue *txq = (struct ieth_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; - struct ieth_tx_entry *txep; + struct ieth_vec_tx_entry *txep; uint16_t n, nb_commit, tx_id; uint64_t flags = I40E_TD_CMD; uint64_t rs = I40E_TX_DESC_CMD_RS | I40E_TD_CMD; int i; if (txq->nb_tx_free < txq->tx_free_thresh) - i40e_tx_free_bufs(txq); + ieth_tx_free_bufs_vector(txq, i40e_tx_desc_done, false); nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts); nb_commit = nb_pkts; @@ -569,13 +569,13 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, tx_id = txq->tx_tail; txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts); n = (uint16_t)(txq->nb_tx_desc - tx_id); if (nb_commit >= n) { - ieth_tx_backlog_entry(txep, tx_pkts, n); + ieth_tx_backlog_entry_vec(txep, tx_pkts, n); for (i = 0; i < n - 1; ++i, ++tx_pkts, ++txdp) vtx1(txdp, *tx_pkts, flags); @@ -589,10 +589,10 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, /* avoid reach the end of ring */ txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; } - ieth_tx_backlog_entry(txep, tx_pkts, nb_commit); + ieth_tx_backlog_entry_vec(txep, tx_pkts, nb_commit); vtx(txdp, tx_pkts, nb_commit, flags); diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx2.c b/drivers/net/i40e/i40e_rxtx_vec_avx2.c index e32fa160bf..8f593378d3 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_avx2.c +++ b/drivers/net/i40e/i40e_rxtx_vec_avx2.c @@ -745,13 +745,13 @@ i40e_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, { struct ieth_tx_queue *txq = (struct ieth_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; - struct ieth_tx_entry *txep; + struct ieth_vec_tx_entry *txep; uint16_t n, nb_commit, tx_id; uint64_t flags = I40E_TD_CMD; uint64_t rs = I40E_TX_DESC_CMD_RS | I40E_TD_CMD; if (txq->nb_tx_free < txq->tx_free_thresh) - i40e_tx_free_bufs(txq); + ieth_tx_free_bufs_vector(txq, i40e_tx_desc_done, false); nb_commit = nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts); if (unlikely(nb_pkts == 0)) @@ -759,13 +759,13 @@ i40e_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, tx_id = txq->tx_tail; txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts); n = (uint16_t)(txq->nb_tx_desc - tx_id); if (nb_commit >= n) { - ieth_tx_backlog_entry(txep, tx_pkts, n); + ieth_tx_backlog_entry_vec(txep, tx_pkts, n); vtx(txdp, tx_pkts, n - 1, flags); tx_pkts += (n - 1); @@ -780,10 +780,10 @@ i40e_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, /* avoid reach the end of ring */ txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; } - ieth_tx_backlog_entry(txep, tx_pkts, nb_commit); + ieth_tx_backlog_entry_vec(txep, tx_pkts, nb_commit); vtx(txdp, tx_pkts, nb_commit, flags); diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c b/drivers/net/i40e/i40e_rxtx_vec_avx512.c index 0ab3a4f02c..e0f1b2bc10 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_avx512.c +++ b/drivers/net/i40e/i40e_rxtx_vec_avx512.c @@ -807,16 +807,6 @@ vtx(volatile struct i40e_tx_desc *txdp, } } -static __rte_always_inline void -tx_backlog_entry_avx512(struct ieth_vec_tx_entry *txep, - struct rte_mbuf **tx_pkts, uint16_t nb_pkts) -{ - int i; - - for (i = 0; i < (int)nb_pkts; ++i) - txep[i].mbuf = tx_pkts[i]; -} - static inline uint16_t i40e_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) @@ -844,7 +834,7 @@ i40e_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts, n = (uint16_t)(txq->nb_tx_desc - tx_id); if (nb_commit >= n) { - tx_backlog_entry_avx512(txep, tx_pkts, n); + ieth_tx_backlog_entry_vec(txep, tx_pkts, n); vtx(txdp, tx_pkts, n - 1, flags); tx_pkts += (n - 1); @@ -862,7 +852,7 @@ i40e_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts, txep = (void *)txq->sw_ring; } - tx_backlog_entry_avx512(txep, tx_pkts, nb_commit); + ieth_tx_backlog_entry_vec(txep, tx_pkts, nb_commit); vtx(txdp, tx_pkts, nb_commit, flags); diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/i40e/i40e_rxtx_vec_common.h index 60f2130f4d..72b4a44faf 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_common.h +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h @@ -24,12 +24,6 @@ i40e_tx_desc_done(struct ieth_tx_queue *txq, uint16_t idx) rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE); } -static __rte_always_inline int -i40e_tx_free_bufs(struct ieth_tx_queue *txq) -{ - return ieth_tx_free_bufs(txq, i40e_tx_desc_done); -} - static inline void _i40e_rx_queue_release_mbufs_vec(struct i40e_rx_queue *rxq) { diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index b30da1a78c..502dcc9407 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -681,14 +681,14 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, { struct ieth_tx_queue *txq = (struct ieth_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; - struct ieth_tx_entry *txep; + struct ieth_vec_tx_entry *txep; uint16_t n, nb_commit, tx_id; uint64_t flags = I40E_TD_CMD; uint64_t rs = I40E_TX_DESC_CMD_RS | I40E_TD_CMD; int i; if (txq->nb_tx_free < txq->tx_free_thresh) - i40e_tx_free_bufs(txq); + ieth_tx_free_bufs_vector(txq, i40e_tx_desc_done, false); nb_commit = nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts); if (unlikely(nb_pkts == 0)) @@ -696,13 +696,13 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, tx_id = txq->tx_tail; txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts); n = (uint16_t)(txq->nb_tx_desc - tx_id); if (nb_commit >= n) { - ieth_tx_backlog_entry(txep, tx_pkts, n); + ieth_tx_backlog_entry_vec(txep, tx_pkts, n); for (i = 0; i < n - 1; ++i, ++tx_pkts, ++txdp) vtx1(txdp, *tx_pkts, flags); @@ -716,10 +716,10 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, /* avoid reach the end of ring */ txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; } - ieth_tx_backlog_entry(txep, tx_pkts, nb_commit); + ieth_tx_backlog_entry_vec(txep, tx_pkts, nb_commit); vtx(txdp, tx_pkts, nb_commit, flags); diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c b/drivers/net/i40e/i40e_rxtx_vec_sse.c index 5107cb9f01..958380815a 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_sse.c +++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c @@ -700,14 +700,14 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, { struct ieth_tx_queue *txq = (struct ieth_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; - struct ieth_tx_entry *txep; + struct ieth_vec_tx_entry *txep; uint16_t n, nb_commit, tx_id; uint64_t flags = I40E_TD_CMD; uint64_t rs = I40E_TX_DESC_CMD_RS | I40E_TD_CMD; int i; if (txq->nb_tx_free < txq->tx_free_thresh) - i40e_tx_free_bufs(txq); + ieth_tx_free_bufs_vector(txq, i40e_tx_desc_done, false); nb_commit = nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts); if (unlikely(nb_pkts == 0)) @@ -715,13 +715,13 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, tx_id = txq->tx_tail; txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts); n = (uint16_t)(txq->nb_tx_desc - tx_id); if (nb_commit >= n) { - ieth_tx_backlog_entry(txep, tx_pkts, n); + ieth_tx_backlog_entry_vec(txep, tx_pkts, n); for (i = 0; i < n - 1; ++i, ++tx_pkts, ++txdp) vtx1(txdp, *tx_pkts, flags); @@ -735,10 +735,10 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, /* avoid reach the end of ring */ txdp = &txq->i40e_tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = &txq->sw_ring_v[tx_id]; } - ieth_tx_backlog_entry(txep, tx_pkts, nb_commit); + ieth_tx_backlog_entry_vec(txep, tx_pkts, nb_commit); vtx(txdp, tx_pkts, nb_commit, flags); -- 2.43.0