DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 0/2] performance optimized for hns3 PMD
@ 2021-11-11 13:38 Min Hu (Connor)
  2021-11-11 13:38 ` [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-11 13:38 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

This patch set contains two ways to optimized Tx performance.

Chengwen Feng (2):
  net/hns3: optimized Tx performance by mbuf fast free
  net/hns3: optimized Tx performance

 drivers/net/hns3/hns3_rxtx.c     | 129 ++++++++++++++++---------------
 drivers/net/hns3/hns3_rxtx.h     |   2 +
 drivers/net/hns3/hns3_rxtx_vec.h |   9 +++
 3 files changed, 76 insertions(+), 64 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free
  2021-11-11 13:38 [PATCH 0/2] performance optimized for hns3 PMD Min Hu (Connor)
@ 2021-11-11 13:38 ` Min Hu (Connor)
  2021-11-15 17:30   ` Ferruh Yigit
  2021-11-11 13:38 ` [PATCH 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2 siblings, 1 reply; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-11 13:38 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

From: Chengwen Feng <fengchengwen@huawei.com>

Currently the vector and simple xmit algorithm don't support multi_segs,
so if Tx offload support MBUF_FAST_FREE, driver could invoke
rte_mempool_put_bulk() to free Tx mbufs in this situation.

In the testpmd single core MAC forwarding scenario, the performance is
improved by 8% at 64B on Kunpeng920 platform.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c     | 11 +++++++++++
 drivers/net/hns3/hns3_rxtx.h     |  2 ++
 drivers/net/hns3/hns3_rxtx_vec.h |  9 +++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index d26e262335..78227a139f 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -3059,6 +3059,8 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->min_tx_pkt_len = hw->min_tx_pkt_len;
 	txq->tso_mode = hw->tso_mode;
 	txq->udp_cksum_mode = hw->udp_cksum_mode;
+	txq->mbuf_fast_free_en = !!(dev->data->dev_conf.txmode.offloads &
+				    DEV_TX_OFFLOAD_MBUF_FAST_FREE);
 	memset(&txq->basic_stats, 0, sizeof(struct hns3_tx_basic_stats));
 	memset(&txq->dfx_stats, 0, sizeof(struct hns3_tx_dfx_stats));
 
@@ -3991,6 +3993,14 @@ hns3_tx_free_buffer_simple(struct hns3_tx_queue *txq)
 
 		tx_entry = &txq->sw_ring[txq->next_to_clean];
 
+		if (txq->mbuf_fast_free_en) {
+			rte_mempool_put_bulk(tx_entry->mbuf->pool,
+					(void **)tx_entry, txq->tx_rs_thresh);
+			for (i = 0; i < txq->tx_rs_thresh; i++)
+				tx_entry[i].mbuf = NULL;
+			goto update_field;
+		}
+
 		for (i = 0; i < txq->tx_rs_thresh; i++)
 			rte_prefetch0((tx_entry + i)->mbuf);
 		for (i = 0; i < txq->tx_rs_thresh; i++, tx_entry++) {
@@ -3998,6 +4008,7 @@ hns3_tx_free_buffer_simple(struct hns3_tx_queue *txq)
 			tx_entry->mbuf = NULL;
 		}
 
+update_field:
 		txq->next_to_clean = (tx_next_clean + 1) % txq->nb_tx_desc;
 		txq->tx_bd_ready += txq->tx_rs_thresh;
 	}
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 63bafc68b6..df731856ef 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -495,6 +495,8 @@ struct hns3_tx_queue {
 	 * this point.
 	 */
 	uint16_t pvid_sw_shift_en:1;
+	/* check whether the mbuf fast free offload is enabled */
+	uint16_t mbuf_fast_free_en:1;
 
 	/*
 	 * For better performance in tx datapath, releasing mbuf in batches is
diff --git a/drivers/net/hns3/hns3_rxtx_vec.h b/drivers/net/hns3/hns3_rxtx_vec.h
index 67c75e44ef..4985a7cae8 100644
--- a/drivers/net/hns3/hns3_rxtx_vec.h
+++ b/drivers/net/hns3/hns3_rxtx_vec.h
@@ -18,6 +18,14 @@ hns3_tx_bulk_free_buffers(struct hns3_tx_queue *txq)
 	int i;
 
 	tx_entry = &txq->sw_ring[txq->next_to_clean];
+	if (txq->mbuf_fast_free_en) {
+		rte_mempool_put_bulk(tx_entry->mbuf->pool, (void **)tx_entry,
+				     txq->tx_rs_thresh);
+		for (i = 0; i < txq->tx_rs_thresh; i++)
+			tx_entry[i].mbuf = NULL;
+		goto update_field;
+	}
+
 	for (i = 0; i < txq->tx_rs_thresh; i++, tx_entry++) {
 		m = rte_pktmbuf_prefree_seg(tx_entry->mbuf);
 		tx_entry->mbuf = NULL;
@@ -36,6 +44,7 @@ hns3_tx_bulk_free_buffers(struct hns3_tx_queue *txq)
 	if (nb_free)
 		rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
 
+update_field:
 	/* Update numbers of available descriptor due to buffer freed */
 	txq->tx_bd_ready += txq->tx_rs_thresh;
 	txq->next_to_clean += txq->tx_rs_thresh;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/2] net/hns3: optimized Tx performance
  2021-11-11 13:38 [PATCH 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2021-11-11 13:38 ` [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
@ 2021-11-11 13:38 ` Min Hu (Connor)
  2021-11-15 17:32   ` Ferruh Yigit
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2 siblings, 1 reply; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-11 13:38 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

From: Chengwen Feng <fengchengwen@huawei.com>

The PMD should check whether the descriptor is done by hardware before
free the corresponding mbuf. Currently the common xmit algorithm will
free mbuf every time when it's invoked. Because hardware may not have
finished sending, this may lead to many invalid queries which are
whether the descriptors are done.

This patch uses tx_free_thresh to control whether invoke free mbuf, and
free tx_rs_thresh mbufs each time.

This patch also modifies the implementation of PMD's tx_done_cleanup
because the mbuf free algorithm changed.

In the testpmd single core MAC forwarding scenario, the performance is
improved by 10% at 64B on Kunpeng920 platform.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 118 ++++++++++++++++-------------------
 1 file changed, 54 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 78227a139f..114b2f397f 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -3077,40 +3077,51 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	return 0;
 }
 
-static void
+static int
 hns3_tx_free_useless_buffer(struct hns3_tx_queue *txq)
 {
 	uint16_t tx_next_clean = txq->next_to_clean;
-	uint16_t tx_next_use   = txq->next_to_use;
-	uint16_t tx_bd_ready   = txq->tx_bd_ready;
-	uint16_t tx_bd_max     = txq->nb_tx_desc;
-	struct hns3_entry *tx_bak_pkt = &txq->sw_ring[tx_next_clean];
+	uint16_t tx_next_use = txq->next_to_use;
+	struct hns3_entry *tx_entry = &txq->sw_ring[tx_next_clean];
 	struct hns3_desc *desc = &txq->tx_ring[tx_next_clean];
-	struct rte_mbuf *mbuf;
+	int i;
 
-	while ((!(desc->tx.tp_fe_sc_vld_ra_ri &
-		rte_cpu_to_le_16(BIT(HNS3_TXD_VLD_B)))) &&
-		tx_next_use != tx_next_clean) {
-		mbuf = tx_bak_pkt->mbuf;
-		if (mbuf) {
-			rte_pktmbuf_free_seg(mbuf);
-			tx_bak_pkt->mbuf = NULL;
-		}
+	if (tx_next_use >= tx_next_clean &&
+	    tx_next_use < tx_next_clean + txq->tx_rs_thresh)
+		return -1;
 
-		desc++;
-		tx_bak_pkt++;
-		tx_next_clean++;
-		tx_bd_ready++;
-
-		if (tx_next_clean >= tx_bd_max) {
-			tx_next_clean = 0;
-			desc = txq->tx_ring;
-			tx_bak_pkt = txq->sw_ring;
-		}
+	/*
+	 * All mbufs can be released only when the VLD bits of all
+	 * descriptors in a batch are cleared.
+	 */
+	for (i = 0; i < txq->tx_rs_thresh; i++) {
+		if (desc[i].tx.tp_fe_sc_vld_ra_ri &
+			rte_le_to_cpu_16(BIT(HNS3_TXD_VLD_B)))
+			return -1;
 	}
 
-	txq->next_to_clean = tx_next_clean;
-	txq->tx_bd_ready   = tx_bd_ready;
+	for (i = 0; i < txq->tx_rs_thresh; i++) {
+		rte_pktmbuf_free_seg(tx_entry[i].mbuf);
+		tx_entry[i].mbuf = NULL;
+	}
+
+	/* Update numbers of available descriptor due to buffer freed */
+	txq->tx_bd_ready += txq->tx_rs_thresh;
+	txq->next_to_clean += txq->tx_rs_thresh;
+	if (txq->next_to_clean >= txq->nb_tx_desc)
+		txq->next_to_clean = 0;
+
+	return 0;
+}
+
+static inline int
+hns3_tx_free_required_buffer(struct hns3_tx_queue *txq, uint16_t required_bds)
+{
+	while (required_bds > txq->tx_bd_ready) {
+		if (hns3_tx_free_useless_buffer(txq) != 0)
+			return -1;
+	}
+	return 0;
 }
 
 int
@@ -4147,8 +4158,8 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	uint16_t nb_tx;
 	uint16_t i;
 
-	/* free useless buffer */
-	hns3_tx_free_useless_buffer(txq);
+	if (txq->tx_bd_ready < txq->tx_free_thresh)
+		(void)hns3_tx_free_useless_buffer(txq);
 
 	tx_next_use   = txq->next_to_use;
 	tx_bd_max     = txq->nb_tx_desc;
@@ -4163,11 +4174,14 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		nb_buf = tx_pkt->nb_segs;
 
 		if (nb_buf > txq->tx_bd_ready) {
-			txq->dfx_stats.queue_full_cnt++;
-			if (nb_tx == 0)
-				return 0;
-
-			goto end_of_tx;
+			/* Try to release the required MBUF, but avoid releasing
+			 * all MBUFs, otherwise, the MBUFs will be released for
+			 * a long time and may cause jitter.
+			 */
+			if (hns3_tx_free_required_buffer(txq, nb_buf) != 0) {
+				txq->dfx_stats.queue_full_cnt++;
+				goto end_of_tx;
+			}
 		}
 
 		/*
@@ -4577,46 +4591,22 @@ hns3_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id)
 static int
 hns3_tx_done_cleanup_full(struct hns3_tx_queue *txq, uint32_t free_cnt)
 {
-	uint16_t next_to_clean = txq->next_to_clean;
-	uint16_t next_to_use   = txq->next_to_use;
-	uint16_t tx_bd_ready   = txq->tx_bd_ready;
-	struct hns3_entry *tx_pkt = &txq->sw_ring[next_to_clean];
-	struct hns3_desc *desc = &txq->tx_ring[next_to_clean];
+	uint16_t round_free_cnt;
 	uint32_t idx;
 
 	if (free_cnt == 0 || free_cnt > txq->nb_tx_desc)
 		free_cnt = txq->nb_tx_desc;
 
-	for (idx = 0; idx < free_cnt; idx++) {
-		if (next_to_clean == next_to_use)
-			break;
+	if (txq->tx_rs_thresh == 0)
+		return 0;
 
-		if (desc->tx.tp_fe_sc_vld_ra_ri &
-		    rte_cpu_to_le_16(BIT(HNS3_TXD_VLD_B)))
+	round_free_cnt = roundup(free_cnt, txq->tx_rs_thresh);
+	for (idx = 0; idx < round_free_cnt; idx += txq->tx_rs_thresh) {
+		if (hns3_tx_free_useless_buffer(txq) != 0)
 			break;
-
-		if (tx_pkt->mbuf != NULL) {
-			rte_pktmbuf_free_seg(tx_pkt->mbuf);
-			tx_pkt->mbuf = NULL;
-		}
-
-		next_to_clean++;
-		tx_bd_ready++;
-		tx_pkt++;
-		desc++;
-		if (next_to_clean == txq->nb_tx_desc) {
-			tx_pkt = txq->sw_ring;
-			desc = txq->tx_ring;
-			next_to_clean = 0;
-		}
-	}
-
-	if (idx > 0) {
-		txq->next_to_clean = next_to_clean;
-		txq->tx_bd_ready = tx_bd_ready;
 	}
 
-	return (int)idx;
+	return RTE_MIN(idx, free_cnt);
 }
 
 int
-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free
  2021-11-11 13:38 ` [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
@ 2021-11-15 17:30   ` Ferruh Yigit
  2021-11-16  1:24     ` Min Hu (Connor)
  0 siblings, 1 reply; 14+ messages in thread
From: Ferruh Yigit @ 2021-11-15 17:30 UTC (permalink / raw)
  To: Min Hu (Connor), dev; +Cc: thomas

On 11/11/2021 1:38 PM, Min Hu (Connor) wrote:
> From: Chengwen Feng <fengchengwen@huawei.com>
> 
> Currently the vector and simple xmit algorithm don't support multi_segs,
> so if Tx offload support MBUF_FAST_FREE, driver could invoke
> rte_mempool_put_bulk() to free Tx mbufs in this situation.
> 
> In the testpmd single core MAC forwarding scenario, the performance is
> improved by 8% at 64B on Kunpeng920 platform.
> 

'RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE' seems already announced in 'tx_offload_capa',
  was it wrong?

> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
>   drivers/net/hns3/hns3_rxtx.c     | 11 +++++++++++
>   drivers/net/hns3/hns3_rxtx.h     |  2 ++
>   drivers/net/hns3/hns3_rxtx_vec.h |  9 +++++++++
>   3 files changed, 22 insertions(+)
> 

Can you please update 'doc/guides/nics/features/hns3.ini' to announce
"Free Tx mbuf on demand" feature.

> diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
> index d26e262335..78227a139f 100644
> --- a/drivers/net/hns3/hns3_rxtx.c
> +++ b/drivers/net/hns3/hns3_rxtx.c
> @@ -3059,6 +3059,8 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
>   	txq->min_tx_pkt_len = hw->min_tx_pkt_len;
>   	txq->tso_mode = hw->tso_mode;
>   	txq->udp_cksum_mode = hw->udp_cksum_mode;
> +	txq->mbuf_fast_free_en = !!(dev->data->dev_conf.txmode.offloads &
> +				    DEV_TX_OFFLOAD_MBUF_FAST_FREE);

Can you please use updated macro name, 'RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] net/hns3: optimized Tx performance
  2021-11-11 13:38 ` [PATCH 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
@ 2021-11-15 17:32   ` Ferruh Yigit
  0 siblings, 0 replies; 14+ messages in thread
From: Ferruh Yigit @ 2021-11-15 17:32 UTC (permalink / raw)
  To: Min Hu (Connor), dev; +Cc: thomas

On 11/11/2021 1:38 PM, Min Hu (Connor) wrote:
> From: Chengwen Feng<fengchengwen@huawei.com>
> 
> The PMD should check whether the descriptor is done by hardware before
> free the corresponding mbuf. Currently the common xmit algorithm will
> free mbuf every time when it's invoked. Because hardware may not have
> finished sending, this may lead to many invalid queries which are
> whether the descriptors are done.
> 

Hi Connor, Chengwen,

Since there will be a new version, can you please reword above paragraph?

> This patch uses tx_free_thresh to control whether invoke free mbuf, and
> free tx_rs_thresh mbufs each time.
> 
> This patch also modifies the implementation of PMD's tx_done_cleanup
> because the mbuf free algorithm changed.
> 
> In the testpmd single core MAC forwarding scenario, the performance is
> improved by 10% at 64B on Kunpeng920 platform.
> 
> Cc:stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng<fengchengwen@huawei.com>
> Signed-off-by: Min Hu (Connor)<humin29@huawei.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-11 13:38 [PATCH 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2021-11-11 13:38 ` [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
  2021-11-11 13:38 ` [PATCH 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
@ 2021-11-16  1:22 ` Min Hu (Connor)
  2021-11-16  1:22   ` [PATCH v2 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
                     ` (3 more replies)
  2 siblings, 4 replies; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-16  1:22 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

This patch set contains two ways to optimized Tx performance.

Chengwen Feng (2):
  net/hns3: optimized Tx performance by mbuf fast free
  net/hns3: optimized Tx performance

 doc/guides/nics/features/hns3.ini |   1 +
 drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++---------------
 drivers/net/hns3/hns3_rxtx.h      |   2 +
 drivers/net/hns3/hns3_rxtx_vec.h  |   9 +++
 4 files changed, 77 insertions(+), 64 deletions(-)
---
v2:
* document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'

-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/2] net/hns3: optimized Tx performance by mbuf fast free
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
@ 2021-11-16  1:22   ` Min Hu (Connor)
  2021-11-16  1:22   ` [PATCH v2 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-16  1:22 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

From: Chengwen Feng <fengchengwen@huawei.com>

Currently the vector and simple xmit algorithm don't support multi_segs,
so if Tx offload support MBUF_FAST_FREE, driver could invoke
rte_mempool_put_bulk() to free Tx mbufs in this situation.

In the testpmd single core MAC forwarding scenario, the performance is
improved by 8% at 64B on Kunpeng920 platform.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/nics/features/hns3.ini |  1 +
 drivers/net/hns3/hns3_rxtx.c      | 11 +++++++++++
 drivers/net/hns3/hns3_rxtx.h      |  2 ++
 drivers/net/hns3/hns3_rxtx_vec.h  |  9 +++++++++
 4 files changed, 23 insertions(+)

diff --git a/doc/guides/nics/features/hns3.ini b/doc/guides/nics/features/hns3.ini
index c3464c8396..405b94f05c 100644
--- a/doc/guides/nics/features/hns3.ini
+++ b/doc/guides/nics/features/hns3.ini
@@ -12,6 +12,7 @@ Queue start/stop     = Y
 Runtime Rx queue setup = Y
 Runtime Tx queue setup = Y
 Burst mode info      = Y
+Fast mbuf free       = Y
 Free Tx mbuf on demand = Y
 MTU update           = Y
 Scattered Rx         = Y
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index d26e262335..f0a57611ec 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -3059,6 +3059,8 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->min_tx_pkt_len = hw->min_tx_pkt_len;
 	txq->tso_mode = hw->tso_mode;
 	txq->udp_cksum_mode = hw->udp_cksum_mode;
+	txq->mbuf_fast_free_en = !!(dev->data->dev_conf.txmode.offloads &
+				    RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE);
 	memset(&txq->basic_stats, 0, sizeof(struct hns3_tx_basic_stats));
 	memset(&txq->dfx_stats, 0, sizeof(struct hns3_tx_dfx_stats));
 
@@ -3991,6 +3993,14 @@ hns3_tx_free_buffer_simple(struct hns3_tx_queue *txq)
 
 		tx_entry = &txq->sw_ring[txq->next_to_clean];
 
+		if (txq->mbuf_fast_free_en) {
+			rte_mempool_put_bulk(tx_entry->mbuf->pool,
+					(void **)tx_entry, txq->tx_rs_thresh);
+			for (i = 0; i < txq->tx_rs_thresh; i++)
+				tx_entry[i].mbuf = NULL;
+			goto update_field;
+		}
+
 		for (i = 0; i < txq->tx_rs_thresh; i++)
 			rte_prefetch0((tx_entry + i)->mbuf);
 		for (i = 0; i < txq->tx_rs_thresh; i++, tx_entry++) {
@@ -3998,6 +4008,7 @@ hns3_tx_free_buffer_simple(struct hns3_tx_queue *txq)
 			tx_entry->mbuf = NULL;
 		}
 
+update_field:
 		txq->next_to_clean = (tx_next_clean + 1) % txq->nb_tx_desc;
 		txq->tx_bd_ready += txq->tx_rs_thresh;
 	}
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 63bafc68b6..df731856ef 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -495,6 +495,8 @@ struct hns3_tx_queue {
 	 * this point.
 	 */
 	uint16_t pvid_sw_shift_en:1;
+	/* check whether the mbuf fast free offload is enabled */
+	uint16_t mbuf_fast_free_en:1;
 
 	/*
 	 * For better performance in tx datapath, releasing mbuf in batches is
diff --git a/drivers/net/hns3/hns3_rxtx_vec.h b/drivers/net/hns3/hns3_rxtx_vec.h
index 67c75e44ef..4985a7cae8 100644
--- a/drivers/net/hns3/hns3_rxtx_vec.h
+++ b/drivers/net/hns3/hns3_rxtx_vec.h
@@ -18,6 +18,14 @@ hns3_tx_bulk_free_buffers(struct hns3_tx_queue *txq)
 	int i;
 
 	tx_entry = &txq->sw_ring[txq->next_to_clean];
+	if (txq->mbuf_fast_free_en) {
+		rte_mempool_put_bulk(tx_entry->mbuf->pool, (void **)tx_entry,
+				     txq->tx_rs_thresh);
+		for (i = 0; i < txq->tx_rs_thresh; i++)
+			tx_entry[i].mbuf = NULL;
+		goto update_field;
+	}
+
 	for (i = 0; i < txq->tx_rs_thresh; i++, tx_entry++) {
 		m = rte_pktmbuf_prefree_seg(tx_entry->mbuf);
 		tx_entry->mbuf = NULL;
@@ -36,6 +44,7 @@ hns3_tx_bulk_free_buffers(struct hns3_tx_queue *txq)
 	if (nb_free)
 		rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
 
+update_field:
 	/* Update numbers of available descriptor due to buffer freed */
 	txq->tx_bd_ready += txq->tx_rs_thresh;
 	txq->next_to_clean += txq->tx_rs_thresh;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 2/2] net/hns3: optimized Tx performance
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2021-11-16  1:22   ` [PATCH v2 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
@ 2021-11-16  1:22   ` Min Hu (Connor)
  2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
  2021-11-16 15:43   ` Ferruh Yigit
  3 siblings, 0 replies; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-16  1:22 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas

From: Chengwen Feng <fengchengwen@huawei.com>

This patch uses tx_free_thresh to control mbufs free when the common
xmit algorithm is used.

This patch also modifies the implementation of PMD's tx_done_cleanup
because the mbuf free algorithm changed.

In the testpmd single core MAC forwarding scenario, the performance is
improved by 10% at 64B on Kunpeng920 platform.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 118 ++++++++++++++++-------------------
 1 file changed, 54 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index f0a57611ec..40cc4e9c1a 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -3077,40 +3077,51 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	return 0;
 }
 
-static void
+static int
 hns3_tx_free_useless_buffer(struct hns3_tx_queue *txq)
 {
 	uint16_t tx_next_clean = txq->next_to_clean;
-	uint16_t tx_next_use   = txq->next_to_use;
-	uint16_t tx_bd_ready   = txq->tx_bd_ready;
-	uint16_t tx_bd_max     = txq->nb_tx_desc;
-	struct hns3_entry *tx_bak_pkt = &txq->sw_ring[tx_next_clean];
+	uint16_t tx_next_use = txq->next_to_use;
+	struct hns3_entry *tx_entry = &txq->sw_ring[tx_next_clean];
 	struct hns3_desc *desc = &txq->tx_ring[tx_next_clean];
-	struct rte_mbuf *mbuf;
+	int i;
 
-	while ((!(desc->tx.tp_fe_sc_vld_ra_ri &
-		rte_cpu_to_le_16(BIT(HNS3_TXD_VLD_B)))) &&
-		tx_next_use != tx_next_clean) {
-		mbuf = tx_bak_pkt->mbuf;
-		if (mbuf) {
-			rte_pktmbuf_free_seg(mbuf);
-			tx_bak_pkt->mbuf = NULL;
-		}
+	if (tx_next_use >= tx_next_clean &&
+	    tx_next_use < tx_next_clean + txq->tx_rs_thresh)
+		return -1;
 
-		desc++;
-		tx_bak_pkt++;
-		tx_next_clean++;
-		tx_bd_ready++;
-
-		if (tx_next_clean >= tx_bd_max) {
-			tx_next_clean = 0;
-			desc = txq->tx_ring;
-			tx_bak_pkt = txq->sw_ring;
-		}
+	/*
+	 * All mbufs can be released only when the VLD bits of all
+	 * descriptors in a batch are cleared.
+	 */
+	for (i = 0; i < txq->tx_rs_thresh; i++) {
+		if (desc[i].tx.tp_fe_sc_vld_ra_ri &
+			rte_le_to_cpu_16(BIT(HNS3_TXD_VLD_B)))
+			return -1;
 	}
 
-	txq->next_to_clean = tx_next_clean;
-	txq->tx_bd_ready   = tx_bd_ready;
+	for (i = 0; i < txq->tx_rs_thresh; i++) {
+		rte_pktmbuf_free_seg(tx_entry[i].mbuf);
+		tx_entry[i].mbuf = NULL;
+	}
+
+	/* Update numbers of available descriptor due to buffer freed */
+	txq->tx_bd_ready += txq->tx_rs_thresh;
+	txq->next_to_clean += txq->tx_rs_thresh;
+	if (txq->next_to_clean >= txq->nb_tx_desc)
+		txq->next_to_clean = 0;
+
+	return 0;
+}
+
+static inline int
+hns3_tx_free_required_buffer(struct hns3_tx_queue *txq, uint16_t required_bds)
+{
+	while (required_bds > txq->tx_bd_ready) {
+		if (hns3_tx_free_useless_buffer(txq) != 0)
+			return -1;
+	}
+	return 0;
 }
 
 int
@@ -4147,8 +4158,8 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	uint16_t nb_tx;
 	uint16_t i;
 
-	/* free useless buffer */
-	hns3_tx_free_useless_buffer(txq);
+	if (txq->tx_bd_ready < txq->tx_free_thresh)
+		(void)hns3_tx_free_useless_buffer(txq);
 
 	tx_next_use   = txq->next_to_use;
 	tx_bd_max     = txq->nb_tx_desc;
@@ -4163,11 +4174,14 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		nb_buf = tx_pkt->nb_segs;
 
 		if (nb_buf > txq->tx_bd_ready) {
-			txq->dfx_stats.queue_full_cnt++;
-			if (nb_tx == 0)
-				return 0;
-
-			goto end_of_tx;
+			/* Try to release the required MBUF, but avoid releasing
+			 * all MBUFs, otherwise, the MBUFs will be released for
+			 * a long time and may cause jitter.
+			 */
+			if (hns3_tx_free_required_buffer(txq, nb_buf) != 0) {
+				txq->dfx_stats.queue_full_cnt++;
+				goto end_of_tx;
+			}
 		}
 
 		/*
@@ -4577,46 +4591,22 @@ hns3_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id)
 static int
 hns3_tx_done_cleanup_full(struct hns3_tx_queue *txq, uint32_t free_cnt)
 {
-	uint16_t next_to_clean = txq->next_to_clean;
-	uint16_t next_to_use   = txq->next_to_use;
-	uint16_t tx_bd_ready   = txq->tx_bd_ready;
-	struct hns3_entry *tx_pkt = &txq->sw_ring[next_to_clean];
-	struct hns3_desc *desc = &txq->tx_ring[next_to_clean];
+	uint16_t round_free_cnt;
 	uint32_t idx;
 
 	if (free_cnt == 0 || free_cnt > txq->nb_tx_desc)
 		free_cnt = txq->nb_tx_desc;
 
-	for (idx = 0; idx < free_cnt; idx++) {
-		if (next_to_clean == next_to_use)
-			break;
+	if (txq->tx_rs_thresh == 0)
+		return 0;
 
-		if (desc->tx.tp_fe_sc_vld_ra_ri &
-		    rte_cpu_to_le_16(BIT(HNS3_TXD_VLD_B)))
+	round_free_cnt = roundup(free_cnt, txq->tx_rs_thresh);
+	for (idx = 0; idx < round_free_cnt; idx += txq->tx_rs_thresh) {
+		if (hns3_tx_free_useless_buffer(txq) != 0)
 			break;
-
-		if (tx_pkt->mbuf != NULL) {
-			rte_pktmbuf_free_seg(tx_pkt->mbuf);
-			tx_pkt->mbuf = NULL;
-		}
-
-		next_to_clean++;
-		tx_bd_ready++;
-		tx_pkt++;
-		desc++;
-		if (next_to_clean == txq->nb_tx_desc) {
-			tx_pkt = txq->sw_ring;
-			desc = txq->tx_ring;
-			next_to_clean = 0;
-		}
-	}
-
-	if (idx > 0) {
-		txq->next_to_clean = next_to_clean;
-		txq->tx_bd_ready = tx_bd_ready;
 	}
 
-	return (int)idx;
+	return RTE_MIN(idx, free_cnt);
 }
 
 int
-- 
2.33.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free
  2021-11-15 17:30   ` Ferruh Yigit
@ 2021-11-16  1:24     ` Min Hu (Connor)
  0 siblings, 0 replies; 14+ messages in thread
From: Min Hu (Connor) @ 2021-11-16  1:24 UTC (permalink / raw)
  To: Ferruh Yigit, dev; +Cc: thomas

Hi, Ferruh,
fixed in v2, thanks.

在 2021/11/16 1:30, Ferruh Yigit 写道:
> On 11/11/2021 1:38 PM, Min Hu (Connor) wrote:
>> From: Chengwen Feng <fengchengwen@huawei.com>
>>
>> Currently the vector and simple xmit algorithm don't support multi_segs,
>> so if Tx offload support MBUF_FAST_FREE, driver could invoke
>> rte_mempool_put_bulk() to free Tx mbufs in this situation.
>>
>> In the testpmd single core MAC forwarding scenario, the performance is
>> improved by 8% at 64B on Kunpeng920 platform.
>>
> 
> 'RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE' seems already announced in 
> 'tx_offload_capa',
>   was it wrong?
> 
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>>   drivers/net/hns3/hns3_rxtx.c     | 11 +++++++++++
>>   drivers/net/hns3/hns3_rxtx.h     |  2 ++
>>   drivers/net/hns3/hns3_rxtx_vec.h |  9 +++++++++
>>   3 files changed, 22 insertions(+)
>>
> 
> Can you please update 'doc/guides/nics/features/hns3.ini' to announce
> "Free Tx mbuf on demand" feature.
> 
>> diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
>> index d26e262335..78227a139f 100644
>> --- a/drivers/net/hns3/hns3_rxtx.c
>> +++ b/drivers/net/hns3/hns3_rxtx.c
>> @@ -3059,6 +3059,8 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, 
>> uint16_t idx, uint16_t nb_desc,
>>       txq->min_tx_pkt_len = hw->min_tx_pkt_len;
>>       txq->tso_mode = hw->tso_mode;
>>       txq->udp_cksum_mode = hw->udp_cksum_mode;
>> +    txq->mbuf_fast_free_en = !!(dev->data->dev_conf.txmode.offloads &
>> +                    DEV_TX_OFFLOAD_MBUF_FAST_FREE);
> 
> Can you please use updated macro name, 'RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'?
> .

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
  2021-11-16  1:22   ` [PATCH v2 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
  2021-11-16  1:22   ` [PATCH v2 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
@ 2021-11-16 14:36   ` Ferruh Yigit
  2021-11-16 15:04     ` Fengchengwen
                       ` (2 more replies)
  2021-11-16 15:43   ` Ferruh Yigit
  3 siblings, 3 replies; 14+ messages in thread
From: Ferruh Yigit @ 2021-11-16 14:36 UTC (permalink / raw)
  To: Min Hu (Connor), dev; +Cc: thomas

On 11/16/2021 1:22 AM, Min Hu (Connor) wrote:
> This patch set contains two ways to optimized Tx performance.
> 
> Chengwen Feng (2):
>    net/hns3: optimized Tx performance by mbuf fast free
>    net/hns3: optimized Tx performance
> 
>   doc/guides/nics/features/hns3.ini |   1 +
>   drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++---------------
>   drivers/net/hns3/hns3_rxtx.h      |   2 +
>   drivers/net/hns3/hns3_rxtx_vec.h  |   9 +++
>   4 files changed, 77 insertions(+), 64 deletions(-)
> ---
> v2:
> * document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'
> 

Is Connor's sign off in the v1 dropped intentionally, or forgotten?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
@ 2021-11-16 15:04     ` Fengchengwen
  2021-11-16 15:12     ` humin (Q)
  2021-11-16 15:38     ` Ferruh Yigit
  2 siblings, 0 replies; 14+ messages in thread
From: Fengchengwen @ 2021-11-16 15:04 UTC (permalink / raw)
  To: Ferruh Yigit, humin (Q), dev; +Cc: thomas

[-- Attachment #1: Type: text/plain, Size: 1093 bytes --]

May be forgotten, I send v2 to him this morning, and maybe forgot adding the sign off. But the patch set is OK because we both check them.


发件人:Ferruh Yigit <ferruh.yigit@intel.com>
收件人:humin (Q) <humin29@huawei.com>;dev <dev@dpdk.org>
抄 送:thomas <thomas@monjalon.net>
时 间:2021-11-16 22:37:23
主 题:Re: [PATCH v2 0/2] performance optimized for hns3 PMD

On 11/16/2021 1:22 AM, Min Hu (Connor) wrote:
> This patch set contains two ways to optimized Tx performance.
>
> Chengwen Feng (2):
>    net/hns3: optimized Tx performance by mbuf fast free
>    net/hns3: optimized Tx performance
>
>   doc/guides/nics/features/hns3.ini |   1 +
>   drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++
________________________________

>   drivers/net/hns3/hns3_rxtx.h      |   2 +
>   drivers/net/hns3/hns3_rxtx_vec.h |   9 +++
>   4 files changed, 77 insertions(+), 64 deletions(-)
> ---
> v2:
> * document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'
>

Is Connor's sign off in the v1 dropped intentionally, or forgotten?


[-- Attachment #2: Type: text/html, Size: 2610 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
  2021-11-16 15:04     ` Fengchengwen
@ 2021-11-16 15:12     ` humin (Q)
  2021-11-16 15:38     ` Ferruh Yigit
  2 siblings, 0 replies; 14+ messages in thread
From: humin (Q) @ 2021-11-16 15:12 UTC (permalink / raw)
  To: Fengchengwen, Ferruh Yigit, dev; +Cc: thomas

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

Hi, ferruh,
I forgot it,please added it for me when you merged them.thank you.



________________________________

胡敏 Hu Min
Mobile: +86-13528728164<tel:+86-13528728164>
Email: humin29@huawei.com<mailto:humin29@huawei.com>
发件人:Fengchengwen <fengchengwen@huawei.com>
收件人:Ferruh Yigit <ferruh.yigit@intel.com>;humin (Q) <humin29@huawei.com>;dev <dev@dpdk.org>
抄 送:thomas <thomas@monjalon.net>
时 间:2021-11-16 23:04:08
主 题:RE: [PATCH v2 0/2] performance optimized for hns3 PMD

May be forgotten, I send v2 to him this morning, and maybe forgot adding the sign off. But the patch set is OK because we both check them.


发件人:Ferruh Yigit <ferruh.yigit@intel.com>
收件人:humin (Q) <humin29@huawei.com>;dev <dev@dpdk.org>
抄 送:thomas <thomas@monjalon.net>
时 间:2021-11-16 22:37:23
主 题:Re: [PATCH v2 0/2] performance optimized for hns3 PMD

On 11/16/2021 1:22 AM, Min Hu (Connor) wrote:
> This patch set contains two ways to optimized Tx performance.
>
> Chengwen Feng (2):
>    net/hns3: optimized Tx performance by mbuf fast free
>    net/hns3: optimized Tx performance
>
>   doc/guides/nics/features/hns3.ini |   1 +
>   drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++
________________________________

>   drivers/net/hns3/hns3_rxtx.h      |   2 +
>   drivers/net/hns3/hns3_rxtx_vec.h |   9 +++
>   4 files changed, 77 insertions(+), 64 deletions(-)
> ---
> v2:
> * document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'
>

Is Connor's sign off in the v1 dropped intentionally, or forgotten?


[-- Attachment #2: Type: text/html, Size: 3459 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
  2021-11-16 15:04     ` Fengchengwen
  2021-11-16 15:12     ` humin (Q)
@ 2021-11-16 15:38     ` Ferruh Yigit
  2 siblings, 0 replies; 14+ messages in thread
From: Ferruh Yigit @ 2021-11-16 15:38 UTC (permalink / raw)
  To: humin (Q), Fengchengwen, dev; +Cc: thomas

On 11/16/2021 3:12 PM, humin (Q) wrote:
> Hi, ferruh,
> I forgot it,please added it for me when you merged them.thank you.
> 

ack, adding it back in next-net.

> 
> 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> 胡敏 Hu Min
> Mobile: +86-13528728164 <tel:+86-13528728164>
> Email: humin29@huawei.com <mailto:humin29@huawei.com>
> *发件人:*Fengchengwen <fengchengwen@huawei.com>
> *收件人:*Ferruh Yigit <ferruh.yigit@intel.com>;humin (Q) <humin29@huawei.com>;dev <dev@dpdk.org>
> *抄 送:*thomas <thomas@monjalon.net>
> *时 间:*2021-11-16 23:04:08
> *主 题:*RE: [PATCH v2 0/2] performance optimized for hns3 PMD
> 
> May be forgotten, I send v2 to him this morning, and maybe forgot adding the sign off. But the patch set is OK because we both check them.
> 
> 
> *发件人:*Ferruh Yigit <ferruh.yigit@intel.com>
> *收件人:*humin (Q) <humin29@huawei.com>;dev <dev@dpdk.org>
> *抄 送:*thomas <thomas@monjalon.net>
> *时 间:*2021-11-16 22:37:23
> *主 题:*Re: [PATCH v2 0/2] performance optimized for hns3 PMD
> 
> On 11/16/2021 1:22 AM, Min Hu (Connor) wrote:
>> This patch set contains two ways to optimized Tx performance.
>> 
>> Chengwen Feng (2):
>>    net/hns3: optimized Tx performance by mbuf fast free
>>    net/hns3: optimized Tx performance
>> 
>>   doc/guides/nics/features/hns3.ini |   1 +
>>   drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
>>   drivers/net/hns3/hns3_rxtx.h      |   2 +
>>   drivers/net/hns3/hns3_rxtx_vec.h |   9 +++
>>   4 files changed, 77 insertions(+), 64 deletions(-)
>> ---
>> v2:
>> * document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'
>> 
> 
> Is Connor's sign off in the v1 dropped intentionally, or forgotten?
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] performance optimized for hns3 PMD
  2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
                     ` (2 preceding siblings ...)
  2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
@ 2021-11-16 15:43   ` Ferruh Yigit
  3 siblings, 0 replies; 14+ messages in thread
From: Ferruh Yigit @ 2021-11-16 15:43 UTC (permalink / raw)
  To: Min Hu (Connor), Chengwen Feng; +Cc: thomas, dev

On 11/16/2021 1:22 AM, Min Hu (Connor) wrote:
> This patch set contains two ways to optimized Tx performance.
> 
> Chengwen Feng (2):
>    net/hns3: optimized Tx performance by mbuf fast free
>    net/hns3: optimized Tx performance
> 
>   doc/guides/nics/features/hns3.ini |   1 +
>   drivers/net/hns3/hns3_rxtx.c      | 129 +++++++++++++++---------------
>   drivers/net/hns3/hns3_rxtx.h      |   2 +
>   drivers/net/hns3/hns3_rxtx_vec.h  |   9 +++
>   4 files changed, 77 insertions(+), 64 deletions(-)
> ---
> v2:
> * document hns3.ini and fix 'TE_ETH_TX_OFFLOAD_MBUF_FAST_FREE'
> 

Series applied to dpdk-next-net/main, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-11-16 15:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11 13:38 [PATCH 0/2] performance optimized for hns3 PMD Min Hu (Connor)
2021-11-11 13:38 ` [PATCH 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
2021-11-15 17:30   ` Ferruh Yigit
2021-11-16  1:24     ` Min Hu (Connor)
2021-11-11 13:38 ` [PATCH 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
2021-11-15 17:32   ` Ferruh Yigit
2021-11-16  1:22 ` [PATCH v2 0/2] performance optimized for hns3 PMD Min Hu (Connor)
2021-11-16  1:22   ` [PATCH v2 1/2] net/hns3: optimized Tx performance by mbuf fast free Min Hu (Connor)
2021-11-16  1:22   ` [PATCH v2 2/2] net/hns3: optimized Tx performance Min Hu (Connor)
2021-11-16 14:36   ` [PATCH v2 0/2] performance optimized for hns3 PMD Ferruh Yigit
2021-11-16 15:04     ` Fengchengwen
2021-11-16 15:12     ` humin (Q)
2021-11-16 15:38     ` Ferruh Yigit
2021-11-16 15:43   ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).