patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6
@ 2020-11-13 13:36 Lijun Ou
  2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 01/13] net/hns3: support TSO Lijun Ou
                   ` (14 more replies)
  0 siblings, 15 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:36 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

Hi, Luca Boccassi
His series are backport for 19.11.6 about hns3 PMD driver
I also noticed that you gave Hu Wei a suggestion on the 19.11.4
backport. You did not recommend adding new features. I reselected
the TSO and some performance optimization patch request backports.
The reason I do this is that I think TSO is also a performance
optimization point, including other fixes. In addition, if TSO
is not integrated, the bug fixes of the cksum may not be integrated,
which will cause the stability of the cksum function.

Chengchang Tang (5):
  net/hns3: support promiscuous and allmulticast mode for VF
  net/hns3: decrease non-nearby memory access in Rx
  net/hns3: cleanup duplicated code on processing TSO in Tx
  net/hns3: fix Tx checksum outer header prepare
  net/hns3: fix Tx checksum with fixed header length

Hongbo Zheng (1):
  net/hns3: check TSO segment size during Tx

Lijun Ou (2):
  net/hns3: support TSO
  net/hns3: report Tx descriptor segment limitations

Wei Hu (Xavier) (5):
  net/hns3: fix reassembling multiple segment packets in Tx
  net/hns3: fix inserted VLAN tag position in Tx
  net/hns3: report Rx drop packets enable configuration
  net/hns3: report Rx free threshold
  net/hns3: reduce address calculation in Rx

 doc/guides/nics/features/hns3.ini    |   1 +
 doc/guides/nics/features/hns3_vf.ini |   3 +
 doc/guides/nics/hns3.rst             |   1 +
 drivers/net/hns3/hns3_ethdev.c       |  38 +-
 drivers/net/hns3/hns3_ethdev.h       |  37 +-
 drivers/net/hns3/hns3_ethdev_vf.c    | 134 ++++++-
 drivers/net/hns3/hns3_mbx.c          |  23 ++
 drivers/net/hns3/hns3_mbx.h          |   2 +
 drivers/net/hns3/hns3_rxtx.c         | 666 ++++++++++++++++++++++++-----------
 drivers/net/hns3/hns3_rxtx.h         |  31 +-
 10 files changed, 720 insertions(+), 216 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 01/13] net/hns3: support TSO
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
@ 2020-11-13 13:36 ` Lijun Ou
  2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 02/13] net/hns3: check TSO segment size during Tx Lijun Ou
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:36 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

[ upstream commit 6dca716c9e1daa8ea770a4a198bd068e72a2e03c ]

This patch adds TCP segment offload support for hns3 PMD driver.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 doc/guides/nics/features/hns3.ini    |   1 +
 doc/guides/nics/features/hns3_vf.ini |   1 +
 doc/guides/nics/hns3.rst             |   1 +
 drivers/net/hns3/hns3_ethdev.c       |   4 +
 drivers/net/hns3/hns3_ethdev.h       |   6 +-
 drivers/net/hns3/hns3_ethdev_vf.c    |   4 +
 drivers/net/hns3/hns3_rxtx.c         | 258 +++++++++++++++++++++++++++++++++--
 7 files changed, 259 insertions(+), 16 deletions(-)

diff --git a/doc/guides/nics/features/hns3.ini b/doc/guides/nics/features/hns3.ini
index cd5c08a..c3a8544 100644
--- a/doc/guides/nics/features/hns3.ini
+++ b/doc/guides/nics/features/hns3.ini
@@ -8,6 +8,7 @@ Link status          = Y
 Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
+TSO                  = Y
 Promiscuous mode     = Y
 Allmulticast mode    = Y
 Unicast MAC filter   = Y
diff --git a/doc/guides/nics/features/hns3_vf.ini b/doc/guides/nics/features/hns3_vf.ini
index fd00ac3..e4e7738 100644
--- a/doc/guides/nics/features/hns3_vf.ini
+++ b/doc/guides/nics/features/hns3_vf.ini
@@ -8,6 +8,7 @@ Link status          = Y
 Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
+TSO                  = Y
 Unicast MAC filter   = Y
 Multicast MAC filter = Y
 RSS hash             = Y
diff --git a/doc/guides/nics/hns3.rst b/doc/guides/nics/hns3.rst
index 8d19f48..05dbe41 100644
--- a/doc/guides/nics/hns3.rst
+++ b/doc/guides/nics/hns3.rst
@@ -17,6 +17,7 @@ Features of the HNS3 PMD are:
 - Receive Side Scaling (RSS)
 - Packet type information
 - Checksum offload
+- TSO offload
 - Promiscuous mode
 - Multicast mode
 - Port hardware statistics
diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index af43a44..23c16b8 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2468,6 +2468,10 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 				 DEV_TX_OFFLOAD_VLAN_INSERT |
 				 DEV_TX_OFFLOAD_QINQ_INSERT |
 				 DEV_TX_OFFLOAD_MULTI_SEGS |
+                                 DEV_TX_OFFLOAD_TCP_TSO |
+                                 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+                                 DEV_TX_OFFLOAD_GRE_TNL_TSO |
+                                 DEV_TX_OFFLOAD_GENEVE_TNL_TSO |
 				 DEV_TX_OFFLOAD_MBUF_FAST_FREE);
 
 	info->rx_desc_lim = (struct rte_eth_desc_lim) {
diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index 6e9173a..f63d01d 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -31,11 +31,15 @@
 #define HNS3_MC_MACADDR_NUM		128
 
 #define HNS3_MAX_BD_SIZE		65535
-#define HNS3_MAX_TX_BD_PER_PKT		8
+#define HNS3_MAX_NON_TSO_BD_PER_PKT	8
+#define HNS3_MAX_TSO_BD_PER_PKT		63
 #define HNS3_MAX_FRAME_LEN		9728
 #define HNS3_MIN_FRAME_LEN		64
 #define HNS3_VLAN_TAG_SIZE		4
 #define HNS3_DEFAULT_RX_BUF_LEN		2048
+#define HNS3_MAX_BD_PAYLEN		(1024 * 1024 - 1)
+#define HNS3_MAX_TSO_HDR_SIZE		512
+#define HNS3_MAX_TSO_HDR_BD_NUM		3
 
 #define HNS3_ETH_OVERHEAD \
 	(RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN + HNS3_VLAN_TAG_SIZE * 2)
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 5d1da44..736d8a7 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -850,6 +850,10 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 				 DEV_TX_OFFLOAD_VLAN_INSERT |
 				 DEV_TX_OFFLOAD_QINQ_INSERT |
 				 DEV_TX_OFFLOAD_MULTI_SEGS |
+                                 DEV_TX_OFFLOAD_TCP_TSO |
+                                 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+                                 DEV_TX_OFFLOAD_GRE_TNL_TSO |
+                                 DEV_TX_OFFLOAD_GENEVE_TNL_TSO |
 				 DEV_TX_OFFLOAD_MBUF_FAST_FREE);
 
 	info->rx_desc_lim = (struct rte_eth_desc_lim) {
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index c1ffa13..28a9334 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1858,6 +1858,78 @@ hns3_tx_free_useless_buffer(struct hns3_tx_queue *txq)
 	txq->tx_bd_ready   = tx_bd_ready;
 }
 
+static int
+hns3_tso_proc_tunnel(struct hns3_desc *desc, uint64_t ol_flags,
+		     struct rte_mbuf *rxm, uint8_t *l2_len)
+{
+	uint64_t tun_flags;
+	uint8_t ol4_len;
+	uint32_t otmp;
+
+	tun_flags = ol_flags & PKT_TX_TUNNEL_MASK;
+	if (tun_flags == 0)
+		return 0;
+
+	otmp = rte_le_to_cpu_32(desc->tx.ol_type_vlan_len_msec);
+	switch (tun_flags) {
+	case PKT_TX_TUNNEL_GENEVE:
+	case PKT_TX_TUNNEL_VXLAN:
+		*l2_len = rxm->l2_len - RTE_ETHER_VXLAN_HLEN;
+		break;
+	case PKT_TX_TUNNEL_GRE:
+		/*
+		 * OL4 header size, defined in 4 Bytes, it contains outer
+		 * L4(GRE) length and tunneling length.
+		 */
+		ol4_len = hns3_get_field(otmp, HNS3_TXD_L4LEN_M,
+					 HNS3_TXD_L4LEN_S);
+		*l2_len = rxm->l2_len - (ol4_len << HNS3_L4_LEN_UNIT);
+		break;
+	default:
+		/* For non UDP / GRE tunneling, drop the tunnel packet */
+		return -EINVAL;
+	}
+	hns3_set_field(otmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+		       rxm->outer_l2_len >> HNS3_L2_LEN_UNIT);
+	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(otmp);
+
+	return 0;
+}
+
+static void
+hns3_set_tso(struct hns3_desc *desc,
+	     uint64_t ol_flags, struct rte_mbuf *rxm)
+{
+	uint32_t paylen, hdr_len;
+	uint32_t tmp;
+	uint8_t l2_len = rxm->l2_len;
+
+	if (!(ol_flags & PKT_TX_TCP_SEG))
+		return;
+
+	if (hns3_tso_proc_tunnel(desc, ol_flags, rxm, &l2_len))
+		return;
+
+	hdr_len = rxm->l2_len + rxm->l3_len + rxm->l4_len;
+	hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
+		    rxm->outer_l2_len + rxm->outer_l3_len : 0;
+	paylen = rxm->pkt_len - hdr_len;
+	if (paylen <= rxm->tso_segsz)
+		return;
+
+	tmp = rte_le_to_cpu_32(desc->tx.type_cs_vlan_tso_len);
+	hns3_set_bit(tmp, HNS3_TXD_TSO_B, 1);
+	hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
+	hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S, HNS3_L4T_TCP);
+	hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
+	hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
+		       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
+	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+		       l2_len >> HNS3_L2_LEN_UNIT);
+	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp);
+	desc->tx.mss = rte_cpu_to_le_16(rxm->tso_segsz);
+}
+
 static void
 fill_desc(struct hns3_tx_queue *txq, uint16_t tx_desc_id, struct rte_mbuf *rxm,
 	  bool first, int offset)
@@ -1865,9 +1937,9 @@ fill_desc(struct hns3_tx_queue *txq, uint16_t tx_desc_id, struct rte_mbuf *rxm,
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
 	uint8_t frag_end = rxm->next == NULL ? 1 : 0;
+	uint64_t ol_flags = rxm->ol_flags;
 	uint16_t size = rxm->data_len;
 	uint16_t rrcfv = 0;
-	uint64_t ol_flags = rxm->ol_flags;
 	uint32_t hdr_len;
 	uint32_t paylen;
 	uint32_t tmp;
@@ -1882,6 +1954,7 @@ fill_desc(struct hns3_tx_queue *txq, uint16_t tx_desc_id, struct rte_mbuf *rxm,
 			   rxm->outer_l2_len + rxm->outer_l3_len : 0;
 		paylen = rxm->pkt_len - hdr_len;
 		desc->tx.paylen = rte_cpu_to_le_32(paylen);
+		hns3_set_tso(desc, ol_flags, rxm);
 	}
 
 	hns3_set_bit(rrcfv, HNS3_TXD_FE_B, frag_end);
@@ -2195,6 +2268,136 @@ hns3_txd_enable_checksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
 	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(value);
 }
 
+static bool
+hns3_pkt_need_linearized(struct rte_mbuf *tx_pkts, uint32_t bd_num)
+{
+	struct rte_mbuf *m_first = tx_pkts;
+	struct rte_mbuf *m_last = tx_pkts;
+	uint32_t tot_len = 0;
+	uint32_t hdr_len;
+	uint32_t i;
+
+	/*
+	 * Hardware requires that the sum of the data length of every 8
+	 * consecutive buffers is greater than MSS in hns3 network engine.
+	 * We simplify it by ensuring pkt_headlen + the first 8 consecutive
+	 * frags greater than gso header len + mss, and the remaining 7
+	 * consecutive frags greater than MSS except the last 7 frags.
+	 */
+	if (bd_num <= HNS3_MAX_NON_TSO_BD_PER_PKT)
+		return false;
+
+	for (i = 0; m_last && i < HNS3_MAX_NON_TSO_BD_PER_PKT - 1;
+	     i++, m_last = m_last->next)
+		tot_len += m_last->data_len;
+
+	if (!m_last)
+		return true;
+
+	/* ensure the first 8 frags is greater than mss + header */
+	hdr_len = tx_pkts->l2_len + tx_pkts->l3_len + tx_pkts->l4_len;
+	hdr_len += (tx_pkts->ol_flags & PKT_TX_TUNNEL_MASK) ?
+		   tx_pkts->outer_l2_len + tx_pkts->outer_l3_len : 0;
+	if (tot_len + m_last->data_len < tx_pkts->tso_segsz + hdr_len)
+		return true;
+
+	/*
+	 * ensure the sum of the data length of every 7 consecutive buffer
+	 * is greater than mss except the last one.
+	 */
+	for (i = 0; m_last && i < bd_num - HNS3_MAX_NON_TSO_BD_PER_PKT; i++) {
+		tot_len -= m_first->data_len;
+		tot_len += m_last->data_len;
+
+		if (tot_len < tx_pkts->tso_segsz)
+			return true;
+
+		m_first = m_first->next;
+		m_last = m_last->next;
+	}
+
+	return false;
+}
+
+static void
+hns3_outer_header_cksum_prepare(struct rte_mbuf *m)
+{
+	uint64_t ol_flags = m->ol_flags;
+	struct rte_ipv4_hdr *ipv4_hdr;
+	struct rte_udp_hdr *udp_hdr;
+	uint32_t paylen, hdr_len;
+
+	if (!(ol_flags & (PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IPV6)))
+		return;
+
+	if (ol_flags & PKT_TX_IPV4) {
+		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *,
+						   m->outer_l2_len);
+
+		if (ol_flags & PKT_TX_IP_CKSUM)
+			ipv4_hdr->hdr_checksum = 0;
+	}
+
+	if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM &&
+	    ol_flags & PKT_TX_TCP_SEG) {
+		hdr_len = m->l2_len + m->l3_len + m->l4_len;
+		hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
+				m->outer_l2_len + m->outer_l3_len : 0;
+		paylen = m->pkt_len - hdr_len;
+		if (paylen <= m->tso_segsz)
+			return;
+		udp_hdr = rte_pktmbuf_mtod_offset(m, struct rte_udp_hdr *,
+						  m->outer_l2_len +
+						  m->outer_l3_len);
+		udp_hdr->dgram_cksum = 0;
+	}
+}
+
+static inline bool
+hns3_pkt_is_tso(struct rte_mbuf *m)
+{
+	return (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG);
+}
+
+static int
+hns3_check_tso_pkt_valid(struct rte_mbuf *m)
+{
+	uint32_t tmp_data_len_sum = 0;
+	uint16_t nb_buf = m->nb_segs;
+	uint32_t paylen, hdr_len;
+	struct rte_mbuf *m_seg;
+	int i;
+
+	if (nb_buf > HNS3_MAX_TSO_BD_PER_PKT)
+		return -EINVAL;
+
+	hdr_len = m->l2_len + m->l3_len + m->l4_len;
+	hdr_len += (m->ol_flags & PKT_TX_TUNNEL_MASK) ?
+			m->outer_l2_len + m->outer_l3_len : 0;
+	if (hdr_len > HNS3_MAX_TSO_HDR_SIZE)
+		return -EINVAL;
+
+	paylen = m->pkt_len - hdr_len;
+	if (paylen > HNS3_MAX_BD_PAYLEN)
+		return -EINVAL;
+
+	/*
+	 * The TSO header (include outer and inner L2, L3 and L4 header)
+	 * should be provided by three descriptors in maximum in hns3 network
+	 * engine.
+	 */
+	m_seg = m;
+	for (i = 0; m_seg != NULL && i < HNS3_MAX_TSO_HDR_BD_NUM && i < nb_buf;
+	     i++, m_seg = m_seg->next) {
+		tmp_data_len_sum += m_seg->data_len;
+	}
+
+	if (hdr_len > tmp_data_len_sum)
+		return -EINVAL;
+
+	return 0;
+}
+
 uint16_t
 hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 	       uint16_t nb_pkts)
@@ -2206,6 +2409,13 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 	for (i = 0; i < nb_pkts; i++) {
 		m = tx_pkts[i];
 
+		if (hns3_pkt_is_tso(m) &&
+		    (hns3_pkt_need_linearized(m, m->nb_segs) |
+		     hns3_check_tso_pkt_valid(m))) {
+			rte_errno = EINVAL;
+			return i;
+		}
+
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 		ret = rte_validate_tx_offload(m);
 		if (ret != 0) {
@@ -2218,6 +2428,8 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 			rte_errno = -ret;
 			return i;
 		}
+
+		hns3_outer_header_cksum_prepare(m);
 	}
 
 	return i;
@@ -2241,13 +2453,39 @@ hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
 	return 0;
 }
 
+static int
+hns3_check_non_tso_pkt(uint16_t nb_buf, struct rte_mbuf **m_seg,
+		      struct rte_mbuf *tx_pkt, struct hns3_tx_queue *txq)
+{
+	struct rte_mbuf *new_pkt;
+	int ret;
+
+	if (hns3_pkt_is_tso(*m_seg))
+		return 0;
+
+	/*
+	 * If packet length is greater than HNS3_MAX_FRAME_LEN
+	 * driver support, the packet will be ignored.
+	 */
+	if (unlikely(rte_pktmbuf_pkt_len(tx_pkt) > HNS3_MAX_FRAME_LEN))
+		return -EINVAL;
+
+	if (unlikely(nb_buf > HNS3_MAX_NON_TSO_BD_PER_PKT)) {
+		ret = hns3_reassemble_tx_pkts(txq, tx_pkt, &new_pkt);
+		if (ret)
+			return ret;
+		*m_seg = new_pkt;
+	}
+
+	return 0;
+}
+
 uint16_t
 hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
 	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
-	struct rte_mbuf *new_pkt;
 	struct rte_mbuf *tx_pkt;
 	struct rte_mbuf *m_seg;
 	uint32_t nb_hold = 0;
@@ -2280,13 +2518,6 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		}
 
 		/*
-		 * If packet length is greater than HNS3_MAX_FRAME_LEN
-		 * driver support, the packet will be ignored.
-		 */
-		if (unlikely(rte_pktmbuf_pkt_len(tx_pkt) > HNS3_MAX_FRAME_LEN))
-			break;
-
-		/*
 		 * If packet length is less than minimum packet size, driver
 		 * need to pad it.
 		 */
@@ -2304,12 +2535,9 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		}
 
 		m_seg = tx_pkt;
-		if (unlikely(nb_buf > HNS3_MAX_TX_BD_PER_PKT)) {
-			if (hns3_reassemble_tx_pkts(txq, tx_pkt, &new_pkt))
-				goto end_of_tx;
-			m_seg = new_pkt;
-			nb_buf = m_seg->nb_segs;
-		}
+
+		if (hns3_check_non_tso_pkt(nb_buf, &m_seg, tx_pkt, txq))
+			goto end_of_tx;
 
 		if (hns3_parse_cksum(txq, tx_next_use, m_seg, &hdr_lens))
 			goto end_of_tx;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 02/13] net/hns3: check TSO segment size during Tx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
  2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 01/13] net/hns3: support TSO Lijun Ou
@ 2020-11-13 13:36 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 03/13] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:36 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Hongbo Zheng <zhenghongbo3@huawei.com>

[ upstream commit b68259f775b749ae860dfa4e170dd01774bf30d7 ]

Base on hns3 network engine, when the rte_eth_tx_burst API is called
by Upper Level Process, if PKT_TX_TCP_SEG flag is set and tso_segsz
is 0 in the input parameter structure rte_mbuf, hns3 PMD driver will
process this packet as an non-TSO packet, otherwise hardware will enter
an abnormal state.

Fixes: 6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 28a9334..8b851a3 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1896,6 +1896,12 @@ hns3_tso_proc_tunnel(struct hns3_desc *desc, uint64_t ol_flags,
 	return 0;
 }
 
+static inline bool
+hns3_pkt_is_tso(struct rte_mbuf *m)
+{
+	return (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG);
+}
+
 static void
 hns3_set_tso(struct hns3_desc *desc,
 	     uint64_t ol_flags, struct rte_mbuf *rxm)
@@ -1904,7 +1910,7 @@ hns3_set_tso(struct hns3_desc *desc,
 	uint32_t tmp;
 	uint8_t l2_len = rxm->l2_len;
 
-	if (!(ol_flags & PKT_TX_TCP_SEG))
+	if (!hns3_pkt_is_tso(rxm))
 		return;
 
 	if (hns3_tso_proc_tunnel(desc, ol_flags, rxm, &l2_len))
@@ -2353,12 +2359,6 @@ hns3_outer_header_cksum_prepare(struct rte_mbuf *m)
 	}
 }
 
-static inline bool
-hns3_pkt_is_tso(struct rte_mbuf *m)
-{
-	return (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG);
-}
-
 static int
 hns3_check_tso_pkt_valid(struct rte_mbuf *m)
 {
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 03/13] net/hns3: fix reassembling multiple segment packets in Tx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
  2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 01/13] net/hns3: support TSO Lijun Ou
  2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 02/13] net/hns3: check TSO segment size during Tx Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 04/13] net/hns3: support promiscuous and allmulticast mode for VF Lijun Ou
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 6c44219f9907a064049d160d366d98bb840724a6 ]

Because of the hardware constraints, hns3 network engine doesn't support
sending packets with more than eight fragments. And hns3 pmd driver
tries to reassemble these kind of packets to meet hardware requirements.
Currently, there are two problems:
1) when the input buffer_len * 8 < pkt_len, the packets are impossible
   to be reassembled into 8 Buffer Descriptors. In this case, the
   packets will be passed to hardware, which eventually causes a
   hardware reset.
2) The meta data in origin packets which are required to fill into the
   descriptor haven't been copied into the reassembled pkts.

This patch adds a check for 1) to ensure such packets will be dropped by
driver and copies useful meta data from the origin packets to the
reassembled packets.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 8b851a3..6eb77b3 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -2019,6 +2019,20 @@ hns3_tx_alloc_mbufs(struct hns3_tx_queue *txq, struct rte_mempool *mb_pool,
 	return 0;
 }
 
+static inline void
+hns3_pktmbuf_copy_hdr(struct rte_mbuf *new_pkt, struct rte_mbuf *old_pkt)
+{
+	new_pkt->ol_flags = old_pkt->ol_flags;
+	new_pkt->pkt_len = rte_pktmbuf_pkt_len(old_pkt);
+	new_pkt->outer_l2_len = old_pkt->outer_l2_len;
+	new_pkt->outer_l3_len = old_pkt->outer_l3_len;
+	new_pkt->l2_len = old_pkt->l2_len;
+	new_pkt->l3_len = old_pkt->l3_len;
+	new_pkt->l4_len = old_pkt->l4_len;
+	new_pkt->vlan_tci_outer = old_pkt->vlan_tci_outer;
+	new_pkt->vlan_tci = old_pkt->vlan_tci;
+}
+
 static int
 hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 			struct rte_mbuf **new_pkt)
@@ -2042,9 +2056,11 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 
 	mb_pool = tx_pkt->pool;
 	buf_size = tx_pkt->buf_len - RTE_PKTMBUF_HEADROOM;
-	nb_new_buf = (tx_pkt->pkt_len - 1) / buf_size + 1;
+	nb_new_buf = (rte_pktmbuf_pkt_len(tx_pkt) - 1) / buf_size + 1;
+	if (nb_new_buf > HNS3_MAX_NON_TSO_BD_PER_PKT)
+		return -EINVAL;
 
-	last_buf_len = tx_pkt->pkt_len % buf_size;
+	last_buf_len = rte_pktmbuf_pkt_len(tx_pkt) % buf_size;
 	if (last_buf_len == 0)
 		last_buf_len = buf_size;
 
@@ -2056,7 +2072,7 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 	/* Copy the original packet content to the new mbufs */
 	temp = tx_pkt;
 	s = rte_pktmbuf_mtod(temp, char *);
-	len_s = temp->data_len;
+	len_s = rte_pktmbuf_data_len(temp);
 	temp_new = new_mbuf;
 	for (i = 0; i < nb_new_buf; i++) {
 		d = rte_pktmbuf_mtod(temp_new, char *);
@@ -2079,13 +2095,14 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 				if (temp == NULL)
 					break;
 				s = rte_pktmbuf_mtod(temp, char *);
-				len_s = temp->data_len;
+				len_s = rte_pktmbuf_data_len(temp);
 			}
 		}
 
 		temp_new->data_len = buf_len;
 		temp_new = temp_new->next;
 	}
+	hns3_pktmbuf_copy_hdr(new_mbuf, tx_pkt);
 
 	/* free original mbufs */
 	rte_pktmbuf_free(tx_pkt);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 04/13] net/hns3: support promiscuous and allmulticast mode for VF
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (2 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 03/13] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 05/13] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit 40486d38496241711955fc752ff4a6db4c89da7f ]

Currently, we only support VF device is bound to vfio_pci or igb_uio and
then driven by DPDK driver when PF is driven by kernel mode hns3 ethdev
driver, VF is not supported when PF is driven by hns3 DPDK driver.

This patch adds promiscuous and allmulticast mode support for hns3 VF
PMD driver.
1) The promiscuous/allmulticast mode can be configured successfully only
   based on the trusted VF device. If based on the non trusted VF
   device, configuring promiscuous/allmulticast mode will fail. The hns3
   VF device can be configured as trusted device by hns3 PF kernel
   ethdev driver on the host by "ip link set <eth num> vf <vf id> turst
   on" command.
2) After the promiscuous mode is configured successfully, hns3 VF PMD
   driver can receive the ingress and outgoing traffic. In the words,
   all the ingress packets, all the packets sent from the PF and other
   VFs on the same physical port.
3) Note: Because of the hardware constraints, By default vlan filter is
   enabled and couldn't be turned off based on VF device, so vlan filter
   is still effective even in promiscuous mode. If upper applications
   don't call rte_eth_dev_vlan_filter API function to set vlan based on
   VF device, hns3 VF PMD driver will can't receive the packets with
   vlan tag in promiscuous mode.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 doc/guides/nics/features/hns3_vf.ini |   2 +
 drivers/net/hns3/hns3_ethdev_vf.c    | 117 +++++++++++++++++++++++++++++++++--
 drivers/net/hns3/hns3_mbx.c          |  23 +++++++
 drivers/net/hns3/hns3_mbx.h          |   2 +
 4 files changed, 139 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/features/hns3_vf.ini b/doc/guides/nics/features/hns3_vf.ini
index e4e7738..80773ac 100644
--- a/doc/guides/nics/features/hns3_vf.ini
+++ b/doc/guides/nics/features/hns3_vf.ini
@@ -9,6 +9,8 @@ Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 TSO                  = Y
+Promiscuous mode     = Y
+Allmulticast mode    = Y
 Unicast MAC filter   = Y
 Multicast MAC filter = Y
 RSS hash             = Y
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 736d8a7..5663a58 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -566,7 +566,8 @@ hns3vf_configure_all_mc_mac_addr(struct hns3_adapter *hns, bool del)
 }
 
 static int
-hns3vf_set_promisc_mode(struct hns3_hw *hw, bool en_bc_pmc)
+hns3vf_set_promisc_mode(struct hns3_hw *hw, bool en_bc_pmc,
+			bool en_uc_pmc, bool en_mc_pmc)
 {
 	struct hns3_mbx_vf_to_pf_cmd *req;
 	struct hns3_cmd_desc desc;
@@ -574,18 +575,116 @@ hns3vf_set_promisc_mode(struct hns3_hw *hw, bool en_bc_pmc)
 
 	req = (struct hns3_mbx_vf_to_pf_cmd *)desc.data;
 
+	/*
+	 * The hns3 VF PMD driver depends on the hns3 PF kernel ethdev driver,
+	 * so there are some features for promiscuous/allmulticast mode in hns3
+	 * VF PMD driver as below:
+	 * 1. The promiscuous/allmulticast mode can be configured successfully
+	 *    only based on the trusted VF device. If based on the non trusted
+	 *    VF device, configuring promiscuous/allmulticast mode will fail.
+	 *    The hns3 VF device can be confiruged as trusted device by hns3 PF
+	 *    kernel ethdev driver on the host by the following command:
+	 *      "ip link set <eth num> vf <vf id> turst on"
+	 * 2. After the promiscuous mode is configured successfully, hns3 VF PMD
+	 *    driver can receive the ingress and outgoing traffic. In the words,
+	 *    all the ingress packets, all the packets sent from the PF and
+	 *    other VFs on the same physical port.
+	 * 3. Note: Because of the hardware constraints, By default vlan filter
+	 *    is enabled and couldn't be turned off based on VF device, so vlan
+	 *    filter is still effective even in promiscuous mode. If upper
+	 *    applications don't call rte_eth_dev_vlan_filter API function to
+	 *    set vlan based on VF device, hns3 VF PMD driver will can't receive
+	 *    the packets with vlan tag in promiscuoue mode.
+	 */
 	hns3_cmd_setup_basic_desc(&desc, HNS3_OPC_MBX_VF_TO_PF, false);
 	req->msg[0] = HNS3_MBX_SET_PROMISC_MODE;
 	req->msg[1] = en_bc_pmc ? 1 : 0;
+	req->msg[2] = en_uc_pmc ? 1 : 0;
+	req->msg[3] = en_mc_pmc ? 1 : 0;
 
 	ret = hns3_cmd_send(hw, &desc, 1);
 	if (ret)
-		hns3_err(hw, "Set promisc mode fail, status is %d", ret);
+		hns3_err(hw, "Set promisc mode fail, ret = %d", ret);
+
+	return ret;
+}
+
+static int
+hns3vf_dev_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	struct hns3_adapter *hns = dev->data->dev_private;
+	struct hns3_hw *hw = &hns->hw;
+	int ret;
+
+	ret = hns3vf_set_promisc_mode(hw, true, true, true);
+	if (ret)
+		hns3_err(hw, "Failed to enable promiscuous mode, ret = %d",
+			ret);
+	return ret;
+}
+
+static int
+hns3vf_dev_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	bool allmulti = dev->data->all_multicast ? true : false;
+	struct hns3_adapter *hns = dev->data->dev_private;
+	struct hns3_hw *hw = &hns->hw;
+	int ret;
+
+	ret = hns3vf_set_promisc_mode(hw, true, false, allmulti);
+	if (ret)
+		hns3_err(hw, "Failed to disable promiscuous mode, ret = %d",
+			ret);
+	return ret;
+}
+
+static int
+hns3vf_dev_allmulticast_enable(struct rte_eth_dev *dev)
+{
+	struct hns3_adapter *hns = dev->data->dev_private;
+	struct hns3_hw *hw = &hns->hw;
+	int ret;
+
+	if (dev->data->promiscuous)
+		return 0;
+
+	ret = hns3vf_set_promisc_mode(hw, true, false, true);
+	if (ret)
+		hns3_err(hw, "Failed to enable allmulticast mode, ret = %d",
+			ret);
+	return ret;
+}
+
+static int
+hns3vf_dev_allmulticast_disable(struct rte_eth_dev *dev)
+{
+	struct hns3_adapter *hns = dev->data->dev_private;
+	struct hns3_hw *hw = &hns->hw;
+	int ret;
+
+	if (dev->data->promiscuous)
+		return 0;
 
+	ret = hns3vf_set_promisc_mode(hw, true, false, false);
+	if (ret)
+		hns3_err(hw, "Failed to disable allmulticast mode, ret = %d",
+			ret);
 	return ret;
 }
 
 static int
+hns3vf_restore_promisc(struct hns3_adapter *hns)
+{
+	struct hns3_hw *hw = &hns->hw;
+	bool allmulti = hw->data->all_multicast ? true : false;
+
+	if (hw->data->promiscuous)
+		return hns3vf_set_promisc_mode(hw, true, true, true);
+
+	return hns3vf_set_promisc_mode(hw, true, false, allmulti);
+}
+
+static int
 hns3vf_bind_ring_with_vector(struct hns3_hw *hw, uint8_t vector_id,
 			     bool mmap, enum hns3_ring_type queue_type,
 			     uint16_t queue_id)
@@ -1390,7 +1489,7 @@ hns3vf_init_hardware(struct hns3_adapter *hns)
 	uint16_t mtu = hw->data->mtu;
 	int ret;
 
-	ret = hns3vf_set_promisc_mode(hw, true);
+	ret = hns3vf_set_promisc_mode(hw, true, false, false);
 	if (ret)
 		return ret;
 
@@ -1432,7 +1531,7 @@ hns3vf_init_hardware(struct hns3_adapter *hns)
 	return 0;
 
 err_init_hardware:
-	(void)hns3vf_set_promisc_mode(hw, false);
+	(void)hns3vf_set_promisc_mode(hw, false, false, false);
 	return ret;
 }
 
@@ -1539,7 +1638,7 @@ hns3vf_uninit_vf(struct rte_eth_dev *eth_dev)
 
 	hns3_rss_uninit(hns);
 	(void)hns3vf_set_alive(hw, false);
-	(void)hns3vf_set_promisc_mode(hw, false);
+	(void)hns3vf_set_promisc_mode(hw, false, false, false);
 	hns3vf_disable_irq0(hw);
 	rte_intr_disable(&pci_dev->intr_handle);
 	hns3_intr_unregister(&pci_dev->intr_handle, hns3vf_interrupt_handler,
@@ -2044,6 +2143,10 @@ hns3vf_restore_conf(struct hns3_adapter *hns)
 	if (ret)
 		goto err_mc_mac;
 
+	ret = hns3vf_restore_promisc(hns);
+	if (ret)
+		goto err_vlan_table;
+
 	ret = hns3vf_restore_vlan_conf(hns);
 	if (ret)
 		goto err_vlan_table;
@@ -2198,6 +2301,10 @@ static const struct eth_dev_ops hns3vf_eth_dev_ops = {
 	.dev_stop           = hns3vf_dev_stop,
 	.dev_close          = hns3vf_dev_close,
 	.mtu_set            = hns3vf_dev_mtu_set,
+	.promiscuous_enable = hns3vf_dev_promiscuous_enable,
+	.promiscuous_disable = hns3vf_dev_promiscuous_disable,
+	.allmulticast_enable = hns3vf_dev_allmulticast_enable,
+	.allmulticast_disable = hns3vf_dev_allmulticast_disable,
 	.stats_get          = hns3_stats_get,
 	.stats_reset        = hns3_stats_reset,
 	.xstats_get         = hns3_dev_xstats_get,
diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c
index 64996c9..34c8c68 100644
--- a/drivers/net/hns3/hns3_mbx.c
+++ b/drivers/net/hns3/hns3_mbx.c
@@ -326,6 +326,21 @@ hns3_handle_link_change_event(struct hns3_hw *hw,
 	hns3_update_link_status(hw);
 }
 
+static void
+hns3_handle_promisc_info(struct hns3_hw *hw, uint16_t promisc_en)
+{
+	if (!promisc_en) {
+		/*
+		 * When promisc/allmulti mode is closed by the hns3 PF kernel
+		 * ethdev driver for untrusted, modify VF's related status.
+		 */
+		hns3_warn(hw, "Promisc mode will be closed by host for being "
+			      "untrusted.");
+		hw->data->promiscuous = 0;
+		hw->data->all_multicast = 0;
+	}
+}
+
 void
 hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
 {
@@ -384,6 +399,14 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
 		case HNS3_MBX_PUSH_LINK_STATUS:
 			hns3_handle_link_change_event(hw, req);
 			break;
+		case HNS3_MBX_PUSH_PROMISC_INFO:
+			/*
+			 * When the trust status of VF device changed by the
+			 * hns3 PF kernel driver, VF driver will receive this
+			 * mailbox message from PF driver.
+			 */
+			hns3_handle_promisc_info(hw, req->msg[1]);
+			break;
 		default:
 			hns3_err(hw,
 				 "VF received unsupported(%d) mbx msg from PF",
diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h
index b01eaac..d6d70f6 100644
--- a/drivers/net/hns3/hns3_mbx.h
+++ b/drivers/net/hns3/hns3_mbx.h
@@ -40,6 +40,8 @@ enum HNS3_MBX_OPCODE {
 	HNS3_MBX_SET_MTU,               /* (VF -> PF) set mtu */
 	HNS3_MBX_GET_QID_IN_PF,         /* (VF -> PF) get queue id in pf */
 
+	HNS3_MBX_PUSH_PROMISC_INFO = 36, /* (PF -> VF) push vf promisc info */
+
 	HNS3_MBX_HANDLE_VF_TBL = 38,    /* (VF -> PF) store/clear hw cfg tbl */
 	HNS3_MBX_GET_RING_VECTOR_MAP,   /* (VF -> PF) get ring-to-vector map */
 	HNS3_MBX_PUSH_LINK_STATUS = 201, /* (IMP -> PF) get port link status */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 05/13] net/hns3: decrease non-nearby memory access in Rx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (3 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 04/13] net/hns3: support promiscuous and allmulticast mode for VF Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 06/13] net/hns3: cleanup duplicated code on processing TSO in Tx Lijun Ou
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit 8c7449779c4526d27205043560a21f0e2c2f622b ]

Currently, hns3 PMD driver needs know the PVID configuration state and
do different processing in the 'rx_pkt_burst' ops implementation
function.

This patch adds a member to struct hns3_rx_queue/hns3_tx_queue of the
driver to indicate the PVID configuration status, so it isn't need
to access other data structure in the 'rx_pkt_burst' ops implementation,
to avoid performance loss because of reducing cache miss.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c | 21 ++++++++++++++++++++-
 drivers/net/hns3/hns3_rxtx.c   | 37 +++++++++++++++++++++++++++++++------
 drivers/net/hns3/hns3_rxtx.h   | 13 +++++++++++++
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 23c16b8..ed783f7 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -895,6 +895,8 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 {
 	struct hns3_adapter *hns = dev->data->dev_private;
 	struct hns3_hw *hw = &hns->hw;
+	bool pvid_en_state_change;
+	uint16_t pvid_state;
 	int ret;
 
 	if (pvid > RTE_ETHER_MAX_VLAN_ID) {
@@ -903,10 +905,27 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 		return -EINVAL;
 	}
 
+	/*
+	 * If PVID configuration state change, should refresh the PVID
+	 * configuration state in struct hns3_tx_queue/hns3_rx_queue.
+	 */
+	pvid_state = hw->port_base_vlan_cfg.state;
+	if ((on && pvid_state == HNS3_PORT_BASE_VLAN_ENABLE) ||
+	    (!on && pvid_state == HNS3_PORT_BASE_VLAN_DISABLE))
+		pvid_en_state_change = false;
+	else
+		pvid_en_state_change = true;
+
 	rte_spinlock_lock(&hw->lock);
 	ret = hns3_vlan_pvid_configure(hns, pvid, on);
 	rte_spinlock_unlock(&hw->lock);
-	return ret;
+	if (ret)
+		return ret;
+
+	if (pvid_en_state_change)
+		hns3_update_all_queues_pvid_state(hw);
+
+	return 0;
 }
 
 static int
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 6eb77b3..6255862 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -316,6 +316,31 @@ hns3_init_tx_queue_hw(struct hns3_tx_queue *txq)
 }
 
 void
+hns3_update_all_queues_pvid_state(struct hns3_hw *hw)
+{
+	uint16_t nb_rx_q = hw->data->nb_rx_queues;
+	uint16_t nb_tx_q = hw->data->nb_tx_queues;
+	struct hns3_rx_queue *rxq;
+	struct hns3_tx_queue *txq;
+	int pvid_state;
+	int i;
+
+	pvid_state = hw->port_base_vlan_cfg.state;
+	for (i = 0; i < hw->cfg_max_queues; i++) {
+		if (i < nb_rx_q) {
+			rxq = hw->data->rx_queues[i];
+			if (rxq != NULL)
+				rxq->pvid_state = pvid_state;
+		}
+		if (i < nb_tx_q) {
+			txq = hw->data->tx_queues[i];
+			if (txq != NULL)
+				txq->pvid_state = pvid_state;
+		}
+	}
+}
+
+void
 hns3_enable_all_queues(struct hns3_hw *hw, bool en)
 {
 	uint16_t nb_rx_q = hw->data->nb_rx_queues;
@@ -1275,6 +1300,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
+	rxq->pvid_state = hw->port_base_vlan_cfg.state;
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
@@ -1503,7 +1529,7 @@ hns3_rx_set_cksum_flag(struct rte_mbuf *rxm, uint64_t packet_type,
 }
 
 static inline void
-hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
+hns3_rxd_to_vlan_tci(struct hns3_rx_queue *rxq, struct rte_mbuf *mb,
 		     uint32_t l234_info, const struct hns3_desc *rxd)
 {
 #define HNS3_STRP_STATUS_NUM		0x4
@@ -1511,8 +1537,6 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 #define HNS3_NO_STRP_VLAN_VLD		0x0
 #define HNS3_INNER_STRP_VLAN_VLD	0x1
 #define HNS3_OUTER_STRP_VLAN_VLD	0x2
-	struct hns3_adapter *hns = dev->data->dev_private;
-	struct hns3_hw *hw = &hns->hw;
 	uint32_t strip_status;
 	uint32_t report_mode;
 
@@ -1538,7 +1562,7 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 	};
 	strip_status = hns3_get_field(l234_info, HNS3_RXD_STRP_TAGP_M,
 				      HNS3_RXD_STRP_TAGP_S);
-	report_mode = report_type[hw->port_base_vlan_cfg.state][strip_status];
+	report_mode = report_type[rxq->pvid_state][strip_status];
 	switch (report_mode) {
 	case HNS3_NO_STRP_VLAN_VLD:
 		mb->vlan_tci = 0;
@@ -1583,7 +1607,6 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx = 0;
 	nb_rx_bd = 0;
 	rxq = rx_queue;
-	dev = &rte_eth_devices[rxq->port_id];
 
 	rx_id = rxq->next_to_clean;
 	rx_ring = rxq->rx_ring;
@@ -1660,6 +1683,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 		nmb = rte_mbuf_raw_alloc(rxq->mb_pool);
 		if (unlikely(nmb == NULL)) {
+			dev = &rte_eth_devices[rxq->port_id];
 			dev->data->rx_mbuf_alloc_failed++;
 			break;
 		}
@@ -1729,7 +1753,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			hns3_rx_set_cksum_flag(first_seg,
 					       first_seg->packet_type,
 					       cksum_err);
-		hns3_rxd_to_vlan_tci(dev, first_seg, l234_info, &rxd);
+		hns3_rxd_to_vlan_tci(rxq, first_seg, l234_info, &rxd);
 
 		rx_pkts[nb_rx++] = first_seg;
 		first_seg = NULL;
@@ -1807,6 +1831,7 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->next_to_clean = 0;
 	txq->tx_bd_ready = txq->nb_tx_desc - 1;
 	txq->port_id = dev->data->port_id;
+	txq->pvid_state = hw->port_base_vlan_cfg.state;
 	txq->configured = true;
 	txq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 1fd1afd..a416d10 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -250,6 +250,12 @@ struct hns3_rx_queue {
 	uint16_t rx_buf_len;
 	uint16_t rx_free_thresh;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool rx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if rx queue has been configured */
 
@@ -276,6 +282,12 @@ struct hns3_tx_queue {
 	uint16_t next_to_use;
 	uint16_t tx_bd_ready;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool tx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if tx queue has been configured */
 };
@@ -336,5 +348,6 @@ void hns3_set_queue_intr_rl(struct hns3_hw *hw, uint16_t queue_id,
 			    uint16_t rl_value);
 int hns3_set_fake_rx_or_tx_queues(struct rte_eth_dev *dev, uint16_t nb_rx_q,
 				  uint16_t nb_tx_q);
+void hns3_update_all_queues_pvid_state(struct hns3_hw *hw);
 
 #endif /* _HNS3_RXTX_H_ */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 06/13] net/hns3: cleanup duplicated code on processing TSO in Tx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (4 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 05/13] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 07/13] net/hns3: fix inserted VLAN tag position " Lijun Ou
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit a001f09d11fac91b760c038cf69af7b041bc983c ]

This patch fixes up paylen calculation twice when processing TSO request
in the '.tx_pkt_burst' ops implementation function to avoid performance
loss.

Fixes: 6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 6255862..9d14632 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1928,12 +1928,11 @@ hns3_pkt_is_tso(struct rte_mbuf *m)
 }
 
 static void
-hns3_set_tso(struct hns3_desc *desc,
-	     uint64_t ol_flags, struct rte_mbuf *rxm)
+hns3_set_tso(struct hns3_desc *desc, uint64_t ol_flags,
+		uint32_t paylen, struct rte_mbuf *rxm)
 {
-	uint32_t paylen, hdr_len;
-	uint32_t tmp;
 	uint8_t l2_len = rxm->l2_len;
+	uint32_t tmp;
 
 	if (!hns3_pkt_is_tso(rxm))
 		return;
@@ -1941,10 +1940,6 @@ hns3_set_tso(struct hns3_desc *desc,
 	if (hns3_tso_proc_tunnel(desc, ol_flags, rxm, &l2_len))
 		return;
 
-	hdr_len = rxm->l2_len + rxm->l3_len + rxm->l4_len;
-	hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
-		    rxm->outer_l2_len + rxm->outer_l3_len : 0;
-	paylen = rxm->pkt_len - hdr_len;
 	if (paylen <= rxm->tso_segsz)
 		return;
 
@@ -1985,7 +1980,7 @@ fill_desc(struct hns3_tx_queue *txq, uint16_t tx_desc_id, struct rte_mbuf *rxm,
 			   rxm->outer_l2_len + rxm->outer_l3_len : 0;
 		paylen = rxm->pkt_len - hdr_len;
 		desc->tx.paylen = rte_cpu_to_le_32(paylen);
-		hns3_set_tso(desc, ol_flags, rxm);
+		hns3_set_tso(desc, ol_flags, paylen, rxm);
 	}
 
 	hns3_set_bit(rrcfv, HNS3_TXD_FE_B, frag_end);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 07/13] net/hns3: fix inserted VLAN tag position in Tx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (5 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 06/13] net/hns3: cleanup duplicated code on processing TSO in Tx Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 08/13] net/hns3: report Tx descriptor segment limitations Lijun Ou
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit fc9b57ff576c4088138a8e0e53650b806927e683 ]

Based on hns3 network engine, in order to configure hardware VLAN insert
offload in Tx direction, PMD driver reads the VLAN tags from the
vlan_tci_outer and vlan_tci of the structure rte_mbuf, fills them into
the Tx Buffer Descriptor and sets the related offload flag for every
packet.

Currently, there are two VLAN related problems in the 'tx_pkt_burst' ops
implementation function:
1) When setting the related offload flag, PMD driver inserts the VLAN
   tag into the position that close to L3 header. So, when upper
   application sends a packet with a VLAN tag in the data buffer, the
   VLAN offloaded by hardware will be added to the wrong position. It is
   supposed to add the VLAN tag from the rte_mbuf to the position close
   to the MAC header in the packet when using VLAN insertion.

   And when PF PVID is enabled by calling the API function named
   rte_eth_dev_set_vlan_pvid or VF PVID is enabled by hns3 PF kernel
   ether driver, the VLAN tag from the structure rte_mbuf to enable the
   VLAN insertion should be filled into the position that close to L3
   header to avoid to be overwritten by the PVID which will always be
   inserted in the position that close to the MAC address.

2) When sending multiple segment packets, VLAN information is required
   to be filled into the first Tx Buffer descriptor. However, currently
   hns3 PMD driver incorrectly placed it in the last Tx Buffer
   Descriptor. This results in VLAN insert offload failure when sending
   multiple segment packets.

This patch fixed them by filling the VLAN information into the position
of the Tx Buffer Descriptor.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 102 +++++++++++++++++++++++++++----------------
 1 file changed, 65 insertions(+), 37 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 9d14632..e351f37 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1956,51 +1956,58 @@ hns3_set_tso(struct hns3_desc *desc, uint64_t ol_flags,
 	desc->tx.mss = rte_cpu_to_le_16(rxm->tso_segsz);
 }
 
+static inline void
+hns3_fill_per_desc(struct hns3_desc *desc, struct rte_mbuf *rxm)
+{
+	desc->addr = rte_mbuf_data_iova(rxm);
+	desc->tx.send_size = rte_cpu_to_le_16(rte_pktmbuf_data_len(rxm));
+	desc->tx.tp_fe_sc_vld_ra_ri = rte_cpu_to_le_16(BIT(HNS3_TXD_VLD_B));
+}
+
 static void
-fill_desc(struct hns3_tx_queue *txq, uint16_t tx_desc_id, struct rte_mbuf *rxm,
-	  bool first, int offset)
+hns3_fill_first_desc(struct hns3_tx_queue *txq, struct hns3_desc *desc,
+		     struct rte_mbuf *rxm)
 {
-	struct hns3_desc *tx_ring = txq->tx_ring;
-	struct hns3_desc *desc = &tx_ring[tx_desc_id];
-	uint8_t frag_end = rxm->next == NULL ? 1 : 0;
 	uint64_t ol_flags = rxm->ol_flags;
-	uint16_t size = rxm->data_len;
-	uint16_t rrcfv = 0;
 	uint32_t hdr_len;
 	uint32_t paylen;
-	uint32_t tmp;
 
-	desc->addr = rte_mbuf_data_iova(rxm) + offset;
-	desc->tx.send_size = rte_cpu_to_le_16(size);
-	hns3_set_bit(rrcfv, HNS3_TXD_VLD_B, 1);
-
-	if (first) {
-		hdr_len = rxm->l2_len + rxm->l3_len + rxm->l4_len;
-		hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
+	hdr_len = rxm->l2_len + rxm->l3_len + rxm->l4_len;
+	hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
 			   rxm->outer_l2_len + rxm->outer_l3_len : 0;
-		paylen = rxm->pkt_len - hdr_len;
-		desc->tx.paylen = rte_cpu_to_le_32(paylen);
-		hns3_set_tso(desc, ol_flags, paylen, rxm);
-	}
-
-	hns3_set_bit(rrcfv, HNS3_TXD_FE_B, frag_end);
-	desc->tx.tp_fe_sc_vld_ra_ri = rte_cpu_to_le_16(rrcfv);
-
-	if (frag_end) {
-		if (ol_flags & (PKT_TX_VLAN_PKT | PKT_TX_QINQ_PKT)) {
-			tmp = rte_le_to_cpu_32(desc->tx.type_cs_vlan_tso_len);
-			hns3_set_bit(tmp, HNS3_TXD_VLAN_B, 1);
-			desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp);
-			desc->tx.vlan_tag = rte_cpu_to_le_16(rxm->vlan_tci);
-		}
+	paylen = rxm->pkt_len - hdr_len;
+	desc->tx.paylen = rte_cpu_to_le_32(paylen);
+	hns3_set_tso(desc, ol_flags, paylen, rxm);
 
-		if (ol_flags & PKT_TX_QINQ_PKT) {
-			tmp = rte_le_to_cpu_32(desc->tx.ol_type_vlan_len_msec);
-			hns3_set_bit(tmp, HNS3_TXD_OVLAN_B, 1);
-			desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(tmp);
+	/*
+	 * Currently, hardware doesn't support more than two layers VLAN offload
+	 * in Tx direction based on hns3 network engine. So when the number of
+	 * VLANs in the packets represented by rxm plus the number of VLAN
+	 * offload by hardware such as PVID etc, exceeds two, the packets will
+	 * be discarded or the original VLAN of the packets will be overwitted
+	 * by hardware. When the PF PVID is enabled by calling the API function
+	 * named rte_eth_dev_set_vlan_pvid or the VF PVID is enabled by the hns3
+	 * PF kernel ether driver, the outer VLAN tag will always be the PVID.
+	 * To avoid the VLAN of Tx descriptor is overwritten by PVID, it should
+	 * be added to the position close to the IP header when PVID is enabled.
+	 */
+	if (!txq->pvid_state && ol_flags & (PKT_TX_VLAN_PKT |
+				PKT_TX_QINQ_PKT)) {
+		desc->tx.ol_type_vlan_len_msec |=
+				rte_cpu_to_le_32(BIT(HNS3_TXD_OVLAN_B));
+		if (ol_flags & PKT_TX_QINQ_PKT)
 			desc->tx.outer_vlan_tag =
-				rte_cpu_to_le_16(rxm->vlan_tci_outer);
-		}
+					rte_cpu_to_le_16(rxm->vlan_tci_outer);
+		else
+			desc->tx.outer_vlan_tag =
+					rte_cpu_to_le_16(rxm->vlan_tci);
+	}
+
+	if (ol_flags & PKT_TX_QINQ_PKT ||
+	    ((ol_flags & PKT_TX_VLAN_PKT) && txq->pvid_state)) {
+		desc->tx.type_cs_vlan_tso_len |=
+					rte_cpu_to_le_32(BIT(HNS3_TXD_VLAN_B));
+		desc->tx.vlan_tag = rte_cpu_to_le_16(rxm->vlan_tci);
 	}
 }
 
@@ -2523,8 +2530,10 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
+	struct hns3_desc *tx_ring;
 	struct rte_mbuf *tx_pkt;
 	struct rte_mbuf *m_seg;
+	struct hns3_desc *desc;
 	uint32_t nb_hold = 0;
 	uint16_t tx_next_use;
 	uint16_t tx_pkt_num;
@@ -2539,6 +2548,7 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	tx_next_use   = txq->next_to_use;
 	tx_bd_max     = txq->nb_tx_desc;
 	tx_pkt_num = nb_pkts;
+	tx_ring = txq->tx_ring;
 
 	/* send packets */
 	tx_bak_pkt = &txq->sw_ring[tx_next_use];
@@ -2580,8 +2590,22 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			goto end_of_tx;
 
 		i = 0;
+		desc = &tx_ring[tx_next_use];
+
+		/*
+		 * If the packet is divided into multiple Tx Buffer Descriptors,
+		 * only need to fill vlan, paylen and tso into the first Tx
+		 * Buffer Descriptor.
+		 */
+		hns3_fill_first_desc(txq, desc, m_seg);
+
 		do {
-			fill_desc(txq, tx_next_use, m_seg, (i == 0), 0);
+			desc = &tx_ring[tx_next_use];
+			/*
+			 * Fill valid bits, DMA address and data length for each
+			 * Tx Buffer Descriptor.
+			 */
+			hns3_fill_per_desc(desc, m_seg);
 			tx_bak_pkt->mbuf = m_seg;
 			m_seg = m_seg->next;
 			tx_next_use++;
@@ -2594,6 +2618,10 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			i++;
 		} while (m_seg != NULL);
 
+		/* Add end flag for the last Tx Buffer Descriptor */
+		desc->tx.tp_fe_sc_vld_ra_ri |=
+				 rte_cpu_to_le_16(BIT(HNS3_TXD_FE_B));
+
 		nb_hold += i;
 		txq->next_to_use = tx_next_use;
 		txq->tx_bd_ready -= i;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 08/13] net/hns3: report Tx descriptor segment limitations
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (6 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 07/13] net/hns3: fix inserted VLAN tag position " Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 09/13] net/hns3: report Rx drop packets enable configuration Lijun Ou
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

[ upstream commit eb8b3a0d829bc109dfb13de9c8cdeacda5449e69 ]

According to the user manual of Kunpeng920 SoC, the max allowed number
of segments per whole packet is 63 and the max number of segments per
packet is 8 in datapath.

This patch reports the Two segment parameters of Tx descriptor
limitations to DPDK framework.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 2 ++
 drivers/net/hns3/hns3_ethdev_vf.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index ed783f7..3129f74 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2503,6 +2503,8 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_max = HNS3_MAX_RING_DESC,
 		.nb_min = HNS3_MIN_RING_DESC,
 		.nb_align = HNS3_ALIGN_RING_DESC,
+		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
+		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 5663a58..8b36f18 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -965,6 +965,8 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_max = HNS3_MAX_RING_DESC,
 		.nb_min = HNS3_MIN_RING_DESC,
 		.nb_align = HNS3_ALIGN_RING_DESC,
+		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
+		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
 	info->vmdq_queue_num = 0;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 09/13] net/hns3: report Rx drop packets enable configuration
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (7 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 08/13] net/hns3: report Tx descriptor segment limitations Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 10/13] net/hns3: report Rx free threshold Lijun Ou
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit a02f1461c75b4be31d4a37ea3fae95aee86cdff2 ]

Currently, if there are not available Rx buffer descriptors in receiving
direction based on hns3 network engine, incoming packets will always be
dropped by hardware. This patch reports the '.rx_drop_en' information to
DPDK framework in the '.dev_infos_get', '.rxq_info_get' and
'.rx_queue_setup' ops implementation function.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 9 +++++++++
 drivers/net/hns3/hns3_ethdev_vf.c | 9 +++++++++
 drivers/net/hns3/hns3_rxtx.c      | 6 ++++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 3129f74..d7cc9a4 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2507,6 +2507,15 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 8b36f18..5b8c712 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -969,6 +969,15 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index e351f37..61354f0 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1251,6 +1251,12 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 		return -EINVAL;
 	}
 
+	if (conf->rx_drop_en == 0)
+		hns3_warn(hw, "if there are no available Rx descriptors,"
+			  "incoming packets are always dropped. input parameter"
+			  " conf->rx_drop_en(%u) is uneffective.",
+			  conf->rx_drop_en);
+
 	if (dev->data->rx_queues[idx]) {
 		hns3_rx_queue_release(dev->data->rx_queues[idx]);
 		dev->data->rx_queues[idx] = NULL;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 10/13] net/hns3: report Rx free threshold
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (8 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 09/13] net/hns3: report Rx drop packets enable configuration Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 11/13] net/hns3: reduce address calculation in Rx Lijun Ou
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit ceabee45be4a1737994c5f4631bd001b8021475c ]

This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    |  2 ++
 drivers/net/hns3/hns3_ethdev_vf.c |  2 ++
 drivers/net/hns3/hns3_rxtx.c      | 30 ++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h      |  9 ++++++---
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index d7cc9a4..69dc841 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2508,12 +2508,14 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 5b8c712..9e022a2 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -970,12 +970,14 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 61354f0..49246ab 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -634,8 +634,7 @@ hns3_dev_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 
 	return 0;
@@ -649,8 +648,7 @@ hns3_fake_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 
 	rxq = (struct hns3_rx_queue *)hw->fkq_data.rx_queues[idx];
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 }
 
@@ -1285,10 +1283,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 
 	rxq->hns = hns;
 	rxq->mb_pool = mp;
-	if (conf->rx_free_thresh <= 0)
-		rxq->rx_free_thresh = DEFAULT_RX_FREE_THRESH;
-	else
-		rxq->rx_free_thresh = conf->rx_free_thresh;
+	rxq->rx_free_thresh = (conf->rx_free_thresh > 0) ?
+		conf->rx_free_thresh : HNS3_DEFAULT_RX_FREE_THRESH;
 	rxq->rx_deferred_start = conf->rx_deferred_start;
 
 	rx_entry_len = sizeof(struct hns3_entry) * rxq->nb_rx_desc;
@@ -1301,8 +1297,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
@@ -1614,11 +1609,11 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx_bd = 0;
 	rxq = rx_queue;
 
-	rx_id = rxq->next_to_clean;
+	rx_id = rxq->next_to_use;
 	rx_ring = rxq->rx_ring;
+	sw_ring = rxq->sw_ring;
 	first_seg = rxq->pkt_first_seg;
 	last_seg = rxq->pkt_last_seg;
-	sw_ring = rxq->sw_ring;
 
 	while (nb_rx < nb_pkts) {
 		rxdp = &rx_ring[rx_id];
@@ -1769,16 +1764,15 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		first_seg = NULL;
 	}
 
-	rxq->next_to_clean = rx_id;
+	rxq->next_to_use = rx_id;
 	rxq->pkt_first_seg = first_seg;
 	rxq->pkt_last_seg = last_seg;
 
-	nb_rx_bd = nb_rx_bd + rxq->nb_rx_hold;
-	if (nb_rx_bd > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, nb_rx_bd);
-		nb_rx_bd = 0;
+	rxq->rx_free_hold += nb_rx_bd;
+	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
+		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		rxq->rx_free_hold = 0;
 	}
-	rxq->nb_rx_hold = nb_rx_bd;
 
 	return nb_rx;
 }
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index a416d10..6211665 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -10,6 +10,7 @@
 #define HNS3_DEFAULT_RING_DESC  1024
 #define	HNS3_ALIGN_RING_DESC	32
 #define HNS3_RING_BASE_ALIGN	128
+#define HNS3_DEFAULT_RX_FREE_THRESH	32
 
 #define HNS3_512_BD_BUF_SIZE	512
 #define HNS3_1K_BD_BUF_SIZE	1024
@@ -243,12 +244,14 @@ struct hns3_rx_queue {
 	uint16_t queue_id;
 	uint16_t port_id;
 	uint16_t nb_rx_desc;
-	uint16_t nb_rx_hold;
-	uint16_t rx_tail;
-	uint16_t next_to_clean;
 	uint16_t next_to_use;
 	uint16_t rx_buf_len;
+	/*
+	 * threshold for the number of BDs waited to passed to hardware. If the
+	 * number exceeds the threshold, driver will pass these BDs to hardware.
+	 */
 	uint16_t rx_free_thresh;
+	uint16_t rx_free_hold;   /* num of BDs waited to passed to hardware */
 
 	/*
 	 * port based vlan configuration state.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 11/13] net/hns3: reduce address calculation in Rx
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (9 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 10/13] net/hns3: report Rx free threshold Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 12/13] net/hns3: fix Tx checksum outer header prepare Lijun Ou
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 323df8941b57027752ee9d191f1ac6f359bd524e ]

This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h | 29 +++++++++++++++++++++++++++--
 drivers/net/hns3/hns3_rxtx.c   | 14 +++-----------
 drivers/net/hns3/hns3_rxtx.h   |  1 +
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index f63d01d..5935ce9 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -583,14 +583,39 @@ struct hns3_adapter {
 	type __max2 = (y);                      \
 	__max1 > __max2 ? __max1 : __max2; })
 
+/*
+ * Because hardware always access register in little-endian mode based on hns3
+ * network engine, so driver should also call rte_cpu_to_le_32 to convert data
+ * in little-endian mode before writing register and call rte_le_to_cpu_32 to
+ * convert data after reading from register.
+ *
+ * Here the driver encapsulates the data conversion operation in the register
+ * read/write operation function as below:
+ *   hns3_write_reg
+ *   hns3_write_reg_opt
+ *   hns3_read_reg
+ * Therefore, when calling these functions, conversion is not required again.
+ */
 static inline void hns3_write_reg(void *base, uint32_t reg, uint32_t value)
 {
-	rte_write32(value, (volatile void *)((char *)base + reg));
+	rte_write32(rte_cpu_to_le_32(value),
+		    (volatile void *)((char *)base + reg));
+}
+
+/*
+ * The optimized function for writing registers used in the '.rx_pkt_burst' and
+ * '.tx_pkt_burst' ops implementation function.
+ */
+static inline void hns3_write_reg_opt(volatile void *addr, uint32_t value)
+{
+	rte_io_wmb();
+	rte_write32_relaxed(rte_cpu_to_le_32(value), addr);
 }
 
 static inline uint32_t hns3_read_reg(void *base, uint32_t reg)
 {
-	return rte_read32((volatile void *)((char *)base + reg));
+	uint32_t read_val = rte_read32((volatile void *)((char *)base + reg));
+	return rte_le_to_cpu_32(read_val);
 }
 
 #define hns3_write_dev(a, reg, value) \
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 49246ab..2bb0c46 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1305,6 +1305,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
+	rxq->io_head_reg = (volatile void *)((char *)rxq->io_base +
+			   HNS3_RING_RX_HEAD_REG);
 	rxq->rx_buf_len = rx_buf_size;
 	rxq->l2_errors = 0;
 	rxq->pkt_len_errors = 0;
@@ -1448,16 +1450,6 @@ hns3_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 	return NULL;
 }
 
-static void
-hns3_clean_rx_buffers(struct hns3_rx_queue *rxq, int count)
-{
-	rxq->next_to_use += count;
-	if (rxq->next_to_use >= rxq->nb_rx_desc)
-		rxq->next_to_use -= rxq->nb_rx_desc;
-
-	hns3_write_dev(rxq, HNS3_RING_RX_HEAD_REG, count);
-}
-
 static int
 hns3_handle_bdinfo(struct hns3_rx_queue *rxq, struct rte_mbuf *rxm,
 		   uint32_t bd_base_info, uint32_t l234_info,
@@ -1770,7 +1762,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	rxq->rx_free_hold += nb_rx_bd;
 	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		hns3_write_reg_opt(rxq->io_head_reg, rxq->rx_free_hold);
 		rxq->rx_free_hold = 0;
 	}
 
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 6211665..fd4f419 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -231,6 +231,7 @@ struct hns3_entry {
 
 struct hns3_rx_queue {
 	void *io_base;
+	volatile void *io_head_reg;
 	struct hns3_adapter *hns;
 	struct rte_mempool *mb_pool;
 	struct hns3_desc *rx_ring;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 12/13] net/hns3: fix Tx checksum outer header prepare
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (10 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 11/13] net/hns3: reduce address calculation in Rx Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 13/13] net/hns3: fix Tx checksum with fixed header length Lijun Ou
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit f9edada6515213d3c58180d564a0743a4ebbe153 ]

Currently, there are two mistakes in Tx checksum outer header prepare.
1) Check whether the packet outer header is IPV4 based on PKT_TX_IPV4
   which is incorrect.
2) For HIP08, the outer UDP cksum could not be offloaded. And driver
   should ensure the outer udp cksum filed set to 0. In current code,
   PKT_TX_UDP_CKSUM is used to determine whether the outer layer of
   the packet is a UDP header. Actually, for tunnel TSO, the flag will
   never be set.

For the first mistake, it is fixed by replacing PKT_TX_IPV4 with
PKT_TX_OUTER_IPV4. And the protocol number in L3 header is used to check
whether the outer L4 header is UDP.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Fixes: 6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 2bb0c46..e8c85c6 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -2365,26 +2365,29 @@ static void
 hns3_outer_header_cksum_prepare(struct rte_mbuf *m)
 {
 	uint64_t ol_flags = m->ol_flags;
-	struct rte_ipv4_hdr *ipv4_hdr;
-	struct rte_udp_hdr *udp_hdr;
-	uint32_t paylen, hdr_len;
+	uint32_t paylen, hdr_len, l4_proto;
 
 	if (!(ol_flags & (PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IPV6)))
 		return;
 
-	if (ol_flags & PKT_TX_IPV4) {
+	if (ol_flags & PKT_TX_OUTER_IPV4) {
+		struct rte_ipv4_hdr *ipv4_hdr;
 		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *,
 						   m->outer_l2_len);
-
-		if (ol_flags & PKT_TX_IP_CKSUM)
+		l4_proto = ipv4_hdr->next_proto_id;
+		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
 			ipv4_hdr->hdr_checksum = 0;
+	} else {
+		struct rte_ipv6_hdr *ipv6_hdr;
+		ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct rte_ipv6_hdr *,
+						   m->outer_l2_len);
+		l4_proto = ipv6_hdr->proto;
 	}
-
-	if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM &&
-	    ol_flags & PKT_TX_TCP_SEG) {
+	/* driver should ensure the outer udp cksum is 0 for TUNNEL TSO */
+	if (l4_proto == IPPROTO_UDP && (ol_flags & PKT_TX_TCP_SEG)) {
+		struct rte_udp_hdr *udp_hdr;
 		hdr_len = m->l2_len + m->l3_len + m->l4_len;
-		hdr_len += (ol_flags & PKT_TX_TUNNEL_MASK) ?
-				m->outer_l2_len + m->outer_l3_len : 0;
+		hdr_len += m->outer_l2_len + m->outer_l3_len;
 		paylen = m->pkt_len - hdr_len;
 		if (paylen <= m->tso_segsz)
 			return;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH 19.11.6 13/13] net/hns3: fix Tx checksum with fixed header length
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (11 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 12/13] net/hns3: fix Tx checksum outer header prepare Lijun Ou
@ 2020-11-13 13:37 ` Lijun Ou
  2020-11-23 15:52 ` [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Luca Boccassi
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
  14 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-13 13:37 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit fb6eb9009f41b823dddef8a09fe8e0b0698c2bf5 ]

Currently, the header length of all the layers are fixed, It would
lead to a csum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h |   2 +
 drivers/net/hns3/hns3_rxtx.c   | 316 +++++++++++++++++++----------------------
 drivers/net/hns3/hns3_rxtx.h   |   8 +-
 3 files changed, 153 insertions(+), 173 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index 5935ce9..edbea3e 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -556,6 +556,8 @@ struct hns3_adapter {
 #define hns3_get_bit(origin, shift) \
 	hns3_get_field((origin), (0x1UL << (shift)), (shift))
 
+#define hns3_gen_field_val(mask, shift, val) (((val) << (shift)) & (mask))
+
 /*
  * upper_32_bits - return bits 32-63 of a number
  * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index e8c85c6..91aeabf 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1875,44 +1875,6 @@ hns3_tx_free_useless_buffer(struct hns3_tx_queue *txq)
 	txq->tx_bd_ready   = tx_bd_ready;
 }
 
-static int
-hns3_tso_proc_tunnel(struct hns3_desc *desc, uint64_t ol_flags,
-		     struct rte_mbuf *rxm, uint8_t *l2_len)
-{
-	uint64_t tun_flags;
-	uint8_t ol4_len;
-	uint32_t otmp;
-
-	tun_flags = ol_flags & PKT_TX_TUNNEL_MASK;
-	if (tun_flags == 0)
-		return 0;
-
-	otmp = rte_le_to_cpu_32(desc->tx.ol_type_vlan_len_msec);
-	switch (tun_flags) {
-	case PKT_TX_TUNNEL_GENEVE:
-	case PKT_TX_TUNNEL_VXLAN:
-		*l2_len = rxm->l2_len - RTE_ETHER_VXLAN_HLEN;
-		break;
-	case PKT_TX_TUNNEL_GRE:
-		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(GRE) length and tunneling length.
-		 */
-		ol4_len = hns3_get_field(otmp, HNS3_TXD_L4LEN_M,
-					 HNS3_TXD_L4LEN_S);
-		*l2_len = rxm->l2_len - (ol4_len << HNS3_L4_LEN_UNIT);
-		break;
-	default:
-		/* For non UDP / GRE tunneling, drop the tunnel packet */
-		return -EINVAL;
-	}
-	hns3_set_field(otmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       rxm->outer_l2_len >> HNS3_L2_LEN_UNIT);
-	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(otmp);
-
-	return 0;
-}
-
 static inline bool
 hns3_pkt_is_tso(struct rte_mbuf *m)
 {
@@ -1920,31 +1882,15 @@ hns3_pkt_is_tso(struct rte_mbuf *m)
 }
 
 static void
-hns3_set_tso(struct hns3_desc *desc, uint64_t ol_flags,
-		uint32_t paylen, struct rte_mbuf *rxm)
+hns3_set_tso(struct hns3_desc *desc, uint32_t paylen, struct rte_mbuf *rxm)
 {
-	uint8_t l2_len = rxm->l2_len;
-	uint32_t tmp;
-
 	if (!hns3_pkt_is_tso(rxm))
 		return;
 
-	if (hns3_tso_proc_tunnel(desc, ol_flags, rxm, &l2_len))
-		return;
-
 	if (paylen <= rxm->tso_segsz)
 		return;
 
-	tmp = rte_le_to_cpu_32(desc->tx.type_cs_vlan_tso_len);
-	hns3_set_bit(tmp, HNS3_TXD_TSO_B, 1);
-	hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
-	hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S, HNS3_L4T_TCP);
-	hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-	hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-		       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
-	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       l2_len >> HNS3_L2_LEN_UNIT);
-	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp);
+	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(BIT(HNS3_TXD_TSO_B));
 	desc->tx.mss = rte_cpu_to_le_16(rxm->tso_segsz);
 }
 
@@ -1969,7 +1915,7 @@ hns3_fill_first_desc(struct hns3_tx_queue *txq, struct hns3_desc *desc,
 			   rxm->outer_l2_len + rxm->outer_l3_len : 0;
 	paylen = rxm->pkt_len - hdr_len;
 	desc->tx.paylen = rte_cpu_to_le_32(paylen);
-	hns3_set_tso(desc, ol_flags, paylen, rxm);
+	hns3_set_tso(desc, paylen, rxm);
 
 	/*
 	 * Currently, hardware doesn't support more than two layers VLAN offload
@@ -2132,180 +2078,212 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 }
 
 static void
-hns3_parse_outer_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec)
+hns3_parse_outer_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec)
 {
 	uint32_t tmp = *ol_type_vlan_len_msec;
+	uint64_t ol_flags = m->ol_flags;
 
 	/* (outer) IP header type */
 	if (ol_flags & PKT_TX_OUTER_IPV4) {
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
 		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M,
-				       HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+					HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
 		else
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-				       HNS3_OL3T_IPV4_NO_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+				HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_NO_CSUM);
 	} else if (ol_flags & PKT_TX_OUTER_IPV6) {
-		hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-			       HNS3_OL3T_IPV6);
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
+		tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
+					HNS3_OL3T_IPV6);
 	}
-
+	/* OL3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+				m->outer_l3_len >> HNS3_L3_LEN_UNIT);
 	*ol_type_vlan_len_msec = tmp;
 }
 
 static int
-hns3_parse_inner_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec,
-			struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_inner_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec,
+			uint32_t *type_cs_vlan_tso_len)
 {
-	uint32_t tmp = *ol_type_vlan_len_msec;
-	uint8_t l4_len;
-
-	/* OL2 header size, defined in 2 bytes */
-	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
+#define HNS3_NVGRE_HLEN 8
+	uint32_t tmp_outer = *ol_type_vlan_len_msec;
+	uint32_t tmp_inner = *type_cs_vlan_tso_len;
+	uint64_t ol_flags = m->ol_flags;
+	uint16_t inner_l2_len;
 
-	/* L4TUNT: L4 Tunneling Type */
 	switch (ol_flags & PKT_TX_TUNNEL_MASK) {
 	case PKT_TX_TUNNEL_GENEVE:
 	case PKT_TX_TUNNEL_VXLAN:
-		/* MAC in UDP tunnelling packet, include VxLAN */
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_MAC_IN_UDP);
+		/* MAC in UDP tunnelling packet, include VxLAN and GENEVE */
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+				HNS3_TXD_TUNTYPE_S, HNS3_TUN_MAC_IN_UDP);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(UDP) length and tunneling length.
+		 * The inner l2 length of mbuf is the sum of outer l4 length,
+		 * tunneling header length and inner l2 length for a tunnel
+		 * packect. But in hns3 tx descriptor, the tunneling header
+		 * length is contained in the field of outer L4 length.
+		 * Therefore, driver need to calculate the outer L4 length and
+		 * inner L2 length.
 		 */
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       (uint8_t)RTE_ETHER_VXLAN_HLEN >>
-			       HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+						HNS3_TXD_L4LEN_S,
+						(uint8_t)RTE_ETHER_VXLAN_HLEN >>
+						HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - RTE_ETHER_VXLAN_HLEN;
 		break;
 	case PKT_TX_TUNNEL_GRE:
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_NVGRE);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+					HNS3_TXD_TUNTYPE_S, HNS3_TUN_NVGRE);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(GRE) length and tunneling length.
+		 * For NVGRE tunnel packect, the outer L4 is empty. So only
+		 * fill the NVGRE header length to the outer L4 field.
 		 */
-		l4_len = hdr_lens->l4_len + hdr_lens->tunnel_len;
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       l4_len >> HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+				HNS3_TXD_L4LEN_S,
+				(uint8_t)HNS3_NVGRE_HLEN >> HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - HNS3_NVGRE_HLEN;
 		break;
 	default:
 		/* For non UDP / GRE tunneling, drop the tunnel packet */
 		return -EINVAL;
 	}
 
-	*ol_type_vlan_len_msec = tmp;
+	tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					inner_l2_len >> HNS3_L2_LEN_UNIT);
+	/* OL2 header size, defined in 2 bytes */
+	tmp_outer |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					m->outer_l2_len >> HNS3_L2_LEN_UNIT);
+
+	*type_cs_vlan_tso_len = tmp_inner;
+	*ol_type_vlan_len_msec = tmp_outer;
 
 	return 0;
 }
 
 static int
-hns3_parse_tunneling_params(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			    uint64_t ol_flags,
-			    struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_tunneling_params(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			    uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
-	uint32_t value = 0;
+	uint32_t tmp_outer = 0;
+	uint32_t tmp_inner = 0;
 	int ret;
 
-	hns3_parse_outer_params(ol_flags, &value);
-	ret = hns3_parse_inner_params(ol_flags, &value, hdr_lens);
-	if (ret)
-		return -EINVAL;
+	/*
+	 * The tunnel header is contained in the inner L2 header field of the
+	 * mbuf, but for hns3 descriptor, it is contained in the outer L4. So,
+	 * there is a need that switching between them. To avoid multiple
+	 * calculations, the length of the L2 header include the outer and
+	 * inner, will be filled during the parsing of tunnel packects.
+	 */
+	if (!(m->ol_flags & PKT_TX_TUNNEL_MASK)) {
+		/*
+		 * For non tunnel type the tunnel type id is 0, so no need to
+		 * assign a value to it. Only the inner(normal) L2 header length
+		 * is assigned.
+		 */
+		tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M,
+			       HNS3_TXD_L2LEN_S, m->l2_len >> HNS3_L2_LEN_UNIT);
+	} else {
+		/*
+		 * If outer csum is not offload, the outer length may be filled
+		 * with 0. And the length of the outer header is added to the
+		 * inner l2_len. It would lead a cksum error. So driver has to
+		 * calculate the header length.
+		 */
+		if (unlikely(!(m->ol_flags & PKT_TX_OUTER_IP_CKSUM) &&
+					m->outer_l2_len == 0)) {
+			struct rte_net_hdr_lens hdr_len;
+			(void)rte_net_get_ptype(m, &hdr_len,
+					RTE_PTYPE_L2_MASK | RTE_PTYPE_L3_MASK);
+			m->outer_l3_len = hdr_len.l3_len;
+			m->outer_l2_len = hdr_len.l2_len;
+			m->l2_len = m->l2_len - hdr_len.l2_len - hdr_len.l3_len;
+		}
+		hns3_parse_outer_params(m, &tmp_outer);
+		ret = hns3_parse_inner_params(m, &tmp_outer, &tmp_inner);
+		if (ret)
+			return -EINVAL;
+	}
 
-	desc->tx.ol_type_vlan_len_msec |= rte_cpu_to_le_32(value);
+	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(tmp_outer);
+	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp_inner);
 
 	return 0;
 }
 
 static void
-hns3_parse_l3_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l3_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
+	uint32_t l3_type;
 	uint32_t tmp;
 
+	tmp = *type_cs_vlan_tso_len;
+	if (ol_flags & PKT_TX_IPV4)
+		l3_type = HNS3_L3T_IPV4;
+	else if (ol_flags & PKT_TX_IPV6)
+		l3_type = HNS3_L3T_IPV6;
+	else
+		l3_type = HNS3_L3T_NONE;
+
+	/* inner(/normal) L3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+					m->l3_len >> HNS3_L3_LEN_UNIT);
+
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3T_M, HNS3_TXD_L3T_S, l3_type);
+
 	/* Enable L3 checksum offloads */
-	if (ol_flags & PKT_TX_IPV4) {
-		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV4);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
-		if (ol_flags & PKT_TX_IP_CKSUM)
-			hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
-		*type_cs_vlan_tso_len = tmp;
-	} else if (ol_flags & PKT_TX_IPV6) {
-		tmp = *type_cs_vlan_tso_len;
-		/* L3T, IPv6 don't do checksum */
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV6);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
-	}
+	if (ol_flags & PKT_TX_IP_CKSUM)
+		tmp |= BIT(HNS3_TXD_L3CS_B);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_parse_l4_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l4_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
 	uint32_t tmp;
-
 	/* Enable L4 checksum offloads */
-	switch (ol_flags & PKT_TX_L4_MASK) {
+	switch (ol_flags & (PKT_TX_L4_MASK | PKT_TX_TCP_SEG)) {
 	case PKT_TX_TCP_CKSUM:
+	case PKT_TX_TCP_SEG:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_TCP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_TCP);
 		break;
 	case PKT_TX_UDP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_UDP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_udp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_UDP);
 		break;
 	case PKT_TX_SCTP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_SCTP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_sctp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_SCTP);
 		break;
 	default:
-		break;
+		return;
 	}
+	tmp |= BIT(HNS3_TXD_L4CS_B);
+	tmp |= hns3_gen_field_val(HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
+					m->l4_len >> HNS3_L4_LEN_UNIT);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_txd_enable_checksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			 uint64_t ol_flags)
+hns3_txd_enable_checksum(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			 uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
 	uint32_t value = 0;
 
-	/* inner(/normal) L2 header size, defined in 2 bytes */
-	hns3_set_field(value, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
-
-	hns3_parse_l3_cksum_params(ol_flags, &value);
-	hns3_parse_l4_cksum_params(ol_flags, &value);
+	hns3_parse_l3_cksum_params(m, &value);
+	hns3_parse_l4_cksum_params(m, &value);
 
 	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(value);
 }
@@ -2476,18 +2454,23 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 
 static int
 hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-		 const struct rte_mbuf *m, struct rte_net_hdr_lens *hdr_lens)
+		 struct rte_mbuf *m)
 {
-	/* Fill in tunneling parameters if necessary */
-	if (m->ol_flags & PKT_TX_TUNNEL_MASK) {
-		(void)rte_net_get_ptype(m, hdr_lens, RTE_PTYPE_ALL_MASK);
-		if (hns3_parse_tunneling_params(txq, tx_desc_id, m->ol_flags,
-						hdr_lens))
+	struct hns3_desc *tx_ring = txq->tx_ring;
+	struct hns3_desc *desc = &tx_ring[tx_desc_id];
+
+	/* Enable checksum offloading */
+	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK) {
+		/* Fill in tunneling parameters if necessary */
+		if (hns3_parse_tunneling_params(txq, m, tx_desc_id))
 			return -EINVAL;
+
+		hns3_txd_enable_checksum(txq, m, tx_desc_id);
+	} else {
+		/* clear the control bit */
+		desc->tx.type_cs_vlan_tso_len  = 0;
+		desc->tx.ol_type_vlan_len_msec = 0;
 	}
-	/* Enable checksum offloading */
-	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK)
-		hns3_txd_enable_checksum(txq, tx_desc_id, m->ol_flags);
 
 	return 0;
 }
@@ -2522,7 +2505,6 @@ hns3_check_non_tso_pkt(uint16_t nb_buf, struct rte_mbuf **m_seg,
 uint16_t
 hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
 	struct hns3_desc *tx_ring;
@@ -2581,7 +2563,7 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		if (hns3_check_non_tso_pkt(nb_buf, &m_seg, tx_pkt, txq))
 			goto end_of_tx;
 
-		if (hns3_parse_cksum(txq, tx_next_use, m_seg, &hdr_lens))
+		if (hns3_parse_cksum(txq, tx_next_use, m_seg))
 			goto end_of_tx;
 
 		i = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index fd4f419..82689fb 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -305,14 +305,10 @@ struct hns3_queue_info {
 };
 
 #define HNS3_TX_CKSUM_OFFLOAD_MASK ( \
-	PKT_TX_OUTER_IPV6 | \
-	PKT_TX_OUTER_IPV4 | \
 	PKT_TX_OUTER_IP_CKSUM | \
-	PKT_TX_IPV6 | \
-	PKT_TX_IPV4 | \
 	PKT_TX_IP_CKSUM | \
-	PKT_TX_L4_MASK | \
-	PKT_TX_TUNNEL_MASK)
+	PKT_TX_TCP_SEG | \
+	PKT_TX_L4_MASK)
 
 enum hns3_cksum_status {
 	HNS3_CKSUM_NONE = 0,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (12 preceding siblings ...)
  2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 13/13] net/hns3: fix Tx checksum with fixed header length Lijun Ou
@ 2020-11-23 15:52 ` Luca Boccassi
  2020-11-24 14:27   ` oulijun
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
  14 siblings, 1 reply; 42+ messages in thread
From: Luca Boccassi @ 2020-11-23 15:52 UTC (permalink / raw)
  To: Lijun Ou, stable; +Cc: linuxarm

On Fri, 2020-11-13 at 21:36 +0800, Lijun Ou wrote:
> Hi, Luca Boccassi
> His series are backport for 19.11.6 about hns3 PMD driver
> I also noticed that you gave Hu Wei a suggestion on the 19.11.4
> backport. You did not recommend adding new features. I reselected
> the TSO and some performance optimization patch request backports.
> The reason I do this is that I think TSO is also a performance
> optimization point, including other fixes. In addition, if TSO
> is not integrated, the bug fixes of the cksum may not be integrated,
> which will cause the stability of the cksum function.
> 
> Chengchang Tang (5):
>   net/hns3: support promiscuous and allmulticast mode for VF
>   net/hns3: decrease non-nearby memory access in Rx
>   net/hns3: cleanup duplicated code on processing TSO in Tx
>   net/hns3: fix Tx checksum outer header prepare
>   net/hns3: fix Tx checksum with fixed header length
> 
> Hongbo Zheng (1):
>   net/hns3: check TSO segment size during Tx
> 
> Lijun Ou (2):
>   net/hns3: support TSO
>   net/hns3: report Tx descriptor segment limitations
> 
> Wei Hu (Xavier) (5):
>   net/hns3: fix reassembling multiple segment packets in Tx
>   net/hns3: fix inserted VLAN tag position in Tx
>   net/hns3: report Rx drop packets enable configuration
>   net/hns3: report Rx free threshold
>   net/hns3: reduce address calculation in Rx
> 
>  doc/guides/nics/features/hns3.ini    |   1 +
>  doc/guides/nics/features/hns3_vf.ini |   3 +
>  doc/guides/nics/hns3.rst             |   1 +
>  drivers/net/hns3/hns3_ethdev.c       |  38 +-
>  drivers/net/hns3/hns3_ethdev.h       |  37 +-
>  drivers/net/hns3/hns3_ethdev_vf.c    | 134 ++++++-
>  drivers/net/hns3/hns3_mbx.c          |  23 ++
>  drivers/net/hns3/hns3_mbx.h          |   2 +
>  drivers/net/hns3/hns3_rxtx.c         | 666 ++++++++++++++++++++++++-----------
>  drivers/net/hns3/hns3_rxtx.h         |  31 +-
>  10 files changed, 720 insertions(+), 216 deletions(-)

Hi,

I discussed this proposal with other maintainers, and I'm afraid the
original answer still stands - we feel backporting an entire new
offload is too much for the scope of an LTS release.

While sometimes we do allow for some featurettes to be backported, in
the form of a new #define or such small changes, there was consensus
that a 936 lines diff is too much churn for what is expected from a
stable release, which is stability. The risk of introducing new bugs,
especially in face of the fact that we never had regression tests
coverage for the hns3 PMD for LTSes before, is too high.

The 20.05/20.08 releases are ABI compatible with 19.11, so if there's
an urgent need for this feature in users of the hns3 PMD, one of those
can be used instead. Some other companies also maintain downstream
trees of the LTSes with invasive changes to their PMDs, that are deemed
too risky for the upstream tree, so that's another option available to
you.

Sorry!

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6
  2020-11-23 15:52 ` [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Luca Boccassi
@ 2020-11-24 14:27   ` oulijun
  2020-11-24 15:35     ` Luca Boccassi
  0 siblings, 1 reply; 42+ messages in thread
From: oulijun @ 2020-11-24 14:27 UTC (permalink / raw)
  To: Luca Boccassi, stable; +Cc: linuxarm



在 2020/11/23 23:52, Luca Boccassi 写道:
> On Fri, 2020-11-13 at 21:36 +0800, Lijun Ou wrote:
>> Hi, Luca Boccassi
>> His series are backport for 19.11.6 about hns3 PMD driver
>> I also noticed that you gave Hu Wei a suggestion on the 19.11.4
>> backport. You did not recommend adding new features. I reselected
>> the TSO and some performance optimization patch request backports.
>> The reason I do this is that I think TSO is also a performance
>> optimization point, including other fixes. In addition, if TSO
>> is not integrated, the bug fixes of the cksum may not be integrated,
>> which will cause the stability of the cksum function.
>>
>> Chengchang Tang (5):
>>    net/hns3: support promiscuous and allmulticast mode for VF
>>    net/hns3: decrease non-nearby memory access in Rx
>>    net/hns3: cleanup duplicated code on processing TSO in Tx
>>    net/hns3: fix Tx checksum outer header prepare
>>    net/hns3: fix Tx checksum with fixed header length
>>
>> Hongbo Zheng (1):
>>    net/hns3: check TSO segment size during Tx
>>
>> Lijun Ou (2):
>>    net/hns3: support TSO
>>    net/hns3: report Tx descriptor segment limitations
>>
>> Wei Hu (Xavier) (5):
>>    net/hns3: fix reassembling multiple segment packets in Tx
>>    net/hns3: fix inserted VLAN tag position in Tx
>>    net/hns3: report Rx drop packets enable configuration
>>    net/hns3: report Rx free threshold
>>    net/hns3: reduce address calculation in Rx
>>
>>   doc/guides/nics/features/hns3.ini    |   1 +
>>   doc/guides/nics/features/hns3_vf.ini |   3 +
>>   doc/guides/nics/hns3.rst             |   1 +
>>   drivers/net/hns3/hns3_ethdev.c       |  38 +-
>>   drivers/net/hns3/hns3_ethdev.h       |  37 +-
>>   drivers/net/hns3/hns3_ethdev_vf.c    | 134 ++++++-
>>   drivers/net/hns3/hns3_mbx.c          |  23 ++
>>   drivers/net/hns3/hns3_mbx.h          |   2 +
>>   drivers/net/hns3/hns3_rxtx.c         | 666 ++++++++++++++++++++++++-----------
>>   drivers/net/hns3/hns3_rxtx.h         |  31 +-
>>   10 files changed, 720 insertions(+), 216 deletions(-)
> 
> Hi,
> 
> I discussed this proposal with other maintainers, and I'm afraid the
> original answer still stands - we feel backporting an entire new
> offload is too much for the scope of an LTS release.
> 
> While sometimes we do allow for some featurettes to be backported, in
> the form of a new #define or such small changes, there was consensus
> that a 936 lines diff is too much churn for what is expected from a
> stable release, which is stability. The risk of introducing new bugs,
> especially in face of the fact that we never had regression tests
> coverage for the hns3 PMD for LTSes before, is too high.
> 
> The 20.05/20.08 releases are ABI compatible with 19.11, so if there's
> an urgent need for this feature in users of the hns3 PMD, one of those
> can be used instead. Some other companies also maintain downstream
> trees of the LTSes with invasive changes to their PMDs, that are deemed
> too risky for the upstream tree, so that's another option available to
> you.
> 
> Sorry!
> 
I see. Can I re-split the bugfix of the TSO part based on the patch and 
return it to you?

Thanks
Lijun Ou

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6
  2020-11-24 14:27   ` oulijun
@ 2020-11-24 15:35     ` Luca Boccassi
  0 siblings, 0 replies; 42+ messages in thread
From: Luca Boccassi @ 2020-11-24 15:35 UTC (permalink / raw)
  To: oulijun, stable; +Cc: linuxarm

On Tue, 2020-11-24 at 22:27 +0800, oulijun wrote:
> 
> 在 2020/11/23 23:52, Luca Boccassi 写道:
> > On Fri, 2020-11-13 at 21:36 +0800, Lijun Ou wrote:
> > > Hi, Luca Boccassi
> > > His series are backport for 19.11.6 about hns3 PMD driver
> > > I also noticed that you gave Hu Wei a suggestion on the 19.11.4
> > > backport. You did not recommend adding new features. I reselected
> > > the TSO and some performance optimization patch request backports.
> > > The reason I do this is that I think TSO is also a performance
> > > optimization point, including other fixes. In addition, if TSO
> > > is not integrated, the bug fixes of the cksum may not be integrated,
> > > which will cause the stability of the cksum function.
> > > 
> > > Chengchang Tang (5):
> > >    net/hns3: support promiscuous and allmulticast mode for VF
> > >    net/hns3: decrease non-nearby memory access in Rx
> > >    net/hns3: cleanup duplicated code on processing TSO in Tx
> > >    net/hns3: fix Tx checksum outer header prepare
> > >    net/hns3: fix Tx checksum with fixed header length
> > > 
> > > Hongbo Zheng (1):
> > >    net/hns3: check TSO segment size during Tx
> > > 
> > > Lijun Ou (2):
> > >    net/hns3: support TSO
> > >    net/hns3: report Tx descriptor segment limitations
> > > 
> > > Wei Hu (Xavier) (5):
> > >    net/hns3: fix reassembling multiple segment packets in Tx
> > >    net/hns3: fix inserted VLAN tag position in Tx
> > >    net/hns3: report Rx drop packets enable configuration
> > >    net/hns3: report Rx free threshold
> > >    net/hns3: reduce address calculation in Rx
> > > 
> > >   doc/guides/nics/features/hns3.ini    |   1 +
> > >   doc/guides/nics/features/hns3_vf.ini |   3 +
> > >   doc/guides/nics/hns3.rst             |   1 +
> > >   drivers/net/hns3/hns3_ethdev.c       |  38 +-
> > >   drivers/net/hns3/hns3_ethdev.h       |  37 +-
> > >   drivers/net/hns3/hns3_ethdev_vf.c    | 134 ++++++-
> > >   drivers/net/hns3/hns3_mbx.c          |  23 ++
> > >   drivers/net/hns3/hns3_mbx.h          |   2 +
> > >   drivers/net/hns3/hns3_rxtx.c         | 666 ++++++++++++++++++++++++-----------
> > >   drivers/net/hns3/hns3_rxtx.h         |  31 +-
> > >   10 files changed, 720 insertions(+), 216 deletions(-)
> > 
> > Hi,
> > 
> > I discussed this proposal with other maintainers, and I'm afraid the
> > original answer still stands - we feel backporting an entire new
> > offload is too much for the scope of an LTS release.
> > 
> > While sometimes we do allow for some featurettes to be backported, in
> > the form of a new #define or such small changes, there was consensus
> > that a 936 lines diff is too much churn for what is expected from a
> > stable release, which is stability. The risk of introducing new bugs,
> > especially in face of the fact that we never had regression tests
> > coverage for the hns3 PMD for LTSes before, is too high.
> > 
> > The 20.05/20.08 releases are ABI compatible with 19.11, so if there's
> > an urgent need for this feature in users of the hns3 PMD, one of those
> > can be used instead. Some other companies also maintain downstream
> > trees of the LTSes with invasive changes to their PMDs, that are deemed
> > too risky for the upstream tree, so that's another option available to
> > you.
> > 
> > Sorry!
> > 
> I see. Can I re-split the bugfix of the TSO part based on the patch and 
> return it to you?
> 
> Thanks
> Lijun Ou

Yes, certainly, individual bug fixes are always welcome.

Thanks!

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 0/7] backport for 19.11.6
  2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
                   ` (13 preceding siblings ...)
  2020-11-23 15:52 ` [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Luca Boccassi
@ 2020-11-25  3:24 ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 1/7] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
                     ` (7 more replies)
  14 siblings, 8 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

His series are backport for 19.11.6 about hns3 PMD driver

Change from V1:
1. split TSO part from fix TX checksum with fix header length
2. remove TSO patch and the relatived patch

Chengchang Tang (2):
  net/hns3: decrease non-nearby memory access in Rx
  net/hns3: fix TX checksum with fix header length

Lijun Ou (1):
  net/hns3: report Tx descriptor segment limitations

Wei Hu (Xavier) (4):
  net/hns3: fix reassembling multiple segment packets in Tx
  net/hns3: report Rx drop packets enable configuration
  net/hns3: report Rx free threshold
  net/hns3: reduce address calculation in Rx

 drivers/net/hns3/hns3_ethdev.c    |  34 +++-
 drivers/net/hns3/hns3_ethdev.h    |  31 +++-
 drivers/net/hns3/hns3_ethdev_vf.c |  13 ++
 drivers/net/hns3/hns3_rxtx.c      | 363 +++++++++++++++++++++++---------------
 drivers/net/hns3/hns3_rxtx.h      |  30 +++-
 5 files changed, 312 insertions(+), 159 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 1/7] net/hns3: fix reassembling multiple segment packets in Tx
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 2/7] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 6c44219f9907a064049d160d366d98bb840724a6 ]

Because of the hardware constraints, hns3 network engine doesn't support
sending packets with more than eight fragments. And hns3 pmd driver
tries to reassemble these kind of packets to meet hardware requirements.
Currently, there are two problems:
1) when the input buffer_len * 8 < pkt_len, the packets are impossible
   to be reassembled into 8 Buffer Descriptors. In this case, the
   packets will be passed to hardware, which eventually causes a
   hardware reset.
2) The meta data in origin packets which are required to fill into the
   descriptor haven't been copied into the reassembled pkts.

This patch adds a check for 1) to ensure such packets will be dropped by
driver and copies useful meta data from the origin packets to the
reassembled packets.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 5aee8cb..ffd8417 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1940,6 +1940,20 @@ hns3_tx_alloc_mbufs(struct hns3_tx_queue *txq, struct rte_mempool *mb_pool,
 	return 0;
 }
 
+static inline void
+hns3_pktmbuf_copy_hdr(struct rte_mbuf *new_pkt, struct rte_mbuf *old_pkt)
+{
+	new_pkt->ol_flags = old_pkt->ol_flags;
+	new_pkt->pkt_len = rte_pktmbuf_pkt_len(old_pkt);
+	new_pkt->outer_l2_len = old_pkt->outer_l2_len;
+	new_pkt->outer_l3_len = old_pkt->outer_l3_len;
+	new_pkt->l2_len = old_pkt->l2_len;
+	new_pkt->l3_len = old_pkt->l3_len;
+	new_pkt->l4_len = old_pkt->l4_len;
+	new_pkt->vlan_tci_outer = old_pkt->vlan_tci_outer;
+	new_pkt->vlan_tci = old_pkt->vlan_tci;
+}
+
 static int
 hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 			struct rte_mbuf **new_pkt)
@@ -1963,9 +1977,11 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 
 	mb_pool = tx_pkt->pool;
 	buf_size = tx_pkt->buf_len - RTE_PKTMBUF_HEADROOM;
-	nb_new_buf = (tx_pkt->pkt_len - 1) / buf_size + 1;
+	nb_new_buf = (rte_pktmbuf_pkt_len(tx_pkt) - 1) / buf_size + 1;
+	if (nb_new_buf > HNS3_MAX_NON_TSO_BD_PER_PKT)
+		return -EINVAL;
 
-	last_buf_len = tx_pkt->pkt_len % buf_size;
+	last_buf_len = rte_pktmbuf_pkt_len(tx_pkt) % buf_size;
 	if (last_buf_len == 0)
 		last_buf_len = buf_size;
 
@@ -1977,7 +1993,7 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 	/* Copy the original packet content to the new mbufs */
 	temp = tx_pkt;
 	s = rte_pktmbuf_mtod(temp, char *);
-	len_s = temp->data_len;
+	len_s = rte_pktmbuf_data_len(temp);
 	temp_new = new_mbuf;
 	for (i = 0; i < nb_new_buf; i++) {
 		d = rte_pktmbuf_mtod(temp_new, char *);
@@ -2000,13 +2016,14 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 				if (temp == NULL)
 					break;
 				s = rte_pktmbuf_mtod(temp, char *);
-				len_s = temp->data_len;
+				len_s = rte_pktmbuf_data_len(temp);
 			}
 		}
 
 		temp_new->data_len = buf_len;
 		temp_new = temp_new->next;
 	}
+	hns3_pktmbuf_copy_hdr(new_mbuf, tx_pkt);
 
 	/* free original mbufs */
 	rte_pktmbuf_free(tx_pkt);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 2/7] net/hns3: decrease non-nearby memory access in Rx
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 1/7] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations Lijun Ou
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit 8c7449779c4526d27205043560a21f0e2c2f622b ]

Currently, hns3 PMD driver needs know the PVID configuration state and
do different processing in the 'rx_pkt_burst' ops implementation
function.

This patch adds a member to struct hns3_rx_queue/hns3_tx_queue of the
driver to indicate the PVID configuration status, so it isn't need
to access other data structure in the 'rx_pkt_burst' ops implementation,
to avoid performance loss because of reducing cache miss.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c | 21 ++++++++++++++++++++-
 drivers/net/hns3/hns3_rxtx.c   | 37 +++++++++++++++++++++++++++++++------
 drivers/net/hns3/hns3_rxtx.h   | 13 +++++++++++++
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index f2702e0..3a92bd4 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -895,6 +895,8 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 {
 	struct hns3_adapter *hns = dev->data->dev_private;
 	struct hns3_hw *hw = &hns->hw;
+	bool pvid_en_state_change;
+	uint16_t pvid_state;
 	int ret;
 
 	if (pvid > RTE_ETHER_MAX_VLAN_ID) {
@@ -903,10 +905,27 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 		return -EINVAL;
 	}
 
+	/*
+	 * If PVID configuration state change, should refresh the PVID
+	 * configuration state in struct hns3_tx_queue/hns3_rx_queue.
+	 */
+	pvid_state = hw->port_base_vlan_cfg.state;
+	if ((on && pvid_state == HNS3_PORT_BASE_VLAN_ENABLE) ||
+	    (!on && pvid_state == HNS3_PORT_BASE_VLAN_DISABLE))
+		pvid_en_state_change = false;
+	else
+		pvid_en_state_change = true;
+
 	rte_spinlock_lock(&hw->lock);
 	ret = hns3_vlan_pvid_configure(hns, pvid, on);
 	rte_spinlock_unlock(&hw->lock);
-	return ret;
+	if (ret)
+		return ret;
+
+	if (pvid_en_state_change)
+		hns3_update_all_queues_pvid_state(hw);
+
+	return 0;
 }
 
 static int
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index ffd8417..53b98cd 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -316,6 +316,31 @@ hns3_init_tx_queue_hw(struct hns3_tx_queue *txq)
 }
 
 void
+hns3_update_all_queues_pvid_state(struct hns3_hw *hw)
+{
+	uint16_t nb_rx_q = hw->data->nb_rx_queues;
+	uint16_t nb_tx_q = hw->data->nb_tx_queues;
+	struct hns3_rx_queue *rxq;
+	struct hns3_tx_queue *txq;
+	int pvid_state;
+	int i;
+
+	pvid_state = hw->port_base_vlan_cfg.state;
+	for (i = 0; i < hw->cfg_max_queues; i++) {
+		if (i < nb_rx_q) {
+			rxq = hw->data->rx_queues[i];
+			if (rxq != NULL)
+				rxq->pvid_state = pvid_state;
+		}
+		if (i < nb_tx_q) {
+			txq = hw->data->tx_queues[i];
+			if (txq != NULL)
+				txq->pvid_state = pvid_state;
+		}
+	}
+}
+
+void
 hns3_enable_all_queues(struct hns3_hw *hw, bool en)
 {
 	uint16_t nb_rx_q = hw->data->nb_rx_queues;
@@ -1275,6 +1300,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
+	rxq->pvid_state = hw->port_base_vlan_cfg.state;
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
@@ -1503,7 +1529,7 @@ hns3_rx_set_cksum_flag(struct rte_mbuf *rxm, uint64_t packet_type,
 }
 
 static inline void
-hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
+hns3_rxd_to_vlan_tci(struct hns3_rx_queue *rxq, struct rte_mbuf *mb,
 		     uint32_t l234_info, const struct hns3_desc *rxd)
 {
 #define HNS3_STRP_STATUS_NUM		0x4
@@ -1511,8 +1537,6 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 #define HNS3_NO_STRP_VLAN_VLD		0x0
 #define HNS3_INNER_STRP_VLAN_VLD	0x1
 #define HNS3_OUTER_STRP_VLAN_VLD	0x2
-	struct hns3_adapter *hns = dev->data->dev_private;
-	struct hns3_hw *hw = &hns->hw;
 	uint32_t strip_status;
 	uint32_t report_mode;
 
@@ -1538,7 +1562,7 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 	};
 	strip_status = hns3_get_field(l234_info, HNS3_RXD_STRP_TAGP_M,
 				      HNS3_RXD_STRP_TAGP_S);
-	report_mode = report_type[hw->port_base_vlan_cfg.state][strip_status];
+	report_mode = report_type[rxq->pvid_state][strip_status];
 	switch (report_mode) {
 	case HNS3_NO_STRP_VLAN_VLD:
 		mb->vlan_tci = 0;
@@ -1583,7 +1607,6 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx = 0;
 	nb_rx_bd = 0;
 	rxq = rx_queue;
-	dev = &rte_eth_devices[rxq->port_id];
 
 	rx_id = rxq->next_to_clean;
 	rx_ring = rxq->rx_ring;
@@ -1660,6 +1683,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 		nmb = rte_mbuf_raw_alloc(rxq->mb_pool);
 		if (unlikely(nmb == NULL)) {
+			dev = &rte_eth_devices[rxq->port_id];
 			dev->data->rx_mbuf_alloc_failed++;
 			break;
 		}
@@ -1729,7 +1753,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			hns3_rx_set_cksum_flag(first_seg,
 					       first_seg->packet_type,
 					       cksum_err);
-		hns3_rxd_to_vlan_tci(dev, first_seg, l234_info, &rxd);
+		hns3_rxd_to_vlan_tci(rxq, first_seg, l234_info, &rxd);
 
 		rx_pkts[nb_rx++] = first_seg;
 		first_seg = NULL;
@@ -1807,6 +1831,7 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->next_to_clean = 0;
 	txq->tx_bd_ready = txq->nb_tx_desc - 1;
 	txq->port_id = dev->data->port_id;
+	txq->pvid_state = hw->port_base_vlan_cfg.state;
 	txq->configured = true;
 	txq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 1fd1afd..a416d10 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -250,6 +250,12 @@ struct hns3_rx_queue {
 	uint16_t rx_buf_len;
 	uint16_t rx_free_thresh;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool rx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if rx queue has been configured */
 
@@ -276,6 +282,12 @@ struct hns3_tx_queue {
 	uint16_t next_to_use;
 	uint16_t tx_bd_ready;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool tx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if tx queue has been configured */
 };
@@ -336,5 +348,6 @@ void hns3_set_queue_intr_rl(struct hns3_hw *hw, uint16_t queue_id,
 			    uint16_t rl_value);
 int hns3_set_fake_rx_or_tx_queues(struct rte_eth_dev *dev, uint16_t nb_rx_q,
 				  uint16_t nb_tx_q);
+void hns3_update_all_queues_pvid_state(struct hns3_hw *hw);
 
 #endif /* _HNS3_RXTX_H_ */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 1/7] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 2/7] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  9:16     ` Luca Boccassi
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 4/7] net/hns3: report Rx drop packets enable configuration Lijun Ou
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

[ upstream commit eb8b3a0d829bc109dfb13de9c8cdeacda5449e69 ]

According to the user manual of Kunpeng920 SoC, the max allowed number
of segments per whole packet is 63 and the max number of segments per
packet is 8 in datapath.

This patch reports the Two segment parameters of Tx descriptor
limitations to DPDK framework.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 2 ++
 drivers/net/hns3/hns3_ethdev_vf.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 3a92bd4..7fd0f7f 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2499,6 +2499,8 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_max = HNS3_MAX_RING_DESC,
 		.nb_min = HNS3_MIN_RING_DESC,
 		.nb_align = HNS3_ALIGN_RING_DESC,
+		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
+		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 4359b2e..0b5d3d4 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -866,6 +866,8 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_max = HNS3_MAX_RING_DESC,
 		.nb_min = HNS3_MIN_RING_DESC,
 		.nb_align = HNS3_ALIGN_RING_DESC,
+		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
+		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
 	info->vmdq_queue_num = 0;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 4/7] net/hns3: report Rx drop packets enable configuration
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
                     ` (2 preceding siblings ...)
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 5/7] net/hns3: report Rx free threshold Lijun Ou
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit a02f1461c75b4be31d4a37ea3fae95aee86cdff2 ]

Currently, if there are not available Rx buffer descriptors in receiving
direction based on hns3 network engine, incoming packets will always be
dropped by hardware. This patch reports the '.rx_drop_en' information to
DPDK framework in the '.dev_infos_get', '.rxq_info_get' and
'.rx_queue_setup' ops implementation function.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 9 +++++++++
 drivers/net/hns3/hns3_ethdev_vf.c | 9 +++++++++
 drivers/net/hns3/hns3_rxtx.c      | 6 ++++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 7fd0f7f..deeb045 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2503,6 +2503,15 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 0b5d3d4..f2fd029 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -870,6 +870,15 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 53b98cd..13d5560 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1251,6 +1251,12 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 		return -EINVAL;
 	}
 
+	if (conf->rx_drop_en == 0)
+		hns3_warn(hw, "if there are no available Rx descriptors,"
+			  "incoming packets are always dropped. input parameter"
+			  " conf->rx_drop_en(%u) is uneffective.",
+			  conf->rx_drop_en);
+
 	if (dev->data->rx_queues[idx]) {
 		hns3_rx_queue_release(dev->data->rx_queues[idx]);
 		dev->data->rx_queues[idx] = NULL;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 5/7] net/hns3: report Rx free threshold
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
                     ` (3 preceding siblings ...)
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 4/7] net/hns3: report Rx drop packets enable configuration Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 6/7] net/hns3: reduce address calculation in Rx Lijun Ou
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit ceabee45be4a1737994c5f4631bd001b8021475c ]

This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    |  2 ++
 drivers/net/hns3/hns3_ethdev_vf.c |  2 ++
 drivers/net/hns3/hns3_rxtx.c      | 30 ++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h      |  9 ++++++---
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index deeb045..95a3f1b 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2504,12 +2504,14 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index f2fd029..da7ca29 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -871,12 +871,14 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 13d5560..41d957a 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -634,8 +634,7 @@ hns3_dev_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 
 	return 0;
@@ -649,8 +648,7 @@ hns3_fake_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 
 	rxq = (struct hns3_rx_queue *)hw->fkq_data.rx_queues[idx];
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 }
 
@@ -1285,10 +1283,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 
 	rxq->hns = hns;
 	rxq->mb_pool = mp;
-	if (conf->rx_free_thresh <= 0)
-		rxq->rx_free_thresh = DEFAULT_RX_FREE_THRESH;
-	else
-		rxq->rx_free_thresh = conf->rx_free_thresh;
+	rxq->rx_free_thresh = (conf->rx_free_thresh > 0) ?
+		conf->rx_free_thresh : HNS3_DEFAULT_RX_FREE_THRESH;
 	rxq->rx_deferred_start = conf->rx_deferred_start;
 
 	rx_entry_len = sizeof(struct hns3_entry) * rxq->nb_rx_desc;
@@ -1301,8 +1297,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
@@ -1614,11 +1609,11 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx_bd = 0;
 	rxq = rx_queue;
 
-	rx_id = rxq->next_to_clean;
+	rx_id = rxq->next_to_use;
 	rx_ring = rxq->rx_ring;
+	sw_ring = rxq->sw_ring;
 	first_seg = rxq->pkt_first_seg;
 	last_seg = rxq->pkt_last_seg;
-	sw_ring = rxq->sw_ring;
 
 	while (nb_rx < nb_pkts) {
 		rxdp = &rx_ring[rx_id];
@@ -1769,16 +1764,15 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		first_seg = NULL;
 	}
 
-	rxq->next_to_clean = rx_id;
+	rxq->next_to_use = rx_id;
 	rxq->pkt_first_seg = first_seg;
 	rxq->pkt_last_seg = last_seg;
 
-	nb_rx_bd = nb_rx_bd + rxq->nb_rx_hold;
-	if (nb_rx_bd > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, nb_rx_bd);
-		nb_rx_bd = 0;
+	rxq->rx_free_hold += nb_rx_bd;
+	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
+		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		rxq->rx_free_hold = 0;
 	}
-	rxq->nb_rx_hold = nb_rx_bd;
 
 	return nb_rx;
 }
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index a416d10..6211665 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -10,6 +10,7 @@
 #define HNS3_DEFAULT_RING_DESC  1024
 #define	HNS3_ALIGN_RING_DESC	32
 #define HNS3_RING_BASE_ALIGN	128
+#define HNS3_DEFAULT_RX_FREE_THRESH	32
 
 #define HNS3_512_BD_BUF_SIZE	512
 #define HNS3_1K_BD_BUF_SIZE	1024
@@ -243,12 +244,14 @@ struct hns3_rx_queue {
 	uint16_t queue_id;
 	uint16_t port_id;
 	uint16_t nb_rx_desc;
-	uint16_t nb_rx_hold;
-	uint16_t rx_tail;
-	uint16_t next_to_clean;
 	uint16_t next_to_use;
 	uint16_t rx_buf_len;
+	/*
+	 * threshold for the number of BDs waited to passed to hardware. If the
+	 * number exceeds the threshold, driver will pass these BDs to hardware.
+	 */
 	uint16_t rx_free_thresh;
+	uint16_t rx_free_hold;   /* num of BDs waited to passed to hardware */
 
 	/*
 	 * port based vlan configuration state.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 6/7] net/hns3: reduce address calculation in Rx
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
                     ` (4 preceding siblings ...)
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 5/7] net/hns3: report Rx free threshold Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 7/7] net/hns3: fix TX checksum with fix header length Lijun Ou
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 323df8941b57027752ee9d191f1ac6f359bd524e ]

This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h | 29 +++++++++++++++++++++++++++--
 drivers/net/hns3/hns3_rxtx.c   | 14 +++-----------
 drivers/net/hns3/hns3_rxtx.h   |  1 +
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index e0a04f8..ba01622 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -579,14 +579,39 @@ struct hns3_adapter {
 	type __max2 = (y);                      \
 	__max1 > __max2 ? __max1 : __max2; })
 
+/*
+ * Because hardware always access register in little-endian mode based on hns3
+ * network engine, so driver should also call rte_cpu_to_le_32 to convert data
+ * in little-endian mode before writing register and call rte_le_to_cpu_32 to
+ * convert data after reading from register.
+ *
+ * Here the driver encapsulates the data conversion operation in the register
+ * read/write operation function as below:
+ *   hns3_write_reg
+ *   hns3_write_reg_opt
+ *   hns3_read_reg
+ * Therefore, when calling these functions, conversion is not required again.
+ */
 static inline void hns3_write_reg(void *base, uint32_t reg, uint32_t value)
 {
-	rte_write32(value, (volatile void *)((char *)base + reg));
+	rte_write32(rte_cpu_to_le_32(value),
+		    (volatile void *)((char *)base + reg));
+}
+
+/*
+ * The optimized function for writing registers used in the '.rx_pkt_burst' and
+ * '.tx_pkt_burst' ops implementation function.
+ */
+static inline void hns3_write_reg_opt(volatile void *addr, uint32_t value)
+{
+	rte_io_wmb();
+	rte_write32_relaxed(rte_cpu_to_le_32(value), addr);
 }
 
 static inline uint32_t hns3_read_reg(void *base, uint32_t reg)
 {
-	return rte_read32((volatile void *)((char *)base + reg));
+	uint32_t read_val = rte_read32((volatile void *)((char *)base + reg));
+	return rte_le_to_cpu_32(read_val);
 }
 
 #define hns3_write_dev(a, reg, value) \
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 41d957a..78c73ea 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1305,6 +1305,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
+	rxq->io_head_reg = (volatile void *)((char *)rxq->io_base +
+			   HNS3_RING_RX_HEAD_REG);
 	rxq->rx_buf_len = rx_buf_size;
 	rxq->l2_errors = 0;
 	rxq->pkt_len_errors = 0;
@@ -1448,16 +1450,6 @@ hns3_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 	return NULL;
 }
 
-static void
-hns3_clean_rx_buffers(struct hns3_rx_queue *rxq, int count)
-{
-	rxq->next_to_use += count;
-	if (rxq->next_to_use >= rxq->nb_rx_desc)
-		rxq->next_to_use -= rxq->nb_rx_desc;
-
-	hns3_write_dev(rxq, HNS3_RING_RX_HEAD_REG, count);
-}
-
 static int
 hns3_handle_bdinfo(struct hns3_rx_queue *rxq, struct rte_mbuf *rxm,
 		   uint32_t bd_base_info, uint32_t l234_info,
@@ -1770,7 +1762,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	rxq->rx_free_hold += nb_rx_bd;
 	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		hns3_write_reg_opt(rxq->io_head_reg, rxq->rx_free_hold);
 		rxq->rx_free_hold = 0;
 	}
 
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 6211665..fd4f419 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -231,6 +231,7 @@ struct hns3_entry {
 
 struct hns3_rx_queue {
 	void *io_base;
+	volatile void *io_head_reg;
 	struct hns3_adapter *hns;
 	struct rte_mempool *mb_pool;
 	struct hns3_desc *rx_ring;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v2 19.11.6 7/7] net/hns3: fix TX checksum with fix header length
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
                     ` (5 preceding siblings ...)
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 6/7] net/hns3: reduce address calculation in Rx Lijun Ou
@ 2020-11-25  3:24   ` Lijun Ou
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  7 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25  3:24 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit fb6eb9009f41b823dddef8a09fe8e0b0698c2bf5 ]

Currently, the header length of all the layers are fixed, It would
lead to a cksum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h |   2 +
 drivers/net/hns3/hns3_rxtx.c   | 253 +++++++++++++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h   |   7 +-
 3 files changed, 147 insertions(+), 115 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index ba01622..8ba22dc 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -552,6 +552,8 @@ struct hns3_adapter {
 #define hns3_get_bit(origin, shift) \
 	hns3_get_field((origin), (0x1UL << (shift)), (shift))
 
+#define hns3_gen_field_val(mask, shift, val) (((val) << (shift)) & (mask))
+
 /*
  * upper_32_bits - return bits 32-63 of a number
  * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 78c73ea..bf6e7d6 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -2051,180 +2051,211 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 }
 
 static void
-hns3_parse_outer_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec)
+hns3_parse_outer_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec)
 {
 	uint32_t tmp = *ol_type_vlan_len_msec;
+	uint64_t ol_flags = m->ol_flags;
 
 	/* (outer) IP header type */
 	if (ol_flags & PKT_TX_OUTER_IPV4) {
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
 		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M,
-				       HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+					HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
 		else
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-				       HNS3_OL3T_IPV4_NO_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+				HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_NO_CSUM);
 	} else if (ol_flags & PKT_TX_OUTER_IPV6) {
-		hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-			       HNS3_OL3T_IPV6);
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
+		tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
+					HNS3_OL3T_IPV6);
 	}
-
+	/* OL3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+				m->outer_l3_len >> HNS3_L3_LEN_UNIT);
 	*ol_type_vlan_len_msec = tmp;
 }
 
 static int
-hns3_parse_inner_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec,
-			struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_inner_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec,
+			uint32_t *type_cs_vlan_tso_len)
 {
-	uint32_t tmp = *ol_type_vlan_len_msec;
-	uint8_t l4_len;
-
-	/* OL2 header size, defined in 2 bytes */
-	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
+#define HNS3_NVGRE_HLEN 8
+	uint32_t tmp_outer = *ol_type_vlan_len_msec;
+	uint32_t tmp_inner = *type_cs_vlan_tso_len;
+	uint64_t ol_flags = m->ol_flags;
+	uint16_t inner_l2_len;
 
-	/* L4TUNT: L4 Tunneling Type */
 	switch (ol_flags & PKT_TX_TUNNEL_MASK) {
 	case PKT_TX_TUNNEL_GENEVE:
 	case PKT_TX_TUNNEL_VXLAN:
-		/* MAC in UDP tunnelling packet, include VxLAN */
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_MAC_IN_UDP);
+		/* MAC in UDP tunnelling packet, include VxLAN and GENEVE */
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+				HNS3_TXD_TUNTYPE_S, HNS3_TUN_MAC_IN_UDP);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(UDP) length and tunneling length.
+		 * The inner l2 length of mbuf is the sum of outer l4 length,
+		 * tunneling header length and inner l2 length for a tunnel
+		 * packect. But in hns3 tx descriptor, the tunneling header
+		 * length is contained in the field of outer L4 length.
+		 * Therefore, driver need to calculate the outer L4 length and
+		 * inner L2 length.
 		 */
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       (uint8_t)RTE_ETHER_VXLAN_HLEN >>
-			       HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+						HNS3_TXD_L4LEN_S,
+						(uint8_t)RTE_ETHER_VXLAN_HLEN >>
+						HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - RTE_ETHER_VXLAN_HLEN;
 		break;
 	case PKT_TX_TUNNEL_GRE:
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_NVGRE);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+					HNS3_TXD_TUNTYPE_S, HNS3_TUN_NVGRE);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(GRE) length and tunneling length.
+		 * For NVGRE tunnel packect, the outer L4 is empty. So only
+		 * fill the NVGRE header length to the outer L4 field.
 		 */
-		l4_len = hdr_lens->l4_len + hdr_lens->tunnel_len;
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       l4_len >> HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+				HNS3_TXD_L4LEN_S,
+				(uint8_t)HNS3_NVGRE_HLEN >> HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - HNS3_NVGRE_HLEN;
 		break;
 	default:
 		/* For non UDP / GRE tunneling, drop the tunnel packet */
 		return -EINVAL;
 	}
 
-	*ol_type_vlan_len_msec = tmp;
+	tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					inner_l2_len >> HNS3_L2_LEN_UNIT);
+	/* OL2 header size, defined in 2 bytes */
+	tmp_outer |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					m->outer_l2_len >> HNS3_L2_LEN_UNIT);
+
+	*type_cs_vlan_tso_len = tmp_inner;
+	*ol_type_vlan_len_msec = tmp_outer;
 
 	return 0;
 }
 
 static int
-hns3_parse_tunneling_params(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			    uint64_t ol_flags,
-			    struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_tunneling_params(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			    uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
-	uint32_t value = 0;
+	uint32_t tmp_outer = 0;
+	uint32_t tmp_inner = 0;
 	int ret;
 
-	hns3_parse_outer_params(ol_flags, &value);
-	ret = hns3_parse_inner_params(ol_flags, &value, hdr_lens);
-	if (ret)
-		return -EINVAL;
+	/*
+	 * The tunnel header is contained in the inner L2 header field of the
+	 * mbuf, but for hns3 descriptor, it is contained in the outer L4. So,
+	 * there is a need that switching between them. To avoid multiple
+	 * calculations, the length of the L2 header include the outer and
+	 * inner, will be filled during the parsing of tunnel packects.
+	 */
+	if (!(m->ol_flags & PKT_TX_TUNNEL_MASK)) {
+		/*
+		 * For non tunnel type the tunnel type id is 0, so no need to
+		 * assign a value to it. Only the inner(normal) L2 header length
+		 * is assigned.
+		 */
+		tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M,
+			       HNS3_TXD_L2LEN_S, m->l2_len >> HNS3_L2_LEN_UNIT);
+	} else {
+		/*
+		 * If outer csum is not offload, the outer length may be filled
+		 * with 0. And the length of the outer header is added to the
+		 * inner l2_len. It would lead a cksum error. So driver has to
+		 * calculate the header length.
+		 */
+		if (unlikely(!(m->ol_flags & PKT_TX_OUTER_IP_CKSUM) &&
+					m->outer_l2_len == 0)) {
+			struct rte_net_hdr_lens hdr_len;
+			(void)rte_net_get_ptype(m, &hdr_len,
+					RTE_PTYPE_L2_MASK | RTE_PTYPE_L3_MASK);
+			m->outer_l3_len = hdr_len.l3_len;
+			m->outer_l2_len = hdr_len.l2_len;
+			m->l2_len = m->l2_len - hdr_len.l2_len - hdr_len.l3_len;
+		}
+		hns3_parse_outer_params(m, &tmp_outer);
+		ret = hns3_parse_inner_params(m, &tmp_outer, &tmp_inner);
+		if (ret)
+			return -EINVAL;
+	}
 
-	desc->tx.ol_type_vlan_len_msec |= rte_cpu_to_le_32(value);
+	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(tmp_outer);
+	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp_inner);
 
 	return 0;
 }
 
 static void
-hns3_parse_l3_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l3_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
+	uint32_t l3_type;
 	uint32_t tmp;
 
+	tmp = *type_cs_vlan_tso_len;
+	if (ol_flags & PKT_TX_IPV4)
+		l3_type = HNS3_L3T_IPV4;
+	else if (ol_flags & PKT_TX_IPV6)
+		l3_type = HNS3_L3T_IPV6;
+	else
+		l3_type = HNS3_L3T_NONE;
+
+	/* inner(/normal) L3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+					m->l3_len >> HNS3_L3_LEN_UNIT);
+
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3T_M, HNS3_TXD_L3T_S, l3_type);
+
 	/* Enable L3 checksum offloads */
-	if (ol_flags & PKT_TX_IPV4) {
-		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV4);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
-		if (ol_flags & PKT_TX_IP_CKSUM)
-			hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
-		*type_cs_vlan_tso_len = tmp;
-	} else if (ol_flags & PKT_TX_IPV6) {
-		tmp = *type_cs_vlan_tso_len;
-		/* L3T, IPv6 don't do checksum */
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV6);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
-	}
+	if (ol_flags & PKT_TX_IP_CKSUM)
+		tmp |= BIT(HNS3_TXD_L3CS_B);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_parse_l4_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l4_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
 	uint32_t tmp;
-
 	/* Enable L4 checksum offloads */
 	switch (ol_flags & PKT_TX_L4_MASK) {
 	case PKT_TX_TCP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_TCP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_TCP);
 		break;
 	case PKT_TX_UDP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_UDP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_udp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_UDP);
 		break;
 	case PKT_TX_SCTP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_SCTP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_sctp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_SCTP);
 		break;
 	default:
-		break;
+		return;
 	}
+	tmp |= BIT(HNS3_TXD_L4CS_B);
+	tmp |= hns3_gen_field_val(HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
+					m->l4_len >> HNS3_L4_LEN_UNIT);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_txd_enable_checksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			 uint64_t ol_flags)
+hns3_txd_enable_checksum(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			 uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
 	uint32_t value = 0;
 
-	/* inner(/normal) L2 header size, defined in 2 bytes */
-	hns3_set_field(value, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
-
-	hns3_parse_l3_cksum_params(ol_flags, &value);
-	hns3_parse_l4_cksum_params(ol_flags, &value);
+	hns3_parse_l3_cksum_params(m, &value);
+	hns3_parse_l4_cksum_params(m, &value);
 
 	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(value);
 }
@@ -2259,18 +2290,23 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 
 static int
 hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-		 const struct rte_mbuf *m, struct rte_net_hdr_lens *hdr_lens)
+		 struct rte_mbuf *m)
 {
-	/* Fill in tunneling parameters if necessary */
-	if (m->ol_flags & PKT_TX_TUNNEL_MASK) {
-		(void)rte_net_get_ptype(m, hdr_lens, RTE_PTYPE_ALL_MASK);
-		if (hns3_parse_tunneling_params(txq, tx_desc_id, m->ol_flags,
-						hdr_lens))
+	struct hns3_desc *tx_ring = txq->tx_ring;
+	struct hns3_desc *desc = &tx_ring[tx_desc_id];
+
+	/* Enable checksum offloading */
+	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK) {
+		/* Fill in tunneling parameters if necessary */
+		if (hns3_parse_tunneling_params(txq, m, tx_desc_id))
 			return -EINVAL;
+
+		hns3_txd_enable_checksum(txq, m, tx_desc_id);
+	} else {
+		/* clear the control bit */
+		desc->tx.type_cs_vlan_tso_len  = 0;
+		desc->tx.ol_type_vlan_len_msec = 0;
 	}
-	/* Enable checksum offloading */
-	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK)
-		hns3_txd_enable_checksum(txq, tx_desc_id, m->ol_flags);
 
 	return 0;
 }
@@ -2278,7 +2314,6 @@ hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
 uint16_t
 hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
 	struct rte_mbuf *new_pkt;
@@ -2345,7 +2380,7 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			nb_buf = m_seg->nb_segs;
 		}
 
-		if (hns3_parse_cksum(txq, tx_next_use, m_seg, &hdr_lens))
+		if (hns3_parse_cksum(txq, tx_next_use, m_seg))
 			goto end_of_tx;
 
 		i = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index fd4f419..88a7584 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -305,14 +305,9 @@ struct hns3_queue_info {
 };
 
 #define HNS3_TX_CKSUM_OFFLOAD_MASK ( \
-	PKT_TX_OUTER_IPV6 | \
-	PKT_TX_OUTER_IPV4 | \
 	PKT_TX_OUTER_IP_CKSUM | \
-	PKT_TX_IPV6 | \
-	PKT_TX_IPV4 | \
 	PKT_TX_IP_CKSUM | \
-	PKT_TX_L4_MASK | \
-	PKT_TX_TUNNEL_MASK)
+	PKT_TX_L4_MASK)
 
 enum hns3_cksum_status {
 	HNS3_CKSUM_NONE = 0,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations Lijun Ou
@ 2020-11-25  9:16     ` Luca Boccassi
  2020-11-25 10:59       ` oulijun
  0 siblings, 1 reply; 42+ messages in thread
From: Luca Boccassi @ 2020-11-25  9:16 UTC (permalink / raw)
  To: Lijun Ou, stable; +Cc: linuxarm

On Wed, 2020-11-25 at 11:24 +0800, Lijun Ou wrote:
> [ upstream commit eb8b3a0d829bc109dfb13de9c8cdeacda5449e69 ]
> 
> According to the user manual of Kunpeng920 SoC, the max allowed number
> of segments per whole packet is 63 and the max number of segments per
> packet is 8 in datapath.
> 
> This patch reports the Two segment parameters of Tx descriptor
> limitations to DPDK framework.
> 
> Signed-off-by: Lijun Ou <oulijun@huawei.com>
> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> ---
>  drivers/net/hns3/hns3_ethdev.c    | 2 ++
>  drivers/net/hns3/hns3_ethdev_vf.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
> index 3a92bd4..7fd0f7f 100644
> --- a/drivers/net/hns3/hns3_ethdev.c
> +++ b/drivers/net/hns3/hns3_ethdev.c
> @@ -2499,6 +2499,8 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
>  		.nb_max = HNS3_MAX_RING_DESC,
>  		.nb_min = HNS3_MIN_RING_DESC,
>  		.nb_align = HNS3_ALIGN_RING_DESC,
> +		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
> +		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
>  	};
>  
>  	info->vmdq_queue_num = 0;
> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
> index 4359b2e..0b5d3d4 100644
> --- a/drivers/net/hns3/hns3_ethdev_vf.c
> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
> @@ -866,6 +866,8 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
>  		.nb_max = HNS3_MAX_RING_DESC,
>  		.nb_min = HNS3_MIN_RING_DESC,
>  		.nb_align = HNS3_ALIGN_RING_DESC,
> +		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
> +		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
>  	};
>  
>  	info->vmdq_queue_num = 0;

Hi,

Thanks for the series, but it fails to build, because of this patch:

../drivers/net/hns3/hns3_ethdev.c: In function ‘hns3_dev_infos_get’:
../drivers/net/hns3/hns3_ethdev.c:2502:17: error: ‘HNS3_MAX_TSO_BD_PER_PKT’ undeclared (first use in this function); did you mean ‘HNS3_MAX_TX_BD_PER_PKT’?
   .nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
                 ^~~~~~~~~~~~~~~~~~~~~~~
                 HNS3_MAX_TX_BD_PER_PKT
../drivers/net/hns3/hns3_ethdev.c:2502:17: note: each undeclared identifier is reported only once for each function it appears in
../drivers/net/hns3/hns3_ethdev.c:2503:21: error: ‘HNS3_MAX_NON_TSO_BD_PER_PKT’ undeclared (first use in this function); did you mean ‘HNS3_MAX_TX_BD_PER_PKT’?
   .nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
                     HNS3_MAX_TX_BD_PER_PKT
[3/32] Compiling C object 'drivers/a715181@@tmp_rte_pmd_hns3@sta/net_hns3_hns3_intr.c.o'.
ninja: build stopped: subcommand failed.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6
  2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
                     ` (6 preceding siblings ...)
  2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 7/7] net/hns3: fix TX checksum with fix header length Lijun Ou
@ 2020-11-25 10:58   ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
                       ` (6 more replies)
  7 siblings, 7 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

His series are backport for 19.11.6 about hns3 PMD driver

Change from V2:
1. remove the relatived patch with TSO

Change from V1:
1. split TSO part from fix TX checksum with fix header length
2. remove TSO patch and the relatived patch

Chengchang Tang (2):
  net/hns3: decrease non-nearby memory access in Rx
  net/hns3: fix TX checksum with fix header length

Wei Hu (Xavier) (4):
  net/hns3: fix reassembling multiple segment packets in Tx
  net/hns3: report Rx drop packets enable configuration
  net/hns3: report Rx free threshold
  net/hns3: reduce address calculation in Rx

 drivers/net/hns3/hns3_ethdev.c    |  32 +++-
 drivers/net/hns3/hns3_ethdev.h    |  31 +++-
 drivers/net/hns3/hns3_ethdev_vf.c |  11 ++
 drivers/net/hns3/hns3_rxtx.c      | 363 +++++++++++++++++++++++---------------
 drivers/net/hns3/hns3_rxtx.h      |  30 +++-
 5 files changed, 308 insertions(+), 159 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 6c44219f9907a064049d160d366d98bb840724a6 ]

Because of the hardware constraints, hns3 network engine doesn't support
sending packets with more than eight fragments. And hns3 pmd driver
tries to reassemble these kind of packets to meet hardware requirements.
Currently, there are two problems:
1) when the input buffer_len * 8 < pkt_len, the packets are impossible
   to be reassembled into 8 Buffer Descriptors. In this case, the
   packets will be passed to hardware, which eventually causes a
   hardware reset.
2) The meta data in origin packets which are required to fill into the
   descriptor haven't been copied into the reassembled pkts.

This patch adds a check for 1) to ensure such packets will be dropped by
driver and copies useful meta data from the origin packets to the
reassembled packets.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 5aee8cb..ffd8417 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1940,6 +1940,20 @@ hns3_tx_alloc_mbufs(struct hns3_tx_queue *txq, struct rte_mempool *mb_pool,
 	return 0;
 }
 
+static inline void
+hns3_pktmbuf_copy_hdr(struct rte_mbuf *new_pkt, struct rte_mbuf *old_pkt)
+{
+	new_pkt->ol_flags = old_pkt->ol_flags;
+	new_pkt->pkt_len = rte_pktmbuf_pkt_len(old_pkt);
+	new_pkt->outer_l2_len = old_pkt->outer_l2_len;
+	new_pkt->outer_l3_len = old_pkt->outer_l3_len;
+	new_pkt->l2_len = old_pkt->l2_len;
+	new_pkt->l3_len = old_pkt->l3_len;
+	new_pkt->l4_len = old_pkt->l4_len;
+	new_pkt->vlan_tci_outer = old_pkt->vlan_tci_outer;
+	new_pkt->vlan_tci = old_pkt->vlan_tci;
+}
+
 static int
 hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 			struct rte_mbuf **new_pkt)
@@ -1963,9 +1977,11 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 
 	mb_pool = tx_pkt->pool;
 	buf_size = tx_pkt->buf_len - RTE_PKTMBUF_HEADROOM;
-	nb_new_buf = (tx_pkt->pkt_len - 1) / buf_size + 1;
+	nb_new_buf = (rte_pktmbuf_pkt_len(tx_pkt) - 1) / buf_size + 1;
+	if (nb_new_buf > HNS3_MAX_NON_TSO_BD_PER_PKT)
+		return -EINVAL;
 
-	last_buf_len = tx_pkt->pkt_len % buf_size;
+	last_buf_len = rte_pktmbuf_pkt_len(tx_pkt) % buf_size;
 	if (last_buf_len == 0)
 		last_buf_len = buf_size;
 
@@ -1977,7 +1993,7 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 	/* Copy the original packet content to the new mbufs */
 	temp = tx_pkt;
 	s = rte_pktmbuf_mtod(temp, char *);
-	len_s = temp->data_len;
+	len_s = rte_pktmbuf_data_len(temp);
 	temp_new = new_mbuf;
 	for (i = 0; i < nb_new_buf; i++) {
 		d = rte_pktmbuf_mtod(temp_new, char *);
@@ -2000,13 +2016,14 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 				if (temp == NULL)
 					break;
 				s = rte_pktmbuf_mtod(temp, char *);
-				len_s = temp->data_len;
+				len_s = rte_pktmbuf_data_len(temp);
 			}
 		}
 
 		temp_new->data_len = buf_len;
 		temp_new = temp_new->next;
 	}
+	hns3_pktmbuf_copy_hdr(new_mbuf, tx_pkt);
 
 	/* free original mbufs */
 	rte_pktmbuf_free(tx_pkt);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit 8c7449779c4526d27205043560a21f0e2c2f622b ]

Currently, hns3 PMD driver needs know the PVID configuration state and
do different processing in the 'rx_pkt_burst' ops implementation
function.

This patch adds a member to struct hns3_rx_queue/hns3_tx_queue of the
driver to indicate the PVID configuration status, so it isn't need
to access other data structure in the 'rx_pkt_burst' ops implementation,
to avoid performance loss because of reducing cache miss.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c | 21 ++++++++++++++++++++-
 drivers/net/hns3/hns3_rxtx.c   | 37 +++++++++++++++++++++++++++++++------
 drivers/net/hns3/hns3_rxtx.h   | 13 +++++++++++++
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index f2702e0..3a92bd4 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -895,6 +895,8 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 {
 	struct hns3_adapter *hns = dev->data->dev_private;
 	struct hns3_hw *hw = &hns->hw;
+	bool pvid_en_state_change;
+	uint16_t pvid_state;
 	int ret;
 
 	if (pvid > RTE_ETHER_MAX_VLAN_ID) {
@@ -903,10 +905,27 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 		return -EINVAL;
 	}
 
+	/*
+	 * If PVID configuration state change, should refresh the PVID
+	 * configuration state in struct hns3_tx_queue/hns3_rx_queue.
+	 */
+	pvid_state = hw->port_base_vlan_cfg.state;
+	if ((on && pvid_state == HNS3_PORT_BASE_VLAN_ENABLE) ||
+	    (!on && pvid_state == HNS3_PORT_BASE_VLAN_DISABLE))
+		pvid_en_state_change = false;
+	else
+		pvid_en_state_change = true;
+
 	rte_spinlock_lock(&hw->lock);
 	ret = hns3_vlan_pvid_configure(hns, pvid, on);
 	rte_spinlock_unlock(&hw->lock);
-	return ret;
+	if (ret)
+		return ret;
+
+	if (pvid_en_state_change)
+		hns3_update_all_queues_pvid_state(hw);
+
+	return 0;
 }
 
 static int
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index ffd8417..53b98cd 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -316,6 +316,31 @@ hns3_init_tx_queue_hw(struct hns3_tx_queue *txq)
 }
 
 void
+hns3_update_all_queues_pvid_state(struct hns3_hw *hw)
+{
+	uint16_t nb_rx_q = hw->data->nb_rx_queues;
+	uint16_t nb_tx_q = hw->data->nb_tx_queues;
+	struct hns3_rx_queue *rxq;
+	struct hns3_tx_queue *txq;
+	int pvid_state;
+	int i;
+
+	pvid_state = hw->port_base_vlan_cfg.state;
+	for (i = 0; i < hw->cfg_max_queues; i++) {
+		if (i < nb_rx_q) {
+			rxq = hw->data->rx_queues[i];
+			if (rxq != NULL)
+				rxq->pvid_state = pvid_state;
+		}
+		if (i < nb_tx_q) {
+			txq = hw->data->tx_queues[i];
+			if (txq != NULL)
+				txq->pvid_state = pvid_state;
+		}
+	}
+}
+
+void
 hns3_enable_all_queues(struct hns3_hw *hw, bool en)
 {
 	uint16_t nb_rx_q = hw->data->nb_rx_queues;
@@ -1275,6 +1300,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
+	rxq->pvid_state = hw->port_base_vlan_cfg.state;
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
@@ -1503,7 +1529,7 @@ hns3_rx_set_cksum_flag(struct rte_mbuf *rxm, uint64_t packet_type,
 }
 
 static inline void
-hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
+hns3_rxd_to_vlan_tci(struct hns3_rx_queue *rxq, struct rte_mbuf *mb,
 		     uint32_t l234_info, const struct hns3_desc *rxd)
 {
 #define HNS3_STRP_STATUS_NUM		0x4
@@ -1511,8 +1537,6 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 #define HNS3_NO_STRP_VLAN_VLD		0x0
 #define HNS3_INNER_STRP_VLAN_VLD	0x1
 #define HNS3_OUTER_STRP_VLAN_VLD	0x2
-	struct hns3_adapter *hns = dev->data->dev_private;
-	struct hns3_hw *hw = &hns->hw;
 	uint32_t strip_status;
 	uint32_t report_mode;
 
@@ -1538,7 +1562,7 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 	};
 	strip_status = hns3_get_field(l234_info, HNS3_RXD_STRP_TAGP_M,
 				      HNS3_RXD_STRP_TAGP_S);
-	report_mode = report_type[hw->port_base_vlan_cfg.state][strip_status];
+	report_mode = report_type[rxq->pvid_state][strip_status];
 	switch (report_mode) {
 	case HNS3_NO_STRP_VLAN_VLD:
 		mb->vlan_tci = 0;
@@ -1583,7 +1607,6 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx = 0;
 	nb_rx_bd = 0;
 	rxq = rx_queue;
-	dev = &rte_eth_devices[rxq->port_id];
 
 	rx_id = rxq->next_to_clean;
 	rx_ring = rxq->rx_ring;
@@ -1660,6 +1683,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 		nmb = rte_mbuf_raw_alloc(rxq->mb_pool);
 		if (unlikely(nmb == NULL)) {
+			dev = &rte_eth_devices[rxq->port_id];
 			dev->data->rx_mbuf_alloc_failed++;
 			break;
 		}
@@ -1729,7 +1753,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			hns3_rx_set_cksum_flag(first_seg,
 					       first_seg->packet_type,
 					       cksum_err);
-		hns3_rxd_to_vlan_tci(dev, first_seg, l234_info, &rxd);
+		hns3_rxd_to_vlan_tci(rxq, first_seg, l234_info, &rxd);
 
 		rx_pkts[nb_rx++] = first_seg;
 		first_seg = NULL;
@@ -1807,6 +1831,7 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->next_to_clean = 0;
 	txq->tx_bd_ready = txq->nb_tx_desc - 1;
 	txq->port_id = dev->data->port_id;
+	txq->pvid_state = hw->port_base_vlan_cfg.state;
 	txq->configured = true;
 	txq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 1fd1afd..a416d10 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -250,6 +250,12 @@ struct hns3_rx_queue {
 	uint16_t rx_buf_len;
 	uint16_t rx_free_thresh;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool rx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if rx queue has been configured */
 
@@ -276,6 +282,12 @@ struct hns3_tx_queue {
 	uint16_t next_to_use;
 	uint16_t tx_bd_ready;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool tx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if tx queue has been configured */
 };
@@ -336,5 +348,6 @@ void hns3_set_queue_intr_rl(struct hns3_hw *hw, uint16_t queue_id,
 			    uint16_t rl_value);
 int hns3_set_fake_rx_or_tx_queues(struct rte_eth_dev *dev, uint16_t nb_rx_q,
 				  uint16_t nb_tx_q);
+void hns3_update_all_queues_pvid_state(struct hns3_hw *hw);
 
 #endif /* _HNS3_RXTX_H_ */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

Currently, if there are not available Rx buffer descriptors in receiving
direction based on hns3 network engine, incoming packets will always be
dropped by hardware. This patch reports the '.rx_drop_en' information to
DPDK framework in the '.dev_infos_get', '.rxq_info_get' and
'.rx_queue_setup' ops implementation function.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 9 +++++++++
 drivers/net/hns3/hns3_ethdev_vf.c | 9 +++++++++
 drivers/net/hns3/hns3_rxtx.c      | 6 ++++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 3a92bd4..d14c96d 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2501,6 +2501,15 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_align = HNS3_ALIGN_RING_DESC,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 4359b2e..ee272da 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -868,6 +868,15 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_align = HNS3_ALIGN_RING_DESC,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 53b98cd..13d5560 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1251,6 +1251,12 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 		return -EINVAL;
 	}
 
+	if (conf->rx_drop_en == 0)
+		hns3_warn(hw, "if there are no available Rx descriptors,"
+			  "incoming packets are always dropped. input parameter"
+			  " conf->rx_drop_en(%u) is uneffective.",
+			  conf->rx_drop_en);
+
 	if (dev->data->rx_queues[idx]) {
 		hns3_rx_queue_release(dev->data->rx_queues[idx]);
 		dev->data->rx_queues[idx] = NULL;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 4/6] net/hns3: report Rx free threshold
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                       ` (2 preceding siblings ...)
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit ceabee45be4a1737994c5f4631bd001b8021475c ]

This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    |  2 ++
 drivers/net/hns3/hns3_ethdev_vf.c |  2 ++
 drivers/net/hns3/hns3_rxtx.c      | 30 ++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h      |  9 ++++++---
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index d14c96d..0dadaef 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2502,12 +2502,14 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index ee272da..ea0fed4 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -869,12 +869,14 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 13d5560..41d957a 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -634,8 +634,7 @@ hns3_dev_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 
 	return 0;
@@ -649,8 +648,7 @@ hns3_fake_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 
 	rxq = (struct hns3_rx_queue *)hw->fkq_data.rx_queues[idx];
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 }
 
@@ -1285,10 +1283,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 
 	rxq->hns = hns;
 	rxq->mb_pool = mp;
-	if (conf->rx_free_thresh <= 0)
-		rxq->rx_free_thresh = DEFAULT_RX_FREE_THRESH;
-	else
-		rxq->rx_free_thresh = conf->rx_free_thresh;
+	rxq->rx_free_thresh = (conf->rx_free_thresh > 0) ?
+		conf->rx_free_thresh : HNS3_DEFAULT_RX_FREE_THRESH;
 	rxq->rx_deferred_start = conf->rx_deferred_start;
 
 	rx_entry_len = sizeof(struct hns3_entry) * rxq->nb_rx_desc;
@@ -1301,8 +1297,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
@@ -1614,11 +1609,11 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx_bd = 0;
 	rxq = rx_queue;
 
-	rx_id = rxq->next_to_clean;
+	rx_id = rxq->next_to_use;
 	rx_ring = rxq->rx_ring;
+	sw_ring = rxq->sw_ring;
 	first_seg = rxq->pkt_first_seg;
 	last_seg = rxq->pkt_last_seg;
-	sw_ring = rxq->sw_ring;
 
 	while (nb_rx < nb_pkts) {
 		rxdp = &rx_ring[rx_id];
@@ -1769,16 +1764,15 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		first_seg = NULL;
 	}
 
-	rxq->next_to_clean = rx_id;
+	rxq->next_to_use = rx_id;
 	rxq->pkt_first_seg = first_seg;
 	rxq->pkt_last_seg = last_seg;
 
-	nb_rx_bd = nb_rx_bd + rxq->nb_rx_hold;
-	if (nb_rx_bd > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, nb_rx_bd);
-		nb_rx_bd = 0;
+	rxq->rx_free_hold += nb_rx_bd;
+	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
+		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		rxq->rx_free_hold = 0;
 	}
-	rxq->nb_rx_hold = nb_rx_bd;
 
 	return nb_rx;
 }
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index a416d10..6211665 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -10,6 +10,7 @@
 #define HNS3_DEFAULT_RING_DESC  1024
 #define	HNS3_ALIGN_RING_DESC	32
 #define HNS3_RING_BASE_ALIGN	128
+#define HNS3_DEFAULT_RX_FREE_THRESH	32
 
 #define HNS3_512_BD_BUF_SIZE	512
 #define HNS3_1K_BD_BUF_SIZE	1024
@@ -243,12 +244,14 @@ struct hns3_rx_queue {
 	uint16_t queue_id;
 	uint16_t port_id;
 	uint16_t nb_rx_desc;
-	uint16_t nb_rx_hold;
-	uint16_t rx_tail;
-	uint16_t next_to_clean;
 	uint16_t next_to_use;
 	uint16_t rx_buf_len;
+	/*
+	 * threshold for the number of BDs waited to passed to hardware. If the
+	 * number exceeds the threshold, driver will pass these BDs to hardware.
+	 */
 	uint16_t rx_free_thresh;
+	uint16_t rx_free_hold;   /* num of BDs waited to passed to hardware */
 
 	/*
 	 * port based vlan configuration state.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 5/6] net/hns3: reduce address calculation in Rx
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                       ` (3 preceding siblings ...)
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 323df8941b57027752ee9d191f1ac6f359bd524e ]

This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h | 29 +++++++++++++++++++++++++++--
 drivers/net/hns3/hns3_rxtx.c   | 14 +++-----------
 drivers/net/hns3/hns3_rxtx.h   |  1 +
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index e0a04f8..ba01622 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -579,14 +579,39 @@ struct hns3_adapter {
 	type __max2 = (y);                      \
 	__max1 > __max2 ? __max1 : __max2; })
 
+/*
+ * Because hardware always access register in little-endian mode based on hns3
+ * network engine, so driver should also call rte_cpu_to_le_32 to convert data
+ * in little-endian mode before writing register and call rte_le_to_cpu_32 to
+ * convert data after reading from register.
+ *
+ * Here the driver encapsulates the data conversion operation in the register
+ * read/write operation function as below:
+ *   hns3_write_reg
+ *   hns3_write_reg_opt
+ *   hns3_read_reg
+ * Therefore, when calling these functions, conversion is not required again.
+ */
 static inline void hns3_write_reg(void *base, uint32_t reg, uint32_t value)
 {
-	rte_write32(value, (volatile void *)((char *)base + reg));
+	rte_write32(rte_cpu_to_le_32(value),
+		    (volatile void *)((char *)base + reg));
+}
+
+/*
+ * The optimized function for writing registers used in the '.rx_pkt_burst' and
+ * '.tx_pkt_burst' ops implementation function.
+ */
+static inline void hns3_write_reg_opt(volatile void *addr, uint32_t value)
+{
+	rte_io_wmb();
+	rte_write32_relaxed(rte_cpu_to_le_32(value), addr);
 }
 
 static inline uint32_t hns3_read_reg(void *base, uint32_t reg)
 {
-	return rte_read32((volatile void *)((char *)base + reg));
+	uint32_t read_val = rte_read32((volatile void *)((char *)base + reg));
+	return rte_le_to_cpu_32(read_val);
 }
 
 #define hns3_write_dev(a, reg, value) \
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 41d957a..78c73ea 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1305,6 +1305,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
+	rxq->io_head_reg = (volatile void *)((char *)rxq->io_base +
+			   HNS3_RING_RX_HEAD_REG);
 	rxq->rx_buf_len = rx_buf_size;
 	rxq->l2_errors = 0;
 	rxq->pkt_len_errors = 0;
@@ -1448,16 +1450,6 @@ hns3_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 	return NULL;
 }
 
-static void
-hns3_clean_rx_buffers(struct hns3_rx_queue *rxq, int count)
-{
-	rxq->next_to_use += count;
-	if (rxq->next_to_use >= rxq->nb_rx_desc)
-		rxq->next_to_use -= rxq->nb_rx_desc;
-
-	hns3_write_dev(rxq, HNS3_RING_RX_HEAD_REG, count);
-}
-
 static int
 hns3_handle_bdinfo(struct hns3_rx_queue *rxq, struct rte_mbuf *rxm,
 		   uint32_t bd_base_info, uint32_t l234_info,
@@ -1770,7 +1762,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	rxq->rx_free_hold += nb_rx_bd;
 	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		hns3_write_reg_opt(rxq->io_head_reg, rxq->rx_free_hold);
 		rxq->rx_free_hold = 0;
 	}
 
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 6211665..fd4f419 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -231,6 +231,7 @@ struct hns3_entry {
 
 struct hns3_rx_queue {
 	void *io_base;
+	volatile void *io_head_reg;
 	struct hns3_adapter *hns;
 	struct rte_mempool *mb_pool;
 	struct hns3_desc *rx_ring;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v3 19.11.6 6/6] net/hns3: fix TX checksum with fix header length
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                       ` (4 preceding siblings ...)
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
@ 2020-11-25 10:58     ` Lijun Ou
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 10:58 UTC (permalink / raw)
  To: luca.boccassi, stable; +Cc: linuxarm

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit fb6eb9009f41b823dddef8a09fe8e0b0698c2bf5 ]

Currently, the header length of all the layers are fixed, It would
lead to a cksum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h |   2 +
 drivers/net/hns3/hns3_rxtx.c   | 253 +++++++++++++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h   |   7 +-
 3 files changed, 147 insertions(+), 115 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index ba01622..8ba22dc 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -552,6 +552,8 @@ struct hns3_adapter {
 #define hns3_get_bit(origin, shift) \
 	hns3_get_field((origin), (0x1UL << (shift)), (shift))
 
+#define hns3_gen_field_val(mask, shift, val) (((val) << (shift)) & (mask))
+
 /*
  * upper_32_bits - return bits 32-63 of a number
  * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 78c73ea..bf6e7d6 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -2051,180 +2051,211 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 }
 
 static void
-hns3_parse_outer_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec)
+hns3_parse_outer_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec)
 {
 	uint32_t tmp = *ol_type_vlan_len_msec;
+	uint64_t ol_flags = m->ol_flags;
 
 	/* (outer) IP header type */
 	if (ol_flags & PKT_TX_OUTER_IPV4) {
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
 		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M,
-				       HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+					HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
 		else
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-				       HNS3_OL3T_IPV4_NO_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+				HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_NO_CSUM);
 	} else if (ol_flags & PKT_TX_OUTER_IPV6) {
-		hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-			       HNS3_OL3T_IPV6);
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
+		tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
+					HNS3_OL3T_IPV6);
 	}
-
+	/* OL3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+				m->outer_l3_len >> HNS3_L3_LEN_UNIT);
 	*ol_type_vlan_len_msec = tmp;
 }
 
 static int
-hns3_parse_inner_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec,
-			struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_inner_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec,
+			uint32_t *type_cs_vlan_tso_len)
 {
-	uint32_t tmp = *ol_type_vlan_len_msec;
-	uint8_t l4_len;
-
-	/* OL2 header size, defined in 2 bytes */
-	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
+#define HNS3_NVGRE_HLEN 8
+	uint32_t tmp_outer = *ol_type_vlan_len_msec;
+	uint32_t tmp_inner = *type_cs_vlan_tso_len;
+	uint64_t ol_flags = m->ol_flags;
+	uint16_t inner_l2_len;
 
-	/* L4TUNT: L4 Tunneling Type */
 	switch (ol_flags & PKT_TX_TUNNEL_MASK) {
 	case PKT_TX_TUNNEL_GENEVE:
 	case PKT_TX_TUNNEL_VXLAN:
-		/* MAC in UDP tunnelling packet, include VxLAN */
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_MAC_IN_UDP);
+		/* MAC in UDP tunnelling packet, include VxLAN and GENEVE */
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+				HNS3_TXD_TUNTYPE_S, HNS3_TUN_MAC_IN_UDP);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(UDP) length and tunneling length.
+		 * The inner l2 length of mbuf is the sum of outer l4 length,
+		 * tunneling header length and inner l2 length for a tunnel
+		 * packect. But in hns3 tx descriptor, the tunneling header
+		 * length is contained in the field of outer L4 length.
+		 * Therefore, driver need to calculate the outer L4 length and
+		 * inner L2 length.
 		 */
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       (uint8_t)RTE_ETHER_VXLAN_HLEN >>
-			       HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+						HNS3_TXD_L4LEN_S,
+						(uint8_t)RTE_ETHER_VXLAN_HLEN >>
+						HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - RTE_ETHER_VXLAN_HLEN;
 		break;
 	case PKT_TX_TUNNEL_GRE:
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_NVGRE);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+					HNS3_TXD_TUNTYPE_S, HNS3_TUN_NVGRE);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(GRE) length and tunneling length.
+		 * For NVGRE tunnel packect, the outer L4 is empty. So only
+		 * fill the NVGRE header length to the outer L4 field.
 		 */
-		l4_len = hdr_lens->l4_len + hdr_lens->tunnel_len;
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       l4_len >> HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+				HNS3_TXD_L4LEN_S,
+				(uint8_t)HNS3_NVGRE_HLEN >> HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - HNS3_NVGRE_HLEN;
 		break;
 	default:
 		/* For non UDP / GRE tunneling, drop the tunnel packet */
 		return -EINVAL;
 	}
 
-	*ol_type_vlan_len_msec = tmp;
+	tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					inner_l2_len >> HNS3_L2_LEN_UNIT);
+	/* OL2 header size, defined in 2 bytes */
+	tmp_outer |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					m->outer_l2_len >> HNS3_L2_LEN_UNIT);
+
+	*type_cs_vlan_tso_len = tmp_inner;
+	*ol_type_vlan_len_msec = tmp_outer;
 
 	return 0;
 }
 
 static int
-hns3_parse_tunneling_params(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			    uint64_t ol_flags,
-			    struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_tunneling_params(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			    uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
-	uint32_t value = 0;
+	uint32_t tmp_outer = 0;
+	uint32_t tmp_inner = 0;
 	int ret;
 
-	hns3_parse_outer_params(ol_flags, &value);
-	ret = hns3_parse_inner_params(ol_flags, &value, hdr_lens);
-	if (ret)
-		return -EINVAL;
+	/*
+	 * The tunnel header is contained in the inner L2 header field of the
+	 * mbuf, but for hns3 descriptor, it is contained in the outer L4. So,
+	 * there is a need that switching between them. To avoid multiple
+	 * calculations, the length of the L2 header include the outer and
+	 * inner, will be filled during the parsing of tunnel packects.
+	 */
+	if (!(m->ol_flags & PKT_TX_TUNNEL_MASK)) {
+		/*
+		 * For non tunnel type the tunnel type id is 0, so no need to
+		 * assign a value to it. Only the inner(normal) L2 header length
+		 * is assigned.
+		 */
+		tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M,
+			       HNS3_TXD_L2LEN_S, m->l2_len >> HNS3_L2_LEN_UNIT);
+	} else {
+		/*
+		 * If outer csum is not offload, the outer length may be filled
+		 * with 0. And the length of the outer header is added to the
+		 * inner l2_len. It would lead a cksum error. So driver has to
+		 * calculate the header length.
+		 */
+		if (unlikely(!(m->ol_flags & PKT_TX_OUTER_IP_CKSUM) &&
+					m->outer_l2_len == 0)) {
+			struct rte_net_hdr_lens hdr_len;
+			(void)rte_net_get_ptype(m, &hdr_len,
+					RTE_PTYPE_L2_MASK | RTE_PTYPE_L3_MASK);
+			m->outer_l3_len = hdr_len.l3_len;
+			m->outer_l2_len = hdr_len.l2_len;
+			m->l2_len = m->l2_len - hdr_len.l2_len - hdr_len.l3_len;
+		}
+		hns3_parse_outer_params(m, &tmp_outer);
+		ret = hns3_parse_inner_params(m, &tmp_outer, &tmp_inner);
+		if (ret)
+			return -EINVAL;
+	}
 
-	desc->tx.ol_type_vlan_len_msec |= rte_cpu_to_le_32(value);
+	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(tmp_outer);
+	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp_inner);
 
 	return 0;
 }
 
 static void
-hns3_parse_l3_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l3_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
+	uint32_t l3_type;
 	uint32_t tmp;
 
+	tmp = *type_cs_vlan_tso_len;
+	if (ol_flags & PKT_TX_IPV4)
+		l3_type = HNS3_L3T_IPV4;
+	else if (ol_flags & PKT_TX_IPV6)
+		l3_type = HNS3_L3T_IPV6;
+	else
+		l3_type = HNS3_L3T_NONE;
+
+	/* inner(/normal) L3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+					m->l3_len >> HNS3_L3_LEN_UNIT);
+
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3T_M, HNS3_TXD_L3T_S, l3_type);
+
 	/* Enable L3 checksum offloads */
-	if (ol_flags & PKT_TX_IPV4) {
-		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV4);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
-		if (ol_flags & PKT_TX_IP_CKSUM)
-			hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
-		*type_cs_vlan_tso_len = tmp;
-	} else if (ol_flags & PKT_TX_IPV6) {
-		tmp = *type_cs_vlan_tso_len;
-		/* L3T, IPv6 don't do checksum */
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV6);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
-	}
+	if (ol_flags & PKT_TX_IP_CKSUM)
+		tmp |= BIT(HNS3_TXD_L3CS_B);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_parse_l4_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l4_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
 	uint32_t tmp;
-
 	/* Enable L4 checksum offloads */
 	switch (ol_flags & PKT_TX_L4_MASK) {
 	case PKT_TX_TCP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_TCP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_TCP);
 		break;
 	case PKT_TX_UDP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_UDP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_udp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_UDP);
 		break;
 	case PKT_TX_SCTP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_SCTP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_sctp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_SCTP);
 		break;
 	default:
-		break;
+		return;
 	}
+	tmp |= BIT(HNS3_TXD_L4CS_B);
+	tmp |= hns3_gen_field_val(HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
+					m->l4_len >> HNS3_L4_LEN_UNIT);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_txd_enable_checksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			 uint64_t ol_flags)
+hns3_txd_enable_checksum(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			 uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
 	uint32_t value = 0;
 
-	/* inner(/normal) L2 header size, defined in 2 bytes */
-	hns3_set_field(value, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
-
-	hns3_parse_l3_cksum_params(ol_flags, &value);
-	hns3_parse_l4_cksum_params(ol_flags, &value);
+	hns3_parse_l3_cksum_params(m, &value);
+	hns3_parse_l4_cksum_params(m, &value);
 
 	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(value);
 }
@@ -2259,18 +2290,23 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 
 static int
 hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-		 const struct rte_mbuf *m, struct rte_net_hdr_lens *hdr_lens)
+		 struct rte_mbuf *m)
 {
-	/* Fill in tunneling parameters if necessary */
-	if (m->ol_flags & PKT_TX_TUNNEL_MASK) {
-		(void)rte_net_get_ptype(m, hdr_lens, RTE_PTYPE_ALL_MASK);
-		if (hns3_parse_tunneling_params(txq, tx_desc_id, m->ol_flags,
-						hdr_lens))
+	struct hns3_desc *tx_ring = txq->tx_ring;
+	struct hns3_desc *desc = &tx_ring[tx_desc_id];
+
+	/* Enable checksum offloading */
+	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK) {
+		/* Fill in tunneling parameters if necessary */
+		if (hns3_parse_tunneling_params(txq, m, tx_desc_id))
 			return -EINVAL;
+
+		hns3_txd_enable_checksum(txq, m, tx_desc_id);
+	} else {
+		/* clear the control bit */
+		desc->tx.type_cs_vlan_tso_len  = 0;
+		desc->tx.ol_type_vlan_len_msec = 0;
 	}
-	/* Enable checksum offloading */
-	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK)
-		hns3_txd_enable_checksum(txq, tx_desc_id, m->ol_flags);
 
 	return 0;
 }
@@ -2278,7 +2314,6 @@ hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
 uint16_t
 hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
 	struct rte_mbuf *new_pkt;
@@ -2345,7 +2380,7 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			nb_buf = m_seg->nb_segs;
 		}
 
-		if (hns3_parse_cksum(txq, tx_next_use, m_seg, &hdr_lens))
+		if (hns3_parse_cksum(txq, tx_next_use, m_seg))
 			goto end_of_tx;
 
 		i = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index fd4f419..88a7584 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -305,14 +305,9 @@ struct hns3_queue_info {
 };
 
 #define HNS3_TX_CKSUM_OFFLOAD_MASK ( \
-	PKT_TX_OUTER_IPV6 | \
-	PKT_TX_OUTER_IPV4 | \
 	PKT_TX_OUTER_IP_CKSUM | \
-	PKT_TX_IPV6 | \
-	PKT_TX_IPV4 | \
 	PKT_TX_IP_CKSUM | \
-	PKT_TX_L4_MASK | \
-	PKT_TX_TUNNEL_MASK)
+	PKT_TX_L4_MASK)
 
 enum hns3_cksum_status {
 	HNS3_CKSUM_NONE = 0,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations
  2020-11-25  9:16     ` Luca Boccassi
@ 2020-11-25 10:59       ` oulijun
  0 siblings, 0 replies; 42+ messages in thread
From: oulijun @ 2020-11-25 10:59 UTC (permalink / raw)
  To: Luca Boccassi, stable; +Cc: linuxarm



在 2020/11/25 17:16, Luca Boccassi 写道:
> On Wed, 2020-11-25 at 11:24 +0800, Lijun Ou wrote:
>> [ upstream commit eb8b3a0d829bc109dfb13de9c8cdeacda5449e69 ]
>>
>> According to the user manual of Kunpeng920 SoC, the max allowed number
>> of segments per whole packet is 63 and the max number of segments per
>> packet is 8 in datapath.
>>
>> This patch reports the Two segment parameters of Tx descriptor
>> limitations to DPDK framework.
>>
>> Signed-off-by: Lijun Ou <oulijun@huawei.com>
>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>> ---
>>   drivers/net/hns3/hns3_ethdev.c    | 2 ++
>>   drivers/net/hns3/hns3_ethdev_vf.c | 2 ++
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
>> index 3a92bd4..7fd0f7f 100644
>> --- a/drivers/net/hns3/hns3_ethdev.c
>> +++ b/drivers/net/hns3/hns3_ethdev.c
>> @@ -2499,6 +2499,8 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
>>   		.nb_max = HNS3_MAX_RING_DESC,
>>   		.nb_min = HNS3_MIN_RING_DESC,
>>   		.nb_align = HNS3_ALIGN_RING_DESC,
>> +		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
>> +		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
>>   	};
>>   
>>   	info->vmdq_queue_num = 0;
>> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
>> index 4359b2e..0b5d3d4 100644
>> --- a/drivers/net/hns3/hns3_ethdev_vf.c
>> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
>> @@ -866,6 +866,8 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
>>   		.nb_max = HNS3_MAX_RING_DESC,
>>   		.nb_min = HNS3_MIN_RING_DESC,
>>   		.nb_align = HNS3_ALIGN_RING_DESC,
>> +		.nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
>> +		.nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
>>   	};
>>   
>>   	info->vmdq_queue_num = 0;
> 
> Hi,
> 
> Thanks for the series, but it fails to build, because of this patch:
> 
> ../drivers/net/hns3/hns3_ethdev.c: In function ‘hns3_dev_infos_get’:
> ../drivers/net/hns3/hns3_ethdev.c:2502:17: error: ‘HNS3_MAX_TSO_BD_PER_PKT’ undeclared (first use in this function); did you mean ‘HNS3_MAX_TX_BD_PER_PKT’?
>     .nb_seg_max = HNS3_MAX_TSO_BD_PER_PKT,
>                   ^~~~~~~~~~~~~~~~~~~~~~~
>                   HNS3_MAX_TX_BD_PER_PKT
> ../drivers/net/hns3/hns3_ethdev.c:2502:17: note: each undeclared identifier is reported only once for each function it appears in
> ../drivers/net/hns3/hns3_ethdev.c:2503:21: error: ‘HNS3_MAX_NON_TSO_BD_PER_PKT’ undeclared (first use in this function); did you mean ‘HNS3_MAX_TX_BD_PER_PKT’?
>     .nb_mtu_seg_max = HNS3_MAX_NON_TSO_BD_PER_PKT,
>                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
>                       HNS3_MAX_TX_BD_PER_PKT
> [3/32] Compiling C object 'drivers/a715181@@tmp_rte_pmd_hns3@sta/net_hns3_hns3_intr.c.o'.
> ninja: build stopped: subcommand failed.
> 
Sorry, I will remove it and send the V3.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6
  2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                       ` (5 preceding siblings ...)
  2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
@ 2020-11-25 11:15     ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
                         ` (6 more replies)
  6 siblings, 7 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

His series are backport for 19.11.6 about hns3 PMD driver

Change from V3:
1. remove the relatived macro with TSO

Change from V2:
1. remove the relatived patch with TSO

Change from V1:
1. split TSO part from fix TX checksum with fix header length
2. remove TSO patch and the relatived patch

Chengchang Tang (2):
  net/hns3: decrease non-nearby memory access in Rx
  net/hns3: fix TX checksum with fix header length

Wei Hu (Xavier) (4):
  net/hns3: fix reassembling multiple segment packets in Tx
  net/hns3: report Rx drop packets enable configuration
  net/hns3: report Rx free threshold
  net/hns3: reduce address calculation in Rx

 drivers/net/hns3/hns3_ethdev.c    |  32 +++-
 drivers/net/hns3/hns3_ethdev.h    |  31 +++-
 drivers/net/hns3/hns3_ethdev_vf.c |  11 ++
 drivers/net/hns3/hns3_rxtx.c      | 363 +++++++++++++++++++++++---------------
 drivers/net/hns3/hns3_rxtx.h      |  30 +++-
 5 files changed, 308 insertions(+), 159 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 6c44219f9907a064049d160d366d98bb840724a6 ]

Because of the hardware constraints, hns3 network engine doesn't support
sending packets with more than eight fragments. And hns3 pmd driver
tries to reassemble these kind of packets to meet hardware requirements.
Currently, there are two problems:
1) when the input buffer_len * 8 < pkt_len, the packets are impossible
   to be reassembled into 8 Buffer Descriptors. In this case, the
   packets will be passed to hardware, which eventually causes a
   hardware reset.
2) The meta data in origin packets which are required to fill into the
   descriptor haven't been copied into the reassembled pkts.

This patch adds a check for 1) to ensure such packets will be dropped by
driver and copies useful meta data from the origin packets to the
reassembled packets.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 5aee8cb..2c480d6 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1940,6 +1940,20 @@ hns3_tx_alloc_mbufs(struct hns3_tx_queue *txq, struct rte_mempool *mb_pool,
 	return 0;
 }
 
+static inline void
+hns3_pktmbuf_copy_hdr(struct rte_mbuf *new_pkt, struct rte_mbuf *old_pkt)
+{
+	new_pkt->ol_flags = old_pkt->ol_flags;
+	new_pkt->pkt_len = rte_pktmbuf_pkt_len(old_pkt);
+	new_pkt->outer_l2_len = old_pkt->outer_l2_len;
+	new_pkt->outer_l3_len = old_pkt->outer_l3_len;
+	new_pkt->l2_len = old_pkt->l2_len;
+	new_pkt->l3_len = old_pkt->l3_len;
+	new_pkt->l4_len = old_pkt->l4_len;
+	new_pkt->vlan_tci_outer = old_pkt->vlan_tci_outer;
+	new_pkt->vlan_tci = old_pkt->vlan_tci;
+}
+
 static int
 hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 			struct rte_mbuf **new_pkt)
@@ -1963,9 +1977,11 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 
 	mb_pool = tx_pkt->pool;
 	buf_size = tx_pkt->buf_len - RTE_PKTMBUF_HEADROOM;
-	nb_new_buf = (tx_pkt->pkt_len - 1) / buf_size + 1;
+	nb_new_buf = (rte_pktmbuf_pkt_len(tx_pkt) - 1) / buf_size + 1;
+	if (nb_new_buf > HNS3_MAX_TX_BD_PER_PKT)
+		return -EINVAL;
 
-	last_buf_len = tx_pkt->pkt_len % buf_size;
+	last_buf_len = rte_pktmbuf_pkt_len(tx_pkt) % buf_size;
 	if (last_buf_len == 0)
 		last_buf_len = buf_size;
 
@@ -1977,7 +1993,7 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 	/* Copy the original packet content to the new mbufs */
 	temp = tx_pkt;
 	s = rte_pktmbuf_mtod(temp, char *);
-	len_s = temp->data_len;
+	len_s = rte_pktmbuf_data_len(temp);
 	temp_new = new_mbuf;
 	for (i = 0; i < nb_new_buf; i++) {
 		d = rte_pktmbuf_mtod(temp_new, char *);
@@ -2000,13 +2016,14 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 				if (temp == NULL)
 					break;
 				s = rte_pktmbuf_mtod(temp, char *);
-				len_s = temp->data_len;
+				len_s = rte_pktmbuf_data_len(temp);
 			}
 		}
 
 		temp_new->data_len = buf_len;
 		temp_new = temp_new->next;
 	}
+	hns3_pktmbuf_copy_hdr(new_mbuf, tx_pkt);
 
 	/* free original mbufs */
 	rte_pktmbuf_free(tx_pkt);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit 8c7449779c4526d27205043560a21f0e2c2f622b ]

Currently, hns3 PMD driver needs know the PVID configuration state and
do different processing in the 'rx_pkt_burst' ops implementation
function.

This patch adds a member to struct hns3_rx_queue/hns3_tx_queue of the
driver to indicate the PVID configuration status, so it isn't need
to access other data structure in the 'rx_pkt_burst' ops implementation,
to avoid performance loss because of reducing cache miss.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c | 21 ++++++++++++++++++++-
 drivers/net/hns3/hns3_rxtx.c   | 37 +++++++++++++++++++++++++++++++------
 drivers/net/hns3/hns3_rxtx.h   | 13 +++++++++++++
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index f2702e0..3a92bd4 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -895,6 +895,8 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 {
 	struct hns3_adapter *hns = dev->data->dev_private;
 	struct hns3_hw *hw = &hns->hw;
+	bool pvid_en_state_change;
+	uint16_t pvid_state;
 	int ret;
 
 	if (pvid > RTE_ETHER_MAX_VLAN_ID) {
@@ -903,10 +905,27 @@ hns3_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on)
 		return -EINVAL;
 	}
 
+	/*
+	 * If PVID configuration state change, should refresh the PVID
+	 * configuration state in struct hns3_tx_queue/hns3_rx_queue.
+	 */
+	pvid_state = hw->port_base_vlan_cfg.state;
+	if ((on && pvid_state == HNS3_PORT_BASE_VLAN_ENABLE) ||
+	    (!on && pvid_state == HNS3_PORT_BASE_VLAN_DISABLE))
+		pvid_en_state_change = false;
+	else
+		pvid_en_state_change = true;
+
 	rte_spinlock_lock(&hw->lock);
 	ret = hns3_vlan_pvid_configure(hns, pvid, on);
 	rte_spinlock_unlock(&hw->lock);
-	return ret;
+	if (ret)
+		return ret;
+
+	if (pvid_en_state_change)
+		hns3_update_all_queues_pvid_state(hw);
+
+	return 0;
 }
 
 static int
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 2c480d6..9eccca0 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -316,6 +316,31 @@ hns3_init_tx_queue_hw(struct hns3_tx_queue *txq)
 }
 
 void
+hns3_update_all_queues_pvid_state(struct hns3_hw *hw)
+{
+	uint16_t nb_rx_q = hw->data->nb_rx_queues;
+	uint16_t nb_tx_q = hw->data->nb_tx_queues;
+	struct hns3_rx_queue *rxq;
+	struct hns3_tx_queue *txq;
+	int pvid_state;
+	int i;
+
+	pvid_state = hw->port_base_vlan_cfg.state;
+	for (i = 0; i < hw->cfg_max_queues; i++) {
+		if (i < nb_rx_q) {
+			rxq = hw->data->rx_queues[i];
+			if (rxq != NULL)
+				rxq->pvid_state = pvid_state;
+		}
+		if (i < nb_tx_q) {
+			txq = hw->data->tx_queues[i];
+			if (txq != NULL)
+				txq->pvid_state = pvid_state;
+		}
+	}
+}
+
+void
 hns3_enable_all_queues(struct hns3_hw *hw, bool en)
 {
 	uint16_t nb_rx_q = hw->data->nb_rx_queues;
@@ -1275,6 +1300,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
+	rxq->pvid_state = hw->port_base_vlan_cfg.state;
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
@@ -1503,7 +1529,7 @@ hns3_rx_set_cksum_flag(struct rte_mbuf *rxm, uint64_t packet_type,
 }
 
 static inline void
-hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
+hns3_rxd_to_vlan_tci(struct hns3_rx_queue *rxq, struct rte_mbuf *mb,
 		     uint32_t l234_info, const struct hns3_desc *rxd)
 {
 #define HNS3_STRP_STATUS_NUM		0x4
@@ -1511,8 +1537,6 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 #define HNS3_NO_STRP_VLAN_VLD		0x0
 #define HNS3_INNER_STRP_VLAN_VLD	0x1
 #define HNS3_OUTER_STRP_VLAN_VLD	0x2
-	struct hns3_adapter *hns = dev->data->dev_private;
-	struct hns3_hw *hw = &hns->hw;
 	uint32_t strip_status;
 	uint32_t report_mode;
 
@@ -1538,7 +1562,7 @@ hns3_rxd_to_vlan_tci(struct rte_eth_dev *dev, struct rte_mbuf *mb,
 	};
 	strip_status = hns3_get_field(l234_info, HNS3_RXD_STRP_TAGP_M,
 				      HNS3_RXD_STRP_TAGP_S);
-	report_mode = report_type[hw->port_base_vlan_cfg.state][strip_status];
+	report_mode = report_type[rxq->pvid_state][strip_status];
 	switch (report_mode) {
 	case HNS3_NO_STRP_VLAN_VLD:
 		mb->vlan_tci = 0;
@@ -1583,7 +1607,6 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx = 0;
 	nb_rx_bd = 0;
 	rxq = rx_queue;
-	dev = &rte_eth_devices[rxq->port_id];
 
 	rx_id = rxq->next_to_clean;
 	rx_ring = rxq->rx_ring;
@@ -1660,6 +1683,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 		nmb = rte_mbuf_raw_alloc(rxq->mb_pool);
 		if (unlikely(nmb == NULL)) {
+			dev = &rte_eth_devices[rxq->port_id];
 			dev->data->rx_mbuf_alloc_failed++;
 			break;
 		}
@@ -1729,7 +1753,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			hns3_rx_set_cksum_flag(first_seg,
 					       first_seg->packet_type,
 					       cksum_err);
-		hns3_rxd_to_vlan_tci(dev, first_seg, l234_info, &rxd);
+		hns3_rxd_to_vlan_tci(rxq, first_seg, l234_info, &rxd);
 
 		rx_pkts[nb_rx++] = first_seg;
 		first_seg = NULL;
@@ -1807,6 +1831,7 @@ hns3_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	txq->next_to_clean = 0;
 	txq->tx_bd_ready = txq->nb_tx_desc - 1;
 	txq->port_id = dev->data->port_id;
+	txq->pvid_state = hw->port_base_vlan_cfg.state;
 	txq->configured = true;
 	txq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 1fd1afd..a416d10 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -250,6 +250,12 @@ struct hns3_rx_queue {
 	uint16_t rx_buf_len;
 	uint16_t rx_free_thresh;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool rx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if rx queue has been configured */
 
@@ -276,6 +282,12 @@ struct hns3_tx_queue {
 	uint16_t next_to_use;
 	uint16_t tx_bd_ready;
 
+	/*
+	 * port based vlan configuration state.
+	 * value range: HNS3_PORT_BASE_VLAN_DISABLE / HNS3_PORT_BASE_VLAN_ENABLE
+	 */
+	uint16_t pvid_state;
+
 	bool tx_deferred_start; /* don't start this queue in dev start */
 	bool configured;        /* indicate if tx queue has been configured */
 };
@@ -336,5 +348,6 @@ void hns3_set_queue_intr_rl(struct hns3_hw *hw, uint16_t queue_id,
 			    uint16_t rl_value);
 int hns3_set_fake_rx_or_tx_queues(struct rte_eth_dev *dev, uint16_t nb_rx_q,
 				  uint16_t nb_tx_q);
+void hns3_update_all_queues_pvid_state(struct hns3_hw *hw);
 
 #endif /* _HNS3_RXTX_H_ */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

Currently, if there are not available Rx buffer descriptors in receiving
direction based on hns3 network engine, incoming packets will always be
dropped by hardware. This patch reports the '.rx_drop_en' information to
DPDK framework in the '.dev_infos_get', '.rxq_info_get' and
'.rx_queue_setup' ops implementation function.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    | 9 +++++++++
 drivers/net/hns3/hns3_ethdev_vf.c | 9 +++++++++
 drivers/net/hns3/hns3_rxtx.c      | 6 ++++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index 3a92bd4..d14c96d 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2501,6 +2501,15 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_align = HNS3_ALIGN_RING_DESC,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 4359b2e..ee272da 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -868,6 +868,15 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 		.nb_align = HNS3_ALIGN_RING_DESC,
 	};
 
+	info->default_rxconf = (struct rte_eth_rxconf) {
+		/*
+		 * If there are no available Rx buffer descriptors, incoming
+		 * packets are always dropped by hardware based on hns3 network
+		 * engine.
+		 */
+		.rx_drop_en = 1,
+	};
+
 	info->vmdq_queue_num = 0;
 
 	info->reta_size = HNS3_RSS_IND_TBL_SIZE;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 9eccca0..3ec589f 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1251,6 +1251,12 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 		return -EINVAL;
 	}
 
+	if (conf->rx_drop_en == 0)
+		hns3_warn(hw, "if there are no available Rx descriptors,"
+			  "incoming packets are always dropped. input parameter"
+			  " conf->rx_drop_en(%u) is uneffective.",
+			  conf->rx_drop_en);
+
 	if (dev->data->rx_queues[idx]) {
 		hns3_rx_queue_release(dev->data->rx_queues[idx]);
 		dev->data->rx_queues[idx] = NULL;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 4/6] net/hns3: report Rx free threshold
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                         ` (2 preceding siblings ...)
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit ceabee45be4a1737994c5f4631bd001b8021475c ]

This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.c    |  2 ++
 drivers/net/hns3/hns3_ethdev_vf.c |  2 ++
 drivers/net/hns3/hns3_rxtx.c      | 30 ++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h      |  9 ++++++---
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index d14c96d..0dadaef 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2502,12 +2502,14 @@ hns3_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index ee272da..ea0fed4 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -869,12 +869,14 @@ hns3vf_dev_infos_get(struct rte_eth_dev *eth_dev, struct rte_eth_dev_info *info)
 	};
 
 	info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_free_thresh = HNS3_DEFAULT_RX_FREE_THRESH,
 		/*
 		 * If there are no available Rx buffer descriptors, incoming
 		 * packets are always dropped by hardware based on hns3 network
 		 * engine.
 		 */
 		.rx_drop_en = 1,
+		.offloads = 0,
 	};
 
 	info->vmdq_queue_num = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 3ec589f..243c286 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -634,8 +634,7 @@ hns3_dev_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 
 	return 0;
@@ -649,8 +648,7 @@ hns3_fake_rx_queue_start(struct hns3_adapter *hns, uint16_t idx)
 
 	rxq = (struct hns3_rx_queue *)hw->fkq_data.rx_queues[idx];
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	hns3_init_rx_queue_hw(rxq);
 }
 
@@ -1285,10 +1283,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 
 	rxq->hns = hns;
 	rxq->mb_pool = mp;
-	if (conf->rx_free_thresh <= 0)
-		rxq->rx_free_thresh = DEFAULT_RX_FREE_THRESH;
-	else
-		rxq->rx_free_thresh = conf->rx_free_thresh;
+	rxq->rx_free_thresh = (conf->rx_free_thresh > 0) ?
+		conf->rx_free_thresh : HNS3_DEFAULT_RX_FREE_THRESH;
 	rxq->rx_deferred_start = conf->rx_deferred_start;
 
 	rx_entry_len = sizeof(struct hns3_entry) * rxq->nb_rx_desc;
@@ -1301,8 +1297,7 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	}
 
 	rxq->next_to_use = 0;
-	rxq->next_to_clean = 0;
-	rxq->nb_rx_hold = 0;
+	rxq->rx_free_hold = 0;
 	rxq->pkt_first_seg = NULL;
 	rxq->pkt_last_seg = NULL;
 	rxq->port_id = dev->data->port_id;
@@ -1614,11 +1609,11 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	nb_rx_bd = 0;
 	rxq = rx_queue;
 
-	rx_id = rxq->next_to_clean;
+	rx_id = rxq->next_to_use;
 	rx_ring = rxq->rx_ring;
+	sw_ring = rxq->sw_ring;
 	first_seg = rxq->pkt_first_seg;
 	last_seg = rxq->pkt_last_seg;
-	sw_ring = rxq->sw_ring;
 
 	while (nb_rx < nb_pkts) {
 		rxdp = &rx_ring[rx_id];
@@ -1769,16 +1764,15 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		first_seg = NULL;
 	}
 
-	rxq->next_to_clean = rx_id;
+	rxq->next_to_use = rx_id;
 	rxq->pkt_first_seg = first_seg;
 	rxq->pkt_last_seg = last_seg;
 
-	nb_rx_bd = nb_rx_bd + rxq->nb_rx_hold;
-	if (nb_rx_bd > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, nb_rx_bd);
-		nb_rx_bd = 0;
+	rxq->rx_free_hold += nb_rx_bd;
+	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
+		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		rxq->rx_free_hold = 0;
 	}
-	rxq->nb_rx_hold = nb_rx_bd;
 
 	return nb_rx;
 }
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index a416d10..6211665 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -10,6 +10,7 @@
 #define HNS3_DEFAULT_RING_DESC  1024
 #define	HNS3_ALIGN_RING_DESC	32
 #define HNS3_RING_BASE_ALIGN	128
+#define HNS3_DEFAULT_RX_FREE_THRESH	32
 
 #define HNS3_512_BD_BUF_SIZE	512
 #define HNS3_1K_BD_BUF_SIZE	1024
@@ -243,12 +244,14 @@ struct hns3_rx_queue {
 	uint16_t queue_id;
 	uint16_t port_id;
 	uint16_t nb_rx_desc;
-	uint16_t nb_rx_hold;
-	uint16_t rx_tail;
-	uint16_t next_to_clean;
 	uint16_t next_to_use;
 	uint16_t rx_buf_len;
+	/*
+	 * threshold for the number of BDs waited to passed to hardware. If the
+	 * number exceeds the threshold, driver will pass these BDs to hardware.
+	 */
 	uint16_t rx_free_thresh;
+	uint16_t rx_free_hold;   /* num of BDs waited to passed to hardware */
 
 	/*
 	 * port based vlan configuration state.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 5/6] net/hns3: reduce address calculation in Rx
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                         ` (3 preceding siblings ...)
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
  2020-11-26  9:21       ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Luca Boccassi
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>

[ upstream commit 323df8941b57027752ee9d191f1ac6f359bd524e ]

This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h | 29 +++++++++++++++++++++++++++--
 drivers/net/hns3/hns3_rxtx.c   | 14 +++-----------
 drivers/net/hns3/hns3_rxtx.h   |  1 +
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index e0a04f8..ba01622 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -579,14 +579,39 @@ struct hns3_adapter {
 	type __max2 = (y);                      \
 	__max1 > __max2 ? __max1 : __max2; })
 
+/*
+ * Because hardware always access register in little-endian mode based on hns3
+ * network engine, so driver should also call rte_cpu_to_le_32 to convert data
+ * in little-endian mode before writing register and call rte_le_to_cpu_32 to
+ * convert data after reading from register.
+ *
+ * Here the driver encapsulates the data conversion operation in the register
+ * read/write operation function as below:
+ *   hns3_write_reg
+ *   hns3_write_reg_opt
+ *   hns3_read_reg
+ * Therefore, when calling these functions, conversion is not required again.
+ */
 static inline void hns3_write_reg(void *base, uint32_t reg, uint32_t value)
 {
-	rte_write32(value, (volatile void *)((char *)base + reg));
+	rte_write32(rte_cpu_to_le_32(value),
+		    (volatile void *)((char *)base + reg));
+}
+
+/*
+ * The optimized function for writing registers used in the '.rx_pkt_burst' and
+ * '.tx_pkt_burst' ops implementation function.
+ */
+static inline void hns3_write_reg_opt(volatile void *addr, uint32_t value)
+{
+	rte_io_wmb();
+	rte_write32_relaxed(rte_cpu_to_le_32(value), addr);
 }
 
 static inline uint32_t hns3_read_reg(void *base, uint32_t reg)
 {
-	return rte_read32((volatile void *)((char *)base + reg));
+	uint32_t read_val = rte_read32((volatile void *)((char *)base + reg));
+	return rte_le_to_cpu_32(read_val);
 }
 
 #define hns3_write_dev(a, reg, value) \
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 243c286..2dda8fc 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1305,6 +1305,8 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t nb_desc,
 	rxq->configured = true;
 	rxq->io_base = (void *)((char *)hw->io_base + HNS3_TQP_REG_OFFSET +
 				idx * HNS3_TQP_REG_SIZE);
+	rxq->io_head_reg = (volatile void *)((char *)rxq->io_base +
+			   HNS3_RING_RX_HEAD_REG);
 	rxq->rx_buf_len = rx_buf_size;
 	rxq->l2_errors = 0;
 	rxq->pkt_len_errors = 0;
@@ -1448,16 +1450,6 @@ hns3_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 	return NULL;
 }
 
-static void
-hns3_clean_rx_buffers(struct hns3_rx_queue *rxq, int count)
-{
-	rxq->next_to_use += count;
-	if (rxq->next_to_use >= rxq->nb_rx_desc)
-		rxq->next_to_use -= rxq->nb_rx_desc;
-
-	hns3_write_dev(rxq, HNS3_RING_RX_HEAD_REG, count);
-}
-
 static int
 hns3_handle_bdinfo(struct hns3_rx_queue *rxq, struct rte_mbuf *rxm,
 		   uint32_t bd_base_info, uint32_t l234_info,
@@ -1770,7 +1762,7 @@ hns3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	rxq->rx_free_hold += nb_rx_bd;
 	if (rxq->rx_free_hold > rxq->rx_free_thresh) {
-		hns3_clean_rx_buffers(rxq, rxq->rx_free_hold);
+		hns3_write_reg_opt(rxq->io_head_reg, rxq->rx_free_hold);
 		rxq->rx_free_hold = 0;
 	}
 
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index 6211665..fd4f419 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -231,6 +231,7 @@ struct hns3_entry {
 
 struct hns3_rx_queue {
 	void *io_base;
+	volatile void *io_head_reg;
 	struct hns3_adapter *hns;
 	struct rte_mempool *mb_pool;
 	struct hns3_desc *rx_ring;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [dpdk-stable] [PATCH v4 19.11.6 6/6] net/hns3: fix TX checksum with fix header length
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                         ` (4 preceding siblings ...)
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
@ 2020-11-25 11:15       ` Lijun Ou
  2020-11-26  9:21       ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Luca Boccassi
  6 siblings, 0 replies; 42+ messages in thread
From: Lijun Ou @ 2020-11-25 11:15 UTC (permalink / raw)
  To: luca.boccassi, stable

From: Chengchang Tang <tangchengchang@huawei.com>

[ upstream commit fb6eb9009f41b823dddef8a09fe8e0b0698c2bf5 ]

Currently, the header length of all the layers are fixed, It would
lead to a cksum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h |   2 +
 drivers/net/hns3/hns3_rxtx.c   | 253 +++++++++++++++++++++++------------------
 drivers/net/hns3/hns3_rxtx.h   |   7 +-
 3 files changed, 147 insertions(+), 115 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index ba01622..8ba22dc 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -552,6 +552,8 @@ struct hns3_adapter {
 #define hns3_get_bit(origin, shift) \
 	hns3_get_field((origin), (0x1UL << (shift)), (shift))
 
+#define hns3_gen_field_val(mask, shift, val) (((val) << (shift)) & (mask))
+
 /*
  * upper_32_bits - return bits 32-63 of a number
  * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 2dda8fc..5971cf6 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -2051,180 +2051,211 @@ hns3_reassemble_tx_pkts(void *tx_queue, struct rte_mbuf *tx_pkt,
 }
 
 static void
-hns3_parse_outer_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec)
+hns3_parse_outer_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec)
 {
 	uint32_t tmp = *ol_type_vlan_len_msec;
+	uint64_t ol_flags = m->ol_flags;
 
 	/* (outer) IP header type */
 	if (ol_flags & PKT_TX_OUTER_IPV4) {
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
 		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M,
-				       HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+					HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_CSUM);
 		else
-			hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-				       HNS3_OL3T_IPV4_NO_CSUM);
+			tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M,
+				HNS3_TXD_OL3T_S, HNS3_OL3T_IPV4_NO_CSUM);
 	} else if (ol_flags & PKT_TX_OUTER_IPV6) {
-		hns3_set_field(tmp, HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
-			       HNS3_OL3T_IPV6);
-		/* OL3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
+		tmp |= hns3_gen_field_val(HNS3_TXD_OL3T_M, HNS3_TXD_OL3T_S,
+					HNS3_OL3T_IPV6);
 	}
-
+	/* OL3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+				m->outer_l3_len >> HNS3_L3_LEN_UNIT);
 	*ol_type_vlan_len_msec = tmp;
 }
 
 static int
-hns3_parse_inner_params(uint64_t ol_flags, uint32_t *ol_type_vlan_len_msec,
-			struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_inner_params(struct rte_mbuf *m, uint32_t *ol_type_vlan_len_msec,
+			uint32_t *type_cs_vlan_tso_len)
 {
-	uint32_t tmp = *ol_type_vlan_len_msec;
-	uint8_t l4_len;
-
-	/* OL2 header size, defined in 2 bytes */
-	hns3_set_field(tmp, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
+#define HNS3_NVGRE_HLEN 8
+	uint32_t tmp_outer = *ol_type_vlan_len_msec;
+	uint32_t tmp_inner = *type_cs_vlan_tso_len;
+	uint64_t ol_flags = m->ol_flags;
+	uint16_t inner_l2_len;
 
-	/* L4TUNT: L4 Tunneling Type */
 	switch (ol_flags & PKT_TX_TUNNEL_MASK) {
 	case PKT_TX_TUNNEL_GENEVE:
 	case PKT_TX_TUNNEL_VXLAN:
-		/* MAC in UDP tunnelling packet, include VxLAN */
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_MAC_IN_UDP);
+		/* MAC in UDP tunnelling packet, include VxLAN and GENEVE */
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+				HNS3_TXD_TUNTYPE_S, HNS3_TUN_MAC_IN_UDP);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(UDP) length and tunneling length.
+		 * The inner l2 length of mbuf is the sum of outer l4 length,
+		 * tunneling header length and inner l2 length for a tunnel
+		 * packect. But in hns3 tx descriptor, the tunneling header
+		 * length is contained in the field of outer L4 length.
+		 * Therefore, driver need to calculate the outer L4 length and
+		 * inner L2 length.
 		 */
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       (uint8_t)RTE_ETHER_VXLAN_HLEN >>
-			       HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+						HNS3_TXD_L4LEN_S,
+						(uint8_t)RTE_ETHER_VXLAN_HLEN >>
+						HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - RTE_ETHER_VXLAN_HLEN;
 		break;
 	case PKT_TX_TUNNEL_GRE:
-		hns3_set_field(tmp, HNS3_TXD_TUNTYPE_M, HNS3_TXD_TUNTYPE_S,
-			       HNS3_TUN_NVGRE);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_TUNTYPE_M,
+					HNS3_TXD_TUNTYPE_S, HNS3_TUN_NVGRE);
 		/*
-		 * OL4 header size, defined in 4 Bytes, it contains outer
-		 * L4(GRE) length and tunneling length.
+		 * For NVGRE tunnel packect, the outer L4 is empty. So only
+		 * fill the NVGRE header length to the outer L4 field.
 		 */
-		l4_len = hdr_lens->l4_len + hdr_lens->tunnel_len;
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       l4_len >> HNS3_L4_LEN_UNIT);
+		tmp_outer |= hns3_gen_field_val(HNS3_TXD_L4LEN_M,
+				HNS3_TXD_L4LEN_S,
+				(uint8_t)HNS3_NVGRE_HLEN >> HNS3_L4_LEN_UNIT);
+
+		inner_l2_len = m->l2_len - HNS3_NVGRE_HLEN;
 		break;
 	default:
 		/* For non UDP / GRE tunneling, drop the tunnel packet */
 		return -EINVAL;
 	}
 
-	*ol_type_vlan_len_msec = tmp;
+	tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					inner_l2_len >> HNS3_L2_LEN_UNIT);
+	/* OL2 header size, defined in 2 bytes */
+	tmp_outer |= hns3_gen_field_val(HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
+					m->outer_l2_len >> HNS3_L2_LEN_UNIT);
+
+	*type_cs_vlan_tso_len = tmp_inner;
+	*ol_type_vlan_len_msec = tmp_outer;
 
 	return 0;
 }
 
 static int
-hns3_parse_tunneling_params(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			    uint64_t ol_flags,
-			    struct rte_net_hdr_lens *hdr_lens)
+hns3_parse_tunneling_params(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			    uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
-	uint32_t value = 0;
+	uint32_t tmp_outer = 0;
+	uint32_t tmp_inner = 0;
 	int ret;
 
-	hns3_parse_outer_params(ol_flags, &value);
-	ret = hns3_parse_inner_params(ol_flags, &value, hdr_lens);
-	if (ret)
-		return -EINVAL;
+	/*
+	 * The tunnel header is contained in the inner L2 header field of the
+	 * mbuf, but for hns3 descriptor, it is contained in the outer L4. So,
+	 * there is a need that switching between them. To avoid multiple
+	 * calculations, the length of the L2 header include the outer and
+	 * inner, will be filled during the parsing of tunnel packects.
+	 */
+	if (!(m->ol_flags & PKT_TX_TUNNEL_MASK)) {
+		/*
+		 * For non tunnel type the tunnel type id is 0, so no need to
+		 * assign a value to it. Only the inner(normal) L2 header length
+		 * is assigned.
+		 */
+		tmp_inner |= hns3_gen_field_val(HNS3_TXD_L2LEN_M,
+			       HNS3_TXD_L2LEN_S, m->l2_len >> HNS3_L2_LEN_UNIT);
+	} else {
+		/*
+		 * If outer csum is not offload, the outer length may be filled
+		 * with 0. And the length of the outer header is added to the
+		 * inner l2_len. It would lead a cksum error. So driver has to
+		 * calculate the header length.
+		 */
+		if (unlikely(!(m->ol_flags & PKT_TX_OUTER_IP_CKSUM) &&
+					m->outer_l2_len == 0)) {
+			struct rte_net_hdr_lens hdr_len;
+			(void)rte_net_get_ptype(m, &hdr_len,
+					RTE_PTYPE_L2_MASK | RTE_PTYPE_L3_MASK);
+			m->outer_l3_len = hdr_len.l3_len;
+			m->outer_l2_len = hdr_len.l2_len;
+			m->l2_len = m->l2_len - hdr_len.l2_len - hdr_len.l3_len;
+		}
+		hns3_parse_outer_params(m, &tmp_outer);
+		ret = hns3_parse_inner_params(m, &tmp_outer, &tmp_inner);
+		if (ret)
+			return -EINVAL;
+	}
 
-	desc->tx.ol_type_vlan_len_msec |= rte_cpu_to_le_32(value);
+	desc->tx.ol_type_vlan_len_msec = rte_cpu_to_le_32(tmp_outer);
+	desc->tx.type_cs_vlan_tso_len = rte_cpu_to_le_32(tmp_inner);
 
 	return 0;
 }
 
 static void
-hns3_parse_l3_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l3_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
+	uint32_t l3_type;
 	uint32_t tmp;
 
+	tmp = *type_cs_vlan_tso_len;
+	if (ol_flags & PKT_TX_IPV4)
+		l3_type = HNS3_L3T_IPV4;
+	else if (ol_flags & PKT_TX_IPV6)
+		l3_type = HNS3_L3T_IPV6;
+	else
+		l3_type = HNS3_L3T_NONE;
+
+	/* inner(/normal) L3 header size, defined in 4 bytes */
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
+					m->l3_len >> HNS3_L3_LEN_UNIT);
+
+	tmp |= hns3_gen_field_val(HNS3_TXD_L3T_M, HNS3_TXD_L3T_S, l3_type);
+
 	/* Enable L3 checksum offloads */
-	if (ol_flags & PKT_TX_IPV4) {
-		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV4);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv4_hdr) >> HNS3_L3_LEN_UNIT);
-		if (ol_flags & PKT_TX_IP_CKSUM)
-			hns3_set_bit(tmp, HNS3_TXD_L3CS_B, 1);
-		*type_cs_vlan_tso_len = tmp;
-	} else if (ol_flags & PKT_TX_IPV6) {
-		tmp = *type_cs_vlan_tso_len;
-		/* L3T, IPv6 don't do checksum */
-		hns3_set_field(tmp, HNS3_TXD_L3T_M, HNS3_TXD_L3T_S,
-			       HNS3_L3T_IPV6);
-		/* inner(/normal) L3 header size, defined in 4 bytes */
-		hns3_set_field(tmp, HNS3_TXD_L3LEN_M, HNS3_TXD_L3LEN_S,
-			       sizeof(struct rte_ipv6_hdr) >> HNS3_L3_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
-	}
+	if (ol_flags & PKT_TX_IP_CKSUM)
+		tmp |= BIT(HNS3_TXD_L3CS_B);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_parse_l4_cksum_params(uint64_t ol_flags, uint32_t *type_cs_vlan_tso_len)
+hns3_parse_l4_cksum_params(struct rte_mbuf *m, uint32_t *type_cs_vlan_tso_len)
 {
+	uint64_t ol_flags = m->ol_flags;
 	uint32_t tmp;
-
 	/* Enable L4 checksum offloads */
 	switch (ol_flags & PKT_TX_L4_MASK) {
 	case PKT_TX_TCP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_TCP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_tcp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_TCP);
 		break;
 	case PKT_TX_UDP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_UDP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_udp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_UDP);
 		break;
 	case PKT_TX_SCTP_CKSUM:
 		tmp = *type_cs_vlan_tso_len;
-		hns3_set_field(tmp, HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
-			       HNS3_L4T_SCTP);
-		hns3_set_bit(tmp, HNS3_TXD_L4CS_B, 1);
-		hns3_set_field(tmp, HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
-			       sizeof(struct rte_sctp_hdr) >> HNS3_L4_LEN_UNIT);
-		*type_cs_vlan_tso_len = tmp;
+		tmp |= hns3_gen_field_val(HNS3_TXD_L4T_M, HNS3_TXD_L4T_S,
+					HNS3_L4T_SCTP);
 		break;
 	default:
-		break;
+		return;
 	}
+	tmp |= BIT(HNS3_TXD_L4CS_B);
+	tmp |= hns3_gen_field_val(HNS3_TXD_L4LEN_M, HNS3_TXD_L4LEN_S,
+					m->l4_len >> HNS3_L4_LEN_UNIT);
+	*type_cs_vlan_tso_len = tmp;
 }
 
 static void
-hns3_txd_enable_checksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-			 uint64_t ol_flags)
+hns3_txd_enable_checksum(struct hns3_tx_queue *txq, struct rte_mbuf *m,
+			 uint16_t tx_desc_id)
 {
 	struct hns3_desc *tx_ring = txq->tx_ring;
 	struct hns3_desc *desc = &tx_ring[tx_desc_id];
 	uint32_t value = 0;
 
-	/* inner(/normal) L2 header size, defined in 2 bytes */
-	hns3_set_field(value, HNS3_TXD_L2LEN_M, HNS3_TXD_L2LEN_S,
-		       sizeof(struct rte_ether_hdr) >> HNS3_L2_LEN_UNIT);
-
-	hns3_parse_l3_cksum_params(ol_flags, &value);
-	hns3_parse_l4_cksum_params(ol_flags, &value);
+	hns3_parse_l3_cksum_params(m, &value);
+	hns3_parse_l4_cksum_params(m, &value);
 
 	desc->tx.type_cs_vlan_tso_len |= rte_cpu_to_le_32(value);
 }
@@ -2259,18 +2290,23 @@ hns3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 
 static int
 hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
-		 const struct rte_mbuf *m, struct rte_net_hdr_lens *hdr_lens)
+		 struct rte_mbuf *m)
 {
-	/* Fill in tunneling parameters if necessary */
-	if (m->ol_flags & PKT_TX_TUNNEL_MASK) {
-		(void)rte_net_get_ptype(m, hdr_lens, RTE_PTYPE_ALL_MASK);
-		if (hns3_parse_tunneling_params(txq, tx_desc_id, m->ol_flags,
-						hdr_lens))
+	struct hns3_desc *tx_ring = txq->tx_ring;
+	struct hns3_desc *desc = &tx_ring[tx_desc_id];
+
+	/* Enable checksum offloading */
+	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK) {
+		/* Fill in tunneling parameters if necessary */
+		if (hns3_parse_tunneling_params(txq, m, tx_desc_id))
 			return -EINVAL;
+
+		hns3_txd_enable_checksum(txq, m, tx_desc_id);
+	} else {
+		/* clear the control bit */
+		desc->tx.type_cs_vlan_tso_len  = 0;
+		desc->tx.ol_type_vlan_len_msec = 0;
 	}
-	/* Enable checksum offloading */
-	if (m->ol_flags & HNS3_TX_CKSUM_OFFLOAD_MASK)
-		hns3_txd_enable_checksum(txq, tx_desc_id, m->ol_flags);
 
 	return 0;
 }
@@ -2278,7 +2314,6 @@ hns3_parse_cksum(struct hns3_tx_queue *txq, uint16_t tx_desc_id,
 uint16_t
 hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct rte_net_hdr_lens hdr_lens = {0};
 	struct hns3_tx_queue *txq = tx_queue;
 	struct hns3_entry *tx_bak_pkt;
 	struct rte_mbuf *new_pkt;
@@ -2345,7 +2380,7 @@ hns3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			nb_buf = m_seg->nb_segs;
 		}
 
-		if (hns3_parse_cksum(txq, tx_next_use, m_seg, &hdr_lens))
+		if (hns3_parse_cksum(txq, tx_next_use, m_seg))
 			goto end_of_tx;
 
 		i = 0;
diff --git a/drivers/net/hns3/hns3_rxtx.h b/drivers/net/hns3/hns3_rxtx.h
index fd4f419..88a7584 100644
--- a/drivers/net/hns3/hns3_rxtx.h
+++ b/drivers/net/hns3/hns3_rxtx.h
@@ -305,14 +305,9 @@ struct hns3_queue_info {
 };
 
 #define HNS3_TX_CKSUM_OFFLOAD_MASK ( \
-	PKT_TX_OUTER_IPV6 | \
-	PKT_TX_OUTER_IPV4 | \
 	PKT_TX_OUTER_IP_CKSUM | \
-	PKT_TX_IPV6 | \
-	PKT_TX_IPV4 | \
 	PKT_TX_IP_CKSUM | \
-	PKT_TX_L4_MASK | \
-	PKT_TX_TUNNEL_MASK)
+	PKT_TX_L4_MASK)
 
 enum hns3_cksum_status {
 	HNS3_CKSUM_NONE = 0,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6
  2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
                         ` (5 preceding siblings ...)
  2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
@ 2020-11-26  9:21       ` Luca Boccassi
  6 siblings, 0 replies; 42+ messages in thread
From: Luca Boccassi @ 2020-11-26  9:21 UTC (permalink / raw)
  To: Lijun Ou, stable

On Wed, 2020-11-25 at 19:15 +0800, Lijun Ou wrote:
> His series are backport for 19.11.6 about hns3 PMD driver
> 
> Change from V3:
> 1. remove the relatived macro with TSO
> 
> Change from V2:
> 1. remove the relatived patch with TSO
> 
> Change from V1:
> 1. split TSO part from fix TX checksum with fix header length
> 2. remove TSO patch and the relatived patch
> 
> Chengchang Tang (2):
>   net/hns3: decrease non-nearby memory access in Rx
>   net/hns3: fix TX checksum with fix header length
> 
> Wei Hu (Xavier) (4):
>   net/hns3: fix reassembling multiple segment packets in Tx
>   net/hns3: report Rx drop packets enable configuration
>   net/hns3: report Rx free threshold
>   net/hns3: reduce address calculation in Rx
> 
>  drivers/net/hns3/hns3_ethdev.c    |  32 +++-
>  drivers/net/hns3/hns3_ethdev.h    |  31 +++-
>  drivers/net/hns3/hns3_ethdev_vf.c |  11 ++
>  drivers/net/hns3/hns3_rxtx.c      | 363 +++++++++++++++++++++++---------------
>  drivers/net/hns3/hns3_rxtx.h      |  30 +++-
>  5 files changed, 308 insertions(+), 159 deletions(-)

Thanks, applied and pushed.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2020-11-26  9:22 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13 13:36 [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Lijun Ou
2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 01/13] net/hns3: support TSO Lijun Ou
2020-11-13 13:36 ` [dpdk-stable] [PATCH 19.11.6 02/13] net/hns3: check TSO segment size during Tx Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 03/13] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 04/13] net/hns3: support promiscuous and allmulticast mode for VF Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 05/13] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 06/13] net/hns3: cleanup duplicated code on processing TSO in Tx Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 07/13] net/hns3: fix inserted VLAN tag position " Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 08/13] net/hns3: report Tx descriptor segment limitations Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 09/13] net/hns3: report Rx drop packets enable configuration Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 10/13] net/hns3: report Rx free threshold Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 11/13] net/hns3: reduce address calculation in Rx Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 12/13] net/hns3: fix Tx checksum outer header prepare Lijun Ou
2020-11-13 13:37 ` [dpdk-stable] [PATCH 19.11.6 13/13] net/hns3: fix Tx checksum with fixed header length Lijun Ou
2020-11-23 15:52 ` [dpdk-stable] [PATCH 19.11.6 00/13] backport for 19.11.6 Luca Boccassi
2020-11-24 14:27   ` oulijun
2020-11-24 15:35     ` Luca Boccassi
2020-11-25  3:24 ` [dpdk-stable] [PATCH v2 19.11.6 0/7] " Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 1/7] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 2/7] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 3/7] net/hns3: report Tx descriptor segment limitations Lijun Ou
2020-11-25  9:16     ` Luca Boccassi
2020-11-25 10:59       ` oulijun
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 4/7] net/hns3: report Rx drop packets enable configuration Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 5/7] net/hns3: report Rx free threshold Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 6/7] net/hns3: reduce address calculation in Rx Lijun Ou
2020-11-25  3:24   ` [dpdk-stable] [PATCH v2 19.11.6 7/7] net/hns3: fix TX checksum with fix header length Lijun Ou
2020-11-25 10:58   ` [dpdk-stable] [PATCH v3 19.11.6 0/6] backport for 19.11.6 Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
2020-11-25 10:58     ` [dpdk-stable] [PATCH v3 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
2020-11-25 11:15     ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 1/6] net/hns3: fix reassembling multiple segment packets in Tx Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 2/6] net/hns3: decrease non-nearby memory access in Rx Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 3/6] net/hns3: report Rx drop packets enable configuration Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 4/6] net/hns3: report Rx free threshold Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 5/6] net/hns3: reduce address calculation in Rx Lijun Ou
2020-11-25 11:15       ` [dpdk-stable] [PATCH v4 19.11.6 6/6] net/hns3: fix TX checksum with fix header length Lijun Ou
2020-11-26  9:21       ` [dpdk-stable] [PATCH v4 19.11.6 0/6] backport for 19.11.6 Luca Boccassi

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git