DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] Offload flags fixes
@ 2021-04-01  9:52 David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin, olivier.matz, fbl, i.maximets

The important part is the last patch on vhost handling of offloading
requests coming from a virtio guest interface.

The rest are small fixes that I accumulated while reviewing the mbuf
offload flags.

On this last patch, it has the potential of breaking existing
applications using the vhost library (OVS being impacted).
I did not mark it for backport, but I am having second thoughts.

The vhost example has not been updated yet, as I wanted to send this
series first to get feedback before looking more into the example code.


-- 
David Marchand

David Marchand (5):
  mbuf: mark old offload flag as deprecated
  net/tap: do not touch Tx offload flags
  net/virtio: do not touch Tx offload flags
  net/virtio: refactor Tx offload helper
  vhost: fix offload flags in Rx path

 drivers/net/tap/rte_eth_tap.c                |  17 ++-
 drivers/net/virtio/virtio_rxtx.c             |   7 +-
 drivers/net/virtio/virtio_rxtx_packed_avx.h  |   2 +-
 drivers/net/virtio/virtio_rxtx_packed_neon.h |   2 +-
 drivers/net/virtio/virtqueue.h               |  81 +++++-----
 examples/vhost/main.c                        |   6 +
 lib/librte_mbuf/rte_mbuf_core.h              |   3 +-
 lib/librte_vhost/virtio_net.c                | 148 ++++++++-----------
 8 files changed, 123 insertions(+), 143 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated
  2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
@ 2021-04-01  9:52 ` David Marchand
  2021-04-07 20:14   ` Flavio Leitner
  2021-04-08  7:23   ` Olivier Matz
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin, olivier.matz, fbl, i.maximets

PKT_RX_EIP_CKSUM_BAD has been declared deprecated quite some time ago,
but there was no warning to applications still using it.
Fix this by marking as deprecated with the newly introduced
RTE_DEPRECATED.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_mbuf/rte_mbuf_core.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index c17dc95c51..bb38d7f581 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -83,7 +83,8 @@ extern "C" {
  * Deprecated.
  * This flag has been renamed, use PKT_RX_OUTER_IP_CKSUM_BAD instead.
  */
-#define PKT_RX_EIP_CKSUM_BAD PKT_RX_OUTER_IP_CKSUM_BAD
+#define PKT_RX_EIP_CKSUM_BAD \
+	RTE_DEPRECATED(PKT_RX_EIP_CKSUM_BAD) PKT_RX_OUTER_IP_CKSUM_BAD
 
 /**
  * A vlan has been stripped by the hardware and its tci is saved in
-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
@ 2021-04-01  9:52 ` David Marchand
  2021-04-07 20:15   ` Flavio Leitner
  2021-04-08  7:53   ` Olivier Matz
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 3/5] net/virtio: " David Marchand
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin, olivier.matz, fbl, i.maximets, Keith Wiles

Tx offload flags are of the application responsibility.
Leave the mbuf alone and check for TSO where needed.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/tap/rte_eth_tap.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index c36d4bf76e..285fe395c5 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -562,6 +562,7 @@ tap_tx_l3_cksum(char *packet, uint64_t ol_flags, unsigned int l2_len,
 		uint16_t *l4_phdr_cksum, uint32_t *l4_raw_cksum)
 {
 	void *l3_hdr = packet + l2_len;
+	uint64_t csum_l4;
 
 	if (ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4)) {
 		struct rte_ipv4_hdr *iph = l3_hdr;
@@ -571,13 +572,17 @@ tap_tx_l3_cksum(char *packet, uint64_t ol_flags, unsigned int l2_len,
 		cksum = rte_raw_cksum(iph, l3_len);
 		iph->hdr_checksum = (cksum == 0xffff) ? cksum : ~cksum;
 	}
-	if (ol_flags & PKT_TX_L4_MASK) {
+
+	csum_l4 = ol_flags & PKT_TX_L4_MASK;
+	if (ol_flags & PKT_TX_TCP_SEG)
+		csum_l4 |= PKT_TX_TCP_CKSUM;
+	if (csum_l4) {
 		void *l4_hdr;
 
 		l4_hdr = packet + l2_len + l3_len;
-		if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM)
+		if (csum_l4 == PKT_TX_UDP_CKSUM)
 			*l4_cksum = &((struct rte_udp_hdr *)l4_hdr)->dgram_cksum;
-		else if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM)
+		else if (csum_l4 == PKT_TX_TCP_CKSUM)
 			*l4_cksum = &((struct rte_tcp_hdr *)l4_hdr)->cksum;
 		else
 			return;
@@ -648,7 +653,8 @@ tap_write_mbufs(struct tx_queue *txq, uint16_t num_mbufs,
 		if (txq->csum &&
 		    ((mbuf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4) ||
 		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM ||
-		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM))) {
+		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM) ||
+		     (mbuf->ol_flags & PKT_TX_TCP_SEG))) {
 			is_cksum = 1;
 
 			/* Support only packets with at least layer 4
@@ -742,9 +748,6 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		if (tso) {
 			struct rte_gso_ctx *gso_ctx = &txq->gso_ctx;
 
-			/* TCP segmentation implies TCP checksum offload */
-			mbuf_in->ol_flags |= PKT_TX_TCP_CKSUM;
-
 			/* gso size is calculated without RTE_ETHER_CRC_LEN */
 			hdrs_len = mbuf_in->l2_len + mbuf_in->l3_len +
 					mbuf_in->l4_len;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [dpdk-dev] [PATCH 3/5] net/virtio: do not touch Tx offload flags
  2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
@ 2021-04-01  9:52 ` David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
  4 siblings, 0 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin, olivier.matz, fbl, i.maximets, Chenbo Xia

Tx offload flags are of the application responsibility.
Leave the mbuf alone and use a local storage for implicit tcp checksum
offloading in case of TSO.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/virtio/virtqueue.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 71b66f3208..2e8826bc28 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -618,10 +618,12 @@ virtqueue_xmit_offload(struct virtio_net_hdr *hdr,
 			uint8_t offload)
 {
 	if (offload) {
+		uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
+
 		if (cookie->ol_flags & PKT_TX_TCP_SEG)
-			cookie->ol_flags |= PKT_TX_TCP_CKSUM;
+			csum_l4 |= PKT_TX_TCP_CKSUM;
 
-		switch (cookie->ol_flags & PKT_TX_L4_MASK) {
+		switch (csum_l4) {
 		case PKT_TX_UDP_CKSUM:
 			hdr->csum_start = cookie->l2_len + cookie->l3_len;
 			hdr->csum_offset = offsetof(struct rte_udp_hdr,
-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper
  2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
                   ` (2 preceding siblings ...)
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 3/5] net/virtio: " David Marchand
@ 2021-04-01  9:52 ` David Marchand
  2021-04-08 13:05   ` Flavio Leitner
  2021-04-09  2:31   ` Ruifeng Wang
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
  4 siblings, 2 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, olivier.matz, fbl, i.maximets, Chenbo Xia,
	Bruce Richardson, Konstantin Ananyev, Jerin Jacob, Ruifeng Wang

Purely cosmetic but it is rather odd to have an "offload" helper that
checks if it actually must do something.
We already have the same checks in most callers, so move this branch
in them.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/virtio/virtio_rxtx.c             |  7 +-
 drivers/net/virtio/virtio_rxtx_packed_avx.h  |  2 +-
 drivers/net/virtio/virtio_rxtx_packed_neon.h |  2 +-
 drivers/net/virtio/virtqueue.h               | 83 +++++++++-----------
 4 files changed, 44 insertions(+), 50 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 40283001b0..a4e37ef379 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -448,7 +448,7 @@ virtqueue_enqueue_xmit_inorder(struct virtnet_tx *txvq,
 		if (!vq->hw->has_tx_offload)
 			virtqueue_clear_net_hdr(hdr);
 		else
-			virtqueue_xmit_offload(hdr, cookies[i], true);
+			virtqueue_xmit_offload(hdr, cookies[i]);
 
 		start_dp[idx].addr  = rte_mbuf_data_iova(cookies[i]) - head_size;
 		start_dp[idx].len   = cookies[i]->data_len + head_size;
@@ -495,7 +495,7 @@ virtqueue_enqueue_xmit_packed_fast(struct virtnet_tx *txvq,
 	if (!vq->hw->has_tx_offload)
 		virtqueue_clear_net_hdr(hdr);
 	else
-		virtqueue_xmit_offload(hdr, cookie, true);
+		virtqueue_xmit_offload(hdr, cookie);
 
 	dp->addr = rte_mbuf_data_iova(cookie) - head_size;
 	dp->len  = cookie->data_len + head_size;
@@ -581,7 +581,8 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
 		idx = start_dp[idx].next;
 	}
 
-	virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload);
+	if (vq->hw->has_tx_offload)
+		virtqueue_xmit_offload(hdr, cookie);
 
 	do {
 		start_dp[idx].addr  = rte_mbuf_data_iova(cookie);
diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.h b/drivers/net/virtio/virtio_rxtx_packed_avx.h
index 49e845d02a..33cac3244f 100644
--- a/drivers/net/virtio/virtio_rxtx_packed_avx.h
+++ b/drivers/net/virtio/virtio_rxtx_packed_avx.h
@@ -115,7 +115,7 @@ virtqueue_enqueue_batch_packed_vec(struct virtnet_tx *txvq,
 		virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
 			hdr = rte_pktmbuf_mtod_offset(tx_pkts[i],
 					struct virtio_net_hdr *, -head_size);
-			virtqueue_xmit_offload(hdr, tx_pkts[i], true);
+			virtqueue_xmit_offload(hdr, tx_pkts[i]);
 		}
 	}
 
diff --git a/drivers/net/virtio/virtio_rxtx_packed_neon.h b/drivers/net/virtio/virtio_rxtx_packed_neon.h
index 851c81f312..1a49caf8af 100644
--- a/drivers/net/virtio/virtio_rxtx_packed_neon.h
+++ b/drivers/net/virtio/virtio_rxtx_packed_neon.h
@@ -134,7 +134,7 @@ virtqueue_enqueue_batch_packed_vec(struct virtnet_tx *txvq,
 		virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
 			hdr = rte_pktmbuf_mtod_offset(tx_pkts[i],
 					struct virtio_net_hdr *, -head_size);
-			virtqueue_xmit_offload(hdr, tx_pkts[i], true);
+			virtqueue_xmit_offload(hdr, tx_pkts[i]);
 		}
 	}
 
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 2e8826bc28..41a9b82a5f 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -613,52 +613,44 @@ virtqueue_notify(struct virtqueue *vq)
 } while (0)
 
 static inline void
-virtqueue_xmit_offload(struct virtio_net_hdr *hdr,
-			struct rte_mbuf *cookie,
-			uint8_t offload)
+virtqueue_xmit_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *cookie)
 {
-	if (offload) {
-		uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
-
-		if (cookie->ol_flags & PKT_TX_TCP_SEG)
-			csum_l4 |= PKT_TX_TCP_CKSUM;
-
-		switch (csum_l4) {
-		case PKT_TX_UDP_CKSUM:
-			hdr->csum_start = cookie->l2_len + cookie->l3_len;
-			hdr->csum_offset = offsetof(struct rte_udp_hdr,
-				dgram_cksum);
-			hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-			break;
-
-		case PKT_TX_TCP_CKSUM:
-			hdr->csum_start = cookie->l2_len + cookie->l3_len;
-			hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum);
-			hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-			break;
-
-		default:
-			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
-			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
-			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
-			break;
-		}
+	uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
+
+	if (cookie->ol_flags & PKT_TX_TCP_SEG)
+		csum_l4 |= PKT_TX_TCP_CKSUM;
+
+	switch (csum_l4) {
+	case PKT_TX_UDP_CKSUM:
+		hdr->csum_start = cookie->l2_len + cookie->l3_len;
+		hdr->csum_offset = offsetof(struct rte_udp_hdr, dgram_cksum);
+		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+		break;
+
+	case PKT_TX_TCP_CKSUM:
+		hdr->csum_start = cookie->l2_len + cookie->l3_len;
+		hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum);
+		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+		break;
+
+	default:
+		ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+		ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+		ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
+		break;
+	}
 
-		/* TCP Segmentation Offload */
-		if (cookie->ol_flags & PKT_TX_TCP_SEG) {
-			hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
-				VIRTIO_NET_HDR_GSO_TCPV6 :
-				VIRTIO_NET_HDR_GSO_TCPV4;
-			hdr->gso_size = cookie->tso_segsz;
-			hdr->hdr_len =
-				cookie->l2_len +
-				cookie->l3_len +
-				cookie->l4_len;
-		} else {
-			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
-			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
-			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
-		}
+	/* TCP Segmentation Offload */
+	if (cookie->ol_flags & PKT_TX_TCP_SEG) {
+		hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
+			VIRTIO_NET_HDR_GSO_TCPV6 :
+			VIRTIO_NET_HDR_GSO_TCPV4;
+		hdr->gso_size = cookie->tso_segsz;
+		hdr->hdr_len = cookie->l2_len + cookie->l3_len + cookie->l4_len;
+	} else {
+		ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+		ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+		ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
 	}
 }
 
@@ -737,7 +729,8 @@ virtqueue_enqueue_xmit_packed(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
 		}
 	}
 
-	virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload);
+	if (vq->hw->has_tx_offload)
+		virtqueue_xmit_offload(hdr, cookie);
 
 	do {
 		uint16_t flags;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path
  2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
                   ` (3 preceding siblings ...)
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
@ 2021-04-01  9:52 ` David Marchand
  2021-04-08  8:28   ` Olivier Matz
  2021-04-08 18:38   ` Flavio Leitner
  4 siblings, 2 replies; 22+ messages in thread
From: David Marchand @ 2021-04-01  9:52 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, olivier.matz, fbl, i.maximets, Chenbo Xia,
	Jijiang Liu, Yuanhan Liu

The vhost library current configures Tx offloading (PKT_TX_*) on any
packet received from a guest virtio device which asks for some offloading.

This is problematic, as Tx offloading is something that the application
must ask for: the application needs to configure devices
to support every used offloads (ip, tcp checksumming, tso..), and the
various l2/l3/l4 lengths must be set following any processing that
happened in the application itself.

On the other hand, the received packets are not marked wrt current
packet l3/l4 checksumming info.

Copy virtio rx processing to fix those offload flags.

The vhost example needs a reworking as it was built with the assumption
that mbuf TSO configuration is set up by the vhost library.
This is not done in this patch for now so TSO activation is forcibly
refused.

Fixes: 859b480d5afd ("vhost: add guest offload setting")

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 examples/vhost/main.c         |   6 ++
 lib/librte_vhost/virtio_net.c | 148 ++++++++++++++--------------------
 2 files changed, 67 insertions(+), 87 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 2ca7d98c58..819cd9909f 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -607,6 +607,12 @@ us_vhost_parse_args(int argc, char **argv)
 				us_vhost_usage(prgname);
 				return -1;
 			}
+			/* FIXME: tso support is broken */
+			if (ret != 0) {
+				RTE_LOG(INFO, VHOST_CONFIG, "TSO support is broken\n");
+				us_vhost_usage(prgname);
+				return -1;
+			}
 			enable_tso = ret;
 			break;
 
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 583bf379c6..06089a4206 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -8,6 +8,7 @@
 
 #include <rte_mbuf.h>
 #include <rte_memcpy.h>
+#include <rte_net.h>
 #include <rte_ether.h>
 #include <rte_ip.h>
 #include <rte_vhost.h>
@@ -1821,105 +1822,75 @@ virtio_net_with_host_offload(struct virtio_net *dev)
 	return false;
 }
 
-static void
-parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
-{
-	struct rte_ipv4_hdr *ipv4_hdr;
-	struct rte_ipv6_hdr *ipv6_hdr;
-	void *l3_hdr = NULL;
-	struct rte_ether_hdr *eth_hdr;
-	uint16_t ethertype;
-
-	eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
-
-	m->l2_len = sizeof(struct rte_ether_hdr);
-	ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
-
-	if (ethertype == RTE_ETHER_TYPE_VLAN) {
-		struct rte_vlan_hdr *vlan_hdr =
-			(struct rte_vlan_hdr *)(eth_hdr + 1);
-
-		m->l2_len += sizeof(struct rte_vlan_hdr);
-		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
-	}
-
-	l3_hdr = (char *)eth_hdr + m->l2_len;
-
-	switch (ethertype) {
-	case RTE_ETHER_TYPE_IPV4:
-		ipv4_hdr = l3_hdr;
-		*l4_proto = ipv4_hdr->next_proto_id;
-		m->l3_len = rte_ipv4_hdr_len(ipv4_hdr);
-		*l4_hdr = (char *)l3_hdr + m->l3_len;
-		m->ol_flags |= PKT_TX_IPV4;
-		break;
-	case RTE_ETHER_TYPE_IPV6:
-		ipv6_hdr = l3_hdr;
-		*l4_proto = ipv6_hdr->proto;
-		m->l3_len = sizeof(struct rte_ipv6_hdr);
-		*l4_hdr = (char *)l3_hdr + m->l3_len;
-		m->ol_flags |= PKT_TX_IPV6;
-		break;
-	default:
-		m->l3_len = 0;
-		*l4_proto = 0;
-		*l4_hdr = NULL;
-		break;
-	}
-}
-
-static __rte_always_inline void
+static __rte_always_inline int
 vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
 {
-	uint16_t l4_proto = 0;
-	void *l4_hdr = NULL;
-	struct rte_tcp_hdr *tcp_hdr = NULL;
+	struct rte_net_hdr_lens hdr_lens;
+	uint32_t hdrlen, ptype;
+	int l4_supported = 0;
 
+	/* nothing to do */
 	if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE)
-		return;
-
-	parse_ethernet(m, &l4_proto, &l4_hdr);
-	if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		if (hdr->csum_start == (m->l2_len + m->l3_len)) {
-			switch (hdr->csum_offset) {
-			case (offsetof(struct rte_tcp_hdr, cksum)):
-				if (l4_proto == IPPROTO_TCP)
-					m->ol_flags |= PKT_TX_TCP_CKSUM;
-				break;
-			case (offsetof(struct rte_udp_hdr, dgram_cksum)):
-				if (l4_proto == IPPROTO_UDP)
-					m->ol_flags |= PKT_TX_UDP_CKSUM;
-				break;
-			case (offsetof(struct rte_sctp_hdr, cksum)):
-				if (l4_proto == IPPROTO_SCTP)
-					m->ol_flags |= PKT_TX_SCTP_CKSUM;
-				break;
-			default:
-				break;
-			}
+		return 0;
+
+	m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN;
+
+	ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK);
+	m->packet_type = ptype;
+	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP ||
+	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP ||
+	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP)
+		l4_supported = 1;
+
+	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+		hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len;
+		if (hdr->csum_start <= hdrlen && l4_supported) {
+			m->ol_flags |= PKT_RX_L4_CKSUM_NONE;
+		} else {
+			/* Unknown proto or tunnel, do sw cksum. We can assume
+			 * the cksum field is in the first segment since the
+			 * buffers we provided to the host are large enough.
+			 * In case of SCTP, this will be wrong since it's a CRC
+			 * but there's nothing we can do.
+			 */
+			uint16_t csum = 0, off;
+
+			if (rte_raw_cksum_mbuf(m, hdr->csum_start,
+				rte_pktmbuf_pkt_len(m) - hdr->csum_start,
+				&csum) < 0)
+				return -EINVAL;
+			if (likely(csum != 0xffff))
+				csum = ~csum;
+			off = hdr->csum_offset + hdr->csum_start;
+			if (rte_pktmbuf_data_len(m) >= off + 1)
+				*rte_pktmbuf_mtod_offset(m, uint16_t *,
+					off) = csum;
 		}
+	} else if (hdr->flags & VIRTIO_NET_HDR_F_DATA_VALID && l4_supported) {
+		m->ol_flags |= PKT_RX_L4_CKSUM_GOOD;
 	}
 
-	if (l4_hdr && hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+	/* GSO request, save required information in mbuf */
+	if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+		/* Check unsupported modes */
+		if ((hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN) ||
+		    (hdr->gso_size == 0)) {
+			return -EINVAL;
+		}
+
+		/* Update mss lengths in mbuf */
+		m->tso_segsz = hdr->gso_size;
 		switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
 		case VIRTIO_NET_HDR_GSO_TCPV4:
 		case VIRTIO_NET_HDR_GSO_TCPV6:
-			tcp_hdr = l4_hdr;
-			m->ol_flags |= PKT_TX_TCP_SEG;
-			m->tso_segsz = hdr->gso_size;
-			m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
-			break;
-		case VIRTIO_NET_HDR_GSO_UDP:
-			m->ol_flags |= PKT_TX_UDP_SEG;
-			m->tso_segsz = hdr->gso_size;
-			m->l4_len = sizeof(struct rte_udp_hdr);
+			m->ol_flags |= PKT_RX_LRO | PKT_RX_L4_CKSUM_NONE;
 			break;
 		default:
-			VHOST_LOG_DATA(WARNING,
-				"unsupported gso type %u.\n", hdr->gso_type);
-			break;
+			return -EINVAL;
 		}
 	}
+
+	return 0;
 }
 
 static __rte_noinline void
@@ -2078,8 +2049,11 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	prev->data_len = mbuf_offset;
 	m->pkt_len    += mbuf_offset;
 
-	if (hdr)
-		vhost_dequeue_offload(hdr, m);
+	if (hdr && vhost_dequeue_offload(hdr, m) < 0) {
+		VHOST_LOG_DATA(ERR, "Packet with invalid offloads.\n");
+		error = -1;
+		goto out;
+	}
 
 out:
 
-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
@ 2021-04-07 20:14   ` Flavio Leitner
  2021-04-08  7:23   ` Olivier Matz
  1 sibling, 0 replies; 22+ messages in thread
From: Flavio Leitner @ 2021-04-07 20:14 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, maxime.coquelin, olivier.matz, i.maximets

On Thu, Apr 01, 2021 at 11:52:39AM +0200, David Marchand wrote:
> PKT_RX_EIP_CKSUM_BAD has been declared deprecated quite some time ago,
> but there was no warning to applications still using it.
> Fix this by marking as deprecated with the newly introduced
> RTE_DEPRECATED.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

Reviewed-by: Flavio Leitner <fbl@sysclose.org>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
@ 2021-04-07 20:15   ` Flavio Leitner
  2021-04-08  7:41     ` Olivier Matz
  2021-04-08  7:53   ` Olivier Matz
  1 sibling, 1 reply; 22+ messages in thread
From: Flavio Leitner @ 2021-04-07 20:15 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, maxime.coquelin, olivier.matz, i.maximets, Keith Wiles

On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> Tx offload flags are of the application responsibility.
> Leave the mbuf alone and check for TSO where needed.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

The patch looks good, but maybe a better approach would be
to change the documentation to require the TCP_CKSUM flag
when TCP_SEG is used, otherwise this flag adjusting needs
to be replicated every time TCP_SEG is used.

The above could break existing applications, so perhaps doing
something like below would be better and backwards compatible?
Then we can remove those places tweaking the flags completely.

diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index c17dc95c5..6a0c2cdd9 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -298,7 +298,7 @@ extern "C" {
  *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
  *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
  */
-#define PKT_TX_TCP_SEG       (1ULL << 50)
+#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
 
 /** TX IEEE1588 packet to timestamp. */
 #define PKT_TX_IEEE1588_TMST (1ULL << 51)

Thanks,
fbl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
  2021-04-07 20:14   ` Flavio Leitner
@ 2021-04-08  7:23   ` Olivier Matz
  2021-04-08  8:41     ` David Marchand
  1 sibling, 1 reply; 22+ messages in thread
From: Olivier Matz @ 2021-04-08  7:23 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, maxime.coquelin, fbl, i.maximets

On Thu, Apr 01, 2021 at 11:52:39AM +0200, David Marchand wrote:
> PKT_RX_EIP_CKSUM_BAD has been declared deprecated quite some time ago,

It's not that old, it was done by Lance in commit e8a419d6de4b ("mbuf:
rename outer IP checksum macro") 1 month ago.

> but there was no warning to applications still using it.
> Fix this by marking as deprecated with the newly introduced
> RTE_DEPRECATED.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-07 20:15   ` Flavio Leitner
@ 2021-04-08  7:41     ` Olivier Matz
  2021-04-08 11:21       ` Flavio Leitner
  0 siblings, 1 reply; 22+ messages in thread
From: Olivier Matz @ 2021-04-08  7:41 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > Tx offload flags are of the application responsibility.
> > Leave the mbuf alone and check for TSO where needed.
> > 
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > ---
> 
> The patch looks good, but maybe a better approach would be
> to change the documentation to require the TCP_CKSUM flag
> when TCP_SEG is used, otherwise this flag adjusting needs
> to be replicated every time TCP_SEG is used.
> 
> The above could break existing applications, so perhaps doing
> something like below would be better and backwards compatible?
> Then we can remove those places tweaking the flags completely.

As a first step, I suggest to document that:
- applications must set TCP_CKSUM when setting TCP_SEG
- pmds must suppose that TCP_CKSUM is set when TCP_SEG is set

This is clearer that what we have today, and I think it does not break
anything. This will guide apps in the correct direction, facilitating
an eventual future PMD change.

> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index c17dc95c5..6a0c2cdd9 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -298,7 +298,7 @@ extern "C" {
>   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
>   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
>   */
> -#define PKT_TX_TCP_SEG       (1ULL << 50)
> +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
>  
>  /** TX IEEE1588 packet to timestamp. */
>  #define PKT_TX_IEEE1588_TMST (1ULL << 51)

I'm afraid some applications or drivers use extended bit manipulations
to do the conversion from/to another domain (like hardware descriptors
or application-specific flags). They may expect this constant to be a
uniq flag.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
  2021-04-07 20:15   ` Flavio Leitner
@ 2021-04-08  7:53   ` Olivier Matz
  1 sibling, 0 replies; 22+ messages in thread
From: Olivier Matz @ 2021-04-08  7:53 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, maxime.coquelin, fbl, i.maximets, Keith Wiles

On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> Tx offload flags are of the application responsibility.
> Leave the mbuf alone and check for TSO where needed.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Maybe the problem being solved should be better described in the commit
log. Is it a problem (other than cosmetic) to touch a mbuf in the Tx
function of a driver, where we could expect that the mbuf is owned by
the driver?

The only problem I can think about is in case we transmit a direct mbuf
whose refcnt is increased, but I wonder how much this is really
supported: for instance, several drivers add vlans using
rte_vlan_insert() in their Tx path.


> ---
>  drivers/net/tap/rte_eth_tap.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index c36d4bf76e..285fe395c5 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -562,6 +562,7 @@ tap_tx_l3_cksum(char *packet, uint64_t ol_flags, unsigned int l2_len,
>  		uint16_t *l4_phdr_cksum, uint32_t *l4_raw_cksum)
>  {
>  	void *l3_hdr = packet + l2_len;
> +	uint64_t csum_l4;
>  
>  	if (ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4)) {
>  		struct rte_ipv4_hdr *iph = l3_hdr;
> @@ -571,13 +572,17 @@ tap_tx_l3_cksum(char *packet, uint64_t ol_flags, unsigned int l2_len,
>  		cksum = rte_raw_cksum(iph, l3_len);
>  		iph->hdr_checksum = (cksum == 0xffff) ? cksum : ~cksum;
>  	}
> -	if (ol_flags & PKT_TX_L4_MASK) {
> +
> +	csum_l4 = ol_flags & PKT_TX_L4_MASK;
> +	if (ol_flags & PKT_TX_TCP_SEG)
> +		csum_l4 |= PKT_TX_TCP_CKSUM;
> +	if (csum_l4) {
>  		void *l4_hdr;
>  
>  		l4_hdr = packet + l2_len + l3_len;
> -		if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM)
> +		if (csum_l4 == PKT_TX_UDP_CKSUM)
>  			*l4_cksum = &((struct rte_udp_hdr *)l4_hdr)->dgram_cksum;
> -		else if ((ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM)
> +		else if (csum_l4 == PKT_TX_TCP_CKSUM)
>  			*l4_cksum = &((struct rte_tcp_hdr *)l4_hdr)->cksum;
>  		else
>  			return;
> @@ -648,7 +653,8 @@ tap_write_mbufs(struct tx_queue *txq, uint16_t num_mbufs,
>  		if (txq->csum &&
>  		    ((mbuf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4) ||
>  		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM ||
> -		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM))) {
> +		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM) ||
> +		     (mbuf->ol_flags & PKT_TX_TCP_SEG))) {
>  			is_cksum = 1;
>  
>  			/* Support only packets with at least layer 4
> @@ -742,9 +748,6 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>  		if (tso) {
>  			struct rte_gso_ctx *gso_ctx = &txq->gso_ctx;
>  
> -			/* TCP segmentation implies TCP checksum offload */
> -			mbuf_in->ol_flags |= PKT_TX_TCP_CKSUM;
> -
>  			/* gso size is calculated without RTE_ETHER_CRC_LEN */
>  			hdrs_len = mbuf_in->l2_len + mbuf_in->l3_len +
>  					mbuf_in->l4_len;
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
@ 2021-04-08  8:28   ` Olivier Matz
  2021-04-08 18:38   ` Flavio Leitner
  1 sibling, 0 replies; 22+ messages in thread
From: Olivier Matz @ 2021-04-08  8:28 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, maxime.coquelin, fbl, i.maximets, Chenbo Xia, Jijiang Liu,
	Yuanhan Liu

Hi David,

On Thu, Apr 01, 2021 at 11:52:43AM +0200, David Marchand wrote:
> The vhost library current configures Tx offloading (PKT_TX_*) on any
> packet received from a guest virtio device which asks for some offloading.
> 
> This is problematic, as Tx offloading is something that the application
> must ask for: the application needs to configure devices
> to support every used offloads (ip, tcp checksumming, tso..), and the
> various l2/l3/l4 lengths must be set following any processing that
> happened in the application itself.
> 
> On the other hand, the received packets are not marked wrt current
> packet l3/l4 checksumming info.
> 
> Copy virtio rx processing to fix those offload flags.
> 
> The vhost example needs a reworking as it was built with the assumption
> that mbuf TSO configuration is set up by the vhost library.
> This is not done in this patch for now so TSO activation is forcibly
> refused.
> 
> Fixes: 859b480d5afd ("vhost: add guest offload setting")
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

Reviewed-by: Olivier Matz <olivier.matz@6wind.com>

LGTM, just one little comment below.

<...>

> +	m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN;
> +
> +	ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK);
> +	m->packet_type = ptype;
> +	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP ||
> +	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP ||
> +	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP)
> +		l4_supported = 1;
> +
> +	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> +		hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len;
> +		if (hdr->csum_start <= hdrlen && l4_supported) {
> +			m->ol_flags |= PKT_RX_L4_CKSUM_NONE;
> +		} else {
> +			/* Unknown proto or tunnel, do sw cksum. We can assume
> +			 * the cksum field is in the first segment since the
> +			 * buffers we provided to the host are large enough.
> +			 * In case of SCTP, this will be wrong since it's a CRC
> +			 * but there's nothing we can do.
> +			 */
> +			uint16_t csum = 0, off;
> +
> +			if (rte_raw_cksum_mbuf(m, hdr->csum_start,
> +				rte_pktmbuf_pkt_len(m) - hdr->csum_start,
> +				&csum) < 0)
> +				return -EINVAL;
> +			if (likely(csum != 0xffff))
> +				csum = ~csum;

I was trying to remember the reason for this last test (which is also
present in net/virtio).

If this is a UDP checksum (on top of an unrecognized tunnel), it's
indeed needed to do that, because we don't want to set the checksum to 0
in the packet (which means "no checksum" for UDPv4, or is fordidden for
UDPv6).

If this is something else than UDP, it shouldn't hurt to have a 0xffff in the
packet instead of 0.

Maybe it deserves a comment here, like:

  /* avoid 0 checksum for UDP, shouldn't hurt for other protocols */

What do you think?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated
  2021-04-08  7:23   ` Olivier Matz
@ 2021-04-08  8:41     ` David Marchand
  0 siblings, 0 replies; 22+ messages in thread
From: David Marchand @ 2021-04-08  8:41 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Maxime Coquelin, Flavio Leitner, Ilya Maximets

On Thu, Apr 8, 2021 at 9:24 AM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> On Thu, Apr 01, 2021 at 11:52:39AM +0200, David Marchand wrote:
> > PKT_RX_EIP_CKSUM_BAD has been declared deprecated quite some time ago,
>
> It's not that old, it was done by Lance in commit e8a419d6de4b ("mbuf:
> rename outer IP checksum macro") 1 month ago.

Err, I was pretty sure it was older... I probably misread some date.
Ok, I'll reword this and add a Fixes: tag just for info.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-08  7:41     ` Olivier Matz
@ 2021-04-08 11:21       ` Flavio Leitner
  2021-04-08 12:05         ` Olivier Matz
  2021-04-08 12:16         ` Ananyev, Konstantin
  0 siblings, 2 replies; 22+ messages in thread
From: Flavio Leitner @ 2021-04-08 11:21 UTC (permalink / raw)
  To: Olivier Matz
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > Tx offload flags are of the application responsibility.
> > > Leave the mbuf alone and check for TSO where needed.
> > > 
> > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > ---
> > 
> > The patch looks good, but maybe a better approach would be
> > to change the documentation to require the TCP_CKSUM flag
> > when TCP_SEG is used, otherwise this flag adjusting needs
> > to be replicated every time TCP_SEG is used.
> > 
> > The above could break existing applications, so perhaps doing
> > something like below would be better and backwards compatible?
> > Then we can remove those places tweaking the flags completely.
> 
> As a first step, I suggest to document that:
> - applications must set TCP_CKSUM when setting TCP_SEG

That's what I suggest above.

> - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set

But that keeps the problem of implying the TCP_CKSUM flag in
various places.

> This is clearer that what we have today, and I think it does not break
> anything. This will guide apps in the correct direction, facilitating
> an eventual future PMD change.
> 
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > index c17dc95c5..6a0c2cdd9 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -298,7 +298,7 @@ extern "C" {
> >   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
> >   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> >   */
> > -#define PKT_TX_TCP_SEG       (1ULL << 50)
> > +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
> >  
> >  /** TX IEEE1588 packet to timestamp. */
> >  #define PKT_TX_IEEE1588_TMST (1ULL << 51)
> 
> I'm afraid some applications or drivers use extended bit manipulations
> to do the conversion from/to another domain (like hardware descriptors
> or application-specific flags). They may expect this constant to be a
> uniq flag.

Interesting, do you have an example? Because each flag still has an
separate meaning.

-- 
fbl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-08 11:21       ` Flavio Leitner
@ 2021-04-08 12:05         ` Olivier Matz
  2021-04-08 12:58           ` Flavio Leitner
  2021-04-08 12:16         ` Ananyev, Konstantin
  1 sibling, 1 reply; 22+ messages in thread
From: Olivier Matz @ 2021-04-08 12:05 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Thu, Apr 08, 2021 at 08:21:58AM -0300, Flavio Leitner wrote:
> On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> > On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > > Tx offload flags are of the application responsibility.
> > > > Leave the mbuf alone and check for TSO where needed.
> > > > 
> > > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > > ---
> > > 
> > > The patch looks good, but maybe a better approach would be
> > > to change the documentation to require the TCP_CKSUM flag
> > > when TCP_SEG is used, otherwise this flag adjusting needs
> > > to be replicated every time TCP_SEG is used.
> > > 
> > > The above could break existing applications, so perhaps doing
> > > something like below would be better and backwards compatible?
> > > Then we can remove those places tweaking the flags completely.
> > 
> > As a first step, I suggest to document that:
> > - applications must set TCP_CKSUM when setting TCP_SEG
> 
> That's what I suggest above.
> 
> > - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set
> 
> But that keeps the problem of implying the TCP_CKSUM flag in
> various places.

Yes. What I propose is just a first step: better document what is the
current expected behavior, before doing something else.

> > This is clearer that what we have today, and I think it does not break
> > anything. This will guide apps in the correct direction, facilitating
> > an eventual future PMD change.
> > 
> > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > index c17dc95c5..6a0c2cdd9 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -298,7 +298,7 @@ extern "C" {
> > >   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
> > >   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> > >   */
> > > -#define PKT_TX_TCP_SEG       (1ULL << 50)
> > > +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
> > >  
> > >  /** TX IEEE1588 packet to timestamp. */
> > >  #define PKT_TX_IEEE1588_TMST (1ULL << 51)
> > 
> > I'm afraid some applications or drivers use extended bit manipulations
> > to do the conversion from/to another domain (like hardware descriptors
> > or application-specific flags). They may expect this constant to be a
> > uniq flag.
> 
> Interesting, do you have an example? Because each flag still has an
> separate meaning.

Honnestly no, I don't have any good example, just a (maybe unfounded) doubt.

I have in mind operations that are done with tables or vector
instructions inside the drivers, but this is mainly done for Rx, not Tx.
You can look at Tx functions like mlx5_set_cksum_table() or
nix_xmit_pkts_vector(), or Rx functions like desc_to_olflags_v() or
enic_noscatter_vec_recv_pkts() to see what kind of stuff I'm talking
about.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-08 11:21       ` Flavio Leitner
  2021-04-08 12:05         ` Olivier Matz
@ 2021-04-08 12:16         ` Ananyev, Konstantin
  1 sibling, 0 replies; 22+ messages in thread
From: Ananyev, Konstantin @ 2021-04-08 12:16 UTC (permalink / raw)
  To: Flavio Leitner, Olivier Matz
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Wiles, Keith



> 
> On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> > On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > > Tx offload flags are of the application responsibility.
> > > > Leave the mbuf alone and check for TSO where needed.
> > > >
> > > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > > ---
> > >
> > > The patch looks good, but maybe a better approach would be
> > > to change the documentation to require the TCP_CKSUM flag
> > > when TCP_SEG is used, otherwise this flag adjusting needs
> > > to be replicated every time TCP_SEG is used.
> > >
> > > The above could break existing applications, so perhaps doing
> > > something like below would be better and backwards compatible?
> > > Then we can remove those places tweaking the flags completely.
> >
> > As a first step, I suggest to document that:
> > - applications must set TCP_CKSUM when setting TCP_SEG
> 
> That's what I suggest above.
> 
> > - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set
> 
> But that keeps the problem of implying the TCP_CKSUM flag in
> various places.
> 
> > This is clearer that what we have today, and I think it does not break
> > anything. This will guide apps in the correct direction, facilitating
> > an eventual future PMD change.
> >
> > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > index c17dc95c5..6a0c2cdd9 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -298,7 +298,7 @@ extern "C" {
> > >   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
> > >   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> > >   */
> > > -#define PKT_TX_TCP_SEG       (1ULL << 50)
> > > +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM

I think that would be an ABI breakage.

> > >
> > >  /** TX IEEE1588 packet to timestamp. */
> > >  #define PKT_TX_IEEE1588_TMST (1ULL << 51)
> >
> > I'm afraid some applications or drivers use extended bit manipulations
> > to do the conversion from/to another domain (like hardware descriptors
> > or application-specific flags). They may expect this constant to be a
> > uniq flag.
> 
> Interesting, do you have an example? Because each flag still has an
> separate meaning.


 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-08 12:05         ` Olivier Matz
@ 2021-04-08 12:58           ` Flavio Leitner
  2021-04-09 13:30             ` Olivier Matz
  0 siblings, 1 reply; 22+ messages in thread
From: Flavio Leitner @ 2021-04-08 12:58 UTC (permalink / raw)
  To: Olivier Matz
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Thu, Apr 08, 2021 at 02:05:21PM +0200, Olivier Matz wrote:
> On Thu, Apr 08, 2021 at 08:21:58AM -0300, Flavio Leitner wrote:
> > On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> > > On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > > > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > > > Tx offload flags are of the application responsibility.
> > > > > Leave the mbuf alone and check for TSO where needed.
> > > > > 
> > > > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > > > ---
> > > > 
> > > > The patch looks good, but maybe a better approach would be
> > > > to change the documentation to require the TCP_CKSUM flag
> > > > when TCP_SEG is used, otherwise this flag adjusting needs
> > > > to be replicated every time TCP_SEG is used.
> > > > 
> > > > The above could break existing applications, so perhaps doing
> > > > something like below would be better and backwards compatible?
> > > > Then we can remove those places tweaking the flags completely.
> > > 
> > > As a first step, I suggest to document that:
> > > - applications must set TCP_CKSUM when setting TCP_SEG
> > 
> > That's what I suggest above.
> > 
> > > - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set
> > 
> > But that keeps the problem of implying the TCP_CKSUM flag in
> > various places.
> 
> Yes. What I propose is just a first step: better document what is the
> current expected behavior, before doing something else.
> 
> > > This is clearer that what we have today, and I think it does not break
> > > anything. This will guide apps in the correct direction, facilitating
> > > an eventual future PMD change.
> > > 
> > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > index c17dc95c5..6a0c2cdd9 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > @@ -298,7 +298,7 @@ extern "C" {
> > > >   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
> > > >   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> > > >   */
> > > > -#define PKT_TX_TCP_SEG       (1ULL << 50)
> > > > +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
> > > >  
> > > >  /** TX IEEE1588 packet to timestamp. */
> > > >  #define PKT_TX_IEEE1588_TMST (1ULL << 51)
> > > 
> > > I'm afraid some applications or drivers use extended bit manipulations
> > > to do the conversion from/to another domain (like hardware descriptors
> > > or application-specific flags). They may expect this constant to be a
> > > uniq flag.
> > 
> > Interesting, do you have an example? Because each flag still has an
> > separate meaning.
> 
> Honnestly no, I don't have any good example, just a (maybe unfounded) doubt.
> 
> I have in mind operations that are done with tables or vector
> instructions inside the drivers, but this is mainly done for Rx, not Tx.
> You can look at Tx functions like mlx5_set_cksum_table() or
> nix_xmit_pkts_vector(), or Rx functions like desc_to_olflags_v() or
> enic_noscatter_vec_recv_pkts() to see what kind of stuff I'm talking
> about.

I see your point. Going back to improving the documentation as a
first step, what would be the next steps? Are we going to wait few
releases and then remove the flag tweaking code assuming that PMDs
and apps are ok?

Thanks,
-- 
fbl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
@ 2021-04-08 13:05   ` Flavio Leitner
  2021-04-09  2:31   ` Ruifeng Wang
  1 sibling, 0 replies; 22+ messages in thread
From: Flavio Leitner @ 2021-04-08 13:05 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, maxime.coquelin, olivier.matz, i.maximets, Chenbo Xia,
	Bruce Richardson, Konstantin Ananyev, Jerin Jacob, Ruifeng Wang

On Thu, Apr 01, 2021 at 11:52:42AM +0200, David Marchand wrote:
> Purely cosmetic but it is rather odd to have an "offload" helper that
> checks if it actually must do something.
> We already have the same checks in most callers, so move this branch
> in them.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

Reviewed-by: Flavio Leitner <fbl@sysclose.org>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
  2021-04-08  8:28   ` Olivier Matz
@ 2021-04-08 18:38   ` Flavio Leitner
  1 sibling, 0 replies; 22+ messages in thread
From: Flavio Leitner @ 2021-04-08 18:38 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, maxime.coquelin, olivier.matz, i.maximets, Chenbo Xia,
	Jijiang Liu, Yuanhan Liu

On Thu, Apr 01, 2021 at 11:52:43AM +0200, David Marchand wrote:
> The vhost library current configures Tx offloading (PKT_TX_*) on any
> packet received from a guest virtio device which asks for some offloading.
> 
> This is problematic, as Tx offloading is something that the application
> must ask for: the application needs to configure devices
> to support every used offloads (ip, tcp checksumming, tso..), and the
> various l2/l3/l4 lengths must be set following any processing that
> happened in the application itself.
> 
> On the other hand, the received packets are not marked wrt current
> packet l3/l4 checksumming info.
> 
> Copy virtio rx processing to fix those offload flags.
> 
> The vhost example needs a reworking as it was built with the assumption
> that mbuf TSO configuration is set up by the vhost library.
> This is not done in this patch for now so TSO activation is forcibly
> refused.
> 
> Fixes: 859b480d5afd ("vhost: add guest offload setting")

There is change that before ECN was ignored and now it is invalid.
I think that's the right way to go, but not sure if virtio blocks
the negotiation of that feature.

Reviewed-by: Flavio Leitner <fbl@sysclose.org>

fbl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper
  2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
  2021-04-08 13:05   ` Flavio Leitner
@ 2021-04-09  2:31   ` Ruifeng Wang
  1 sibling, 0 replies; 22+ messages in thread
From: Ruifeng Wang @ 2021-04-09  2:31 UTC (permalink / raw)
  To: David Marchand, dev
  Cc: maxime.coquelin, olivier.matz, fbl, i.maximets, Chenbo Xia,
	Bruce Richardson, Konstantin Ananyev, jerinj, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, April 1, 2021 5:53 PM
> To: dev@dpdk.org
> Cc: maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> fbl@sysclose.org; i.maximets@ovn.org; Chenbo Xia <chenbo.xia@intel.com>;
> Bruce Richardson <bruce.richardson@intel.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>; jerinj@marvell.com; Ruifeng Wang
> <Ruifeng.Wang@arm.com>
> Subject: [PATCH 4/5] net/virtio: refactor Tx offload helper
> 
> Purely cosmetic but it is rather odd to have an "offload" helper that checks if
> it actually must do something.
> We already have the same checks in most callers, so move this branch in
> them.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  drivers/net/virtio/virtio_rxtx.c             |  7 +-
>  drivers/net/virtio/virtio_rxtx_packed_avx.h  |  2 +-
> drivers/net/virtio/virtio_rxtx_packed_neon.h |  2 +-
>  drivers/net/virtio/virtqueue.h               | 83 +++++++++-----------
>  4 files changed, 44 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> index 40283001b0..a4e37ef379 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -448,7 +448,7 @@ virtqueue_enqueue_xmit_inorder(struct virtnet_tx
> *txvq,
>  		if (!vq->hw->has_tx_offload)
>  			virtqueue_clear_net_hdr(hdr);
>  		else
> -			virtqueue_xmit_offload(hdr, cookies[i], true);
> +			virtqueue_xmit_offload(hdr, cookies[i]);
> 
>  		start_dp[idx].addr  = rte_mbuf_data_iova(cookies[i]) -
> head_size;
>  		start_dp[idx].len   = cookies[i]->data_len + head_size;
> @@ -495,7 +495,7 @@ virtqueue_enqueue_xmit_packed_fast(struct
> virtnet_tx *txvq,
>  	if (!vq->hw->has_tx_offload)
>  		virtqueue_clear_net_hdr(hdr);
>  	else
> -		virtqueue_xmit_offload(hdr, cookie, true);
> +		virtqueue_xmit_offload(hdr, cookie);
> 
>  	dp->addr = rte_mbuf_data_iova(cookie) - head_size;
>  	dp->len  = cookie->data_len + head_size; @@ -581,7 +581,8 @@
> virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
>  		idx = start_dp[idx].next;
>  	}
> 
> -	virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload);
> +	if (vq->hw->has_tx_offload)
> +		virtqueue_xmit_offload(hdr, cookie);
> 
>  	do {
>  		start_dp[idx].addr  = rte_mbuf_data_iova(cookie); diff --git
> a/drivers/net/virtio/virtio_rxtx_packed_avx.h
> b/drivers/net/virtio/virtio_rxtx_packed_avx.h
> index 49e845d02a..33cac3244f 100644
> --- a/drivers/net/virtio/virtio_rxtx_packed_avx.h
> +++ b/drivers/net/virtio/virtio_rxtx_packed_avx.h
> @@ -115,7 +115,7 @@ virtqueue_enqueue_batch_packed_vec(struct
> virtnet_tx *txvq,
>  		virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
>  			hdr = rte_pktmbuf_mtod_offset(tx_pkts[i],
>  					struct virtio_net_hdr *, -head_size);
> -			virtqueue_xmit_offload(hdr, tx_pkts[i], true);
> +			virtqueue_xmit_offload(hdr, tx_pkts[i]);
>  		}
>  	}
> 
> diff --git a/drivers/net/virtio/virtio_rxtx_packed_neon.h
> b/drivers/net/virtio/virtio_rxtx_packed_neon.h
> index 851c81f312..1a49caf8af 100644
> --- a/drivers/net/virtio/virtio_rxtx_packed_neon.h
> +++ b/drivers/net/virtio/virtio_rxtx_packed_neon.h
> @@ -134,7 +134,7 @@ virtqueue_enqueue_batch_packed_vec(struct
> virtnet_tx *txvq,
>  		virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
>  			hdr = rte_pktmbuf_mtod_offset(tx_pkts[i],
>  					struct virtio_net_hdr *, -head_size);
> -			virtqueue_xmit_offload(hdr, tx_pkts[i], true);
> +			virtqueue_xmit_offload(hdr, tx_pkts[i]);
>  		}
>  	}
> 
> diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
> index 2e8826bc28..41a9b82a5f 100644
> --- a/drivers/net/virtio/virtqueue.h
> +++ b/drivers/net/virtio/virtqueue.h
> @@ -613,52 +613,44 @@ virtqueue_notify(struct virtqueue *vq)  } while (0)
> 
>  static inline void
> -virtqueue_xmit_offload(struct virtio_net_hdr *hdr,
> -			struct rte_mbuf *cookie,
> -			uint8_t offload)
> +virtqueue_xmit_offload(struct virtio_net_hdr *hdr, struct rte_mbuf
> +*cookie)
>  {
> -	if (offload) {
> -		uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
> -
> -		if (cookie->ol_flags & PKT_TX_TCP_SEG)
> -			csum_l4 |= PKT_TX_TCP_CKSUM;
> -
> -		switch (csum_l4) {
> -		case PKT_TX_UDP_CKSUM:
> -			hdr->csum_start = cookie->l2_len + cookie->l3_len;
> -			hdr->csum_offset = offsetof(struct rte_udp_hdr,
> -				dgram_cksum);
> -			hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
> -			break;
> -
> -		case PKT_TX_TCP_CKSUM:
> -			hdr->csum_start = cookie->l2_len + cookie->l3_len;
> -			hdr->csum_offset = offsetof(struct rte_tcp_hdr,
> cksum);
> -			hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
> -			break;
> -
> -		default:
> -			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
> -			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
> -			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
> -			break;
> -		}
> +	uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
> +
> +	if (cookie->ol_flags & PKT_TX_TCP_SEG)
> +		csum_l4 |= PKT_TX_TCP_CKSUM;
> +
> +	switch (csum_l4) {
> +	case PKT_TX_UDP_CKSUM:
> +		hdr->csum_start = cookie->l2_len + cookie->l3_len;
> +		hdr->csum_offset = offsetof(struct rte_udp_hdr,
> dgram_cksum);
> +		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
> +		break;
> +
> +	case PKT_TX_TCP_CKSUM:
> +		hdr->csum_start = cookie->l2_len + cookie->l3_len;
> +		hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum);
> +		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
> +		break;
> +
> +	default:
> +		ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
> +		ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
> +		ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
> +		break;
> +	}
> 
> -		/* TCP Segmentation Offload */
> -		if (cookie->ol_flags & PKT_TX_TCP_SEG) {
> -			hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
> -				VIRTIO_NET_HDR_GSO_TCPV6 :
> -				VIRTIO_NET_HDR_GSO_TCPV4;
> -			hdr->gso_size = cookie->tso_segsz;
> -			hdr->hdr_len =
> -				cookie->l2_len +
> -				cookie->l3_len +
> -				cookie->l4_len;
> -		} else {
> -			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
> -			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
> -			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
> -		}
> +	/* TCP Segmentation Offload */
> +	if (cookie->ol_flags & PKT_TX_TCP_SEG) {
> +		hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
> +			VIRTIO_NET_HDR_GSO_TCPV6 :
> +			VIRTIO_NET_HDR_GSO_TCPV4;
> +		hdr->gso_size = cookie->tso_segsz;
> +		hdr->hdr_len = cookie->l2_len + cookie->l3_len + cookie-
> >l4_len;
> +	} else {
> +		ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
> +		ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
> +		ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
>  	}
>  }
> 
> @@ -737,7 +729,8 @@ virtqueue_enqueue_xmit_packed(struct virtnet_tx
> *txvq, struct rte_mbuf *cookie,
>  		}
>  	}
> 
> -	virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload);
> +	if (vq->hw->has_tx_offload)
> +		virtqueue_xmit_offload(hdr, cookie);
> 
>  	do {
>  		uint16_t flags;
> --
> 2.23.0

Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-08 12:58           ` Flavio Leitner
@ 2021-04-09 13:30             ` Olivier Matz
  2021-04-09 16:55               ` Flavio Leitner
  0 siblings, 1 reply; 22+ messages in thread
From: Olivier Matz @ 2021-04-09 13:30 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Thu, Apr 08, 2021 at 09:58:35AM -0300, Flavio Leitner wrote:
> On Thu, Apr 08, 2021 at 02:05:21PM +0200, Olivier Matz wrote:
> > On Thu, Apr 08, 2021 at 08:21:58AM -0300, Flavio Leitner wrote:
> > > On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> > > > On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > > > > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > > > > Tx offload flags are of the application responsibility.
> > > > > > Leave the mbuf alone and check for TSO where needed.
> > > > > > 
> > > > > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > > > > ---
> > > > > 
> > > > > The patch looks good, but maybe a better approach would be
> > > > > to change the documentation to require the TCP_CKSUM flag
> > > > > when TCP_SEG is used, otherwise this flag adjusting needs
> > > > > to be replicated every time TCP_SEG is used.
> > > > > 
> > > > > The above could break existing applications, so perhaps doing
> > > > > something like below would be better and backwards compatible?
> > > > > Then we can remove those places tweaking the flags completely.
> > > > 
> > > > As a first step, I suggest to document that:
> > > > - applications must set TCP_CKSUM when setting TCP_SEG
> > > 
> > > That's what I suggest above.
> > > 
> > > > - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set
> > > 
> > > But that keeps the problem of implying the TCP_CKSUM flag in
> > > various places.
> > 
> > Yes. What I propose is just a first step: better document what is the
> > current expected behavior, before doing something else.
> > 
> > > > This is clearer that what we have today, and I think it does not break
> > > > anything. This will guide apps in the correct direction, facilitating
> > > > an eventual future PMD change.
> > > > 
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > index c17dc95c5..6a0c2cdd9 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > @@ -298,7 +298,7 @@ extern "C" {
> > > > >   *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag
> > > > >   *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> > > > >   */
> > > > > -#define PKT_TX_TCP_SEG       (1ULL << 50)
> > > > > +#define PKT_TX_TCP_SEG       (1ULL << 50) | PKT_TX_TCP_CKSUM
> > > > >  
> > > > >  /** TX IEEE1588 packet to timestamp. */
> > > > >  #define PKT_TX_IEEE1588_TMST (1ULL << 51)
> > > > 
> > > > I'm afraid some applications or drivers use extended bit manipulations
> > > > to do the conversion from/to another domain (like hardware descriptors
> > > > or application-specific flags). They may expect this constant to be a
> > > > uniq flag.
> > > 
> > > Interesting, do you have an example? Because each flag still has an
> > > separate meaning.
> > 
> > Honnestly no, I don't have any good example, just a (maybe unfounded) doubt.
> > 
> > I have in mind operations that are done with tables or vector
> > instructions inside the drivers, but this is mainly done for Rx, not Tx.
> > You can look at Tx functions like mlx5_set_cksum_table() or
> > nix_xmit_pkts_vector(), or Rx functions like desc_to_olflags_v() or
> > enic_noscatter_vec_recv_pkts() to see what kind of stuff I'm talking
> > about.
> 
> I see your point. Going back to improving the documentation as a
> first step, what would be the next steps? Are we going to wait few
> releases and then remove the flag tweaking code assuming that PMDs
> and apps are ok?

After this documentation step, in few releases, we could relax the
constraint on PMD: applications will be expected to set TCP_CKSUM when
TCP_SEG is set, so no need for the PMD to force TCP_CKSUM to 1 if
TCP_SEG is set. The documentation will be updated again.

This plan can be described in the deprecation notice, and later in the
release note.

How does it sound?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags
  2021-04-09 13:30             ` Olivier Matz
@ 2021-04-09 16:55               ` Flavio Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Flavio Leitner @ 2021-04-09 16:55 UTC (permalink / raw)
  To: Olivier Matz
  Cc: David Marchand, dev, maxime.coquelin, i.maximets, Keith Wiles

On Fri, Apr 09, 2021 at 03:30:18PM +0200, Olivier Matz wrote:
> On Thu, Apr 08, 2021 at 09:58:35AM -0300, Flavio Leitner wrote:
> > On Thu, Apr 08, 2021 at 02:05:21PM +0200, Olivier Matz wrote:
> > > On Thu, Apr 08, 2021 at 08:21:58AM -0300, Flavio Leitner wrote:
> > > > On Thu, Apr 08, 2021 at 09:41:59AM +0200, Olivier Matz wrote:
> > > > > On Wed, Apr 07, 2021 at 05:15:39PM -0300, Flavio Leitner wrote:
> > > > > > On Thu, Apr 01, 2021 at 11:52:40AM +0200, David Marchand wrote:
> > > > > > > Tx offload flags are of the application responsibility.
> > > > > > > Leave the mbuf alone and check for TSO where needed.
> > > > > > > 
> > > > > > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > > > > > ---
> > > > > > 
> > > > > > The patch looks good, but maybe a better approach would be
> > > > > > to change the documentation to require the TCP_CKSUM flag
> > > > > > when TCP_SEG is used, otherwise this flag adjusting needs
> > > > > > to be replicated every time TCP_SEG is used.
> > > > > > 
> > > > > > The above could break existing applications, so perhaps doing
> > > > > > something like below would be better and backwards compatible?
> > > > > > Then we can remove those places tweaking the flags completely.
> > > > > 
> > > > > As a first step, I suggest to document that:
> > > > > - applications must set TCP_CKSUM when setting TCP_SEG
> > > > 
> > > > That's what I suggest above.
> > > > 
> > > > > - pmds must suppose that TCP_CKSUM is set when TCP_SEG is set
> > > > 
> > > > But that keeps the problem of implying the TCP_CKSUM flag in
> > > > various places.
> > > 
> > > Yes. What I propose is just a first step: better document what is the
> > > current expected behavior, before doing something else.
> > > 
> > > > > This is clearer that what we have today, and I think it does not break
> > > > > anything. This will guide apps in the correct direction, facilitating
> > > > > an eventual future PMD change.
[...]
> > I see your point. Going back to improving the documentation as a
> > first step, what would be the next steps? Are we going to wait few
> > releases and then remove the flag tweaking code assuming that PMDs
> > and apps are ok?
> 
> After this documentation step, in few releases, we could relax the
> constraint on PMD: applications will be expected to set TCP_CKSUM when
> TCP_SEG is set, so no need for the PMD to force TCP_CKSUM to 1 if
> TCP_SEG is set. The documentation will be updated again.
> 
> This plan can be described in the deprecation notice, and later in the
> release note.
> 
> How does it sound?

Works for me.
Thanks,
-- 
fbl

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-04-09 16:55 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
2021-04-07 20:14   ` Flavio Leitner
2021-04-08  7:23   ` Olivier Matz
2021-04-08  8:41     ` David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
2021-04-07 20:15   ` Flavio Leitner
2021-04-08  7:41     ` Olivier Matz
2021-04-08 11:21       ` Flavio Leitner
2021-04-08 12:05         ` Olivier Matz
2021-04-08 12:58           ` Flavio Leitner
2021-04-09 13:30             ` Olivier Matz
2021-04-09 16:55               ` Flavio Leitner
2021-04-08 12:16         ` Ananyev, Konstantin
2021-04-08  7:53   ` Olivier Matz
2021-04-01  9:52 ` [dpdk-dev] [PATCH 3/5] net/virtio: " David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
2021-04-08 13:05   ` Flavio Leitner
2021-04-09  2:31   ` Ruifeng Wang
2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
2021-04-08  8:28   ` Olivier Matz
2021-04-08 18:38   ` Flavio Leitner

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git