* [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
@ 2015-11-04 10:54 Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 1/8] driver/virtio:add virtual addr for virtio net header Jijiang Liu
` (8 more replies)
0 siblings, 9 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Adds vhost TX offload support.
The patch set add the negotiation between us-vhost and virtio-net for vhost TX offload(checksum and TSO), and add the TX offload support in the libs and change vhost sample and csum application to test these changes.
v3 change:
rebase latest codes.
v2 change:
fill virtio device information for TX offloads.
Jijiang Liu (8):
add virtual address of virtio net header
store virtual address of virtio hdr
add vhost TX offload support capability in virtio-net
fill virtio device information for TX offloads.
add vhost TX offload support capability in vhost
enqueue TX offload
dequeue TX offload
change vhost App to support TX offload
fix csumonly fwd issue
drivers/net/virtio/virtio_ethdev.c | 13 ++++++++++
drivers/net/virtio/virtio_ethdev.h | 5 +-
drivers/net/virtio/virtio_rxtx.c | 61 +++++++++++++++++
drivers/net/virtio/virtqueue.h | 1 +
examples/vhost/main.c | 128 +++++++++++++++++++++++++++++++-----
lib/librte_vhost/vhost_rxtx.c | 108 ++++++++++++++++++++++++++++++-
lib/librte_vhost/virtio-net.c | 6 ++-
8 files changed, 302 insertions(+), 20 deletions(-)
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 1/8] driver/virtio:add virtual addr for virtio net header
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 2/8] driver/virtio: record virtual address of " Jijiang Liu
` (7 subsequent siblings)
8 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
The virtual addr for virtio net header need to be recorded.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
drivers/net/virtio/virtqueue.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 689c321..5b43eeb 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -190,6 +190,7 @@ struct virtqueue {
uint16_t vq_avail_idx;
uint64_t mbuf_initializer; /**< value to init mbufs. */
phys_addr_t virtio_net_hdr_mem; /**< hdr for each xmit packet */
+ uint64_t virtio_net_hdr_addr; /**< virtual addr for virtio net header */
struct rte_mbuf **sw_ring; /**< RX software ring. */
/* dummy mbuf, for wraparound when processing RX ring. */
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 2/8] driver/virtio: record virtual address of virtio net header
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 1/8] driver/virtio:add virtual addr for virtio net header Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 3/8] driver/virtio:add vhost TX checksum support capability in virtio-net Jijiang Liu
` (6 subsequent siblings)
8 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Record virtual address of virtio net header.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
drivers/net/virtio/virtio_ethdev.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 465d3cd..cb5dfee 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -376,6 +376,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
}
vq->virtio_net_hdr_mem =
vq->virtio_net_hdr_mz->phys_addr;
+ vq->virtio_net_hdr_addr =
+ (uint64_t)(uintptr_t)vq->virtio_net_hdr_mz->addr;
+
memset(vq->virtio_net_hdr_mz->addr, 0,
vq_size * hw->vtnet_hdr_size);
} else if (queue_type == VTNET_CQ) {
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 3/8] driver/virtio:add vhost TX checksum support capability in virtio-net
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 1/8] driver/virtio:add virtual addr for virtio net header Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 2/8] driver/virtio: record virtual address of " Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 4/8] driver/virtio:fill virtio device info for TX offload Jijiang Liu
` (5 subsequent siblings)
8 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Add vhost TX checksum and TSO capabilities in virtio-net lib.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
drivers/net/virtio/virtio_ethdev.h | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
index 9026d42..6ee95c6 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -64,7 +64,10 @@
1u << VIRTIO_NET_F_CTRL_VQ | \
1u << VIRTIO_NET_F_CTRL_RX | \
1u << VIRTIO_NET_F_CTRL_VLAN | \
- 1u << VIRTIO_NET_F_MRG_RXBUF)
+ 1u << VIRTIO_NET_F_MRG_RXBUF | \
+ 1u << VIRTIO_NET_F_HOST_TSO4 | \
+ 1u << VIRTIO_NET_F_HOST_TSO6 | \
+ 1u << VIRTIO_NET_F_CSUM)
/*
* CQ function prototype
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 4/8] driver/virtio:fill virtio device info for TX offload
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (2 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 3/8] driver/virtio:add vhost TX checksum support capability in virtio-net Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 5/8] driver/virtio:enqueue vhost " Jijiang Liu
` (4 subsequent siblings)
8 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Fill virtio device info for TX offload.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
drivers/net/virtio/virtio_ethdev.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index cb5dfee..b831c02 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1559,6 +1559,16 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
dev_info->default_txconf = (struct rte_eth_txconf) {
.txq_flags = ETH_TXQ_FLAGS_NOOFFLOADS
};
+
+ if (hw->guest_features & (1u << VIRTIO_NET_F_CSUM))
+ dev_info->tx_offload_capa = DEV_TX_OFFLOAD_IPV4_CKSUM |
+ DEV_TX_OFFLOAD_UDP_CKSUM |
+ DEV_TX_OFFLOAD_TCP_CKSUM |
+ DEV_TX_OFFLOAD_SCTP_CKSUM;
+
+ if ((hw->guest_features & (1u << VIRTIO_NET_F_HOST_TSO4)) ||
+ (hw->guest_features & (1u << VIRTIO_NET_F_HOST_TSO6)))
+ dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
}
/*
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 5/8] driver/virtio:enqueue vhost TX offload
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (3 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 4/8] driver/virtio:fill virtio device info for TX offload Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 6/8] " Jijiang Liu
` (3 subsequent siblings)
8 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Enqueue vhost TX checksum and TSO4/6 offload in virtio-net lib.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
lib/librte_vhost/virtio-net.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 14278de..81bd309 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -77,7 +77,11 @@ static struct virtio_net_config_ll *ll_root;
(VHOST_SUPPORTS_MQ) | \
(1ULL << VIRTIO_F_VERSION_1) | \
(1ULL << VHOST_F_LOG_ALL) | \
- (1ULL << VHOST_USER_F_PROTOCOL_FEATURES))
+ (1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
+ (1ULL << VIRTIO_NET_F_HOST_TSO4) | \
+ (1ULL << VIRTIO_NET_F_HOST_TSO6) | \
+ (1ULL << VIRTIO_NET_F_CSUM))
+
static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (4 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 5/8] driver/virtio:enqueue vhost " Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-04 11:17 ` Thomas Monjalon
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue " Jijiang Liu
` (2 subsequent siblings)
8 siblings, 1 reply; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Enqueue vhost TX checksum and TSO4/6 offload in virtio-net lib.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
drivers/net/virtio/virtio_rxtx.c | 61 ++++++++++++++++++++++++++++++++++++++
1 files changed, 61 insertions(+), 0 deletions(-)
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index c5b53bb..b99f5b5 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -50,6 +50,10 @@
#include <rte_string_fns.h>
#include <rte_errno.h>
#include <rte_byteorder.h>
+#include <rte_tcp.h>
+#include <rte_ip.h>
+#include <rte_udp.h>
+#include <rte_sctp.h>
#include "virtio_logs.h"
#include "virtio_ethdev.h"
@@ -199,6 +203,58 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, struct rte_mbuf *cookie)
}
static int
+virtqueue_enqueue_offload(struct virtqueue *txvq, struct rte_mbuf *m,
+ uint16_t idx, uint16_t hdr_sz)
+{
+ struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)(uintptr_t)
+ (txvq->virtio_net_hdr_addr + idx * hdr_sz);
+
+ hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+
+ /* if vhost TX checksum offload is required */
+ if (m->ol_flags & PKT_TX_IP_CKSUM) {
+ hdr->csum_start = m->l2_len;
+ hdr->csum_offset = offsetof(struct ipv4_hdr, hdr_checksum);
+ } else if (m->ol_flags & PKT_TX_L4_MASK) {
+ hdr->csum_start = m->l2_len + m->l3_len;
+ switch (m->ol_flags & PKT_TX_L4_MASK) {
+ case PKT_TX_TCP_CKSUM:
+ hdr->csum_offset = offsetof(struct tcp_hdr, cksum);
+ break;
+ case PKT_TX_UDP_CKSUM:
+ hdr->csum_offset = offsetof(struct udp_hdr,
+ dgram_cksum);
+ break;
+ case PKT_TX_SCTP_CKSUM:
+ hdr->csum_offset = offsetof(struct sctp_hdr, cksum);
+ break;
+ default:
+ break;
+ }
+ } else
+ hdr->flags = 0;
+
+ /* if vhost TSO offload is required */
+ if (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG) {
+ if (m->ol_flags & PKT_TX_IPV4) {
+ if (!vtpci_with_feature(txvq->hw,
+ VIRTIO_NET_F_HOST_TSO4))
+ return -1;
+ hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
+ } else if (m->ol_flags & PKT_TX_IPV6) {
+ if (!vtpci_with_feature(txvq->hw,
+ VIRTIO_NET_F_HOST_TSO6))
+ return -1;
+ hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
+ }
+ hdr->gso_size = m->tso_segsz;
+ hdr->hdr_len = m->l2_len + m->l3_len + m->l4_len;
+ } else
+ hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
+ return 0;
+}
+
+static int
virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
{
struct vq_desc_extra *dxp;
@@ -221,6 +277,11 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
dxp->cookie = (void *)cookie;
dxp->ndescs = needed;
+ if (vtpci_with_feature(txvq->hw, VIRTIO_NET_F_CSUM)) {
+ if (virtqueue_enqueue_offload(txvq, cookie, idx, head_size) < 0)
+ return -EPERM;
+ }
+
start_dp = txvq->vq_ring.desc;
start_dp[idx].addr =
txvq->virtio_net_hdr_mem + idx * head_size;
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX offload
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (5 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 6/8] " Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-09 4:00 ` Yuanhan Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample Jijiang Liu
2015-11-04 11:14 ` [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Tan, Jianfeng
8 siblings, 1 reply; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Dequeue vhost TX offload in vhost lib.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
lib/librte_vhost/vhost_rxtx.c | 108 ++++++++++++++++++++++++++++++++++++++++-
1 files changed, 107 insertions(+), 1 deletions(-)
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 9322ce6..86bfa60 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -37,7 +37,12 @@
#include <rte_mbuf.h>
#include <rte_memcpy.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
#include <rte_virtio_net.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+#include <rte_sctp.h>
#include "vhost-net.h"
@@ -568,6 +573,101 @@ rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
return virtio_dev_rx(dev, queue_id, pkts, count);
}
+static void
+parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
+{
+ struct ipv4_hdr *ipv4_hdr;
+ struct ipv6_hdr *ipv6_hdr;
+ void *l3_hdr = NULL;
+ struct ether_hdr *eth_hdr;
+ uint16_t ethertype;
+
+ eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ m->l2_len = sizeof(struct ether_hdr);
+ ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
+
+ if (ethertype == ETHER_TYPE_VLAN) {
+ struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
+
+ m->l2_len += sizeof(struct vlan_hdr);
+ ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
+ }
+
+ l3_hdr = (char *)eth_hdr + m->l2_len;
+
+ switch (ethertype) {
+ case ETHER_TYPE_IPv4:
+ ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+ *l4_proto = ipv4_hdr->next_proto_id;
+ m->l3_len = (ipv4_hdr->version_ihl & 0x0f) * 4;
+ *l4_hdr = (char *)l3_hdr + m->l3_len;
+ m->ol_flags |= PKT_TX_IPV4;
+ break;
+ case ETHER_TYPE_IPv6:
+ ipv6_hdr = (struct ipv6_hdr *)l3_hdr;
+ *l4_proto = ipv6_hdr->proto;
+ m->ol_flags |= PKT_TX_IPV6;
+ m->l3_len = sizeof(struct ipv6_hdr);
+ *l4_hdr = (char *)l3_hdr + m->l3_len;
+ break;
+ default:
+ m->l3_len = 0;
+ *l4_proto = 0;
+ break;
+ }
+}
+
+static inline void __attribute__((always_inline))
+vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
+{
+ uint16_t l4_proto = 0;
+ void *l4_hdr = NULL;
+ struct tcp_hdr *tcp_hdr = NULL;
+
+ parse_ethernet(m, &l4_proto, &l4_hdr);
+ if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+ if ((hdr->csum_start == m->l2_len) &&
+ (hdr->csum_offset == offsetof(struct ipv4_hdr,
+ hdr_checksum)))
+ m->ol_flags |= PKT_TX_IP_CKSUM;
+ else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
+ switch (hdr->csum_offset) {
+ case (offsetof(struct tcp_hdr, cksum)):
+ if (l4_proto == IPPROTO_TCP)
+ m->ol_flags |= PKT_TX_TCP_CKSUM;
+ break;
+ case (offsetof(struct udp_hdr, dgram_cksum)):
+ if (l4_proto == IPPROTO_UDP)
+ m->ol_flags |= PKT_TX_UDP_CKSUM;
+ break;
+ case (offsetof(struct sctp_hdr, cksum)):
+ if (l4_proto == IPPROTO_SCTP)
+ m->ol_flags |= PKT_TX_SCTP_CKSUM;
+ break;
+ default:
+ break;
+ }
+ }
+ }
+
+ if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+ switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
+ case VIRTIO_NET_HDR_GSO_TCPV4:
+ case VIRTIO_NET_HDR_GSO_TCPV6:
+ tcp_hdr = (struct tcp_hdr *)l4_hdr;
+ m->ol_flags |= PKT_TX_TCP_SEG;
+ m->tso_segsz = hdr->gso_size;
+ m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
+ break;
+ default:
+ RTE_LOG(WARNING, VHOST_DATA,
+ "unsupported gso type %u.\n", hdr->gso_type);
+ break;
+ }
+ }
+}
+
uint16_t
rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count)
@@ -576,11 +676,13 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct vhost_virtqueue *vq;
struct vring_desc *desc;
uint64_t vb_addr = 0;
+ uint64_t vb_net_hdr_addr = 0;
uint32_t head[MAX_PKT_BURST];
uint32_t used_idx;
uint32_t i;
uint16_t free_entries, entry_success = 0;
uint16_t avail_idx;
+ struct virtio_net_hdr *hdr = NULL;
if (unlikely(!is_valid_virt_queue_idx(queue_id, 1, dev->virt_qp_nb))) {
RTE_LOG(ERR, VHOST_DATA,
@@ -632,6 +734,9 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
desc = &vq->desc[head[entry_success]];
+ vb_net_hdr_addr = gpa_to_vva(dev, desc->addr);
+ hdr = (struct virtio_net_hdr *)((uintptr_t)vb_net_hdr_addr);
+
/* Discard first buffer as it is the virtio header */
if (desc->flags & VRING_DESC_F_NEXT) {
desc = &vq->desc[desc->next];
@@ -770,7 +875,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
break;
m->nb_segs = seg_num;
-
+ if ((hdr->flags != 0) || (hdr->gso_type != 0))
+ vhost_dequeue_offload(hdr, m);
pkts[entry_success] = m;
vq->last_used_idx++;
entry_success++;
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (6 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue " Jijiang Liu
@ 2015-11-04 10:54 ` Jijiang Liu
2015-11-09 4:17 ` Yuanhan Liu
2015-11-04 11:14 ` [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Tan, Jianfeng
8 siblings, 1 reply; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 10:54 UTC (permalink / raw)
To: dev
Change the vhost sample to support and test TX offload.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
examples/vhost/main.c | 128 ++++++++++++++++++++++++++++++++++++++++++-------
1 files changed, 111 insertions(+), 17 deletions(-)
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 9eac2d0..06e1e8b 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -50,6 +50,10 @@
#include <rte_string_fns.h>
#include <rte_malloc.h>
#include <rte_virtio_net.h>
+#include <rte_tcp.h>
+#include <rte_ip.h>
+#include <rte_udp.h>
+#include <rte_sctp.h>
#include "main.h"
@@ -140,6 +144,8 @@
#define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb))
+#define VIRTIO_TX_CKSUM_OFFLOAD_MASK (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK)
+
/* mask of enabled ports */
static uint32_t enabled_port_mask = 0;
@@ -197,6 +203,13 @@ typedef enum {
static uint32_t enable_stats = 0;
/* Enable retries on RX. */
static uint32_t enable_retry = 1;
+
+/* Disable TX checksum offload */
+static uint32_t enable_tx_csum;
+
+/* Disable TSO offload */
+static uint32_t enable_tso;
+
/* Specify timeout (in useconds) between retries on RX. */
static uint32_t burst_rx_delay_time = BURST_RX_WAIT_US;
/* Specify the number of retries on RX. */
@@ -292,20 +305,6 @@ struct vlan_ethhdr {
__be16 h_vlan_encapsulated_proto;
};
-/* IPv4 Header */
-struct ipv4_hdr {
- uint8_t version_ihl; /**< version and header length */
- uint8_t type_of_service; /**< type of service */
- uint16_t total_length; /**< length of packet */
- uint16_t packet_id; /**< packet ID */
- uint16_t fragment_offset; /**< fragmentation offset */
- uint8_t time_to_live; /**< time to live */
- uint8_t next_proto_id; /**< protocol ID */
- uint16_t hdr_checksum; /**< header checksum */
- uint32_t src_addr; /**< source address */
- uint32_t dst_addr; /**< destination address */
-} __attribute__((__packed__));
-
/* Header lengths. */
#define VLAN_HLEN 4
#define VLAN_ETH_HLEN 18
@@ -441,6 +440,14 @@ port_init(uint8_t port)
if (port >= rte_eth_dev_count()) return -1;
+ if (enable_tx_csum == 0)
+ rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_CSUM);
+
+ if (enable_tso == 0) {
+ rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4);
+ rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO6);
+ }
+
rx_rings = (uint16_t)dev_info.max_rx_queues;
/* Configure ethernet device. */
retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
@@ -576,7 +583,9 @@ us_vhost_usage(const char *prgname)
" --rx-desc-num [0-N]: the number of descriptors on rx, "
"used only when zero copy is enabled.\n"
" --tx-desc-num [0-N]: the number of descriptors on tx, "
- "used only when zero copy is enabled.\n",
+ "used only when zero copy is enabled.\n"
+ " --tx-csum [0|1] disable/enable TX checksum offload.\n"
+ " --tso [0|1] disable/enable TCP segement offload.\n",
prgname);
}
@@ -602,6 +611,8 @@ us_vhost_parse_args(int argc, char **argv)
{"zero-copy", required_argument, NULL, 0},
{"rx-desc-num", required_argument, NULL, 0},
{"tx-desc-num", required_argument, NULL, 0},
+ {"tx-csum", required_argument, NULL, 0},
+ {"tso", required_argument, NULL, 0},
{NULL, 0, 0, 0},
};
@@ -656,6 +667,28 @@ us_vhost_parse_args(int argc, char **argv)
}
}
+ /* Enable/disable TX checksum offload. */
+ if (!strncmp(long_option[option_index].name, "tx-csum", MAX_LONG_OPT_SZ)) {
+ ret = parse_num_opt(optarg, 1);
+ if (ret == -1) {
+ RTE_LOG(INFO, VHOST_CONFIG, "Invalid argument for tx-csum [0|1]\n");
+ us_vhost_usage(prgname);
+ return -1;
+ } else
+ enable_tx_csum = ret;
+ }
+
+ /* Enable/disable TSO offload. */
+ if (!strncmp(long_option[option_index].name, "tso", MAX_LONG_OPT_SZ)) {
+ ret = parse_num_opt(optarg, 1);
+ if (ret == -1) {
+ RTE_LOG(INFO, VHOST_CONFIG, "Invalid argument for tso [0|1]\n");
+ us_vhost_usage(prgname);
+ return -1;
+ } else
+ enable_tso = ret;
+ }
+
/* Specify the retries delay time (in useconds) on RX. */
if (!strncmp(long_option[option_index].name, "rx-retry-delay", MAX_LONG_OPT_SZ)) {
ret = parse_num_opt(optarg, INT32_MAX);
@@ -1114,6 +1147,63 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
return 0;
}
+static uint16_t
+get_psd_sum(void *l3_hdr, uint64_t ol_flags)
+{
+ if (ol_flags & PKT_TX_IPV4)
+ return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
+ else /* assume ethertype == ETHER_TYPE_IPv6 */
+ return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
+}
+
+static void virtio_tx_offload(struct rte_mbuf *m)
+{
+ void *l3_hdr;
+ struct ipv4_hdr *ipv4_hdr = NULL;
+ struct tcp_hdr *tcp_hdr = NULL;
+ struct udp_hdr *udp_hdr = NULL;
+ struct sctp_hdr *sctp_hdr = NULL;
+ struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ l3_hdr = (char *)eth_hdr + m->l2_len;
+
+ if (m->ol_flags & PKT_TX_IPV4) {
+ ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+ if (m->ol_flags & PKT_TX_IP_CKSUM)
+ ipv4_hdr->hdr_checksum = 0;
+ }
+
+ if (m->ol_flags & PKT_TX_L4_MASK) {
+ switch (m->ol_flags & PKT_TX_L4_MASK) {
+ case PKT_TX_TCP_CKSUM:
+ tcp_hdr = (struct tcp_hdr *)
+ ((char *)l3_hdr + m->l3_len);
+ tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
+ break;
+ case PKT_TX_UDP_CKSUM:
+ udp_hdr = (struct udp_hdr *)
+ ((char *)l3_hdr + m->l3_len);
+ udp_hdr->dgram_cksum = get_psd_sum(l3_hdr, m->ol_flags);
+ break;
+ case PKT_TX_SCTP_CKSUM:
+ sctp_hdr = (struct sctp_hdr *)
+ ((char *)l3_hdr + m->l3_len);
+ sctp_hdr->cksum = 0;
+ break;
+ default:
+ break;
+ }
+ }
+
+ if (m->tso_segsz != 0) {
+ ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+ tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + m->l3_len);
+ m->ol_flags |= PKT_TX_IP_CKSUM;
+ ipv4_hdr->hdr_checksum = 0;
+ tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
+ }
+}
+
/*
* This function routes the TX packet to the correct interface. This may be a local device
* or the physical port.
@@ -1156,7 +1246,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
(vh->vlan_tci != vlan_tag_be))
vh->vlan_tci = vlan_tag_be;
} else {
- m->ol_flags = PKT_TX_VLAN_PKT;
+ m->ol_flags |= PKT_TX_VLAN_PKT;
/*
* Find the right seg to adjust the data len when offset is
@@ -1180,6 +1270,10 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
m->vlan_tci = vlan_tag;
}
+ if ((m->ol_flags & VIRTIO_TX_CKSUM_OFFLOAD_MASK) ||
+ (m->ol_flags & PKT_TX_TCP_SEG))
+ virtio_tx_offload(m);
+
tx_q->m_table[len] = m;
len++;
if (enable_stats) {
@@ -1841,7 +1935,7 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct rte_mbuf *m,
mbuf->buf_physaddr = m->buf_physaddr;
mbuf->buf_addr = m->buf_addr;
}
- mbuf->ol_flags = PKT_TX_VLAN_PKT;
+ mbuf->ol_flags |= PKT_TX_VLAN_PKT;
mbuf->vlan_tci = vlan_tag;
mbuf->l2_len = sizeof(struct ether_hdr);
mbuf->l3_len = sizeof(struct ipv4_hdr);
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
` (7 preceding siblings ...)
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample Jijiang Liu
@ 2015-11-04 11:14 ` Tan, Jianfeng
2015-11-05 14:24 ` Glynn, Michael J
8 siblings, 1 reply; 29+ messages in thread
From: Tan, Jianfeng @ 2015-11-04 11:14 UTC (permalink / raw)
To: Liu, Jijiang, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jijiang Liu
> Sent: Wednesday, November 4, 2015 6:54 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
>
> Adds vhost TX offload support.
>
> The patch set add the negotiation between us-vhost and virtio-net for vhost
> TX offload(checksum and TSO), and add the TX offload support in the libs and
> change vhost sample and csum application to test these changes.
>
> v3 change:
> rebase latest codes.
>
.....
> lib/librte_vhost/virtio-net.c | 6 ++-
> 8 files changed, 302 insertions(+), 20 deletions(-)
>
> --
> 1.7.7.6
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 6/8] " Jijiang Liu
@ 2015-11-04 11:17 ` Thomas Monjalon
2015-11-04 12:52 ` Liu, Jijiang
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Thomas Monjalon @ 2015-11-04 11:17 UTC (permalink / raw)
To: Jijiang Liu; +Cc: dev, Michael S. Tsirkin
2015-11-04 18:54, Jijiang Liu:
> + /* if vhost TX checksum offload is required */
> + if (m->ol_flags & PKT_TX_IP_CKSUM) {
> + hdr->csum_start = m->l2_len;
> + hdr->csum_offset = offsetof(struct ipv4_hdr, hdr_checksum);
> + } else if (m->ol_flags & PKT_TX_L4_MASK) {
> + hdr->csum_start = m->l2_len + m->l3_len;
> + switch (m->ol_flags & PKT_TX_L4_MASK) {
> + case PKT_TX_TCP_CKSUM:
> + hdr->csum_offset = offsetof(struct tcp_hdr, cksum);
> + break;
> + case PKT_TX_UDP_CKSUM:
> + hdr->csum_offset = offsetof(struct udp_hdr,
> + dgram_cksum);
> + break;
> + case PKT_TX_SCTP_CKSUM:
> + hdr->csum_offset = offsetof(struct sctp_hdr, cksum);
> + break;
> + default:
> + break;
> + }
The header checksum to offload is deduced from csum_offset.
Your vhost implementation do some parsing to deduce it:
> + parse_ethernet(m, &l4_proto, &l4_hdr);
> + if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> + if ((hdr->csum_start == m->l2_len) &&
> + (hdr->csum_offset == offsetof(struct ipv4_hdr,
> + hdr_checksum)))
> + m->ol_flags |= PKT_TX_IP_CKSUM;
> + else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> + switch (hdr->csum_offset) {
> + case (offsetof(struct tcp_hdr, cksum)):
> + if (l4_proto == IPPROTO_TCP)
> + m->ol_flags |= PKT_TX_TCP_CKSUM;
> + break;
> + case (offsetof(struct udp_hdr, dgram_cksum)):
> + if (l4_proto == IPPROTO_UDP)
> + m->ol_flags |= PKT_TX_UDP_CKSUM;
> + break;
> + case (offsetof(struct sctp_hdr, cksum)):
> + if (l4_proto == IPPROTO_SCTP)
> + m->ol_flags |= PKT_TX_SCTP_CKSUM;
> + break;
> + default:
> + break;
> + }
> + }
The kernel doesn't work this way.
Please could you check that your virtio implementation works with a
vanilla Linux with or without vhost?
Thanks
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 11:17 ` Thomas Monjalon
@ 2015-11-04 12:52 ` Liu, Jijiang
2015-11-04 13:18 ` Thomas Monjalon
2015-11-04 13:06 ` Liu, Jijiang
2015-11-04 13:08 ` Liu, Jijiang
2 siblings, 1 reply; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-04 12:52 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
Hi Thomas,
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, November 4, 2015 7:18 PM
> To: Liu, Jijiang
> Cc: dev@dpdk.org; Michael S. Tsirkin
> Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost
The following code is not in the patch 6, please review the latest patch set.
> > + parse_ethernet(m, &l4_proto, &l4_hdr);
> > + if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > + if ((hdr->csum_start == m->l2_len) &&
> > + (hdr->csum_offset == offsetof(struct ipv4_hdr,
> > + hdr_checksum)))
> > + m->ol_flags |= PKT_TX_IP_CKSUM;
> > + else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> > + switch (hdr->csum_offset) {
> > + case (offsetof(struct tcp_hdr, cksum)):
> > + if (l4_proto == IPPROTO_TCP)
> > + m->ol_flags |= PKT_TX_TCP_CKSUM;
> > + break;
> > + case (offsetof(struct udp_hdr, dgram_cksum)):
> > + if (l4_proto == IPPROTO_UDP)
> > + m->ol_flags |= PKT_TX_UDP_CKSUM;
> > + break;
> > + case (offsetof(struct sctp_hdr, cksum)):
> > + if (l4_proto == IPPROTO_SCTP)
> > + m->ol_flags |= PKT_TX_SCTP_CKSUM;
> > + break;
> > + default:
> > + break;
> > + }
> > + }
>
> The kernel doesn't work this way.
> Please could you check that your virtio implementation works with a vanilla
> Linux with or without vhost?
> Thanks
This is vhost lib implementation, not virtio-net side.
We have already validated with a vanilla Linux with or without virtio-net, and it passed.
Could you please review latest patch v3?
Xu Qian can send the test report out.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 11:17 ` Thomas Monjalon
2015-11-04 12:52 ` Liu, Jijiang
@ 2015-11-04 13:06 ` Liu, Jijiang
2015-11-04 13:08 ` Liu, Jijiang
2 siblings, 0 replies; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-04 13:06 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
Hi Thomas,
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, November 4, 2015 7:18 PM
> To: Liu, Jijiang
> Cc: dev@dpdk.org; Michael S. Tsirkin
> Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX
> offload
>
> 2015-11-04 18:54, Jijiang Liu:
> > + /* if vhost TX checksum offload is required */
> > + if (m->ol_flags & PKT_TX_IP_CKSUM) {
> > + hdr->csum_start = m->l2_len;
> > + hdr->csum_offset = offsetof(struct ipv4_hdr, hdr_checksum);
> > + } else if (m->ol_flags & PKT_TX_L4_MASK) {
> > + hdr->csum_start = m->l2_len + m->l3_len;
> > + switch (m->ol_flags & PKT_TX_L4_MASK) {
> > + case PKT_TX_TCP_CKSUM:
> > + hdr->csum_offset = offsetof(struct tcp_hdr, cksum);
> > + break;
> > + case PKT_TX_UDP_CKSUM:
> > + hdr->csum_offset = offsetof(struct udp_hdr,
> > + dgram_cksum);
> > + break;
> > + case PKT_TX_SCTP_CKSUM:
> > + hdr->csum_offset = offsetof(struct sctp_hdr, cksum);
> > + break;
> > + default:
> > + break;
> > + }
>
> The header checksum to offload is deduced from csum_offset.
> Your vhost implementation do some parsing to deduce it:
>
The ol_flag is set in application, we have to fill 'csum_start' and 'csum_offset' based on these offload flags.
As long as the 'csum_start' and 'csum_offset' fileds are set correctly, and it can work well with a vanilla linux with vhost.
But in DPDK vhost lib, we need to parse the 'csum_start' and 'csum_offset' filed to get the which offload flags should be set, and the l2_len, l3_len and l3_len also should be filled.
So I think it is necessary to do this in both side.
We can continue discuss this if you have further comments. Thanks
--Jijiang Liu
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 11:17 ` Thomas Monjalon
2015-11-04 12:52 ` Liu, Jijiang
2015-11-04 13:06 ` Liu, Jijiang
@ 2015-11-04 13:08 ` Liu, Jijiang
2015-11-04 13:15 ` Liu, Jijiang
2 siblings, 1 reply; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-04 13:08 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
> -----Original Message-----
> From: Liu, Jijiang
> Sent: Wednesday, November 4, 2015 8:52 PM
> To: 'Thomas Monjalon'
> Cc: dev@dpdk.org; Michael S. Tsirkin
> Subject: RE: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX
> offload
>
> Hi Thomas,
>
>
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Sent: Wednesday, November 4, 2015 7:18 PM
> > To: Liu, Jijiang
> > Cc: dev@dpdk.org; Michael S. Tsirkin
> > Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost
>
>
> The following code is not in the patch 6, please review the latest patch set.
Got it. You copy the codes from vhost side here for the comparison. The v3 is latest.
>
> > > + parse_ethernet(m, &l4_proto, &l4_hdr);
> > > + if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > > + if ((hdr->csum_start == m->l2_len) &&
> > > + (hdr->csum_offset == offsetof(struct ipv4_hdr,
> > > + hdr_checksum)))
> > > + m->ol_flags |= PKT_TX_IP_CKSUM;
> > > + else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> > > + switch (hdr->csum_offset) {
> > > + case (offsetof(struct tcp_hdr, cksum)):
> > > + if (l4_proto == IPPROTO_TCP)
> > > + m->ol_flags |= PKT_TX_TCP_CKSUM;
> > > + break;
> > > + case (offsetof(struct udp_hdr, dgram_cksum)):
> > > + if (l4_proto == IPPROTO_UDP)
> > > + m->ol_flags |= PKT_TX_UDP_CKSUM;
> > > + break;
> > > + case (offsetof(struct sctp_hdr, cksum)):
> > > + if (l4_proto == IPPROTO_SCTP)
> > > + m->ol_flags |= PKT_TX_SCTP_CKSUM;
> > > + break;
> > > + default:
> > > + break;
> > > + }
> > > + }
> >
> > The kernel doesn't work this way.
> > Please could you check that your virtio implementation works with a
> vanilla
> > Linux with or without vhost?
> > Thanks
>
> This is vhost lib implementation, not virtio-net side.
> We have already validated with a vanilla Linux with or without virtio-net,
> and it passed.
> Could you please review latest patch v3?
>
> Xu Qian can send the test report out.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 13:08 ` Liu, Jijiang
@ 2015-11-04 13:15 ` Liu, Jijiang
0 siblings, 0 replies; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-04 13:15 UTC (permalink / raw)
To: Liu, Jijiang, Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
The following structure is defined in virtio standard,
struct virtio_net_hdr {
#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
u8 flags;
#define VIRTIO_NET_HDR_GSO_NONE 0
#define VIRTIO_NET_HDR_GSO_TCPV4 1
#define VIRTIO_NET_HDR_GSO_UDP 3
#define VIRTIO_NET_HDR_GSO_TCPV6 4
#define VIRTIO_NET_HDR_GSO_ECN 0x80
u8 gso_type;
le16 hdr_len;
le16 gso_size;
le16 csum_start;
le16 csum_offset;
le16 num_buffers;
};
For checksum. The 'flags', ' csum_start' and csum_offset filed need to be filled.
For TSO, the 'gso_type', 'hdr_len' and 'csum_offset' fileds need to be filled.
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Liu, Jijiang
> Sent: Wednesday, November 4, 2015 9:08 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org; Michael S. Tsirkin
> Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX
> offload
>
>
>
> > -----Original Message-----
> > From: Liu, Jijiang
> > Sent: Wednesday, November 4, 2015 8:52 PM
> > To: 'Thomas Monjalon'
> > Cc: dev@dpdk.org; Michael S. Tsirkin
> > Subject: RE: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX
> > offload
> >
> > Hi Thomas,
> >
> >
> > > -----Original Message-----
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > Sent: Wednesday, November 4, 2015 7:18 PM
> > > To: Liu, Jijiang
> > > Cc: dev@dpdk.org; Michael S. Tsirkin
> > > Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost
> >
> >
> > The following code is not in the patch 6, please review the latest patch set.
>
> Got it. You copy the codes from vhost side here for the comparison. The v3 is
> latest.
>
> >
> > > > + parse_ethernet(m, &l4_proto, &l4_hdr);
> > > > + if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > > > + if ((hdr->csum_start == m->l2_len) &&
> > > > + (hdr->csum_offset == offsetof(struct ipv4_hdr,
> > > > + hdr_checksum)))
> > > > + m->ol_flags |= PKT_TX_IP_CKSUM;
> > > > + else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> > > > + switch (hdr->csum_offset) {
> > > > + case (offsetof(struct tcp_hdr, cksum)):
> > > > + if (l4_proto == IPPROTO_TCP)
> > > > + m->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > + break;
> > > > + case (offsetof(struct udp_hdr, dgram_cksum)):
> > > > + if (l4_proto == IPPROTO_UDP)
> > > > + m->ol_flags |= PKT_TX_UDP_CKSUM;
> > > > + break;
> > > > + case (offsetof(struct sctp_hdr, cksum)):
> > > > + if (l4_proto == IPPROTO_SCTP)
> > > > + m->ol_flags |= PKT_TX_SCTP_CKSUM;
> > > > + break;
> > > > + default:
> > > > + break;
> > > > + }
> > > > + }
> > >
> > > The kernel doesn't work this way.
> > > Please could you check that your virtio implementation works with a
> > vanilla
> > > Linux with or without vhost?
> > > Thanks
> >
> > This is vhost lib implementation, not virtio-net side.
> > We have already validated with a vanilla Linux with or without
> > virtio-net, and it passed.
> > Could you please review latest patch v3?
> >
> > Xu Qian can send the test report out.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 12:52 ` Liu, Jijiang
@ 2015-11-04 13:18 ` Thomas Monjalon
2015-11-05 8:49 ` Xu, Qian Q
0 siblings, 1 reply; 29+ messages in thread
From: Thomas Monjalon @ 2015-11-04 13:18 UTC (permalink / raw)
To: Liu, Jijiang; +Cc: dev, Michael S. Tsirkin
2015-11-04 12:52, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Please could you check that your virtio implementation works with a vanilla
> > Linux with or without vhost?
> > Thanks
[...]
> Xu Qian can send the test report out.
Yes please, I'd like to see a test report showing this virtio running
with Linux vhost and without vhost.
We must check that the checksum is well offloaded and sent packets are valids.
Thanks
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 13:18 ` Thomas Monjalon
@ 2015-11-05 8:49 ` Xu, Qian Q
2015-11-05 9:02 ` Thomas Monjalon
0 siblings, 1 reply; 29+ messages in thread
From: Xu, Qian Q @ 2015-11-05 8:49 UTC (permalink / raw)
To: Thomas Monjalon, Liu, Jijiang; +Cc: dev, Michael S. Tsirkin
Tested-by: Qian Xu <qian.q.xu@intel.com>
- Test Commit: c4d404d7c1257465176deb5bb8c84e627d2d5eee
- OS/Kernel: Fedora 21/4.1.8
- GCC: gcc (GCC) 4.9.2 20141101 (Red Hat 4.9.2-1)
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
- Target: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
- Total 2 cases, 2 passed, 0 failed. DPDK vhost + legacy virtio or virtio-pmd can work well with TSO.
Test Case 1: test_dpdk vhost+ virtio-pmd tso
======================================
On host:
1. Start up vhost-switch, mergeable 1 means the jubmo frame feature is enabled. vm2vm 0 means only one vm without vm to vm communication::
taskset -c 1-3 <dpdk_folder>/examples/vhost/build/vhost-switch -c 0xf -n 4 --huge-dir /mnt/huge --socket-mem 1024,1024 -- -p 1 --mergeable 1 --zero-copy 0 --vm2vm 0 --tso 1 --tx-csum 1
2. Start VM with vhost cuse as backend::
taskset -c 4-6 /home/qxu10/qemu-2.2.0/x86_64-softmmu/qemu-system-x86_64 -object memory-backend-file, id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa node,memdev=mem -mem-prealloc \
-enable-kvm -m 2048 -smp 4 -cpu host -name dpdk1-vm1 \
-drive file=/home/img/dpdk1-vm1.img \
-netdev tap,id=vhost3,ifname=tap_vhost3,vhost=on,script=no \
-device virtio-net pci,netdev=vhost3,mac=52:54:00:00:00:01,id=net3 \
-netdev tap,id=vhost4,ifname=tap_vhost4,vhost=on,script=no \
-device virtio-net-pci,netdev=vhost4,mac=52:54:00:00:00:02,id=net4 \
-netdev tap,id=ipvm1,ifname=tap3,script=/etc/qemu-ifup -device rtl8139,netdev=ipvm1,id=net0,mac=00:00:00:00:00:01 \
-localtime -nographic
On guest:
3. ensure the dpdk folder copied to the guest with the same config file and build process as host. Then bind 2 virtio devices to igb_uio and start testpmd, below is the step for reference::
./<dpdk_folder>/tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
./<dpdk_folder>/x86_64-native-linuxapp-gcc/app/test-pmd/testpmd -c f -n 4 -- -i --txqflags 0x0f00 --max-pkt-len 9000
$ >set fwd csum
$ >tso set 1000 0
$ >tso set 1000 1
$ >start tx_first
4. Send TCP packets to virtio1, and the packet size is 5000, then at the virtio side, it will receive 1 packet ant let vhost to do TSO, vhost will let NIC do TSO, so at IXIA, we expected 5 packets, each ~1k size, then also capture the received packets and check if the checksum is correct.
Result: All the behavior is expected as step4. So the case is PASS.
Test Case 2: test_dpdk vhost+legacy virtio iperf tso
===========================================
Hardware config: Connect one physical port(port1) to another physical port(port2). Port1 is the NIC port that will do the TSO.
1. Start dpdk vhost sample, the command is same as above case. Port1 is binded to igb_uio
2. start VM with 1 virtio
3. let port2 and 1virtio in VM do iperf test, since iperf test will send out
VIRTIO: ifconfig eth0 1.1.1.2
Port2: ifconfig p2p6 1.1.1.8
Make ping work: ping 1.1.1.8
Then run iperf server at port2: iperf -s -I 1
Run iperf client at port1: iperf -c 1.1.1.8 -t 60 -I 1
Check the packet size at virtio and port2 to see if there are many 64KB packet, if has, then pass. The reason is that vhost/virtio will first negotiate if each other supports tso, if supports, then the TCP/IP stack will compose BIG packets such as 64KB, since NIC has the TSO capability, vhost will let NIC do the TSO work, then at port2, the small packets will be composed to big packets with TCP/IP stack.
Result: there are many 64KB packet in both virtio and port2, so it is pass.
Thanks
Qian
-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
Sent: Wednesday, November 04, 2015 9:18 PM
To: Liu, Jijiang
Cc: dev@dpdk.org; Michael S. Tsirkin
Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-04 12:52, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Please could you check that your virtio implementation works with a
> > vanilla Linux with or without vhost?
> > Thanks
[...]
> Xu Qian can send the test report out.
Yes please, I'd like to see a test report showing this virtio running with Linux vhost and without vhost.
We must check that the checksum is well offloaded and sent packets are valids.
Thanks
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-05 8:49 ` Xu, Qian Q
@ 2015-11-05 9:02 ` Thomas Monjalon
2015-11-05 10:44 ` Xu, Qian Q
0 siblings, 1 reply; 29+ messages in thread
From: Thomas Monjalon @ 2015-11-05 9:02 UTC (permalink / raw)
To: Xu, Qian Q; +Cc: dev, Michael S. Tsirkin
2015-11-05 08:49, Xu, Qian Q:
> Test Case 1: test_dpdk vhost+ virtio-pmd tso
[...]
> Test Case 2: test_dpdk vhost+legacy virtio iperf tso
[...]
> Yes please, I'd like to see a test report showing this virtio running with Linux vhost and without vhost.
> We must check that the checksum is well offloaded and sent packets are valids.
> Thanks
Thanks for doing some tests.
I had no doubt it works with DPDK vhost.
Please could you do some tests without vhost and with kernel vhost?
We need to check that the checksum is not missing in such cases.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-05 9:02 ` Thomas Monjalon
@ 2015-11-05 10:44 ` Xu, Qian Q
2015-11-06 8:24 ` Xu, Qian Q
0 siblings, 1 reply; 29+ messages in thread
From: Xu, Qian Q @ 2015-11-05 10:44 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
OK, I will check it tomorrow.
Another comment is that "Legacy vhost + virtio-pmd" is not the common use case. Firstly, in this case, virtio-pmd has no TCP/IP stack, TSO is not very meaningful; secondly, we can't get performance benefit from this case compared to "Legacy vhost+ legacy virtio". So I'm afraid no customer would like to try this case since the fake TSO and poor performance.
Thanks
Qian
-----Original Message-----
From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
Sent: Thursday, November 05, 2015 5:02 PM
To: Xu, Qian Q
Cc: Liu, Jijiang; dev@dpdk.org; Michael S. Tsirkin
Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-05 08:49, Xu, Qian Q:
> Test Case 1: test_dpdk vhost+ virtio-pmd tso
[...]
> Test Case 2: test_dpdk vhost+legacy virtio iperf tso
[...]
> Yes please, I'd like to see a test report showing this virtio running with Linux vhost and without vhost.
> We must check that the checksum is well offloaded and sent packets are valids.
> Thanks
Thanks for doing some tests.
I had no doubt it works with DPDK vhost.
Please could you do some tests without vhost and with kernel vhost?
We need to check that the checksum is not missing in such cases.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
2015-11-04 11:14 ` [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Tan, Jianfeng
@ 2015-11-05 14:24 ` Glynn, Michael J
0 siblings, 0 replies; 29+ messages in thread
From: Glynn, Michael J @ 2015-11-05 14:24 UTC (permalink / raw)
To: thomas.monjalon; +Cc: dev
Hi Thomas
Is there anything else needed to get this applied?
Thanks
Mike
-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng
Sent: Wednesday, November 4, 2015 11:14 AM
To: Liu, Jijiang; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jijiang Liu
> Sent: Wednesday, November 4, 2015 6:54 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support
>
> Adds vhost TX offload support.
>
> The patch set add the negotiation between us-vhost and virtio-net for
> vhost TX offload(checksum and TSO), and add the TX offload support in
> the libs and change vhost sample and csum application to test these changes.
>
> v3 change:
> rebase latest codes.
>
.....
> lib/librte_vhost/virtio-net.c | 6 ++-
> 8 files changed, 302 insertions(+), 20 deletions(-)
>
> --
> 1.7.7.6
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-05 10:44 ` Xu, Qian Q
@ 2015-11-06 8:24 ` Xu, Qian Q
0 siblings, 0 replies; 29+ messages in thread
From: Xu, Qian Q @ 2015-11-06 8:24 UTC (permalink / raw)
To: Xu, Qian Q, Thomas Monjalon; +Cc: dev, Michael S. Tsirkin
Tested-by: Qian Xu <qian.q.xu@intel.com>
- Test Commit: c4d404d7c1257465176deb5bb8c84e627d2d5eee
- OS/Kernel: Fedora 21/4.1.8
- GCC: gcc (GCC) 4.9.2 20141101 (Red Hat 4.9.2-1)
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
- Target: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
- Total 1 cases, 1 passed, 0 failed. Legacy vhost + virtio-pmd can work well with TSO.
Test Case 1: test_legacy_vhost+ virtio-pmd tso
=======================================
On host:
1. Start VM with legacy-vhost as backend::
taskset -c 4-6 /home/qxu10/qemu-2.2.0/x86_64-softmmu/qemu-system-x86_64 -object memory-backend-file, id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa node,memdev=mem -mem-prealloc \
-enable-kvm -m 2048 -smp 4 -cpu host -name dpdk1-vm1 \
-drive file=/home/img/dpdk1-vm1.img \
-netdev tap,id=vhost3,ifname=tap_vhost3,vhost=on,script=no \
-device virtio-net pci,netdev=vhost3,mac=52:54:00:00:00:01,id=net3 \
-netdev tap,id=ipvm1,ifname=tap3,script=/etc/qemu-ifup -device rtl8139,netdev=ipvm1,id=net0,mac=00:00:00:00:00:01 \
-localtime -nographic
2. Set up the bridge on host:
brctl addbr br1
brctl addif br1 ens260f0 # The interface is 85:00.0 connected to ixia card3 port9
brctl addif br1 tap0
brctl addif br1 tap1
ifconfig ens260f0 up
ifconfig ens260f0 promisc
ifconfig tap0 up
ifconfig tap1 up
ifconfig tap0 promisc
ifconfig tap1 promisc
brctl stp br1 off
ifconfig br1 up
brctl show
3. Disable firewall and Network manager on host:
systemctl stop firewalld.service
systemctl disable firewalld.service
systemctl stop ip6tables.service
systemctl disable ip6tables.service
systemctl stop iptables.service
systemctl disable iptables.service
systemctl stop NetworkManager.service
systemctl disable NetworkManager.service
4. Let br1 learn the MAC : 02:00:00:00:00:00, since in the VM, the virtio device run testpmd, then it will send packets with the DEST MAC as 02:00:00:00:00:00. Then the br1 will know this packet can go to the NIC and then it will go back to the traffic generator. So here we send a packet from IXIA with the SRC MAC=02:00:00:00:00:00 and DEST MAC=52:54:00:00:00:01 to let the br1 know the MAC. We can verify the macs that the bridge knows by running: brctl br1 showmacs
port no mac addr is local? ageing timer
3 02:00:00:00:00:00 no 6.06
1 42:fa:45:4d:aa:4d yes 0.00
1 42:fa:45:4d:aa:4d yes 0.00
1 52:54:00:00:00:01 no 6.06
2 8e:d7:22:bf:c9:8d yes 0.00
2 8e:d7:22:bf:c9:8d yes 0.00
3 90:e2:ba:4a:55:1c yes 0.00
3 90:e2:ba:4a:55:1c yes 0.00
On guest:
5. ensure the dpdk folder copied to the guest with the same config file and build process as host. Then bind 2 virtio devices to igb_uio and start testpmd, below is the step for reference::
./<dpdk_folder>/tools/dpdk_nic_bind.py --bind igb_uio 00:03.0
./<dpdk_folder>/x86_64-native-linuxapp-gcc/app/test-pmd/testpmd -c f -n 4 -- -i --txqflags 0x0f00 --max-pkt-len 9000
$ >set fwd csum
$ >tso set 1000 0
$ >tso set 1000 1
$ >start
6. Send TCP packets to virtio1, and the packet size is 5000, then at the virtio side, it will receive 1 packet ant let vhost to do TSO, vhost will let NIC do TSO, so at IXIA, we expected 5 packets, each ~1k size, then also capture the received packets and check if the checksum is correct.
Result: All the behavior is expected and cksum is correct. So the case is PASS.
Thanks
Qian
-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Xu, Qian Q
Sent: Thursday, November 05, 2015 6:45 PM
To: Thomas Monjalon
Cc: dev@dpdk.org; Michael S. Tsirkin
Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
OK, I will check it tomorrow.
Another comment is that "Legacy vhost + virtio-pmd" is not the common use case. Firstly, in this case, virtio-pmd has no TCP/IP stack, TSO is not very meaningful; secondly, we can't get performance benefit from this case compared to "Legacy vhost+ legacy virtio". So I'm afraid no customer would like to try this case since the fake TSO and poor performance.
Thanks
Qian
-----Original Message-----
From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
Sent: Thursday, November 05, 2015 5:02 PM
To: Xu, Qian Q
Cc: Liu, Jijiang; dev@dpdk.org; Michael S. Tsirkin
Subject: Re: [dpdk-dev] [PATCH v3 6/8] driver/virtio:enqueue vhost TX offload
2015-11-05 08:49, Xu, Qian Q:
> Test Case 1: test_dpdk vhost+ virtio-pmd tso
[...]
> Test Case 2: test_dpdk vhost+legacy virtio iperf tso
[...]
> Yes please, I'd like to see a test report showing this virtio running with Linux vhost and without vhost.
> We must check that the checksum is well offloaded and sent packets are valids.
> Thanks
Thanks for doing some tests.
I had no doubt it works with DPDK vhost.
Please could you do some tests without vhost and with kernel vhost?
We need to check that the checksum is not missing in such cases.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX offload
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue " Jijiang Liu
@ 2015-11-09 4:00 ` Yuanhan Liu
2015-11-09 5:27 ` Liu, Jijiang
0 siblings, 1 reply; 29+ messages in thread
From: Yuanhan Liu @ 2015-11-09 4:00 UTC (permalink / raw)
To: Jijiang Liu; +Cc: dev
On Wed, Nov 04, 2015 at 06:54:15PM +0800, Jijiang Liu wrote:
> Dequeue vhost TX offload in vhost lib.
This is not enough; you need do corresponding setups at RX side,
otherwise the packet will be droped in the target VM at TCP layer
in VM2VM case, for checksum validation is failed.
--yliu
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample Jijiang Liu
@ 2015-11-09 4:17 ` Yuanhan Liu
2015-11-09 8:17 ` Liu, Jijiang
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Yuanhan Liu @ 2015-11-09 4:17 UTC (permalink / raw)
To: Jijiang Liu; +Cc: dev
On Wed, Nov 04, 2015 at 06:54:16PM +0800, Jijiang Liu wrote:
> Change the vhost sample to support and test TX offload.
>
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> ---
> examples/vhost/main.c | 128 ++++++++++++++++++++++++++++++++++++++++++-------
> 1 files changed, 111 insertions(+), 17 deletions(-)
>
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index 9eac2d0..06e1e8b 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -50,6 +50,10 @@
> #include <rte_string_fns.h>
> #include <rte_malloc.h>
> #include <rte_virtio_net.h>
> +#include <rte_tcp.h>
> +#include <rte_ip.h>
> +#include <rte_udp.h>
> +#include <rte_sctp.h>
>
> #include "main.h"
>
> @@ -140,6 +144,8 @@
>
> #define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb))
>
> +#define VIRTIO_TX_CKSUM_OFFLOAD_MASK (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK)
> +
> /* mask of enabled ports */
> static uint32_t enabled_port_mask = 0;
>
> @@ -197,6 +203,13 @@ typedef enum {
> static uint32_t enable_stats = 0;
> /* Enable retries on RX. */
> static uint32_t enable_retry = 1;
> +
> +/* Disable TX checksum offload */
^^^^^^^
You meant to "Enable"?
> +static uint32_t enable_tx_csum;
> +
> +/* Disable TSO offload */
> +static uint32_t enable_tso;
Actually, I'd like to see TSO/CSUM offloading is enabled by default:
they are so common, and they are enabled in a lot places by default,
say, kernel, qemu. There is no reason to make it "disable" here.
> +
> /* Specify timeout (in useconds) between retries on RX. */
> static uint32_t burst_rx_delay_time = BURST_RX_WAIT_US;
> /* Specify the number of retries on RX. */
> @@ -292,20 +305,6 @@ struct vlan_ethhdr {
> __be16 h_vlan_encapsulated_proto;
> };
>
> -/* IPv4 Header */
> -struct ipv4_hdr {
> - uint8_t version_ihl; /**< version and header length */
> - uint8_t type_of_service; /**< type of service */
> - uint16_t total_length; /**< length of packet */
> - uint16_t packet_id; /**< packet ID */
> - uint16_t fragment_offset; /**< fragmentation offset */
> - uint8_t time_to_live; /**< time to live */
> - uint8_t next_proto_id; /**< protocol ID */
> - uint16_t hdr_checksum; /**< header checksum */
> - uint32_t src_addr; /**< source address */
> - uint32_t dst_addr; /**< destination address */
> -} __attribute__((__packed__));
Minor nit: it's a cleanup, having nothing to do with this patch (to
demonstrate TSO/CSUM). It belongs to another patch.
> -
> /* Header lengths. */
> #define VLAN_HLEN 4
> #define VLAN_ETH_HLEN 18
> @@ -441,6 +440,14 @@ port_init(uint8_t port)
>
> if (port >= rte_eth_dev_count()) return -1;
>
> + if (enable_tx_csum == 0)
> + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_CSUM);
> +
> + if (enable_tso == 0) {
> + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4);
> + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO6);
> + }
> +
> rx_rings = (uint16_t)dev_info.max_rx_queues;
> /* Configure ethernet device. */
> retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
> @@ -576,7 +583,9 @@ us_vhost_usage(const char *prgname)
> " --rx-desc-num [0-N]: the number of descriptors on rx, "
> "used only when zero copy is enabled.\n"
> " --tx-desc-num [0-N]: the number of descriptors on tx, "
> - "used only when zero copy is enabled.\n",
> + "used only when zero copy is enabled.\n"
> + " --tx-csum [0|1] disable/enable TX checksum offload.\n"
> + " --tso [0|1] disable/enable TCP segement offload.\n",
> prgname);
> }
>
> @@ -602,6 +611,8 @@ us_vhost_parse_args(int argc, char **argv)
> {"zero-copy", required_argument, NULL, 0},
> {"rx-desc-num", required_argument, NULL, 0},
> {"tx-desc-num", required_argument, NULL, 0},
> + {"tx-csum", required_argument, NULL, 0},
> + {"tso", required_argument, NULL, 0},
> {NULL, 0, 0, 0},
> };
>
> @@ -656,6 +667,28 @@ us_vhost_parse_args(int argc, char **argv)
> }
> }
>
> + /* Enable/disable TX checksum offload. */
> + if (!strncmp(long_option[option_index].name, "tx-csum", MAX_LONG_OPT_SZ)) {
> + ret = parse_num_opt(optarg, 1);
> + if (ret == -1) {
> + RTE_LOG(INFO, VHOST_CONFIG, "Invalid argument for tx-csum [0|1]\n");
> + us_vhost_usage(prgname);
> + return -1;
> + } else
> + enable_tx_csum = ret;
> + }
> +
> + /* Enable/disable TSO offload. */
> + if (!strncmp(long_option[option_index].name, "tso", MAX_LONG_OPT_SZ)) {
> + ret = parse_num_opt(optarg, 1);
> + if (ret == -1) {
> + RTE_LOG(INFO, VHOST_CONFIG, "Invalid argument for tso [0|1]\n");
> + us_vhost_usage(prgname);
> + return -1;
> + } else
> + enable_tso = ret;
> + }
> +
> /* Specify the retries delay time (in useconds) on RX. */
> if (!strncmp(long_option[option_index].name, "rx-retry-delay", MAX_LONG_OPT_SZ)) {
> ret = parse_num_opt(optarg, INT32_MAX);
> @@ -1114,6 +1147,63 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
> return 0;
> }
>
> +static uint16_t
> +get_psd_sum(void *l3_hdr, uint64_t ol_flags)
> +{
> + if (ol_flags & PKT_TX_IPV4)
> + return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
> + else /* assume ethertype == ETHER_TYPE_IPv6 */
> + return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
> +}
> +
> +static void virtio_tx_offload(struct rte_mbuf *m)
> +{
> + void *l3_hdr;
> + struct ipv4_hdr *ipv4_hdr = NULL;
> + struct tcp_hdr *tcp_hdr = NULL;
> + struct udp_hdr *udp_hdr = NULL;
> + struct sctp_hdr *sctp_hdr = NULL;
> + struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> +
> + l3_hdr = (char *)eth_hdr + m->l2_len;
> +
> + if (m->ol_flags & PKT_TX_IPV4) {
> + ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
> + if (m->ol_flags & PKT_TX_IP_CKSUM)
> + ipv4_hdr->hdr_checksum = 0;
> + }
> +
> + if (m->ol_flags & PKT_TX_L4_MASK) {
> + switch (m->ol_flags & PKT_TX_L4_MASK) {
> + case PKT_TX_TCP_CKSUM:
> + tcp_hdr = (struct tcp_hdr *)
> + ((char *)l3_hdr + m->l3_len);
> + tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
I'm wondering that's necessary here (even for the data going through
nic). AFAIK, the kernel sending the data will calculate pseudo checksum.
(I may be wrong; a simple validation could prove that)
--yliu
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX offload
2015-11-09 4:00 ` Yuanhan Liu
@ 2015-11-09 5:27 ` Liu, Jijiang
0 siblings, 0 replies; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-09 5:27 UTC (permalink / raw)
To: Yuanhan Liu; +Cc: dev
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Monday, November 9, 2015 12:01 PM
> To: Liu, Jijiang
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX
> offload
>
> On Wed, Nov 04, 2015 at 06:54:15PM +0800, Jijiang Liu wrote:
> > Dequeue vhost TX offload in vhost lib.
>
> This is not enough; you need do corresponding setups at RX side, otherwise
> the packet will be droped in the target VM at TCP layer in VM2VM case, for
> checksum validation is failed.
>
> --yliu
Yes, the VM2VM case need to be considered,will fix it in next version of patch.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-09 4:17 ` Yuanhan Liu
@ 2015-11-09 8:17 ` Liu, Jijiang
2015-11-09 8:51 ` Yuanhan Liu
2015-11-09 8:18 ` Liu, Jijiang
2015-11-11 6:47 ` Liu, Jijiang
2 siblings, 1 reply; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-09 8:17 UTC (permalink / raw)
To: Yuanhan Liu; +Cc: dev
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Monday, November 09, 2015 12:17 PM
> To: Liu, Jijiang
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in
> vhost sample
>
> On Wed, Nov 04, 2015 at 06:54:16PM +0800, Jijiang Liu wrote:
> > Change the vhost sample to support and test TX offload.
> >
> > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> > ---
> > examples/vhost/main.c | 128
> > ++++++++++++++++++++++++++++++++++++++++++-------
> > 1 files changed, 111 insertions(+), 17 deletions(-)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 9eac2d0..06e1e8b 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -50,6 +50,10 @@
> > #include <rte_string_fns.h>
> > #include <rte_malloc.h>
> > #include <rte_virtio_net.h>
> > +#include <rte_tcp.h>
> > +#include <rte_ip.h>
> > +#include <rte_udp.h>
> > +#include <rte_sctp.h>
> >
> > #include "main.h"
> >
> > @@ -140,6 +144,8 @@
> >
> Actually, I'd like to see TSO/CSUM offloading is enabled by default:
> they are so common, and they are enabled in a lot places by default, say,
> kernel, qemu. There is no reason to make it "disable" here.
>
This is configuration only in the vhost sample, but TSO/CSUM is enabled by default in lib leyer.
If user want to use it in vhost sample, just change the configuration.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-09 4:17 ` Yuanhan Liu
2015-11-09 8:17 ` Liu, Jijiang
@ 2015-11-09 8:18 ` Liu, Jijiang
2015-11-11 6:47 ` Liu, Jijiang
2 siblings, 0 replies; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-09 8:18 UTC (permalink / raw)
To: Yuanhan Liu; +Cc: dev
> >
> > -/* IPv4 Header */
> > -struct ipv4_hdr {
> > - uint8_t version_ihl; /**< version and header length */
> > - uint8_t type_of_service; /**< type of service */
> > - uint16_t total_length; /**< length of packet */
> > - uint16_t packet_id; /**< packet ID */
> > - uint16_t fragment_offset; /**< fragmentation offset */
> > - uint8_t time_to_live; /**< time to live */
> > - uint8_t next_proto_id; /**< protocol ID */
> > - uint16_t hdr_checksum; /**< header checksum */
> > - uint32_t src_addr; /**< source address */
> > - uint32_t dst_addr; /**< destination address */
> > -} __attribute__((__packed__));
>
>
> Minor nit: it's a cleanup, having nothing to do with this patch (to
> demonstrate TSO/CSUM). It belongs to another patch.
>
Ok, it could be a separate patch.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-09 8:17 ` Liu, Jijiang
@ 2015-11-09 8:51 ` Yuanhan Liu
0 siblings, 0 replies; 29+ messages in thread
From: Yuanhan Liu @ 2015-11-09 8:51 UTC (permalink / raw)
To: Liu, Jijiang; +Cc: dev
On Mon, Nov 09, 2015 at 08:17:24AM +0000, Liu, Jijiang wrote:
>
>
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> > Sent: Monday, November 09, 2015 12:17 PM
> > To: Liu, Jijiang
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in
> > vhost sample
> >
> > On Wed, Nov 04, 2015 at 06:54:16PM +0800, Jijiang Liu wrote:
> > > Change the vhost sample to support and test TX offload.
> > >
> > > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> > > ---
> > > examples/vhost/main.c | 128
> > > ++++++++++++++++++++++++++++++++++++++++++-------
> > > 1 files changed, 111 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > > 9eac2d0..06e1e8b 100644
> > > --- a/examples/vhost/main.c
> > > +++ b/examples/vhost/main.c
> > > @@ -50,6 +50,10 @@
> > > #include <rte_string_fns.h>
> > > #include <rte_malloc.h>
> > > #include <rte_virtio_net.h>
> > > +#include <rte_tcp.h>
> > > +#include <rte_ip.h>
> > > +#include <rte_udp.h>
> > > +#include <rte_sctp.h>
> > >
> > > #include "main.h"
> > >
> > > @@ -140,6 +144,8 @@
> > >
> > Actually, I'd like to see TSO/CSUM offloading is enabled by default:
> > they are so common, and they are enabled in a lot places by default, say,
> > kernel, qemu. There is no reason to make it "disable" here.
> >
> This is configuration only in the vhost sample, but TSO/CSUM is enabled by default in lib leyer.
>
> If user want to use it in vhost sample, just change the configuration.
So, why do you want to disable it by default then?
--yliu
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample
2015-11-09 4:17 ` Yuanhan Liu
2015-11-09 8:17 ` Liu, Jijiang
2015-11-09 8:18 ` Liu, Jijiang
@ 2015-11-11 6:47 ` Liu, Jijiang
2 siblings, 0 replies; 29+ messages in thread
From: Liu, Jijiang @ 2015-11-11 6:47 UTC (permalink / raw)
To: Yuanhan Liu; +Cc: dev
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Monday, November 09, 2015 12:17 PM
> To: Liu, Jijiang
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in
> vhost sample
>
> On Wed, Nov 04, 2015 at 06:54:16PM +0800, Jijiang Liu wrote:
> > Change the vhost sample to support and test TX offload.
> >
> > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> > ---
> > examples/vhost/main.c | 128
> > ++++++++++++++++++++++++++++++++++++++++++-------
> > 1 files changed, 111 insertions(+), 17 deletions(-)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 9eac2d0..06e1e8b 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -50,6 +50,10 @@
> > #include <rte_string_fns.h>
> > #include <rte_malloc.h>
> > #include <rte_virtio_net.h>
> > +#include <rte_tcp.h>
> > +#include <rte_ip.h>
> > +#include <rte_udp.h>
> > +#include <rte_sctp.h>
> >
> > #include "main.h"
> >
> > @@ -140,6 +144,8 @@
> >
> > #define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb))
> >
> > +#define VIRTIO_TX_CKSUM_OFFLOAD_MASK (PKT_TX_IP_CKSUM |
> > +PKT_TX_L4_MASK)
> > +
> > /* mask of enabled ports */
> > static uint32_t enabled_port_mask = 0;
> >
> > @@ -197,6 +203,13 @@ typedef enum {
> > static uint32_t enable_stats = 0;
> > /* Enable retries on RX. */
> > static uint32_t enable_retry = 1;
> > +
> > +/* Disable TX checksum offload */
> ^^^^^^^
> You meant to "Enable"?
>
> > +static uint32_t enable_tx_csum;
> > +
> > +/* Disable TSO offload */
> > +static uint32_t enable_tso;
>
> Actually, I'd like to see TSO/CSUM offloading is enabled by default:
> they are so common, and they are enabled in a lot places by default, say,
> kernel, qemu. There is no reason to make it "disable" here.
>
> > +
> > /* Specify timeout (in useconds) between retries on RX. */ static
> > uint32_t burst_rx_delay_time = BURST_RX_WAIT_US;
> > /* Specify the number of retries on RX. */ @@ -292,20 +305,6 @@
> > struct vlan_ethhdr {
> > __be16 h_vlan_encapsulated_proto;
> > };
> >
> > -/* IPv4 Header */
> > -struct ipv4_hdr {
> > - uint8_t version_ihl; /**< version and header length */
> > - uint8_t type_of_service; /**< type of service */
> > - uint16_t total_length; /**< length of packet */
> > - uint16_t packet_id; /**< packet ID */
> > - uint16_t fragment_offset; /**< fragmentation offset */
> > - uint8_t time_to_live; /**< time to live */
> > - uint8_t next_proto_id; /**< protocol ID */
> > - uint16_t hdr_checksum; /**< header checksum */
> > - uint32_t src_addr; /**< source address */
> > - uint32_t dst_addr; /**< destination address */
> > -} __attribute__((__packed__));
>
>
> Minor nit: it's a cleanup, having nothing to do with this patch (to
> demonstrate TSO/CSUM). It belongs to another patch.
>
> > -
> > /* Header lengths. */
> > #define VLAN_HLEN 4
> > #define VLAN_ETH_HLEN 18
> > @@ -441,6 +440,14 @@ port_init(uint8_t port)
> >
> > if (port >= rte_eth_dev_count()) return -1;
> >
> > + if (enable_tx_csum == 0)
> > + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_CSUM);
> > +
> > + if (enable_tso == 0) {
> > + rte_vhost_feature_disable(1ULL <<
> VIRTIO_NET_F_HOST_TSO4);
> > + rte_vhost_feature_disable(1ULL <<
> VIRTIO_NET_F_HOST_TSO6);
> > + }
> > +
> > rx_rings = (uint16_t)dev_info.max_rx_queues;
> > /* Configure ethernet device. */
> > retval = rte_eth_dev_configure(port, rx_rings, tx_rings,
> > &port_conf); @@ -576,7 +583,9 @@ us_vhost_usage(const char
> *prgname)
> > " --rx-desc-num [0-N]: the number of descriptors on rx,
> "
> > "used only when zero copy is enabled.\n"
> > " --tx-desc-num [0-N]: the number of descriptors on tx,
> "
> > - "used only when zero copy is enabled.\n",
> > + "used only when zero copy is enabled.\n"
> > + " --tx-csum [0|1] disable/enable TX checksum
> offload.\n"
> > + " --tso [0|1] disable/enable TCP segement offload.\n",
> > prgname);
> > }
> >
> > @@ -602,6 +611,8 @@ us_vhost_parse_args(int argc, char **argv)
> > {"zero-copy", required_argument, NULL, 0},
> > {"rx-desc-num", required_argument, NULL, 0},
> > {"tx-desc-num", required_argument, NULL, 0},
> > + {"tx-csum", required_argument, NULL, 0},
> > + {"tso", required_argument, NULL, 0},
> > {NULL, 0, 0, 0},
> > };
> >
> > @@ -656,6 +667,28 @@ us_vhost_parse_args(int argc, char **argv)
> > }
> > }
> >
> > + /* Enable/disable TX checksum offload. */
> > + if (!strncmp(long_option[option_index].name, "tx-
> csum", MAX_LONG_OPT_SZ)) {
> > + ret = parse_num_opt(optarg, 1);
> > + if (ret == -1) {
> > + RTE_LOG(INFO, VHOST_CONFIG,
> "Invalid argument for tx-csum [0|1]\n");
> > + us_vhost_usage(prgname);
> > + return -1;
> > + } else
> > + enable_tx_csum = ret;
> > + }
> > +
> > + /* Enable/disable TSO offload. */
> > + if (!strncmp(long_option[option_index].name, "tso",
> MAX_LONG_OPT_SZ)) {
> > + ret = parse_num_opt(optarg, 1);
> > + if (ret == -1) {
> > + RTE_LOG(INFO, VHOST_CONFIG,
> "Invalid argument for tso [0|1]\n");
> > + us_vhost_usage(prgname);
> > + return -1;
> > + } else
> > + enable_tso = ret;
> > + }
> > +
> > /* Specify the retries delay time (in useconds) on RX.
> */
> > if (!strncmp(long_option[option_index].name, "rx-
> retry-delay", MAX_LONG_OPT_SZ)) {
> > ret = parse_num_opt(optarg, INT32_MAX);
> @@ -1114,6 +1147,63 @@
> > find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
> > return 0;
> > }
> >
> > +static uint16_t
> > +get_psd_sum(void *l3_hdr, uint64_t ol_flags) {
> > + if (ol_flags & PKT_TX_IPV4)
> > + return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
> > + else /* assume ethertype == ETHER_TYPE_IPv6 */
> > + return rte_ipv6_phdr_cksum(l3_hdr, ol_flags); }
> > +
> > +static void virtio_tx_offload(struct rte_mbuf *m) {
> > + void *l3_hdr;
> > + struct ipv4_hdr *ipv4_hdr = NULL;
> > + struct tcp_hdr *tcp_hdr = NULL;
> > + struct udp_hdr *udp_hdr = NULL;
> > + struct sctp_hdr *sctp_hdr = NULL;
> > + struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr
> *);
> > +
> > + l3_hdr = (char *)eth_hdr + m->l2_len;
> > +
> > + if (m->ol_flags & PKT_TX_IPV4) {
> > + ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
> > + if (m->ol_flags & PKT_TX_IP_CKSUM)
> > + ipv4_hdr->hdr_checksum = 0;
> > + }
> > +
> > + if (m->ol_flags & PKT_TX_L4_MASK) {
> > + switch (m->ol_flags & PKT_TX_L4_MASK) {
> > + case PKT_TX_TCP_CKSUM:
> > + tcp_hdr = (struct tcp_hdr *)
> > + ((char *)l3_hdr + m->l3_len);
> > + tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
>
> I'm wondering that's necessary here (even for the data going through nic).
> AFAIK, the kernel sending the data will calculate pseudo checksum.
>
> (I may be wrong; a simple validation could prove that)
>
After testing with combining TSO, these fileds need to be set.
> --yliu
^ permalink raw reply [flat|nested] 29+ messages in thread
* [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX offload
2015-11-04 8:35 Jijiang Liu
@ 2015-11-04 8:35 ` Jijiang Liu
0 siblings, 0 replies; 29+ messages in thread
From: Jijiang Liu @ 2015-11-04 8:35 UTC (permalink / raw)
To: dev
Dequeue vhost TX offload in vhost lib.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
lib/librte_vhost/vhost_rxtx.c | 108 ++++++++++++++++++++++++++++++++++++++++-
1 files changed, 107 insertions(+), 1 deletions(-)
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 7026bfa..a888ba9 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -36,7 +36,12 @@
#include <rte_mbuf.h>
#include <rte_memcpy.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
#include <rte_virtio_net.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+#include <rte_sctp.h>
#include "vhost-net.h"
@@ -548,6 +553,101 @@ rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
return virtio_dev_rx(dev, queue_id, pkts, count);
}
+static void
+parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
+{
+ struct ipv4_hdr *ipv4_hdr;
+ struct ipv6_hdr *ipv6_hdr;
+ void *l3_hdr = NULL;
+ struct ether_hdr *eth_hdr;
+ uint16_t ethertype;
+
+ eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ m->l2_len = sizeof(struct ether_hdr);
+ ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
+
+ if (ethertype == ETHER_TYPE_VLAN) {
+ struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
+
+ m->l2_len += sizeof(struct vlan_hdr);
+ ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
+ }
+
+ l3_hdr = (char *)eth_hdr + m->l2_len;
+
+ switch (ethertype) {
+ case ETHER_TYPE_IPv4:
+ ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+ *l4_proto = ipv4_hdr->next_proto_id;
+ m->l3_len = (ipv4_hdr->version_ihl & 0x0f) * 4;
+ *l4_hdr = (char *)l3_hdr + m->l3_len;
+ m->ol_flags |= PKT_TX_IPV4;
+ break;
+ case ETHER_TYPE_IPv6:
+ ipv6_hdr = (struct ipv6_hdr *)l3_hdr;
+ *l4_proto = ipv6_hdr->proto;
+ m->ol_flags |= PKT_TX_IPV6;
+ m->l3_len = sizeof(struct ipv6_hdr);
+ *l4_hdr = (char *)l3_hdr + m->l3_len;
+ break;
+ default:
+ m->l3_len = 0;
+ *l4_proto = 0;
+ break;
+ }
+}
+
+static inline void __attribute__((always_inline))
+vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
+{
+ uint16_t l4_proto = 0;
+ void *l4_hdr = NULL;
+ struct tcp_hdr *tcp_hdr = NULL;
+
+ parse_ethernet(m, &l4_proto, &l4_hdr);
+ if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+ if ((hdr->csum_start == m->l2_len) &&
+ (hdr->csum_offset == offsetof(struct ipv4_hdr,
+ hdr_checksum)))
+ m->ol_flags |= PKT_TX_IP_CKSUM;
+ else if (hdr->csum_start == (m->l2_len + m->l3_len)) {
+ switch (hdr->csum_offset) {
+ case (offsetof(struct tcp_hdr, cksum)):
+ if (l4_proto == IPPROTO_TCP)
+ m->ol_flags |= PKT_TX_TCP_CKSUM;
+ break;
+ case (offsetof(struct udp_hdr, dgram_cksum)):
+ if (l4_proto == IPPROTO_UDP)
+ m->ol_flags |= PKT_TX_UDP_CKSUM;
+ break;
+ case (offsetof(struct sctp_hdr, cksum)):
+ if (l4_proto == IPPROTO_SCTP)
+ m->ol_flags |= PKT_TX_SCTP_CKSUM;
+ break;
+ default:
+ break;
+ }
+ }
+ }
+
+ if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+ switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
+ case VIRTIO_NET_HDR_GSO_TCPV4:
+ case VIRTIO_NET_HDR_GSO_TCPV6:
+ tcp_hdr = (struct tcp_hdr *)l4_hdr;
+ m->ol_flags |= PKT_TX_TCP_SEG;
+ m->tso_segsz = hdr->gso_size;
+ m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
+ break;
+ default:
+ RTE_LOG(WARNING, VHOST_DATA,
+ "unsupported gso type %u.\n", hdr->gso_type);
+ break;
+ }
+ }
+}
+
uint16_t
rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count)
@@ -556,11 +656,13 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct vhost_virtqueue *vq;
struct vring_desc *desc;
uint64_t vb_addr = 0;
+ uint64_t vb_net_hdr_addr = 0;
uint32_t head[MAX_PKT_BURST];
uint32_t used_idx;
uint32_t i;
uint16_t free_entries, entry_success = 0;
uint16_t avail_idx;
+ struct virtio_net_hdr *hdr = NULL;
if (unlikely(queue_id != VIRTIO_TXQ)) {
LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
@@ -607,6 +709,9 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
desc = &vq->desc[head[entry_success]];
+ vb_net_hdr_addr = gpa_to_vva(dev, desc->addr);
+ hdr = (struct virtio_net_hdr *)((uintptr_t)vb_net_hdr_addr);
+
/* Discard first buffer as it is the virtio header */
if (desc->flags & VRING_DESC_F_NEXT) {
desc = &vq->desc[desc->next];
@@ -745,7 +850,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
break;
m->nb_segs = seg_num;
-
+ if ((hdr->flags != 0) || (hdr->gso_type != 0))
+ vhost_dequeue_offload(hdr, m);
pkts[entry_success] = m;
vq->last_used_idx++;
entry_success++;
--
1.7.7.6
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2015-11-11 6:48 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-04 10:54 [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 1/8] driver/virtio:add virtual addr for virtio net header Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 2/8] driver/virtio: record virtual address of " Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 3/8] driver/virtio:add vhost TX checksum support capability in virtio-net Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 4/8] driver/virtio:fill virtio device info for TX offload Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 5/8] driver/virtio:enqueue vhost " Jijiang Liu
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 6/8] " Jijiang Liu
2015-11-04 11:17 ` Thomas Monjalon
2015-11-04 12:52 ` Liu, Jijiang
2015-11-04 13:18 ` Thomas Monjalon
2015-11-05 8:49 ` Xu, Qian Q
2015-11-05 9:02 ` Thomas Monjalon
2015-11-05 10:44 ` Xu, Qian Q
2015-11-06 8:24 ` Xu, Qian Q
2015-11-04 13:06 ` Liu, Jijiang
2015-11-04 13:08 ` Liu, Jijiang
2015-11-04 13:15 ` Liu, Jijiang
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue " Jijiang Liu
2015-11-09 4:00 ` Yuanhan Liu
2015-11-09 5:27 ` Liu, Jijiang
2015-11-04 10:54 ` [dpdk-dev] [PATCH v3 8/8] examples/vhost:support TX offload in vhost sample Jijiang Liu
2015-11-09 4:17 ` Yuanhan Liu
2015-11-09 8:17 ` Liu, Jijiang
2015-11-09 8:51 ` Yuanhan Liu
2015-11-09 8:18 ` Liu, Jijiang
2015-11-11 6:47 ` Liu, Jijiang
2015-11-04 11:14 ` [dpdk-dev] [PATCH v3 0/8] add vhost TX offload support Tan, Jianfeng
2015-11-05 14:24 ` Glynn, Michael J
-- strict thread matches above, loose matches on Subject: below --
2015-11-04 8:35 Jijiang Liu
2015-11-04 8:35 ` [dpdk-dev] [PATCH v3 7/8] lib/librte_vhost:dequeue vhost TX offload Jijiang Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).