From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: dev@dpdk.org
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>
Subject: [dpdk-dev] [PATCH v3 5/7] vhost: add a flag to enable dequeue zero copy
Date: Sun, 9 Oct 2016 15:27:58 +0800 [thread overview]
Message-ID: <1475998080-4644-6-git-send-email-yuanhan.liu@linux.intel.com> (raw)
In-Reply-To: <1475998080-4644-1-git-send-email-yuanhan.liu@linux.intel.com>
Dequeue zero copy is disabled by default. Here add a new flag
``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v2: - update release log
- doc dequeue zero copy in detail
---
doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++++++++++++++++++++++++++-
doc/guides/rel_notes/release_16_11.rst | 13 +++++++++++++
lib/librte_vhost/rte_virtio_net.h | 1 +
lib/librte_vhost/socket.c | 5 +++++
lib/librte_vhost/vhost.c | 10 ++++++++++
lib/librte_vhost/vhost.h | 1 +
6 files changed, 64 insertions(+), 1 deletion(-)
diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 6b0c6b2..3fa9dd7 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -79,7 +79,7 @@ The following is an overview of the Vhost API functions:
``/dev/path`` character device file will be created. For vhost-user server
mode, a Unix domain socket file ``path`` will be created.
- Currently two flags are supported (these are valid for vhost-user only):
+ Currently supported flags are (these are valid for vhost-user only):
- ``RTE_VHOST_USER_CLIENT``
@@ -97,6 +97,39 @@ The following is an overview of the Vhost API functions:
This reconnect option is enabled by default. However, it can be turned off
by setting this flag.
+ - ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY``
+
+ Dequeue zero copy will be enabled when this flag is set. It is disabled by
+ default.
+
+ There are some truths (including limitations) you might want to know while
+ setting this flag:
+
+ * zero copy is not good for small packets (typically for packet size below
+ 512).
+
+ * zero copy is really good for VM2VM case. For iperf between two VMs, the
+ boost could be above 70% (when TSO is enableld).
+
+ * for VM2NIC case, the ``nb_tx_desc`` has to be small enough: <= 64 if virtio
+ indirect feature is not enabled and <= 128 if it is enabled.
+
+ The is because when dequeue zero copy is enabled, guest Tx used vring will
+ be updated only when corresponding mbuf is freed. Thus, the nb_tx_desc
+ has to be small enough so that the PMD driver will run out of available
+ Tx descriptors and free mbufs timely. Otherwise, guest Tx vring would be
+ starved.
+
+ * Guest memory should be backended with huge pages to achieve better
+ performance. Using 1G page size is the best.
+
+ When dequeue zero copy is enabled, the guest phys address and host phys
+ address mapping has to be established. Using non-huge pages means far
+ more page segments. To make it simple, DPDK vhost does a linear search
+ of those segments, thus the fewer the segments, the quicker we will get
+ the mapping. NOTE: we may speed it by using radix tree searching in
+ future.
+
* ``rte_vhost_driver_session_start()``
This function starts the vhost session loop to handle vhost messages. It
diff --git a/doc/guides/rel_notes/release_16_11.rst b/doc/guides/rel_notes/release_16_11.rst
index 905186a..2180d8d 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -36,6 +36,19 @@ New Features
This section is a comment. Make sure to start the actual text at the margin.
+* **Added vhost-user dequeue zero copy support**
+
+ The copy in dequeue path is saved, which is meant to improve the performance.
+ In the VM2VM case, the boost is quite impressive. The bigger the packet size,
+ the bigger performance boost you may get. However, for VM2NIC case, there
+ are some limitations, yet the boost is not that impressive as VM2VM case.
+ It may even drop quite a bit for small packets.
+
+ For such reason, this feature is disabled by default. It can be enabled when
+ ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` flag is given. Check the vhost section
+ at programming guide for more information.
+
+
* **Added vhost-user indirect descriptors support.**
If indirect descriptor feature is negotiated, each packet sent by the guest
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index a88aecd..c53ff64 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -53,6 +53,7 @@
#define RTE_VHOST_USER_CLIENT (1ULL << 0)
#define RTE_VHOST_USER_NO_RECONNECT (1ULL << 1)
+#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY (1ULL << 2)
/* Enum for virtqueue management. */
enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index bf03f84..967cb65 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -62,6 +62,7 @@ struct vhost_user_socket {
int connfd;
bool is_server;
bool reconnect;
+ bool dequeue_zero_copy;
};
struct vhost_user_connection {
@@ -203,6 +204,9 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket)
size = strnlen(vsocket->path, PATH_MAX);
vhost_set_ifname(vid, vsocket->path, size);
+ if (vsocket->dequeue_zero_copy)
+ vhost_enable_dequeue_zero_copy(vid);
+
RTE_LOG(INFO, VHOST_CONFIG, "new device, handle is %d\n", vid);
vsocket->connfd = fd;
@@ -499,6 +503,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
memset(vsocket, 0, sizeof(struct vhost_user_socket));
vsocket->path = strdup(path);
vsocket->connfd = -1;
+ vsocket->dequeue_zero_copy = flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
if ((flags & RTE_VHOST_USER_CLIENT) != 0) {
vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT);
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index dbf5d1b..469117a 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -291,6 +291,16 @@ vhost_set_ifname(int vid, const char *if_name, unsigned int if_len)
dev->ifname[sizeof(dev->ifname) - 1] = '\0';
}
+void
+vhost_enable_dequeue_zero_copy(int vid)
+{
+ struct virtio_net *dev = get_device(vid);
+
+ if (dev == NULL)
+ return;
+
+ dev->dequeue_zero_copy = 1;
+}
int
rte_vhost_get_numa_node(int vid)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index be8a398..53dbf33 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -278,6 +278,7 @@ void vhost_destroy_device(int);
int alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx);
void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
+void vhost_enable_dequeue_zero_copy(int vid);
/*
* Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
--
1.9.0
next prev parent reply other threads:[~2016-10-09 7:27 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-23 8:10 [dpdk-dev] [PATCH 0/6] vhost: add Tx zero copy support Yuanhan Liu
2016-08-23 8:10 ` [dpdk-dev] [PATCH 1/6] vhost: simplify memory regions handling Yuanhan Liu
2016-08-23 9:17 ` Maxime Coquelin
2016-08-24 7:26 ` Xu, Qian Q
2016-08-24 7:40 ` Yuanhan Liu
2016-08-24 7:36 ` Xu, Qian Q
2016-08-23 8:10 ` [dpdk-dev] [PATCH 2/6] vhost: get guest/host physical address mappings Yuanhan Liu
2016-08-23 9:58 ` Maxime Coquelin
2016-08-23 12:32 ` Yuanhan Liu
2016-08-23 13:25 ` Maxime Coquelin
2016-08-23 13:49 ` Yuanhan Liu
2016-08-23 14:05 ` Maxime Coquelin
2016-08-23 8:10 ` [dpdk-dev] [PATCH 3/6] vhost: introduce last avail idx for Tx Yuanhan Liu
2016-08-23 12:27 ` Maxime Coquelin
2016-08-23 8:10 ` [dpdk-dev] [PATCH 4/6] vhost: add Tx zero copy Yuanhan Liu
2016-08-23 14:04 ` Maxime Coquelin
2016-08-23 14:31 ` Yuanhan Liu
2016-08-23 15:40 ` Maxime Coquelin
2016-08-23 8:10 ` [dpdk-dev] [PATCH 5/6] vhost: add a flag to enable " Yuanhan Liu
2016-09-06 9:00 ` Xu, Qian Q
2016-09-06 9:42 ` Xu, Qian Q
2016-09-06 10:02 ` Yuanhan Liu
2016-09-07 2:43 ` Xu, Qian Q
2016-09-06 9:55 ` Yuanhan Liu
2016-09-07 16:00 ` Thomas Monjalon
2016-09-08 7:21 ` Yuanhan Liu
2016-09-08 7:57 ` Thomas Monjalon
2016-08-23 8:10 ` [dpdk-dev] [PATCH 6/6] examples/vhost: add an option " Yuanhan Liu
2016-08-23 9:31 ` Thomas Monjalon
2016-08-23 12:33 ` Yuanhan Liu
2016-08-23 14:14 ` Maxime Coquelin
2016-08-23 14:45 ` Yuanhan Liu
2016-08-23 14:18 ` [dpdk-dev] [PATCH 0/6] vhost: add Tx zero copy support Maxime Coquelin
2016-08-23 14:42 ` Yuanhan Liu
2016-08-23 14:53 ` Yuanhan Liu
2016-08-23 16:41 ` Maxime Coquelin
2016-08-29 8:32 ` Xu, Qian Q
2016-08-29 8:57 ` Xu, Qian Q
2016-09-23 4:11 ` Yuanhan Liu
2016-10-09 15:20 ` Yuanhan Liu
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 0/7] vhost: add dequeue " Yuanhan Liu
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 1/7] vhost: simplify memory regions handling Yuanhan Liu
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 2/7] vhost: get guest/host physical address mappings Yuanhan Liu
2016-09-26 20:17 ` Maxime Coquelin
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 3/7] vhost: introduce last avail idx for dequeue Yuanhan Liu
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 4/7] vhost: add dequeue zero copy Yuanhan Liu
2016-09-26 20:45 ` Maxime Coquelin
2016-10-06 14:37 ` Xu, Qian Q
2016-10-09 2:03 ` Yuanhan Liu
2016-10-10 10:12 ` Xu, Qian Q
2016-10-10 10:14 ` Maxime Coquelin
2016-10-10 10:22 ` Xu, Qian Q
2016-10-10 10:40 ` Xu, Qian Q
2016-10-10 11:48 ` Maxime Coquelin
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 5/7] vhost: add a flag to enable " Yuanhan Liu
2016-09-26 20:57 ` Maxime Coquelin
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 6/7] examples/vhost: add an option " Yuanhan Liu
2016-09-26 21:05 ` Maxime Coquelin
2016-09-23 4:13 ` [dpdk-dev] [PATCH v2 7/7] net/vhost: " Yuanhan Liu
2016-09-26 21:05 ` Maxime Coquelin
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 0/7] vhost: add dequeue zero copy support Yuanhan Liu
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 1/7] vhost: simplify memory regions handling Yuanhan Liu
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 2/7] vhost: get guest/host physical address mappings Yuanhan Liu
2016-11-29 3:10 ` linhaifeng
2016-11-29 13:14 ` linhaifeng
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 3/7] vhost: introduce last avail idx for dequeue Yuanhan Liu
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 4/7] vhost: add dequeue zero copy Yuanhan Liu
2016-10-09 7:27 ` Yuanhan Liu [this message]
2016-10-09 7:27 ` [dpdk-dev] [PATCH v3 6/7] examples/vhost: add an option to enable " Yuanhan Liu
2016-10-09 7:28 ` [dpdk-dev] [PATCH v3 7/7] net/vhost: " Yuanhan Liu
2016-10-11 13:04 ` [dpdk-dev] [PATCH v3 0/7] vhost: add dequeue zero copy support Xu, Qian Q
2016-10-12 7:48 ` Yuanhan Liu
2016-10-09 10:46 ` [dpdk-dev] [PATCH 0/6] vhost: add Tx " linhaifeng
2016-10-10 8:03 ` Yuanhan Liu
2016-10-14 7:30 ` linhaifeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1475998080-4644-6-git-send-email-yuanhan.liu@linux.intel.com \
--to=yuanhan.liu@linux.intel.com \
--cc=dev@dpdk.org \
--cc=maxime.coquelin@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).