DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC v2 0/3] host: multiqueue improvements
@ 2018-02-27 14:34 Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq() Maxime Coquelin
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Maxime Coquelin @ 2018-02-27 14:34 UTC (permalink / raw)
  To: jianfeng.tan, stefanha, tiwei.bie, jfreimann, dev; +Cc: Maxime Coquelin

This second revision takes Jens comments into account, main
change is fixing an off-by-one error in patch 2.

The series introduce support for a new protocol request that
notifies the backend with Virtio device status updates.

Main goal is to be able with Virtio 1.0 devices to start
the port even if the guest hasn't initialized all the
queue pairs of the device. This case happens for example
with Windows driver if more queue pairs are declared than
there are vCPUs.

The patch also handles reset and failed driver status to
stop the device and destroy the virtqueues.

Last patch implements a workaround for old and current
QEMUs, that sends SET_VRING_ADDR requests for uninitalized
queues, which can leads to guest memory corruption if
the host application requests to diasble queues
notifications.

I posted the series as RFC, as the QEMU & specification
parts for the new request haven't been accepted yet.


Changes since RFC v1:
=====================
- move virtio_status declaration in the right patch
- Fix off-by-one error when removing uninitialized queues

Maxime Coquelin (3):
  vhost: invalidate vring addresses in cleanup_vq()
  vhost: add SET_VIRTIO_STATUS support
  vhost_user: work around invalid rings addresses sent by QEMU

 lib/librte_vhost/vhost.c      |   6 ++-
 lib/librte_vhost/vhost.h      |   4 +-
 lib/librte_vhost/vhost_user.c | 113 +++++++++++++++++++++++++++++++++++++++++-
 lib/librte_vhost/vhost_user.h |   5 +-
 4 files changed, 123 insertions(+), 5 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq()
  2018-02-27 14:34 [dpdk-dev] [RFC v2 0/3] host: multiqueue improvements Maxime Coquelin
@ 2018-02-27 14:34 ` Maxime Coquelin
  2018-02-28  6:35   ` Yang, Zhiyong
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 2/3] vhost: add SET_VIRTIO_STATUS support Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 3/3] vhost_user: work around invalid rings addresses sent by QEMU Maxime Coquelin
  2 siblings, 1 reply; 5+ messages in thread
From: Maxime Coquelin @ 2018-02-27 14:34 UTC (permalink / raw)
  To: jianfeng.tan, stefanha, tiwei.bie, jfreimann, dev; +Cc: Maxime Coquelin

When cleaning-up the virtqueue, we also need to invalidate its
addresses to be sure outdated addresses won't be used later.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
---
 lib/librte_vhost/vhost.c      | 6 ++++--
 lib/librte_vhost/vhost.h      | 3 ++-
 lib/librte_vhost/vhost_user.c | 2 +-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index f6f12a03b..e4281cf67 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -69,12 +69,14 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
 }
 
 void
-cleanup_vq(struct vhost_virtqueue *vq, int destroy)
+cleanup_vq(struct virtio_net *dev, struct vhost_virtqueue *vq, int destroy)
 {
 	if ((vq->callfd >= 0) && (destroy != 0))
 		close(vq->callfd);
 	if (vq->kickfd >= 0)
 		close(vq->kickfd);
+
+	vring_invalidate(dev, vq);
 }
 
 /*
@@ -89,7 +91,7 @@ cleanup_device(struct virtio_net *dev, int destroy)
 	vhost_backend_cleanup(dev);
 
 	for (i = 0; i < dev->nr_vring; i++)
-		cleanup_vq(dev->virtqueue[i], destroy);
+		cleanup_vq(dev, dev->virtqueue[i], destroy);
 }
 
 void
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 58aec2e0d..481700489 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -362,7 +362,8 @@ void cleanup_device(struct virtio_net *dev, int destroy);
 void reset_device(struct virtio_net *dev);
 void vhost_destroy_device(int);
 
-void cleanup_vq(struct vhost_virtqueue *vq, int destroy);
+void cleanup_vq(struct virtio_net *dev, struct vhost_virtqueue *vq,
+				int destroy);
 void free_vq(struct vhost_virtqueue *vq);
 
 int alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx);
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 5c5361066..c256ebb06 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -219,7 +219,7 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
 				continue;
 
 			dev->virtqueue[dev->nr_vring] = NULL;
-			cleanup_vq(vq, 1);
+			cleanup_vq(dev, vq, 1);
 			free_vq(vq);
 		}
 	}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [RFC v2 2/3] vhost: add SET_VIRTIO_STATUS support
  2018-02-27 14:34 [dpdk-dev] [RFC v2 0/3] host: multiqueue improvements Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq() Maxime Coquelin
@ 2018-02-27 14:34 ` Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 3/3] vhost_user: work around invalid rings addresses sent by QEMU Maxime Coquelin
  2 siblings, 0 replies; 5+ messages in thread
From: Maxime Coquelin @ 2018-02-27 14:34 UTC (permalink / raw)
  To: jianfeng.tan, stefanha, tiwei.bie, jfreimann, dev; +Cc: Maxime Coquelin

This patch implements support for the new SET_VIRTIO_STATUS
vhost-user request.

The main use for this new request is for the backend to know
when the driver sets the DRIVER_OK status bit. Starting Virtio
1.0, we know that once the the bit is set, no more queues will
be initialized.
When it happens, this patch removes all queues starting from
the first uninitialized one, so that the port starts even if
the guest driver does not use all the queues provided by QEMU.
This is for example the case with Windows driver, which only
initializes as much queue pairs as vCPUs.

The second use for this request is when the status changes to
reset or failed state, the vhost port is stopped and virtqueues
cleaned and freed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
---
 lib/librte_vhost/vhost.h      |  1 +
 lib/librte_vhost/vhost_user.c | 98 +++++++++++++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_user.h |  5 ++-
 3 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 481700489..4ebf84bec 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -241,6 +241,7 @@ struct virtio_net {
 	struct guest_page       *guest_pages;
 
 	int			slave_req_fd;
+	uint8_t		virtio_status;
 } __rte_cache_aligned;
 
 
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index c256ebb06..63f501e8d 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -67,6 +67,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_NET_SET_MTU]  = "VHOST_USER_NET_SET_MTU",
 	[VHOST_USER_SET_SLAVE_REQ_FD]  = "VHOST_USER_SET_SLAVE_REQ_FD",
 	[VHOST_USER_IOTLB_MSG]  = "VHOST_USER_IOTLB_MSG",
+	[VHOST_USER_SET_VIRTIO_STATUS]  = "VHOST_USER_SET_VIRTIO_STATUS",
 };
 
 static uint64_t
@@ -1244,6 +1245,100 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg)
 	return 0;
 }
 
+static int
+vhost_user_set_virtio_status(struct virtio_net *dev, struct VhostUserMsg *msg)
+{
+	uint8_t old_status, new_status;
+	uint32_t i;
+
+	/* As per Virtio spec, the Virtio device status is 8 bits wide */
+	if (msg->payload.u64 != (uint8_t)msg->payload.u64) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+				"Invalid Virtio dev status value (%lx)\n",
+				msg->payload.u64);
+		return -1;
+	}
+
+	new_status = msg->payload.u64;
+	old_status = dev->virtio_status;
+
+	if (new_status == old_status)
+		return 0;
+
+	RTE_LOG(DEBUG, VHOST_CONFIG,
+			"New Virtio device status %x (was %x)\n",
+			new_status, old_status);
+
+	dev->virtio_status = new_status;
+
+	if (new_status == 0 || new_status & VIRTIO_CONFIG_S_FAILED) {
+		/*
+		 * The device moved to reset  or failed state,
+		 * stop processing the virtqueues
+		 */
+		if (dev->flags & VIRTIO_DEV_RUNNING) {
+			dev->flags &= ~VIRTIO_DEV_RUNNING;
+			dev->notify_ops->destroy_device(dev->vid);
+		}
+
+		while (dev->nr_vring > 0) {
+			struct vhost_virtqueue *vq;
+
+			vq = dev->virtqueue[--dev->nr_vring];
+			if (!vq)
+				continue;
+
+			dev->virtqueue[dev->nr_vring] = NULL;
+			cleanup_vq(dev, vq, 1);
+			free_vq(vq);
+		}
+
+		return 0;
+	}
+
+	if ((dev->features & (1ULL << VIRTIO_F_VERSION_1)) &&
+			(new_status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+			!virtio_is_ready(dev)) {
+		/*
+		 * Since Virtio 1.0, we know that no more queues will be
+		 * setup after guest sets DRIVER_OK. So let's remove
+		 * uinitialized queues.
+		 */
+		RTE_LOG(INFO, VHOST_CONFIG,
+				"Driver is ready, but some queues aren't initialized\n");
+
+		/*
+		 * Find the first uninitialized queue.
+		 *
+		 * Note: Ideally the backend implementation should
+		 * support sparsed virtqueues, but as long as it is
+		 * not the case, let's remove all queues after the
+		 * first uninitialized one.
+		 */
+		for (i = 0; i < dev->nr_vring; i++) {
+			if (!vq_is_ready(dev->virtqueue[i]))
+				break;
+		}
+
+		while (dev->nr_vring > i) {
+			struct vhost_virtqueue *vq;
+
+			vq = dev->virtqueue[--dev->nr_vring];
+			if (!vq)
+				continue;
+
+			RTE_LOG(INFO, VHOST_CONFIG,
+					"Removing uninitialized queue %d\n",
+					dev->nr_vring);
+
+			dev->virtqueue[dev->nr_vring] = NULL;
+			cleanup_vq(dev, vq, 1);
+			free_vq(vq);
+		}
+	}
+	return 0;
+}
+
 /* return bytes# of read on success or negative val on failure. */
 static int
 read_vhost_message(int sockfd, struct VhostUserMsg *msg)
@@ -1550,6 +1645,9 @@ vhost_user_msg_handler(int vid, int fd)
 	case VHOST_USER_IOTLB_MSG:
 		ret = vhost_user_iotlb_msg(&dev, &msg);
 		break;
+	case VHOST_USER_SET_VIRTIO_STATUS:
+		ret = vhost_user_set_virtio_status(dev, &msg);
+		break;
 
 	default:
 		ret = -1;
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 0fafbe6e0..b4e23e0e9 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -20,13 +20,15 @@
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK	3
 #define VHOST_USER_PROTOCOL_F_NET_MTU 4
 #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5
+#define VHOST_USER_PROTOCOL_F_VIRTIO_STATUS 7
 
 #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ))
+					 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_VIRTIO_STATUS))
 
 typedef enum VhostUserRequest {
 	VHOST_USER_NONE = 0,
@@ -52,6 +54,7 @@ typedef enum VhostUserRequest {
 	VHOST_USER_NET_SET_MTU = 20,
 	VHOST_USER_SET_SLAVE_REQ_FD = 21,
 	VHOST_USER_IOTLB_MSG = 22,
+	VHOST_USER_SET_VIRTIO_STATUS = 26,
 	VHOST_USER_MAX
 } VhostUserRequest;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [RFC v2 3/3] vhost_user: work around invalid rings addresses sent by QEMU
  2018-02-27 14:34 [dpdk-dev] [RFC v2 0/3] host: multiqueue improvements Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq() Maxime Coquelin
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 2/3] vhost: add SET_VIRTIO_STATUS support Maxime Coquelin
@ 2018-02-27 14:34 ` Maxime Coquelin
  2 siblings, 0 replies; 5+ messages in thread
From: Maxime Coquelin @ 2018-02-27 14:34 UTC (permalink / raw)
  To: jianfeng.tan, stefanha, tiwei.bie, jfreimann, dev; +Cc: Maxime Coquelin

When the guest driver driver does not initialize all the queues,
QEMU currently sends SET_VRING_ADDR request for these queues.
In this case all the desc, avail and used addresses have GPA 0,
so translating them likely succeed.

The problem is that even if the uninitialized queues remain
disabled, the host application may request to disable the
notifications using rte_vhost_enable_guest_notification().
Doing this results in writing 0 to the used ring flag field,
so resulting in writing 0 in the guest physical address 0.

This patch adds a check to ensure all the ring addresses are
different before their translation.

When VHOST_USER_F_PROTOCOL_VIRTIO_STATUS and VIRTIO_F_VERSION_1
have been negotiated, the uninitialized queues will be removed
when driver sets the DRIVER_OK status bit.
Otherwise, the port will never start to avoid any guest memory
corruption.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 63f501e8d..afa7f7268 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -448,6 +448,19 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
 	if (vq->desc && vq->avail && vq->used)
 		return dev;
 
+	/*
+	 * QEMU currently sends SET_VRING_ADDR request even for queues
+	 * not initialized by the guest driver. In this case, all rings
+	 * addresses are identical (GPA 0).
+	 */
+	if (addr->desc_user_addr == addr->avail_user_addr &&
+			addr->desc_user_addr == addr->used_user_addr) {
+		RTE_LOG(INFO, VHOST_CONFIG,
+				"Invalid rings addresses for dev %d queue %d\n",
+				dev->vid, vq_index);
+		return dev;
+	}
+
 	vq->desc = (struct vring_desc *)(uintptr_t)ring_addr_to_vva(dev,
 			vq, addr->desc_user_addr, sizeof(struct vring_desc));
 	if (vq->desc == 0) {
-- 
2.14.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq()
  2018-02-27 14:34 ` [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq() Maxime Coquelin
@ 2018-02-28  6:35   ` Yang, Zhiyong
  0 siblings, 0 replies; 5+ messages in thread
From: Yang, Zhiyong @ 2018-02-28  6:35 UTC (permalink / raw)
  To: Maxime Coquelin, Tan, Jianfeng, stefanha, Bie, Tiwei, jfreimann, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Maxime Coquelin
> Sent: Tuesday, February 27, 2018 10:35 PM
> To: Tan, Jianfeng <jianfeng.tan@intel.com>; stefanha@redhat.com; Bie,
> Tiwei <tiwei.bie@intel.com>; jfreimann@redhat.com; dev@dpdk.org
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Subject: [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in
> cleanup_vq()
> 
> When cleaning-up the virtqueue, we also need to invalidate its addresses to
> be sure outdated addresses won't be used later.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Reviewed-by: Jens Freimann <jfreimann@redhat.com>
> ---

Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-28  6:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-27 14:34 [dpdk-dev] [RFC v2 0/3] host: multiqueue improvements Maxime Coquelin
2018-02-27 14:34 ` [dpdk-dev] [RFC v2 1/3] vhost: invalidate vring addresses in cleanup_vq() Maxime Coquelin
2018-02-28  6:35   ` Yang, Zhiyong
2018-02-27 14:34 ` [dpdk-dev] [RFC v2 2/3] vhost: add SET_VIRTIO_STATUS support Maxime Coquelin
2018-02-27 14:34 ` [dpdk-dev] [RFC v2 3/3] vhost_user: work around invalid rings addresses sent by QEMU Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).