* [dpdk-dev] [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE
2017-11-24 17:48 [dpdk-dev] [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
Cc: stable, jfreiman, Maxime Coquelin
If VHOST_USER_SET_LOG_BASE request's message size is invalid,
the fd is leaked.
Fix this by closing the fd systematically as long as it is valid.
Fixes: 53af5b1e0ace ("vhost: fix leak of file descriptor")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f4c7ce462..f06d9bb65 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -895,6 +895,7 @@ static int
vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
{
int fd = msg->fds[0];
+ int ret = 0;
uint64_t size, off;
void *addr;
@@ -907,7 +908,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
RTE_LOG(ERR, VHOST_CONFIG,
"invalid log base msg size: %"PRId32" != %d\n",
msg->size, (int)sizeof(VhostUserLog));
- return -1;
+ ret = -1;
+ goto out;
}
size = msg->payload.log.mmap_size;
@@ -921,10 +923,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
* fail when offset is not page size aligned.
*/
addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
- close(fd);
if (addr == MAP_FAILED) {
RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
- return -1;
+ ret = -1;
+ goto out;
}
/*
@@ -938,7 +940,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
dev->log_base = dev->log_addr + off;
dev->log_size = size;
- return 0;
+out:
+ close(fd);
+
+ return ret;
}
/*
--
2.14.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [dpdk-dev] [PATCH 2/3] vhost: protect dirty logging against logging base change
2017-11-24 17:48 [dpdk-dev] [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
Cc: stable, jfreiman, Maxime Coquelin
When performing live-migration with multiple queue pairs,
VHOST_USER_SET_LOG_BASE request is sent multiple times.
If packets are being processed by the PMD threads, it is
possible that they are setting bits in the dirty log map while
its region is being unmapped by the vhost-user protocol thread.
It results in the following crash:
Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f71ca495700 (LWP 32451)]
0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: Cannot access memory at address 0x7f71cbe18432>, nr=1) at /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
267 __sync_fetch_and_or_8(addr, (1U << nr));
We can see the vhost-user protocol thread just did the unmap of the
dirty log region when it happens.
This patch prevents this by introducing a RW lock to protect
the log base.
Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost.c | 2 ++
lib/librte_vhost/vhost.h | 10 ++++++++--
lib/librte_vhost/vhost_user.c | 4 ++++
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 4f8b73a09..5a7699da0 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -311,6 +311,8 @@ vhost_new_device(void)
return -1;
}
+ rte_rwlock_init(&dev->log_lock);
+
vhost_devices[i] = dev;
dev->vid = i;
dev->slave_req_fd = -1;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 1cc81c17c..0f76d6495 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -243,6 +243,7 @@ struct virtio_net {
uint64_t log_size;
uint64_t log_base;
uint64_t log_addr;
+ rte_rwlock_t log_lock;
struct ether_addr mac;
uint16_t mtu;
@@ -278,12 +279,14 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
{
uint64_t page;
+ rte_rwlock_read_lock(&dev->log_lock);
+
if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
!dev->log_base || !len))
- return;
+ goto unlock;
if (unlikely(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8)))
- return;
+ goto unlock;
/* To make sure guest memory updates are committed before logging */
rte_smp_wmb();
@@ -293,6 +296,9 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
page += 1;
}
+
+unlock:
+ rte_rwlock_read_unlock(&dev->log_lock);
}
static __rte_always_inline void
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f06d9bb65..4b03dbbca 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -929,6 +929,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
goto out;
}
+ rte_rwlock_write_lock(&dev->log_lock);
+
/*
* Free previously mapped log memory on occasionally
* multiple VHOST_USER_SET_LOG_BASE.
@@ -940,6 +942,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
dev->log_base = dev->log_addr + off;
dev->log_size = size;
+ rte_rwlock_write_unlock(&dev->log_lock);
+
out:
close(fd);
--
2.14.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [dpdk-dev] [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical
2017-11-24 17:48 [dpdk-dev] [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
2017-11-24 17:48 ` [dpdk-dev] [PATCH 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
Cc: stable, jfreiman, Maxime Coquelin
In VHOST_USER_SET_VRING_ADDR handling, don't invalidate the vring
if it has already been mapped and new addresses are identical.
When initiating live-migration, VHOST_USER_SET_VRING_ADDR is sent
again by QEMU, but the queues are enabled, so invalidating them
can result in NULL pointer de-referencing. In this case, it is
not needed to perform the invalidation, as the new addresses provided
by QEMU are indentical to the initial ones.
Fixes: eefac9536a90 ("vhost: postpone device creation until rings are mapped")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 4b03dbbca..29a431687 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -436,6 +436,14 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
return dev;
}
+static int
+is_same_vring_addrs(struct vhost_vring_addr *a1, struct vhost_vring_addr *a2)
+{
+ return ((a1->desc_user_addr == a2->desc_user_addr) &&
+ (a1->used_user_addr == a2->used_user_addr) &&
+ (a1->avail_user_addr == a2->avail_user_addr));
+}
+
/*
* The virtio device sends us the desc, used and avail ring addresses.
* This function then converts these to our address space.
@@ -453,6 +461,13 @@ vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg)
/* addr->index refers to the queue index. The txq 1, rxq is 0. */
vq = dev->virtqueue[msg->payload.addr.index];
+ /*
+ * If it is trying to set the same rings addresses, don't invalidate as
+ * PMD threads might be using them.
+ */
+ if (is_same_vring_addrs(addr, &vq->ring_addrs))
+ return 0;
+
/*
* Rings addresses should not be interpreted as long as the ring is not
* started and enabled
--
2.14.3
^ permalink raw reply [flat|nested] 4+ messages in thread