DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/3] vhost: MQ live-migration fixes
@ 2017-11-24 18:08 Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-24 18:08 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

Sorry, posted the wrong version. Only patch 2 changes in the v2,
the log_lock is read-locked is moved after the VHOST_F_LOG_ALL
feature check, so that it does not degrade performance when
not doing the live-migration.

This 3 patches series fixes issues met when doing live-migration
with multiple queue pairs.

Patch 1 is theorical and unlikely to be reproduced in real use-cases,
so it may be safe not to pick it in stable trees.

Patch 2 reproduces quite often when lots of packets are being processed.
Easiest way to reproduce it is to run DPDK in guest and perform IO
loopback with testpmd. This patch targets both v16.11 & v17.11 stable
trees, and will require a rework for v16.11 as some dirty logging
functions moved from virtio-net.c to vhost.h. I'm not sure of the
process here, but I can provide the v16.11 backport if needed.

Patch 3 is a regression introduced in v17.11. For a reason I have
yet to understand, QEMU sends VHOST_USER_SET_VRING_ADDR requests
when live-migration is initiated. The problem is that the vhost-user
protocol thread has no way to be sure the PMD threads are accessing
the rings or not. As the new addresses sent by QEMU are the same
it sent intially, this patch just ignores them.

Regards,
Maxime

Maxime Coquelin (3):
  vhost: fix fd leak in VHOST_USER_SET_LOG_BASE
  vhost: protect dirty logging against logging base change
  vhost: don't invalidate vrings if new addresses are identical

 lib/librte_vhost/vhost.c      |  2 ++
 lib/librte_vhost/vhost.h      | 14 +++++++++++---
 lib/librte_vhost/vhost_user.c | 32 ++++++++++++++++++++++++++++----
 3 files changed, 41 insertions(+), 7 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE
  2017-11-24 18:08 [dpdk-dev] [PATCH v2 0/3] vhost: MQ live-migration fixes Maxime Coquelin
@ 2017-11-24 18:08 ` Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
  2 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-24 18:08 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

If VHOST_USER_SET_LOG_BASE request's message size is invalid,
the fd is leaked.

Fix this by closing the fd systematically as long as it is valid.

Fixes: 53af5b1e0ace ("vhost: fix leak of file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f4c7ce462..f06d9bb65 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -895,6 +895,7 @@ static int
 vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 {
 	int fd = msg->fds[0];
+	int ret = 0;
 	uint64_t size, off;
 	void *addr;
 
@@ -907,7 +908,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"invalid log base msg size: %"PRId32" != %d\n",
 			msg->size, (int)sizeof(VhostUserLog));
-		return -1;
+		ret = -1;
+		goto out;
 	}
 
 	size = msg->payload.log.mmap_size;
@@ -921,10 +923,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	 * fail when offset is not page size aligned.
 	 */
 	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
-	close(fd);
 	if (addr == MAP_FAILED) {
 		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
-		return -1;
+		ret = -1;
+		goto out;
 	}
 
 	/*
@@ -938,7 +940,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	dev->log_base = dev->log_addr + off;
 	dev->log_size = size;
 
-	return 0;
+out:
+	close(fd);
+
+	return ret;
 }
 
 /*
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-24 18:08 [dpdk-dev] [PATCH v2 0/3] vhost: MQ live-migration fixes Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
@ 2017-11-24 18:08 ` Maxime Coquelin
  2017-11-27  8:16   ` Victor Kaplansky
  2017-11-28 10:06   ` Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
  2 siblings, 2 replies; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-24 18:08 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

When performing live-migration with multiple queue pairs,
VHOST_USER_SET_LOG_BASE request is sent multiple times.

If packets are being processed by the PMD threads, it is
possible that they are setting bits in the dirty log map while
its region is being unmapped by the vhost-user protocol thread.
It results in the following crash:
Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f71ca495700 (LWP 32451)]
0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: Cannot access memory at address 0x7f71cbe18432>, nr=1) at /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
267        __sync_fetch_and_or_8(addr, (1U << nr));

We can see the vhost-user protocol thread just did the unmap of the
dirty log region when it happens.

This patch prevents this by introducing a RW lock to protect
the log base.

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost.c      |  2 ++
 lib/librte_vhost/vhost.h      | 14 +++++++++++---
 lib/librte_vhost/vhost_user.c |  4 ++++
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 4f8b73a09..5a7699da0 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -311,6 +311,8 @@ vhost_new_device(void)
 		return -1;
 	}
 
+	rte_rwlock_init(&dev->log_lock);
+
 	vhost_devices[i] = dev;
 	dev->vid = i;
 	dev->slave_req_fd = -1;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 1cc81c17c..2f36a034e 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -243,6 +243,7 @@ struct virtio_net {
 	uint64_t		log_size;
 	uint64_t		log_base;
 	uint64_t		log_addr;
+	rte_rwlock_t	log_lock;
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
@@ -278,12 +279,16 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
 {
 	uint64_t page;
 
+
 	if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
-		   !dev->log_base || !len))
+		   !len))
 		return;
 
-	if (unlikely(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8)))
-		return;
+	rte_rwlock_read_lock(&dev->log_lock);
+
+	if (unlikely((!dev->log_base) ||
+				(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8))))
+		goto unlock;
 
 	/* To make sure guest memory updates are committed before logging */
 	rte_smp_wmb();
@@ -293,6 +298,9 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
 		vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
 		page += 1;
 	}
+
+unlock:
+	rte_rwlock_read_unlock(&dev->log_lock);
 }
 
 static __rte_always_inline void
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f06d9bb65..4b03dbbca 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -929,6 +929,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		goto out;
 	}
 
+	rte_rwlock_write_lock(&dev->log_lock);
+
 	/*
 	 * Free previously mapped log memory on occasionally
 	 * multiple VHOST_USER_SET_LOG_BASE.
@@ -940,6 +942,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	dev->log_base = dev->log_addr + off;
 	dev->log_size = size;
 
+	rte_rwlock_write_unlock(&dev->log_lock);
+
 out:
 	close(fd);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] vhost: don't invalidate vrings if new addresses are identical
  2017-11-24 18:08 [dpdk-dev] [PATCH v2 0/3] vhost: MQ live-migration fixes Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
@ 2017-11-24 18:08 ` Maxime Coquelin
  2 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-24 18:08 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

In VHOST_USER_SET_VRING_ADDR handling, don't invalidate the vring
if it has already been mapped and new addresses are identical.

When initiating live-migration, VHOST_USER_SET_VRING_ADDR is sent
again by QEMU, but the queues are enabled, so invalidating them
can result in NULL pointer de-referencing. In this case, it is
not needed to perform the invalidation, as the new addresses provided
by QEMU are indentical to the initial ones.

Fixes: eefac9536a90 ("vhost: postpone device creation until rings are mapped")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 4b03dbbca..29a431687 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -436,6 +436,14 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
 	return dev;
 }
 
+static int
+is_same_vring_addrs(struct vhost_vring_addr *a1, struct vhost_vring_addr *a2)
+{
+	return ((a1->desc_user_addr == a2->desc_user_addr) &&
+			(a1->used_user_addr == a2->used_user_addr) &&
+			(a1->avail_user_addr == a2->avail_user_addr));
+}
+
 /*
  * The virtio device sends us the desc, used and avail ring addresses.
  * This function then converts these to our address space.
@@ -453,6 +461,13 @@ vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg)
 	/* addr->index refers to the queue index. The txq 1, rxq is 0. */
 	vq = dev->virtqueue[msg->payload.addr.index];
 
+	/*
+	 * If it is trying to set the same rings addresses, don't invalidate as
+	 * PMD threads might be using them.
+	 */
+	if (is_same_vring_addrs(addr, &vq->ring_addrs))
+		return 0;
+
 	/*
 	 * Rings addresses should not be interpreted as long as the ring is not
 	 * started and enabled
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
@ 2017-11-27  8:16   ` Victor Kaplansky
  2017-11-27  8:27     ` Maxime Coquelin
  2017-11-28 10:06   ` Maxime Coquelin
  1 sibling, 1 reply; 12+ messages in thread
From: Victor Kaplansky @ 2017-11-27  8:16 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: dev, yliu, tiwei bie, jianfeng tan, stable, jfreiman

Hi,

While I agree that taking full fledged lock by rte_rwlock_read_lock() solves the race condition,
I'm afraid that it would be too expensive in case when logging is off, since it introduces
acquiring and releasing lock into the main flow of ring updates.

It is OK for now, as it fixes the bug, but we need to perform more careful performance measurements,
and see whether the performance degradation is not too prohibitive.

As alternative, we may consider using more light weighted busy looping.

Also, lets fix by this series the __sync_fetch_and_or_8 -> __sync_fetch_and_or,
as it may improve the performance slightly.

-- 
Victor 

----- Original Message -----
> From: "Maxime Coquelin" <maxime.coquelin@redhat.com>
> To: dev@dpdk.org, yliu@fridaylinux.org, "tiwei bie" <tiwei.bie@intel.com>, "jianfeng tan" <jianfeng.tan@intel.com>,
> vkaplans@redhat.com
> Cc: stable@dpdk.org, jfreiman@redhat.com, "Maxime Coquelin" <maxime.coquelin@redhat.com>
> Sent: Friday, November 24, 2017 8:08:25 PM
> Subject: [PATCH v2 2/3] vhost: protect dirty logging against logging base change
> 
> When performing live-migration with multiple queue pairs,
> VHOST_USER_SET_LOG_BASE request is sent multiple times.
> 
> If packets are being processed by the PMD threads, it is
> possible that they are setting bits in the dirty log map while
> its region is being unmapped by the vhost-user protocol thread.
> It results in the following crash:
> Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f71ca495700 (LWP 32451)]
> 0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: Cannot
> access memory at address 0x7f71cbe18432>, nr=1) at
> /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
> 267        __sync_fetch_and_or_8(addr, (1U << nr));
> 
> We can see the vhost-user protocol thread just did the unmap of the
> dirty log region when it happens.
> 
> This patch prevents this by introducing a RW lock to protect
> the log base.
> 
> Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/vhost.c      |  2 ++
>  lib/librte_vhost/vhost.h      | 14 +++++++++++---
>  lib/librte_vhost/vhost_user.c |  4 ++++
>  3 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index 4f8b73a09..5a7699da0 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -311,6 +311,8 @@ vhost_new_device(void)
>  		return -1;
>  	}
>  
> +	rte_rwlock_init(&dev->log_lock);
> +
>  	vhost_devices[i] = dev;
>  	dev->vid = i;
>  	dev->slave_req_fd = -1;
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> index 1cc81c17c..2f36a034e 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -243,6 +243,7 @@ struct virtio_net {
>  	uint64_t		log_size;
>  	uint64_t		log_base;
>  	uint64_t		log_addr;
> +	rte_rwlock_t	log_lock;
>  	struct ether_addr	mac;
>  	uint16_t		mtu;
>  
> @@ -278,12 +279,16 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr,
> uint64_t len)
>  {
>  	uint64_t page;
>  
> +
>  	if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
> -		   !dev->log_base || !len))
> +		   !len))
>  		return;
>  
> -	if (unlikely(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8)))
> -		return;
> +	rte_rwlock_read_lock(&dev->log_lock);
> +
> +	if (unlikely((!dev->log_base) ||
> +				(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8))))
> +		goto unlock;
>  
>  	/* To make sure guest memory updates are committed before logging */
>  	rte_smp_wmb();
> @@ -293,6 +298,9 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr,
> uint64_t len)
>  		vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
>  		page += 1;
>  	}
> +
> +unlock:
> +	rte_rwlock_read_unlock(&dev->log_lock);
>  }
>  
>  static __rte_always_inline void
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index f06d9bb65..4b03dbbca 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -929,6 +929,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct
> VhostUserMsg *msg)
>  		goto out;
>  	}
>  
> +	rte_rwlock_write_lock(&dev->log_lock);
> +
>  	/*
>  	 * Free previously mapped log memory on occasionally
>  	 * multiple VHOST_USER_SET_LOG_BASE.
> @@ -940,6 +942,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct
> VhostUserMsg *msg)
>  	dev->log_base = dev->log_addr + off;
>  	dev->log_size = size;
>  
> +	rte_rwlock_write_unlock(&dev->log_lock);
> +
>  out:
>  	close(fd);
>  
> --
> 2.14.3
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-27  8:16   ` Victor Kaplansky
@ 2017-11-27  8:27     ` Maxime Coquelin
  2017-11-27  8:42       ` Victor Kaplansky
  0 siblings, 1 reply; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-27  8:27 UTC (permalink / raw)
  To: Victor Kaplansky; +Cc: dev, yliu, tiwei bie, jianfeng tan, stable, jfreiman

Hi Victor,

On 11/27/2017 09:16 AM, Victor Kaplansky wrote:
> Hi,
> 
> While I agree that taking full fledged lock by rte_rwlock_read_lock() solves the race condition,
> I'm afraid that it would be too expensive in case when logging is off, since it introduces
> acquiring and releasing lock into the main flow of ring updates.

Actually my v2 fixes the performance penalty when logging is off. The 
lock is now taken after the logging feature check.

But still, I agree logging on case will suffer from a performance
penalty.

> It is OK for now, as it fixes the bug, but we need to perform more careful performance measurements,
> and see whether the performance degradation is not too prohibitive.
> 
> As alternative, we may consider using more light weighted busy looping.

I think it will end up almost being the same, as both threads will need
to busy loop. PMD thread to be sure the protocol thread isn't being
unmapping the region before doing the logging, and protocol thread to be
sure the PMD thread is not doing logging before handling the set log
base.

Maybe you have something else in mind?

> Also, lets fix by this series the __sync_fetch_and_or_8 -> __sync_fetch_and_or,
> as it may improve the performance slightly.

Sure, this can be done, but it would need to be benchmarked first.

Regards,
Maxime

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-27  8:27     ` Maxime Coquelin
@ 2017-11-27  8:42       ` Victor Kaplansky
  2017-11-27  9:00         ` Maxime Coquelin
  0 siblings, 1 reply; 12+ messages in thread
From: Victor Kaplansky @ 2017-11-27  8:42 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: dev, yliu, tiwei bie, jianfeng tan, stable, jfreiman



----- Original Message -----
> From: "Maxime Coquelin" <maxime.coquelin@redhat.com>
> To: "Victor Kaplansky" <vkaplans@redhat.com>
> Cc: dev@dpdk.org, yliu@fridaylinux.org, "tiwei bie" <tiwei.bie@intel.com>, "jianfeng tan" <jianfeng.tan@intel.com>,
> stable@dpdk.org, jfreiman@redhat.com
> Sent: Monday, November 27, 2017 10:27:22 AM
> Subject: Re: [PATCH v2 2/3] vhost: protect dirty logging against logging base change
> 
> Hi Victor,
> 
> On 11/27/2017 09:16 AM, Victor Kaplansky wrote:
> > Hi,
> > 
> > While I agree that taking full fledged lock by rte_rwlock_read_lock()
> > solves the race condition,
> > I'm afraid that it would be too expensive in case when logging is off,
> > since it introduces
> > acquiring and releasing lock into the main flow of ring updates.
> 
> Actually my v2 fixes the performance penalty when logging is off. The
> lock is now taken after the logging feature check.
> 
> But still, I agree logging on case will suffer from a performance
> penalty.

Yes, checking of logging feature is better than nothing, but VHOST_F_LOG_ALL
marks only whether logging is supported by the device and not if
logging is in the action. Thus, any guest will hit the performance
degradation even not during migration.


> 
> > It is OK for now, as it fixes the bug, but we need to perform more careful
> > performance measurements,
> > and see whether the performance degradation is not too prohibitive.
> > 
> > As alternative, we may consider using more light weighted busy looping.
> 
> I think it will end up almost being the same, as both threads will need
> to busy loop. PMD thread to be sure the protocol thread isn't being
> unmapping the region before doing the logging, and protocol thread to be
> sure the PMD thread is not doing logging before handling the set log
> base.
> 

I'm not fully aware how rte_rwlock_read_lock() is implemented, but
theoretically busy looping should be much cheaper in cases when
taking lock by one side is very rare.

> Maybe you have something else in mind?
> 
> > Also, lets fix by this series the __sync_fetch_and_or_8 ->
> > __sync_fetch_and_or,
> > as it may improve the performance slightly.
> 
> Sure, this can be done, but it would need to be benchmarked first.

Agree.
> 
> Regards,
> Maxime
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-27  8:42       ` Victor Kaplansky
@ 2017-11-27  9:00         ` Maxime Coquelin
  0 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-27  9:00 UTC (permalink / raw)
  To: Victor Kaplansky; +Cc: dev, yliu, tiwei bie, jianfeng tan, stable, jfreiman



On 11/27/2017 09:42 AM, Victor Kaplansky wrote:
> 
> 
> ----- Original Message -----
>> From: "Maxime Coquelin" <maxime.coquelin@redhat.com>
>> To: "Victor Kaplansky" <vkaplans@redhat.com>
>> Cc: dev@dpdk.org, yliu@fridaylinux.org, "tiwei bie" <tiwei.bie@intel.com>, "jianfeng tan" <jianfeng.tan@intel.com>,
>> stable@dpdk.org, jfreiman@redhat.com
>> Sent: Monday, November 27, 2017 10:27:22 AM
>> Subject: Re: [PATCH v2 2/3] vhost: protect dirty logging against logging base change
>>
>> Hi Victor,
>>
>> On 11/27/2017 09:16 AM, Victor Kaplansky wrote:
>>> Hi,
>>>
>>> While I agree that taking full fledged lock by rte_rwlock_read_lock()
>>> solves the race condition,
>>> I'm afraid that it would be too expensive in case when logging is off,
>>> since it introduces
>>> acquiring and releasing lock into the main flow of ring updates.
>>
>> Actually my v2 fixes the performance penalty when logging is off. The
>> lock is now taken after the logging feature check.
>>
>> But still, I agree logging on case will suffer from a performance
>> penalty.
> 
> Yes, checking of logging feature is better than nothing, but VHOST_F_LOG_ALL
> marks only whether logging is supported by the device and not if
> logging is in the action. Thus, any guest will hit the performance
> degradation even not during migration.

My understanding is that VHOST_USER_SET_FEATURES is called again with 
VHOST_F_LOG_ALL on on migration start and with VHOST_F_LOG_ALL off on
migration stop.

> 
> 
>>
>>> It is OK for now, as it fixes the bug, but we need to perform more careful
>>> performance measurements,
>>> and see whether the performance degradation is not too prohibitive.
>>>
>>> As alternative, we may consider using more light weighted busy looping.
>>
>> I think it will end up almost being the same, as both threads will need
>> to busy loop. PMD thread to be sure the protocol thread isn't being
>> unmapping the region before doing the logging, and protocol thread to be
>> sure the PMD thread is not doing logging before handling the set log
>> base.
>>
> 
> I'm not fully aware how rte_rwlock_read_lock() is implemented, but
> theoretically busy looping should be much cheaper in cases when
> taking lock by one side is very rare.

we could improve it by only taking the lock once per burst instead of
per page logging, as we don't care the protocol thread waits a bit more
when it wants to remap the area.

>> Maybe you have something else in mind?
>>
>>> Also, lets fix by this series the __sync_fetch_and_or_8 ->
>>> __sync_fetch_and_or,
>>> as it may improve the performance slightly.
>>
>> Sure, this can be done, but it would need to be benchmarked first.
> 
> Agree.
>>
>> Regards,
>> Maxime
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
  2017-11-27  8:16   ` Victor Kaplansky
@ 2017-11-28 10:06   ` Maxime Coquelin
  2018-02-14  2:03     ` Tan, Jianfeng
  1 sibling, 1 reply; 12+ messages in thread
From: Maxime Coquelin @ 2017-11-28 10:06 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans; +Cc: stable, jfreiman



On 11/24/2017 07:08 PM, Maxime Coquelin wrote:
> When performing live-migration with multiple queue pairs,
> VHOST_USER_SET_LOG_BASE request is sent multiple times.
> 
> If packets are being processed by the PMD threads, it is
> possible that they are setting bits in the dirty log map while
> its region is being unmapped by the vhost-user protocol thread.
> It results in the following crash:
> Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f71ca495700 (LWP 32451)]
> 0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: Cannot access memory at address 0x7f71cbe18432>, nr=1) at /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
> 267        __sync_fetch_and_or_8(addr, (1U << nr));
> 
> We can see the vhost-user protocol thread just did the unmap of the
> dirty log region when it happens.
> 
> This patch prevents this by introducing a RW lock to protect
> the log base.
> 
> Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>   lib/librte_vhost/vhost.c      |  2 ++
>   lib/librte_vhost/vhost.h      | 14 +++++++++++---
>   lib/librte_vhost/vhost_user.c |  4 ++++
>   3 files changed, 17 insertions(+), 3 deletions(-)
> 

By clarifying the vhost-user spec, we may be able to avoid this lock and
just ignore the subsequent SET_LOG_BASE requests once
VHOST_F_LOG_ALL feature bit is set.

So let's just discard this series for now.

Maxime

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2017-11-28 10:06   ` Maxime Coquelin
@ 2018-02-14  2:03     ` Tan, Jianfeng
  2018-02-14  7:52       ` Maxime Coquelin
  0 siblings, 1 reply; 12+ messages in thread
From: Tan, Jianfeng @ 2018-02-14  2:03 UTC (permalink / raw)
  To: Maxime Coquelin, dev, yliu, tiwei.bie, vkaplans; +Cc: stable, jfreiman

Hi Maxime,


On 11/28/2017 6:06 PM, Maxime Coquelin wrote:
>
>
> On 11/24/2017 07:08 PM, Maxime Coquelin wrote:
>> When performing live-migration with multiple queue pairs,
>> VHOST_USER_SET_LOG_BASE request is sent multiple times.
>>
>> If packets are being processed by the PMD threads, it is
>> possible that they are setting bits in the dirty log map while
>> its region is being unmapped by the vhost-user protocol thread.
>> It results in the following crash:
>> Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7f71ca495700 (LWP 32451)]
>> 0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: 
>> Cannot access memory at address 0x7f71cbe18432>, nr=1) at 
>> /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
>> 267        __sync_fetch_and_or_8(addr, (1U << nr));
>>
>> We can see the vhost-user protocol thread just did the unmap of the
>> dirty log region when it happens.
>>
>> This patch prevents this by introducing a RW lock to protect
>> the log base.
>>
>> Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>   lib/librte_vhost/vhost.c      |  2 ++
>>   lib/librte_vhost/vhost.h      | 14 +++++++++++---
>>   lib/librte_vhost/vhost_user.c |  4 ++++
>>   3 files changed, 17 insertions(+), 3 deletions(-)
>>
>
> By clarifying the vhost-user spec, we may be able to avoid this lock and
> just ignore the subsequent SET_LOG_BASE requests once
> VHOST_F_LOG_ALL feature bit is set.
>
> So let's just discard this series for now.

I would assume this issue has been addressed by the per-queue lock patch 
from Victor, correct?

Besides, we really don't need multiple unmap/map for each vq. Would you 
think this shall be fixed in QEMU?

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2018-02-14  2:03     ` Tan, Jianfeng
@ 2018-02-14  7:52       ` Maxime Coquelin
  2018-02-22  2:54         ` Tan, Jianfeng
  0 siblings, 1 reply; 12+ messages in thread
From: Maxime Coquelin @ 2018-02-14  7:52 UTC (permalink / raw)
  To: Tan, Jianfeng, dev, yliu, tiwei.bie, vkaplans; +Cc: stable, jfreiman

Hi Jianfeng,

On 02/14/2018 03:03 AM, Tan, Jianfeng wrote:
> Hi Maxime,
> 
> 
> On 11/28/2017 6:06 PM, Maxime Coquelin wrote:
>>
>>
>> On 11/24/2017 07:08 PM, Maxime Coquelin wrote:
>>> When performing live-migration with multiple queue pairs,
>>> VHOST_USER_SET_LOG_BASE request is sent multiple times.
>>>
>>> If packets are being processed by the PMD threads, it is
>>> possible that they are setting bits in the dirty log map while
>>> its region is being unmapped by the vhost-user protocol thread.
>>> It results in the following crash:
>>> Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 0x7f71ca495700 (LWP 32451)]
>>> 0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: 
>>> Cannot access memory at address 0x7f71cbe18432>, nr=1) at 
>>> /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
>>> 267        __sync_fetch_and_or_8(addr, (1U << nr));
>>>
>>> We can see the vhost-user protocol thread just did the unmap of the
>>> dirty log region when it happens.
>>>
>>> This patch prevents this by introducing a RW lock to protect
>>> the log base.
>>>
>>> Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> ---
>>>   lib/librte_vhost/vhost.c      |  2 ++
>>>   lib/librte_vhost/vhost.h      | 14 +++++++++++---
>>>   lib/librte_vhost/vhost_user.c |  4 ++++
>>>   3 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>
>> By clarifying the vhost-user spec, we may be able to avoid this lock and
>> just ignore the subsequent SET_LOG_BASE requests once
>> VHOST_F_LOG_ALL feature bit is set.
>>
>> So let's just discard this series for now.
> 
> I would assume this issue has been addressed by the per-queue lock patch 
> from Victor, correct?

Correct.

> Besides, we really don't need multiple unmap/map for each vq. Would you 
> think this shall be fixed in QEMU?

Yes, I tihnk you are right it should be fixed in QEMU, so that it is
sent only for the first queue pair.

But I didn't had time to work on it TBH.


Cheers,
Maxime
> Thanks,
> Jianfeng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change
  2018-02-14  7:52       ` Maxime Coquelin
@ 2018-02-22  2:54         ` Tan, Jianfeng
  0 siblings, 0 replies; 12+ messages in thread
From: Tan, Jianfeng @ 2018-02-22  2:54 UTC (permalink / raw)
  To: Maxime Coquelin, dev, yliu, Bie, Tiwei, vkaplans; +Cc: stable, jfreiman



> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Wednesday, February 14, 2018 3:53 PM
> To: Tan, Jianfeng; dev@dpdk.org; yliu@fridaylinux.org; Bie, Tiwei;
> vkaplans@redhat.com
> Cc: stable@dpdk.org; jfreiman@redhat.com
> Subject: Re: [PATCH v2 2/3] vhost: protect dirty logging against logging base
> change
> 
> Hi Jianfeng,
> 
> On 02/14/2018 03:03 AM, Tan, Jianfeng wrote:
> > Hi Maxime,
> >
> >
> > On 11/28/2017 6:06 PM, Maxime Coquelin wrote:
> >>
> >>
> >> On 11/24/2017 07:08 PM, Maxime Coquelin wrote:
> >>> When performing live-migration with multiple queue pairs,
> >>> VHOST_USER_SET_LOG_BASE request is sent multiple times.
> >>>
> >>> If packets are being processed by the PMD threads, it is
> >>> possible that they are setting bits in the dirty log map while
> >>> its region is being unmapped by the vhost-user protocol thread.
> >>> It results in the following crash:
> >>> Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
> >>> [Switching to Thread 0x7f71ca495700 (LWP 32451)]
> >>> 0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error:
> >>> Cannot access memory at address 0x7f71cbe18432>, nr=1) at
> >>> /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
> >>> 267        __sync_fetch_and_or_8(addr, (1U << nr));
> >>>
> >>> We can see the vhost-user protocol thread just did the unmap of the
> >>> dirty log region when it happens.
> >>>
> >>> This patch prevents this by introducing a RW lock to protect
> >>> the log base.
> >>>
> >>> Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>> ---
> >>>   lib/librte_vhost/vhost.c      |  2 ++
> >>>   lib/librte_vhost/vhost.h      | 14 +++++++++++---
> >>>   lib/librte_vhost/vhost_user.c |  4 ++++
> >>>   3 files changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>
> >> By clarifying the vhost-user spec, we may be able to avoid this lock and
> >> just ignore the subsequent SET_LOG_BASE requests once
> >> VHOST_F_LOG_ALL feature bit is set.
> >>
> >> So let's just discard this series for now.
> >
> > I would assume this issue has been addressed by the per-queue lock patch
> > from Victor, correct?
> 
> Correct.
> 
> > Besides, we really don't need multiple unmap/map for each vq. Would you
> > think this shall be fixed in QEMU?
> 
> Yes, I tihnk you are right it should be fixed in QEMU, so that it is
> sent only for the first queue pair.
> 
> But I didn't had time to work on it TBH.

Thank you for the confirmation. And it's not an urgent issue to fix anyway.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-02-22  2:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-24 18:08 [dpdk-dev] [PATCH v2 0/3] vhost: MQ live-migration fixes Maxime Coquelin
2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
2017-11-27  8:16   ` Victor Kaplansky
2017-11-27  8:27     ` Maxime Coquelin
2017-11-27  8:42       ` Victor Kaplansky
2017-11-27  9:00         ` Maxime Coquelin
2017-11-28 10:06   ` Maxime Coquelin
2018-02-14  2:03     ` Tan, Jianfeng
2018-02-14  7:52       ` Maxime Coquelin
2018-02-22  2:54         ` Tan, Jianfeng
2017-11-24 18:08 ` [dpdk-dev] [PATCH v2 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).