* [dpdk-dev] [PATCH v5 1/4] vhost: prevent features to be changed while device is running
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
@ 2017-12-13 8:51 ` Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 2/4] vhost: propagate VHOST_USER_SET_FEATURES handling error Maxime Coquelin
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2017-12-13 8:51 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, lprosek, lersek; +Cc: Maxime Coquelin
As section 2.2 of the Virtio spec states about features
negotiation:
"During device initialization, the driver reads this and tells
the device the subset that it accepts. The only way to
renegotiate is to reset the device."
This patch implements a check to prevent illegal features change
while the device is running.
One exception is the VHOST_F_LOG_ALL feature bit, which is enabled
when live-migration is initiated. But this feature is not negotiated
with the Virtio driver, but directly with the Vhost master.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f4c7ce462..545dbcb2b 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -183,7 +183,22 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
return -1;
}
- if ((dev->flags & VIRTIO_DEV_RUNNING) && dev->features != features) {
+ if (dev->flags & VIRTIO_DEV_RUNNING) {
+ if (dev->features == features)
+ return 0;
+
+ /*
+ * Error out if master tries to change features while device is
+ * in running state. The exception being VHOST_F_LOG_ALL, which
+ * is enabled when the live-migration starts.
+ */
+ if ((dev->features ^ features) & ~(1ULL << VHOST_F_LOG_ALL)) {
+ RTE_LOG(ERR, VHOST_CONFIG,
+ "(%d) features changed while device is running.\n",
+ dev->vid);
+ return -1;
+ }
+
if (dev->notify_ops->features_changed)
dev->notify_ops->features_changed(dev->vid, features);
}
--
2.14.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v5 2/4] vhost: propagate VHOST_USER_SET_FEATURES handling error
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 1/4] vhost: prevent features to be changed while device is running Maxime Coquelin
@ 2017-12-13 8:51 ` Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 3/4] vhost: extract virtqueue cleaning and freeing functions Maxime Coquelin
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2017-12-13 8:51 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, lprosek, lersek; +Cc: Maxime Coquelin
Not propagating VHOST_USER_SET_FEATURES request handling
error may result in unpredictable behavior, as host and
guests features may no more be synchronized.
This patch fixes this by reporting the error to the upper
layer, which would result in the device being destroyed
and the connection with the master to be closed.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 545dbcb2b..471b1612c 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1263,7 +1263,9 @@ vhost_user_msg_handler(int vid, int fd)
send_vhost_reply(fd, &msg);
break;
case VHOST_USER_SET_FEATURES:
- vhost_user_set_features(dev, msg.payload.u64);
+ ret = vhost_user_set_features(dev, msg.payload.u64);
+ if (ret)
+ return -1;
break;
case VHOST_USER_GET_PROTOCOL_FEATURES:
--
2.14.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v5 3/4] vhost: extract virtqueue cleaning and freeing functions
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 1/4] vhost: prevent features to be changed while device is running Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 2/4] vhost: propagate VHOST_USER_SET_FEATURES handling error Maxime Coquelin
@ 2017-12-13 8:51 ` Maxime Coquelin
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 4/4] vhost: destroy unused virtqueues when multiqueue not negotiated Maxime Coquelin
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2017-12-13 8:51 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, lprosek, lersek; +Cc: Maxime Coquelin
This patch extracts needed code for vhost_user.c to be able
to clean and free virtqueues unitary.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost.c | 22 ++++++++++++----------
lib/librte_vhost/vhost.h | 3 +++
2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 4f8b73a09..df528a4ea 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -103,7 +103,7 @@ get_device(int vid)
return dev;
}
-static void
+void
cleanup_vq(struct vhost_virtqueue *vq, int destroy)
{
if ((vq->callfd >= 0) && (destroy != 0))
@@ -127,6 +127,15 @@ cleanup_device(struct virtio_net *dev, int destroy)
cleanup_vq(dev->virtqueue[i], destroy);
}
+void
+free_vq(struct vhost_virtqueue *vq)
+{
+ rte_free(vq->shadow_used_ring);
+ rte_free(vq->batch_copy_elems);
+ rte_mempool_free(vq->iotlb_pool);
+ rte_free(vq);
+}
+
/*
* Release virtqueues and device memory.
*/
@@ -134,16 +143,9 @@ static void
free_device(struct virtio_net *dev)
{
uint32_t i;
- struct vhost_virtqueue *vq;
-
- for (i = 0; i < dev->nr_vring; i++) {
- vq = dev->virtqueue[i];
- rte_free(vq->shadow_used_ring);
- rte_free(vq->batch_copy_elems);
- rte_mempool_free(vq->iotlb_pool);
- rte_free(vq);
- }
+ for (i = 0; i < dev->nr_vring; i++)
+ free_vq(dev->virtqueue[i]);
rte_free(dev);
}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 1cc81c17c..9cad1bb3c 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -364,6 +364,9 @@ void cleanup_device(struct virtio_net *dev, int destroy);
void reset_device(struct virtio_net *dev);
void vhost_destroy_device(int);
+void cleanup_vq(struct vhost_virtqueue *vq, int destroy);
+void free_vq(struct vhost_virtqueue *vq);
+
int alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx);
void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
--
2.14.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v5 4/4] vhost: destroy unused virtqueues when multiqueue not negotiated
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
` (2 preceding siblings ...)
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 3/4] vhost: extract virtqueue cleaning and freeing functions Maxime Coquelin
@ 2017-12-13 8:51 ` Maxime Coquelin
2017-12-13 9:15 ` [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ " Paolo Bonzini
2018-01-08 14:36 ` Yuanhan Liu
5 siblings, 0 replies; 10+ messages in thread
From: Maxime Coquelin @ 2017-12-13 8:51 UTC (permalink / raw)
To: dev, yliu, tiwei.bie, jianfeng.tan, lprosek, lersek; +Cc: Maxime Coquelin
QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
declared in QEMU command line before the guest is started.
It has the effect in DPDK vhost-user backend to allocate vrings
for all queues declared by QEMU.
If the first driver being used does not support multiqueue,
the device never changes to VIRTIO_DEV_RUNNING state as only
the first queue pair is initialized. One driver impacted by
this bug is virtio-net's iPXE driver which does not support
VIRTIO_NET_F_MQ feature.
It is safe to destroy unused virtqueues in SET_FEATURES request
handler, as it is ensured the device is not in running state
at this stage, so virtqueues aren't being processed.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 471b1612c..1848c8de9 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -216,6 +216,25 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+ if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
+ /*
+ * Remove all but first queue pair if MQ hasn't been
+ * negotiated. This is safe because the device is not
+ * running at this stage.
+ */
+ while (dev->nr_vring > 2) {
+ struct vhost_virtqueue *vq;
+
+ vq = dev->virtqueue[--dev->nr_vring];
+ if (!vq)
+ continue;
+
+ dev->virtqueue[dev->nr_vring] = NULL;
+ cleanup_vq(vq, 1);
+ free_vq(vq);
+ }
+ }
+
return 0;
}
--
2.14.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
` (3 preceding siblings ...)
2017-12-13 8:51 ` [dpdk-dev] [PATCH v5 4/4] vhost: destroy unused virtqueues when multiqueue not negotiated Maxime Coquelin
@ 2017-12-13 9:15 ` Paolo Bonzini
2017-12-13 10:11 ` Maxime Coquelin
2018-01-08 14:36 ` Yuanhan Liu
5 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2017-12-13 9:15 UTC (permalink / raw)
To: Maxime Coquelin, dev, yliu, tiwei.bie, jianfeng.tan, lprosek, lersek
On 13/12/2017 09:51, Maxime Coquelin wrote:
> This series fixes this by destroying all but first queue pair in
> the backend if VIRTIO_NET_F_MQ isn't negotiated. First patches
> makes sure that VHOST_USER_SET_FEATURES request doesn't change
> Virtio features while the device is running, which should never
> happen as per the Virtio spec. This helps to make sure vitqueues
> aren't destroyed while being processed, but also protect from
> other illegal features changes (e.g. VIRTIO_NET_F_MRG_RXBUF).
Hi Maxime,
I think this series is wrong from the virtio spec's point of view. If
the driver requests VIRTIO_NET_F_MQ, that does not mean "the driver
knows about multiqueue", it only means that "the driver wants to read
max_virtqueue_pairs" from configuration space.
Just like it's perfectly fine for a device to propose VIRTIO_NET_F_MQ
and set max_virtqueue_pairs=1, a driver can negotiate VIRTIO_NET_F_MQ
and then skip initialization of some virtqueues.
In fact, for virtio-net there is an explicit way to say how many
virtqueue pairs are available; the virtio spec's section 5.1.5 (Network
device, Device Initialization) mentions that
Even with VIRTIO_NET_F_MQ, only receiveq1, transmitq1 and
controlq are used by default. The driver would send the
VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command specifying the number of
the transmit and receive queues to use.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated
2017-12-13 9:15 ` [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ " Paolo Bonzini
@ 2017-12-13 10:11 ` Maxime Coquelin
2017-12-13 10:24 ` Paolo Bonzini
0 siblings, 1 reply; 10+ messages in thread
From: Maxime Coquelin @ 2017-12-13 10:11 UTC (permalink / raw)
To: Paolo Bonzini, dev, yliu, tiwei.bie, jianfeng.tan, lprosek,
lersek, Michael S. Tsirkin
Hi Paolo,
On 12/13/2017 10:15 AM, Paolo Bonzini wrote:
> On 13/12/2017 09:51, Maxime Coquelin wrote:
>> This series fixes this by destroying all but first queue pair in
>> the backend if VIRTIO_NET_F_MQ isn't negotiated. First patches
>> makes sure that VHOST_USER_SET_FEATURES request doesn't change
>> Virtio features while the device is running, which should never
>> happen as per the Virtio spec. This helps to make sure vitqueues
>> aren't destroyed while being processed, but also protect from
>> other illegal features changes (e.g. VIRTIO_NET_F_MRG_RXBUF).
>
> Hi Maxime,
>
> I think this series is wrong from the virtio spec's point of view. If
> the driver requests VIRTIO_NET_F_MQ, that does not mean "the driver
> knows about multiqueue", it only means that "the driver wants to read
> max_virtqueue_pairs" from configuration space.
Actually, my series fixes half of the problem, the case where driver
does not know about multiqueue.
In this case, there is no point in the backend to wait for the
initialization of queues that does not exist.
So I think my series is not enough, but not wrong.
> Just like it's perfectly fine for a device to propose VIRTIO_NET_F_MQ
> and set max_virtqueue_pairs=1, a driver can negotiate VIRTIO_NET_F_MQ
> and then skip initialization of some virtqueues.
>
> In fact, for virtio-net there is an explicit way to say how many
> virtqueue pairs are available; the virtio spec's section 5.1.5 (Network
> device, Device Initialization) mentions that
>
> Even with VIRTIO_NET_F_MQ, only receiveq1, transmitq1 and
> controlq are used by default. The driver would send the
> VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command specifying the number of
> the transmit and receive queues to use.
Yes, I agree.
I was planning to send a vhost-user protocol spec update to forward this
information to the backend with a new request.
Currently, DPDK will increment the queue counter each time it receives a
request for a new queue. Then it waits for all queues to be initialized
(but not necessarily enabled) to start the vhost device.
The problem is that QEMU, when receiving the
VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command, will send
VHOST_USER_SET_VRING_ENABLE request to the backend to enable first queue
pair, and disable all others.
We cannot destroy a queue on disable, because it could happen in other
cases, where it would be re-enabled without being re-initialized.
So on DPDK side, my understanding is that we cannot deduce the number of
queues that we have to wait to be initialized before starting the
device. DPDK currently assume a queue to be initialized if rings
addresses are set and if it has received both call and kick fds.
I only fixed half of the problem as a first step because current Kernel
& DPDK virtio-net drivers always allocate the maximum number of queues
exposed by QEMU, even if it does use them all. But it is not compliant
with the Virtio spec, which does not imply this (and the spec is right).
Thanks,
Maxime
> Thanks,
>
> Paolo
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated
2017-12-13 10:11 ` Maxime Coquelin
@ 2017-12-13 10:24 ` Paolo Bonzini
2017-12-13 11:23 ` Laszlo Ersek
0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2017-12-13 10:24 UTC (permalink / raw)
To: Maxime Coquelin, dev, yliu, tiwei.bie, jianfeng.tan, lprosek,
lersek, Michael S. Tsirkin
On 13/12/2017 11:11, Maxime Coquelin wrote:
>> Hi Maxime,
>>
>> I think this series is wrong from the virtio spec's point of view. If
>> the driver requests VIRTIO_NET_F_MQ, that does not mean "the driver
>> knows about multiqueue", it only means that "the driver wants to read
>> max_virtqueue_pairs" from configuration space.
>
> Actually, my series fixes half of the problem, the case where driver
> does not know about multiqueue.
>
> In this case, there is no point in the backend to wait for the
> initialization of queues that does not exist.
>
> So I think my series is not enough, but not wrong.
You're right; what I meant by "wrong" is that it becomes unnecessary if
you do VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET. But since this requires a
vhost-user update, doing both makes sense.
Thanks!
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated
2017-12-13 10:24 ` Paolo Bonzini
@ 2017-12-13 11:23 ` Laszlo Ersek
0 siblings, 0 replies; 10+ messages in thread
From: Laszlo Ersek @ 2017-12-13 11:23 UTC (permalink / raw)
To: Paolo Bonzini, Maxime Coquelin, dev, yliu, tiwei.bie,
jianfeng.tan, lprosek, Michael S. Tsirkin
On 12/13/17 11:24, Paolo Bonzini wrote:
> On 13/12/2017 11:11, Maxime Coquelin wrote:
>>> Hi Maxime,
>>>
>>> I think this series is wrong from the virtio spec's point of view. If
>>> the driver requests VIRTIO_NET_F_MQ, that does not mean "the driver
>>> knows about multiqueue", it only means that "the driver wants to read
>>> max_virtqueue_pairs" from configuration space.
>>
>> Actually, my series fixes half of the problem, the case where driver
>> does not know about multiqueue.
>>
>> In this case, there is no point in the backend to wait for the
>> initialization of queues that does not exist.
>>
>> So I think my series is not enough, but not wrong.
>
> You're right; what I meant by "wrong" is that it becomes unnecessary if
> you do VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET. But since this requires a
> vhost-user update, doing both makes sense.
Based on this, plus reviewing patch #4 for:
+ vq = dev->virtqueue[--dev->nr_vring];
+ if (!vq)
+ continue;
+
+ dev->virtqueue[dev->nr_vring] = NULL;
Acked-by: Laszlo Ersek <lersek@redhat.com>
Thanks
Laszlo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated
2017-12-13 8:51 [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ not negotiated Maxime Coquelin
` (4 preceding siblings ...)
2017-12-13 9:15 ` [dpdk-dev] [PATCH v5 0/4] Vhost: fix mq=on but VIRTIO_NET_F_MQ " Paolo Bonzini
@ 2018-01-08 14:36 ` Yuanhan Liu
5 siblings, 0 replies; 10+ messages in thread
From: Yuanhan Liu @ 2018-01-08 14:36 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: dev, tiwei.bie, jianfeng.tan, lprosek, lersek
On Wed, Dec 13, 2017 at 09:51:05AM +0100, Maxime Coquelin wrote:
> Hi,
>
> This fifth revision fixes bug reported by Tiwei, dev->virtqueue[$idx]
> wasn't reset to NULL at vq freeing time.
Applied to dpdk-next-virtio.
Thanks.
--yliu
>
> Having QEMU started with mq=on but guest driver not negotiating
> VIRTIO_NET_F_MQ feature ends up in the vhost device to never
> start. Indeed, more queues are created in the vhost backend than
> configured.
>
> Guest drivers known to not advertise the VIRTIO_NET_F_MQ feature are
> iPXE and OVMF Virtio-net drivers.
>
> Queues are created because before starting the guest, QEMU sends
> VHOST_USER_SET_VRING_CALL requests for all queues declared in QEMU
> command line. Also, once Virtio features negotiated, QEMU sends
> VHOST_USER_SET_VRING_ENABLE requests to disable all but the first
> queue pair.
>
> This series fixes this by destroying all but first queue pair in
> the backend if VIRTIO_NET_F_MQ isn't negotiated. First patches
> makes sure that VHOST_USER_SET_FEATURES request doesn't change
> Virtio features while the device is running, which should never
> happen as per the Virtio spec. This helps to make sure vitqueues
> aren't destroyed while being processed, but also protect from
> other illegal features changes (e.g. VIRTIO_NET_F_MRG_RXBUF).
>
> Changes since v4:
> =================
> - Fix dev->virtqueue[$ixd] not reset to NULL at VQ free time (Tiwei)
> Changes since v3:
> =================
> - Fix Virtio features = 0 case (Tiwei)
> Changes since v2:
> =================
> - Patch 2: Rework & fix VQs destruction loop (Laszlo)
> Changes since v1:
> =================
> - Patch 1: shift bits in the right direction (Ladi)
>
> Maxime Coquelin (4):
> vhost: prevent features to be changed while device is running
> vhost: propagate VHOST_USER_SET_FEATURES handling error
> vhost: extract virtqueue cleaning and freeing functions
> vhost: destroy unused virtqueues when multiqueue not negotiated
>
> lib/librte_vhost/vhost.c | 22 ++++++++++++----------
> lib/librte_vhost/vhost.h | 3 +++
> lib/librte_vhost/vhost_user.c | 40 ++++++++++++++++++++++++++++++++++++++--
> 3 files changed, 53 insertions(+), 12 deletions(-)
>
> --
> 2.14.3
^ permalink raw reply [flat|nested] 10+ messages in thread