DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] vhost: add VDUSE virtqueue ready state polling workaround
@ 2025-09-11  8:36 Maxime Coquelin
  2025-09-15  9:42 ` David Marchand
  0 siblings, 1 reply; 3+ messages in thread
From: Maxime Coquelin @ 2025-09-11  8:36 UTC (permalink / raw)
  To: dev, chenbox, david.marchand, amorenoz; +Cc: Maxime Coquelin, stable

Add workaround to poll virtqueue ready states before starting device
when VIRTIO_DEVICE_STATUS_DRIVER_OK is set in vduse_events_handler().

For each virtqueue, poll using VDUSE_VQ_GET_INFO ioctl to check
vq_info->ready state with configurable retry limit. This addresses
timing issues where device start was attempted before all virtqueues
were properly initialized and ready.

A notification mechanism will be introduced in the next version of
the VDUSE uAPI. When it lands, we would only apply this workaround
when the kernel does not support it.

Fixes: a9120db8b98b ("vhost: add VDUSE device startup")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/vhost/vduse.c | 62 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c
index 9de7f04a4f..5a6025d702 100644
--- a/lib/vhost/vduse.c
+++ b/lib/vhost/vduse.c
@@ -272,6 +272,56 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index)
 	vq->last_avail_idx = 0;
 }
 
+
+/*
+ * Tests show that it succeeds at the first retry at worst,
+ * but let's be on the safe side and allow more retries.
+ */
+#define VDUSE_VQ_READY_POLL_MAX_RETRIES 100
+
+static int
+vduse_wait_for_virtqueues_ready(struct virtio_net *dev)
+{
+	struct vduse_vq_info vq_info;
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < dev->nr_vring; i++) {
+		int retry_count = 0;
+
+		while (retry_count < VDUSE_VQ_READY_POLL_MAX_RETRIES) {
+			vq_info.index = i;
+			ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_GET_INFO, &vq_info);
+			if (ret) {
+				VHOST_CONFIG_LOG(dev->ifname, ERR,
+					"Failed to get VQ %u info while polling ready state: %s",
+					i, strerror(errno));
+				return -1;
+			}
+
+			if (vq_info.ready) {
+				VHOST_CONFIG_LOG(dev->ifname, DEBUG,
+					"VQ %u is ready after %u retries", i, retry_count);
+				break;
+			}
+
+			retry_count++;
+			/* Small delay between retries */
+			usleep(1000);
+		}
+
+		if (retry_count >= VDUSE_VQ_READY_POLL_MAX_RETRIES) {
+			VHOST_CONFIG_LOG(dev->ifname, ERR,
+				"VQ %u ready state polling timeout after %u retries",
+				i, VDUSE_VQ_READY_POLL_MAX_RETRIES);
+			return -1;
+		}
+	}
+
+	VHOST_CONFIG_LOG(dev->ifname, INFO, "All virtqueues are ready after polling");
+	return 0;
+}
+
 static void
 vduse_device_start(struct virtio_net *dev, bool reconnect)
 {
@@ -414,10 +464,18 @@ vduse_events_handler(int fd, void *arg, int *close __rte_unused)
 	}
 
 	if ((old_status ^ dev->status) & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
-		if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK)
+		if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
+			/* Poll virtqueues ready states before starting device */
+			ret = vduse_wait_for_virtqueues_ready(dev);
+			if (ret < 0) {
+				VHOST_CONFIG_LOG(dev->ifname, ERR,
+					"Failed to wait for virtqueues ready, aborting device start");
+				return;
+			}
 			vduse_device_start(dev, false);
-		else
+		} else {
 			vduse_device_stop(dev);
+		}
 	}
 
 	VHOST_CONFIG_LOG(dev->ifname, INFO, "Request %s (%u) handled successfully",
-- 
2.51.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] vhost: add VDUSE virtqueue ready state polling workaround
  2025-09-11  8:36 [PATCH] vhost: add VDUSE virtqueue ready state polling workaround Maxime Coquelin
@ 2025-09-15  9:42 ` David Marchand
  2025-09-16  8:47   ` Maxime Coquelin
  0 siblings, 1 reply; 3+ messages in thread
From: David Marchand @ 2025-09-15  9:42 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: dev, chenbox, amorenoz, stable

On Thu, 11 Sept 2025 at 10:36, Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> Add workaround to poll virtqueue ready states before starting device
> when VIRTIO_DEVICE_STATUS_DRIVER_OK is set in vduse_events_handler().
>
> For each virtqueue, poll using VDUSE_VQ_GET_INFO ioctl to check
> vq_info->ready state with configurable retry limit. This addresses
> timing issues where device start was attempted before all virtqueues
> were properly initialized and ready.
>
> A notification mechanism will be introduced in the next version of
> the VDUSE uAPI. When it lands, we would only apply this workaround
> when the kernel does not support it.
>
> Fixes: a9120db8b98b ("vhost: add VDUSE device startup")
> Cc: stable@dpdk.org
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/vhost/vduse.c | 62 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 60 insertions(+), 2 deletions(-)
>
> diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c
> index 9de7f04a4f..5a6025d702 100644
> --- a/lib/vhost/vduse.c
> +++ b/lib/vhost/vduse.c
> @@ -272,6 +272,56 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index)
>         vq->last_avail_idx = 0;
>  }
>
> +

Nit: no need for double empty lines.

> +/*
> + * Tests show that it succeeds at the first retry at worst,

it?

> + * but let's be on the safe side and allow more retries.
> + */
> +#define VDUSE_VQ_READY_POLL_MAX_RETRIES 100
> +
> +static int
> +vduse_wait_for_virtqueues_ready(struct virtio_net *dev)
> +{
> +       struct vduse_vq_info vq_info;
> +       unsigned int i;
> +       int ret;
> +
> +       for (i = 0; i < dev->nr_vring; i++) {
> +               int retry_count = 0;
> +
> +               while (retry_count < VDUSE_VQ_READY_POLL_MAX_RETRIES) {
> +                       vq_info.index = i;

It is not clear which part of the vduse_vq_info structure is r/o, r/w
or w/o in uapi header
I see that vduse_vring_setup() does nothing more than setting index.
I am probably paranoid but do we need an explicit reset of the whole
vq_info on retry?

Moving the definition of vq_info in this loop (right before setting
vq_info.index) seems better on that topic.


> +                       ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_GET_INFO, &vq_info);
> +                       if (ret) {
> +                               VHOST_CONFIG_LOG(dev->ifname, ERR,
> +                                       "Failed to get VQ %u info while polling ready state: %s",
> +                                       i, strerror(errno));
> +                               return -1;
> +                       }
> +
> +                       if (vq_info.ready) {
> +                               VHOST_CONFIG_LOG(dev->ifname, DEBUG,
> +                                       "VQ %u is ready after %u retries", i, retry_count);
> +                               break;
> +                       }
> +
> +                       retry_count++;
> +                       /* Small delay between retries */

I would remove this Lapalissade comment.


> +                       usleep(1000);
> +               }
> +
> +               if (retry_count >= VDUSE_VQ_READY_POLL_MAX_RETRIES) {
> +                       VHOST_CONFIG_LOG(dev->ifname, ERR,
> +                               "VQ %u ready state polling timeout after %u retries",
> +                               i, VDUSE_VQ_READY_POLL_MAX_RETRIES);
> +                       return -1;
> +               }
> +       }
> +
> +       VHOST_CONFIG_LOG(dev->ifname, INFO, "All virtqueues are ready after polling");
> +       return 0;
> +}
> +
>  static void
>  vduse_device_start(struct virtio_net *dev, bool reconnect)
>  {
> @@ -414,10 +464,18 @@ vduse_events_handler(int fd, void *arg, int *close __rte_unused)
>         }
>
>         if ((old_status ^ dev->status) & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
> -               if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK)
> +               if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
> +                       /* Poll virtqueues ready states before starting device */
> +                       ret = vduse_wait_for_virtqueues_ready(dev);
> +                       if (ret < 0) {
> +                               VHOST_CONFIG_LOG(dev->ifname, ERR,
> +                                       "Failed to wait for virtqueues ready, aborting device start");
> +                               return;
> +                       }
>                         vduse_device_start(dev, false);
> -               else
> +               } else {
>                         vduse_device_stop(dev);
> +               }
>         }
>
>         VHOST_CONFIG_LOG(dev->ifname, INFO, "Request %s (%u) handled successfully",
> --
> 2.51.0
>

Aside from those nits, it looks an acceptable workaround for now.
Reviewed-by: David Marchand <david.marchand@redhat.com>


-- 
David Marchand


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] vhost: add VDUSE virtqueue ready state polling workaround
  2025-09-15  9:42 ` David Marchand
@ 2025-09-16  8:47   ` Maxime Coquelin
  0 siblings, 0 replies; 3+ messages in thread
From: Maxime Coquelin @ 2025-09-16  8:47 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, chenbox, amorenoz, stable



On 9/15/25 11:42 AM, David Marchand wrote:
> On Thu, 11 Sept 2025 at 10:36, Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
>>
>> Add workaround to poll virtqueue ready states before starting device
>> when VIRTIO_DEVICE_STATUS_DRIVER_OK is set in vduse_events_handler().
>>
>> For each virtqueue, poll using VDUSE_VQ_GET_INFO ioctl to check
>> vq_info->ready state with configurable retry limit. This addresses
>> timing issues where device start was attempted before all virtqueues
>> were properly initialized and ready.
>>
>> A notification mechanism will be introduced in the next version of
>> the VDUSE uAPI. When it lands, we would only apply this workaround
>> when the kernel does not support it.
>>
>> Fixes: a9120db8b98b ("vhost: add VDUSE device startup")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>   lib/vhost/vduse.c | 62 +++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 60 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c
>> index 9de7f04a4f..5a6025d702 100644
>> --- a/lib/vhost/vduse.c
>> +++ b/lib/vhost/vduse.c
>> @@ -272,6 +272,56 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index)
>>          vq->last_avail_idx = 0;
>>   }
>>
>> +
> 
> Nit: no need for double empty lines.
> 
>> +/*
>> + * Tests show that it succeeds at the first retry at worst,
> 
> it?

Changing to:
"Tests show that virtqueues get ready at the first retry at worst..."

> 
>> + * but let's be on the safe side and allow more retries.
>> + */
>> +#define VDUSE_VQ_READY_POLL_MAX_RETRIES 100
>> +
>> +static int
>> +vduse_wait_for_virtqueues_ready(struct virtio_net *dev)
>> +{
>> +       struct vduse_vq_info vq_info;
>> +       unsigned int i;
>> +       int ret;
>> +
>> +       for (i = 0; i < dev->nr_vring; i++) {
>> +               int retry_count = 0;
>> +
>> +               while (retry_count < VDUSE_VQ_READY_POLL_MAX_RETRIES) {
>> +                       vq_info.index = i;
> 
> It is not clear which part of the vduse_vq_info structure is r/o, r/w
> or w/o in uapi header
> I see that vduse_vring_setup() does nothing more than setting index.
> I am probably paranoid but do we need an explicit reset of the whole
> vq_info on retry?
> 
> Moving the definition of vq_info in this loop (right before setting
> vq_info.index) seems better on that topic.
> 

The Kernel side only look for the index field (for now at least), but I 
agree that could change, so zeroing vq_info should be done.

I will also send a separate patch for vduse_vring_setup().

>> +                       ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_GET_INFO, &vq_info);
>> +                       if (ret) {
>> +                               VHOST_CONFIG_LOG(dev->ifname, ERR,
>> +                                       "Failed to get VQ %u info while polling ready state: %s",
>> +                                       i, strerror(errno));
>> +                               return -1;
>> +                       }
>> +
>> +                       if (vq_info.ready) {
>> +                               VHOST_CONFIG_LOG(dev->ifname, DEBUG,
>> +                                       "VQ %u is ready after %u retries", i, retry_count);
>> +                               break;
>> +                       }
>> +
>> +                       retry_count++;
>> +                       /* Small delay between retries */
> 
> I would remove this Lapalissade comment.
> 
> 
>> +                       usleep(1000);
>> +               }
>> +
>> +               if (retry_count >= VDUSE_VQ_READY_POLL_MAX_RETRIES) {
>> +                       VHOST_CONFIG_LOG(dev->ifname, ERR,
>> +                               "VQ %u ready state polling timeout after %u retries",
>> +                               i, VDUSE_VQ_READY_POLL_MAX_RETRIES);
>> +                       return -1;
>> +               }
>> +       }
>> +
>> +       VHOST_CONFIG_LOG(dev->ifname, INFO, "All virtqueues are ready after polling");
>> +       return 0;
>> +}
>> +
>>   static void
>>   vduse_device_start(struct virtio_net *dev, bool reconnect)
>>   {
>> @@ -414,10 +464,18 @@ vduse_events_handler(int fd, void *arg, int *close __rte_unused)
>>          }
>>
>>          if ((old_status ^ dev->status) & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
>> -               if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK)
>> +               if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
>> +                       /* Poll virtqueues ready states before starting device */
>> +                       ret = vduse_wait_for_virtqueues_ready(dev);
>> +                       if (ret < 0) {
>> +                               VHOST_CONFIG_LOG(dev->ifname, ERR,
>> +                                       "Failed to wait for virtqueues ready, aborting device start");
>> +                               return;
>> +                       }
>>                          vduse_device_start(dev, false);
>> -               else
>> +               } else {
>>                          vduse_device_stop(dev);
>> +               }
>>          }
>>
>>          VHOST_CONFIG_LOG(dev->ifname, INFO, "Request %s (%u) handled successfully",
>> --
>> 2.51.0
>>
> 
> Aside from those nits, it looks an acceptable workaround for now.
> Reviewed-by: David Marchand <david.marchand@redhat.com>
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-09-16  8:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-11  8:36 [PATCH] vhost: add VDUSE virtqueue ready state polling workaround Maxime Coquelin
2025-09-15  9:42 ` David Marchand
2025-09-16  8:47   ` Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).