* [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy
@ 2025-09-03 11:17 Thanushree Sreerama
2025-09-03 14:16 ` Burakov, Anatoly
2025-09-04 7:22 ` David Marchand
0 siblings, 2 replies; 3+ messages in thread
From: Thanushree Sreerama @ 2025-09-03 11:17 UTC (permalink / raw)
To: dev; +Cc: thanushree.sreerama, stable, Anatoly Burakov
From: "Thanushree.Sreerama" <thanushree.sreerama@nokia.com>
Add proper EAGAIN handling for the device setup by retrying the device reset
Issue:
asc-0a Disp_0[18237]: EAL: Unable to reset device! Error: 11 (Resource temporarily unavailable)
asc-0a Disp_0[18237]: EAL: 0000:f4:02.3 setup device failed
asc-0a Disp_0[18237]: EAL: Requested device 0000:f4:02.3 cannot be used
Caused due to:
92d847a35e1 ("Revert "driver core: Fix uevent_show() vs driver detach race"")
Cc: stable@dpdk.org
Change-Id: Ic3ae8701fccdbf1e8e2a575d48e707b4c58e939a
Signed-off-by: Thanushree Sreerama <thanushree.sreerama@nokia.com>
---
drivers/bus/pci/linux/pci_vfio.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index fab3483d9f..20e212c9f1 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -478,6 +478,8 @@ pci_vfio_is_ioport_bar(int vfio_dev_fd, int bar_index)
static int
pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd)
{
+ int i, ret = 0, max_retries = 5, retry_delay_ms = 20;
+
if (pci_vfio_setup_interrupts(dev, vfio_dev_fd) != 0) {
RTE_LOG(ERR, EAL, "Error setting up interrupts!\n");
return -1;
@@ -498,7 +500,19 @@ pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd)
* Reset the device. If the device is not capable of resetting,
* then it updates errno as EINVAL.
*/
- if (ioctl(vfio_dev_fd, VFIO_DEVICE_RESET) && errno != EINVAL) {
+ for (i = 0; i < max_retries; i++) {
+ errno = 0;
+ ret = ioctl(vfio_dev_fd, VFIO_DEVICE_RESET);
+ if (!ret || errno == EINVAL)
+ break;
+
+ if (errno == EAGAIN) {
+ RTE_LOG(DEBUG, EAL, "Device busy, sleep %d ms and retry to reset %d of %d times\n",
+ retry_delay_ms, i + 1, max_retries);
+ usleep(retry_delay_ms * 1000);
+ continue;
+ }
+
RTE_LOG(ERR, EAL, "Unable to reset device! Error: %d (%s)\n",
errno, strerror(errno));
return -1;
--
2.43.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy
2025-09-03 11:17 [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy Thanushree Sreerama
@ 2025-09-03 14:16 ` Burakov, Anatoly
2025-09-04 7:22 ` David Marchand
1 sibling, 0 replies; 3+ messages in thread
From: Burakov, Anatoly @ 2025-09-03 14:16 UTC (permalink / raw)
To: Thanushree Sreerama, dev; +Cc: stable
On 9/3/2025 1:17 PM, Thanushree Sreerama wrote:
> From: "Thanushree.Sreerama" <thanushree.sreerama@nokia.com>
>
> Add proper EAGAIN handling for the device setup by retrying the device reset
>
> Issue:
> asc-0a Disp_0[18237]: EAL: Unable to reset device! Error: 11 (Resource temporarily unavailable)
> asc-0a Disp_0[18237]: EAL: 0000:f4:02.3 setup device failed
> asc-0a Disp_0[18237]: EAL: Requested device 0000:f4:02.3 cannot be used
>
> Caused due to:
> 92d847a35e1 ("Revert "driver core: Fix uevent_show() vs driver detach race"")
> Cc: stable@dpdk.org
>
> Change-Id: Ic3ae8701fccdbf1e8e2a575d48e707b4c58e939a
> Signed-off-by: Thanushree Sreerama <thanushree.sreerama@nokia.com>
> ---
> drivers/bus/pci/linux/pci_vfio.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
> index fab3483d9f..20e212c9f1 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -478,6 +478,8 @@ pci_vfio_is_ioport_bar(int vfio_dev_fd, int bar_index)
> static int
> pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd)
> {
> + int i, ret = 0, max_retries = 5, retry_delay_ms = 20;
> +
> if (pci_vfio_setup_interrupts(dev, vfio_dev_fd) != 0) {
> RTE_LOG(ERR, EAL, "Error setting up interrupts!\n");
> return -1;
> @@ -498,7 +500,19 @@ pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd)
> * Reset the device. If the device is not capable of resetting,
> * then it updates errno as EINVAL.
> */
> - if (ioctl(vfio_dev_fd, VFIO_DEVICE_RESET) && errno != EINVAL) {
> + for (i = 0; i < max_retries; i++) {
> + errno = 0;
> + ret = ioctl(vfio_dev_fd, VFIO_DEVICE_RESET);
> + if (!ret || errno == EINVAL)
> + break;
> +
> + if (errno == EAGAIN) {
> + RTE_LOG(DEBUG, EAL, "Device busy, sleep %d ms and retry to reset %d of %d times\n",
> + retry_delay_ms, i + 1, max_retries);
> + usleep(retry_delay_ms * 1000);
Perhaps use one of the rte_delay_* functions for portability?
> + continue;
> + }
> +
> RTE_LOG(ERR, EAL, "Unable to reset device! Error: %d (%s)\n",
> errno, strerror(errno));
> return -1;
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy
2025-09-03 11:17 [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy Thanushree Sreerama
2025-09-03 14:16 ` Burakov, Anatoly
@ 2025-09-04 7:22 ` David Marchand
1 sibling, 0 replies; 3+ messages in thread
From: David Marchand @ 2025-09-04 7:22 UTC (permalink / raw)
To: Thanushree Sreerama; +Cc: dev, stable, Anatoly Burakov
Hello,
Please have a look at
https://doc.dpdk.org/guides/contributing/patches.html before sending a
new revision.
On Wed, 3 Sept 2025 at 16:24, Thanushree Sreerama
<thanushree.sreerama@nokia.com> wrote:
>
> From: "Thanushree.Sreerama" <thanushree.sreerama@nokia.com>
>
> Add proper EAGAIN handling for the device setup by retrying the device reset
>
> Issue:
> asc-0a Disp_0[18237]: EAL: Unable to reset device! Error: 11 (Resource temporarily unavailable)
> asc-0a Disp_0[18237]: EAL: 0000:f4:02.3 setup device failed
> asc-0a Disp_0[18237]: EAL: Requested device 0000:f4:02.3 cannot be used
We are missing the conditions where such an error is hit (iow, why do
we get a EBUSY here).
Please describe the functionnal impact.
Please explain how those timeout/retry values were chosen and how they
are enough for the issue you faced.
>
> Caused due to:
> 92d847a35e1 ("Revert "driver core: Fix uevent_show() vs driver detach race"")
This sha1 does not belong to DPDK public repo.
I also see the patch could not be applied in the CI, so you'll have to
make sure this patch is rebased on main.
> Cc: stable@dpdk.org
If you want this fix to be backported, point at the commit where the
issue was introduced with a Fixes:.
> Change-Id: Ic3ae8701fccdbf1e8e2a575d48e707b4c58e939a
Please remove this marker, this does not make sense for upstream contributions.
> Signed-off-by: Thanushree Sreerama <thanushree.sreerama@nokia.com>
And please register to the mailing list, otherwise your patch waits in
the moderation queue until some dpdk admin looks at it.
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-09-04 7:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-03 11:17 [PATCH] pci: pci_vfio: Retry vfio setup device reset, if device is busy Thanushree Sreerama
2025-09-03 14:16 ` Burakov, Anatoly
2025-09-04 7:22 ` David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).