DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
@ 2020-03-31 16:56 Michael Haeuptle
  2020-04-01  8:50 ` Stojaczyk, Dariusz
  2020-04-02 10:10 ` Burakov, Anatoly
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Haeuptle @ 2020-03-31 16:56 UTC (permalink / raw)
  To: dev

This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}
-- 
2.18.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-03-31 16:56 [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition Michael Haeuptle
@ 2020-04-01  8:50 ` Stojaczyk, Dariusz
  2020-04-02 10:10 ` Burakov, Anatoly
  1 sibling, 0 replies; 10+ messages in thread
From: Stojaczyk, Dariusz @ 2020-04-01  8:50 UTC (permalink / raw)
  To: Michael Haeuptle, dev

> From: dev <dev-bounces@dpdk.org> On Behalf Of Michael Haeuptle
> Sent: Tuesday, March 31, 2020 6:57 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
> 
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
> 
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
> 
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> ---

Acked-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-03-31 16:56 [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition Michael Haeuptle
  2020-04-01  8:50 ` Stojaczyk, Dariusz
@ 2020-04-02 10:10 ` Burakov, Anatoly
  2020-04-06 13:25   ` David Marchand
  1 sibling, 1 reply; 10+ messages in thread
From: Burakov, Anatoly @ 2020-04-02 10:10 UTC (permalink / raw)
  To: Michael Haeuptle, dev

On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
> 
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
> 
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> ---
>   lib/librte_eal/linux/eal_vfio.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> index 4502aefed..1979f6fdd 100644
> --- a/lib/librte_eal/linux/eal_vfio.c
> +++ b/lib/librte_eal/linux/eal_vfio.c
> @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
>   	}
>   
>   	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> -	if (vfio_group_fd < 0) {
> +	if (vfio_group_fd <= 0) {
>   		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
>   		return -1;
>   	}
> 

If it's returning an invalid value, is that a kernel bug?

I mean, looks fine to me, so

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-04-02 10:10 ` Burakov, Anatoly
@ 2020-04-06 13:25   ` David Marchand
  2020-04-06 19:15     ` Haeuptle, Michael
  0 siblings, 1 reply; 10+ messages in thread
From: David Marchand @ 2020-04-06 13:25 UTC (permalink / raw)
  To: Burakov, Anatoly, Michael Haeuptle; +Cc: dev

On Thu, Apr 2, 2020 at 12:11 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> > This fix treats a 0 return value from vfio_open_group_fd
> > in vfio_get_group_fd as the intended error condition instead
> > of putting an incorrect 0 file descriptor in the vfio_group table.
> >
> > Sometimes, the creation of device files in sysfs is not
> > instantaneously causing vfio_open_groupfd to return 0.
> > This has been observed when hot removing/adding multiple
> > NVMe devices (>=4).
> >
> > Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> > ---
> >   lib/librte_eal/linux/eal_vfio.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> > index 4502aefed..1979f6fdd 100644
> > --- a/lib/librte_eal/linux/eal_vfio.c
> > +++ b/lib/librte_eal/linux/eal_vfio.c
> > @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
> >       }
> >
> >       vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> > -     if (vfio_group_fd < 0) {
> > +     if (vfio_group_fd <= 0) {
> >               RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
> >               return -1;
> >       }
> >
>
> If it's returning an invalid value, is that a kernel bug?
>
> I mean, looks fine to me, so
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

We are missing a fixes line.
Does this deserve a backport?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-04-06 13:25   ` David Marchand
@ 2020-04-06 19:15     ` Haeuptle, Michael
  2020-04-06 20:08       ` David Marchand
  0 siblings, 1 reply; 10+ messages in thread
From: Haeuptle, Michael @ 2020-04-06 19:15 UTC (permalink / raw)
  To: David Marchand, Burakov, Anatoly; +Cc: dev

Hi Dave,

This is my first submission... how do I get the fixes reference?

Back porting it to 19.11 would be great. This issue shows up in SPDK 20.01, which uses 19.11.

-- Michael

-----Original Message-----
From: David Marchand [mailto:david.marchand@redhat.com] 
Sent: Monday, April 6, 2020 7:25 AM
To: Burakov, Anatoly <anatoly.burakov@intel.com>; Haeuptle, Michael <michael.haeuptle@hpe.com>
Cc: dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition

On Thu, Apr 2, 2020 at 12:11 PM Burakov, Anatoly <anatoly.burakov@intel.com> wrote:
>
> On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> > This fix treats a 0 return value from vfio_open_group_fd in 
> > vfio_get_group_fd as the intended error condition instead of putting 
> > an incorrect 0 file descriptor in the vfio_group table.
> >
> > Sometimes, the creation of device files in sysfs is not 
> > instantaneously causing vfio_open_groupfd to return 0.
> > This has been observed when hot removing/adding multiple NVMe 
> > devices (>=4).
> >
> > Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> > ---
> >   lib/librte_eal/linux/eal_vfio.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/linux/eal_vfio.c 
> > b/lib/librte_eal/linux/eal_vfio.c index 4502aefed..1979f6fdd 100644
> > --- a/lib/librte_eal/linux/eal_vfio.c
> > +++ b/lib/librte_eal/linux/eal_vfio.c
> > @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
> >       }
> >
> >       vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> > -     if (vfio_group_fd < 0) {
> > +     if (vfio_group_fd <= 0) {
> >               RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
> >               return -1;
> >       }
> >
>
> If it's returning an invalid value, is that a kernel bug?
>
> I mean, looks fine to me, so
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

We are missing a fixes line.
Does this deserve a backport?


--
David Marchand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-04-06 19:15     ` Haeuptle, Michael
@ 2020-04-06 20:08       ` David Marchand
  0 siblings, 0 replies; 10+ messages in thread
From: David Marchand @ 2020-04-06 20:08 UTC (permalink / raw)
  To: Haeuptle, Michael; +Cc: Burakov, Anatoly, dev

On Mon, Apr 6, 2020 at 9:15 PM Haeuptle, Michael
<michael.haeuptle@hpe.com> wrote:
> This is my first submission... how do I get the fixes reference?

I recommend reading: https://doc.dpdk.org/guides/contributing/patches.html
But let me help you this time.

For the missing Fixes: tag, we want to get the sha1 of the commit that
first exhibited the issue.
For this you must either use git bisect (not always possible) or go
back in the git history and find the culprit commit.
git log -p --follow and git blame can help identify it (just be
cautious and skip purely cosmetic changes if any).

In your case, I *think* the commit is
340b7bb8d583661369a9491ade63fe2407e85267, since it introduced the
vfio_open_group_fd() function and the check on its return value.
I did not see changes between this commit and origin/master wrt to
vfio_open_group_fd() return values.
It might have been older than this, the vfio maintainer can help confirm.
Anatoly, can you double check :-) ?


For the formatting of the Fixes: tag, you have a git macro in
https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-body
Which then gives:
$ git fixline 340b7bb8d58
Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")

>
> Back porting it to 19.11 would be great. This issue shows up in SPDK 20.01, which uses 19.11.

For the criterias on what should be backported:
https://doc.dpdk.org/guides/contributing/stable.html#what-changes-should-be-backported.

You can annotate a patch with a version, but it will be informational only.
What is important is to Cc: stable@dpdk.org so that stable maintainers
see this fix and consider it for backport in the currently maintained
branches (atm 18.11 and 19.11).
The Fixes: line already gives an idea of which branches are concerned.
The stable maintainers have scripts to catch fixes of interest for
them (+ those scripts also check if a change is a fix of a previous
fix that got backported itself).

Here, if the above Fixes: is correct, it means this backport is a
candidate to backport to 18.11 and 19.11.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
  2020-04-06 22:23 Michael Haeuptle
@ 2020-04-21 16:19 ` David Marchand
  0 siblings, 0 replies; 10+ messages in thread
From: David Marchand @ 2020-04-21 16:19 UTC (permalink / raw)
  To: Michael Haeuptle; +Cc: dev, Xiao Wang, dpdk stable

On Tue, Apr 7, 2020 at 12:23 AM Michael Haeuptle
<michael.haeuptle@hpe.com> wrote:
>
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
>
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
>
> Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")
> Cc: stable@dpdk.org
>
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>

Please submit with a revision next time.
Added back acks from first revision of the patch.

Applied, thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
@ 2020-04-06 22:23 Michael Haeuptle
  2020-04-21 16:19 ` David Marchand
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Haeuptle @ 2020-04-06 22:23 UTC (permalink / raw)
  To: dev; +Cc: xiao.w.wang, stable

This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")
Cc: xiao.w.wang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}
-- 
2.18.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
@ 2020-04-06 22:16 Michael Haeuptle
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Haeuptle @ 2020-04-06 22:16 UTC (permalink / raw)
  To: dev; +Cc: stable

This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")
Cc: stable@dpdk.org

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}
-- 
2.18.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
@ 2020-03-31 16:51 Michael Haeuptle
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Haeuptle @ 2020-03-31 16:51 UTC (permalink / raw)
  To: dev

This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}
-- 
2.18.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-04-21 16:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-31 16:56 [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition Michael Haeuptle
2020-04-01  8:50 ` Stojaczyk, Dariusz
2020-04-02 10:10 ` Burakov, Anatoly
2020-04-06 13:25   ` David Marchand
2020-04-06 19:15     ` Haeuptle, Michael
2020-04-06 20:08       ` David Marchand
  -- strict thread matches above, loose matches on Subject: below --
2020-04-06 22:23 Michael Haeuptle
2020-04-21 16:19 ` David Marchand
2020-04-06 22:16 Michael Haeuptle
2020-03-31 16:51 Michael Haeuptle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).