patches for DPDK stable branches
 help / color / mirror / Atom feed
* Early backport of Vhost regression fix in LTS branches
@ 2022-09-20  9:36 Maxime Coquelin
  2022-09-20 11:03 ` Kevin Traynor
  0 siblings, 1 reply; 5+ messages in thread
From: Maxime Coquelin @ 2022-09-20  9:36 UTC (permalink / raw)
  To: stable, Kevin Traynor, Luca Boccassi, Christian Ehrhardt, Xueming Li
  Cc: Michael Phelan

Hi LTS maintainers,

We have discovered a regression causing deadlock in application using
the Vhost library (when vIOMMU is used & NUMA reallocation happens).

The faulty commit [0] got backported in all maintained LTS branches,
following minor releases are impacted:
- V21.11.1+
- V20.11.5+
- V19.11.12+

The fix for this regression is already in main branch, and will be part
of next v22.11 release.

Discussing with Kevin, he suggested the fix to be backported early to
all the LTS branches.

Below is the fix to be backported:

======================================================================
commit 0b2a2ca35037d6a5168f0832c11d9858b8ae946a
Author: David Marchand <david.marchand@redhat.com>
Date:   Mon Jul 25 22:32:03 2022 +0200

     vhost: fix virtqueue use after free on NUMA reallocation

     translate_ring_addresses (via numa_realloc) may change a virtio 
device and
     virtio queue.
     The virtqueue object must be refreshed before accessing the lock.

     Fixes: 04c27cb673b9 ("vhost: fix unsafe vring addresses modifications")
     Cc: stable@dpdk.org

     Signed-off-by: David Marchand <david.marchand@redhat.com>
     Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
======================================================================

The fix can be backported without conflicts to all the LTS branches,
except for v19.11, for which the Vhost directory rename can cause
issues. It can be overcome using below command:

git cherry-pick -Xfind-renames=5% 0b2a2ca350

Is that OK for you?

Please let me know if any issue.

Thanks,
Maxime

[0]: https://git.dpdk.org/dpdk/commit/?id=04c27cb673b9


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Early backport of Vhost regression fix in LTS branches
  2022-09-20  9:36 Early backport of Vhost regression fix in LTS branches Maxime Coquelin
@ 2022-09-20 11:03 ` Kevin Traynor
  2022-09-20 11:10   ` Luca Boccassi
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Traynor @ 2022-09-20 11:03 UTC (permalink / raw)
  To: Maxime Coquelin, stable, Luca Boccassi, Christian Ehrhardt, Xueming Li
  Cc: Michael Phelan

On 20/09/2022 10:36, Maxime Coquelin wrote:
> Hi LTS maintainers,
> 
> We have discovered a regression causing deadlock in application using
> the Vhost library (when vIOMMU is used & NUMA reallocation happens).
> 
> The faulty commit [0] got backported in all maintained LTS branches,
> following minor releases are impacted:
> - V21.11.1+
> - V20.11.5+
> - V19.11.12+
> 
> The fix for this regression is already in main branch, and will be part
> of next v22.11 release.
> 
> Discussing with Kevin, he suggested the fix to be backported early to
> all the LTS branches.
> 

This issue is a deadlock likely to occur with an application such as 
OVS, that uses vIOMMU vhost ports and a multi-NUMA system.

In normal circumstances, for example with OVS, we could just recommend 
users not to upgrade to the latest DPDK LTS releases until the issue is 
fixed. Where this one gets tricky is that the latest LTS releases 
contains CVE fixes.

At the moment if a user wants the CVE fixes *and* the below deadlock 
fix, they will have to pick patches themselves. It might help some if 
the DPDK stable branches (which are still at last release point) 
backport the fix below early so a user can just pull the branch.

> Below is the fix to be backported:
> 
> ======================================================================
> commit 0b2a2ca35037d6a5168f0832c11d9858b8ae946a
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Mon Jul 25 22:32:03 2022 +0200
> 
>       vhost: fix virtqueue use after free on NUMA reallocation
> 
>       translate_ring_addresses (via numa_realloc) may change a virtio
> device and
>       virtio queue.
>       The virtqueue object must be refreshed before accessing the lock.
> 
>       Fixes: 04c27cb673b9 ("vhost: fix unsafe vring addresses modifications")
>       Cc: stable@dpdk.org
> 
>       Signed-off-by: David Marchand <david.marchand@redhat.com>
>       Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ======================================================================
> 
> The fix can be backported without conflicts to all the LTS branches,
> except for v19.11, for which the Vhost directory rename can cause
> issues. It can be overcome using below command:
> 
> git cherry-pick -Xfind-renames=5% 0b2a2ca350
> 
> Is that OK for you?
> 

I can take care of it for all branches if other maintainers are busy and 
ok with that.

thanks,
Kevin.

> Please let me know if any issue.
> 
> Thanks,
> Maxime
> 
> [0]: https://git.dpdk.org/dpdk/commit/?id=04c27cb673b9
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Early backport of Vhost regression fix in LTS branches
  2022-09-20 11:03 ` Kevin Traynor
@ 2022-09-20 11:10   ` Luca Boccassi
  2022-09-23 14:53     ` Kevin Traynor
  0 siblings, 1 reply; 5+ messages in thread
From: Luca Boccassi @ 2022-09-20 11:10 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Maxime Coquelin, stable, Christian Ehrhardt, Xueming Li, Michael Phelan

On Tue, 20 Sept 2022 at 12:03, Kevin Traynor <ktraynor@redhat.com> wrote:
>
> On 20/09/2022 10:36, Maxime Coquelin wrote:
> > Hi LTS maintainers,
> >
> > We have discovered a regression causing deadlock in application using
> > the Vhost library (when vIOMMU is used & NUMA reallocation happens).
> >
> > The faulty commit [0] got backported in all maintained LTS branches,
> > following minor releases are impacted:
> > - V21.11.1+
> > - V20.11.5+
> > - V19.11.12+
> >
> > The fix for this regression is already in main branch, and will be part
> > of next v22.11 release.
> >
> > Discussing with Kevin, he suggested the fix to be backported early to
> > all the LTS branches.
> >
>
> This issue is a deadlock likely to occur with an application such as
> OVS, that uses vIOMMU vhost ports and a multi-NUMA system.
>
> In normal circumstances, for example with OVS, we could just recommend
> users not to upgrade to the latest DPDK LTS releases until the issue is
> fixed. Where this one gets tricky is that the latest LTS releases
> contains CVE fixes.
>
> At the moment if a user wants the CVE fixes *and* the below deadlock
> fix, they will have to pick patches themselves. It might help some if
> the DPDK stable branches (which are still at last release point)
> backport the fix below early so a user can just pull the branch.
>
> > Below is the fix to be backported:
> >
> > ======================================================================
> > commit 0b2a2ca35037d6a5168f0832c11d9858b8ae946a
> > Author: David Marchand <david.marchand@redhat.com>
> > Date:   Mon Jul 25 22:32:03 2022 +0200
> >
> >       vhost: fix virtqueue use after free on NUMA reallocation
> >
> >       translate_ring_addresses (via numa_realloc) may change a virtio
> > device and
> >       virtio queue.
> >       The virtqueue object must be refreshed before accessing the lock.
> >
> >       Fixes: 04c27cb673b9 ("vhost: fix unsafe vring addresses modifications")
> >       Cc: stable@dpdk.org
> >
> >       Signed-off-by: David Marchand <david.marchand@redhat.com>
> >       Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > ======================================================================
> >
> > The fix can be backported without conflicts to all the LTS branches,
> > except for v19.11, for which the Vhost directory rename can cause
> > issues. It can be overcome using below command:
> >
> > git cherry-pick -Xfind-renames=5% 0b2a2ca350
> >
> > Is that OK for you?
> >
>
> I can take care of it for all branches if other maintainers are busy and
> ok with that.

Sounds good to me, feel free to go ahead for 20.11, thank you.

Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Early backport of Vhost regression fix in LTS branches
  2022-09-20 11:10   ` Luca Boccassi
@ 2022-09-23 14:53     ` Kevin Traynor
  2022-09-26 16:47       ` Kevin Traynor
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Traynor @ 2022-09-23 14:53 UTC (permalink / raw)
  To: Luca Boccassi, Christian Ehrhardt
  Cc: Maxime Coquelin, stable, Xueming Li, Michael Phelan

On 20/09/2022 12:10, Luca Boccassi wrote:
> On Tue, 20 Sept 2022 at 12:03, Kevin Traynor <ktraynor@redhat.com> wrote:
>>
>> On 20/09/2022 10:36, Maxime Coquelin wrote:
>>> Hi LTS maintainers,
>>>
>>> We have discovered a regression causing deadlock in application using
>>> the Vhost library (when vIOMMU is used & NUMA reallocation happens).
>>>
>>> The faulty commit [0] got backported in all maintained LTS branches,
>>> following minor releases are impacted:
>>> - V21.11.1+
>>> - V20.11.5+
>>> - V19.11.12+
>>>
>>> The fix for this regression is already in main branch, and will be part
>>> of next v22.11 release.
>>>
>>> Discussing with Kevin, he suggested the fix to be backported early to
>>> all the LTS branches.
>>>
>>
>> This issue is a deadlock likely to occur with an application such as
>> OVS, that uses vIOMMU vhost ports and a multi-NUMA system.
>>
>> In normal circumstances, for example with OVS, we could just recommend
>> users not to upgrade to the latest DPDK LTS releases until the issue is
>> fixed. Where this one gets tricky is that the latest LTS releases
>> contains CVE fixes.
>>
>> At the moment if a user wants the CVE fixes *and* the below deadlock
>> fix, they will have to pick patches themselves. It might help some if
>> the DPDK stable branches (which are still at last release point)
>> backport the fix below early so a user can just pull the branch.
>>
>>> Below is the fix to be backported:
>>>
>>> ======================================================================
>>> commit 0b2a2ca35037d6a5168f0832c11d9858b8ae946a
>>> Author: David Marchand <david.marchand@redhat.com>
>>> Date:   Mon Jul 25 22:32:03 2022 +0200
>>>
>>>        vhost: fix virtqueue use after free on NUMA reallocation
>>>
>>>        translate_ring_addresses (via numa_realloc) may change a virtio
>>> device and
>>>        virtio queue.
>>>        The virtqueue object must be refreshed before accessing the lock.
>>>
>>>        Fixes: 04c27cb673b9 ("vhost: fix unsafe vring addresses modifications")
>>>        Cc: stable@dpdk.org
>>>
>>>        Signed-off-by: David Marchand <david.marchand@redhat.com>
>>>        Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> ======================================================================
>>>
>>> The fix can be backported without conflicts to all the LTS branches,
>>> except for v19.11, for which the Vhost directory rename can cause
>>> issues. It can be overcome using below command:
>>>
>>> git cherry-pick -Xfind-renames=5% 0b2a2ca350
>>>
>>> Is that OK for you?
>>>
>>
>> I can take care of it for all branches if other maintainers are busy and
>> ok with that.
> 
> Sounds good to me, feel free to go ahead for 20.11, thank you.
> 

20.11 and 21.11 are done.

@Christian, wasn't able to get in touch, not sure if you are on PTO etc.
I don't see any big risk, this is just backporting a patch that would be 
backported in a couple of months anyway. So let's say I will push to the 
19.11 branch on Monday if there are no objections.

thanks,
Kevin.

> Kind regards,
> Luca Boccassi
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Early backport of Vhost regression fix in LTS branches
  2022-09-23 14:53     ` Kevin Traynor
@ 2022-09-26 16:47       ` Kevin Traynor
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Traynor @ 2022-09-26 16:47 UTC (permalink / raw)
  To: Luca Boccassi, Christian Ehrhardt
  Cc: Maxime Coquelin, stable, Xueming Li, Michael Phelan

On 23/09/2022 15:53, Kevin Traynor wrote:
> On 20/09/2022 12:10, Luca Boccassi wrote:
>> On Tue, 20 Sept 2022 at 12:03, Kevin Traynor <ktraynor@redhat.com> wrote:
>>>
>>> On 20/09/2022 10:36, Maxime Coquelin wrote:
>>>> Hi LTS maintainers,
>>>>
>>>> We have discovered a regression causing deadlock in application using
>>>> the Vhost library (when vIOMMU is used & NUMA reallocation happens).
>>>>
>>>> The faulty commit [0] got backported in all maintained LTS branches,
>>>> following minor releases are impacted:
>>>> - V21.11.1+
>>>> - V20.11.5+
>>>> - V19.11.12+
>>>>
>>>> The fix for this regression is already in main branch, and will be part
>>>> of next v22.11 release.
>>>>
>>>> Discussing with Kevin, he suggested the fix to be backported early to
>>>> all the LTS branches.
>>>>
>>>
>>> This issue is a deadlock likely to occur with an application such as
>>> OVS, that uses vIOMMU vhost ports and a multi-NUMA system.
>>>
>>> In normal circumstances, for example with OVS, we could just recommend
>>> users not to upgrade to the latest DPDK LTS releases until the issue is
>>> fixed. Where this one gets tricky is that the latest LTS releases
>>> contains CVE fixes.
>>>
>>> At the moment if a user wants the CVE fixes *and* the below deadlock
>>> fix, they will have to pick patches themselves. It might help some if
>>> the DPDK stable branches (which are still at last release point)
>>> backport the fix below early so a user can just pull the branch.
>>>
>>>> Below is the fix to be backported:
>>>>
>>>> ======================================================================
>>>> commit 0b2a2ca35037d6a5168f0832c11d9858b8ae946a
>>>> Author: David Marchand <david.marchand@redhat.com>
>>>> Date:   Mon Jul 25 22:32:03 2022 +0200
>>>>
>>>>         vhost: fix virtqueue use after free on NUMA reallocation
>>>>
>>>>         translate_ring_addresses (via numa_realloc) may change a virtio
>>>> device and
>>>>         virtio queue.
>>>>         The virtqueue object must be refreshed before accessing the lock.
>>>>
>>>>         Fixes: 04c27cb673b9 ("vhost: fix unsafe vring addresses modifications")
>>>>         Cc: stable@dpdk.org
>>>>
>>>>         Signed-off-by: David Marchand <david.marchand@redhat.com>
>>>>         Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ======================================================================
>>>>
>>>> The fix can be backported without conflicts to all the LTS branches,
>>>> except for v19.11, for which the Vhost directory rename can cause
>>>> issues. It can be overcome using below command:
>>>>
>>>> git cherry-pick -Xfind-renames=5% 0b2a2ca350
>>>>
>>>> Is that OK for you?
>>>>
>>>
>>> I can take care of it for all branches if other maintainers are busy and
>>> ok with that.
>>
>> Sounds good to me, feel free to go ahead for 20.11, thank you.
>>
> 
> 20.11 and 21.11 are done.
> 
> @Christian, wasn't able to get in touch, not sure if you are on PTO etc.
> I don't see any big risk, this is just backporting a patch that would be
> backported in a couple of months anyway. So let's say I will push to the
> 19.11 branch on Monday if there are no objections.
> 

Pushed to 19.11 branch, thanks.

> thanks,
> Kevin.
> 
>> Kind regards,
>> Luca Boccassi
>>
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-09-26 16:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-20  9:36 Early backport of Vhost regression fix in LTS branches Maxime Coquelin
2022-09-20 11:03 ` Kevin Traynor
2022-09-20 11:10   ` Luca Boccassi
2022-09-23 14:53     ` Kevin Traynor
2022-09-26 16:47       ` Kevin Traynor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).