patches for DPDK stable branches
 help / color / Atom feed
* [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
       [not found] <20200310133304.39951-1-harry.van.haaren@intel.com>
@ 2020-03-11 14:39 ` Harry van Haaren
  2020-03-11 16:15   ` David Marchand
  2020-03-13 10:04   ` David Marchand
  0 siblings, 2 replies; 7+ messages in thread
From: Harry van Haaren @ 2020-03-11 14:39 UTC (permalink / raw)
  To: dev; +Cc: david.marchand, aconole, Harry van Haaren, stable

This commit releases all service cores from their role,
returning them to ROLE_RTE on rte_service_finalize().

This may fix an issue relating to the service cores causing
a race-condition on eal_cleanup(), where the service core
could still be executing while the main thread has already
free-d the service memory, leading to a segfault.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Reported-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>

---

v2:
- Added rte_eal_mp_wait_lcore() after reset (David)
- Added Signed-off and Acked from mailing list (David, Aaron)

---
 lib/librte_eal/common/rte_service.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c
index 7e537b8cd..b0b78baab 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -122,6 +122,9 @@ rte_service_finalize(void)
 	if (!rte_service_library_initialized)
 		return;
 
+	rte_service_lcore_reset_all();
+	rte_eal_mp_wait_lcore();
+
 	rte_free(rte_services);
 	rte_free(lcore_states);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 14:39 ` [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores Harry van Haaren
@ 2020-03-11 16:15   ` David Marchand
  2020-03-11 16:21     ` Van Haaren, Harry
  2020-03-11 17:08     ` Aaron Conole
  2020-03-13 10:04   ` David Marchand
  1 sibling, 2 replies; 7+ messages in thread
From: David Marchand @ 2020-03-11 16:15 UTC (permalink / raw)
  To: Harry van Haaren; +Cc: dev, Aaron Conole, dpdk stable

On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> This commit releases all service cores from their role,
> returning them to ROLE_RTE on rte_service_finalize().
>
> This may fix an issue relating to the service cores causing

You don't seem convinced.


> a race-condition on eal_cleanup(), where the service core
> could still be executing while the main thread has already
> free-d the service memory, leading to a segfault.
>
> Fixes: 21698354c832 ("service: introduce service cores concept")
> Cc: stable@dpdk.org
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Reported-by: Aaron Conole <aconole@redhat.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Acked-by: Aaron Conole <aconole@redhat.com>

I am okay with merging this so that we stop getting random failures of the ut.
I will let this patch on the ml and apply on Friday at worse.

Please take the time to reply to my question.
Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 16:15   ` David Marchand
@ 2020-03-11 16:21     ` Van Haaren, Harry
  2020-03-12  8:59       ` David Marchand
  2020-03-11 17:08     ` Aaron Conole
  1 sibling, 1 reply; 7+ messages in thread
From: Van Haaren, Harry @ 2020-03-11 16:21 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Aaron Conole, dpdk stable

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Wednesday, March 11, 2020 4:16 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev <dev@dpdk.org>; Aaron Conole <aconole@redhat.com>; dpdk stable
> <stable@dpdk.org>
> Subject: Re: [PATCH v2] eal/service: fix exit by resetting service lcores
> 
> On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
> <harry.van.haaren@intel.com> wrote:
> >
> > This commit releases all service cores from their role,
> > returning them to ROLE_RTE on rte_service_finalize().
> >
> > This may fix an issue relating to the service cores causing
> 
> You don't seem convinced.

Apologies - kept from v1 of commit message, should have removed "may" for v2.

Issue was that service cores can remain running while main thread
has freed service-core memory, later racy return of service lcore
then causes use-after-free.

This commit fixes it by
A) resetting all service cores to return
B) waiting for them to return
C) freeing memory

I am confident in the fix.


> > a race-condition on eal_cleanup(), where the service core
> > could still be executing while the main thread has already
> > free-d the service memory, leading to a segfault.
> >
> > Fixes: 21698354c832 ("service: introduce service cores concept")
> > Cc: stable@dpdk.org
> >
> > Reported-by: David Marchand <david.marchand@redhat.com>
> > Reported-by: Aaron Conole <aconole@redhat.com>
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> > Acked-by: Aaron Conole <aconole@redhat.com>
> 
> I am okay with merging this so that we stop getting random failures of the
> ut. I will let this patch on the ml and apply on Friday at worse.
> 
> Please take the time to reply to my question.
> Thanks.

Thanks, -Harry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 16:15   ` David Marchand
  2020-03-11 16:21     ` Van Haaren, Harry
@ 2020-03-11 17:08     ` Aaron Conole
  2020-03-12  9:03       ` David Marchand
  1 sibling, 1 reply; 7+ messages in thread
From: Aaron Conole @ 2020-03-11 17:08 UTC (permalink / raw)
  To: David Marchand; +Cc: Harry van Haaren, dev, dpdk stable

David Marchand <david.marchand@redhat.com> writes:

> On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
> <harry.van.haaren@intel.com> wrote:
>>
>> This commit releases all service cores from their role,
>> returning them to ROLE_RTE on rte_service_finalize().
>>
>> This may fix an issue relating to the service cores causing
>
> You don't seem convinced.
>
>
>> a race-condition on eal_cleanup(), where the service core
>> could still be executing while the main thread has already
>> free-d the service memory, leading to a segfault.
>>
>> Fixes: 21698354c832 ("service: introduce service cores concept")
>> Cc: stable@dpdk.org
>>
>> Reported-by: David Marchand <david.marchand@redhat.com>
>> Reported-by: Aaron Conole <aconole@redhat.com>
>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
>> Acked-by: Aaron Conole <aconole@redhat.com>
>
> I am okay with merging this so that we stop getting random failures of the ut.

I think it could also potentially cause errors in user applications that
regularly exit, and which use the service core architecture.  So it's
worth getting in now, anyway.

> I will let this patch on the ml and apply on Friday at worse.
>
> Please take the time to reply to my question.
> Thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 16:21     ` Van Haaren, Harry
@ 2020-03-12  8:59       ` David Marchand
  0 siblings, 0 replies; 7+ messages in thread
From: David Marchand @ 2020-03-12  8:59 UTC (permalink / raw)
  To: Van Haaren, Harry; +Cc: dev, Aaron Conole, dpdk stable

Hello,

On Wed, Mar 11, 2020 at 5:21 PM Van Haaren, Harry
<harry.van.haaren@intel.com> wrote:
> Issue was that service cores can remain running while main thread
> has freed service-core memory, later racy return of service lcore
> then causes use-after-free.
>
> This commit fixes it by
> A) resetting all service cores to return
> B) waiting for them to return
> C) freeing memory
>
> I am confident in the fix.

Ok.

> > > a race-condition on eal_cleanup(), where the service core
> > > could still be executing while the main thread has already
> > > free-d the service memory, leading to a segfault.
> > >
> > > Fixes: 21698354c832 ("service: introduce service cores concept")

The race per se was introduced with:
da23f0aa87d8 ("service: fix memory leak with new function")


-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 17:08     ` Aaron Conole
@ 2020-03-12  9:03       ` David Marchand
  0 siblings, 0 replies; 7+ messages in thread
From: David Marchand @ 2020-03-12  9:03 UTC (permalink / raw)
  To: Aaron Conole; +Cc: Harry van Haaren, dev, dpdk stable

On Wed, Mar 11, 2020 at 6:08 PM Aaron Conole <aconole@redhat.com> wrote:
>
> David Marchand <david.marchand@redhat.com> writes:
>
> > On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
> > <harry.van.haaren@intel.com> wrote:
> >>
> >> This commit releases all service cores from their role,
> >> returning them to ROLE_RTE on rte_service_finalize().
> >>
> >> This may fix an issue relating to the service cores causing
> >
> > You don't seem convinced.
> >
> >
> >> a race-condition on eal_cleanup(), where the service core
> >> could still be executing while the main thread has already
> >> free-d the service memory, leading to a segfault.
> >>
> >> Fixes: 21698354c832 ("service: introduce service cores concept")
> >> Cc: stable@dpdk.org
> >>
> >> Reported-by: David Marchand <david.marchand@redhat.com>
> >> Reported-by: Aaron Conole <aconole@redhat.com>
> >> Signed-off-by: David Marchand <david.marchand@redhat.com>
> >> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> >> Acked-by: Aaron Conole <aconole@redhat.com>
> >
> > I am okay with merging this so that we stop getting random failures of the ut.
>
> I think it could also potentially cause errors in user applications that
> regularly exit, and which use the service core architecture.  So it's
> worth getting in now, anyway.

Indeed, thanks for the precision.

In my defense, we did not get report of such crashes out of the CI.
The CI is the main reason why I (selfishly :-)) have been pressing on
this issue.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores
  2020-03-11 14:39 ` [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores Harry van Haaren
  2020-03-11 16:15   ` David Marchand
@ 2020-03-13 10:04   ` David Marchand
  1 sibling, 0 replies; 7+ messages in thread
From: David Marchand @ 2020-03-13 10:04 UTC (permalink / raw)
  To: Harry van Haaren; +Cc: dev, Aaron Conole, dpdk stable

On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> This commit releases all service cores from their role,
> returning them to ROLE_RTE on rte_service_finalize().
>
> This may fix an issue relating to the service cores causing

s/may fix/fixes/

> a race-condition on eal_cleanup(), where the service core
> could still be executing while the main thread has already
> free-d the service memory, leading to a segfault.
>
> Fixes: 21698354c832 ("service: introduce service cores concept")

Replaced with:
Fixes: da23f0aa87d8 ("service: fix memory leak with new function")

> Cc: stable@dpdk.org
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Reported-by: Aaron Conole <aconole@redhat.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Acked-by: Aaron Conole <aconole@redhat.com>

Applied, thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200310133304.39951-1-harry.van.haaren@intel.com>
2020-03-11 14:39 ` [dpdk-stable] [PATCH v2] eal/service: fix exit by resetting service lcores Harry van Haaren
2020-03-11 16:15   ` David Marchand
2020-03-11 16:21     ` Van Haaren, Harry
2020-03-12  8:59       ` David Marchand
2020-03-11 17:08     ` Aaron Conole
2020-03-12  9:03       ` David Marchand
2020-03-13 10:04   ` David Marchand

patches for DPDK stable branches

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox