* [dpdk-dev] Service lcores and Application lcores
@ 2017-06-29 14:36 Van Haaren, Harry
2017-06-29 15:16 ` Thomas Monjalon
2017-06-29 15:57 ` Bruce Richardson
0 siblings, 2 replies; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-29 14:36 UTC (permalink / raw)
To: dev; +Cc: 'Jerin Jacob', thomas, Wiles, Keith, Richardson, Bruce
Hi All,
The recently posted service cores patchset[1], introduces service lcores to run services for DPDK applications. Services are just an ordinary function for eg: eventdev scheduling, NIC RX, statistics and monitoring, etc. A service is just a callback function, which a core invokes. An atomic ensures that services that are
non-multi-thread-safe are never concurrently invoked.
The topic of discussion in this thread is how we can ensure that application lcores do not interfere with service cores. I have a solution described below, opinions welcome.
Regards, -Harry
PS: This discussion extends that in the ML thread here[2], participants of that thread added to CC.
[1] Service Cores v2 patchset http://dpdk.org/dev/patchwork/bundle/hvanhaar/service_cores_v2/
[2] http://dpdk.org/ml/archives/dev/2017-June/069290.html
________________________
A proposal for Eventdev, to ensure Service lcores and Application lcores play nice;
1) Application lcores must not directly call rte_eventdev_schedule()
2A) Service cores are the proper method to run services
2B) If an application insists on running a service "manually" on an app lcore, we provide a function for that:
rte_service_run_from_app_lcore(struct service *srv);
The above function would allow a pesky app to run services on its own (non-service core) lcores, but
does so through the service-core framework, allowing the service-library atomic to keep access serialized as required for non-multi-thread-safe services.
The above solution maintains the option of running the eventdev PMD as now (single-core dedicated to a single service), while providing correct serialization by using the rte_service_run_from_app_lcore() function. Given the atomic is only used when required (multiple cores mapped to the service) there should be no performance delta.
Given that the application should not invoke rte_eventdev_schedule(), we could even consider removing it from the Eventdev API. A PMD that requires cycles registers a service, and an application can use a service core or the run_from_app_lcore() function if it wishes to invoke that service on an application owned lcore.
Opinions?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 14:36 [dpdk-dev] Service lcores and Application lcores Van Haaren, Harry
@ 2017-06-29 15:16 ` Thomas Monjalon
2017-06-29 16:35 ` Van Haaren, Harry
2017-06-29 15:57 ` Bruce Richardson
1 sibling, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-06-29 15:16 UTC (permalink / raw)
To: Van Haaren, Harry
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
29/06/2017 16:36, Van Haaren, Harry:
> The topic of discussion in this thread is how we can ensure
> that application lcores do not interfere with service cores.
Please could you give more details on the issue?
I think you need to clearly explain the 3 types of cores:
- DPDK mainloop
- DPDK services
- not used by DPDK
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 14:36 [dpdk-dev] Service lcores and Application lcores Van Haaren, Harry
2017-06-29 15:16 ` Thomas Monjalon
@ 2017-06-29 15:57 ` Bruce Richardson
2017-06-30 4:45 ` Jerin Jacob
1 sibling, 1 reply; 19+ messages in thread
From: Bruce Richardson @ 2017-06-29 15:57 UTC (permalink / raw)
To: Van Haaren, Harry; +Cc: dev, 'Jerin Jacob', thomas, Wiles, Keith
On Thu, Jun 29, 2017 at 03:36:04PM +0100, Van Haaren, Harry wrote:
> Hi All,
>
>
> The recently posted service cores patchset[1], introduces service lcores to run services for DPDK applications. Services are just an ordinary function for eg: eventdev scheduling, NIC RX, statistics and monitoring, etc. A service is just a callback function, which a core invokes. An atomic ensures that services that are
> non-multi-thread-safe are never concurrently invoked.
>
> The topic of discussion in this thread is how we can ensure that application lcores do not interfere with service cores. I have a solution described below, opinions welcome.
>
>
> Regards, -Harry
>
>
> PS: This discussion extends that in the ML thread here[2], participants of that thread added to CC.
>
> [1] Service Cores v2 patchset http://dpdk.org/dev/patchwork/bundle/hvanhaar/service_cores_v2/
> [2] http://dpdk.org/ml/archives/dev/2017-June/069290.html
>
>
> ________________________
>
>
>
> A proposal for Eventdev, to ensure Service lcores and Application lcores play nice;
>
> 1) Application lcores must not directly call rte_eventdev_schedule()
> 2A) Service cores are the proper method to run services
> 2B) If an application insists on running a service "manually" on an app lcore, we provide a function for that:
> rte_service_run_from_app_lcore(struct service *srv);
>
> The above function would allow a pesky app to run services on its own (non-service core) lcores, but
> does so through the service-core framework, allowing the service-library atomic to keep access serialized as required for non-multi-thread-safe services.
>
> The above solution maintains the option of running the eventdev PMD as now (single-core dedicated to a single service), while providing correct serialization by using the rte_service_run_from_app_lcore() function. Given the atomic is only used when required (multiple cores mapped to the service) there should be no performance delta.
>
> Given that the application should not invoke rte_eventdev_schedule(), we could even consider removing it from the Eventdev API. A PMD that requires cycles registers a service, and an application can use a service core or the run_from_app_lcore() function if it wishes to invoke that service on an application owned lcore.
>
>
> Opinions?
I would be in favour of this proposal, except for the proposed name for
the new function. It would be useful for an app to be able to "adopt" a
service into it's main loop if so desired. If we do this, I think I'd
also support the removal of a dedicated schedule call from the eventdev
API, or alternatively, if it is needed by other PMDs, leave it as a
no-op in the sw PMD in favour of the service-cores managed function.
/Bruce
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 15:16 ` Thomas Monjalon
@ 2017-06-29 16:35 ` Van Haaren, Harry
2017-06-29 20:18 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-29 16:35 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, June 29, 2017 4:16 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> 29/06/2017 16:36, Van Haaren, Harry:
> > The topic of discussion in this thread is how we can ensure
> > that application lcores do not interfere with service cores.
>
> Please could you give more details on the issue?
Sure, hope I didn't write too much!
> I think you need to clearly explain the 3 types of cores:
> - DPDK mainloop
> - DPDK services
> - not used by DPDK
DPDK cores continue to function as they currently do, with the execption that service-cores are removed from the coremask. Details in 0) below.
DPDK service cores run services - they are not visible to the application. (AKA; the application does not perform any remote_launch() on these cores, it is handled internally in EAL). Service lcores are just normal lcores, only the
lcore_config[lcore_id].core_role == ROLE_SERVICE instead of ROLE_RTE.
Non DPDK cores are not changed.
I'll run through the following scenarios to detail the problem;
0) Explain where service cores come from in relation to non-DPDK cores
1) Describe the current usage of DPDK cores, and how the eventdev scheduler is used
2) Introduce the a service core only usage of eventdev
3) Introduce the problem: service cores and application cores concurrently run a multi-thread unsafe service
4) The proposed solution
0) At application startup, the EAL coremask detects DPDK cores, and "brings them up".
Service cores are "stolen" from the previous core-mask, so the service-core mask is a subset of the EAL coremask.
Service cores are marked as ROLE_SERVICE, and the application "FOR_EACH_LCORE" will not use them.
Non-DPDK cores are not affected - they remain as they were.
1) Currently, a DPDK application directly invokes rte_eventdev_schedule() using an ordinary app lcore.
The application is responsible for multiplexing work on cores (assuming multiple PMDs are running on one core).
The app logic *must* be updated if we wish to add more SW workloads (aka, using a SW PMD instead of HW acceleration).
This change in app logic is the workaround to DPDK failing to abstract HW / SW PMD requirements.
Service cores provides the abstraction of environment (SW/HW PMD) difference to the application.
2) In a service core "only" situation, the SW PMD registers a service. This service is run on a service core.
The application logic does not have to change, as the service-core running the service is not under app control.
Note that the application does NOT call rte_eventdev_schedule() as in 1) above, since the service core now performs this.
3) The problem;
If a service core runs the SW PMD schedule() function (option 2) *AND*
the application lcore runs schedule() func (option 1), the result is that
two threads are concurrently running a multi-thread unsafe function.
The issue is not that the application is wrong: it correctly called rte_schedule()
It is also not that service core infra is wrong: it correctly ran the service
The combination of both (and the un-awareness of eachother) that causes the issue.
4) The proposed solution;
In order to ensure that multiple threads do not concurrently run a multi-thread unsafe service function,
the threads must be aware of runtime of the other threads. The service core code handles this
using an atomic operation per service; multiple service cores operate correctly, no problem.
The root cause of the issue is that the application cores are not using the service atomic.
The rte_service_run() function, allows the application to be aware of service-core runtime habits
due to calling into the service library, and running the service from there. With this additional rule,
all cores (service and application owned) will be aware of eachother, and can run multi-thread unsafe
services safely, in a co-operative manner.
In order to allow the application core still run the eventdev PMD "manually" if it insists,
I am proposing to allow it to invoke a specific function, which is aware of the service
atomic. This will result in all cores "playing nice", regardless of if it is app or service owned.
The rte_service_run() function (which allows an application-lcore to run a service) allows
much easier porting of applications to the service-core infrastructure. It is easier because
the threading model of the application does not have to change, it looks up the service it
would like to run, and can repeatedly call the rte_service_run() function to have the application
behave in the same way as before the service core addition.
Ok, this got longer than intended - but hopefully clearly articulates the motivation for the rte_service_run() concept.
Regards, -Harry
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 16:35 ` Van Haaren, Harry
@ 2017-06-29 20:18 ` Thomas Monjalon
2017-06-30 8:52 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-06-29 20:18 UTC (permalink / raw)
To: Van Haaren, Harry
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
29/06/2017 18:35, Van Haaren, Harry:
> 3) The problem;
> If a service core runs the SW PMD schedule() function (option 2) *AND*
> the application lcore runs schedule() func (option 1), the result is that
> two threads are concurrently running a multi-thread unsafe function.
Which function is multi-thread unsafe?
Why the same function would be run by the service and by the scheduler?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 15:57 ` Bruce Richardson
@ 2017-06-30 4:45 ` Jerin Jacob
2017-06-30 10:00 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Jerin Jacob @ 2017-06-30 4:45 UTC (permalink / raw)
To: Bruce Richardson; +Cc: Van Haaren, Harry, dev, thomas, Wiles, Keith
-----Original Message-----
> Date: Thu, 29 Jun 2017 16:57:08 +0100
> From: Bruce Richardson <bruce.richardson@intel.com>
> To: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>, 'Jerin Jacob'
> <jerin.jacob@caviumnetworks.com>, "thomas@monjalon.net"
> <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> Subject: Re: Service lcores and Application lcores
> User-Agent: Mutt/1.8.1 (2017-04-11)
>
> On Thu, Jun 29, 2017 at 03:36:04PM +0100, Van Haaren, Harry wrote:
> > Hi All,
> >
> >
> > The recently posted service cores patchset[1], introduces service lcores to run services for DPDK applications. Services are just an ordinary function for eg: eventdev scheduling, NIC RX, statistics and monitoring, etc. A service is just a callback function, which a core invokes. An atomic ensures that services that are
> > non-multi-thread-safe are never concurrently invoked.
> >
> > The topic of discussion in this thread is how we can ensure that application lcores do not interfere with service cores. I have a solution described below, opinions welcome.
> >
> >
> > Regards, -Harry
> >
> >
> > PS: This discussion extends that in the ML thread here[2], participants of that thread added to CC.
> >
> > [1] Service Cores v2 patchset http://dpdk.org/dev/patchwork/bundle/hvanhaar/service_cores_v2/
> > [2] http://dpdk.org/ml/archives/dev/2017-June/069290.html
> >
> >
> > ________________________
> >
> >
> >
> > A proposal for Eventdev, to ensure Service lcores and Application lcores play nice;
> >
> > 1) Application lcores must not directly call rte_eventdev_schedule()
> > 2A) Service cores are the proper method to run services
> > 2B) If an application insists on running a service "manually" on an app lcore, we provide a function for that:
> > rte_service_run_from_app_lcore(struct service *srv);
> >
> > The above function would allow a pesky app to run services on its own (non-service core) lcores, but
> > does so through the service-core framework, allowing the service-library atomic to keep access serialized as required for non-multi-thread-safe services.
> >
> > The above solution maintains the option of running the eventdev PMD as now (single-core dedicated to a single service), while providing correct serialization by using the rte_service_run_from_app_lcore() function. Given the atomic is only used when required (multiple cores mapped to the service) there should be no performance delta.
> >
> > Given that the application should not invoke rte_eventdev_schedule(), we could even consider removing it from the Eventdev API. A PMD that requires cycles registers a service, and an application can use a service core or the run_from_app_lcore() function if it wishes to invoke that service on an application owned lcore.
> >
> >
> > Opinions?
>
> I would be in favour of this proposal, except for the proposed name for
> the new function. It would be useful for an app to be able to "adopt" a
> service into it's main loop if so desired. If we do this, I think I'd
+1
Agree with Harry and Bruce here.
I think, The adapter function should take "struct service *" and return
lcore_function_t so that it can run using exiting rte_eal_remote_launch()
> also support the removal of a dedicated schedule call from the eventdev
> API, or alternatively, if it is needed by other PMDs, leave it as a
> no-op in the sw PMD in favour of the service-cores managed function.
I would be in favor of removing eventdev schedule and
RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability so that it is completely
transparent to application whether scheduler runs on HW or SW or "combination
of both"
>
> /Bruce
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-29 20:18 ` Thomas Monjalon
@ 2017-06-30 8:52 ` Van Haaren, Harry
2017-06-30 9:29 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 8:52 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, June 29, 2017 9:19 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> 29/06/2017 18:35, Van Haaren, Harry:
> > 3) The problem;
> > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > the application lcore runs schedule() func (option 1), the result is that
> > two threads are concurrently running a multi-thread unsafe function.
>
> Which function is multi-thread unsafe?
With the current design, the service-callback does not have to be multi-thread safe.
For example, the eventdev SW PMD is not multi-thread safe.
The service library handles serializing access to the service-callback if multiple cores
are mapped to that service. This keeps the atomic complexity in one place, and keeps
services as light-weight to implement as possible.
(We could consider forcing all service-callbacks to be multi-thread safe by using atomics,
but we would not be able to optimize away the atomic cmpset if it is not required. This
feels heavy handed, and would cause useless atomic ops to execute.)
> Why the same function would be run by the service and by the scheduler?
The same function can be run concurrently by the application, and a service core.
The root cause that this could happen is that an application can *think* it is the
only one running threads, but in reality one or more service-cores may be running
in the background.
The service lcores and application lcores existence without knowledge of the others
behavior is the cause of concurrent running of the multi-thread unsafe service function.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 8:52 ` Van Haaren, Harry
@ 2017-06-30 9:29 ` Thomas Monjalon
2017-06-30 10:18 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-06-30 9:29 UTC (permalink / raw)
To: Van Haaren, Harry
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
30/06/2017 10:52, Van Haaren, Harry:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 29/06/2017 18:35, Van Haaren, Harry:
> > > 3) The problem;
> > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > the application lcore runs schedule() func (option 1), the result is that
> > > two threads are concurrently running a multi-thread unsafe function.
> >
> > Which function is multi-thread unsafe?
>
> With the current design, the service-callback does not have to be multi-thread safe.
> For example, the eventdev SW PMD is not multi-thread safe.
>
> The service library handles serializing access to the service-callback if multiple cores
> are mapped to that service. This keeps the atomic complexity in one place, and keeps
> services as light-weight to implement as possible.
>
> (We could consider forcing all service-callbacks to be multi-thread safe by using atomics,
> but we would not be able to optimize away the atomic cmpset if it is not required. This
> feels heavy handed, and would cause useless atomic ops to execute.)
OK thank you for the detailed explanation.
> > Why the same function would be run by the service and by the scheduler?
>
> The same function can be run concurrently by the application, and a service core.
> The root cause that this could happen is that an application can *think* it is the
> only one running threads, but in reality one or more service-cores may be running
> in the background.
>
> The service lcores and application lcores existence without knowledge of the others
> behavior is the cause of concurrent running of the multi-thread unsafe service function.
That's the part I still don't understand.
Why an application would run a function on its own core if it is already
run as a service? Can we just have a check that the service API exists
and that the service is running?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 4:45 ` Jerin Jacob
@ 2017-06-30 10:00 ` Van Haaren, Harry
2017-06-30 12:51 ` Jerin Jacob
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 10:00 UTC (permalink / raw)
To: Jerin Jacob, Richardson, Bruce; +Cc: dev, thomas, Wiles, Keith
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, June 30, 2017 5:45 AM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> Wiles, Keith <keith.wiles@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> -----Original Message-----
> > Date: Thu, 29 Jun 2017 16:57:08 +0100
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > To: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > CC: "dev@dpdk.org" <dev@dpdk.org>, 'Jerin Jacob'
> > <jerin.jacob@caviumnetworks.com>, "thomas@monjalon.net"
> > <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> > Subject: Re: Service lcores and Application lcores
> > User-Agent: Mutt/1.8.1 (2017-04-11)
> >
> > On Thu, Jun 29, 2017 at 03:36:04PM +0100, Van Haaren, Harry wrote:
> > > Hi All,
<snip>
> > > A proposal for Eventdev, to ensure Service lcores and Application lcores play nice;
> > >
> > > 1) Application lcores must not directly call rte_eventdev_schedule()
> > > 2A) Service cores are the proper method to run services
> > > 2B) If an application insists on running a service "manually" on an app lcore, we
> provide a function for that:
> > > rte_service_run_from_app_lcore(struct service *srv);
> > >
> > > The above function would allow a pesky app to run services on its own (non-service
> core) lcores, but
> > > does so through the service-core framework, allowing the service-library atomic to
> keep access serialized as required for non-multi-thread-safe services.
> > >
> > > The above solution maintains the option of running the eventdev PMD as now (single-
> core dedicated to a single service), while providing correct serialization by using the
> rte_service_run_from_app_lcore() function. Given the atomic is only used when required
> (multiple cores mapped to the service) there should be no performance delta.
> > >
> > > Given that the application should not invoke rte_eventdev_schedule(), we could even
> consider removing it from the Eventdev API. A PMD that requires cycles registers a
> service, and an application can use a service core or the run_from_app_lcore() function if
> it wishes to invoke that service on an application owned lcore.
> > >
> > >
> > > Opinions?
> >
> > I would be in favour of this proposal, except for the proposed name for
> > the new function. It would be useful for an app to be able to "adopt" a
> > service into it's main loop if so desired. If we do this, I think I'd
>
> +1
>
> Agree with Harry and Bruce here.
>
> I think, The adapter function should take "struct service *" and return
> lcore_function_t so that it can run using exiting rte_eal_remote_launch()
I don't think providing a remote-launch API is actually beneficial. Remote-launching a single service
is equivalent to adding that lcore as a service-core, and mapping it to just that single service.
The advantage of adding it as a service core, is future-proofing for if more services need to be added
to that core in future, and statistics of the service core infrastructure. A convenience API could be
provided to perform the core_add(), service_start(), enable_on_service() and core_start() APIs in one.
Also, the remote_launch API doesn't solve the original problem - what if an application lcore wishes
to run one iteration of a service "manually". The remote_launch style API does not solve this problem.
Here a much simpler API to run a service... as a counter-proposal :)
/** Runs one iteration of *service* on the calling lcore */
int rte_service_iterate(struct rte_service_spec *service);
The iterate() function can check that the service is start()-ed, check the number of mapped-lcores and utilize the atomic to prevent concurrent access to multi-thread unsafe services. By exposing the function-pointer/userdata directly, we lose that.
Thinking about it, a function like rte_service_iterate() is the only functionally correct approach. (Exposing the callback directly brings us back to the "application thread without atomic check" problem.)
Thoughts?
> > also support the removal of a dedicated schedule call from the eventdev
> > API, or alternatively, if it is needed by other PMDs, leave it as a
> > no-op in the sw PMD in favour of the service-cores managed function.
>
> I would be in favor of removing eventdev schedule and
> RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability so that it is completely
> transparent to application whether scheduler runs on HW or SW or "combination
> of both"
Yep this bit sounds good!
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 9:29 ` Thomas Monjalon
@ 2017-06-30 10:18 ` Van Haaren, Harry
2017-06-30 10:38 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 10:18 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, June 30, 2017 10:29 AM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> 30/06/2017 10:52, Van Haaren, Harry:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > 3) The problem;
> > > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > the application lcore runs schedule() func (option 1), the result is that
> > > > two threads are concurrently running a multi-thread unsafe function.
> > >
> > > Which function is multi-thread unsafe?
> >
> > With the current design, the service-callback does not have to be multi-thread safe.
> > For example, the eventdev SW PMD is not multi-thread safe.
> >
> > The service library handles serializing access to the service-callback if multiple cores
> > are mapped to that service. This keeps the atomic complexity in one place, and keeps
> > services as light-weight to implement as possible.
> >
> > (We could consider forcing all service-callbacks to be multi-thread safe by using
> atomics,
> > but we would not be able to optimize away the atomic cmpset if it is not required. This
> > feels heavy handed, and would cause useless atomic ops to execute.)
>
> OK thank you for the detailed explanation.
>
> > > Why the same function would be run by the service and by the scheduler?
> >
> > The same function can be run concurrently by the application, and a service core.
> > The root cause that this could happen is that an application can *think* it is the
> > only one running threads, but in reality one or more service-cores may be running
> > in the background.
> >
> > The service lcores and application lcores existence without knowledge of the others
> > behavior is the cause of concurrent running of the multi-thread unsafe service function.
>
> That's the part I still don't understand.
> Why an application would run a function on its own core if it is already
> run as a service? Can we just have a check that the service API exists
> and that the service is running?
The point is that really it is an application / service core mis-match.
The application should never run a PMD that it knows also has a service core running it.
However, porting applications to the service-core API has an over-lap time where an
application on 17.05 will be required to call eg: rte_eventdev_schedule() itself, and
depending on startup EAL flags for service-cores, it may-or-may-not have to call schedule() manually.
This is pretty error prone, and mis-configuration would cause A) deadlock due to no CPU cycles, B) segfault due to two cores.
As per the other thread re service-lcores[1], removing rte_eventdev_schedule() from the API would force apps to use the service-core infrastructure for eventdev instead of the possibility of mis-match.
[1] http://dpdk.org/ml/archives/dev/2017-June/069492.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 10:18 ` Van Haaren, Harry
@ 2017-06-30 10:38 ` Thomas Monjalon
2017-06-30 11:14 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-06-30 10:38 UTC (permalink / raw)
To: Van Haaren, Harry
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
30/06/2017 12:18, Van Haaren, Harry:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 30/06/2017 10:52, Van Haaren, Harry:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > > 3) The problem;
> > > > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > > the application lcore runs schedule() func (option 1), the result is that
> > > > > two threads are concurrently running a multi-thread unsafe function.
> > > >
> > > > Which function is multi-thread unsafe?
> > >
> > > With the current design, the service-callback does not have to be multi-thread safe.
> > > For example, the eventdev SW PMD is not multi-thread safe.
> > >
> > > The service library handles serializing access to the service-callback if multiple cores
> > > are mapped to that service. This keeps the atomic complexity in one place, and keeps
> > > services as light-weight to implement as possible.
> > >
> > > (We could consider forcing all service-callbacks to be multi-thread safe by using
> > atomics,
> > > but we would not be able to optimize away the atomic cmpset if it is not required. This
> > > feels heavy handed, and would cause useless atomic ops to execute.)
> >
> > OK thank you for the detailed explanation.
> >
> > > > Why the same function would be run by the service and by the scheduler?
> > >
> > > The same function can be run concurrently by the application, and a service core.
> > > The root cause that this could happen is that an application can *think* it is the
> > > only one running threads, but in reality one or more service-cores may be running
> > > in the background.
> > >
> > > The service lcores and application lcores existence without knowledge of the others
> > > behavior is the cause of concurrent running of the multi-thread unsafe service function.
> >
> > That's the part I still don't understand.
> > Why an application would run a function on its own core if it is already
> > run as a service? Can we just have a check that the service API exists
> > and that the service is running?
>
> The point is that really it is an application / service core mis-match.
> The application should never run a PMD that it knows also has a service core running it.
Yes
> However, porting applications to the service-core API has an over-lap time where an
> application on 17.05 will be required to call eg: rte_eventdev_schedule() itself, and
> depending on startup EAL flags for service-cores, it may-or-may-not have to call schedule() manually.
Yes service cores may be unavailable, depending of user configuration.
That's why it must be possible to request the service core API
to know whether a service is run or not.
When porting an application to service core, you just have to run this
check, which is known to be available for DPDK 17.08 (check rte_version.h).
> This is pretty error prone, and mis-configuration would cause A) deadlock due to no CPU cycles, B) segfault due to two cores.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 10:38 ` Thomas Monjalon
@ 2017-06-30 11:14 ` Van Haaren, Harry
2017-06-30 13:04 ` Jerin Jacob
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 11:14 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, 'Jerin Jacob', Wiles, Keith, Richardson, Bruce
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, June 30, 2017 11:39 AM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> 30/06/2017 12:18, Van Haaren, Harry:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 30/06/2017 10:52, Van Haaren, Harry:
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > > > 3) The problem;
> > > > > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > > > the application lcore runs schedule() func (option 1), the result is that
> > > > > > two threads are concurrently running a multi-thread unsafe function.
> > > > >
> > > > > Which function is multi-thread unsafe?
> > > >
> > > > With the current design, the service-callback does not have to be multi-thread safe.
> > > > For example, the eventdev SW PMD is not multi-thread safe.
> > > >
> > > > The service library handles serializing access to the service-callback if multiple
> cores
> > > > are mapped to that service. This keeps the atomic complexity in one place, and keeps
> > > > services as light-weight to implement as possible.
> > > >
> > > > (We could consider forcing all service-callbacks to be multi-thread safe by using
> > > atomics,
> > > > but we would not be able to optimize away the atomic cmpset if it is not required.
> This
> > > > feels heavy handed, and would cause useless atomic ops to execute.)
> > >
> > > OK thank you for the detailed explanation.
> > >
> > > > > Why the same function would be run by the service and by the scheduler?
> > > >
> > > > The same function can be run concurrently by the application, and a service core.
> > > > The root cause that this could happen is that an application can *think* it is the
> > > > only one running threads, but in reality one or more service-cores may be running
> > > > in the background.
> > > >
> > > > The service lcores and application lcores existence without knowledge of the others
> > > > behavior is the cause of concurrent running of the multi-thread unsafe service
> function.
> > >
> > > That's the part I still don't understand.
> > > Why an application would run a function on its own core if it is already
> > > run as a service? Can we just have a check that the service API exists
> > > and that the service is running?
> >
> > The point is that really it is an application / service core mis-match.
> > The application should never run a PMD that it knows also has a service core running it.
>
> Yes
>
> > However, porting applications to the service-core API has an over-lap time where an
> > application on 17.05 will be required to call eg: rte_eventdev_schedule() itself, and
> > depending on startup EAL flags for service-cores, it may-or-may-not have to call
> schedule() manually.
>
> Yes service cores may be unavailable, depending of user configuration.
> That's why it must be possible to request the service core API
> to know whether a service is run or not.
Yep - an application can check if a service is running by calling rte_service_is_running(struct service_spec*);
It returns true if a service-core is running, mapped to the service, and the service is start()-ed.
> When porting an application to service core, you just have to run this
> check, which is known to be available for DPDK 17.08 (check rte_version.h).
Ok, so as part of porting to service-cores, applications are expected to sanity check the services vs their own lcore config.
If there's no disagreement, I will add it to the releases notes of the V+1 service-cores patchset.
There is still a need for the rte_service_iterate() function as discussed in the other branch of this thread.
I'll wait for consensus on that and post the next revision then.
Thanks for the questions / input!
> > This is pretty error prone, and mis-configuration would cause A) deadlock due to no CPU
> cycles, B) segfault due to two cores.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 10:00 ` Van Haaren, Harry
@ 2017-06-30 12:51 ` Jerin Jacob
2017-06-30 13:08 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Jerin Jacob @ 2017-06-30 12:51 UTC (permalink / raw)
To: Van Haaren, Harry; +Cc: Richardson, Bruce, dev, thomas, Wiles, Keith
-----Original Message-----
> Date: Fri, 30 Jun 2017 10:00:18 +0000
> From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> <bruce.richardson@intel.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>, "thomas@monjalon.net"
> <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> Subject: RE: Service lcores and Application lcores
>
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Friday, June 30, 2017 5:45 AM
> > To: Richardson, Bruce <bruce.richardson@intel.com>
> > Cc: Van Haaren, Harry <harry.van.haaren@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> > Wiles, Keith <keith.wiles@intel.com>
> > Subject: Re: Service lcores and Application lcores
> >
> > -----Original Message-----
> > > Date: Thu, 29 Jun 2017 16:57:08 +0100
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > To: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > > CC: "dev@dpdk.org" <dev@dpdk.org>, 'Jerin Jacob'
> > > <jerin.jacob@caviumnetworks.com>, "thomas@monjalon.net"
> > > <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> > > Subject: Re: Service lcores and Application lcores
> > > User-Agent: Mutt/1.8.1 (2017-04-11)
> > >
> > > On Thu, Jun 29, 2017 at 03:36:04PM +0100, Van Haaren, Harry wrote:
> > > > Hi All,
>
> <snip>
>
> > > > A proposal for Eventdev, to ensure Service lcores and Application lcores play nice;
> > > >
> > > > 1) Application lcores must not directly call rte_eventdev_schedule()
> > > > 2A) Service cores are the proper method to run services
> > > > 2B) If an application insists on running a service "manually" on an app lcore, we
> > provide a function for that:
> > > > rte_service_run_from_app_lcore(struct service *srv);
> > > >
> > > > The above function would allow a pesky app to run services on its own (non-service
> > core) lcores, but
> > > > does so through the service-core framework, allowing the service-library atomic to
> > keep access serialized as required for non-multi-thread-safe services.
> > > >
> > > > The above solution maintains the option of running the eventdev PMD as now (single-
> > core dedicated to a single service), while providing correct serialization by using the
> > rte_service_run_from_app_lcore() function. Given the atomic is only used when required
> > (multiple cores mapped to the service) there should be no performance delta.
> > > >
> > > > Given that the application should not invoke rte_eventdev_schedule(), we could even
> > consider removing it from the Eventdev API. A PMD that requires cycles registers a
> > service, and an application can use a service core or the run_from_app_lcore() function if
> > it wishes to invoke that service on an application owned lcore.
> > > >
> > > >
> > > > Opinions?
> > >
> > > I would be in favour of this proposal, except for the proposed name for
> > > the new function. It would be useful for an app to be able to "adopt" a
> > > service into it's main loop if so desired. If we do this, I think I'd
> >
> > +1
> >
> > Agree with Harry and Bruce here.
> >
> > I think, The adapter function should take "struct service *" and return
> > lcore_function_t so that it can run using exiting rte_eal_remote_launch()
>
>
> I don't think providing a remote-launch API is actually beneficial. Remote-launching a single service
> is equivalent to adding that lcore as a service-core, and mapping it to just that single service.
> The advantage of adding it as a service core, is future-proofing for if more services need to be added
> to that core in future, and statistics of the service core infrastructure. A convenience API could be
> provided to perform the core_add(), service_start(), enable_on_service() and core_start() APIs in one.
>
> Also, the remote_launch API doesn't solve the original problem - what if an application lcore wishes
> to run one iteration of a service "manually". The remote_launch style API does not solve this problem.
Agree with problem statement. But, remote_launch() operates on lcores not on
not necessary on 1:1 mapped physical cores.
By introducing "rte_service_iterate", We are creating a parallel infrastructure to
run the service on non DPDK service lcores aka normal lcores.
Is this really required? Is there any real advantage for
application not use builtin service lcore infrastructure, rather than iterating over
"rte_service_iterate" and run on normal lcores. If we really want to mux
a physical core to N lcore, EAL already provides that in the form of threads.
I think, providing too many parallel options for the same use case may be
a overkill.
Just my 2c.
>
>
> Here a much simpler API to run a service... as a counter-proposal :)
>
> /** Runs one iteration of *service* on the calling lcore */
> int rte_service_iterate(struct rte_service_spec *service);
>
>
> The iterate() function can check that the service is start()-ed, check the number of mapped-lcores and utilize the atomic to prevent concurrent access to multi-thread unsafe services. By exposing the function-pointer/userdata directly, we lose that.
>
> Thinking about it, a function like rte_service_iterate() is the only functionally correct approach. (Exposing the callback directly brings us back to the "application thread without atomic check" problem.)
>
> Thoughts?
>
>
> > > also support the removal of a dedicated schedule call from the eventdev
> > > API, or alternatively, if it is needed by other PMDs, leave it as a
> > > no-op in the sw PMD in favour of the service-cores managed function.
> >
> > I would be in favor of removing eventdev schedule and
> > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability so that it is completely
> > transparent to application whether scheduler runs on HW or SW or "combination
> > of both"
>
>
> Yep this bit sounds good!
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 11:14 ` Van Haaren, Harry
@ 2017-06-30 13:04 ` Jerin Jacob
2017-06-30 13:16 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Jerin Jacob @ 2017-06-30 13:04 UTC (permalink / raw)
To: Van Haaren, Harry; +Cc: Thomas Monjalon, dev, Wiles, Keith, Richardson, Bruce
-----Original Message-----
> Date: Fri, 30 Jun 2017 11:14:39 +0000
> From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> To: Thomas Monjalon <thomas@monjalon.net>
> CC: "dev@dpdk.org" <dev@dpdk.org>, 'Jerin Jacob'
> <jerin.jacob@caviumnetworks.com>, "Wiles, Keith" <keith.wiles@intel.com>,
> "Richardson, Bruce" <bruce.richardson@intel.com>
> Subject: RE: Service lcores and Application lcores
>
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Friday, June 30, 2017 11:39 AM
> > To: Van Haaren, Harry <harry.van.haaren@intel.com>
> > Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> > <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: Service lcores and Application lcores
> >
> > 30/06/2017 12:18, Van Haaren, Harry:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > 30/06/2017 10:52, Van Haaren, Harry:
> > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > > > > 3) The problem;
> > > > > > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > > > > the application lcore runs schedule() func (option 1), the result is that
> > > > > > > two threads are concurrently running a multi-thread unsafe function.
> > > > > >
> > > > > > Which function is multi-thread unsafe?
> > > > >
> > > > > With the current design, the service-callback does not have to be multi-thread safe.
> > > > > For example, the eventdev SW PMD is not multi-thread safe.
> > > > >
> > > > > The service library handles serializing access to the service-callback if multiple
> > cores
> > > > > are mapped to that service. This keeps the atomic complexity in one place, and keeps
> > > > > services as light-weight to implement as possible.
> > > > >
> > > > > (We could consider forcing all service-callbacks to be multi-thread safe by using
> > > > atomics,
> > > > > but we would not be able to optimize away the atomic cmpset if it is not required.
> > This
> > > > > feels heavy handed, and would cause useless atomic ops to execute.)
> > > >
> > > > OK thank you for the detailed explanation.
> > > >
> > > > > > Why the same function would be run by the service and by the scheduler?
> > > > >
> > > > > The same function can be run concurrently by the application, and a service core.
> > > > > The root cause that this could happen is that an application can *think* it is the
> > > > > only one running threads, but in reality one or more service-cores may be running
> > > > > in the background.
> > > > >
> > > > > The service lcores and application lcores existence without knowledge of the others
> > > > > behavior is the cause of concurrent running of the multi-thread unsafe service
> > function.
> > > >
> > > > That's the part I still don't understand.
> > > > Why an application would run a function on its own core if it is already
> > > > run as a service? Can we just have a check that the service API exists
> > > > and that the service is running?
> > >
> > > The point is that really it is an application / service core mis-match.
> > > The application should never run a PMD that it knows also has a service core running it.
> >
> > Yes
> >
> > > However, porting applications to the service-core API has an over-lap time where an
> > > application on 17.05 will be required to call eg: rte_eventdev_schedule() itself, and
> > > depending on startup EAL flags for service-cores, it may-or-may-not have to call
> > schedule() manually.
> >
> > Yes service cores may be unavailable, depending of user configuration.
> > That's why it must be possible to request the service core API
> > to know whether a service is run or not.
>
> Yep - an application can check if a service is running by calling rte_service_is_running(struct service_spec*);
> It returns true if a service-core is running, mapped to the service, and the service is start()-ed.
If I understand it correctly, driver should check the the _required_
service has been running or not ? Not the _application_. Right?
>
> > When porting an application to service core, you just have to run this
> > check, which is known to be available for DPDK 17.08 (check rte_version.h).
>
> Ok, so as part of porting to service-cores, applications are expected to sanity check the services vs their own lcore config.
> If there's no disagreement, I will add it to the releases notes of the V+1 service-cores patchset.
>
> There is still a need for the rte_service_iterate() function as discussed in the other branch of this thread.
> I'll wait for consensus on that and post the next revision then.
>
> Thanks for the questions / input!
>
>
> > > This is pretty error prone, and mis-configuration would cause A) deadlock due to no CPU
> > cycles, B) segfault due to two cores.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 12:51 ` Jerin Jacob
@ 2017-06-30 13:08 ` Van Haaren, Harry
2017-06-30 13:20 ` Jerin Jacob
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 13:08 UTC (permalink / raw)
To: Jerin Jacob; +Cc: Richardson, Bruce, dev, thomas, Wiles, Keith
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, June 30, 2017 1:52 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> Wiles, Keith <keith.wiles@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> -----Original Message-----
> > Date: Fri, 30 Jun 2017 10:00:18 +0000
> > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> > <bruce.richardson@intel.com>
> > CC: "dev@dpdk.org" <dev@dpdk.org>, "thomas@monjalon.net"
> > <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> > Subject: RE: Service lcores and Application lcores
<snip previous non-related items>
> > I don't think providing a remote-launch API is actually beneficial. Remote-launching a
> single service
> > is equivalent to adding that lcore as a service-core, and mapping it to just that single
> service.
> > The advantage of adding it as a service core, is future-proofing for if more services
> need to be added
> > to that core in future, and statistics of the service core infrastructure. A convenience
> API could be
> > provided to perform the core_add(), service_start(), enable_on_service() and
> core_start() APIs in one.
> >
> > Also, the remote_launch API doesn't solve the original problem - what if an application
> lcore wishes
> > to run one iteration of a service "manually". The remote_launch style API does not solve
> this problem.
>
> Agree with problem statement. But, remote_launch() operates on lcores not on
> not necessary on 1:1 mapped physical cores.
>
> By introducing "rte_service_iterate", We are creating a parallel infrastructure to
> run the service on non DPDK service lcores aka normal lcores.
> Is this really required? Is there any real advantage for
> application not use builtin service lcore infrastructure, rather than iterating over
> "rte_service_iterate" and run on normal lcores. If we really want to mux
> a physical core to N lcore, EAL already provides that in the form of threads.
>
> I think, providing too many parallel options for the same use case may be
> a overkill.
>
> Just my 2c.
The use-case that the rte_service_iterate() caters for is one where the application
wishes to run a service on an "ordinary app lcore", together with an application workload.
For example, the eventdev-scheduler and one worker can be run on the same lcore. If the schedule() running thread *must* be a service lcore, we would not be able to also use that lcore as an application worker core.
That was my motivation for adding this API, I do agree with you above; it is a second "parallel" method to run a service. I think there's enough value in enabling the use-case as per example above to add it.
Do you see enough value in the use-case above to add the API?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 13:04 ` Jerin Jacob
@ 2017-06-30 13:16 ` Van Haaren, Harry
0 siblings, 0 replies; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 13:16 UTC (permalink / raw)
To: Jerin Jacob; +Cc: Thomas Monjalon, dev, Wiles, Keith, Richardson, Bruce
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, June 30, 2017 2:04 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Wiles, Keith
> <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> -----Original Message-----
> > Date: Fri, 30 Jun 2017 11:14:39 +0000
> > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > To: Thomas Monjalon <thomas@monjalon.net>
> > CC: "dev@dpdk.org" <dev@dpdk.org>, 'Jerin Jacob'
> > <jerin.jacob@caviumnetworks.com>, "Wiles, Keith" <keith.wiles@intel.com>,
> > "Richardson, Bruce" <bruce.richardson@intel.com>
> > Subject: RE: Service lcores and Application lcores
> >
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > Sent: Friday, June 30, 2017 11:39 AM
> > > To: Van Haaren, Harry <harry.van.haaren@intel.com>
> > > Cc: dev@dpdk.org; 'Jerin Jacob' <jerin.jacob@caviumnetworks.com>; Wiles, Keith
> > > <keith.wiles@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: Re: Service lcores and Application lcores
> > >
> > > 30/06/2017 12:18, Van Haaren, Harry:
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > 30/06/2017 10:52, Van Haaren, Harry:
> > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > > > > > 3) The problem;
> > > > > > > > If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > > > > > the application lcore runs schedule() func (option 1), the result is that
> > > > > > > > two threads are concurrently running a multi-thread unsafe function.
> > > > > > >
> > > > > > > Which function is multi-thread unsafe?
> > > > > >
> > > > > > With the current design, the service-callback does not have to be multi-thread
> safe.
> > > > > > For example, the eventdev SW PMD is not multi-thread safe.
> > > > > >
> > > > > > The service library handles serializing access to the service-callback if
> multiple
> > > cores
> > > > > > are mapped to that service. This keeps the atomic complexity in one place, and
> keeps
> > > > > > services as light-weight to implement as possible.
> > > > > >
> > > > > > (We could consider forcing all service-callbacks to be multi-thread safe by
> using
> > > > > atomics,
> > > > > > but we would not be able to optimize away the atomic cmpset if it is not
> required.
> > > This
> > > > > > feels heavy handed, and would cause useless atomic ops to execute.)
> > > > >
> > > > > OK thank you for the detailed explanation.
> > > > >
> > > > > > > Why the same function would be run by the service and by the scheduler?
> > > > > >
> > > > > > The same function can be run concurrently by the application, and a service
> core.
> > > > > > The root cause that this could happen is that an application can *think* it is
> the
> > > > > > only one running threads, but in reality one or more service-cores may be
> running
> > > > > > in the background.
> > > > > >
> > > > > > The service lcores and application lcores existence without knowledge of the
> others
> > > > > > behavior is the cause of concurrent running of the multi-thread unsafe service
> > > function.
> > > > >
> > > > > That's the part I still don't understand.
> > > > > Why an application would run a function on its own core if it is already
> > > > > run as a service? Can we just have a check that the service API exists
> > > > > and that the service is running?
> > > >
> > > > The point is that really it is an application / service core mis-match.
> > > > The application should never run a PMD that it knows also has a service core running
> it.
> > >
> > > Yes
> > >
> > > > However, porting applications to the service-core API has an over-lap time where an
> > > > application on 17.05 will be required to call eg: rte_eventdev_schedule() itself,
> and
> > > > depending on startup EAL flags for service-cores, it may-or-may-not have to call
> > > schedule() manually.
> > >
> > > Yes service cores may be unavailable, depending of user configuration.
> > > That's why it must be possible to request the service core API
> > > to know whether a service is run or not.
> >
> > Yep - an application can check if a service is running by calling
> rte_service_is_running(struct service_spec*);
> > It returns true if a service-core is running, mapped to the service, and the service is
> start()-ed.
>
> If I understand it correctly, driver should check the the _required_
> service has been running or not ? Not the _application_. Right?
I think the PMD should check if a service core is mapped, and it can print a warning if not.
In the case of eventdev, the eventdev_start() is the function where service_is_running() is checked, and if not, we inform the user that no service-core is ready to run the service.
>From the application POV, it could use e.g. the rte_service_iterate()* to run that service - so the PMD should not fail to start(), just warn that at time of starting there was no core available to it. The application itself must still check if it should call rte_eventdev_schedule() itself, based on rte_version.h as Thomas mentioned.
The ideal end goal is in my opinion something like this;
Service cores are used to run services by 95+% of apps, to abstract away SW/HW core-requirement differences.
Advanced applications can utilize rte_service_iterate() to run specific services on application lcores if it wishes.
* See other "branch" of this thread about rte_service_iterate()
http://dpdk.org/ml/archives/dev/2017-June/069540.html
> > > When porting an application to service core, you just have to run this
> > > check, which is known to be available for DPDK 17.08 (check rte_version.h).
> >
> > Ok, so as part of porting to service-cores, applications are expected to sanity check
> the services vs their own lcore config.
> > If there's no disagreement, I will add it to the releases notes of the V+1 service-cores
> patchset.
> >
> > There is still a need for the rte_service_iterate() function as discussed in the other
> branch of this thread.
> > I'll wait for consensus on that and post the next revision then.
> >
> > Thanks for the questions / input!
> >
> >
> > > > This is pretty error prone, and mis-configuration would cause A) deadlock due to no
> CPU
> > > cycles, B) segfault due to two cores.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 13:08 ` Van Haaren, Harry
@ 2017-06-30 13:20 ` Jerin Jacob
2017-06-30 13:24 ` Van Haaren, Harry
0 siblings, 1 reply; 19+ messages in thread
From: Jerin Jacob @ 2017-06-30 13:20 UTC (permalink / raw)
To: Van Haaren, Harry; +Cc: Richardson, Bruce, dev, thomas, Wiles, Keith
-----Original Message-----
> Date: Fri, 30 Jun 2017 13:08:26 +0000
> From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: "Richardson, Bruce" <bruce.richardson@intel.com>, "dev@dpdk.org"
> <dev@dpdk.org>, "thomas@monjalon.net" <thomas@monjalon.net>, "Wiles,
> Keith" <keith.wiles@intel.com>
> Subject: RE: Service lcores and Application lcores
>
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Friday, June 30, 2017 1:52 PM
> > To: Van Haaren, Harry <harry.van.haaren@intel.com>
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> > Wiles, Keith <keith.wiles@intel.com>
> > Subject: Re: Service lcores and Application lcores
> >
> > -----Original Message-----
> > > Date: Fri, 30 Jun 2017 10:00:18 +0000
> > > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> > > <bruce.richardson@intel.com>
> > > CC: "dev@dpdk.org" <dev@dpdk.org>, "thomas@monjalon.net"
> > > <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> > > Subject: RE: Service lcores and Application lcores
>
> <snip previous non-related items>
>
> > > I don't think providing a remote-launch API is actually beneficial. Remote-launching a
> > single service
> > > is equivalent to adding that lcore as a service-core, and mapping it to just that single
> > service.
> > > The advantage of adding it as a service core, is future-proofing for if more services
> > need to be added
> > > to that core in future, and statistics of the service core infrastructure. A convenience
> > API could be
> > > provided to perform the core_add(), service_start(), enable_on_service() and
> > core_start() APIs in one.
> > >
> > > Also, the remote_launch API doesn't solve the original problem - what if an application
> > lcore wishes
> > > to run one iteration of a service "manually". The remote_launch style API does not solve
> > this problem.
> >
> > Agree with problem statement. But, remote_launch() operates on lcores not on
> > not necessary on 1:1 mapped physical cores.
> >
> > By introducing "rte_service_iterate", We are creating a parallel infrastructure to
> > run the service on non DPDK service lcores aka normal lcores.
> > Is this really required? Is there any real advantage for
> > application not use builtin service lcore infrastructure, rather than iterating over
> > "rte_service_iterate" and run on normal lcores. If we really want to mux
> > a physical core to N lcore, EAL already provides that in the form of threads.
> >
> > I think, providing too many parallel options for the same use case may be
> > a overkill.
> >
> > Just my 2c.
>
>
> The use-case that the rte_service_iterate() caters for is one where the application
> wishes to run a service on an "ordinary app lcore", together with an application workload.
>
> For example, the eventdev-scheduler and one worker can be run on the same lcore. If the schedule() running thread *must* be a service lcore, we would not be able to also use that lcore as an application worker core.
>
> That was my motivation for adding this API, I do agree with you above; it is a second "parallel" method to run a service. I think there's enough value in enabling the use-case as per example above to add it.
>
>
> Do you see enough value in the use-case above to add the API?
The above use case can be realized like --lcores='(0-1)@1'(Two lcore on
an physical core). I believe, application writers never want to write a
code based on specific number of cores available in the system. If they
do then they will be stuck on running on another environment and too
many combination to address.
For me it complicates service lcore usage. But someone think, it will useful then
I don't have strong objection.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 13:20 ` Jerin Jacob
@ 2017-06-30 13:24 ` Van Haaren, Harry
2017-06-30 13:51 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Van Haaren, Harry @ 2017-06-30 13:24 UTC (permalink / raw)
To: Jerin Jacob; +Cc: Richardson, Bruce, dev, thomas, Wiles, Keith
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, June 30, 2017 2:21 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> Wiles, Keith <keith.wiles@intel.com>
> Subject: Re: Service lcores and Application lcores
>
> -----Original Message-----
> > Date: Fri, 30 Jun 2017 13:08:26 +0000
> > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > CC: "Richardson, Bruce" <bruce.richardson@intel.com>, "dev@dpdk.org"
> > <dev@dpdk.org>, "thomas@monjalon.net" <thomas@monjalon.net>, "Wiles,
> > Keith" <keith.wiles@intel.com>
> > Subject: RE: Service lcores and Application lcores
> >
> > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > Sent: Friday, June 30, 2017 1:52 PM
> > > To: Van Haaren, Harry <harry.van.haaren@intel.com>
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> > > Wiles, Keith <keith.wiles@intel.com>
> > > Subject: Re: Service lcores and Application lcores
> > >
> > > -----Original Message-----
> > > > Date: Fri, 30 Jun 2017 10:00:18 +0000
> > > > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> > > > <bruce.richardson@intel.com>
> > > > CC: "dev@dpdk.org" <dev@dpdk.org>, "thomas@monjalon.net"
> > > > <thomas@monjalon.net>, "Wiles, Keith" <keith.wiles@intel.com>
> > > > Subject: RE: Service lcores and Application lcores
> >
> > <snip previous non-related items>
> >
> > > > I don't think providing a remote-launch API is actually beneficial. Remote-launching
> a
> > > single service
> > > > is equivalent to adding that lcore as a service-core, and mapping it to just that
> single
> > > service.
> > > > The advantage of adding it as a service core, is future-proofing for if more
> services
> > > need to be added
> > > > to that core in future, and statistics of the service core infrastructure. A
> convenience
> > > API could be
> > > > provided to perform the core_add(), service_start(), enable_on_service() and
> > > core_start() APIs in one.
> > > >
> > > > Also, the remote_launch API doesn't solve the original problem - what if an
> application
> > > lcore wishes
> > > > to run one iteration of a service "manually". The remote_launch style API does not
> solve
> > > this problem.
> > >
> > > Agree with problem statement. But, remote_launch() operates on lcores not on
> > > not necessary on 1:1 mapped physical cores.
> > >
> > > By introducing "rte_service_iterate", We are creating a parallel infrastructure to
> > > run the service on non DPDK service lcores aka normal lcores.
> > > Is this really required? Is there any real advantage for
> > > application not use builtin service lcore infrastructure, rather than iterating over
> > > "rte_service_iterate" and run on normal lcores. If we really want to mux
> > > a physical core to N lcore, EAL already provides that in the form of threads.
> > >
> > > I think, providing too many parallel options for the same use case may be
> > > a overkill.
> > >
> > > Just my 2c.
> >
> >
> > The use-case that the rte_service_iterate() caters for is one where the application
> > wishes to run a service on an "ordinary app lcore", together with an application
> workload.
> >
> > For example, the eventdev-scheduler and one worker can be run on the same lcore. If the
> schedule() running thread *must* be a service lcore, we would not be able to also use that
> lcore as an application worker core.
> >
> > That was my motivation for adding this API, I do agree with you above; it is a second
> "parallel" method to run a service. I think there's enough value in enabling the use-case
> as per example above to add it.
> >
> >
> > Do you see enough value in the use-case above to add the API?
>
> The above use case can be realized like --lcores='(0-1)@1'(Two lcore on
> an physical core). I believe, application writers never want to write a
> code based on specific number of cores available in the system. If they
> do then they will be stuck on running on another environment and too
> many combination to address.
Good point.
> For me it complicates service lcore usage. But someone think, it will useful then
> I don't have strong objection.
We can easily add APIs later - and removing them isn't so easy. +1 from me leave it out for now, and we can see about adding it for 17.11 if the need arises.
Thanks for your input, I'll spin a v3 without the rte_service_iterate() function, and that should be it then!
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [dpdk-dev] Service lcores and Application lcores
2017-06-30 13:24 ` Van Haaren, Harry
@ 2017-06-30 13:51 ` Thomas Monjalon
0 siblings, 0 replies; 19+ messages in thread
From: Thomas Monjalon @ 2017-06-30 13:51 UTC (permalink / raw)
To: Van Haaren, Harry; +Cc: Jerin Jacob, Richardson, Bruce, dev, Wiles, Keith
30/06/2017 15:24, Van Haaren, Harry:
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> > > <snip previous non-related items>
> > >
> > > > > I don't think providing a remote-launch API is actually beneficial. Remote-launching
> > a
> > > > single service
> > > > > is equivalent to adding that lcore as a service-core, and mapping it to just that
> > single
> > > > service.
> > > > > The advantage of adding it as a service core, is future-proofing for if more
> > services
> > > > need to be added
> > > > > to that core in future, and statistics of the service core infrastructure. A
> > convenience
> > > > API could be
> > > > > provided to perform the core_add(), service_start(), enable_on_service() and
> > > > core_start() APIs in one.
> > > > >
> > > > > Also, the remote_launch API doesn't solve the original problem - what if an
> > application
> > > > lcore wishes
> > > > > to run one iteration of a service "manually". The remote_launch style API does not
> > solve
> > > > this problem.
> > > >
> > > > Agree with problem statement. But, remote_launch() operates on lcores not on
> > > > not necessary on 1:1 mapped physical cores.
> > > >
> > > > By introducing "rte_service_iterate", We are creating a parallel infrastructure to
> > > > run the service on non DPDK service lcores aka normal lcores.
> > > > Is this really required? Is there any real advantage for
> > > > application not use builtin service lcore infrastructure, rather than iterating over
> > > > "rte_service_iterate" and run on normal lcores. If we really want to mux
> > > > a physical core to N lcore, EAL already provides that in the form of threads.
> > > >
> > > > I think, providing too many parallel options for the same use case may be
> > > > a overkill.
> > > >
> > > > Just my 2c.
> > >
> > >
> > > The use-case that the rte_service_iterate() caters for is one where the application
> > > wishes to run a service on an "ordinary app lcore", together with an application
> > workload.
> > >
> > > For example, the eventdev-scheduler and one worker can be run on the same lcore. If the
> > schedule() running thread *must* be a service lcore, we would not be able to also use that
> > lcore as an application worker core.
> > >
> > > That was my motivation for adding this API, I do agree with you above; it is a second
> > "parallel" method to run a service. I think there's enough value in enabling the use-case
> > as per example above to add it.
> > >
> > >
> > > Do you see enough value in the use-case above to add the API?
> >
> > The above use case can be realized like --lcores='(0-1)@1'(Two lcore on
> > an physical core). I believe, application writers never want to write a
> > code based on specific number of cores available in the system. If they
> > do then they will be stuck on running on another environment and too
> > many combination to address.
>
> Good point.
>
> > For me it complicates service lcore usage. But someone think, it will useful then
> > I don't have strong objection.
>
> We can easily add APIs later - and removing them isn't so easy. +1 from me leave it out for now, and we can see about adding it for 17.11 if the need arises.
>
> Thanks for your input, I'll spin a v3 without the rte_service_iterate() function, and that should be it then!
I agree to leave it and keep things simple.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-06-30 13:51 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-29 14:36 [dpdk-dev] Service lcores and Application lcores Van Haaren, Harry
2017-06-29 15:16 ` Thomas Monjalon
2017-06-29 16:35 ` Van Haaren, Harry
2017-06-29 20:18 ` Thomas Monjalon
2017-06-30 8:52 ` Van Haaren, Harry
2017-06-30 9:29 ` Thomas Monjalon
2017-06-30 10:18 ` Van Haaren, Harry
2017-06-30 10:38 ` Thomas Monjalon
2017-06-30 11:14 ` Van Haaren, Harry
2017-06-30 13:04 ` Jerin Jacob
2017-06-30 13:16 ` Van Haaren, Harry
2017-06-29 15:57 ` Bruce Richardson
2017-06-30 4:45 ` Jerin Jacob
2017-06-30 10:00 ` Van Haaren, Harry
2017-06-30 12:51 ` Jerin Jacob
2017-06-30 13:08 ` Van Haaren, Harry
2017-06-30 13:20 ` Jerin Jacob
2017-06-30 13:24 ` Van Haaren, Harry
2017-06-30 13:51 ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).