DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: dpdk-dev <dev@dpdk.org>, Jerin Jacob <jerinj@marvell.com>,
	Gage Eads <gage.eads@intel.com>,
	Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] [RFC 1/3] eventdev: allow for event devices requiring maintenance
Date: Mon, 20 Apr 2020 09:05:26 +0000	[thread overview]
Message-ID: <1f3badcf-08cf-7d8c-d912-7f36031f3a4f@ericsson.com> (raw)
In-Reply-To: <CALBAE1PL9BG6Nir6GxbrgDKciYz3vk9s+gjCnSQRhh2=gBwcXA@mail.gmail.com>

On 2020-04-16 19:19, Jerin Jacob wrote:
> On Tue, Apr 14, 2020 at 11:25 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>> On 2020-04-14 18:15, Jerin Jacob wrote:
>>> On Tue, Apr 14, 2020 at 9:27 PM Mattias Rönnblom
>>> <mattias.ronnblom@ericsson.com> wrote:
>>>> On 2020-04-10 15:00, Jerin Jacob wrote:
>>>>> On Thu, Apr 9, 2020 at 7:32 PM Mattias Rönnblom
>>>>> <mattias.ronnblom@ericsson.com> wrote:
>>>>>> On 2020-04-09 15:32, Jerin Jacob wrote:
>>>>>>> On Thu, Apr 9, 2020 at 5:51 PM Mattias Rönnblom
>>>>>>> <mattias.ronnblom@ericsson.com> wrote:
>>>>>>>> On 2020-04-08 21:36, Jerin Jacob wrote:
>>>>>>>>> On Wed, Apr 8, 2020 at 11:27 PM Mattias Rönnblom
>>>>>>>>> <mattias.ronnblom@ericsson.com> wrote:
>>>>>>>>>> Extend Eventdev API to allow for event devices which require various
>>>>>>>>>> forms of internal processing to happen, even when events are not
>>>>>>>>>> enqueued to or dequeued from a port.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>>>>>>>>> ---
>>>>>>>>>>       lib/librte_eventdev/rte_eventdev.h     | 65 ++++++++++++++++++++++++++
>>>>>>>>>>       lib/librte_eventdev/rte_eventdev_pmd.h | 14 ++++++
>>>>>>>>>>       2 files changed, 79 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
>>>>>>>>>> index 226f352ad..d69150792 100644
>>>>>>>>>> --- a/lib/librte_eventdev/rte_eventdev.h
>>>>>>>>>> +++ b/lib/librte_eventdev/rte_eventdev.h
>>>>>>>>>> @@ -289,6 +289,15 @@ struct rte_event;
>>>>>>>>>>        * single queue to each port or map a single queue to many port.
>>>>>>>>>>        */
>>>>>>>>>>
>>>>>>>>>> +#define RTE_EVENT_DEV_CAP_REQUIRES_MAINT (1ULL << 9)
>>>>>>>>>> +/**< Event device requires calls to rte_event_maintain() during
>>>>>>>>> This scheme would call for DSW specific API handling in fastpath.
>>>>>>>> Initially this would be so, but buffering events might yield performance
>>>>>>>> benefits for more event devices than DSW.
>>>>>>>>
>>>>>>>>
>>>>>>>> In an application, it's often convenient, but sub-optimal from a
>>>>>>>> performance point of view, to do single-event enqueue operations. The
>>>>>>>> alternative is to use an application-level buffer, and the flush this
>>>>>>>> buffer with rte_event_enqueue_burst(). If you allow the event device to
>>>>>>>> buffer, you get the simplicity of single-event enqueue operations, but
>>>>>>>> without taking any noticeable performance hit.
>>>>>>> IMO, It is better to aggregate the burst by the application,  as sending
>>>>>>> event by event to the driver to aggregate has performance due to cost
>>>>>>> function pointer overhead.
>>>>>> That's a very slight overhead - but for optimal performance, sure. It'll
>>>>>> come at a cost in terms of code complexity. Just look at the adapters.
>>>>>> They do this already. I think some applications are ready to take the
>>>>>> extra 5-10 clock cycles or so it'll cost them to do the function call
>>>>>> (provided the event device had buffering support).
>>>>> So Is there any advantage of moving aggregation logic to PMD? it is costly.
>>>> What do you mean by aggregation logic?
>>> aggregation == buffering.
>>
>> The reason I put it into DSW to begin with was that it yielded a
>> significant performance benefit, for situations where the application
>> would enqueue() few or even single events per call. For DSW it will
>> translate to lower DPDK event ring overhead. I would imagine it could
>> improve performance for hardware-based event devices as well.
> Yes. we are already doing this in application. It makes sense for buffering.


Should I read this as "we are assuming the application will do this"?


(I'm not saying it's a totally unreasonable assumption.)


>>
>>>>>>> Another concern is the frequency of calling rte_event_maintain() function by
>>>>>>> the application, as the timing requirements will vary differently by
>>>>>>> the driver to driver and application to application.
>>>>>>> IMO, It is not portable and I believe the application should not be
>>>>>>> aware of those details. If the driver needs specific maintenance
>>>>>>> function for any other reason then better to use DPDK SERVICE core infra.
>>>>>> The only thing the application needs to be aware of, is that it needs to
>>>>>> call rte_event_maintain() as often as it would have called dequeue() in
>>>>>> your "typical worker" example. To make sure this call is cheap-enough is
>>>>>> up to the driver, and this needs to hold true for all event devices that
>>>>>> needs maintenance.
>>>>> Why not rte_event_maintain() can't do either in dequeue() or enqueue()
>>>>> in the driver context? Either one of them has to be called
>>>>> periodically by application
>>>>> in any case?
>>>> No, producer-only ports can go idle for long times. For applications
>>>> that don't "go idle" need not worry about the maintain function.
>>> If I understand it correctly, If the worker does not call enqueue() or dequeue()
>>> for a long time then maintain() needs to be called by the application.
>>>
>>> That's where I concern with this API. What is the definition of long
>>> time frame?(ns or ticks?)
>>> Will it be changing driver to driver and arch to arch? And it is
>>> leaking the driver requirements to the application.
>>>
>> It's difficult to quantify exactly, but the rate should be on the same
>> order of magnitude as you would call dequeue() on a consumer-type worker
> The challenge is if another driver has a different requirement for maintain()
> interms of frequecey if is a public API then how we will abstract?


It's a challenge indeed, like so often is the case for poll-type APIs.


My thought, and what's implemented in DSW today, is that the application 
will call maintain() with a high frequency, as I've mentioned before. If 
the event device periodically needs to some perform heavy-weight 
operation, such an operation will not be perform on every call. Rather, 
the event device keeps a counter, and only perform these on every Nth 
call. For DSW, one such operation is when a DSW port considers if it 
needs to migrate flows, and in that case where those flows should emigrate.


Whatever maintain() is doing, it makes the event device machinery make 
progress in some way or the other. Calling it rarely will translate to 
latency. Failing to maintain a sufficient call rate shouldn't affect 
correctness.


>> port. All the RTE_EVENT_DEV_CAP_* are essentially represents such
>> leakage, where the event device driver and/or the underlying hardware
>> express various capabilities and limitations.
> I agree. But in fastpath, we do not bring any such _functional_ difference.
> If it is on a slow path then no issue at all.
>

I'm not sure I follow. What do you mean by functional here?


The capabilities certainly change API semantics, and as a result how 
applications need to behave - including fast-path behavior. For example, 
consider an application which prefers 
RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE-capable event devices, but 
also need a code path for those that don't.


>>
>> I'm not sure it needs to be much more complicated for the application to
>> handle than the change to the event adapters I included in the patch.
>> There, it boils down the service function call rate, which would be high
>> during low load (causing buffers to be flush quickly etc), and a little
>> lower during high lcore load.
>>
>>
>>>>>> If you plan to use a non-buffering hardware device driver or a soft,
>>>>>> centralized scheduler that doesn't need this, it will also not set the
>>>>>> flag, and thus the application needs not care about the
>>>>>> rte_event_maintain() function. DPDK code such as the eventdev adapters
>>>>>> do need to care, but the increase in complexity is slight, and the cost
>>>>>> of calling rte_maintain_event() on a maintenance-free devices is very
>>>>>> low (since the then-NULL function pointer is in the eventdev struct,
>>>>>> likely on a cache-line already dragged in).
>>>>>>
>>>>>>
>>>>>> Unfortunately, DPDK doesn't have a per-core delayed-work mechanism.
>>>>>> Flushing event buffers (and other DSW "background work") can't be done
>>>>>> on a service core, since they would work on non-MT-safe data structures
>>>>>> on the worker thread's event ports.
>>>>> Yes. Otherwise, DSW needs to update to support MT safe.
>>>> I haven't been looking at this in detail, but I suspect it will be both
>>>> complex and not very performant. One of problems that need to be solved
>>>> in such a solution, is the "pausing" of flows during migration. All
>>>> participating lcores needs to ACK that a flow is paused.
>>> Could you share the patch on the details on how much it costs?
>>
>> I don't have a ready-made solution for making lcore ports thread-safe.
> OK
>
>>
>>>>>>>>>> + * periods when neither rte_event_dequeue_burst() nor
>>>>>>>>> The typical worker thread will be
>>>>>>>>> while (1) {
>>>>>>>>>                      rte_event_dequeue_burst();
>>>>>>>>>                       ..proess..
>>>>>>>>>                      rte_event_enqueue_burst();
>>>>>>>>> }
>>>>>>>>> If so, Why DSW driver can't do the maintenance in driver context in
>>>>>>>>> dequeue() call.
>>>>>>>>>
>>>>>>>> DSW already does maintenance on dequeue, and works well in the above
>>>>>>>> scenario. The typical worker does not need to care about the
>>>>>>>> rte_event_maintain() functions, since it dequeues events on a regular basis.
>>>>>>>>
>>>>>>>>
>>>>>>>> What this RFC addresses is the more atypical (but still fairly common)
>>>>>>>> case of a port being neither dequeued to or enqueued from on a regular
>>>>>>>> basis. The timer and ethernet rx adapters are examples of such.
>>>>>>> If it is an Adapter specific use case problem then maybe, we have
>>>>>>> an option to fix the problem in adapter specific API usage or in that area.
>>>>>>>
>>>>>> It's not adapter specific, I think. There might be producer-only ports,
>>>>>> for example, which doesn't provide a constant stream of events, but
>>>>>> rather intermittent bursts. A traffic generator is one example of such
>>>>>> an application, and there might be other, less synthetic ones as well.
>>>>> In that case, the application knows the purpose of the eventdev port.
>>>>> Is changing eventdev spec to configure "port" or "queue" for that use
>>>>> case help? Meaning, DSW or
>>>>> Any driver can get the hint and change the function pointers
>>>>> accordingly for fastpath.
>>>>> For instance, do maintenance on enqueue() for such ports or so.
>>>> This is what DSW does already today. A dequeue() call with a zero-length
>>>> event array serves the purpose of rte_event_maintain(). It's a bit of a
>>>> hack, in my opinion.
>>> I agree that it is the hack.
>>>
>>> One more concern we have we are adding API for the new driver requirements and
>>> not updating the example application. Sharing the example application
>>> patch would
>>> enable us to understand the cost and usage.(i.e Cost wrt to other SW drivers)
>>
>> Good point.
> I suggest sharing patch based on existing app/adapter usage, based on that, we
> can analyze more on abstraction and cost analysis.
>

Will do.


>
>>
>>
>>>>>>>>>> + * rte_event_enqueue_burst() are called on a port. This will allow the
>>>>>>>>>> + * event device to perform internal processing, such as flushing
>>>>>>>>>> + * buffered events, return credits to a global pool, or process
>>>>>>>>>> + * signaling related to load balancing.
>>>>>>>>>> + */



  reply	other threads:[~2020-04-20  9:05 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08 17:56 Mattias Rönnblom
2020-04-08 17:56 ` [dpdk-dev] [RFC 2/3] event/dsw: make use of eventdev maintenance facility Mattias Rönnblom
2020-04-08 17:56 ` [dpdk-dev] [RFC 3/3] eventdev: allow devices requiring maintenance with adapters Mattias Rönnblom
2020-04-08 19:36 ` [dpdk-dev] [RFC 1/3] eventdev: allow for event devices requiring maintenance Jerin Jacob
2020-04-09 12:21   ` Mattias Rönnblom
2020-04-09 13:32     ` Jerin Jacob
2020-04-09 14:02       ` Mattias Rönnblom
2020-04-10 13:00         ` Jerin Jacob
2020-04-14 15:57           ` Mattias Rönnblom
2020-04-14 16:15             ` Jerin Jacob
2020-04-14 17:55               ` Mattias Rönnblom
2020-04-16 17:19                 ` Jerin Jacob
2020-04-20  9:05                   ` Mattias Rönnblom [this message]
2020-05-13 18:56                 ` Mattias Rönnblom
2021-08-02 16:14                   ` [dpdk-dev] [RFC v2 " Mattias Rönnblom
2021-08-02 16:15                     ` [dpdk-dev] [RFC v2 2/3] event/dsw: make use of eventdev maintenance facility Mattias Rönnblom
2021-08-02 16:15                     ` [dpdk-dev] [RFC v2 3/3] eventdev: have adapters support device maintenance Mattias Rönnblom
2021-08-03  4:39                     ` [dpdk-dev] [RFC v2 1/3] eventdev: allow for event devices requiring maintenance Jerin Jacob
2021-08-03  8:26                       ` Mattias Rönnblom
2021-08-03 10:06                         ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2021-10-26 17:31                         ` [dpdk-dev] [PATCH " Mattias Rönnblom
2021-10-26 17:31                           ` [dpdk-dev] [PATCH 2/3] event/dsw: make use of eventdev maintenance facility Mattias Rönnblom
2021-10-26 17:31                           ` [dpdk-dev] [PATCH 3/3] eventdev: have adapters support device maintenance Mattias Rönnblom
2021-10-29 14:38                           ` [dpdk-dev] [PATCH 1/3] eventdev: allow for event devices requiring maintenance Jerin Jacob
2021-10-29 15:03                             ` Mattias Rönnblom
2021-10-29 15:17                               ` Jerin Jacob
2021-10-29 16:02                                 ` Van Haaren, Harry
2021-10-31  9:29                                   ` Mattias Rönnblom
2021-10-30 17:19                                 ` Mattias Rönnblom
2021-10-31 13:17                                   ` Jerin Jacob
2021-11-01 13:28                                     ` [dpdk-dev] [PATCH v2 " Mattias Rönnblom
2021-11-01 13:28                                       ` [dpdk-dev] [PATCH v2 2/3] event/dsw: make use of eventdev maintenance facility Mattias Rönnblom
2021-11-01 13:28                                       ` [dpdk-dev] [PATCH v2 3/3] eventdev: have adapters support device maintenance Mattias Rönnblom
2021-11-04 12:33                                     ` [dpdk-dev] [PATCH 1/3] eventdev: allow for event devices requiring maintenance Jerin Jacob
2021-11-01  9:26                                 ` Mattias Rönnblom
2021-11-01 18:40                                   ` [dpdk-dev] [PATCH v3 " Mattias Rönnblom
2021-11-01 18:40                                     ` [dpdk-dev] [PATCH v3 2/3] event/dsw: make use of eventdev maintenance facility Mattias Rönnblom
2021-11-01 18:40                                     ` [dpdk-dev] [PATCH v3 3/3] eventdev: have adapters support device maintenance Mattias Rönnblom
2020-04-09 13:33     ` [dpdk-dev] [RFC 1/3] eventdev: allow for event devices requiring maintenance Eads, Gage
2020-04-09 14:14       ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f3badcf-08cf-7d8c-d912-7f36031f3a4f@ericsson.com \
    --to=mattias.ronnblom@ericsson.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=gage.eads@intel.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).