Eventdev dequeue-enqueue event correlation

DPDK patches and discussions
 help / color / mirror / Atom feed

* Eventdev dequeue-enqueue event correlation
@ 2023-10-23 16:10 Mattias Rönnblom
  2023-10-24  8:10 ` Bruce Richardson
  0 siblings, 1 reply; 6+ messages in thread
From: Mattias Rönnblom @ 2023-10-23 16:10 UTC (permalink / raw)
  To: dev
  Cc: Jerin Jacob, Peter Nilsson, svante.jarvstrat, Harry van Haaren,
	Abdullah Sevincer

Hi.

Consider an Eventdev app using atomic-type scheduling doing something like:

     struct rte_event events[3];

     rte_event_dequeue_burst(dev_id, port_id, events, 3, 0);

     /* Assume three events were dequeued, and the application decides
      * it's best off to processing event 0 and 2 consecutively */

     process(&events[0]);
     process(&events[2]);

     events[0].queue_id++;
     events[0].op = RTE_EVENT_OP_FORWARD;
     events[2].queue_id++;
     events[2].op = RTE_EVENT_OP_FORWARD;

     rte_event_enqueue_burst(dev_id, port_id, &events[0], 1);
     rte_event_enqueue_burst(dev_id, port_id, &events[2], 1);

     process(&events[1]);
     events[1].queue_id++;
     events[1].op = RTE_EVENT_OP_FORWARD;

     rte_event_enqueue_burst(dev_id, port_id, &events[1], 1);

If one would just read the Eventdev API spec, they might expect this to 
work (especially since impl_opaque hints as potentially be useful for 
the purpose of identifying events).

However, on certain event devices, it doesn't (and maybe rightly so). If 
event 0 and 2 belongs to the same flow (queue id + flow id pair), and 
event 1 belongs to some other, then this other flow would be "unlocked" 
at the point of the second enqueue operation (and thus be processed on 
some other core, in parallel). The first flow would still be needlessly 
"locked".

Such event devices require the order of the enqueued events to be the 
same as the dequeued events, using RTE_EVENT_OP_RELEASE type events as 
"fillers" for dropped events.

Am I missing something in the Eventdev API documentation?

Could an event device use the impl_opaque field to track the identity of 
an event (and thus relax ordering requirements) and still be complaint 
toward the API?

What happens if a RTE_EVENT_OP_NEW event is inserted into the mix of 
OP_FORWARD and OP_RELEASE type events being enqueued? Again I'm not 
clear on what the API says, if anything.

Regards,
	Mattias

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Eventdev dequeue-enqueue event correlation
  2023-10-23 16:10 Eventdev dequeue-enqueue event correlation Mattias Rönnblom
@ 2023-10-24  8:10 ` Bruce Richardson
  2023-10-24  9:10   ` Bruce Richardson
  0 siblings, 1 reply; 6+ messages in thread
From: Bruce Richardson @ 2023-10-24  8:10 UTC (permalink / raw)
  To: Mattias Rönnblom
  Cc: dev, Jerin Jacob, Peter Nilsson, svante.jarvstrat,
	Harry van Haaren, Abdullah Sevincer

On Mon, Oct 23, 2023 at 06:10:54PM +0200, Mattias Rönnblom wrote:
> Hi.
> 
> Consider an Eventdev app using atomic-type scheduling doing something like:
> 
>     struct rte_event events[3];
> 
>     rte_event_dequeue_burst(dev_id, port_id, events, 3, 0);
> 
>     /* Assume three events were dequeued, and the application decides
>      * it's best off to processing event 0 and 2 consecutively */
> 
>     process(&events[0]);
>     process(&events[2]);
> 
>     events[0].queue_id++;
>     events[0].op = RTE_EVENT_OP_FORWARD;
>     events[2].queue_id++;
>     events[2].op = RTE_EVENT_OP_FORWARD;
> 
>     rte_event_enqueue_burst(dev_id, port_id, &events[0], 1);
>     rte_event_enqueue_burst(dev_id, port_id, &events[2], 1);
> 
>     process(&events[1]);
>     events[1].queue_id++;
>     events[1].op = RTE_EVENT_OP_FORWARD;
> 
>     rte_event_enqueue_burst(dev_id, port_id, &events[1], 1);
> 
> If one would just read the Eventdev API spec, they might expect this to work
> (especially since impl_opaque hints as potentially be useful for the purpose
> of identifying events).
> 
> However, on certain event devices, it doesn't (and maybe rightly so). If
> event 0 and 2 belongs to the same flow (queue id + flow id pair), and event
> 1 belongs to some other, then this other flow would be "unlocked" at the
> point of the second enqueue operation (and thus be processed on some other
> core, in parallel). The first flow would still be needlessly "locked".
> 
> Such event devices require the order of the enqueued events to be the same
> as the dequeued events, using RTE_EVENT_OP_RELEASE type events as "fillers"
> for dropped events.
> 
> Am I missing something in the Eventdev API documentation?
> 

Much more likely is that the documentation is missing something. We should
explicitly clarify this behaviour, as it's required by a number of drivers.

> Could an event device use the impl_opaque field to track the identity of an
> event (and thus relax ordering requirements) and still be complaint toward
> the API?
> 

Possibly, but the documentation also doesn't report that the impl_opaque
field must be preserved between dequeue and enqueue. When forwarding a
packet it's well possible for an app to extract an mbuf from a dequeued
event and create a new event for sending it back in to the eventdev. For
example, if the first stage post-RX is doing classify, it's entirely
possible for every single field in the event header to be different for the
event returned compared to dequeue (flow_id recomputed, event type/source
adjusted, target queue_id and priority updated, op type changed to forward
from new, etc. etc.).

> What happens if a RTE_EVENT_OP_NEW event is inserted into the mix of
> OP_FORWARD and OP_RELEASE type events being enqueued? Again I'm not clear on
> what the API says, if anything.
>
OP_NEW should have no effect on the "history-list" of events previousl
dequeued. Again, our docs should clarify that explicitly. Thanks for
calling all this out.

/Bruce 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Eventdev dequeue-enqueue event correlation
  2023-10-24  8:10 ` Bruce Richardson
@ 2023-10-24  9:10   ` Bruce Richardson
  2023-10-25  7:40     ` Mattias Rönnblom
  0 siblings, 1 reply; 6+ messages in thread
From: Bruce Richardson @ 2023-10-24  9:10 UTC (permalink / raw)
  To: Mattias Rönnblom
  Cc: dev, Jerin Jacob, Peter Nilsson, svante.jarvstrat,
	Harry van Haaren, Abdullah Sevincer

On Tue, Oct 24, 2023 at 09:10:30AM +0100, Bruce Richardson wrote:
> On Mon, Oct 23, 2023 at 06:10:54PM +0200, Mattias Rönnblom wrote:
> > Hi.
> > 
> > Consider an Eventdev app using atomic-type scheduling doing something like:
> > 
> >     struct rte_event events[3];
> > 
> >     rte_event_dequeue_burst(dev_id, port_id, events, 3, 0);
> > 
> >     /* Assume three events were dequeued, and the application decides
> >      * it's best off to processing event 0 and 2 consecutively */
> > 
> >     process(&events[0]);
> >     process(&events[2]);
> > 
> >     events[0].queue_id++;
> >     events[0].op = RTE_EVENT_OP_FORWARD;
> >     events[2].queue_id++;
> >     events[2].op = RTE_EVENT_OP_FORWARD;
> > 
> >     rte_event_enqueue_burst(dev_id, port_id, &events[0], 1);
> >     rte_event_enqueue_burst(dev_id, port_id, &events[2], 1);
> > 
> >     process(&events[1]);
> >     events[1].queue_id++;
> >     events[1].op = RTE_EVENT_OP_FORWARD;
> > 
> >     rte_event_enqueue_burst(dev_id, port_id, &events[1], 1);
> > 
> > If one would just read the Eventdev API spec, they might expect this to work
> > (especially since impl_opaque hints as potentially be useful for the purpose
> > of identifying events).
> > 
> > However, on certain event devices, it doesn't (and maybe rightly so). If
> > event 0 and 2 belongs to the same flow (queue id + flow id pair), and event
> > 1 belongs to some other, then this other flow would be "unlocked" at the
> > point of the second enqueue operation (and thus be processed on some other
> > core, in parallel). The first flow would still be needlessly "locked".
> > 
> > Such event devices require the order of the enqueued events to be the same
> > as the dequeued events, using RTE_EVENT_OP_RELEASE type events as "fillers"
> > for dropped events.
> > 
> > Am I missing something in the Eventdev API documentation?
> > 
> 
> Much more likely is that the documentation is missing something. We should
> explicitly clarify this behaviour, as it's required by a number of drivers.
> 
> > Could an event device use the impl_opaque field to track the identity of an
> > event (and thus relax ordering requirements) and still be complaint toward
> > the API?
> > 
> 
> Possibly, but the documentation also doesn't report that the impl_opaque
> field must be preserved between dequeue and enqueue. When forwarding a
> packet it's well possible for an app to extract an mbuf from a dequeued
> event and create a new event for sending it back in to the eventdev. For
> example, if the first stage post-RX is doing classify, it's entirely
> possible for every single field in the event header to be different for the
> event returned compared to dequeue (flow_id recomputed, event type/source
> adjusted, target queue_id and priority updated, op type changed to forward
> from new, etc. etc.).
> 
> > What happens if a RTE_EVENT_OP_NEW event is inserted into the mix of
> > OP_FORWARD and OP_RELEASE type events being enqueued? Again I'm not clear on
> > what the API says, if anything.
> >
> OP_NEW should have no effect on the "history-list" of events previousl
> dequeued. Again, our docs should clarify that explicitly. Thanks for
> calling all this out.
>
Looking at the docs we have, I would propose adding a new subsection "Event
Operations", as section 49.1.6 to [1]. There we could explain "New",
"Forward" and "Release" events - what they mean for the different queue
types and how to use them. That section could also cover the enqueue
ordering rules, as the use of event "history" is necessary to explain
releases and forwards.

This seem reasonable? If nobody else has already started on updating docs
for this, I'm happy enough to give it a stab.

/Bruce

[1] https://doc.dpdk.org/guides-23.07/prog_guide/eventdev.html 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Eventdev dequeue-enqueue event correlation
  2023-10-24  9:10   ` Bruce Richardson
@ 2023-10-25  7:40     ` Mattias Rönnblom
  2023-10-25 12:29       ` Bruce Richardson
  2024-01-16 14:58       ` Bruce Richardson
  0 siblings, 2 replies; 6+ messages in thread
From: Mattias Rönnblom @ 2023-10-25  7:40 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, Jerin Jacob, Peter Nilsson, svante.jarvstrat,
	Harry van Haaren, Abdullah Sevincer, Mattias Rönnblom

On 2023-10-24 11:10, Bruce Richardson wrote:
> On Tue, Oct 24, 2023 at 09:10:30AM +0100, Bruce Richardson wrote:
>> On Mon, Oct 23, 2023 at 06:10:54PM +0200, Mattias Rönnblom wrote:
>>> Hi.
>>>
>>> Consider an Eventdev app using atomic-type scheduling doing something like:
>>>
>>>      struct rte_event events[3];
>>>
>>>      rte_event_dequeue_burst(dev_id, port_id, events, 3, 0);
>>>
>>>      /* Assume three events were dequeued, and the application decides
>>>       * it's best off to processing event 0 and 2 consecutively */
>>>
>>>      process(&events[0]);
>>>      process(&events[2]);
>>>
>>>      events[0].queue_id++;
>>>      events[0].op = RTE_EVENT_OP_FORWARD;
>>>      events[2].queue_id++;
>>>      events[2].op = RTE_EVENT_OP_FORWARD;
>>>
>>>      rte_event_enqueue_burst(dev_id, port_id, &events[0], 1);
>>>      rte_event_enqueue_burst(dev_id, port_id, &events[2], 1);
>>>
>>>      process(&events[1]);
>>>      events[1].queue_id++;
>>>      events[1].op = RTE_EVENT_OP_FORWARD;
>>>
>>>      rte_event_enqueue_burst(dev_id, port_id, &events[1], 1);
>>>
>>> If one would just read the Eventdev API spec, they might expect this to work
>>> (especially since impl_opaque hints as potentially be useful for the purpose
>>> of identifying events).
>>>
>>> However, on certain event devices, it doesn't (and maybe rightly so). If
>>> event 0 and 2 belongs to the same flow (queue id + flow id pair), and event
>>> 1 belongs to some other, then this other flow would be "unlocked" at the
>>> point of the second enqueue operation (and thus be processed on some other
>>> core, in parallel). The first flow would still be needlessly "locked".
>>>
>>> Such event devices require the order of the enqueued events to be the same
>>> as the dequeued events, using RTE_EVENT_OP_RELEASE type events as "fillers"
>>> for dropped events.
>>>
>>> Am I missing something in the Eventdev API documentation?
>>>
>>
>> Much more likely is that the documentation is missing something. We should
>> explicitly clarify this behaviour, as it's required by a number of drivers.
>>
>>> Could an event device use the impl_opaque field to track the identity of an
>>> event (and thus relax ordering requirements) and still be complaint toward
>>> the API?
>>>
>>
>> Possibly, but the documentation also doesn't report that the impl_opaque
>> field must be preserved between dequeue and enqueue. When forwarding a
>> packet it's well possible for an app to extract an mbuf from a dequeued
>> event and create a new event for sending it back in to the eventdev. For

Such a behavior would be in violation of a part of the Eventdev API 
contract actually specified. The rte_event struct documentation says 
about impl_opaque that "An implementation may use this field to hold 
implementation specific value to share between dequeue and enqueue 
operation. The application should not modify this field. "

I see no other way to read this than that "an implementation" here is 
referring to an event device PMD. The requirement that the application 
can't modify this field only make sense in the context of "from dequeue 
to enqueue".

>> example, if the first stage post-RX is doing classify, it's entirely
>> possible for every single field in the event header to be different for the
>> event returned compared to dequeue (flow_id recomputed, event type/source
>> adjusted, target queue_id and priority updated, op type changed to forward
>> from new, etc. etc.).
>>
>>> What happens if a RTE_EVENT_OP_NEW event is inserted into the mix of
>>> OP_FORWARD and OP_RELEASE type events being enqueued? Again I'm not clear on
>>> what the API says, if anything.
>>>
>> OP_NEW should have no effect on the "history-list" of events previousl
>> dequeued. Again, our docs should clarify that explicitly. Thanks for
>> calling all this out.
>>
> Looking at the docs we have, I would propose adding a new subsection "Event
> Operations", as section 49.1.6 to [1]. There we could explain "New",
> "Forward" and "Release" events - what they mean for the different queue
> types and how to use them. That section could also cover the enqueue
> ordering rules, as the use of event "history" is necessary to explain
> releases and forwards.
> 
> This seem reasonable? If nobody else has already started on updating docs
> for this, I'm happy enough to give it a stab.
> 

Batch dequeues not only provides an opportunity to amortize 
per-interaction overhead with the event device, it also allows the 
application to reshuffle the order in which it decides to process the 
events.

Such reshuffling may have a very significant impact on performance. At a 
minimum, cache locality improves, and in case the app is able to "vector 
processing" (e.g., something akin to what fd.io VPP does), the gains may 
be further increased.

One may argue the app/core should just "do what it's told" by the event 
device. After all, an event device is a work scheduler, and reshuffling 
items of work certainly counts as (micro-)scheduling work.

However it's much to hope for to expect a fairly generic function, 
especially if it comes in the form of hardware, with a design frozen 
years ago, to be able to arrange the work in whatever is currently 
optimal order for one particular application.

What such an app can do (or must do, if it has efficiency constraints) 
is to buffer the events on the output side, rearranging them in 
accordance to the yet-seemingly-undocumented Eventdev API contract. 
That's certainly possible, and not very difficult, but it seems to me 
that this really is the job something in the platform (e.g., in Eventdev 
or the event device PMD).

One way out of this could be to add an "implicit release-*only*" mode of 
operation for eventdev.

In such a mode, the RTE_SCHED_TYPE_ATOMIC per-flow "lock" (and its 
ORDERED equivalent, if there is one) would be held until the next 
dequeue. In such a mode, the difference between OP_FORWARD and OP_NEW 
events would just be the back-pressure watermark (new_event_threshold).

That pre-rte_event_enqueue_burst() buffering would prevent the event 
device from releasing "locks" that could otherwise be released, but the 
typical cost of event device interaction is so high so I have my doubts 
about how useful that feature is. If you are worried about "locks" held 
for a long time, one may need to use short bursts anyway (since 
worst-case critical section length is not reduced by such RELEASEs).

Another option would be to have the current RTE_EVENT_DEV_CAP_BURST_MODE 
capable PMDs start using the "impl_opaque" field for the purpose of 
matching in and out events. It would require applications to actually 
start adhering to the "don't touch impl_opaque" requirement of the 
Eventdev API.

Those "fixes" are not mutually exclusive.

A side note: it's unfortunate there are no bits in the rte_event struct 
that can be used for "event id"/"event SN"/"event dequeue idx" type 
information, if an app would like to work around this issue with current 
PMDs.

> /Bruce
> 
> [1] https://doc.dpdk.org/guides-23.07/prog_guide/eventdev.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Eventdev dequeue-enqueue event correlation
  2023-10-25  7:40     ` Mattias Rönnblom
@ 2023-10-25 12:29       ` Bruce Richardson
  2024-01-16 14:58       ` Bruce Richardson
  1 sibling, 0 replies; 6+ messages in thread
From: Bruce Richardson @ 2023-10-25 12:29 UTC (permalink / raw)
  To: Mattias Rönnblom
  Cc: dev, Jerin Jacob, Peter Nilsson, svante.jarvstrat,
	Harry van Haaren, Abdullah Sevincer, Mattias Rönnblom

On Wed, Oct 25, 2023 at 09:40:54AM +0200, Mattias Rönnblom wrote:
> On 2023-10-24 11:10, Bruce Richardson wrote:
> > On Tue, Oct 24, 2023 at 09:10:30AM +0100, Bruce Richardson wrote:
> > > On Mon, Oct 23, 2023 at 06:10:54PM +0200, Mattias Rönnblom wrote:
> > > > Hi.
> > > > 
> > > > Consider an Eventdev app using atomic-type scheduling doing something like:
> > > > 
> > > >      struct rte_event events[3];
> > > > 
> > > >      rte_event_dequeue_burst(dev_id, port_id, events, 3, 0);
> > > > 
> > > >      /* Assume three events were dequeued, and the application decides
> > > >       * it's best off to processing event 0 and 2 consecutively */
> > > > 
> > > >      process(&events[0]);
> > > >      process(&events[2]);
> > > > 
> > > >      events[0].queue_id++;
> > > >      events[0].op = RTE_EVENT_OP_FORWARD;
> > > >      events[2].queue_id++;
> > > >      events[2].op = RTE_EVENT_OP_FORWARD;
> > > > 
> > > >      rte_event_enqueue_burst(dev_id, port_id, &events[0], 1);
> > > >      rte_event_enqueue_burst(dev_id, port_id, &events[2], 1);
> > > > 
> > > >      process(&events[1]);
> > > >      events[1].queue_id++;
> > > >      events[1].op = RTE_EVENT_OP_FORWARD;
> > > > 
> > > >      rte_event_enqueue_burst(dev_id, port_id, &events[1], 1);
> > > > 
> > > > If one would just read the Eventdev API spec, they might expect this to work
> > > > (especially since impl_opaque hints as potentially be useful for the purpose
> > > > of identifying events).
> > > > 
> > > > However, on certain event devices, it doesn't (and maybe rightly so). If
> > > > event 0 and 2 belongs to the same flow (queue id + flow id pair), and event
> > > > 1 belongs to some other, then this other flow would be "unlocked" at the
> > > > point of the second enqueue operation (and thus be processed on some other
> > > > core, in parallel). The first flow would still be needlessly "locked".
> > > > 
> > > > Such event devices require the order of the enqueued events to be the same
> > > > as the dequeued events, using RTE_EVENT_OP_RELEASE type events as "fillers"
> > > > for dropped events.
> > > > 
> > > > Am I missing something in the Eventdev API documentation?
> > > > 
> > > 
> > > Much more likely is that the documentation is missing something. We should
> > > explicitly clarify this behaviour, as it's required by a number of drivers.
> > > 
> > > > Could an event device use the impl_opaque field to track the identity of an
> > > > event (and thus relax ordering requirements) and still be complaint toward
> > > > the API?
> > > > 
> > > 
> > > Possibly, but the documentation also doesn't report that the impl_opaque
> > > field must be preserved between dequeue and enqueue. When forwarding a
> > > packet it's well possible for an app to extract an mbuf from a dequeued
> > > event and create a new event for sending it back in to the eventdev. For
> 
> Such a behavior would be in violation of a part of the Eventdev API contract
> actually specified. The rte_event struct documentation says about
> impl_opaque that "An implementation may use this field to hold
> implementation specific value to share between dequeue and enqueue
> operation. The application should not modify this field. "
> 
> I see no other way to read this than that "an implementation" here is
> referring to an event device PMD. The requirement that the application can't
> modify this field only make sense in the context of "from dequeue to
> enqueue".
> 

Yep, you are completely correct. For some reason, I had this in my head the
other way round, that it was for internal use between the enqueue and
dequeue. My mistake! :-(

> > > example, if the first stage post-RX is doing classify, it's entirely
> > > possible for every single field in the event header to be different for the
> > > event returned compared to dequeue (flow_id recomputed, event type/source
> > > adjusted, target queue_id and priority updated, op type changed to forward
> > > from new, etc. etc.).
> > > 
> > > > What happens if a RTE_EVENT_OP_NEW event is inserted into the mix of
> > > > OP_FORWARD and OP_RELEASE type events being enqueued? Again I'm not clear on
> > > > what the API says, if anything.
> > > > 
> > > OP_NEW should have no effect on the "history-list" of events previousl
> > > dequeued. Again, our docs should clarify that explicitly. Thanks for
> > > calling all this out.
> > > 
> > Looking at the docs we have, I would propose adding a new subsection "Event
> > Operations", as section 49.1.6 to [1]. There we could explain "New",
> > "Forward" and "Release" events - what they mean for the different queue
> > types and how to use them. That section could also cover the enqueue
> > ordering rules, as the use of event "history" is necessary to explain
> > releases and forwards.
> > 
> > This seem reasonable? If nobody else has already started on updating docs
> > for this, I'm happy enough to give it a stab.
> > 
> 
> Batch dequeues not only provides an opportunity to amortize per-interaction
> overhead with the event device, it also allows the application to reshuffle
> the order in which it decides to process the events.
> 
> Such reshuffling may have a very significant impact on performance. At a
> minimum, cache locality improves, and in case the app is able to "vector
> processing" (e.g., something akin to what fd.io VPP does), the gains may be
> further increased.
> 
> One may argue the app/core should just "do what it's told" by the event
> device. After all, an event device is a work scheduler, and reshuffling
> items of work certainly counts as (micro-)scheduling work.
> 
> However it's much to hope for to expect a fairly generic function,
> especially if it comes in the form of hardware, with a design frozen years
> ago, to be able to arrange the work in whatever is currently optimal order
> for one particular application.
> 
> What such an app can do (or must do, if it has efficiency constraints) is to
> buffer the events on the output side, rearranging them in accordance to the
> yet-seemingly-undocumented Eventdev API contract. That's certainly possible,
> and not very difficult, but it seems to me that this really is the job
> something in the platform (e.g., in Eventdev or the event device PMD).
> 
> One way out of this could be to add an "implicit release-*only*" mode of
> operation for eventdev.
> 
> In such a mode, the RTE_SCHED_TYPE_ATOMIC per-flow "lock" (and its ORDERED
> equivalent, if there is one) would be held until the next dequeue. In such a
> mode, the difference between OP_FORWARD and OP_NEW events would just be the
> back-pressure watermark (new_event_threshold).
> 
> That pre-rte_event_enqueue_burst() buffering would prevent the event device
> from releasing "locks" that could otherwise be released, but the typical
> cost of event device interaction is so high so I have my doubts about how
> useful that feature is. If you are worried about "locks" held for a long
> time, one may need to use short bursts anyway (since worst-case critical
> section length is not reduced by such RELEASEs).
> 
> Another option would be to have the current RTE_EVENT_DEV_CAP_BURST_MODE
> capable PMDs start using the "impl_opaque" field for the purpose of matching
> in and out events. It would require applications to actually start adhering
> to the "don't touch impl_opaque" requirement of the Eventdev API.
> 
> Those "fixes" are not mutually exclusive.
> 
> A side note: it's unfortunate there are no bits in the rte_event struct that
> can be used for "event id"/"event SN"/"event dequeue idx" type information,
> if an app would like to work around this issue with current PMDs.
> 
Lots of good points here. We'll take a look and see what we can do in our
drivers and any other ideas or suggestions.

/Bruce

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Eventdev dequeue-enqueue event correlation
  2023-10-25  7:40     ` Mattias Rönnblom
  2023-10-25 12:29       ` Bruce Richardson
@ 2024-01-16 14:58       ` Bruce Richardson
  1 sibling, 0 replies; 6+ messages in thread
From: Bruce Richardson @ 2024-01-16 14:58 UTC (permalink / raw)
  To: Mattias Rönnblom
  Cc: dev, Jerin Jacob, Peter Nilsson, svante.jarvstrat,
	Harry van Haaren, Mattias Rönnblom, Pravin Pathak

On Wed, Oct 25, 2023 at 09:40:54AM +0200, Mattias Rönnblom wrote:
<snip for brevity> 
> Another option would be to have the current RTE_EVENT_DEV_CAP_BURST_MODE
> capable PMDs start using the "impl_opaque" field for the purpose of matching
> in and out events. It would require applications to actually start adhering
> to the "don't touch impl_opaque" requirement of the Eventdev API.
> 
> Those "fixes" are not mutually exclusive.
> 
> A side note: it's unfortunate there are no bits in the rte_event struct that
> can be used for "event id"/"event SN"/"event dequeue idx" type information,
> if an app would like to work around this issue with current PMDs.
> 

Restarting this old thread.

Having looked at the eventdev API, I think that we need to tighten up the
specification for how enqueue to dequeue correlation is to be managed.
Right now, the spec seems to imply that the impl_opaque field should be
used to correlate enqueue and dequeue events, but I'm not sure if any
drivers use that, we have quite a number of drivers which require
re-enqueue in the same order as dequeue, others which just don't support
burst enq/deq, which avoids the issue, and others which may use other
methods to achieve this. There is no documentation that I have found,
- written from the application-writers viewpoint - describing how enqueued
events for reordering, or for releasing atomic locks, are to be
correlated with the equivalent dequeued event. This looks a major
documentation gap.

I think the best approach overall is to mandate that impl_opaque field
should be used for this, as Mattias suggest above. (Using
implicit-release-only will work for atomic flows where only locks need to
be released, but I don't believe it works for reordered flows where we need
to correlate the new event with a specific order slot, not just a flow
lock) Using impl_opaque field and allowing reordering of events does open
an issue for how we deal with things like fragmentation, where one dequeued
event leads to multiple enqueued ones. For this, I suspect we may need a
new event type, called PARTIAL or FRAGMENT, to indicate that an event is to
be treated for ordering purposes the same as another event, without
actually releasing any atomic locks etc. for that event.

To try and move this along, and make the discussion more focused and
concrete, I'll do up a patchset to try and improve the eventdev
documentation, and as part of that, try and document exactly what behaviour
an app should expect when forwarding events between enqueue and dequeue.
Even if the enq-deq problem is still controversial, I think there is
probably more clarification we can do anyway.

Any further thoughts or comments?

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-16 14:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-23 16:10 Eventdev dequeue-enqueue event correlation Mattias Rönnblom
2023-10-24  8:10 ` Bruce Richardson
2023-10-24  9:10   ` Bruce Richardson
2023-10-25  7:40     ` Mattias Rönnblom
2023-10-25 12:29       ` Bruce Richardson
2024-01-16 14:58       ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).