DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob <jerinjacobk@gmail.com>
To: "Mattias Rönnblom" <hofors@lysator.liu.se>
Cc: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Morten Brørup" <mb@smartsharesystems.com>
Subject: Re: [PATCH v3] eventdev: avoid non-burst shortcut for variable-size bursts
Date: Tue, 16 May 2023 18:38:23 +0530	[thread overview]
Message-ID: <CALBAE1Nzk_BmmVddCJH+=rp3OBcs5hv7uno+JNLzSUVm4cd=JQ@mail.gmail.com> (raw)
In-Reply-To: <f1d26ef6-d6a2-ec1c-e365-4447b5fa16e0@lysator.liu.se>

On Tue, May 16, 2023 at 2:22 AM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> On 2023-05-15 14:38, Jerin Jacob wrote:
> > On Fri, May 12, 2023 at 6:45 PM Mattias Rönnblom
> > <mattias.ronnblom@ericsson.com> wrote:
> >>
> >> On 2023-05-12 13:59, Jerin Jacob wrote:
> >>> On Thu, May 11, 2023 at 2:00 PM Mattias Rönnblom
> >>> <mattias.ronnblom@ericsson.com> wrote:
> >>>>
> >>>> Use non-burst event enqueue and dequeue calls from burst enqueue and
> >>>> dequeue only when the burst size is compile-time constant (and equal
> >>>> to one).
> >>>>
> >>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >>>>
> >>>> ---
> >>>>
> >>>> v3: Actually include the change v2 claimed to contain.
> >>>> v2: Wrap builtin call in __extension__, to avoid compiler warnings if
> >>>>       application is compiled with -pedantic. (Morten Brørup)
> >>>> ---
> >>>>    lib/eventdev/rte_eventdev.h | 4 ++--
> >>>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> >>>> index a90e23ac8b..a471caeb6d 100644
> >>>> --- a/lib/eventdev/rte_eventdev.h
> >>>> +++ b/lib/eventdev/rte_eventdev.h
> >>>> @@ -1944,7 +1944,7 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t port_id,
> >>>>            * Allow zero cost non burst mode routine invocation if application
> >>>>            * requests nb_events as const one
> >>>>            */
> >>>> -       if (nb_events == 1)
> >>>> +       if (__extension__(__builtin_constant_p(nb_events)) && nb_events == 1)
> >>>
> >>> "Why" part is not clear from the commit message. Is this to avoid
> >>> nb_events read if it is built-in const.
> >>
> >> The __builtin_constant_p() is introduced to avoid having the compiler
> >> generate a conditional branch and two different code paths in case
> >> nb_elem is a run-time variable.
> >>
> >> In particular, this matters if nb_elems is run-time variable and varies
> >> between 1 and some larger value.
> >>
> >> I should have mention this in the commit message.
> >>
> >> A very slight performance improvement. It also makes the code better
> >> match the comment, imo. Zero cost for const one enqueues, but no impact
> >> non-compile-time-constant-length enqueues.
> >>
> >> Feel free to ignore.
> >
> >
> > I did some performance comparison of the patch.
> > A low-end ARM machines shows 0.7%  drop with single event case. No
> > regression see with high-end ARM cores with single event case.
> >
> > IMO, optimizing the check for burst mode(the new patch) may not show
> > any real improvement as the cost is divided by number of event.
> > Whereas optimizing the check for single event case(The current code)
> > shows better performance with single event case and no regression
> > with burst mode as cost is divided by number of events.
>
> I ran some tests on an AMD Zen 3 with DSW.
> In the below tests the enqueue burst size is not compile-time constant.
>
> Enqueue burst size      Performance improvement
> Run-time constant 1     ~5%
> Run-time constant 2     ~0%
> Run-time variable 1-2   ~9%
> Run-time variable 1-16  ~0%
>
> The run-time variable enqueue sizes randomly (uniformly) distributed in
> the specified range.
>
> The first result may come as a surprise. The benchmark is using
> RTE_EVENT_OP_FORWARD type events (which likely is the dominating op type
> in most apps). The single-event enqueue function only exists in a
> generic variant (i.e., no rte_event_enqueue_forward_burst() equivalent).
> I suspect that is the reason for the performance improvement.
>
> This effect is large-enough to make it somewhat beneficial (+~1%) to use
> run-time variable single-event enqueue compared to keeping the burst
> size compile-time constant.

# Interesting, Could you share your testeventdev command to test it.
# By having quick glance on DSW code, following change can be added(or
 similar cases).
Not sure such change in DSW driver is making a difference or nor?


diff --git a/drivers/event/dsw/dsw_event.c b/drivers/event/dsw/dsw_event.c
index e84b65d99f..455470997b 100644
--- a/drivers/event/dsw/dsw_event.c
+++ b/drivers/event/dsw/dsw_event.c
@@ -1251,7 +1251,7 @@ dsw_port_flush_out_buffers(struct dsw_evdev
*dsw, struct dsw_port *source_port)
 uint16_t
 dsw_event_enqueue(void *port, const struct rte_event *ev)
 {
-       return dsw_event_enqueue_burst(port, ev, unlikely(ev == NULL) ? 0 : 1);
+       return dsw_event_enqueue_burst(port, ev, 1);
 }

 static __rte_always_inline uint16_t
@@ -1340,7 +1340,7 @@ dsw_event_enqueue_burst_generic(struct dsw_port
*source_port,
        return (num_non_release + num_release);
 }

-uint16_t
+inline uint16_t
 dsw_event_enqueue_burst(void *port, const struct rte_event events[],
                        uint16_t events_len)
 {

# I am testing with command like this "/build/app/dpdk-test-eventdev
-l 0-23 -a 0002:0e:00.0 -- --test=perf_atq --plcores 1 --wlcores 8
--stlist p --nb_pkts=10000000000"

>
> The performance gain is counted toward both enqueue and dequeue costs
> (+benchmark app overhead), so an under-estimation if see this as an
> enqueue performance improvement.
>
> > If you agree, then we can skip this patch.
> >
>
> I have no strong opinion if this should be included or not.
>
> It was up to me, I would drop the single-enqueue special case handling
> altogether in the next ABI update.

That's a reasonable path. If we are willing to push a patch, we can
test it and give feedback.
Or in our spare time, We can do that as well.

>
> >
> >>
> >>> If so, check should be following. Right?
> >>>
> >>> if (__extension__((__builtin_constant_p(nb_events)) && nb_events == 1)
> >>> || nb_events  == 1)
> >>>
> >>> At least, It was my original intention in the code.
> >>>
> >>>
> >>>
> >>>>                   return (fp_ops->enqueue)(port, ev);
> >>>>           else
> >>>>                   return fn(port, ev, nb_events);
> >>>> @@ -2200,7 +2200,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[],
> >>>>            * Allow zero cost non burst mode routine invocation if application
> >>>>            * requests nb_events as const one
> >>>>            */
> >>>> -       if (nb_events == 1)
> >>>> +       if (__extension__(__builtin_constant_p(nb_events)) && nb_events == 1)
> >>>>                   return (fp_ops->dequeue)(port, ev, timeout_ticks);
> >>>>           else
> >>>>                   return (fp_ops->dequeue_burst)(port, ev, nb_events,
> >>>> --
> >>>> 2.34.1
> >>>>
> >>

  reply	other threads:[~2023-05-16 13:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11  6:43 [PATCH] " Mattias Rönnblom
2023-05-11  7:09 ` Morten Brørup
2023-05-11  8:16   ` [PATCH v2] " Mattias Rönnblom
2023-05-11  8:24     ` [PATCH v3] " Mattias Rönnblom
2023-05-11  9:01       ` Morten Brørup
2023-05-12 11:59       ` Jerin Jacob
2023-05-12 13:15         ` Mattias Rönnblom
2023-05-12 13:56           ` Morten Brørup
2023-05-12 14:52             ` Mattias Rönnblom
2023-05-15 12:33             ` Jerin Jacob
2023-05-15 12:38           ` Jerin Jacob
2023-05-15 20:52             ` Mattias Rönnblom
2023-05-16 13:08               ` Jerin Jacob [this message]
2023-05-17  7:16                 ` Mattias Rönnblom
2023-05-17 12:28                   ` Jerin Jacob
2023-05-11  8:24     ` [PATCH v2] " Morten Brørup
2023-05-11  8:29   ` [PATCH] " Mattias Rönnblom
2023-05-11  9:06     ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALBAE1Nzk_BmmVddCJH+=rp3OBcs5hv7uno+JNLzSUVm4cd=JQ@mail.gmail.com' \
    --to=jerinjacobk@gmail.com \
    --cc=dev@dpdk.org \
    --cc=hofors@lysator.liu.se \
    --cc=jerinj@marvell.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=mb@smartsharesystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).