From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9A28242AE6; Fri, 12 May 2023 16:52:56 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 60E4A42FB0; Fri, 12 May 2023 16:52:56 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id B164A42D3E for ; Fri, 12 May 2023 16:52:55 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 4416716498 for ; Fri, 12 May 2023 16:52:55 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 42B9B16425; Fri, 12 May 2023 16:52:55 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A, T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6 X-Spam-Score: -2.4 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 647EB16424; Fri, 12 May 2023 16:52:54 +0200 (CEST) Message-ID: Date: Fri, 12 May 2023 16:52:53 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH v3] eventdev: avoid non-burst shortcut for variable-size bursts To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , Jerin Jacob Cc: jerinj@marvell.com, dev@dpdk.org References: <20230511081641.6693-1-mattias.ronnblom@ericsson.com> <20230511082415.6720-1-mattias.ronnblom@ericsson.com> <7c0013e1-d11b-2ad0-04b9-73be426d4719@ericsson.com> <98CBD80474FA8B44BF855DF32C47DC35D8790C@smartserver.smartshare.dk> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8790C@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2023-05-12 15:56, Morten Brørup wrote: >> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com] >> Sent: Friday, 12 May 2023 15.15 >> >> On 2023-05-12 13:59, Jerin Jacob wrote: >>> On Thu, May 11, 2023 at 2:00 PM Mattias Rönnblom >>> wrote: >>>> >>>> Use non-burst event enqueue and dequeue calls from burst enqueue and >>>> dequeue only when the burst size is compile-time constant (and equal >>>> to one). >>>> >>>> Signed-off-by: Mattias Rönnblom >>>> >>>> --- >>>> >>>> v3: Actually include the change v2 claimed to contain. >>>> v2: Wrap builtin call in __extension__, to avoid compiler warnings if >>>> application is compiled with -pedantic. (Morten Brørup) >>>> --- >>>> lib/eventdev/rte_eventdev.h | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h >>>> index a90e23ac8b..a471caeb6d 100644 >>>> --- a/lib/eventdev/rte_eventdev.h >>>> +++ b/lib/eventdev/rte_eventdev.h >>>> @@ -1944,7 +1944,7 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t >> port_id, >>>> * Allow zero cost non burst mode routine invocation if >> application >>>> * requests nb_events as const one >>>> */ >>>> - if (nb_events == 1) >>>> + if (__extension__(__builtin_constant_p(nb_events)) && nb_events == >> 1) >>> >>> "Why" part is not clear from the commit message. Is this to avoid >>> nb_events read if it is built-in const. >> >> The __builtin_constant_p() is introduced to avoid having the compiler >> generate a conditional branch and two different code paths in case >> nb_elem is a run-time variable. >> >> In particular, this matters if nb_elems is run-time variable and varies >> between 1 and some larger value. >> >> I should have mention this in the commit message. >> >> A very slight performance improvement. It also makes the code better >> match the comment, imo. Zero cost for const one enqueues, but no impact >> non-compile-time-constant-length enqueues. >> >> Feel free to ignore. >> >>> If so, check should be following. Right? >>> >>> if (__extension__((__builtin_constant_p(nb_events)) && nb_events == 1) >>> || nb_events == 1) > > @Mattias: You missed the second part of this comparison, also catching nb_events == 1 with non-constant nb_events. > I didn't comment on that code snippet since it was based on a misconception of the intention of my patch. > @Jerin: Such a change has no effect, compared to the original code. > >>> >>> At least, It was my original intention in the code. > > @Jerin: Mattias implemented exactly what the comment says. > > Perhaps only the comment should be updated, not the code. > > Is nb_events likely to be non-constant 1, and are there benefits to calling either of the non-burst functions in those cases, vs. the branch cost of this comparison (which Mattias' patch gets rid of)? > I think the main worry would be the cost of branch mispredictions in case of alternating enqueue sizes (between 1 and some other size). If there is a performance upside to calling single-event enqueue in a scenario where all enqueues are *run-time variable* and 1 (which I find unlikely, but well inside the realms of the possibility), the next question would be: OK, but how about for two events? Three? Four. Etc. >>> >>> >>> >>>> return (fp_ops->enqueue)(port, ev); >>>> else >>>> return fn(port, ev, nb_events); >>>> @@ -2200,7 +2200,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t >> port_id, struct rte_event ev[], >>>> * Allow zero cost non burst mode routine invocation if >> application >>>> * requests nb_events as const one >>>> */ >>>> - if (nb_events == 1) >>>> + if (__extension__(__builtin_constant_p(nb_events)) && nb_events == >> 1) >>>> return (fp_ops->dequeue)(port, ev, timeout_ticks); >>>> else >>>> return (fp_ops->dequeue_burst)(port, ev, nb_events, >>>> -- >>>> 2.34.1 >>>> >