DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	<olivier.matz@6wind.com>, <Honnappa.Nagarahalli@arm.com>,
	<nd@arm.com>
Cc: <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] ring: empty and count optimizations
Date: Thu, 14 May 2020 20:00:21 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C60FC8@smartserver.smartshare.dk> (raw)
In-Reply-To: <BYAPR11MB330101699614B8EA0EF36D8E9ABC0@BYAPR11MB3301.namprd11.prod.outlook.com>

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Thursday, May 14, 2020 6:47 PM
> >
> > > > -static inline unsigned
> > > > +static inline unsigned int
> > > >  rte_ring_count(const struct rte_ring *r)
> > > >  {
> > > >  	uint32_t prod_tail = r->prod.tail;
> > > >  	uint32_t cons_tail = r->cons.tail;
> > > >  	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > > -	return (count > r->capacity) ? r->capacity : count;
> > > > +	return likely(count <= r->capacity) ? count : r->capacity;
> > >
> > > Honestly, I don't see there is any point of that change:
> > > I think it wouldn't change anything in terms of functionality
> > > or performance.
> >
> > Chapter 3.4.1 "Branch Prediction Optimization" in the Intel 64 and
> IA-32 Architectures Optimization Reference Manual recommends this
> > kind of optimization as Assembly/Compiler Coding Rule 3, which is why
> I rearranged the trigraph. Essentially, there is a limit to the number
> > of BTB (Branch Target Buffer) entries, so they should be conserved if
> possible.
> >
> > In addition to that, I have added the likely() because I consider it
> nearly impossible that the count will exceed the capacity.
> >
> > However, it's not the first time I see this kind of response to a
> suggested branch optimization on the DPDK mailing list. Everyone seem
> to
> > think that branch prediction is infinite and always works. It may
> seem as if infinite on trivial applications, but BTB entries may be a
> scarce
> > resource on complex applications. I assume Intel's recommendations
> are not just for the fun of it.
> 
> I think it is better to leave such level of micro-optimizations to the
> compiler.
> BTW, in that particular case, compiler most likely will generate a code
> without any branches at all (at least for IA).
> Let say on my box with gcc 7.3:
> 
> $ cat trc1.c
> #include <stdint.h>
> #include <rte_config.h>
> #include <rte_ring.h>
> 
> uint32_t
> fffx1(const struct rte_ring *r)
> {
>         uint32_t prod_tail = r->prod.tail;
>         uint32_t cons_tail = r->cons.tail;
>         uint32_t count = (prod_tail - cons_tail) & r->mask;
>         return (count > r->capacity) ? r->capacity : count;
> }
> 
> uint32_t
> fffx2(const struct rte_ring *r)
> {
>         uint32_t prod_tail = r->prod.tail;
>         uint32_t cons_tail = r->cons.tail;
>         uint32_t count = (prod_tail - cons_tail) & r->mask;
>         return likely(count <= r->capacity) ? count : r->capacity;
> }
> 
> $ gcc -m64 -O3 -march=native -I${RTE_SDK}/x86_64-native-linuxapp-
> gcc/include -c trc1.c
> 
> $ objdump -d trc1.o
> 
> 0000000000000000 <fffx1>:
>    0:   8b 87 84 00 00 00       mov    0x84(%rdi),%eax
>    6:   8b 97 04 01 00 00       mov    0x104(%rdi),%edx
>    c:   29 d0                   sub    %edx,%eax
>    e:   8b 57 38                mov    0x38(%rdi),%edx
>   11:   23 47 34                and    0x34(%rdi),%eax
>   14:   39 d0                   cmp    %edx,%eax
>   16:   0f 47 c2                cmova  %edx,%eax
>   19:   c3                      retq
>   1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
> 
> 0000000000000020 <fffx2>:
>   20:   8b 87 84 00 00 00       mov    0x84(%rdi),%eax
>   26:   8b 97 04 01 00 00       mov    0x104(%rdi),%edx
>   2c:   29 d0                   sub    %edx,%eax
>   2e:   8b 57 38                mov    0x38(%rdi),%edx
>   31:   23 47 34                and    0x34(%rdi),%eax
>   34:   39 d0                   cmp    %edx,%eax
>   36:   0f 47 c2                cmova  %edx,%eax
>   39:   c3                      retq
> 
> As you can see, there is no difference.
> 

Thank you for the detailed feedback.

Reality trumps theory, so I will leave the count function as is. :-)


> >
> > Konstantin, please note that I'm letting out my frustration about the
> general misconception about branch prediction here. You are doing a
> > great job, so I feel bad about responding like this to you.
> 
> No worries, in fact I am glad to know that DPDK contributors
> read IA optimization manual that thoughtfully 😊
> 
> Konstantin

  reply	other threads:[~2020-05-14 18:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13 15:31 Morten Brørup
2020-05-13 15:35 ` Morten Brørup
2020-05-13 17:08 ` Morten Brørup
2020-05-14 12:23   ` Ananyev, Konstantin
2020-05-14 13:45     ` Morten Brørup
2020-05-14 16:46       ` Ananyev, Konstantin
2020-05-14 18:00         ` Morten Brørup [this message]
2020-05-19 15:27 ` [dpdk-dev] [PATCH 0/2] ring: empty optimization Morten Brørup
2020-05-19 15:27   ` [dpdk-dev] [PATCH 1/2] ring: coding style cleanup Morten Brørup
2020-05-22 12:34     ` Ananyev, Konstantin
2020-05-19 15:27   ` [dpdk-dev] [PATCH 2/2] ring: empty optimization Morten Brørup
2020-05-19 15:52     ` Stephen Hemminger
2020-05-19 16:02       ` Morten Brørup
2020-07-01  9:19         ` David Marchand
2020-05-22 12:32     ` Ananyev, Konstantin
2020-07-01  9:20   ` [dpdk-dev] [PATCH 0/2] " David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C60FC8@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).