From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
"jerinj@marvell.com" <jerinj@marvell.com>,
"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>
Subject: Re: [dpdk-dev] [RFC] ring: make ring implementation non-inlined
Date: Sat, 21 Mar 2020 01:03:35 +0000 [thread overview]
Message-ID: <SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200320105435.47681954@hermes.lan>
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, March 20, 2020 5:55 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; olivier.matz@6wind.com; honnappa.nagarahalli@arm.com; jerinj@marvell.com; drc@linux.vnet.ibm.com
> Subject: Re: [RFC] ring: make ring implementation non-inlined
>
> On Fri, 20 Mar 2020 16:41:38 +0000
> Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:
>
> > As was discussed here:
> > http://mails.dpdk.org/archives/dev/2020-February/158586.html
> > this RFC aimed to hide ring internals into .c and make all
> > ring functions non-inlined. In theory that might help to
> > maintain ABI stability in future.
> > This is just a POC to measure the impact of proposed idea,
> > proper implementation would definetly need some extra effort.
> > On IA box (SKX) ring_perf_autotest shows ~20-30 cycles extra for
> > enqueue+dequeue pair. On some more realistic code, I suspect
> > the impact it might be a bit higher.
> > For MP/MC bulk transfers degradation seems quite small,
> > though for SP/SC and/or small transfers it is more then noticable
> > (see exact numbers below).
> > From my perspective we'd probably keep it inlined for now
> > to avoid any non-anticipated perfomance degradations.
> > Though intersted to see perf results and opinions from
> > other interested parties.
> >
> > Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> > ring_perf_autotest (without patch/with patch)
> >
> > ### Testing single element enq/deq ###
> > legacy APIs: SP/SC: single: 8.75/43.23
> > legacy APIs: MP/MC: single: 56.18/80.44
> >
> > ### Testing burst enq/deq ###
> > legacy APIs: SP/SC: burst (size: 8): 37.36/53.37
> > legacy APIs: SP/SC: burst (size: 32): 93.97/117.30
> > legacy APIs: MP/MC: burst (size: 8): 78.23/91.45
> > legacy APIs: MP/MC: burst (size: 32): 131.59/152.49
> >
> > ### Testing bulk enq/deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 37.29/54.48
> > legacy APIs: SP/SC: bulk (size: 32): 92.68/113.01
> > legacy APIs: MP/MC: bulk (size: 8): 78.40/93.50
> > legacy APIs: MP/MC: bulk (size: 32): 131.49/154.25
> >
> > ### Testing empty bulk deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 4.00/16.86
> > legacy APIs: MP/MC: bulk (size: 8): 7.01/15.55
> >
> > ### Testing using two hyperthreads ###
> > legacy APIs: SP/SC: bulk (size: 8): 10.64/17.56
> > legacy APIs: MP/MC: bulk (size: 8): 15.30/16.69
> > legacy APIs: SP/SC: bulk (size: 32): 5.84/7.09
> > legacy APIs: MP/MC: bulk (size: 32): 6.34/7.54
> >
> > ### Testing using two physical cores ###
> > legacy APIs: SP/SC: bulk (size: 8): 24.34/42.40
> > legacy APIs: MP/MC: bulk (size: 8): 70.34/71.82
> > legacy APIs: SP/SC: bulk (size: 32): 12.67/14.68
> > legacy APIs: MP/MC: bulk (size: 32): 22.41/17.93
> >
> > ### Testing single element enq/deq ###
> > elem APIs: element size 16B: SP/SC: single: 10.65/41.96
> > elem APIs: element size 16B: MP/MC: single: 44.33/81.36
> >
> > ### Testing burst enq/deq ###
> > elem APIs: element size 16B: SP/SC: burst (size: 8): 39.20/58.52
> > elem APIs: element size 16B: SP/SC: burst (size: 32): 123.19/142.79
> > elem APIs: element size 16B: MP/MC: burst (size: 8): 80.72/101.36
> > elem APIs: element size 16B: MP/MC: burst (size: 32): 169.21/185.38
> >
> > ### Testing bulk enq/deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 41.64/58.46
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 122.74/142.52
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 80.60/103.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 169.39/186.67
> >
> > ### Testing empty bulk deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 5.01/17.17
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 6.01/14.80
> >
> > ### Testing using two hyperthreads ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.02/17.18
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 16.81/21.14
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 7.87/9.01
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 8.22/10.57
> >
> > ### Testing using two physical cores ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 27.00/51.94
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 78.24/74.48
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 15.41/16.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 18.72/21.64
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
>
> What is impact with LTO? I suspect compiler might have a chance to
> get speed back with LTO.
Might be, but LTO is not enabled by default.
So don't see much point in digging any further here.
next prev parent reply other threads:[~2020-03-21 1:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-20 16:41 Konstantin Ananyev
2020-03-20 17:54 ` Stephen Hemminger
2020-03-21 1:03 ` Ananyev, Konstantin [this message]
2020-03-25 21:09 ` Jerin Jacob
2020-03-26 0:28 ` Ananyev, Konstantin
2020-03-26 8:04 ` Morten Brørup
2020-03-31 23:25 ` Thomas Monjalon
2020-06-30 23:15 ` Honnappa Nagarahalli
2020-07-01 7:27 ` Morten Brørup
2020-07-01 12:21 ` Ananyev, Konstantin
2020-07-01 14:11 ` Honnappa Nagarahalli
2020-07-01 14:31 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com \
--to=konstantin.ananyev@intel.com \
--cc=dev@dpdk.org \
--cc=drc@linux.vnet.ibm.com \
--cc=honnappa.nagarahalli@arm.com \
--cc=jerinj@marvell.com \
--cc=olivier.matz@6wind.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).