DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>
Subject: Re: [dpdk-dev] [RFC] ring: make ring implementation non-inlined
Date: Sat, 21 Mar 2020 01:03:35 +0000	[thread overview]
Message-ID: <SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200320105435.47681954@hermes.lan>



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, March 20, 2020 5:55 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; olivier.matz@6wind.com; honnappa.nagarahalli@arm.com; jerinj@marvell.com; drc@linux.vnet.ibm.com
> Subject: Re: [RFC] ring: make ring implementation non-inlined
> 
> On Fri, 20 Mar 2020 16:41:38 +0000
> Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:
> 
> > As was discussed here:
> > http://mails.dpdk.org/archives/dev/2020-February/158586.html
> > this RFC aimed to hide ring internals into .c and make all
> > ring functions non-inlined. In theory that might help to
> > maintain ABI stability in future.
> > This is just a POC to measure the impact of proposed idea,
> > proper implementation would definetly need some extra effort.
> > On IA box (SKX) ring_perf_autotest shows ~20-30 cycles extra for
> > enqueue+dequeue pair. On some more realistic code, I suspect
> > the impact it might be a bit higher.
> > For MP/MC bulk transfers degradation seems quite small,
> > though for SP/SC and/or small transfers it is more then noticable
> > (see exact numbers below).
> > From my perspective we'd probably keep it inlined for now
> > to avoid any non-anticipated perfomance degradations.
> > Though intersted to see perf results and opinions from
> > other interested parties.
> >
> > Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> > ring_perf_autotest (without patch/with patch)
> >
> > ### Testing single element enq/deq ###
> > legacy APIs: SP/SC: single: 8.75/43.23
> > legacy APIs: MP/MC: single: 56.18/80.44
> >
> > ### Testing burst enq/deq ###
> > legacy APIs: SP/SC: burst (size: 8): 37.36/53.37
> > legacy APIs: SP/SC: burst (size: 32): 93.97/117.30
> > legacy APIs: MP/MC: burst (size: 8): 78.23/91.45
> > legacy APIs: MP/MC: burst (size: 32): 131.59/152.49
> >
> > ### Testing bulk enq/deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 37.29/54.48
> > legacy APIs: SP/SC: bulk (size: 32): 92.68/113.01
> > legacy APIs: MP/MC: bulk (size: 8): 78.40/93.50
> > legacy APIs: MP/MC: bulk (size: 32): 131.49/154.25
> >
> > ### Testing empty bulk deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 4.00/16.86
> > legacy APIs: MP/MC: bulk (size: 8): 7.01/15.55
> >
> > ### Testing using two hyperthreads ###
> > legacy APIs: SP/SC: bulk (size: 8): 10.64/17.56
> > legacy APIs: MP/MC: bulk (size: 8): 15.30/16.69
> > legacy APIs: SP/SC: bulk (size: 32): 5.84/7.09
> > legacy APIs: MP/MC: bulk (size: 32): 6.34/7.54
> >
> > ### Testing using two physical cores ###
> > legacy APIs: SP/SC: bulk (size: 8): 24.34/42.40
> > legacy APIs: MP/MC: bulk (size: 8): 70.34/71.82
> > legacy APIs: SP/SC: bulk (size: 32): 12.67/14.68
> > legacy APIs: MP/MC: bulk (size: 32): 22.41/17.93
> >
> > ### Testing single element enq/deq ###
> > elem APIs: element size 16B: SP/SC: single: 10.65/41.96
> > elem APIs: element size 16B: MP/MC: single: 44.33/81.36
> >
> > ### Testing burst enq/deq ###
> > elem APIs: element size 16B: SP/SC: burst (size: 8): 39.20/58.52
> > elem APIs: element size 16B: SP/SC: burst (size: 32): 123.19/142.79
> > elem APIs: element size 16B: MP/MC: burst (size: 8): 80.72/101.36
> > elem APIs: element size 16B: MP/MC: burst (size: 32): 169.21/185.38
> >
> > ### Testing bulk enq/deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 41.64/58.46
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 122.74/142.52
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 80.60/103.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 169.39/186.67
> >
> > ### Testing empty bulk deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 5.01/17.17
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 6.01/14.80
> >
> > ### Testing using two hyperthreads ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.02/17.18
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 16.81/21.14
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 7.87/9.01
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 8.22/10.57
> >
> > ### Testing using two physical cores ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 27.00/51.94
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 78.24/74.48
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 15.41/16.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 18.72/21.64
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> What is impact with LTO? I suspect compiler might have a chance to
> get speed back with LTO.

Might be, but LTO is not enabled by default.
So don't see much point in digging any further here.

  reply	other threads:[~2020-03-21  1:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 16:41 Konstantin Ananyev
2020-03-20 17:54 ` Stephen Hemminger
2020-03-21  1:03   ` Ananyev, Konstantin [this message]
2020-03-25 21:09 ` Jerin Jacob
2020-03-26  0:28   ` Ananyev, Konstantin
2020-03-26  8:04   ` Morten Brørup
2020-03-31 23:25     ` Thomas Monjalon
2020-06-30 23:15       ` Honnappa Nagarahalli
2020-07-01  7:27         ` Morten Brørup
2020-07-01 12:21           ` Ananyev, Konstantin
2020-07-01 14:11             ` Honnappa Nagarahalli
2020-07-01 14:31               ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=drc@linux.vnet.ibm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).