From: Bruce Richardson <bruce.richardson@intel.com>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: Jerin Jacob <jerin.jacob@caviumnetworks.com>, dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
Date: Wed, 1 Mar 2017 10:42:58 +0000 [thread overview]
Message-ID: <20170301104257.GB25032@bricha3-MOBL3.ger.corp.intel.com> (raw)
In-Reply-To: <20170301111753.1223a01e@platinum>
On Wed, Mar 01, 2017 at 11:17:53AM +0100, Olivier Matz wrote:
> Hi Bruce,
>
> On Wed, 1 Mar 2017 09:47:03 +0000, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> > On Tue, Feb 28, 2017 at 11:24:25PM +0530, Jerin Jacob wrote:
> > > On Tue, Feb 28, 2017 at 01:52:26PM +0000, Bruce Richardson wrote:
> > > > On Tue, Feb 28, 2017 at 05:38:34PM +0530, Jerin Jacob wrote:
> > > > > On Tue, Feb 28, 2017 at 11:57:03AM +0000, Bruce Richardson
> > > > > wrote:
> > > > > > On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> > > > > > > On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson
> > > > > > > wrote:
> > > > > > > > Users compiling DPDK should not need to know or care
> > > > > > > > about the arrangement of cachelines in the rte_ring
> > > > > > > > structure. Therefore just remove the build option and set
> > > > > > > > the structures to be always split. For improved
> > > > > > > > performance use 128B rather than 64B alignment since it
> > > > > > > > stops the producer and consumer data being on adjacent
>
>
> You say you see an improved performance on Intel by having an extra
> blank cache-line between the producer and consumer data. Do you have an
> idea why it behaves like this? Do you think it is related to the
> hardware adjacent cache line prefetcher?
>
That is a likely candidate, but I haven't done a full analysis on the
details to know for sure what all the factors are. We see a similar
effect with the packet distributor library, which uses similar padding.
>
>
> > [...]
> > > # base code
> > > RTE>>ring_perf_autotest
> > > ### Testing single element and burst enq/deq ###
> > > SP/SC single enq/dequeue: 84
> > > MP/MC single enq/dequeue: 301
> > > SP/SC burst enq/dequeue (size: 8): 20
> > > MP/MC burst enq/dequeue (size: 8): 46
> > > SP/SC burst enq/dequeue (size: 32): 12
> > > MP/MC burst enq/dequeue (size: 32): 18
> > >
> > > ### Testing empty dequeue ###
> > > SC empty dequeue: 7.11
> > > MC empty dequeue: 12.15
> > >
> > > ### Testing using a single lcore ###
> > > SP/SC bulk enq/dequeue (size: 8): 19.08
> > > MP/MC bulk enq/dequeue (size: 8): 46.28
> > > SP/SC bulk enq/dequeue (size: 32): 11.89
> > > MP/MC bulk enq/dequeue (size: 32): 18.84
> > >
> > > ### Testing using two physical cores ###
> > > SP/SC bulk enq/dequeue (size: 8): 37.42
> > > MP/MC bulk enq/dequeue (size: 8): 73.32
> > > SP/SC bulk enq/dequeue (size: 32): 18.69
> > > MP/MC bulk enq/dequeue (size: 32): 24.59
> > > Test OK
> > >
> > > # with ring rework patch
> > > RTE>>ring_perf_autotest
> > > ### Testing single element and burst enq/deq ###
> > > SP/SC single enq/dequeue: 84
> > > MP/MC single enq/dequeue: 301
> > > SP/SC burst enq/dequeue (size: 8): 19
> > > MP/MC burst enq/dequeue (size: 8): 45
> > > SP/SC burst enq/dequeue (size: 32): 11
> > > MP/MC burst enq/dequeue (size: 32): 18
> > >
> > > ### Testing empty dequeue ###
> > > SC empty dequeue: 7.10
> > > MC empty dequeue: 12.15
> > >
> > > ### Testing using a single lcore ###
> > > SP/SC bulk enq/dequeue (size: 8): 18.59
> > > MP/MC bulk enq/dequeue (size: 8): 45.49
> > > SP/SC bulk enq/dequeue (size: 32): 11.67
> > > MP/MC bulk enq/dequeue (size: 32): 18.65
> > >
> > > ### Testing using two physical cores ###
> > > SP/SC bulk enq/dequeue (size: 8): 37.41
> > > MP/MC bulk enq/dequeue (size: 8): 72.98
> > > SP/SC bulk enq/dequeue (size: 32): 18.69
> > > MP/MC bulk enq/dequeue (size: 32): 24.59
> > > Test OK
> > > RTE>>
> > >
> > > # with ring rework patch + cache-line size change to one on 128BCL
> > > target
> > > RTE>>ring_perf_autotest
> > > ### Testing single element and burst enq/deq ###
> > > SP/SC single enq/dequeue: 90
> > > MP/MC single enq/dequeue: 317
> > > SP/SC burst enq/dequeue (size: 8): 20
> > > MP/MC burst enq/dequeue (size: 8): 48
> > > SP/SC burst enq/dequeue (size: 32): 11
> > > MP/MC burst enq/dequeue (size: 32): 18
> > >
> > > ### Testing empty dequeue ###
> > > SC empty dequeue: 8.10
> > > MC empty dequeue: 11.15
> > >
> > > ### Testing using a single lcore ###
> > > SP/SC bulk enq/dequeue (size: 8): 20.24
> > > MP/MC bulk enq/dequeue (size: 8): 48.43
> > > SP/SC bulk enq/dequeue (size: 32): 11.01
> > > MP/MC bulk enq/dequeue (size: 32): 18.43
> > >
> > > ### Testing using two physical cores ###
> > > SP/SC bulk enq/dequeue (size: 8): 25.92
> > > MP/MC bulk enq/dequeue (size: 8): 69.76
> > > SP/SC bulk enq/dequeue (size: 32): 14.27
> > > MP/MC bulk enq/dequeue (size: 32): 22.94
> > > Test OK
> > > RTE>>
> >
> > So given that there is not much difference here, is the MIN_SIZE i.e.
> > forced 64B, your preference, rather than actual cacheline-size?
> >
>
> I don't quite like this macro CACHE_LINE_MIN_SIZE. For me, it does not
> mean anything. The reasons for aligning on a cache line size are
> straightforward, but when should we need to align on the minimum
> cache line size supported by dpdk? For instance, in mbuf structure,
> aligning on 64 would make more sense to me.
>
> So, I would prefer using (RTE_CACHE_LINE_SIZE * 2) here. If we don't
> want it on some architectures, or if this optimization is only for Intel
> (or all archs that need this optim), I think we could have something
> like:
>
> /* bla bla */
> #ifdef INTEL
> #define __rte_ring_aligned __rte_aligned(RTE_CACHE_LINE_SIZE * 2)
> #else
> #define __rte_ring_aligned __rte_aligned(RTE_CACHE_LINE_SIZE)
> #endif
>
I would agree that CACHE_LINE_MIN_SIZE probably doesn't make any sense
here, but I'm happy to put in any suitable scheme that others are happy
with. The options are:
* Keep as-is:
adv: simplest option, disadv: wastes 128B * 2 on some platforms
* Change to MIN_SIZE:
adv: no ifdefs, disadv: doesn't make much sense logically here
* Use ifdefs:
adv: each platform gets what's best for it, disadv: untidy code, may
be harder to maintain
* Use hard-coded 128:
adv: short and simple, disadv: misses any logical reason why 128 is
used, i.e. magic number
I'm ok with any option above.
/Bruce
next prev parent reply other threads:[~2017-03-01 10:43 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-23 17:23 [dpdk-dev] [PATCH v1 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-02-28 11:35 ` Jerin Jacob
2017-02-28 11:57 ` Bruce Richardson
2017-02-28 12:08 ` Jerin Jacob
2017-02-28 13:52 ` Bruce Richardson
2017-02-28 17:54 ` Jerin Jacob
2017-03-01 9:47 ` Bruce Richardson
2017-03-01 10:17 ` Olivier Matz
2017-03-01 10:42 ` Bruce Richardson [this message]
2017-03-01 11:06 ` Olivier Matz
2017-03-01 11:19 ` Jerin Jacob
2017-03-01 12:12 ` Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 02/14] ring: create common structure for prod and cons metadata Bruce Richardson
2017-03-01 10:22 ` Olivier Matz
2017-03-01 10:33 ` Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 04/14] ring: remove debug setting Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-02-23 17:23 ` [dpdk-dev] [PATCH v1 06/14] ring: remove watermark support Bruce Richardson
2017-03-01 10:34 ` Olivier Matz
2017-03-01 10:43 ` Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 08/14] ring: allow enqueue fns to return free space value Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 10/14] examples/quota_watermark: use ring space for watermarks Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 11/14] ring: reduce scope of local variables Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 12/14] ring: separate out head index manipulation for enq/deq Bruce Richardson
2017-03-08 10:49 ` Olivier MATZ
2017-03-08 12:06 ` Bruce Richardson
2017-03-14 8:56 ` Olivier Matz
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 13/14] ring: create common function for updating tail idx Bruce Richardson
2017-02-23 17:24 ` [dpdk-dev] [PATCH v1 14/14] ring: make ring struct and enq/deq macros type agnostic Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 02/14] ring: create common structure for prod and cons metadata Bruce Richardson
2017-03-15 14:01 ` Thomas Monjalon
2017-03-22 16:38 ` Bruce Richardson
2017-03-24 14:55 ` Bruce Richardson
2017-03-24 16:41 ` Olivier Matz
2017-03-24 16:57 ` Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 04/14] ring: remove debug setting Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 06/14] ring: remove watermark support Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-08 10:22 ` Olivier MATZ
2017-03-08 12:08 ` Bruce Richardson
2017-03-14 8:56 ` Olivier Matz
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 08/14] ring: allow enqueue fns to return free space value Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 10/14] examples/quota_watermark: use ring space for watermarks Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 11/14] ring: reduce scope of local variables Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 12/14] ring: separate out head index manipulation for enq/deq Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 13/14] ring: create common function for updating tail idx Bruce Richardson
2017-03-07 11:32 ` [dpdk-dev] [PATCH v2 14/14] ring: make ring struct and enq/deq macros type agnostic Bruce Richardson
2017-03-14 8:59 ` [dpdk-dev] [PATCH v2 00/14] refactor and cleanup of rte_ring Olivier Matz
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 " Bruce Richardson
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 02/14] ring: create common structure for prod and cons metadata Bruce Richardson
2017-03-27 7:20 ` Olivier Matz
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-27 9:52 ` Thomas Monjalon
2017-03-27 10:13 ` Bruce Richardson
2017-03-27 10:15 ` Bruce Richardson
2017-03-27 13:13 ` Thomas Monjalon
2017-03-27 14:57 ` Bruce Richardson
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 04/14] ring: remove debug setting Bruce Richardson
2017-03-24 17:09 ` [dpdk-dev] [PATCH v3 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 06/14] ring: remove watermark support Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 08/14] ring: allow enqueue fns to return free space value Bruce Richardson
2017-03-28 7:12 ` Thomas Monjalon
2017-03-28 8:16 ` Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 10/14] examples/quota_watermark: use ring space for watermarks Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 11/14] ring: reduce scope of local variables Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 12/14] ring: separate out head index manipulation for enq/deq Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 13/14] ring: create common function for updating tail idx Bruce Richardson
2017-03-24 17:10 ` [dpdk-dev] [PATCH v3 14/14] ring: make ring struct and enq/deq macros type agnostic Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 02/14] ring: create common structure for prod and cons metadata Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 04/14] ring: remove debug setting Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 06/14] ring: remove watermark support Bruce Richardson
2017-03-28 20:35 ` [dpdk-dev] [PATCH v4 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 08/14] ring: allow enqueue fns to return free space value Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 10/14] examples/quota_watermark: use ring space for watermarks Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 11/14] ring: reduce scope of local variables Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 12/14] ring: separate out head index manipulation for enq/deq Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 13/14] ring: create common function for updating tail idx Bruce Richardson
2017-03-28 20:36 ` [dpdk-dev] [PATCH v4 14/14] ring: make ring struct and enq/deq macros type agnostic Bruce Richardson
2017-03-29 2:47 ` [dpdk-dev] [PATCH v4 00/14] refactor and cleanup of rte_ring Yuanhan Liu
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 " Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 02/14] ring: create common structure for prod and cons metadata Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 04/14] ring: remove debug setting Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 06/14] ring: remove watermark support Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-04-13 6:42 ` Wang, Zhihong
2017-04-13 8:33 ` Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 08/14] ring: allow enqueue fns to return free space value Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 10/14] examples/quota_watermark: use ring space for watermarks Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 11/14] ring: reduce scope of local variables Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 12/14] ring: separate out head index manipulation for enq/deq Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 13/14] ring: create common function for updating tail idx Bruce Richardson
2017-03-29 13:09 ` [dpdk-dev] [PATCH v5 14/14] ring: make ring struct and enq/deq macros type agnostic Bruce Richardson
2017-03-29 20:33 ` [dpdk-dev] [PATCH v5 00/14] refactor and cleanup of rte_ring Thomas Monjalon
2017-03-31 14:37 ` Ferruh Yigit
2017-04-03 17:55 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170301104257.GB25032@bricha3-MOBL3.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=jerin.jacob@caviumnetworks.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).