From: Olivier Matz <olivier.matz@6wind.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
dev@dpdk.org, Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [PATCH v6 3/4] mempool: fix cache flushing algorithm
Date: Fri, 14 Oct 2022 21:50:37 +0200 [thread overview]
Message-ID: <Y0m9jYSVig8eugTn@platinum> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D873E3@smartserver.smartshare.dk>
On Fri, Oct 14, 2022 at 05:57:39PM +0200, Morten Brørup wrote:
> > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > Sent: Friday, 14 October 2022 16.01
> >
> > Hi Morten, Andrew,
> >
> > On Sun, Oct 09, 2022 at 05:08:39PM +0200, Morten Brørup wrote:
> > > > From: Andrew Rybchenko [mailto:andrew.rybchenko@oktetlabs.ru]
> > > > Sent: Sunday, 9 October 2022 16.52
> > > >
> > > > On 10/9/22 17:31, Morten Brørup wrote:
> > > > >> From: Andrew Rybchenko [mailto:andrew.rybchenko@oktetlabs.ru]
> > > > >> Sent: Sunday, 9 October 2022 15.38
> > > > >>
> > > > >> From: Morten Brørup <mb@smartsharesystems.com>
> > > > >>
> > >
> > > [...]
> >
> > I finally took a couple of hours to carefully review the mempool-
> > related
> > series (including the ones that have already been pushed).
> >
> > The new behavior looks better to me in all situations I can think
> > about.
>
> Extreme care is required when touching a core library like the mempool.
>
> Thank you, Olivier.
>
> >
> > >
> > > > >> --- a/lib/mempool/rte_mempool.h
> > > > >> +++ b/lib/mempool/rte_mempool.h
> > > > >> @@ -90,7 +90,7 @@ struct rte_mempool_cache {
> > > > >> * Cache is allocated to this size to allow it to overflow
> > in
> > > > >> certain
> > > > >> * cases to avoid needless emptying of cache.
> > > > >> */
> > > > >> - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache
> > objects */
> > > > >> + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2]; /**< Cache
> > objects */
> > > > >> } __rte_cache_aligned;
> > > > >
> > > > > How much are we allowed to break the ABI here?
> > > > >
> > > > > This patch reduces the size of the structure by removing a now
> > unused
> > > > part at the end, which should be harmless.
> >
> > It is an ABI breakage: an existing application will use the new 22.11
> > function to create the mempool (with a smaller cache), but will use the
> > old inlined get/put that can exceed MAX_SIZE x 2 will remain.
> >
> > But this is a nice memory consumption improvement, in my opinion we
> > should accept it for 22.11 with an entry in the release note.
> >
> >
> > > > >
> > > > > If we may also move the position of the objs array, I would add
> > > > __rte_cache_aligned to the objs array. It makes no difference in
> > the
> > > > general case, but if get/put operations are always 32 objects, it
> > will
> > > > reduce the number of memory (or last level cache) accesses from
> > five to
> > > > four 64 B cache lines for every get/put operation.
> >
> > Will it really be the case? Since cache->len has to be accessed too,
> > I don't think it would make a difference.
>
> Yes, the first cache line, containing cache->len, will be accessed always. I forgot to count that; so the improvement by aligning cache->objs will be five cache line accesses instead of six.
>
> Let me try to explain the scenario in other words:
>
> In an application where a mempool cache is only accessed in bursts of 32 objects (256 B), it matters if those 256 B accesses in the mempool cache start at a cache line aligned address or not. If cache line aligned, accessing those 256 B in the mempool cache will only touch 4 cache lines; if not, 5 cache lines will be touched. (For architectures with 128 B cache line, it will be 2 instead of 3 touched cache lines per mempool cache get/put operation in applications using only bursts of 32 objects.)
>
> If we cache line align cache->objs, those bursts of 32 objects (256 B) will be cache line aligned: Any address at cache->objs[N * 32 objects] is cache line aligned if objs->objs[0] is cache line aligned.
>
> Currently, the cache->objs directly follows cache->len, which makes cache->objs[0] cache line unaligned.
>
> If we decide to break the mempool cache ABI, we might as well include my suggested cache line alignment performance improvement. It doesn't degrade performance for mempool caches not only accessed in bursts of 32 objects.
I don't follow you. Currently, with 16 objects (128B), we access to 3
cache lines:
┌────────┐
│len │
cache │********│---
line0 │********│ ^
│********│ |
├────────┤ | 16 objects
│********│ | 128B
cache │********│ |
line1 │********│ |
│********│ |
├────────┤ |
│********│_v_
cache │ │
line2 │ │
│ │
└────────┘
With the alignment, it is also 3 cache lines:
┌────────┐
│len │
cache │ │
line0 │ │
│ │
├────────┤---
│********│ ^
cache │********│ |
line1 │********│ |
│********│ |
├────────┤ | 16 objects
│********│ | 128B
cache │********│ |
line2 │********│ |
│********│ v
└────────┘---
Am I missing something?
>
> >
> >
> > > > >
> > > > > uint32_t len; /**< Current cache count */
> > > > > - /*
> > > > > - * Cache is allocated to this size to allow it to overflow
> > in
> > > > certain
> > > > > - * cases to avoid needless emptying of cache.
> > > > > - */
> > > > > - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache
> > objects */
> > > > > + /**
> > > > > + * Cache objects
> > > > > + *
> > > > > + * Cache is allocated to this size to allow it to overflow
> > in
> > > > certain
> > > > > + * cases to avoid needless emptying of cache.
> > > > > + */
> > > > > + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2]
> > __rte_cache_aligned;
> > > > > } __rte_cache_aligned;
> > > >
> > > > I think aligning objs on cacheline should be a separate patch.
> > >
> > > Good point. I'll let you do it. :-)
> > >
> > > PS: Thank you for following up on this patch series, Andrew!
> >
> > Many thanks for this rework.
> >
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
> Perhaps Reviewed-by would be appropriate?
I was thinking that "Acked-by" was commonly used by maintainers, and
"Reviewed-by" for reviews by community members. After reading the
documentation again, it's not that clear now in my mind :)
Thanks,
Olivier
next prev parent reply other threads:[~2022-10-14 19:50 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-26 15:34 [RFC] mempool: rte_mempool_do_generic_get optimizations Morten Brørup
2022-01-06 12:23 ` [PATCH] mempool: optimize incomplete cache handling Morten Brørup
2022-01-06 16:55 ` Jerin Jacob
2022-01-07 8:46 ` Morten Brørup
2022-01-10 7:26 ` Jerin Jacob
2022-01-10 10:55 ` Morten Brørup
2022-01-14 16:36 ` [PATCH] mempool: fix get objects from mempool with cache Morten Brørup
2022-01-17 17:35 ` Bruce Richardson
2022-01-18 8:25 ` Morten Brørup
2022-01-18 9:07 ` Bruce Richardson
2022-01-24 15:38 ` Olivier Matz
2022-01-24 16:11 ` Olivier Matz
2022-01-28 10:22 ` Morten Brørup
2022-01-17 11:52 ` [PATCH] mempool: optimize put objects to " Morten Brørup
2022-01-19 14:52 ` [PATCH v2] mempool: fix " Morten Brørup
2022-01-19 15:03 ` [PATCH v3] " Morten Brørup
2022-01-24 15:39 ` Olivier Matz
2022-01-28 9:37 ` Morten Brørup
2022-02-02 8:14 ` [PATCH v2] mempool: fix get objects from " Morten Brørup
2022-06-15 21:18 ` Morten Brørup
2022-09-29 10:52 ` Morten Brørup
2022-10-04 12:57 ` Andrew Rybchenko
2022-10-04 15:13 ` Morten Brørup
2022-10-04 15:58 ` Andrew Rybchenko
2022-10-04 18:09 ` Morten Brørup
2022-10-06 13:43 ` Aaron Conole
2022-10-04 16:03 ` Morten Brørup
2022-10-04 16:36 ` Morten Brørup
2022-10-04 16:39 ` Morten Brørup
2022-02-02 10:33 ` [PATCH v4] mempool: fix mempool cache flushing algorithm Morten Brørup
2022-04-07 9:04 ` Morten Brørup
2022-04-07 9:14 ` Bruce Richardson
2022-04-07 9:26 ` Morten Brørup
2022-04-07 10:32 ` Bruce Richardson
2022-04-07 10:43 ` Bruce Richardson
2022-04-07 11:36 ` Morten Brørup
2022-10-04 20:01 ` Morten Brørup
2022-10-09 11:11 ` [PATCH 1/2] mempool: check driver enqueue result in one place Andrew Rybchenko
2022-10-09 11:11 ` [PATCH 2/2] mempool: avoid usage of term ring on put Andrew Rybchenko
2022-10-09 13:08 ` Morten Brørup
2022-10-09 13:14 ` Andrew Rybchenko
2022-10-09 13:01 ` [PATCH 1/2] mempool: check driver enqueue result in one place Morten Brørup
2022-10-09 13:19 ` [PATCH v4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
2022-10-04 12:53 ` [PATCH v3] mempool: fix get objects from mempool with cache Andrew Rybchenko
2022-10-04 14:42 ` Morten Brørup
2022-10-07 10:44 ` [PATCH v4] " Andrew Rybchenko
2022-10-08 20:56 ` Thomas Monjalon
2022-10-11 20:30 ` Copy-pasted code should be updated Morten Brørup
2022-10-11 21:47 ` Honnappa Nagarahalli
2022-10-30 8:44 ` Morten Brørup
2022-10-30 22:50 ` Honnappa Nagarahalli
2022-10-14 14:01 ` [PATCH v4] mempool: fix get objects from mempool with cache Olivier Matz
2022-10-09 13:37 ` [PATCH v6 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
2022-10-09 13:37 ` [PATCH v6 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
2022-10-09 13:37 ` [PATCH v6 2/4] mempool: avoid usage of term ring on put Andrew Rybchenko
2022-10-09 13:37 ` [PATCH v6 3/4] mempool: fix cache flushing algorithm Andrew Rybchenko
2022-10-09 14:31 ` Morten Brørup
2022-10-09 14:51 ` Andrew Rybchenko
2022-10-09 15:08 ` Morten Brørup
2022-10-14 14:01 ` Olivier Matz
2022-10-14 15:57 ` Morten Brørup
2022-10-14 19:50 ` Olivier Matz [this message]
2022-10-15 6:57 ` Morten Brørup
2022-10-18 16:32 ` Jerin Jacob
2022-10-09 13:37 ` [PATCH v6 4/4] mempool: flush cache completely on overflow Andrew Rybchenko
2022-10-09 14:44 ` Morten Brørup
2022-10-14 14:01 ` Olivier Matz
2022-10-10 15:21 ` [PATCH v6 0/4] mempool: fix mempool cache flushing algorithm Thomas Monjalon
2022-10-11 19:26 ` Morten Brørup
2022-10-26 14:09 ` Thomas Monjalon
2022-10-26 14:26 ` Morten Brørup
2022-10-26 14:44 ` [PATCH] mempool: cache align mempool cache objects Morten Brørup
2022-10-26 19:44 ` Andrew Rybchenko
2022-10-27 8:34 ` Olivier Matz
2022-10-27 9:22 ` Morten Brørup
2022-10-27 11:42 ` Olivier Matz
2022-10-27 12:11 ` Morten Brørup
2022-10-27 15:20 ` Olivier Matz
2022-10-28 6:35 ` [PATCH v3 1/2] " Morten Brørup
2022-10-28 6:35 ` [PATCH v3 2/2] mempool: optimized debug statistics Morten Brørup
2022-10-28 6:41 ` [PATCH v4 1/2] mempool: cache align mempool cache objects Morten Brørup
2022-10-28 6:41 ` [PATCH v4 2/2] mempool: optimized debug statistics Morten Brørup
2022-10-30 9:09 ` Morten Brørup
2022-10-30 9:16 ` Thomas Monjalon
2022-10-30 9:17 ` [PATCH v4 1/2] mempool: cache align mempool cache objects Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0m9jYSVig8eugTn@platinum \
--to=olivier.matz@6wind.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).