From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
"Dharmik Thakkar" <Dharmik.Thakkar@arm.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Olivier Matz" <olivier.matz@6wind.com>,
"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
<dev@dpdk.org>, "nd" <nd@arm.com>,
"Ruifeng Wang" <Ruifeng.Wang@arm.com>, "nd" <nd@arm.com>,
"nd" <nd@arm.com>
Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache
Date: Mon, 8 Nov 2021 17:03:55 +0100 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86CB4@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB5814A4A1EA7EF38940E78DCB98919@DBAPR08MB5814.eurprd08.prod.outlook.com>
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa
> Nagarahalli
> Sent: Monday, 8 November 2021 16.46
>
> <snip>
> > >
> > > > > > > >>>>>>>>> Current mempool per core cache implementation is
> > > based
> > > > > > > >>>>>>>>> on
> > > > > > > >>>>> pointer
> > > > > > > >>>>>>>>> For most architectures, each pointer consumes 64b
> > > > > > > >>>>>>>>> Replace
> > > > > it
> > > > > > > >>>>> with
> > > > > > > >>>>>>>>> index-based implementation, where in each buffer
> is
> > > > > > > >>>>>>>>> addressed
> > > > > > > >>>>> by
> > > > > > > >>>>>>>>> (pool address + index)
> > > > > > > >>>>
> > > > > > > >>>> I like Dharmik's suggestion very much. CPU cache is a
> > > > > > > >>>> critical and limited resource.
> > > > > > > >>>>
> > > > > > > >>>> DPDK has a tendency of using pointers where indexes
> could
> > > be
> > > > > used
> > > > > > > >>>> instead. I suppose pointers provide the additional
> > > > > > > >>>> flexibility
> > > > > of
> > > > > > > >>>> mixing entries from different memory pools, e.g.
> multiple
> > > > > > > >>>> mbuf
> > > > > > > >> pools.
> > > > > > > >>>>
> > > > > > > >>
> > > > > > > >> Agreed, thank you!
> > > > > > > >>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> I don't think it is going to work:
> > > > > > > >>>>>>>> On 64-bit systems difference between pool address
> and
> > > > > > > >>>>>>>> it's
> > > > > > > elem
> > > > > > > >>>>>>>> address could be bigger than 4GB.
> > > > > > > >>>>>>> Are you talking about a case where the memory pool
> > > > > > > >>>>>>> size
> > > is
> > > > > > > >>>>>>> more
> > > > > > > >>>>> than 4GB?
> > > > > > > >>>>>>
> > > > > > > >>>>>> That is one possible scenario.
> > > > > > > >>>>
> > > > > > > >>>> That could be solved by making the index an element
> index
> > > > > instead
> > > > > > > of
> > > > > > > >> a
> > > > > > > >>>> pointer offset: address = (pool address + index *
> element
> > > > > size).
> > > > > > > >>>
> > > > > > > >>> Or instead of scaling the index with the element size,
> > > which
> > > > > > > >>> is
> > > > > > > only
> > > > > > > >> known at runtime, the index could be more efficiently
> > > > > > > >> scaled
> > > by
> > > > > a
> > > > > > > >> compile time constant such as RTE_MEMPOOL_ALIGN (=
> > > > > > > >> RTE_CACHE_LINE_SIZE). With a cache line size of 64 byte,
> > > that
> > > > > would
> > > > > > > >> allow indexing into mempools up to 256 GB in size.
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >> Looking at this snippet [1] from
> > > > > rte_mempool_op_populate_helper(),
> > > > > > > >> there is an ‘offset’ added to avoid objects to cross
> page
> > > > > > > boundaries.
> > > > > > > >> If my understanding is correct, using the index of
> element
> > > > > instead
> > > > > > > of a
> > > > > > > >> pointer offset will pose a challenge for some of the
> corner
> > > > > cases.
> > > > > > > >>
> > > > > > > >> [1]
> > > > > > > >> for (i = 0; i < max_objs; i++) {
> > > > > > > >> /* avoid objects to cross page boundaries
> */
> > > > > > > >> if (check_obj_bounds(va + off, pg_sz,
> > > > > total_elt_sz)
> > > > > > > >> <
> > > > > > > >> 0) {
> > > > > > > >> off += RTE_PTR_ALIGN_CEIL(va +
> off,
> > > > > pg_sz) -
> > > > > > > >> (va + off);
> > > > > > > >> if (flags &
> > > > > RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ)
> > > > > > > >> off += total_elt_sz -
> > > > > > > >> (((uintptr_t)(va
> +
> > > off -
> > > > > 1) %
> > > > > > > >>
> > > > > > > >> total_elt_sz)
> > > +
> > > > > 1);
> > > > > > > >> }
> > > > > > > >>
> > > > > > > >
> > > > > > > > OK. Alternatively to scaling the index with a cache line
> > > size,
> > > > > you
> > > > > > > can scale it with sizeof(uintptr_t) to be able to address
> 32
> > > > > > > or
> > > 16
> > > > > GB
> > > > > > > mempools on respectively 64 bit and 32 bit architectures.
> Both
> > > x86
> > > > > and
> > > > > > > ARM CPUs have instructions to access memory with an added
> > > offset
> > > > > > > multiplied by 4 or 8. So that should be high performance.
> > > > > > >
> > > > > > > Yes, agreed this can be done.
> > > > > > > Cache line size can also be used when
> > > ‘MEMPOOL_F_NO_CACHE_ALIGN’
> > > > > > > is not enabled.
> > > > > > > On a side note, I wanted to better understand the need for
> > > having
> > > > > the
> > > > > > > 'MEMPOOL_F_NO_CACHE_ALIGN' option.
> > > > > >
> > > > > > The description of this field is misleading, and should be
> > > corrected.
> > > > > > The correct description would be: Don't need to align objs on
> > > cache
> > > > > lines.
> > > > > >
> > > > > > It is useful for mempools containing very small objects, to
> > > conserve
> > > > > memory.
> > > > > I think we can assume that mbuf pools are created with the
> > > > > 'MEMPOOL_F_NO_CACHE_ALIGN' flag set. With this we can use
> offset
> > > > > calculated with cache line size as the unit.
> > > >
> > > > You mean MEMPOOL_F_NO_CACHE_ALIGN flag not set. ;-)
> > > Yes 😊
> > >
> > > >
> > > > I agree. And since the flag is a hint only, it can be ignored if
> the
> > > mempool
> > > > library is scaling the index with the cache line size.
> > > I do not think we should ignore the flag for reason you mention
> below.
> > >
> > > >
> > > > However, a mempool may contain other objects than mbufs, and
> those
> > > objects
> > > > may be small, so ignoring the MEMPOOL_F_NO_CACHE_ALIGN flag may
> > cost
> > > a
> > > > lot of memory for such mempools.
> > > We could use different methods. If MEMPOOL_F_NO_CACHE_ALIGN is set,
> > > use the unit as 'sizeof(uintptr_t)', if not set use cache line size
> as
> > > the unit.
> > >
> >
> > That would require that the indexing multiplier is a runtime
> parameter instead
> > of a compile time parameter. So it would have a performance penalty.
> >
> > The indexing multiplier could be compile time configurable, so it is
> a tradeoff
> > between granularity and maximum mempool size.
> I meant compile time configurable. i.e.
>
> #ifdef MEMPOOL_F_NO_CACHE_ALIGN
> <use sizeof(uintptr_t) as the multiplier>
> #else
> <use cache line size as the multiplier> /* This should provide enough
> memory for packet buffers */
> #endif
Please note that MEMPOOL_F_NO_CACHE_ALIGN is a runtime flag passed when creating a mempool, not a compile time option.
next prev parent reply other threads:[~2021-11-08 16:03 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-30 17:27 Dharmik Thakkar
2021-10-01 12:36 ` Jerin Jacob
2021-10-01 15:44 ` Honnappa Nagarahalli
2021-10-01 17:32 ` Jerin Jacob
2021-10-01 17:57 ` Honnappa Nagarahalli
2021-10-01 18:21 ` Jerin Jacob
2021-10-01 21:30 ` Ananyev, Konstantin
2021-10-02 0:07 ` Honnappa Nagarahalli
2021-10-02 18:51 ` Ananyev, Konstantin
2021-10-04 16:36 ` Honnappa Nagarahalli
2021-10-30 10:23 ` Morten Brørup
2021-10-31 8:14 ` Morten Brørup
2021-11-03 15:12 ` Dharmik Thakkar
2021-11-03 15:52 ` Morten Brørup
2021-11-04 4:42 ` Dharmik Thakkar
2021-11-04 8:04 ` Morten Brørup
2021-11-08 4:32 ` Honnappa Nagarahalli
2021-11-08 7:22 ` Morten Brørup
2021-11-08 15:29 ` Honnappa Nagarahalli
2021-11-08 15:39 ` Morten Brørup
2021-11-08 15:46 ` Honnappa Nagarahalli
2021-11-08 16:03 ` Morten Brørup [this message]
2021-11-08 16:47 ` Jerin Jacob
2021-12-24 22:59 ` [PATCH 0/1] " Dharmik Thakkar
2021-12-24 22:59 ` [PATCH 1/1] " Dharmik Thakkar
2022-01-11 2:26 ` Ananyev, Konstantin
2022-01-13 5:17 ` Dharmik Thakkar
2022-01-13 10:37 ` Ananyev, Konstantin
2022-01-19 15:32 ` Dharmik Thakkar
2022-01-21 11:25 ` Ananyev, Konstantin
2022-01-21 11:31 ` Ananyev, Konstantin
2022-03-24 19:51 ` Dharmik Thakkar
2021-12-25 0:16 ` [PATCH 0/1] " Morten Brørup
2022-01-07 11:15 ` Bruce Richardson
2022-01-07 11:29 ` Morten Brørup
2022-01-07 13:50 ` Bruce Richardson
2022-01-08 9:37 ` Morten Brørup
2022-01-10 6:38 ` Jerin Jacob
2022-01-13 5:31 ` Dharmik Thakkar
2023-07-06 17:43 ` Stephen Hemminger
2023-07-31 12:23 ` Thomas Monjalon
2023-07-31 12:33 ` Morten Brørup
2023-07-31 14:57 ` Dharmik Jayesh Thakkar
2022-01-13 5:36 ` [PATCH v2 " Dharmik Thakkar
2022-01-13 5:36 ` [PATCH v2 1/1] " Dharmik Thakkar
2022-01-13 10:18 ` Jerin Jacob
2022-01-20 8:21 ` Morten Brørup
2022-01-21 6:01 ` Honnappa Nagarahalli
2022-01-21 7:36 ` Morten Brørup
2022-01-24 13:05 ` Ray Kinsella
2022-01-21 9:12 ` Bruce Richardson
2022-01-23 7:13 ` Wang, Haiyue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D86CB4@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=Dharmik.Thakkar@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=Ruifeng.Wang@arm.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
--cc=nd@arm.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).