From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Dharmik Thakkar" <Dharmik.Thakkar@arm.com>,
"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Olivier Matz" <olivier.matz@6wind.com>,
"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
<dev@dpdk.org>, "nd" <nd@arm.com>,
"Ruifeng Wang" <Ruifeng.Wang@arm.com>
Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache
Date: Thu, 4 Nov 2021 09:04:48 +0100 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86CA2@smartserver.smartshare.dk> (raw)
In-Reply-To: <00330F38-A5D0-4B56-872E-0EA0E24808F7@arm.com>
+ Ring library maintainers (@Honnappa and @Konstantin) for my rants about its documentation.
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Dharmik Thakkar
> Sent: Thursday, 4 November 2021 05.42
>
> > On Nov 3, 2021, at 10:52 AM, Morten Brørup <mb@smartsharesystems.com>
> wrote:
> >
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Dharmik Thakkar
> >> Sent: Wednesday, 3 November 2021 16.13
> >>
> >> Hi,
> >>
> >> Thank you everyone for the comments! I am currently working on
> making
> >> the global pool ring’s implementation as index based.
> >> Once done, I will send a patch for community review. I will also
> make
> >> it as a compile time option.
> >
> > Sounds good to me.
> >
> > This could probably be abstracted to other libraries too. E.g. the
> ring library holds pointers to objects (void *); an alternative ring
> library could hold indexes to objects (uint32_t). A ring often holds
> objects from the same mempool, and the application knows which mempool,
> so indexing would be useful here too.
> >
>
> Yes, ring library within DPDK has the APIs to support configurable
> element size
I remember seeing that feature proposed on the mailing list too, but I couldn't find it in the API documentation, so I was not sure it was ever accepted.
The containers section of the API documentation (/doc/api/doxy-api-index.md) doesn't contain any references to it. And the description of the RTE Ring library, which the "ring" link in the API documentation refers to, clearly says: The Ring Manager is a fixed-size queue, implemented as a table of *pointers*. (My emphasis.) So I thought it wasn't accepted.
However, searching for it in the source code reveals that it is indeed there! And the Ring Library chapter in the Programmer's Guide does mention that the objects can be something else than pointers.
So the documentation is not all screwed up, just a little sparse. :-)
>
> >>
> >>> On Oct 31, 2021, at 3:14 AM, Morten Brørup
> <mb@smartsharesystems.com>
> >> wrote:
> >>>
> >>>> From: Morten Brørup
> >>>> Sent: Saturday, 30 October 2021 12.24
> >>>>
> >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa
> >>>>> Nagarahalli
> >>>>> Sent: Monday, 4 October 2021 18.36
> >>>>>
> >>>>> <snip>
> >>>>>>
> >>>>>>
> >>>>>>>>> Current mempool per core cache implementation is based on
> >>>>> pointer
> >>>>>>>>> For most architectures, each pointer consumes 64b Replace it
> >>>>> with
> >>>>>>>>> index-based implementation, where in each buffer is addressed
> >>>>> by
> >>>>>>>>> (pool address + index)
> >>>>
> >>>> I like Dharmik's suggestion very much. CPU cache is a critical and
> >>>> limited resource.
> >>>>
> >>>> DPDK has a tendency of using pointers where indexes could be used
> >>>> instead. I suppose pointers provide the additional flexibility of
> >>>> mixing entries from different memory pools, e.g. multiple mbuf
> >> pools.
> >>>>
> >>
> >> Agreed, thank you!
> >>
> >>>>>>>>
> >>>>>>>> I don't think it is going to work:
> >>>>>>>> On 64-bit systems difference between pool address and it's
> elem
> >>>>>>>> address could be bigger than 4GB.
> >>>>>>> Are you talking about a case where the memory pool size is more
> >>>>> than 4GB?
> >>>>>>
> >>>>>> That is one possible scenario.
> >>>>
> >>>> That could be solved by making the index an element index instead
> of
> >> a
> >>>> pointer offset: address = (pool address + index * element size).
> >>>
> >>> Or instead of scaling the index with the element size, which is
> only
> >> known at runtime, the index could be more efficiently scaled by a
> >> compile time constant such as RTE_MEMPOOL_ALIGN (=
> >> RTE_CACHE_LINE_SIZE). With a cache line size of 64 byte, that would
> >> allow indexing into mempools up to 256 GB in size.
> >>>
> >>
> >> Looking at this snippet [1] from rte_mempool_op_populate_helper(),
> >> there is an ‘offset’ added to avoid objects to cross page
> boundaries.
> >> If my understanding is correct, using the index of element instead
> of a
> >> pointer offset will pose a challenge for some of the corner cases.
> >>
> >> [1]
> >> for (i = 0; i < max_objs; i++) {
> >> /* avoid objects to cross page boundaries */
> >> if (check_obj_bounds(va + off, pg_sz, total_elt_sz) <
> >> 0) {
> >> off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) -
> >> (va + off);
> >> if (flags & RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ)
> >> off += total_elt_sz -
> >> (((uintptr_t)(va + off - 1) %
> >> total_elt_sz) + 1);
> >> }
> >>
> >
> > OK. Alternatively to scaling the index with a cache line size, you
> can scale it with sizeof(uintptr_t) to be able to address 32 or 16 GB
> mempools on respectively 64 bit and 32 bit architectures. Both x86 and
> ARM CPUs have instructions to access memory with an added offset
> multiplied by 4 or 8. So that should be high performance.
>
> Yes, agreed this can be done.
> Cache line size can also be used when ‘MEMPOOL_F_NO_CACHE_ALIGN’ is not
> enabled.
> On a side note, I wanted to better understand the need for having the
> 'MEMPOOL_F_NO_CACHE_ALIGN' option.
The description of this field is misleading, and should be corrected.
The correct description would be: Don't need to align objs on cache lines.
It is useful for mempools containing very small objects, to conserve memory.
>
> >
> >>>>
> >>>>>> Another possibility - user populates mempool himself with some
> >>>>> external
> >>>>>> memory by calling rte_mempool_populate_iova() directly.
> >>>>> Is the concern that IOVA might not be contiguous for all the
> memory
> >>>>> used by the mempool?
> >>>>>
> >>>>>> I suppose such situation can even occur even with normal
> >>>>>> rte_mempool_create(), though it should be a really rare one.
> >>>>> All in all, this feature needs to be configurable during compile
> >>>> time.
> >>>
> >
next prev parent reply other threads:[~2021-11-04 8:04 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-30 17:27 Dharmik Thakkar
2021-10-01 12:36 ` Jerin Jacob
2021-10-01 15:44 ` Honnappa Nagarahalli
2021-10-01 17:32 ` Jerin Jacob
2021-10-01 17:57 ` Honnappa Nagarahalli
2021-10-01 18:21 ` Jerin Jacob
2021-10-01 21:30 ` Ananyev, Konstantin
2021-10-02 0:07 ` Honnappa Nagarahalli
2021-10-02 18:51 ` Ananyev, Konstantin
2021-10-04 16:36 ` Honnappa Nagarahalli
2021-10-30 10:23 ` Morten Brørup
2021-10-31 8:14 ` Morten Brørup
2021-11-03 15:12 ` Dharmik Thakkar
2021-11-03 15:52 ` Morten Brørup
2021-11-04 4:42 ` Dharmik Thakkar
2021-11-04 8:04 ` Morten Brørup [this message]
2021-11-08 4:32 ` Honnappa Nagarahalli
2021-11-08 7:22 ` Morten Brørup
2021-11-08 15:29 ` Honnappa Nagarahalli
2021-11-08 15:39 ` Morten Brørup
2021-11-08 15:46 ` Honnappa Nagarahalli
2021-11-08 16:03 ` Morten Brørup
2021-11-08 16:47 ` Jerin Jacob
2021-12-24 22:59 ` [PATCH 0/1] " Dharmik Thakkar
2021-12-24 22:59 ` [PATCH 1/1] " Dharmik Thakkar
2022-01-11 2:26 ` Ananyev, Konstantin
2022-01-13 5:17 ` Dharmik Thakkar
2022-01-13 10:37 ` Ananyev, Konstantin
2022-01-19 15:32 ` Dharmik Thakkar
2022-01-21 11:25 ` Ananyev, Konstantin
2022-01-21 11:31 ` Ananyev, Konstantin
2022-03-24 19:51 ` Dharmik Thakkar
2021-12-25 0:16 ` [PATCH 0/1] " Morten Brørup
2022-01-07 11:15 ` Bruce Richardson
2022-01-07 11:29 ` Morten Brørup
2022-01-07 13:50 ` Bruce Richardson
2022-01-08 9:37 ` Morten Brørup
2022-01-10 6:38 ` Jerin Jacob
2022-01-13 5:31 ` Dharmik Thakkar
2023-07-06 17:43 ` Stephen Hemminger
2023-07-31 12:23 ` Thomas Monjalon
2023-07-31 12:33 ` Morten Brørup
2023-07-31 14:57 ` Dharmik Jayesh Thakkar
2022-01-13 5:36 ` [PATCH v2 " Dharmik Thakkar
2022-01-13 5:36 ` [PATCH v2 1/1] " Dharmik Thakkar
2022-01-13 10:18 ` Jerin Jacob
2022-01-20 8:21 ` Morten Brørup
2022-01-21 6:01 ` Honnappa Nagarahalli
2022-01-21 7:36 ` Morten Brørup
2022-01-24 13:05 ` Ray Kinsella
2022-01-21 9:12 ` Bruce Richardson
2022-01-23 7:13 ` Wang, Haiyue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D86CA2@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=Dharmik.Thakkar@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=Ruifeng.Wang@arm.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
--cc=nd@arm.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).