DPDK patches and discussions
 help / color / mirror / Atom feed
From: Morten Brørup <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Dharmik Thakkar" <Dharmik.Thakkar@arm.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Olivier Matz" <olivier.matz@6wind.com>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	<dev@dpdk.org>, "nd" <nd@arm.com>,
	"Ruifeng Wang" <Ruifeng.Wang@arm.com>, "nd" <nd@arm.com>
Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache
Date: Mon, 8 Nov 2021 08:22:06 +0100
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86CAE@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB5814A9A1A405D8959EF4398798919@DBAPR08MB5814.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Monday, 8 November 2021 05.33
> 
> <snip>
> 
> > > >>>>>>>>> Current mempool per core cache implementation is based on
> > > >>>>> pointer
> > > >>>>>>>>> For most architectures, each pointer consumes 64b Replace
> it
> > > >>>>> with
> > > >>>>>>>>> index-based implementation, where in each buffer is
> > > >>>>>>>>> addressed
> > > >>>>> by
> > > >>>>>>>>> (pool address + index)
> > > >>>>
> > > >>>> I like Dharmik's suggestion very much. CPU cache is a critical
> > > >>>> and limited resource.
> > > >>>>
> > > >>>> DPDK has a tendency of using pointers where indexes could be
> used
> > > >>>> instead. I suppose pointers provide the additional flexibility
> of
> > > >>>> mixing entries from different memory pools, e.g. multiple mbuf
> > > >> pools.
> > > >>>>
> > > >>
> > > >> Agreed, thank you!
> > > >>
> > > >>>>>>>>
> > > >>>>>>>> I don't think it is going to work:
> > > >>>>>>>> On 64-bit systems difference between pool address and it's
> > > elem
> > > >>>>>>>> address could be bigger than 4GB.
> > > >>>>>>> Are you talking about a case where the memory pool size is
> > > >>>>>>> more
> > > >>>>> than 4GB?
> > > >>>>>>
> > > >>>>>> That is one possible scenario.
> > > >>>>
> > > >>>> That could be solved by making the index an element index
> instead
> > > of
> > > >> a
> > > >>>> pointer offset: address = (pool address + index * element
> size).
> > > >>>
> > > >>> Or instead of scaling the index with the element size, which is
> > > only
> > > >> known at runtime, the index could be more efficiently scaled by
> a
> > > >> compile time constant such as RTE_MEMPOOL_ALIGN (=
> > > >> RTE_CACHE_LINE_SIZE). With a cache line size of 64 byte, that
> would
> > > >> allow indexing into mempools up to 256 GB in size.
> > > >>>
> > > >>
> > > >> Looking at this snippet [1] from
> rte_mempool_op_populate_helper(),
> > > >> there is an ‘offset’ added to avoid objects to cross page
> > > boundaries.
> > > >> If my understanding is correct, using the index of element
> instead
> > > of a
> > > >> pointer offset will pose a challenge for some of the corner
> cases.
> > > >>
> > > >> [1]
> > > >>        for (i = 0; i < max_objs; i++) {
> > > >>                /* avoid objects to cross page boundaries */
> > > >>                if (check_obj_bounds(va + off, pg_sz,
> total_elt_sz)
> > > >> <
> > > >> 0) {
> > > >>                        off += RTE_PTR_ALIGN_CEIL(va + off,
> pg_sz) -
> > > >> (va + off);
> > > >>                        if (flags &
> RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ)
> > > >>                                off += total_elt_sz -
> > > >>                                        (((uintptr_t)(va + off -
> 1) %
> > > >>                                                total_elt_sz) +
> 1);
> > > >>                }
> > > >>
> > > >
> > > > OK. Alternatively to scaling the index with a cache line size,
> you
> > > can scale it with sizeof(uintptr_t) to be able to address 32 or 16
> GB
> > > mempools on respectively 64 bit and 32 bit architectures. Both x86
> and
> > > ARM CPUs have instructions to access memory with an added offset
> > > multiplied by 4 or 8. So that should be high performance.
> > >
> > > Yes, agreed this can be done.
> > > Cache line size can also be used when ‘MEMPOOL_F_NO_CACHE_ALIGN’ is
> > > not enabled.
> > > On a side note, I wanted to better understand the need for having
> the
> > > 'MEMPOOL_F_NO_CACHE_ALIGN' option.
> >
> > The description of this field is misleading, and should be corrected.
> > The correct description would be: Don't need to align objs on cache
> lines.
> >
> > It is useful for mempools containing very small objects, to conserve
> memory.
> I think we can assume that mbuf pools are created with the
> 'MEMPOOL_F_NO_CACHE_ALIGN' flag set. With this we can use offset
> calculated with cache line size as the unit.

You mean MEMPOOL_F_NO_CACHE_ALIGN flag not set. ;-)

I agree. And since the flag is a hint only, it can be ignored if the mempool library is scaling the index with the cache line size.

However, a mempool may contain other objects than mbufs, and those objects may be small, so ignoring the MEMPOOL_F_NO_CACHE_ALIGN flag may cost a lot of memory for such mempools.

> 
> >
> > >
> > > >
> > > >>>>
> > > >>>>>> Another possibility - user populates mempool himself with
> some
> > > >>>>> external
> > > >>>>>> memory by calling rte_mempool_populate_iova() directly.
> > > >>>>> Is the concern that IOVA might not be contiguous for all the
> > > memory
> > > >>>>> used by the mempool?
> > > >>>>>
> > > >>>>>> I suppose such situation can even occur even with normal
> > > >>>>>> rte_mempool_create(), though it should be a really rare one.
> > > >>>>> All in all, this feature needs to be configurable during
> compile
> > > >>>> time.
> > > >>>
> > > >

  reply	other threads:[~2021-11-08  7:22 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 17:27 Dharmik Thakkar
2021-10-01 12:36 ` Jerin Jacob
2021-10-01 15:44   ` Honnappa Nagarahalli
2021-10-01 17:32     ` Jerin Jacob
2021-10-01 17:57       ` Honnappa Nagarahalli
2021-10-01 18:21       ` Jerin Jacob
2021-10-01 21:30 ` Ananyev, Konstantin
2021-10-02  0:07   ` Honnappa Nagarahalli
2021-10-02 18:51     ` Ananyev, Konstantin
2021-10-04 16:36       ` Honnappa Nagarahalli
2021-10-30 10:23         ` Morten Brørup
2021-10-31  8:14         ` Morten Brørup
2021-11-03 15:12           ` Dharmik Thakkar
2021-11-03 15:52             ` Morten Brørup
2021-11-04  4:42               ` Dharmik Thakkar
2021-11-04  8:04                 ` Morten Brørup
2021-11-08  4:32                   ` Honnappa Nagarahalli
2021-11-08  7:22                     ` Morten Brørup [this message]
2021-11-08 15:29                       ` Honnappa Nagarahalli
2021-11-08 15:39                         ` Morten Brørup
2021-11-08 15:46                           ` Honnappa Nagarahalli
2021-11-08 16:03                             ` Morten Brørup
2021-11-08 16:47                               ` Jerin Jacob
2021-12-24 22:59 ` [PATCH 0/1] " Dharmik Thakkar
2021-12-24 22:59   ` [PATCH 1/1] " Dharmik Thakkar
2022-01-11  2:26     ` Ananyev, Konstantin
2022-01-13  5:17       ` Dharmik Thakkar
2022-01-13 10:37         ` Ananyev, Konstantin
2022-01-19 15:32           ` Dharmik Thakkar
2022-01-21 11:25             ` Ananyev, Konstantin
2022-01-21 11:31               ` Ananyev, Konstantin
2022-03-24 19:51               ` Dharmik Thakkar
2021-12-25  0:16   ` [PATCH 0/1] " Morten Brørup
2022-01-07 11:15     ` Bruce Richardson
2022-01-07 11:29       ` Morten Brørup
2022-01-07 13:50         ` Bruce Richardson
2022-01-08  9:37           ` Morten Brørup
2022-01-10  6:38             ` Jerin Jacob
2022-01-13  5:31               ` Dharmik Thakkar
2022-01-13  5:36   ` [PATCH v2 " Dharmik Thakkar
2022-01-13  5:36     ` [PATCH v2 1/1] " Dharmik Thakkar
2022-01-13 10:18       ` Jerin Jacob
2022-01-20  8:21       ` Morten Brørup
2022-01-21  6:01         ` Honnappa Nagarahalli
2022-01-21  7:36           ` Morten Brørup
2022-01-24 13:05             ` Ray Kinsella
2022-01-21  9:12           ` Bruce Richardson
2022-01-23  7:13       ` Wang, Haiyue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D86CAE@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Dharmik.Thakkar@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git