DPDK patches and discussions
 help / color / mirror / Atom feed
From: Morten Brørup <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Dharmik Thakkar" <dharmik.thakkar@arm.com>,
	<honnappa.nagarahalli@arm.com>
Cc: <dev@dpdk.org>, <nd@arm.com>, <ruifeng.wang@arm.com>
Subject: RE: [PATCH 0/1] mempool: implement index-based per core cache
Date: Sat, 8 Jan 2022 10:37:10 +0100
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86DEF@smartserver.smartshare.dk> (raw)
In-Reply-To: <YdhFKhWtpzKS6g7l@bricha3-MOBL.ger.corp.intel.com>

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 7 January 2022 14.51
> 
> On Fri, Jan 07, 2022 at 12:29:23PM +0100, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Friday, 7 January 2022 12.16
> > >
> > > On Sat, Dec 25, 2021 at 01:16:03AM +0100, Morten Brørup wrote:
> > > > > From: Dharmik Thakkar [mailto:dharmik.thakkar@arm.com] Sent:
> > > Friday, 24
> > > > > December 2021 23.59
> > > > >
> > > > > Current mempool per core cache implementation stores pointers
> to
> > > mbufs
> > > > > On 64b architectures, each pointer consumes 8B This patch
> replaces
> > > it
> > > > > with index-based implementation, where in each buffer is
> addressed
> > > by
> > > > > (pool base address + index) It reduces the amount of
> memory/cache
> > > > > required for per core cache
> > > > >
> > > > > L3Fwd performance testing reveals minor improvements in the
> cache
> > > > > performance (L1 and L2 misses reduced by 0.60%) with no change
> in
> > > > > throughput
> > > > >
> > > > > Micro-benchmarking the patch using mempool_perf_test shows
> > > significant
> > > > > improvement with majority of the test cases
> > > > >
> > > >
> > > > I still think this is very interesting. And your performance
> numbers
> > > are
> > > > looking good.
> > > >
> > > > However, it limits the size of a mempool to 4 GB. As previously
> > > > discussed, the max mempool size can be increased by multiplying
> the
> > > index
> > > > with a constant.
> > > >
> > > > I would suggest using sizeof(uintptr_t) as the constant
> multiplier,
> > > so
> > > > the mempool can hold objects of any size divisible by
> > > sizeof(uintptr_t).
> > > > And it would be silly to use a mempool to hold objects smaller
> than
> > > > sizeof(uintptr_t).
> > > >
> > > > How does the performance look if you multiply the index by
> > > > sizeof(uintptr_t)?
> > > >
> > >
> > > Each mempool entry is cache aligned, so we can use that if we want
> a
> > > bigger
> > > multiplier.
> >
> > Thanks for chiming in, Bruce.
> >
> > Please also read this discussion about the multiplier:
> > http://inbox.dpdk.org/dev/CALBAE1PrQYyOG96f6ECeW1vPF3TOh1h7MZZULiY95z9xjbRuyA@mail.gmail.com/
> >
> 
> I actually wondered after I had sent the email whether we had indeed an
> option to disable the cache alignment or not! Thanks for pointing out
> that
> we do. This brings a couple additional thoughts:
> 
> * Using indexes for the cache should probably be a runtime flag rather
> than
>   a build-time one.
> * It would seem reasonable to me to disallow use of the indexed-cache
> flag
>   and the non-cache aligned flag simultaneously.
> * On the offchance that that restriction is unacceptable, then we can
>   make things a little more complicated by doing a runtime computation
> of
>   the "index-shiftwidth" to use.
> 
> Overall, I think defaulting to cacheline shiftwidth and disallowing
> index-based addressing when using unaligned buffers is simplest and
> easiest
> unless we can come up with a valid usecase for needing more than that.
> 
> /Bruce

This feature is a performance optimization.

With that in mind, it should not introduce function pointers or similar run-time checks or in the fast path, to determine what kind of cache to use per mempool. And if an index multiplier is implemented, it should be a compile time constant, probably something between sizeof(uintptr_t) or RTE_MEMPOOL_ALIGN (=RTE_CACHE_LINE_SIZE).

The patch comes with a tradeoff between better performance and limited mempool size, and possibly some limitations regarding very small objects that are not cache line aligned to avoid wasting memory (RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ).

With no multiplier, the only tradeoff is that the mempool size is limited to 4 GB.

If the multiplier is small (i.e. 8 bytes) the only tradeoff is that the mempool size is limited to 32 GB. (And a waste of memory for objects smaller than 8 byte; but I don't think anyone would use a mempool to hold objects smaller than 8 byte.)

If the multiplier is larger (i.e. 64 bytes cache line size), the mempool size is instead limited to 256 GB, but RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ has no effect.

Note: 32 bit platforms have no benefit from this patch: The pointer already only uses 4 bytes, so replacing the pointer with a 4 byte index makes no difference.


Since this feature is a performance optimization only, and doesn't provide any new features, I don't mind it being a compile time option.

If this feature is a compile time option, and the mempool library is compiled with the large multiplier, then RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ could be made undefined in the public header file, so compilation of applications using the flag will fail. And rte_mempool_create() could RTE_ASSERT() that RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ is not set in its flags parameter, or emit a warning about the flag being ignored. Obviously, rte_mempool_create() should also RTE_ASSERT() that the mempool is not larger than the library supports, possibly emitting a message that the mempool library should be built without this feature to support the larger mempool.

Here is another thought: If only exotic applications use mempools larger than 32 GB, this would be a generally acceptable limit, and DPDK should use index-based cache as default, making the opposite (i.e. pointer-based cache) a compile time option instead. A similar decision was recently made for limiting the RTE_MAX_LCORE default.


Although DPDK is moving away from compile time options in order to better support Linux distros, there should be a general exception for performance and memory optimizations. Otherwise, network appliance vendors will inherit the increasing amount of DPDK bloat, and we (network appliance vendors) will eventually be forced to fork DPDK to get rid of the bloat and achieve the goals originally intended by DPDK. If anyone disagrees with the principle about a general exception for performance and memory optimizations, I would like to pass on the decision to the Techboard!


  reply	other threads:[~2022-01-08  9:37 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 17:27 [dpdk-dev] [RFC] " Dharmik Thakkar
2021-10-01 12:36 ` Jerin Jacob
2021-10-01 15:44   ` Honnappa Nagarahalli
2021-10-01 17:32     ` Jerin Jacob
2021-10-01 17:57       ` Honnappa Nagarahalli
2021-10-01 18:21       ` Jerin Jacob
2021-10-01 21:30 ` Ananyev, Konstantin
2021-10-02  0:07   ` Honnappa Nagarahalli
2021-10-02 18:51     ` Ananyev, Konstantin
2021-10-04 16:36       ` Honnappa Nagarahalli
2021-10-30 10:23         ` Morten Brørup
2021-10-31  8:14         ` Morten Brørup
2021-11-03 15:12           ` Dharmik Thakkar
2021-11-03 15:52             ` Morten Brørup
2021-11-04  4:42               ` Dharmik Thakkar
2021-11-04  8:04                 ` Morten Brørup
2021-11-08  4:32                   ` Honnappa Nagarahalli
2021-11-08  7:22                     ` Morten Brørup
2021-11-08 15:29                       ` Honnappa Nagarahalli
2021-11-08 15:39                         ` Morten Brørup
2021-11-08 15:46                           ` Honnappa Nagarahalli
2021-11-08 16:03                             ` Morten Brørup
2021-11-08 16:47                               ` Jerin Jacob
2021-12-24 22:59 ` [PATCH 0/1] " Dharmik Thakkar
2021-12-24 22:59   ` [PATCH 1/1] " Dharmik Thakkar
2022-01-11  2:26     ` Ananyev, Konstantin
2022-01-13  5:17       ` Dharmik Thakkar
2022-01-13 10:37         ` Ananyev, Konstantin
2022-01-19 15:32           ` Dharmik Thakkar
2022-01-21 11:25             ` Ananyev, Konstantin
2022-01-21 11:31               ` Ananyev, Konstantin
2022-03-24 19:51               ` Dharmik Thakkar
2021-12-25  0:16   ` [PATCH 0/1] " Morten Brørup
2022-01-07 11:15     ` Bruce Richardson
2022-01-07 11:29       ` Morten Brørup
2022-01-07 13:50         ` Bruce Richardson
2022-01-08  9:37           ` Morten Brørup [this message]
2022-01-10  6:38             ` Jerin Jacob
2022-01-13  5:31               ` Dharmik Thakkar
2022-01-13  5:36   ` [PATCH v2 " Dharmik Thakkar
2022-01-13  5:36     ` [PATCH v2 1/1] " Dharmik Thakkar
2022-01-13 10:18       ` Jerin Jacob
2022-01-20  8:21       ` Morten Brørup
2022-01-21  6:01         ` Honnappa Nagarahalli
2022-01-21  7:36           ` Morten Brørup
2022-01-24 13:05             ` Ray Kinsella
2022-01-21  9:12           ` Bruce Richardson
2022-01-23  7:13       ` Wang, Haiyue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D86DEF@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=dharmik.thakkar@arm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=nd@arm.com \
    --cc=ruifeng.wang@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git