Re: [PATCH 1/1] mempool: implement index-based per core cache

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Dharmik Thakkar <Dharmik.Thakkar@arm.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: Olivier Matz <olivier.matz@6wind.com>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	"dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	Ruifeng Wang <Ruifeng.Wang@arm.com>
Subject: Re: [PATCH 1/1] mempool: implement index-based per core cache
Date: Wed, 19 Jan 2022 15:32:57 +0000	[thread overview]
Message-ID: <278A6D2F-B5B5-4D4E-B8BE-C5A9BFE8C1C7@arm.com> (raw)
In-Reply-To: <DM6PR11MB4491342E71D9FB54BA18346C9A539@DM6PR11MB4491.namprd11.prod.outlook.com>

Hi Konstatin,

> On Jan 13, 2022, at 4:37 AM, Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote:
> 
> 
> Hi Dharmik,
> 
>>> 
>>>> Current mempool per core cache implementation stores pointers to mbufs
>>>> On 64b architectures, each pointer consumes 8B
>>>> This patch replaces it with index-based implementation,
>>>> where in each buffer is addressed by (pool base address + index)
>>>> It reduces the amount of memory/cache required for per core cache
>>>> 
>>>> L3Fwd performance testing reveals minor improvements in the cache
>>>> performance (L1 and L2 misses reduced by 0.60%)
>>>> with no change in throughput
>>> 
>>> I feel really sceptical about that patch and the whole idea in general:
>>> - From what I read above there is no real performance improvement observed.
>>> (In fact on my IA boxes mempool_perf_autotest reports ~20% slowdown,
>>> see below for more details).
>> 
>> Currently, the optimizations (loop unroll and vectorization) are only implemented for ARM64.
>> Similar optimizations can be implemented for x86 platforms which should close the performance gap
>> and in my understanding should give better performance for a bulk size of 32.
> 
> Might be, but I still don't see the reason for such effort.
> As you mentioned there is no performance improvement in 'real' apps: l3fwd, etc.
> on ARM64 even with vectorized version of the code.
> 

IMO, even without performance improvement, it is advantageous because the same performance is being achieved
with less memory and cache utilization using the patch.

>>> - Space utilization difference looks neglectable too.
>> 
>> Sorry, I did not understand this point.
> 
> As I understand one of the expectations from that patch was:
> reduce memory/cache required, which should improve cache utilization
> (less misses, etc.).
> Though I think such improvements would be neglectable and wouldn't
> cause any real performance gain.

The cache utilization performance numbers are for the l3fwd app, which might not be bottlenecked at the mempool per core cache.
Theoretically, this patch enables storing twice the number of objects in the cache as compared to the original implementation.

> 
>>> - The change introduces a new build time config option with a major limitation:
>>>  All memzones in a pool have to be within the same 4GB boundary.
>>>  To address it properly, extra changes will be required in init(/populate) part of the code.
>> 
>> I agree to the above mentioned challenges and I am currently working on resolving these issues.
> 
> I still think that to justify such changes some really noticeable performance
> improvement needs to be demonstrated: double-digit speedup for l3fwd/ipsec-secgw/...  
> Otherwise it just not worth the hassle. 
> 

Like I mentioned earlier, the app might not be bottlenecked at the mempool per core cache.
That could be the reason the numbers with l3fwd don’t fully show the advantage of the patch.
I’m seeing double-digit improvement with mempool_perf_autotest which should not be ignored.

>>>  All that will complicate mempool code, will make it more error prone
>>>  and harder to maintain.
>>> But, as there is no real gain in return - no point to add such extra complexity at all.
>>> 
>>> Konstantin
>>>

next prev parent reply	other threads:[~2022-01-19 15:33 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 17:27 [dpdk-dev] [RFC] " Dharmik Thakkar
2021-10-01 12:36 ` Jerin Jacob
2021-10-01 15:44   ` Honnappa Nagarahalli
2021-10-01 17:32     ` Jerin Jacob
2021-10-01 17:57       ` Honnappa Nagarahalli
2021-10-01 18:21       ` Jerin Jacob
2021-10-01 21:30 ` Ananyev, Konstantin
2021-10-02  0:07   ` Honnappa Nagarahalli
2021-10-02 18:51     ` Ananyev, Konstantin
2021-10-04 16:36       ` Honnappa Nagarahalli
2021-10-30 10:23         ` Morten Brørup
2021-10-31  8:14         ` Morten Brørup
2021-11-03 15:12           ` Dharmik Thakkar
2021-11-03 15:52             ` Morten Brørup
2021-11-04  4:42               ` Dharmik Thakkar
2021-11-04  8:04                 ` Morten Brørup
2021-11-08  4:32                   ` Honnappa Nagarahalli
2021-11-08  7:22                     ` Morten Brørup
2021-11-08 15:29                       ` Honnappa Nagarahalli
2021-11-08 15:39                         ` Morten Brørup
2021-11-08 15:46                           ` Honnappa Nagarahalli
2021-11-08 16:03                             ` Morten Brørup
2021-11-08 16:47                               ` Jerin Jacob
2021-12-24 22:59 ` [PATCH 0/1] " Dharmik Thakkar
2021-12-24 22:59   ` [PATCH 1/1] " Dharmik Thakkar
2022-01-11  2:26     ` Ananyev, Konstantin
2022-01-13  5:17       ` Dharmik Thakkar
2022-01-13 10:37         ` Ananyev, Konstantin
2022-01-19 15:32           ` Dharmik Thakkar [this message]
2022-01-21 11:25             ` Ananyev, Konstantin
2022-01-21 11:31               ` Ananyev, Konstantin
2022-03-24 19:51               ` Dharmik Thakkar
2021-12-25  0:16   ` [PATCH 0/1] " Morten Brørup
2022-01-07 11:15     ` Bruce Richardson
2022-01-07 11:29       ` Morten Brørup
2022-01-07 13:50         ` Bruce Richardson
2022-01-08  9:37           ` Morten Brørup
2022-01-10  6:38             ` Jerin Jacob
2022-01-13  5:31               ` Dharmik Thakkar
2023-07-06 17:43                 ` Stephen Hemminger
2023-07-31 12:23                   ` Thomas Monjalon
2023-07-31 12:33                     ` Morten Brørup
2023-07-31 14:57                       ` Dharmik Jayesh Thakkar
2022-01-13  5:36   ` [PATCH v2 " Dharmik Thakkar
2022-01-13  5:36     ` [PATCH v2 1/1] " Dharmik Thakkar
2022-01-13 10:18       ` Jerin Jacob
2022-01-20  8:21       ` Morten Brørup
2022-01-21  6:01         ` Honnappa Nagarahalli
2022-01-21  7:36           ` Morten Brørup
2022-01-24 13:05             ` Ray Kinsella
2022-01-21  9:12           ` Bruce Richardson
2022-01-23  7:13       ` Wang, Haiyue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=278A6D2F-B5B5-4D4E-B8BE-C5A9BFE8C1C7@arm.com \
    --to=dharmik.thakkar@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).