From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 53FB4A0032;
	Sat,  2 Oct 2021 02:07:39 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id D7978410EB;
	Sat,  2 Oct 2021 02:07:38 +0200 (CEST)
Received: from EUR02-VE1-obe.outbound.protection.outlook.com
 (mail-eopbgr20084.outbound.protection.outlook.com [40.107.2.84])
 by mails.dpdk.org (Postfix) with ESMTP id 6792D4067B
 for <dev@dpdk.org>; Sat,  2 Oct 2021 02:07:38 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; 
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=kWr/XEWP0rfQQFCPHJBQ/4ZrAKXHh5zFIhjD4n4daeA=;
 b=hzxApG8Qg1k4rJPRdDWCzcvC6TWzbhpKDV8KH3hYWQ1ADeuHN4OGAcny3FBBUN1PgoCS+qGqfaZ1sBtdmq2wSWYc9RNAy0+YoSgBpeKzirJ9CDj4eYEocA8i8jAiYMks38ZBpPLOxuin1v0pAEiAuochTcbEsll3v8fsxm+STwI=
Received: from AM5PR1001CA0066.EURPRD10.PROD.OUTLOOK.COM
 (2603:10a6:206:15::43) by DBBPR08MB6235.eurprd08.prod.outlook.com
 (2603:10a6:10:201::17) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.15; Sat, 2 Oct
 2021 00:07:36 +0000
Received: from AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com
 (2603:10a6:206:15:cafe::cb) by AM5PR1001CA0066.outlook.office365.com
 (2603:10a6:206:15::43) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.16 via Frontend
 Transport; Sat, 2 Oct 2021 00:07:36 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.33.187.114)
 smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified)
 header.d=armh.onmicrosoft.com;dpdk.org; dmarc=pass action=none
 header.from=arm.com;
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates
 63.33.187.114 as permitted sender) receiver=protection.outlook.com;
 client-ip=63.33.187.114; helo=64aa7808-outbound-2.mta.getcheckrecipient.com;
Received: from 64aa7808-outbound-2.mta.getcheckrecipient.com (63.33.187.114)
 by AM5EUR03FT023.mail.protection.outlook.com (10.152.16.169) with Microsoft
 SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4566.14 via Frontend Transport; Sat, 2 Oct 2021 00:07:35 +0000
Received: ("Tessian outbound 3c48586a377f:v103");
 Sat, 02 Oct 2021 00:07:33 +0000
X-CR-MTA-TID: 64aa7808
Received: from 73901a70067b.2
 by 64aa7808-outbound-1.mta.getcheckrecipient.com id
 7A12F9DD-A308-4FBE-A00E-099D7AFC6630.1; 
 Sat, 02 Oct 2021 00:07:22 +0000
Received: from EUR02-HE1-obe.outbound.protection.outlook.com
 by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 73901a70067b.2
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);
 Sat, 02 Oct 2021 00:07:22 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=XxWlWdE2t9vekgpaXxnAVXdYhjXFqhyKfDuEUInSI9Q6uAK9znDhEvHW1vXu69Csj4+kXlCia1pgWfCLux000jRBC3DEw3fLyuoeSqoXSTZOR4stAe97oZYO/Ud6gQkaLCFuPnJUXmuDx7EFX218XZo05DATqGLuCnKknnr/rtkGpGeXPGAnR+h3uTw+taUqQCY+I75358D+lajdK9u37M3yqciLXt2fRAOV50NiUzZjyqe/XM8f11s1BdtDZa8BR3Ki2dvY85lqU1laarOZ02bxldwUJLF7XsbqOhPvp1UvCRFoQQNXyd4n8JWegARsWdmD6GwdV9iY0IU3DVTAAg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=kWr/XEWP0rfQQFCPHJBQ/4ZrAKXHh5zFIhjD4n4daeA=;
 b=FQdziAlwYH45HoEg0YV8bANxTxGe8P2ofWNIk+wKhrmABoRWWX8vkeGTTc2RtvXpZSjgCjPHTM3/B8fFi8YFSF1ZCpAdN47Poz/3UCwxktZcOsxa2q1OfiJiMvCfKvwVI8EoiaTK38bLrkUfze1OmJwzsWV9wZzCq+dRAXidY14Tw1JEysnm9OGALWZfmcKDLDy9BtQ3/SfvKisHoDBwlI8vt/Xoa+wMOF2AUZAmXuiE3NpiBeQApXmYDjNK5Ejn9QA/4qRdItTJGG7NDFCEN01AbcqE+i4yKb52K0gR9sCOtiScN2N4ZnI6IK5qAWxpWXUmOeJBiQG0Kwml5JhFbQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass
 header.d=arm.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; 
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=kWr/XEWP0rfQQFCPHJBQ/4ZrAKXHh5zFIhjD4n4daeA=;
 b=hzxApG8Qg1k4rJPRdDWCzcvC6TWzbhpKDV8KH3hYWQ1ADeuHN4OGAcny3FBBUN1PgoCS+qGqfaZ1sBtdmq2wSWYc9RNAy0+YoSgBpeKzirJ9CDj4eYEocA8i8jAiYMks38ZBpPLOxuin1v0pAEiAuochTcbEsll3v8fsxm+STwI=
Received: from DBAPR08MB5814.eurprd08.prod.outlook.com (2603:10a6:10:1b1::6)
 by DBBPR08MB4281.eurprd08.prod.outlook.com (2603:10a6:10:c4::17) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.18; Sat, 2 Oct
 2021 00:07:17 +0000
Received: from DBAPR08MB5814.eurprd08.prod.outlook.com
 ([fe80::8187:ccbc:30d:3464]) by DBAPR08MB5814.eurprd08.prod.outlook.com
 ([fe80::8187:ccbc:30d:3464%5]) with mapi id 15.20.4566.019; Sat, 2 Oct 2021
 00:07:17 +0000
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Dharmik Thakkar
 <Dharmik.Thakkar@arm.com>, Olivier Matz <olivier.matz@6wind.com>, Andrew
 Rybchenko <andrew.rybchenko@oktetlabs.ru>
CC: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>, Ruifeng Wang
 <Ruifeng.Wang@arm.com>, nd <nd@arm.com>
Thread-Topic: [dpdk-dev] [RFC] mempool: implement index-based per core cache
Thread-Index: AQHXtwuB3FbH85c80U2vmaazO82iNau+1ETw
Date: Sat, 2 Oct 2021 00:07:17 +0000
Message-ID: <DBAPR08MB5814DDC7117DA8D9C560C37298AC9@DBAPR08MB5814.eurprd08.prod.outlook.com>
References: <20210930172735.2675627-1-dharmik.thakkar@arm.com>
 <DM6PR11MB449143289777B6B94042E9969AAB9@DM6PR11MB4491.namprd11.prod.outlook.com>
In-Reply-To: <DM6PR11MB449143289777B6B94042E9969AAB9@DM6PR11MB4491.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ts-tracking-id: 5C3FCCD704E8A148BA7B168EEFCC0950.0
x-checkrecipientchecked: true
Authentication-Results-Original: intel.com; dkim=none (message not signed)
 header.d=none;intel.com; dmarc=none action=none header.from=arm.com;
x-ms-publictraffictype: Email
X-MS-Office365-Filtering-Correlation-Id: b0163186-2fb7-43c1-16eb-08d98538a358
x-ms-traffictypediagnostic: DBBPR08MB4281:|DBBPR08MB6235:
x-ms-exchange-transport-forked: True
X-Microsoft-Antispam-PRVS: <DBBPR08MB6235DA4E511BDFB46039391598AC9@DBBPR08MB6235.eurprd08.prod.outlook.com>
x-checkrecipientrouted: true
nodisclaimer: true
x-ms-oob-tlc-oobclassifiers: OLM:7219;OLM:7219;
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original: rJbm5Po0nfCC3y7qCuATp7dy2HBVpT1bXsEqtHzWD1tmm8872ZpLeTBLE8NZ60cI1VRVb7eMKpi76KrYWKT2reny8OFC1J/zeX0WqQerFHLSZG8lnG74k1FCyFewB1Nr/W7NJben7f07A/hZtko6a7JyBZji67xPyw8d+0crZBgWcpRArJkHRS9Txyk+ua1cAs9bWPHmwkuwAmQcc6T3GvDNjiwt+rgtF47Yma/Zf5AnAKUZDVSRTpv/CtQ6+dxGN4rkWvei6UwmTzVAwgQ9G1c5fl9tXsag7wI2t1DlxsJ83LUcWIHmW70/cdmpU4avk5mSw3GE2m7nPFTBx8ZjREI9h878rA/tu7u/su+5Fy6cfQALGubQo2BFu9xMcp7PX6WIOSlStHFem4dEOjUG68cgue5zoKbaH3gnFdC3umUus71+WlnbUy08vylqXsvnqC5AOIVfm4mp9khTZVFs0JQpJw/4MFQ3w+zB3M12Sv/uK+sYAJvU3nFg23zkRN1jMB7v3qSsykArm/vX4XmGpIO2YwcYK6cvhhZqyAqMmoeCwLi/ADvciY+5TEo6Jk0YZdbNMzXEBjKqpHzVCns9RaWVNGVoD4KgwbqK6jmCwSMd8nUghh7KAFXkAj9FdkpMGUXZAR4Rm7f7FNyuowU1eeNQFTfHf6prIsmtHcN4Yi2AYwikBx4lpmin/vRW6Xw+dmC7RwFAwHYnPSusqwZoYw==
X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en;
 SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DBAPR08MB5814.eurprd08.prod.outlook.com;
 PTR:; CAT:NONE;
 SFS:(4636009)(366004)(6506007)(33656002)(8936002)(55016002)(38070700005)(8676002)(7696005)(71200400001)(2906002)(83380400001)(66446008)(66946007)(5660300002)(54906003)(110136005)(316002)(508600001)(76116006)(66476007)(64756008)(66556008)(26005)(86362001)(186003)(52536014)(4326008)(122000001)(9686003)(38100700002);
 DIR:OUT; SFP:1101; 
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4281
Original-Authentication-Results: intel.com; dkim=none (message not signed)
 header.d=none;intel.com; dmarc=none action=none header.from=arm.com;
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com
X-MS-Office365-Filtering-Correlation-Id-Prvs: 4f94009d-afff-4d04-7571-08d9853898a7
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: 2KU/h/JYRs8wude2NqZPq/WlE2XNhx/oAAKDQUSb+nZdxI6+c3wkzwVftlZ+6PdE9qc09O2abCKHV2wdJMgvduCG1HoWbEO/pvPnMto7FtGaFXjFTNlAbBnq8k/ABPB2qHZl55KWyf1VDB3HeDhlKZV/2DMVUNAdGB3p6V9CPYxUTgUXPbd4mqvtnMeHtW2kn2hF0QD8XLFbIS54fWW5nv9x8hLsxK0E7QfA25bgF+7z1EInkHUmjdQKHnBkDo5kLacU61zm8Li5uG2c4sXm7BhpES6PfW9SdL9irQmc0PJaFVoZ6srjwNfUnlm/21OoAqsoFgZJ9Aa8KRirI1C6mDfFJT4b6IMUB5yXmstsw93JVLlqcRcEmSERWS11dX9UnO8AckNsMAIQCkb5ehsJcewMGBeBYDzGiCSw2ip9F4ckuizvwXaOBVce3wkl1+hkA8zxt3D5MPcjVtDkzjOSXZ3BdoUPyaP+njhFAsHchUycePwawHc0rZbTz7ls/5RNggmVwQhbhpwoKa+qlrrSMkcXWjdTy4Tpt6F/CO3Nv1Jqz5SSRlCJIweElbd3/irYJzeyj608NKQjC9lRmNNN5BFc8PL+p1hx7944/CmcBdsipdQts18mrYx7TYEtKe2bNLN7Fyf8cCpNva5DBU0qXMr8GjzdiCEnU24oUzO0KDVIX/vU5DAaU2diJT/AiZCpAwNbvQHsQW9EkeI93eADwQ==
X-Forefront-Antispam-Report: CIP:63.33.187.114; CTRY:IE; LANG:en; SCL:1; SRV:;
 IPV:CAL; SFV:NSPM; H:64aa7808-outbound-2.mta.getcheckrecipient.com;
 PTR:ec2-63-33-187-114.eu-west-1.compute.amazonaws.com; CAT:NONE;
 SFS:(4636009)(36840700001)(46966006)(2906002)(52536014)(8936002)(6506007)(70586007)(70206006)(7696005)(4326008)(33656002)(26005)(81166007)(356005)(54906003)(36860700001)(86362001)(9686003)(508600001)(5660300002)(186003)(110136005)(8676002)(55016002)(47076005)(336012)(82310400003)(316002)(83380400001);
 DIR:OUT; SFP:1101; 
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Oct 2021 00:07:35.3998 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: b0163186-2fb7-43c1-16eb-08d98538a358
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.33.187.114];
 Helo=[64aa7808-outbound-2.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6235
Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

<snip>
>=20
> > Current mempool per core cache implementation is based on pointer For
> > most architectures, each pointer consumes 64b Replace it with
> > index-based implementation, where in each buffer is addressed by (pool
> > address + index)
>=20
> I don't think it is going to work:
> On 64-bit systems difference between pool address and it's elem address
> could be bigger than 4GB.
Are you talking about a case where the memory pool size is more than 4GB?

>=20
> > It will reduce memory requirements
> >
> > L3Fwd performance testing reveals minor improvements in the cache
> > performance and no change in throughput
> >
> > Micro-benchmarking the patch using mempool_perf_test shows significant
> > improvement with majority of the test cases
> >
> > Future plan involves replacing global pool's pointer-based
> > implementation with index-based implementation
> >
> > Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > ---
> >  drivers/mempool/ring/rte_mempool_ring.c |  2 +-
> >  lib/mempool/rte_mempool.c               |  8 +++
> >  lib/mempool/rte_mempool.h               | 74 ++++++++++++++++++++++---
> >  3 files changed, 74 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/mempool/ring/rte_mempool_ring.c
> > b/drivers/mempool/ring/rte_mempool_ring.c
> > index b1f09ff28f4d..e55913e47f21 100644
> > --- a/drivers/mempool/ring/rte_mempool_ring.c
> > +++ b/drivers/mempool/ring/rte_mempool_ring.c
> > @@ -101,7 +101,7 @@ ring_alloc(struct rte_mempool *mp, uint32_t
> rg_flags)
> >  		return -rte_errno;
> >
> >  	mp->pool_data =3D r;
> > -
> > +	mp->local_cache_base_addr =3D &r[1];
> >  	return 0;
> >  }
> >
> > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> > index 59a588425bd6..424bdb19c323 100644
> > --- a/lib/mempool/rte_mempool.c
> > +++ b/lib/mempool/rte_mempool.c
> > @@ -480,6 +480,7 @@ rte_mempool_populate_default(struct
> rte_mempool *mp)
> >  	int ret;
> >  	bool need_iova_contig_obj;
> >  	size_t max_alloc_size =3D SIZE_MAX;
> > +	unsigned lcore_id;
> >
> >  	ret =3D mempool_ops_alloc_once(mp);
> >  	if (ret !=3D 0)
> > @@ -600,6 +601,13 @@ rte_mempool_populate_default(struct
> rte_mempool *mp)
> >  		}
> >  	}
> >
> > +	/* Init all default caches. */
> > +	if (mp->cache_size !=3D 0) {
> > +		for (lcore_id =3D 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
> > +			mp->local_cache[lcore_id].local_cache_base_value =3D
> > +				*(void **)mp->local_cache_base_addr;
> > +	}
> > +
> >  	rte_mempool_trace_populate_default(mp);
> >  	return mp->size;
> >
> > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > index 4235d6f0bf2b..545405c0d3ce 100644
> > --- a/lib/mempool/rte_mempool.h
> > +++ b/lib/mempool/rte_mempool.h
> > @@ -51,6 +51,8 @@
> >  #include <rte_memcpy.h>
> >  #include <rte_common.h>
> >
> > +#include <arm_neon.h>
> > +
> >  #include "rte_mempool_trace_fp.h"
> >
> >  #ifdef __cplusplus
> > @@ -91,11 +93,12 @@ struct rte_mempool_cache {
> >  	uint32_t size;	      /**< Size of the cache */
> >  	uint32_t flushthresh; /**< Threshold before we flush excess elements
> */
> >  	uint32_t len;	      /**< Current cache count */
> > +	void *local_cache_base_value; /**< Base value to calculate indices
> > +*/
> >  	/*
> >  	 * Cache is allocated to this size to allow it to overflow in certain
> >  	 * cases to avoid needless emptying of cache.
> >  	 */
> > -	void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache
> objects */
> > +	uint32_t objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache
> objects */
> >  } __rte_cache_aligned;
> >
> >  /**
> > @@ -172,7 +175,6 @@ struct rte_mempool_objtlr {
> >   * A list of memory where objects are stored
> >   */
> >  STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
> > -
> >  /**
> >   * Callback used to free a memory chunk
> >   */
> > @@ -244,6 +246,7 @@ struct rte_mempool {
> >  	int32_t ops_index;
> >
> >  	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
> > +	void *local_cache_base_addr; /**< Reference to the base value */
> >
> >  	uint32_t populated_size;         /**< Number of populated objects. */
> >  	struct rte_mempool_objhdr_list elt_list; /**< List of objects in
> > pool */ @@ -1269,7 +1272,15 @@ rte_mempool_cache_flush(struct
> rte_mempool_cache *cache,
> >  	if (cache =3D=3D NULL || cache->len =3D=3D 0)
> >  		return;
> >  	rte_mempool_trace_cache_flush(cache, mp);
> > -	rte_mempool_ops_enqueue_bulk(mp, cache->objs, cache->len);
> > +
> > +	unsigned int i;
> > +	unsigned int cache_len =3D cache->len;
> > +	void *obj_table[RTE_MEMPOOL_CACHE_MAX_SIZE * 3];
> > +	void *base_value =3D cache->local_cache_base_value;
> > +	uint32_t *cache_objs =3D cache->objs;
> > +	for (i =3D 0; i < cache_len; i++)
> > +		obj_table[i] =3D (void *) RTE_PTR_ADD(base_value,
> cache_objs[i]);
> > +	rte_mempool_ops_enqueue_bulk(mp, obj_table, cache->len);
> >  	cache->len =3D 0;
> >  }
> >
> > @@ -1289,7 +1300,9 @@ static __rte_always_inline void
> > __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
> >  		      unsigned int n, struct rte_mempool_cache *cache)  {
> > -	void **cache_objs;
> > +	uint32_t *cache_objs;
> > +	void *base_value;
> > +	uint32_t i;
> >
> >  	/* increment stat now, adding in mempool always success */
> >  	__MEMPOOL_STAT_ADD(mp, put_bulk, 1); @@ -1301,6 +1314,12
> @@
> > __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
> >
> >  	cache_objs =3D &cache->objs[cache->len];
> >
> > +	base_value =3D cache->local_cache_base_value;
> > +
> > +	uint64x2_t v_obj_table;
> > +	uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value);
> > +	uint32x2_t v_cache_objs;
> > +
> >  	/*
> >  	 * The cache follows the following algorithm
> >  	 *   1. Add the objects to the cache
> > @@ -1309,12 +1328,26 @@ __mempool_generic_put(struct rte_mempool
> *mp, void * const *obj_table,
> >  	 */
> >
> >  	/* Add elements back into the cache */
> > -	rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);
> > +
> > +#if defined __ARM_NEON
> > +	for (i =3D 0; i < (n & ~0x1); i+=3D2) {
> > +		v_obj_table =3D vld1q_u64((const uint64_t *)&obj_table[i]);
> > +		v_cache_objs =3D vqmovn_u64(vsubq_u64(v_obj_table,
> v_base_value));
> > +		vst1_u32(cache_objs + i, v_cache_objs);
> > +	}
> > +	if (n & 0x1) {
> > +		cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i],
> base_value);
> > +	}
> > +#else
> > +	for (i =3D 0; i < n; i++) {
> > +		cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i],
> base_value);
> > +	}
> > +#endif
> >
> >  	cache->len +=3D n;
> >
> >  	if (cache->len >=3D cache->flushthresh) {
> > -		rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache-
> >size],
> > +		rte_mempool_ops_enqueue_bulk(mp, obj_table + cache->len
> -
> > +cache->size,
> >  				cache->len - cache->size);
> >  		cache->len =3D cache->size;
> >  	}
> > @@ -1415,23 +1448,26 @@ __mempool_generic_get(struct rte_mempool
> *mp, void **obj_table,
> >  		      unsigned int n, struct rte_mempool_cache *cache)  {
> >  	int ret;
> > +	uint32_t i;
> >  	uint32_t index, len;
> > -	void **cache_objs;
> > +	uint32_t *cache_objs;
> >
> >  	/* No cache provided or cannot be satisfied from cache */
> >  	if (unlikely(cache =3D=3D NULL || n >=3D cache->size))
> >  		goto ring_dequeue;
> >
> > +	void *base_value =3D cache->local_cache_base_value;
> >  	cache_objs =3D cache->objs;
> >
> >  	/* Can this be satisfied from the cache? */
> >  	if (cache->len < n) {
> >  		/* No. Backfill the cache first, and then fill from it */
> >  		uint32_t req =3D n + (cache->size - cache->len);
> > +		void *temp_objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3];
> /**< Cache objects
> > +*/
> >
> >  		/* How many do we require i.e. number to fill the cache + the
> request */
> >  		ret =3D rte_mempool_ops_dequeue_bulk(mp,
> > -			&cache->objs[cache->len], req);
> > +			temp_objs, req);
> >  		if (unlikely(ret < 0)) {
> >  			/*
> >  			 * In the off chance that we are buffer constrained,
> @@ -1442,12
> > +1478,32 @@ __mempool_generic_get(struct rte_mempool *mp, void
> **obj_table,
> >  			goto ring_dequeue;
> >  		}
> >
> > +		len =3D cache->len;
> > +		for (i =3D 0; i < req; ++i, ++len) {
> > +			cache_objs[len] =3D (uint32_t)
> RTE_PTR_DIFF(temp_objs[i], base_value);
> > +		}
> > +
> >  		cache->len +=3D req;
> >  	}
> >
> > +	uint64x2_t v_obj_table;
> > +	uint64x2_t v_cache_objs;
> > +	uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value);
> > +
> >  	/* Now fill in the response ... */
> > +#if defined __ARM_NEON
> > +	for (index =3D 0, len =3D cache->len - 1; index < (n & ~0x1); index+=
=3D2,
> > +						len-=3D2, obj_table+=3D2) {
> > +		v_cache_objs =3D vmovl_u32(vld1_u32(cache_objs + len - 1));
> > +		v_obj_table =3D vaddq_u64(v_cache_objs, v_base_value);
> > +		vst1q_u64((uint64_t *)obj_table, v_obj_table);
> > +	}
> > +	if (n & 0x1)
> > +		*obj_table =3D (void *) RTE_PTR_ADD(base_value,
> cache_objs[len]);
> > +#else
> >  	for (index =3D 0, len =3D cache->len - 1; index < n; ++index, len--,
> obj_table++)
> > -		*obj_table =3D cache_objs[len];
> > +		*obj_table =3D (void *) RTE_PTR_ADD(base_value,
> cache_objs[len]);
> > +#endif
> >
> >  	cache->len -=3D n;
> >
> > --
> > 2.17.1