From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6B5ADA0032; Sat, 2 Oct 2021 20:51:47 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0133B411A0; Sat, 2 Oct 2021 20:51:47 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id EABF7410DA for ; Sat, 2 Oct 2021 20:51:44 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10125"; a="311275860" X-IronPort-AV: E=Sophos;i="5.85,342,1624345200"; d="scan'208";a="311275860" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Oct 2021 11:51:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,342,1624345200"; d="scan'208";a="557121400" Received: from fmsmsx605.amr.corp.intel.com ([10.18.126.85]) by FMSMGA003.fm.intel.com with ESMTP; 02 Oct 2021 11:51:43 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx605.amr.corp.intel.com (10.18.126.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Sat, 2 Oct 2021 11:51:43 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Sat, 2 Oct 2021 11:51:43 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (104.47.55.103) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Sat, 2 Oct 2021 11:51:43 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CaR/AEP467h1Jyu1DXUZSkpg7rWznvl9L6ZYLKvbKuTTK71lhC6tghXOV0FZ192eU8tkxW0voNwrVnGrDI9p3TXFKxmabAU/nHRrhxcwtL0BPawNDeNRuThaPSLrV3XdKRBDj6z2dUjjDO0UJg1o/Q3+HhDLjO4p6mKqiEHGaVNA29/VCF0BRYwKIWAAw0U9iewFonmbbzErJEKOvQZKNwZTlcGkHuPGA0oa7GbHbiAjK/uc8JNPmfIGOwu9vKR+qdiIOm5yytHoOGCnqbw+VkgYojsVq/dG9OD2I445tV+/JsZVn8IwkPTYWcrnKlewGNwtUIzXyBuefZeE4UnfCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SdVqqSvQBk/TA5aEtm8mcD5YckCS0TVQlOG90K/v8oc=; b=OKwQIqSYbptSYapOxsWRvCFVO+H4PjMNAmqR7N4YD1tALjx9A0fpcjvcuBt+hsZAC1Dd99BrEq4E87iNRHuhXSDEW9bZfJEBHUQfz/aGCHg84D5JvsvzmoJx+T0EIbS4kiBkfoHUVJRp12hK9C6iQ2cgr/Jx6h1CM8IxEU9qBny3JydlHRp9qIJldwtpEYgKzQBrBUCcnorWovJj5/dsiHIp2YcRrvq+doGPoH+87PvH/HhbH/tGOPXulL13QQWFaDz7XUBe24ieQvrX0mBUjVzsttDU7YMy8bDxneVlt3Gw+dTQi9B1g7CviUXHuCHMSZEtmDNx1aTiAR+6MHfWHQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SdVqqSvQBk/TA5aEtm8mcD5YckCS0TVQlOG90K/v8oc=; b=nFr4UfrkubEWrZrgApFnq1o4wW0ccRRUrMoyNiIB/UnIZsj5XVL5oGoewclHT94dkx6cFTu7OjRcj5MLx/vdA6i7o9f3okGz1qyAXcatxFD60TjgASxrKeMDdadEEpyE7EpjQTlsyu3DM2BkDLBywD/6hH5rMBzGk4+3jPpWsfQ= Received: from DM6PR11MB4491.namprd11.prod.outlook.com (2603:10b6:5:204::19) by DM6PR11MB4532.namprd11.prod.outlook.com (2603:10b6:5:2aa::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.14; Sat, 2 Oct 2021 18:51:41 +0000 Received: from DM6PR11MB4491.namprd11.prod.outlook.com ([fe80::740e:126e:c785:c8fd]) by DM6PR11MB4491.namprd11.prod.outlook.com ([fe80::740e:126e:c785:c8fd%4]) with mapi id 15.20.4566.014; Sat, 2 Oct 2021 18:51:40 +0000 From: "Ananyev, Konstantin" To: Honnappa Nagarahalli , Dharmik Thakkar , Olivier Matz , "Andrew Rybchenko" CC: "dev@dpdk.org" , nd , Ruifeng Wang , nd Thread-Topic: [dpdk-dev] [RFC] mempool: implement index-based per core cache Thread-Index: AQHXtiCSje19dgSVGUWYbrC84aN826u+qP0QgAAtrYCAATd1wA== Date: Sat, 2 Oct 2021 18:51:40 +0000 Message-ID: References: <20210930172735.2675627-1-dharmik.thakkar@arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.6.200.16 authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 92f60c99-665c-4eb1-dff2-08d985d5abf8 x-ms-traffictypediagnostic: DM6PR11MB4532: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: eOmzeoRMsyL80H58sfNtkEvzG5KupnztEBGxADBPERHd84D7RDG1/kLJw6BR6kcYZ91SE8EhtVwi3sbWR6O/kPW1n4IHX+97xbM5jk5rGHhSHJFKXeNpBxLnZZLPKshOhWAEiUYYZJhKzjAi4FK5CissaqVfh1/5RitwvZLI+gg1TUIDvU6diWukApv3q89WyOQRBp658czMcmI2Js0Io3q1w01Bwmbr5Y+hVs6oCqsXE+y9vw9My6NX/QJpMn+VWlZuQtHih6C17ErCMcVPnaSt0UX4PtmNWT3GFF7sqki9jjELVSmOIy+ZMVcOA8CNaJY8+Zq9cW+RYdGsU1fwxVU6kp6L8FTH5/z1yxpPP+ozNQ2KE6XFOIgmyN2VsZTj948FrXaBnuLJXWOZGzGjaKEKKPNNsMUNpLAgbaxugtze04aNpV4G/+BeuZ3ox0Mu0zNBM+Q2PQe1DDIDiRWeaoXSDE4eFDLYnjYEm1AYclkGRnZk+rGXtsFfKRzA+EsdtxaL1l5nBD6UJX0zm/pBGVD0jv44c/UbSkkhBHY8JxrHkoItIKceGqeo5hLpfaTPzuw7qJ7IPMbWWt1mXsDi8oRNZrSG93ot25OBuiivdBkT2pE+OelbYZowkjVQYq6J0BPf8xPdF88ds6EzjlR0RFIKUkw1/eJTmTH+0de2MuVD0g45APUU1Xczo2ygyIrmBDsU6VC2Zw2u6HnP8SyGOg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR11MB4491.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(86362001)(4326008)(508600001)(38100700002)(316002)(122000001)(71200400001)(54906003)(110136005)(66556008)(66476007)(8676002)(66446008)(76116006)(38070700005)(8936002)(83380400001)(26005)(186003)(66946007)(55236004)(52536014)(9686003)(7696005)(64756008)(55016002)(5660300002)(6506007)(33656002)(2906002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?9J7ymwZ7VgR9iXJi5XkhzSpYoljdLOuV/q3KdlizKB2n7s3NPRc/1aC3Ws3l?= =?us-ascii?Q?KXqO+6Pws3lA72ZNqKssxRTpD47b1ccgCh7FHAlAJA0OBX9h0e/+HEVde3v2?= =?us-ascii?Q?4pWmdkRsL7aFcEr0B79DGrRKKrkLuE7FTFgXKAYITRdhsPlSxvakO7MYOJIA?= =?us-ascii?Q?9rB08m9HcV5Vh5dLGd73s7pFwFxw5supfacmW4y1etd1D0kHttlS0rS3VJ3t?= =?us-ascii?Q?wK5FTErKugOjb4CSyQ0JCsno6kO1wwS6avOA33I0yV9FHPL5jd3LcY51hhDO?= =?us-ascii?Q?Ys+RPB3vfQsO0Lzh1czkBmgxC1lWkn9nbzTNTlJLOeWVw9DCvbjmu2UCZdmG?= =?us-ascii?Q?GIWdDHlv+FIzMKb4t9pQ834FKQYlQsPHHYHdD8YyIJ2QdqdSSE9AIHLjtbGG?= =?us-ascii?Q?ruzadyh7gd8lTWfK7Y/vynLY/zn8ToX9mLYBBJLI1iVQhoVFWTFZbCM0L2no?= =?us-ascii?Q?rMMM9IgK71J6M9M0ocszdUHTbQ0gz1iW4T9VPTrUpBMlCdKgjvU5kURIjC2D?= =?us-ascii?Q?VgBciMZEZu4jU/oP/otEVmHEiqP6B9oa9irg5C1lLILLYInhI2TPbywwrf8+?= =?us-ascii?Q?8XlMQbNNvEB1oj2c//QsHakZt98Iaj2LbWV94Plp5r9+r9dpWfbAhm490Fma?= =?us-ascii?Q?0hCeJVpesimG/OXZRWRCn9WSbRGy9lp/iGlh4Zzzg57nVFt1QzzseuuWSkhd?= =?us-ascii?Q?wozwyQ7RIXAOowRNdqlS3aghJRpoz1HUwFMyQZeRxw9vcUCCGnlc7y6PveVE?= =?us-ascii?Q?pnkvCoAURgO3ZwPQJw5G80VpJh/okkkWQaGHhRRn5vCNvYO8GWS/Mx0yvloK?= =?us-ascii?Q?+PO1WQSfU3VotqvyE8nwEZGl8ifKH77NiQGaxA6F/5RLoImNdzyi6/Io1CSj?= =?us-ascii?Q?Twm2TPnOW9nDSGLW5fftNzcUUZkkIhTpkb33WX7Ttn90km9yck9y5oHQkSkd?= =?us-ascii?Q?sjhbtOStzIkUn5QSrRDIeTYh2xqCYa+fRye4RV6ZuMDiqJTXvp5OX5IbLyyj?= =?us-ascii?Q?p5gtJ/QmpbtwTeC1IH67BUSQP2i2pU1iOXLiRv5ExdNCBwRATx1s3djozY9J?= =?us-ascii?Q?21YvWrF5ka+Sq5ifuyhUg64HSTUklvycE2POGKqc/RaVogFAkXLemexdvB3Q?= =?us-ascii?Q?8c4oCkVNyzSyy05x1//rzJiuG8k+leHY+/QhRukV/wVckwMFU0V7AGKQaami?= =?us-ascii?Q?n/b4U2Q59zxk152duUM6R+cOI+43sLNpCB0t99R3r1Szo+XRLD9fsBu71TGK?= =?us-ascii?Q?hcDHx3UU92O7GeMudHumkbKw2mx/+j1be8ax/i9H1b8M6yBuhAOJUcWBzzSW?= =?us-ascii?Q?KdOa58dG1D+/Cs9iMaObNn28?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR11MB4491.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 92f60c99-665c-4eb1-dff2-08d985d5abf8 X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Oct 2021 18:51:40.8928 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Vu5CpzVbpl4tqTlyIUu3RfbOPbFVPBaGUoQw1TeFyWmhtVr3QUJITgNqWRJgcPVztc98I6aUXVrSbA7Ret0ktUCZ9o6tS1VMfKaJZ69rXSc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR11MB4532 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > Current mempool per core cache implementation is based on pointer For > > > most architectures, each pointer consumes 64b Replace it with > > > index-based implementation, where in each buffer is addressed by (poo= l > > > address + index) > > > > I don't think it is going to work: > > On 64-bit systems difference between pool address and it's elem address > > could be bigger than 4GB. > Are you talking about a case where the memory pool size is more than 4GB? That is one possible scenario. Another possibility - user populates mempool himself with some external memory by calling rte_mempool_populate_iova() directly. I suppose such situation can even occur even with normal rte_mempool_create= (), though it should be a really rare one. =20 >=20 > > > > > It will reduce memory requirements > > > > > > L3Fwd performance testing reveals minor improvements in the cache > > > performance and no change in throughput > > > > > > Micro-benchmarking the patch using mempool_perf_test shows significan= t > > > improvement with majority of the test cases > > > > > > Future plan involves replacing global pool's pointer-based > > > implementation with index-based implementation > > > > > > Signed-off-by: Dharmik Thakkar > > > --- > > > drivers/mempool/ring/rte_mempool_ring.c | 2 +- > > > lib/mempool/rte_mempool.c | 8 +++ > > > lib/mempool/rte_mempool.h | 74 ++++++++++++++++++++++-= -- > > > 3 files changed, 74 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/mempool/ring/rte_mempool_ring.c > > > b/drivers/mempool/ring/rte_mempool_ring.c > > > index b1f09ff28f4d..e55913e47f21 100644 > > > --- a/drivers/mempool/ring/rte_mempool_ring.c > > > +++ b/drivers/mempool/ring/rte_mempool_ring.c > > > @@ -101,7 +101,7 @@ ring_alloc(struct rte_mempool *mp, uint32_t > > rg_flags) > > > return -rte_errno; > > > > > > mp->pool_data =3D r; > > > - > > > + mp->local_cache_base_addr =3D &r[1]; > > > return 0; > > > } > > > > > > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c > > > index 59a588425bd6..424bdb19c323 100644 > > > --- a/lib/mempool/rte_mempool.c > > > +++ b/lib/mempool/rte_mempool.c > > > @@ -480,6 +480,7 @@ rte_mempool_populate_default(struct > > rte_mempool *mp) > > > int ret; > > > bool need_iova_contig_obj; > > > size_t max_alloc_size =3D SIZE_MAX; > > > + unsigned lcore_id; > > > > > > ret =3D mempool_ops_alloc_once(mp); > > > if (ret !=3D 0) > > > @@ -600,6 +601,13 @@ rte_mempool_populate_default(struct > > rte_mempool *mp) > > > } > > > } > > > > > > + /* Init all default caches. */ > > > + if (mp->cache_size !=3D 0) { > > > + for (lcore_id =3D 0; lcore_id < RTE_MAX_LCORE; lcore_id++) > > > + mp->local_cache[lcore_id].local_cache_base_value =3D > > > + *(void **)mp->local_cache_base_addr; > > > + } > > > + > > > rte_mempool_trace_populate_default(mp); > > > return mp->size; > > > > > > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h > > > index 4235d6f0bf2b..545405c0d3ce 100644 > > > --- a/lib/mempool/rte_mempool.h > > > +++ b/lib/mempool/rte_mempool.h > > > @@ -51,6 +51,8 @@ > > > #include > > > #include > > > > > > +#include > > > + > > > #include "rte_mempool_trace_fp.h" > > > > > > #ifdef __cplusplus > > > @@ -91,11 +93,12 @@ struct rte_mempool_cache { > > > uint32_t size; /**< Size of the cache */ > > > uint32_t flushthresh; /**< Threshold before we flush excess element= s > > */ > > > uint32_t len; /**< Current cache count */ > > > + void *local_cache_base_value; /**< Base value to calculate indices > > > +*/ > > > /* > > > * Cache is allocated to this size to allow it to overflow in certa= in > > > * cases to avoid needless emptying of cache. > > > */ > > > - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache > > objects */ > > > + uint32_t objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache > > objects */ > > > } __rte_cache_aligned; > > > > > > /** > > > @@ -172,7 +175,6 @@ struct rte_mempool_objtlr { > > > * A list of memory where objects are stored > > > */ > > > STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr); > > > - > > > /** > > > * Callback used to free a memory chunk > > > */ > > > @@ -244,6 +246,7 @@ struct rte_mempool { > > > int32_t ops_index; > > > > > > struct rte_mempool_cache *local_cache; /**< Per-lcore local cache *= / > > > + void *local_cache_base_addr; /**< Reference to the base value */ > > > > > > uint32_t populated_size; /**< Number of populated objects. = */ > > > struct rte_mempool_objhdr_list elt_list; /**< List of objects in > > > pool */ @@ -1269,7 +1272,15 @@ rte_mempool_cache_flush(struct > > rte_mempool_cache *cache, > > > if (cache =3D=3D NULL || cache->len =3D=3D 0) > > > return; > > > rte_mempool_trace_cache_flush(cache, mp); > > > - rte_mempool_ops_enqueue_bulk(mp, cache->objs, cache->len); > > > + > > > + unsigned int i; > > > + unsigned int cache_len =3D cache->len; > > > + void *obj_table[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; > > > + void *base_value =3D cache->local_cache_base_value; > > > + uint32_t *cache_objs =3D cache->objs; > > > + for (i =3D 0; i < cache_len; i++) > > > + obj_table[i] =3D (void *) RTE_PTR_ADD(base_value, > > cache_objs[i]); > > > + rte_mempool_ops_enqueue_bulk(mp, obj_table, cache->len); > > > cache->len =3D 0; > > > } > > > > > > @@ -1289,7 +1300,9 @@ static __rte_always_inline void > > > __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table= , > > > unsigned int n, struct rte_mempool_cache *cache) { > > > - void **cache_objs; > > > + uint32_t *cache_objs; > > > + void *base_value; > > > + uint32_t i; > > > > > > /* increment stat now, adding in mempool always success */ > > > __MEMPOOL_STAT_ADD(mp, put_bulk, 1); @@ -1301,6 +1314,12 > > @@ > > > __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table= , > > > > > > cache_objs =3D &cache->objs[cache->len]; > > > > > > + base_value =3D cache->local_cache_base_value; > > > + > > > + uint64x2_t v_obj_table; > > > + uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value); > > > + uint32x2_t v_cache_objs; > > > + > > > /* > > > * The cache follows the following algorithm > > > * 1. Add the objects to the cache > > > @@ -1309,12 +1328,26 @@ __mempool_generic_put(struct rte_mempool > > *mp, void * const *obj_table, > > > */ > > > > > > /* Add elements back into the cache */ > > > - rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n); > > > + > > > +#if defined __ARM_NEON > > > + for (i =3D 0; i < (n & ~0x1); i+=3D2) { > > > + v_obj_table =3D vld1q_u64((const uint64_t *)&obj_table[i]); > > > + v_cache_objs =3D vqmovn_u64(vsubq_u64(v_obj_table, > > v_base_value)); > > > + vst1_u32(cache_objs + i, v_cache_objs); > > > + } > > > + if (n & 0x1) { > > > + cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i], > > base_value); > > > + } > > > +#else > > > + for (i =3D 0; i < n; i++) { > > > + cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i], > > base_value); > > > + } > > > +#endif > > > > > > cache->len +=3D n; > > > > > > if (cache->len >=3D cache->flushthresh) { > > > - rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache- > > >size], > > > + rte_mempool_ops_enqueue_bulk(mp, obj_table + cache->len > > - > > > +cache->size, > > > cache->len - cache->size); > > > cache->len =3D cache->size; > > > } > > > @@ -1415,23 +1448,26 @@ __mempool_generic_get(struct rte_mempool > > *mp, void **obj_table, > > > unsigned int n, struct rte_mempool_cache *cache) { > > > int ret; > > > + uint32_t i; > > > uint32_t index, len; > > > - void **cache_objs; > > > + uint32_t *cache_objs; > > > > > > /* No cache provided or cannot be satisfied from cache */ > > > if (unlikely(cache =3D=3D NULL || n >=3D cache->size)) > > > goto ring_dequeue; > > > > > > + void *base_value =3D cache->local_cache_base_value; > > > cache_objs =3D cache->objs; > > > > > > /* Can this be satisfied from the cache? */ > > > if (cache->len < n) { > > > /* No. Backfill the cache first, and then fill from it */ > > > uint32_t req =3D n + (cache->size - cache->len); > > > + void *temp_objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; > > /**< Cache objects > > > +*/ > > > > > > /* How many do we require i.e. number to fill the cache + the > > request */ > > > ret =3D rte_mempool_ops_dequeue_bulk(mp, > > > - &cache->objs[cache->len], req); > > > + temp_objs, req); > > > if (unlikely(ret < 0)) { > > > /* > > > * In the off chance that we are buffer constrained, > > @@ -1442,12 > > > +1478,32 @@ __mempool_generic_get(struct rte_mempool *mp, void > > **obj_table, > > > goto ring_dequeue; > > > } > > > > > > + len =3D cache->len; > > > + for (i =3D 0; i < req; ++i, ++len) { > > > + cache_objs[len] =3D (uint32_t) > > RTE_PTR_DIFF(temp_objs[i], base_value); > > > + } > > > + > > > cache->len +=3D req; > > > } > > > > > > + uint64x2_t v_obj_table; > > > + uint64x2_t v_cache_objs; > > > + uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value); > > > + > > > /* Now fill in the response ... */ > > > +#if defined __ARM_NEON > > > + for (index =3D 0, len =3D cache->len - 1; index < (n & ~0x1); index= +=3D2, > > > + len-=3D2, obj_table+=3D2) { > > > + v_cache_objs =3D vmovl_u32(vld1_u32(cache_objs + len - 1)); > > > + v_obj_table =3D vaddq_u64(v_cache_objs, v_base_value); > > > + vst1q_u64((uint64_t *)obj_table, v_obj_table); > > > + } > > > + if (n & 0x1) > > > + *obj_table =3D (void *) RTE_PTR_ADD(base_value, > > cache_objs[len]); > > > +#else > > > for (index =3D 0, len =3D cache->len - 1; index < n; ++index, len--= , > > obj_table++) > > > - *obj_table =3D cache_objs[len]; > > > + *obj_table =3D (void *) RTE_PTR_ADD(base_value, > > cache_objs[len]); > > > +#endif > > > > > > cache->len -=3D n; > > > > > > -- > > > 2.17.1