From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5FA1AA00C5; Sat, 5 Nov 2022 14:19:18 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 305BE4067C; Sat, 5 Nov 2022 14:19:17 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id A2359400D5 for ; Sat, 5 Nov 2022 14:19:15 +0100 (CET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: [RFC]: mempool: zero-copy cache get bulk Date: Sat, 5 Nov 2022 14:19:13 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87488@smartserver.smartshare.dk> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC]: mempool: zero-copy cache get bulk Thread-Index: AdjxGTKWl0t76eJiQb2E+CV6pXxvaQ== From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: , , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Zero-copy access to the mempool cache is beneficial for PMD performance, = and must be provided by the mempool library to fix [Bug 1052] without a = performance regression. [Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=3D1052 This RFC offers two conceptual variants of zero-copy get: 1. A simple version. 2. A version where existing (hot) objects in the cache are moved to the = top of the cache before new objects from the backend driver are pulled = in. I would like some early feedback. Also, which variant do you prefer? Notes: * Allowing the 'cache' parameter to be NULL, and getting it from the = mempool instead, was inspired by rte_mempool_cache_flush(). * Asserting that the 'mp' parameter is not NULL is not done by other = functions, so I omitted it here too. NB: Please ignore formatting. Also, this code has not even been compile = tested. PS: No promises, but I expect to offer an RFC for zero-copy put too. :-) 1. Simple version: /** * Get objects from a mempool via zero-copy access to a user-owned = mempool cache. * * @param cache * A pointer to the mempool cache. * @param mp * A pointer to the mempool. * @param n * The number of objects to prefetch into the mempool cache. * @return * The pointer to the objects in the mempool cache. * NULL on error * with rte_errno set appropriately. */ static __rte_always_inline void * rte_mempool_cache_get_bulk(struct rte_mempool_cache *cache, struct rte_mempool *mp, unsigned int n) { unsigned int len; if (cache =3D=3D NULL) cache =3D rte_mempool_default_cache(mp, rte_lcore_id()); if (cache =3D=3D NULL) { rte_errno =3D EINVAL; goto fail; } rte_mempool_trace_cache_get_bulk(cache, mp, n); len =3D cache->len; if (unlikely(n > len)) { unsigned int size; if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE)) { rte_errno =3D EINVAL; goto fail; } /* Fill the cache from the backend; fetch size + requested - len = objects. */ size =3D cache->size; ret =3D rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], size = + n - len); if (unlikely(ret < 0)) { /* * We are buffer constrained. * Do not fill the cache, just satisfy the request. */ ret =3D rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], = n - len); if (unlikely(ret < 0)) { rte_errno =3D -ret; goto fail; } len =3D 0; } else len =3D size; } else len -=3D n; cache->len =3D len; RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1); RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n); return &cache->objs[len]; fail: RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1); RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n); return NULL; } 2. Advanced version: /** * Get objects from a mempool via zero-copy access to a user-owned = mempool cache. * * @param cache * A pointer to the mempool cache. * @param mp * A pointer to the mempool. * @param n * The number of objects to prefetch into the mempool cache. * @return * The pointer to the objects in the mempool cache. * NULL on error * with rte_errno set appropriately. */ static __rte_always_inline void * rte_mempool_cache_get_bulk(struct rte_mempool_cache *cache, struct rte_mempool *mp, unsigned int n) { unsigned int len; if (cache =3D=3D NULL) cache =3D rte_mempool_default_cache(mp, rte_lcore_id()); if (cache =3D=3D NULL) { rte_errno =3D EINVAL; goto fail; } rte_mempool_trace_cache_get_bulk(cache, mp, n); len =3D cache->len; if (unlikely(n > len)) { unsigned int size; if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE)) { rte_errno =3D EINVAL; goto fail; } /* Fill the cache from the backend; fetch size + requested - len = objects. */ size =3D cache->size; if (likely(size + n >=3D 2 * len)) { /* * No overlap when copying (dst >=3D len): size + n - len = >=3D len. * Move (i.e. copy) the existing objects in the cache to the * coming top of the cache, to make room for new objects = below. */ rte_memcpy(&cache->objs[size + n - len], &cache->objs[0], = len); /* Fill the cache below the existing objects in the cache. = */ ret =3D rte_mempool_ops_dequeue_bulk(mp, &cache->objs[0], = size + n - len); if (unlikely(ret < 0)) { goto constrained; } else len =3D size; } else { /* Fill the cache on top of any objects in it. */ ret =3D rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], = size + n - len); if (unlikely(ret < 0)) { constrained: /* * We are buffer constrained. * Do not fill the cache, just satisfy the request. */ ret =3D rte_mempool_ops_dequeue_bulk(mp, = &cache->objs[len], n - len); if (unlikely(ret < 0)) { rte_errno =3D -ret; goto fail; } len =3D 0; } else len =3D size; } } else len -=3D n; cache->len =3D len; RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1); RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n); return &cache->objs[len]; fail: RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1); RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n); return NULL; } Med venlig hilsen / Kind regards, -Morten Br=F8rup