From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by dpdk.org (Postfix) with ESMTP id 324552C0B for ; Wed, 29 Jun 2016 01:47:59 +0200 (CEST) Received: by mail-pa0-f41.google.com with SMTP id b13so10972752pat.0 for ; Tue, 28 Jun 2016 16:47:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nofutznetworks-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pTdNwY8dihP8V2AGDZltiAOrTjwR4BCg7R305BF3NZE=; b=BsB06W05fr11ABPluJ73Xw5XgFZDVNNIkPDNX7O3XKXfqIxLYehOOuB5BWoYY2y6Qu Rs+jbW717tGHwjug9OlsHwpHzTeoSHUjoEgBmxnlrbcFVZ+pl/HRFwLh6S87yknZ1SQi BuTcBsphh6sJvWaw3N6C0av68DaCcniOiOO/dcJ+RLfG0vyUuD1DtotEpCzS6PYnWEcg pf7Og+J1YUviWT3wqYAC3Wa+MiAyzc69n+0pBBSobmnR5piz9+RltZW0gU9dSaYPACBN fFGOQPFKaOm7M7MGqWgHFSAjZbGR4e+HFgyuhF3rxPv6eC6ufRd6r5oIsxEE3uPRJEDk TjYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pTdNwY8dihP8V2AGDZltiAOrTjwR4BCg7R305BF3NZE=; b=JkXkkIOj/N7ZKXNutd7joVQa/DmWAY4kCOFAON8UGPNj/8EgKq7ZWnHo4EJ9zDeLRk m7qW3bW5AOe2dJ1hnBIcWlADbUca2mf8/WtLQ76/4TEwM3U7AKeVVHac4DNDfXfI9a8G AnZzI8IYhPGPxRRMBBL9oN8VqMdZi4LgkbHxV9SPH4tzlPFsW4gOoq9ZlhGCnHLlN0un mBy+7vlk9N3Dly+CopPOucxYxFZtZhUMPF/7a/nvpjqUa4RCIRDydtdGzBC0X+XVp0T4 NMqeGGYr2s78LPcX7Y/dwNlevfmLZKOAiPzuFRnJhEqNLbAjg9ZrR5hedGjo1PzFtA57 uoXw== X-Gm-Message-State: ALyK8tLfqsHmTqtOhD4moBDcAeQ3dqEnWc8C3dsFty0J7Nu1xBhOTORU5lZWetGNEzZ2Wg== X-Received: by 10.66.180.40 with SMTP id dl8mr6526340pac.131.1467157677824; Tue, 28 Jun 2016 16:47:57 -0700 (PDT) Received: from lap-3.2f30.org (cpc92320-cmbg19-2-0-cust3.5-4.cable.virginm.net. [82.13.64.4]) by smtp.gmail.com with ESMTPSA id p63sm608488pfp.65.2016.06.28.16.47.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 28 Jun 2016 16:47:57 -0700 (PDT) From: Lazaros Koromilas To: dev@dpdk.org Cc: Olivier Matz Date: Wed, 29 Jun 2016 00:47:38 +0100 Message-Id: <1467157658-28935-4-git-send-email-l@nofutznetworks.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1467157658-28935-1-git-send-email-l@nofutznetworks.com> References: <1467042637-22907-1-git-send-email-olivier.matz@6wind.com> <1467157658-28935-1-git-send-email-l@nofutznetworks.com> Subject: [dpdk-dev] [PATCH v5 3/3] mempool: allow for user-owned mempool caches X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jun 2016 23:47:59 -0000 The mempool cache is only available to EAL threads as a per-lcore resource. Change this so that the user can create and provide their own cache on mempool get and put operations. This works with non-EAL threads too. This commit introduces the new API calls: rte_mempool_cache_create(size, socket_id) rte_mempool_cache_free(cache) rte_mempool_cache_flush(cache, mp) rte_mempool_default_cache(mp, lcore_id) Changes the API calls: rte_mempool_generic_put(mp, obj_table, n, cache, flags) rte_mempool_generic_get(mp, obj_table, n, cache, flags) The cache-oblivious API calls use the per-lcore default local cache. Signed-off-by: Lazaros Koromilas Acked-by: Olivier Matz --- app/test/test_mempool.c | 73 ++++++++--- app/test/test_mempool_perf.c | 73 +++++++++-- doc/guides/prog_guide/env_abstraction_layer.rst | 4 +- doc/guides/prog_guide/mempool_lib.rst | 6 +- lib/librte_mempool/rte_mempool.c | 66 +++++++++- lib/librte_mempool/rte_mempool.h | 164 +++++++++++++++++------- lib/librte_mempool/rte_mempool_version.map | 4 + 7 files changed, 308 insertions(+), 82 deletions(-) diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c index 55c2cbc..63c61f3 100644 --- a/app/test/test_mempool.c +++ b/app/test/test_mempool.c @@ -75,10 +75,16 @@ #define MAX_KEEP 16 #define MEMPOOL_SIZE ((rte_lcore_count()*(MAX_KEEP+RTE_MEMPOOL_CACHE_MAX_SIZE))-1) +#define LOG_ERR() printf("test failed at %s():%d\n", __func__, __LINE__) #define RET_ERR() do { \ - printf("test failed at %s():%d\n", __func__, __LINE__); \ + LOG_ERR(); \ return -1; \ } while (0) +#define GOTO_ERR(var, label) do { \ + LOG_ERR(); \ + var = -1; \ + goto label; \ + } while (0) static rte_atomic32_t synchro; @@ -191,7 +197,7 @@ my_obj_init(struct rte_mempool *mp, __attribute__((unused)) void *arg, /* basic tests (done on one core) */ static int -test_mempool_basic(struct rte_mempool *mp) +test_mempool_basic(struct rte_mempool *mp, int use_external_cache) { uint32_t *objnum; void **objtable; @@ -199,47 +205,62 @@ test_mempool_basic(struct rte_mempool *mp) char *obj_data; int ret = 0; unsigned i, j; + int offset; + struct rte_mempool_cache *cache; + + if (use_external_cache) { + /* Create a user-owned mempool cache. */ + cache = rte_mempool_cache_create(RTE_MEMPOOL_CACHE_MAX_SIZE, + SOCKET_ID_ANY); + if (cache == NULL) + RET_ERR(); + } else { + /* May be NULL if cache is disabled. */ + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + } /* dump the mempool status */ rte_mempool_dump(stdout, mp); printf("get an object\n"); - if (rte_mempool_get(mp, &obj) < 0) - RET_ERR(); + if (rte_mempool_generic_get(mp, &obj, 1, cache, 0) < 0) + GOTO_ERR(ret, out); rte_mempool_dump(stdout, mp); /* tests that improve coverage */ printf("get object count\n"); - if (rte_mempool_count(mp) != MEMPOOL_SIZE - 1) - RET_ERR(); + /* We have to count the extra caches, one in this case. */ + offset = use_external_cache ? 1 * cache->len : 0; + if (rte_mempool_count(mp) + offset != MEMPOOL_SIZE - 1) + GOTO_ERR(ret, out); printf("get private data\n"); if (rte_mempool_get_priv(mp) != (char *)mp + MEMPOOL_HEADER_SIZE(mp, mp->cache_size)) - RET_ERR(); + GOTO_ERR(ret, out); #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */ printf("get physical address of an object\n"); if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj)) - RET_ERR(); + GOTO_ERR(ret, out); #endif printf("put the object back\n"); - rte_mempool_put(mp, obj); + rte_mempool_generic_put(mp, &obj, 1, cache, 0); rte_mempool_dump(stdout, mp); printf("get 2 objects\n"); - if (rte_mempool_get(mp, &obj) < 0) - RET_ERR(); - if (rte_mempool_get(mp, &obj2) < 0) { - rte_mempool_put(mp, obj); - RET_ERR(); + if (rte_mempool_generic_get(mp, &obj, 1, cache, 0) < 0) + GOTO_ERR(ret, out); + if (rte_mempool_generic_get(mp, &obj2, 1, cache, 0) < 0) { + rte_mempool_generic_put(mp, &obj, 1, cache, 0); + GOTO_ERR(ret, out); } rte_mempool_dump(stdout, mp); printf("put the objects back\n"); - rte_mempool_put(mp, obj); - rte_mempool_put(mp, obj2); + rte_mempool_generic_put(mp, &obj, 1, cache, 0); + rte_mempool_generic_put(mp, &obj2, 1, cache, 0); rte_mempool_dump(stdout, mp); /* @@ -248,10 +269,10 @@ test_mempool_basic(struct rte_mempool *mp) */ objtable = malloc(MEMPOOL_SIZE * sizeof(void *)); if (objtable == NULL) - RET_ERR(); + GOTO_ERR(ret, out); for (i = 0; i < MEMPOOL_SIZE; i++) { - if (rte_mempool_get(mp, &objtable[i]) < 0) + if (rte_mempool_generic_get(mp, &objtable[i], 1, cache, 0) < 0) break; } @@ -273,13 +294,19 @@ test_mempool_basic(struct rte_mempool *mp) ret = -1; } - rte_mempool_put(mp, objtable[i]); + rte_mempool_generic_put(mp, &objtable[i], 1, cache, 0); } free(objtable); if (ret == -1) printf("objects were modified!\n"); +out: + if (use_external_cache) { + rte_mempool_cache_flush(cache, mp); + rte_mempool_cache_free(cache); + } + return ret; } @@ -631,11 +658,15 @@ test_mempool(void) rte_mempool_list_dump(stdout); /* basic tests without cache */ - if (test_mempool_basic(mp_nocache) < 0) + if (test_mempool_basic(mp_nocache, 0) < 0) goto err; /* basic tests with cache */ - if (test_mempool_basic(mp_cache) < 0) + if (test_mempool_basic(mp_cache, 0) < 0) + goto err; + + /* basic tests with user-owned cache */ + if (test_mempool_basic(mp_nocache, 1) < 0) goto err; /* more basic tests without cache */ diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c index c5f8455..b80a1dd 100644 --- a/app/test/test_mempool_perf.c +++ b/app/test/test_mempool_perf.c @@ -78,6 +78,9 @@ * - One core without cache * - Two cores without cache * - Max. cores without cache + * - One core with user-owned cache + * - Two cores with user-owned cache + * - Max. cores with user-owned cache * * - Bulk size (*n_get_bulk*, *n_put_bulk*) * @@ -96,8 +99,21 @@ #define MAX_KEEP 128 #define MEMPOOL_SIZE ((rte_lcore_count()*(MAX_KEEP+RTE_MEMPOOL_CACHE_MAX_SIZE))-1) +#define LOG_ERR() printf("test failed at %s():%d\n", __func__, __LINE__) +#define RET_ERR() do { \ + LOG_ERR(); \ + return -1; \ + } while (0) +#define GOTO_ERR(var, label) do { \ + LOG_ERR(); \ + var = -1; \ + goto label; \ + } while (0) + static struct rte_mempool *mp; static struct rte_mempool *mp_cache, *mp_nocache; +static int use_external_cache; +static unsigned external_cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE; static rte_atomic32_t synchro; @@ -134,15 +150,27 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg) void *obj_table[MAX_KEEP]; unsigned i, idx; unsigned lcore_id = rte_lcore_id(); - int ret; + int ret = 0; uint64_t start_cycles, end_cycles; uint64_t time_diff = 0, hz = rte_get_timer_hz(); + struct rte_mempool_cache *cache; + + if (use_external_cache) { + /* Create a user-owned mempool cache. */ + cache = rte_mempool_cache_create(external_cache_size, + SOCKET_ID_ANY); + if (cache == NULL) + RET_ERR(); + } else { + /* May be NULL if cache is disabled. */ + cache = rte_mempool_default_cache(mp, lcore_id); + } /* n_get_bulk and n_put_bulk must be divisors of n_keep */ if (((n_keep / n_get_bulk) * n_get_bulk) != n_keep) - return -1; + GOTO_ERR(ret, out); if (((n_keep / n_put_bulk) * n_put_bulk) != n_keep) - return -1; + GOTO_ERR(ret, out); stats[lcore_id].enq_count = 0; @@ -157,12 +185,14 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg) /* get n_keep objects by bulk of n_bulk */ idx = 0; while (idx < n_keep) { - ret = rte_mempool_get_bulk(mp, &obj_table[idx], - n_get_bulk); + ret = rte_mempool_generic_get(mp, + &obj_table[idx], + n_get_bulk, + cache, 0); if (unlikely(ret < 0)) { rte_mempool_dump(stdout, mp); /* in this case, objects are lost... */ - return -1; + GOTO_ERR(ret, out); } idx += n_get_bulk; } @@ -170,8 +200,9 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg) /* put the objects back */ idx = 0; while (idx < n_keep) { - rte_mempool_put_bulk(mp, &obj_table[idx], - n_put_bulk); + rte_mempool_generic_put(mp, &obj_table[idx], + n_put_bulk, + cache, 0); idx += n_put_bulk; } } @@ -180,7 +211,13 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg) stats[lcore_id].enq_count += N; } - return 0; +out: + if (use_external_cache) { + rte_mempool_cache_flush(cache, mp); + rte_mempool_cache_free(cache); + } + + return ret; } /* launch all the per-lcore test, and display the result */ @@ -199,7 +236,9 @@ launch_cores(unsigned cores) printf("mempool_autotest cache=%u cores=%u n_get_bulk=%u " "n_put_bulk=%u n_keep=%u ", - (unsigned) mp->cache_size, cores, n_get_bulk, n_put_bulk, n_keep); + use_external_cache ? + external_cache_size : (unsigned) mp->cache_size, + cores, n_get_bulk, n_put_bulk, n_keep); if (rte_mempool_count(mp) != MEMPOOL_SIZE) { printf("mempool is not full\n"); @@ -323,6 +362,20 @@ test_mempool_perf(void) if (do_one_mempool_test(rte_lcore_count()) < 0) return -1; + /* performance test with 1, 2 and max cores */ + printf("start performance test (with user-owned cache)\n"); + mp = mp_nocache; + use_external_cache = 1; + + if (do_one_mempool_test(1) < 0) + return -1; + + if (do_one_mempool_test(2) < 0) + return -1; + + if (do_one_mempool_test(rte_lcore_count()) < 0) + return -1; + rte_mempool_list_dump(stdout); return 0; diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index 4737dc2..4b9895e 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -322,8 +322,8 @@ Known Issues The rte_mempool uses a per-lcore cache inside the mempool. For non-EAL pthreads, ``rte_lcore_id()`` will not return a valid number. - So for now, when rte_mempool is used with non-EAL pthreads, the put/get operations will bypass the mempool cache and there is a performance penalty because of this bypass. - Support for non-EAL mempool cache is currently being enabled. + So for now, when rte_mempool is used with non-EAL pthreads, the put/get operations will bypass the default mempool cache and there is a performance penalty because of this bypass. + Only user-owned external caches can be used in a non-EAL context in conjunction with ``rte_mempool_generic_put()`` and ``rte_mempool_generic_get()`` that accept an explicit cache parameter. + rte_ring diff --git a/doc/guides/prog_guide/mempool_lib.rst b/doc/guides/prog_guide/mempool_lib.rst index 1943fc4..5946675 100644 --- a/doc/guides/prog_guide/mempool_lib.rst +++ b/doc/guides/prog_guide/mempool_lib.rst @@ -115,7 +115,7 @@ While this may mean a number of buffers may sit idle on some core's cache, the speed at which a core can access its own cache for a specific memory pool without locks provides performance gains. The cache is composed of a small, per-core table of pointers and its length (used as a stack). -This cache can be enabled or disabled at creation of the pool. +This internal cache can be enabled or disabled at creation of the pool. The maximum size of the cache is static and is defined at compilation time (CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE). @@ -127,6 +127,10 @@ The maximum size of the cache is static and is defined at compilation time (CONF A mempool in Memory with its Associated Ring +Alternatively to the internal default per-lcore local cache, an application can create and manage external caches through the ``rte_mempool_cache_create()``, ``rte_mempool_cache_free()`` and ``rte_mempool_cache_flush()`` calls. +These user-owned caches can be explicitly passed to ``rte_mempool_generic_put()`` and ``rte_mempool_generic_get()``. +The ``rte_mempool_default_cache()`` call returns the default internal cache if any. +In contrast to the default caches, user-owned caches can be used by non-EAL threads too. Mempool Handlers ------------------------ diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index e6a83d0..4f159fc 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -674,6 +674,53 @@ rte_mempool_free(struct rte_mempool *mp) rte_memzone_free(mp->mz); } +static void +mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size) +{ + cache->size = size; + cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size); + cache->len = 0; +} + +/* + * Create and initialize a cache for objects that are retrieved from and + * returned to an underlying mempool. This structure is identical to the + * local_cache[lcore_id] pointed to by the mempool structure. + */ +struct rte_mempool_cache * +rte_mempool_cache_create(uint32_t size, int socket_id) +{ + struct rte_mempool_cache *cache; + + if (size == 0 || size > RTE_MEMPOOL_CACHE_MAX_SIZE) { + rte_errno = EINVAL; + return NULL; + } + + cache = rte_zmalloc_socket("MEMPOOL_CACHE", sizeof(*cache), + RTE_CACHE_LINE_SIZE, socket_id); + if (cache == NULL) { + RTE_LOG(ERR, MEMPOOL, "Cannot allocate mempool cache.\n"); + rte_errno = ENOMEM; + return NULL; + } + + mempool_cache_init(cache, size); + + return cache; +} + +/* + * Free a cache. It's the responsibility of the user to make sure that any + * remaining objects in the cache are flushed to the corresponding + * mempool. + */ +void +rte_mempool_cache_free(struct rte_mempool_cache *cache) +{ + rte_free(cache); +} + /* create an empty mempool */ struct rte_mempool * rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, @@ -688,6 +735,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, size_t mempool_size; int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY; struct rte_mempool_objsz objsz; + unsigned lcore_id; int ret; /* compilation-time checks */ @@ -768,8 +816,8 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, mp->elt_size = objsz.elt_size; mp->header_size = objsz.header_size; mp->trailer_size = objsz.trailer_size; + /* Size of default caches, zero means disabled. */ mp->cache_size = cache_size; - mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size); mp->private_data_size = private_data_size; STAILQ_INIT(&mp->elt_list); STAILQ_INIT(&mp->mem_list); @@ -781,6 +829,13 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, mp->local_cache = (struct rte_mempool_cache *) RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, 0)); + /* Init all default caches. */ + if (cache_size != 0) { + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) + mempool_cache_init(&mp->local_cache[lcore_id], + cache_size); + } + te->data = mp; rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); @@ -936,7 +991,7 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) unsigned count = 0; unsigned cache_count; - fprintf(f, " cache infos:\n"); + fprintf(f, " internal cache infos:\n"); fprintf(f, " cache_size=%"PRIu32"\n", mp->cache_size); if (mp->cache_size == 0) @@ -944,7 +999,8 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { cache_count = mp->local_cache[lcore_id].len; - fprintf(f, " cache_count[%u]=%u\n", lcore_id, cache_count); + fprintf(f, " cache_count[%u]=%"PRIu32"\n", + lcore_id, cache_count); count += cache_count; } fprintf(f, " total_cache_count=%u\n", count); @@ -1063,7 +1119,9 @@ mempool_audit_cache(const struct rte_mempool *mp) return; for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { - if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) { + const struct rte_mempool_cache *cache; + cache = &mp->local_cache[lcore_id]; + if (cache->len > cache->flushthresh) { RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n", lcore_id); rte_panic("MEMPOOL: invalid cache len\n"); diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 971b1ba..1963253 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -101,7 +101,9 @@ struct rte_mempool_debug_stats { * A structure that stores a per-core object cache. */ struct rte_mempool_cache { - unsigned len; /**< Cache len */ + uint32_t size; /**< Size of the cache */ + uint32_t flushthresh; /**< Threshold before we flush excess elements */ + uint32_t len; /**< Current cache count */ /* * Cache is allocated to this size to allow it to overflow in certain * cases to avoid needless emptying of cache. @@ -213,9 +215,8 @@ struct rte_mempool { int flags; /**< Flags of the mempool. */ int socket_id; /**< Socket id passed at create. */ uint32_t size; /**< Max size of the mempool. */ - uint32_t cache_size; /**< Size of per-lcore local cache. */ - uint32_t cache_flushthresh; - /**< Threshold before we flush excess elements. */ + uint32_t cache_size; + /**< Size of per-lcore default local cache. */ uint32_t elt_size; /**< Size of an element. */ uint32_t header_size; /**< Size of header (before elt). */ @@ -945,6 +946,70 @@ uint32_t rte_mempool_mem_iter(struct rte_mempool *mp, void rte_mempool_dump(FILE *f, struct rte_mempool *mp); /** + * Create a user-owned mempool cache. + * + * This can be used by non-EAL threads to enable caching when they + * interact with a mempool. + * + * @param size + * The size of the mempool cache. See rte_mempool_create()'s cache_size + * parameter description for more information. The same limits and + * considerations apply here too. + * @param socket_id + * The socket identifier in the case of NUMA. The value can be + * SOCKET_ID_ANY if there is no NUMA constraint for the reserved zone. + */ +struct rte_mempool_cache * +rte_mempool_cache_create(uint32_t size, int socket_id); + +/** + * Free a user-owned mempool cache. + * + * @param cache + * A pointer to the mempool cache. + */ +void +rte_mempool_cache_free(struct rte_mempool_cache *cache); + +/** + * Flush a user-owned mempool cache to the specified mempool. + * + * @param cache + * A pointer to the mempool cache. + * @param mp + * A pointer to the mempool. + */ +static inline void __attribute__((always_inline)) +rte_mempool_cache_flush(struct rte_mempool_cache *cache, + struct rte_mempool *mp) +{ + rte_mempool_ops_enqueue_bulk(mp, cache->objs, cache->len); + cache->len = 0; +} + +/** + * Get a pointer to the per-lcore default mempool cache. + * + * @param mp + * A pointer to the mempool structure. + * @param lcore_id + * The logical core id. + * @return + * A pointer to the mempool cache or NULL if disabled or non-EAL thread. + */ +static inline struct rte_mempool_cache *__attribute__((always_inline)) +rte_mempool_default_cache(struct rte_mempool *mp, unsigned lcore_id) +{ + if (mp->cache_size == 0) + return NULL; + + if (lcore_id >= RTE_MAX_LCORE) + return NULL; + + return &mp->local_cache[lcore_id]; +} + +/** * @internal Put several objects back in the mempool; used internally. * @param mp * A pointer to the mempool structure. @@ -953,34 +1018,30 @@ void rte_mempool_dump(FILE *f, struct rte_mempool *mp); * @param n * The number of objects to store back in the mempool, must be strictly * positive. + * @param cache + * A pointer to a mempool cache structure. May be NULL if not needed. * @param flags * The flags used for the mempool creation. * Single-producer (MEMPOOL_F_SP_PUT flag) or multi-producers. */ static inline void __attribute__((always_inline)) __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table, - unsigned n, int flags) + unsigned n, struct rte_mempool_cache *cache, int flags) { - struct rte_mempool_cache *cache; uint32_t index; void **cache_objs; - unsigned lcore_id = rte_lcore_id(); - uint32_t cache_size = mp->cache_size; - uint32_t flushthresh = mp->cache_flushthresh; /* increment stat now, adding in mempool always success */ __MEMPOOL_STAT_ADD(mp, put, n); - /* cache is not enabled or single producer or non-EAL thread */ - if (unlikely(cache_size == 0 || flags & MEMPOOL_F_SP_PUT || - lcore_id >= RTE_MAX_LCORE)) + /* No cache provided or single producer */ + if (unlikely(cache == NULL || flags & MEMPOOL_F_SP_PUT)) goto ring_enqueue; /* Go straight to ring if put would overflow mem allocated for cache */ if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE)) goto ring_enqueue; - cache = &mp->local_cache[lcore_id]; cache_objs = &cache->objs[cache->len]; /* @@ -996,10 +1057,10 @@ __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table, cache->len += n; - if (cache->len >= flushthresh) { - rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache_size], - cache->len - cache_size); - cache->len = cache_size; + if (cache->len >= cache->flushthresh) { + rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache->size], + cache->len - cache->size); + cache->len = cache->size; } return; @@ -1025,16 +1086,18 @@ ring_enqueue: * A pointer to a table of void * pointers (objects). * @param n * The number of objects to add in the mempool from the obj_table. + * @param cache + * A pointer to a mempool cache structure. May be NULL if not needed. * @param flags * The flags used for the mempool creation. * Single-producer (MEMPOOL_F_SP_PUT flag) or multi-producers. */ static inline void __attribute__((always_inline)) rte_mempool_generic_put(struct rte_mempool *mp, void * const *obj_table, - unsigned n, int flags) + unsigned n, struct rte_mempool_cache *cache, int flags) { __mempool_check_cookies(mp, obj_table, n, 0); - __mempool_generic_put(mp, obj_table, n, flags); + __mempool_generic_put(mp, obj_table, n, cache, flags); } /** @@ -1052,7 +1115,9 @@ __rte_deprecated static inline void __attribute__((always_inline)) rte_mempool_mp_put_bulk(struct rte_mempool *mp, void * const *obj_table, unsigned n) { - rte_mempool_generic_put(mp, obj_table, n, 0); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + rte_mempool_generic_put(mp, obj_table, n, cache, 0); } /** @@ -1070,7 +1135,7 @@ __rte_deprecated static inline void __attribute__((always_inline)) rte_mempool_sp_put_bulk(struct rte_mempool *mp, void * const *obj_table, unsigned n) { - rte_mempool_generic_put(mp, obj_table, n, MEMPOOL_F_SP_PUT); + rte_mempool_generic_put(mp, obj_table, n, NULL, MEMPOOL_F_SP_PUT); } /** @@ -1091,7 +1156,9 @@ static inline void __attribute__((always_inline)) rte_mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, unsigned n) { - rte_mempool_generic_put(mp, obj_table, n, mp->flags); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + rte_mempool_generic_put(mp, obj_table, n, cache, mp->flags); } /** @@ -1106,7 +1173,9 @@ rte_mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, __rte_deprecated static inline void __attribute__((always_inline)) rte_mempool_mp_put(struct rte_mempool *mp, void *obj) { - rte_mempool_generic_put(mp, &obj, 1, 0); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + rte_mempool_generic_put(mp, &obj, 1, cache, 0); } /** @@ -1121,7 +1190,7 @@ rte_mempool_mp_put(struct rte_mempool *mp, void *obj) __rte_deprecated static inline void __attribute__((always_inline)) rte_mempool_sp_put(struct rte_mempool *mp, void *obj) { - rte_mempool_generic_put(mp, &obj, 1, MEMPOOL_F_SP_PUT); + rte_mempool_generic_put(mp, &obj, 1, NULL, MEMPOOL_F_SP_PUT); } /** @@ -1150,6 +1219,8 @@ rte_mempool_put(struct rte_mempool *mp, void *obj) * A pointer to a table of void * pointers (objects). * @param n * The number of objects to get, must be strictly positive. + * @param cache + * A pointer to a mempool cache structure. May be NULL if not needed. * @param flags * The flags used for the mempool creation. * Single-consumer (MEMPOOL_F_SC_GET flag) or multi-consumers. @@ -1159,27 +1230,23 @@ rte_mempool_put(struct rte_mempool *mp, void *obj) */ static inline int __attribute__((always_inline)) __mempool_generic_get(struct rte_mempool *mp, void **obj_table, - unsigned n, int flags) + unsigned n, struct rte_mempool_cache *cache, int flags) { int ret; - struct rte_mempool_cache *cache; uint32_t index, len; void **cache_objs; - unsigned lcore_id = rte_lcore_id(); - uint32_t cache_size = mp->cache_size; - /* cache is not enabled or single consumer */ - if (unlikely(cache_size == 0 || flags & MEMPOOL_F_SC_GET || - n >= cache_size || lcore_id >= RTE_MAX_LCORE)) + /* No cache provided or single consumer */ + if (unlikely(cache == NULL || flags & MEMPOOL_F_SC_GET || + n >= cache->size)) goto ring_dequeue; - cache = &mp->local_cache[lcore_id]; cache_objs = cache->objs; /* Can this be satisfied from the cache? */ if (cache->len < n) { /* No. Backfill the cache first, and then fill from it */ - uint32_t req = n + (cache_size - cache->len); + uint32_t req = n + (cache->size - cache->len); /* How many do we require i.e. number to fill the cache + the request */ ret = rte_mempool_ops_dequeue_bulk(mp, @@ -1234,6 +1301,8 @@ ring_dequeue: * A pointer to a table of void * pointers (objects) that will be filled. * @param n * The number of objects to get from mempool to obj_table. + * @param cache + * A pointer to a mempool cache structure. May be NULL if not needed. * @param flags * The flags used for the mempool creation. * Single-consumer (MEMPOOL_F_SC_GET flag) or multi-consumers. @@ -1243,10 +1312,10 @@ ring_dequeue: */ static inline int __attribute__((always_inline)) rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table, unsigned n, - int flags) + struct rte_mempool_cache *cache, int flags) { int ret; - ret = __mempool_generic_get(mp, obj_table, n, flags); + ret = __mempool_generic_get(mp, obj_table, n, cache, flags); if (ret == 0) __mempool_check_cookies(mp, obj_table, n, 1); return ret; @@ -1274,7 +1343,9 @@ rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table, unsigned n, __rte_deprecated static inline int __attribute__((always_inline)) rte_mempool_mc_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) { - return rte_mempool_generic_get(mp, obj_table, n, 0); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + return rte_mempool_generic_get(mp, obj_table, n, cache, 0); } /** @@ -1300,7 +1371,8 @@ rte_mempool_mc_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) __rte_deprecated static inline int __attribute__((always_inline)) rte_mempool_sc_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) { - return rte_mempool_generic_get(mp, obj_table, n, MEMPOOL_F_SC_GET); + return rte_mempool_generic_get(mp, obj_table, n, NULL, + MEMPOOL_F_SC_GET); } /** @@ -1328,7 +1400,9 @@ rte_mempool_sc_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) static inline int __attribute__((always_inline)) rte_mempool_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) { - return rte_mempool_generic_get(mp, obj_table, n, mp->flags); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + return rte_mempool_generic_get(mp, obj_table, n, cache, mp->flags); } /** @@ -1351,7 +1425,9 @@ rte_mempool_get_bulk(struct rte_mempool *mp, void **obj_table, unsigned n) __rte_deprecated static inline int __attribute__((always_inline)) rte_mempool_mc_get(struct rte_mempool *mp, void **obj_p) { - return rte_mempool_generic_get(mp, obj_p, 1, 0); + struct rte_mempool_cache *cache; + cache = rte_mempool_default_cache(mp, rte_lcore_id()); + return rte_mempool_generic_get(mp, obj_p, 1, cache, 0); } /** @@ -1374,7 +1450,7 @@ rte_mempool_mc_get(struct rte_mempool *mp, void **obj_p) __rte_deprecated static inline int __attribute__((always_inline)) rte_mempool_sc_get(struct rte_mempool *mp, void **obj_p) { - return rte_mempool_generic_get(mp, obj_p, 1, MEMPOOL_F_SC_GET); + return rte_mempool_generic_get(mp, obj_p, 1, NULL, MEMPOOL_F_SC_GET); } /** @@ -1408,7 +1484,7 @@ rte_mempool_get(struct rte_mempool *mp, void **obj_p) * * When cache is enabled, this function has to browse the length of * all lcores, so it should not be used in a data path, but only for - * debug purposes. + * debug purposes. User-owned mempool caches are not accounted for. * * @param mp * A pointer to the mempool structure. @@ -1427,7 +1503,7 @@ unsigned rte_mempool_count(const struct rte_mempool *mp); * * When cache is enabled, this function has to browse the length of * all lcores, so it should not be used in a data path, but only for - * debug purposes. + * debug purposes. User-owned mempool caches are not accounted for. * * @param mp * A pointer to the mempool structure. @@ -1445,7 +1521,7 @@ rte_mempool_free_count(const struct rte_mempool *mp) * * When cache is enabled, this function has to browse the length of all * lcores, so it should not be used in a data path, but only for debug - * purposes. + * purposes. User-owned mempool caches are not accounted for. * * @param mp * A pointer to the mempool structure. @@ -1464,7 +1540,7 @@ rte_mempool_full(const struct rte_mempool *mp) * * When cache is enabled, this function has to browse the length of all * lcores, so it should not be used in a data path, but only for debug - * purposes. + * purposes. User-owned mempool caches are not accounted for. * * @param mp * A pointer to the mempool structure. diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map index 6d4fc4a..729ea97 100644 --- a/lib/librte_mempool/rte_mempool_version.map +++ b/lib/librte_mempool/rte_mempool_version.map @@ -19,8 +19,12 @@ DPDK_2.0 { DPDK_16.07 { global: + rte_mempool_cache_create; + rte_mempool_cache_flush; + rte_mempool_cache_free; rte_mempool_check_cookies; rte_mempool_create_empty; + rte_mempool_default_cache; rte_mempool_free; rte_mempool_generic_get; rte_mempool_generic_put; -- 1.9.1