From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 13B369ACF for ; Tue, 1 Mar 2016 14:32:49 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 01 Mar 2016 05:32:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,523,1449561600"; d="scan'208";a="755971808" Received: from dhunt5-mobl.ger.corp.intel.com (HELO [10.237.221.12]) ([10.237.221.12]) by orsmga003.jf.intel.com with ESMTP; 01 Mar 2016 05:32:47 -0800 From: "Hunt, David" To: Olivier MATZ , dev@dpdk.org References: <1453829155-1366-1-git-send-email-david.hunt@intel.com> <1453829155-1366-2-git-send-email-david.hunt@intel.com> <56B365A0.3080206@6wind.com> Message-ID: <56D599FE.90706@intel.com> Date: Tue, 1 Mar 2016 13:32:46 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56B365A0.3080206@6wind.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 1/5] mempool: add external mempool manager support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 13:32:51 -0000 Olivier, Here's my comments on your feedback. Hopefully I've covered all of it this time, and I've summarised the outstanding questions at the bottom. On 2/4/2016 2:52 PM, Olivier MATZ wrote: > >> -#ifndef RTE_LIBRTE_XEN_DOM0 >> -/* stub if DOM0 support not configured */ >> -struct rte_mempool * >> -rte_dom0_mempool_create(const char *name __rte_unused, >> - unsigned n __rte_unused, >> - unsigned elt_size __rte_unused, >> - unsigned cache_size __rte_unused, >> - unsigned private_data_size __rte_unused, >> - rte_mempool_ctor_t *mp_init __rte_unused, >> - void *mp_init_arg __rte_unused, >> - rte_mempool_obj_ctor_t *obj_init __rte_unused, >> - void *obj_init_arg __rte_unused, >> - int socket_id __rte_unused, >> - unsigned flags __rte_unused) >> -{ >> - rte_errno = EINVAL; >> - return NULL; >> -} >> -#endif >> - > > Could we move this is a separated commit? > "mempool: remove unused rte_dom0_mempool_create stub" Will do for v3. --snip-- > return rte_mempool_xmem_create(name, n, elt_size, >> - cache_size, private_data_size, >> - mp_init, mp_init_arg, >> - obj_init, obj_init_arg, >> - socket_id, flags, >> - NULL, NULL, MEMPOOL_PG_NUM_DEFAULT, >> - MEMPOOL_PG_SHIFT_MAX); >> + cache_size, private_data_size, >> + mp_init, mp_init_arg, >> + obj_init, obj_init_arg, >> + socket_id, flags, >> + NULL, NULL, >> + MEMPOOL_PG_NUM_DEFAULT, MEMPOOL_PG_SHIFT_MAX); >> } > > As far as I can see, you are not modifying the code here, only the > style. For better readability, it should go in another commit that > only fixes indent or style issues. > I've removed any changes to style in v2. Only makes things more difficult to read. > Also, I think the proper indentation is to use only one tab for the > subsequent lines. I've done this in v2. > >> @@ -598,6 +568,22 @@ rte_mempool_xmem_create(const char *name, >> unsigned n, unsigned elt_size, >> mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size); >> mp->private_data_size = private_data_size; >> >> + /* >> + * Since we have 4 combinations of the SP/SC/MP/MC, and stack, >> + * examine the >> + * flags to set the correct index into the handler table. >> + */ > > nit: comment style is not correct > Will fix. >> + if (flags & MEMPOOL_F_USE_STACK) >> + mp->handler_idx = rte_get_mempool_handler("stack"); > > The stack handler does not exist yet, it is introduced in the next > commit. I think this code should be moved in the next commit too. Done in v2 > >> @@ -622,6 +607,10 @@ rte_mempool_xmem_create(const char *name, >> unsigned n, unsigned elt_size, >> >> mp->elt_va_end = mp->elt_va_start; >> >> + /* Parameters are setup. Call the mempool handler alloc */ >> + if ((rte_mempool_ext_alloc(mp, name, n, socket_id, flags)) == NULL) >> + goto exit; >> + > > I think some memory needs to be freed here. At least 'te'. Done in v2 >> @@ -681,7 +670,9 @@ rte_mempool_dump_cache(FILE *f, const struct >> rte_mempool *mp) >> fprintf(f, " cache_size=%"PRIu32"\n", mp->cache_size); >> for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >> cache_count = mp->local_cache[lcore_id].len; >> - fprintf(f, " cache_count[%u]=%u\n", lcore_id, cache_count); >> + if (cache_count > 0) >> + fprintf(f, " cache_count[%u]=%u\n", >> + lcore_id, cache_count); >> count += cache_count; >> } >> fprintf(f, " total_cache_count=%u\n", count); > > This could also be moved in a separate commit. Removed this change, as it's not really relevant to mempool manager >> @@ -825,7 +815,7 @@ rte_mempool_dump(FILE *f, const struct >> rte_mempool *mp) >> mp->size); >> >> cache_count = rte_mempool_dump_cache(f, mp); >> - common_count = rte_ring_count(mp->ring); >> + common_count = /* rte_ring_count(mp->ring)*/0; >> if ((cache_count + common_count) > mp->size) >> common_count = mp->size - cache_count; >> fprintf(f, " common_pool_count=%u\n", common_count); > > should it be rte_mempool_ext_get_count(mp) instead? > Done. > > >> @@ -919,3 +909,111 @@ void rte_mempool_walk(void (*func)(const struct >> rte_mempool *, void *), >> >> rte_rwlock_read_unlock(RTE_EAL_MEMPOOL_RWLOCK); >> } >> + >> + >> +/* create the mempool using and external mempool manager */ >> +struct rte_mempool * >> +rte_mempool_create_ext(const char *name, unsigned n, unsigned elt_size, >> + unsigned cache_size, unsigned private_data_size, >> + rte_mempool_ctor_t *mp_init, void *mp_init_arg, >> + rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, >> + int socket_id, unsigned flags, >> + const char *handler_name) >> +{ > > I would have used one tab here for subsequent lines. Done in v2 > >> + char mz_name[RTE_MEMZONE_NAMESIZE]; >> + struct rte_mempool_list *mempool_list; >> + struct rte_mempool *mp = NULL; >> + struct rte_tailq_entry *te; >> + const struct rte_memzone *mz; >> + size_t mempool_size; >> + int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY; >> + int rg_flags = 0; >> + int16_t handler_idx; >> + >> + mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, >> rte_mempool_list); >> + >> + /* asked cache too big */ >> + if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE || >> + CALC_CACHE_FLUSHTHRESH(cache_size) > n) { >> + rte_errno = EINVAL; >> + return NULL; >> + } >> + >> + handler_idx = rte_get_mempool_handler(handler_name); >> + if (handler_idx < 0) { >> + RTE_LOG(ERR, MEMPOOL, "Cannot find mempool handler by >> name!\n"); >> + goto exit; >> + } >> + >> + /* ring flags */ >> + if (flags & MEMPOOL_F_SP_PUT) >> + rg_flags |= RING_F_SP_ENQ; >> + if (flags & MEMPOOL_F_SC_GET) >> + rg_flags |= RING_F_SC_DEQ; >> + >> ... > > I have the same comment than Jerin here. I think it should be > factorized with rte_mempool_xmem_create() if possible. Maybe a > at least a function rte_mempool_init() could be introduced, in > the same model than rte_ring_init(). factorization done in v2. > >> diff --git a/lib/librte_mempool/rte_mempool.h >> b/lib/librte_mempool/rte_mempool.h >> index 6e2390a..620cfb7 100644 >> --- a/lib/librte_mempool/rte_mempool.h >> +++ b/lib/librte_mempool/rte_mempool.h >> @@ -88,6 +88,8 @@ extern "C" { >> struct rte_mempool_debug_stats { >> uint64_t put_bulk; /**< Number of puts. */ >> uint64_t put_objs; /**< Number of objects successfully >> put. */ >> + uint64_t put_pool_bulk; /**< Number of puts into pool. */ >> + uint64_t put_pool_objs; /**< Number of objects into pool. */ >> uint64_t get_success_bulk; /**< Successful allocation number. */ >> uint64_t get_success_objs; /**< Objects successfully allocated. */ >> uint64_t get_fail_bulk; /**< Failed allocation number. */ > > I think the comment of put_pool_objs is not very clear. > Shouldn't we have the same stats for get? > Not used, removed. Covered by put_bulk. > >> @@ -123,6 +125,7 @@ struct rte_mempool_objsz { >> #define RTE_MEMPOOL_NAMESIZE 32 /**< Maximum length of a memory >> pool. */ >> #define RTE_MEMPOOL_MZ_PREFIX "MP_" >> >> + >> /* "MP_" */ >> #define RTE_MEMPOOL_MZ_FORMAT RTE_MEMPOOL_MZ_PREFIX "%s" >> > > to be removed Done in v2. > >> @@ -175,12 +178,85 @@ struct rte_mempool_objtlr { >> #endif >> }; >> >> +/* Handler functions for external mempool support */ >> +typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp, >> + const char *name, unsigned n, int socket_id, unsigned flags); >> +typedef int (*rte_mempool_put_t)(void *p, >> + void * const *obj_table, unsigned n); >> +typedef int (*rte_mempool_get_t)(void *p, void **obj_table, >> + unsigned n); >> +typedef unsigned (*rte_mempool_get_count)(void *p); >> +typedef int(*rte_mempool_free_t)(struct rte_mempool *mp); > > a space is missing after 'int'. done in v2 > > >> + >> +/** >> + * @internal wrapper for external mempool manager alloc callback. >> + * >> + * @param mp >> + * Pointer to the memory pool. >> + * @param name >> + * Name of the statistics field to increment in the memory pool. >> + * @param n >> + * Number to add to the object-oriented statistics. > > Are this comments correct? Fixed in v2 > > >> + * @param socket_id >> + * socket id on which to allocate. >> + * @param flags >> + * general flags to allocate function > > We could add that we are talking about MEMPOOL_F_* flags. > > By the way, the '@return' is missing in all declarations. > Will fix in v3 > >> +/** >> + * @internal wrapper for external mempool manager get_count callback. >> + * >> + * @param mp >> + * Pointer to the memory pool. >> + */ >> +int >> +rte_mempool_ext_get_count(const struct rte_mempool *mp); > > should it be unsigned instead of int? > Yes. Will change. > >> + >> +/** >> + * @internal wrapper for external mempool manager free callback. >> + * >> + * @param mp >> + * Pointer to the memory pool. >> + */ >> +int >> +rte_mempool_ext_free(struct rte_mempool *mp); >> + >> /** >> * The RTE mempool structure. >> */ >> struct rte_mempool { >> char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */ >> - struct rte_ring *ring; /**< Ring to store objects. */ >> phys_addr_t phys_addr; /**< Phys. addr. of mempool >> struct. */ >> int flags; /**< Flags of the mempool. */ >> uint32_t size; /**< Size of the mempool. */ >> @@ -194,6 +270,11 @@ struct rte_mempool { >> >> unsigned private_data_size; /**< Size of private data. */ >> >> + /* Common pool data structure pointer */ >> + void *rt_pool __rte_cache_aligned; > > What is the meaning of rt_pool? I agree that it's probably not a very good name. Since it's basically the pointer which is used by the handlers callbacks, maybe we should call it mempool_storage? That leaves it generic enough that it can point at a ring, an array, or whatever else is needed for a particular handler. >> + >> + int16_t handler_idx; >> + > > I don't think I'm getting why an index is better than a pointer to > the struct rte_mempool_handler. It would simplify the add_handler() > function. See below for a detailed explaination. > As discussed in previous mails. It's to facilitate secondary processes. >> @@ -223,6 +304,10 @@ struct rte_mempool { >> #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on >> cache lines.*/ >> #define MEMPOOL_F_SP_PUT 0x0004 /**< Default put is >> "single-producer".*/ >> #define MEMPOOL_F_SC_GET 0x0008 /**< Default get is >> "single-consumer".*/ >> +#define MEMPOOL_F_USE_STACK 0x0010 /**< Use a stack for the >> common pool. */ > > Stack is not implemented in this commit. It should be moved in next > commit. Done in v2 >> +#define MEMPOOL_F_USE_TM 0x0020 >> +#define MEMPOOL_F_NO_SECONDARY 0x0040 >> + > > What are these flags? Not needed. Part of temporary change. Removed. >> @@ -728,7 +813,6 @@ rte_dom0_mempool_create(const char *name, >> unsigned n, unsigned elt_size, >> rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, >> int socket_id, unsigned flags); >> >> - >> /** >> * Dump the status of the mempool to the console. >> * > > style will fix in v3. > > >> @@ -753,7 +837,7 @@ void rte_mempool_dump(FILE *f, const struct >> rte_mempool *mp); >> */ >> static inline void __attribute__((always_inline)) >> __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, >> - unsigned n, int is_mp) >> + unsigned n, __attribute__((unused)) int is_mp) > > You could use __rte_unused instead of __attribute__((unused)) will change in v3 > >> @@ -769,8 +853,7 @@ __mempool_put_bulk(struct rte_mempool *mp, void * >> const *obj_table, >> >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >> /* cache is not enabled or single producer or non-EAL thread */ >> - if (unlikely(cache_size == 0 || is_mp == 0 || >> - lcore_id >= RTE_MAX_LCORE)) >> + if (unlikely(cache_size == 0 || lcore_id >= RTE_MAX_LCORE)) >> goto ring_enqueue; >> >> /* Go straight to ring if put would overflow mem allocated for >> cache */ > > If I understand well, we now always use the cache, even if the mempool > is single-producer. I was wondering if it would have a performance > impact... I suppose that using the cache is more efficient than the ring > in single-producer mode, so it may increase performance. Do you have an > idea of the impact here? I've seen very little in performance gain, maybe a couple of percent for some tests, and up to 10% drop for some single core tests. I'll do some more specific testing based on SP versus MP. > > I think we could remove the parameter as the function is marked as > internal. The comment above should also be fixed. The same comments > apply to the get() functions. > will fix comments in v3, and see if we should remove is_mp based on more performance testing. > >> @@ -793,8 +876,8 @@ __mempool_put_bulk(struct rte_mempool *mp, void * >> const *obj_table, >> >> cache->len += n; >> >> - if (cache->len >= flushthresh) { >> - rte_ring_mp_enqueue_bulk(mp->ring, &cache->objs[cache_size], >> + if (unlikely(cache->len >= flushthresh)) { >> + rte_mempool_ext_put_bulk(mp, &cache->objs[cache_size], >> cache->len - cache_size); > > Shouldn't we add a __MEMPOOL_STAT_ADD(mp, put_pool, > cache->len - cache_size) here ? > Correct. Added in v3. >> @@ -954,8 +1025,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void >> **obj_table, >> uint32_t cache_size = mp->cache_size; >> >> /* cache is not enabled or single consumer */ >> - if (unlikely(cache_size == 0 || is_mc == 0 || >> - n >= cache_size || lcore_id >= RTE_MAX_LCORE)) >> + if (unlikely(cache_size == 0 || n >= cache_size || >> + lcore_id >= RTE_MAX_LCORE)) > > incorrect indent will fix in v3 > >> @@ -967,7 +1038,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void >> **obj_table, >> uint32_t req = n + (cache_size - cache->len); >> >> /* How many do we require i.e. number to fill the cache + >> the request */ >> - ret = rte_ring_mc_dequeue_bulk(mp->ring, >> &cache->objs[cache->len], req); >> + ret = rte_mempool_ext_get_bulk(mp, >> + &cache->objs[cache->len], req); > > indent will fix in v3 >> +/** >> + * Function to get an index to an external mempool manager >> + * >> + * @param name >> + * The name of the mempool handler to search for in the list of >> handlers >> + * @return >> + * The index of the mempool handler in the list of registered mempool >> + * handlers >> + */ >> +int16_t >> +rte_get_mempool_handler(const char *name); > > I would prefer a function like this: > > const struct rte_mempool_handler * > rte_get_mempool_handler(const char *name); > > (detailed explaination below) Already discussed previously, index needed over pointer because of secondary processes. >> diff --git a/lib/librte_mempool/rte_mempool_default.c >> b/lib/librte_mempool/rte_mempool_default.c >> new file mode 100644 >> index 0000000..2493dc1 >> --- /dev/null >> +++ b/lib/librte_mempool/rte_mempool_default.c >> +#include "rte_mempool_internal.h" >> + >> +/* >> + * Indirect jump table to support external memory pools >> + */ >> +struct rte_mempool_handler_list mempool_handler_list = { >> + .sl = RTE_SPINLOCK_INITIALIZER , >> + .num_handlers = 0 >> +}; >> + >> +/* TODO Convert to older mechanism of an array of stucts */ >> +int16_t >> +add_handler(struct rte_mempool_handler *h) >> +{ >> + int16_t handler_idx; >> + >> + /* */ >> + rte_spinlock_lock(&mempool_handler_list.sl); >> + >> + /* Check whether jump table has space */ >> + if (mempool_handler_list.num_handlers >= >> RTE_MEMPOOL_MAX_HANDLER_IDX) { >> + rte_spinlock_unlock(&mempool_handler_list.sl); >> + RTE_LOG(ERR, MEMPOOL, >> + "Maximum number of mempool handlers exceeded\n"); >> + return -1; >> + } >> + >> + if ((h->put == NULL) || (h->get == NULL) || >> + (h->get_count == NULL)) { >> + rte_spinlock_unlock(&mempool_handler_list.sl); >> + RTE_LOG(ERR, MEMPOOL, >> + "Missing callback while registering mempool >> handler\n"); >> + return -1; >> + } >> + >> + /* add new handler index */ >> + handler_idx = mempool_handler_list.num_handlers++; >> + >> + snprintf(mempool_handler_list.handler[handler_idx].name, >> + RTE_MEMPOOL_NAMESIZE, "%s", h->name); >> + mempool_handler_list.handler[handler_idx].alloc = h->alloc; >> + mempool_handler_list.handler[handler_idx].put = h->put; >> + mempool_handler_list.handler[handler_idx].get = h->get; >> + mempool_handler_list.handler[handler_idx].get_count = h->get_count; >> + >> + rte_spinlock_unlock(&mempool_handler_list.sl); >> + >> + return handler_idx; >> +} > > Why not using a similar mechanism than what we have for PMDs? > > void rte_eal_driver_register(struct rte_driver *driver) > { > TAILQ_INSERT_TAIL(&dev_driver_list, driver, next); > } > > To do that, you just need to add a TAILQ_ENTRY() in your > rte_mempool_handler structure. This would avoid to duplicate the > structure into a static array whose size is limited. > > Accessing to the callbacks would be easier: > > return mp->mp_handler->put(mp->rt_pool, obj_table, n); > > instead of: > > return (mempool_handler_list.handler[mp->handler_idx].put) > (mp->rt_pool, obj_table, n); > > If we really want to copy the handlers somewhere, it could be in > the mempool structure. It would avoid an extra dereference > (note the first '.' instead of '->'): > > return mp.mp_handler->put(mp->rt_pool, obj_table, n); > > After doing that, we could ask ourself if the wrappers are still > useful or not. I would have say that they could be removed. > > > The spinlock could be kept, although it may look a bit overkill: > - I don't expect to have several loading at the same time > - There is no unregister() function, so there is no risk to > browse the list atomically > Already discussed previously, index needed over pointer because of secondary processes. > Last thing, I think this code should go in rte_mempool.c, not in > rte_mempool_default.c. I was trying to keep the default handlers together in their own file, rather than having them in with the mempool framework. I think it's better having them separate, and new handlers can go in their own files also. no? >> + >> +/* TODO Convert to older mechanism of an array of stucts */ >> +int16_t >> +rte_get_mempool_handler(const char *name) >> +{ >> + int16_t i; >> + >> + for (i = 0; i < mempool_handler_list.num_handlers; i++) { >> + if (!strcmp(name, mempool_handler_list.handler[i].name)) >> + return i; >> + } >> + return -1; >> +} > > This would be replaced by a TAILQ_FOREACH(). Already discussed previously, index needed over pointer because of secondary processes. > >> +static void * >> +rte_mempool_common_ring_alloc(struct rte_mempool *mp, >> + const char *name, unsigned n, int socket_id, unsigned flags) >> +{ >> + struct rte_ring *r; >> + char rg_name[RTE_RING_NAMESIZE]; >> + int rg_flags = 0; >> + >> + if (flags & MEMPOOL_F_SP_PUT) >> + rg_flags |= RING_F_SP_ENQ; >> + if (flags & MEMPOOL_F_SC_GET) >> + rg_flags |= RING_F_SC_DEQ; >> + >> + /* allocate the ring that will be used to store objects */ >> + /* Ring functions will return appropriate errors if we are >> + * running as a secondary process etc., so no checks made >> + * in this function for that condition */ >> + snprintf(rg_name, sizeof(rg_name), "%s-ring", name); >> + r = rte_ring_create(rg_name, rte_align32pow2(n+1), socket_id, >> rg_flags); >> + if (r == NULL) >> + return NULL; >> + >> + mp->rt_pool = (void *)r; >> + >> + return (void *) r; > > I don't think the explicit casts are required. will change in v3 > >> --- /dev/null >> +++ b/lib/librte_mempool/rte_mempool_internal.h > > Is it the proper name? > We could imagine a mempool handler provided by a plugin, and > in this case this code should go in rte_mempool.h. I was trying to keep the public APIs in rte_mempool.h, and aal the private stuff in rte_mempool_internal.h. Maybe a better name would be rte_mempool_private.h? >> + >> +struct rte_mempool_handler { >> + char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool handler */ > > I would use a const char * here instead. > Would we then have to allocate the memory for the string elsewhere? I would have thought this is the more straightforward method. >> + >> + rte_mempool_alloc_t alloc; >> + >> + rte_mempool_put_t put __rte_cache_aligned; >> + >> + rte_mempool_get_t get __rte_cache_aligned; >> + >> + rte_mempool_get_count get_count __rte_cache_aligned; >> + >> + rte_mempool_free_t free __rte_cache_aligned; >> +}; > > I agree with Jerin's comments. I don't think we should cache > align each field. Maybe the whole structure. Changed in v2. >> + >> +struct rte_mempool_handler_list { >> + rte_spinlock_t sl; /**< Spinlock for add/delete. */ >> + >> + int32_t num_handlers; /**< Number of handlers that are >> valid. */ >> + >> + /* storage for all possible handlers */ >> + struct rte_mempool_handler handler[RTE_MEMPOOL_MAX_HANDLER_IDX]; >> +}; >> + >> +int16_t add_handler(struct rte_mempool_handler *h); > > I think it should be called rte_mempool_register_handler(). Agreed, changed in v2. >> + >> +#define REGISTER_MEMPOOL_HANDLER(h) \ >> +static int16_t __attribute__((used)) testfn_##h(void);\ >> +int16_t __attribute__((constructor, used)) testfn_##h(void)\ >> +{\ >> + return add_handler(&h);\ >> +} >> + >> +#endif >> > > > > Regards, > Olivier Apologies for not addressing all of your comments for v2. I'll await your comments on the couple of outstanding questions above, then push up v3. Mainly: * change "rt_pool" to "mempool_storage"? * change to const char * for mempool name, or leave as is. * move all contents of rte_mempool_internal.h to rte_mempool.h, or leave as is. * alternatively change name of rte_mempool_internal.h to rte_mempool_private.h * I need to look into the performace of always using cache for single producer/consumer. Thanks, David.