DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC]: mempool: zero-copy cache get bulk
@ 2022-11-05 13:19 Morten Brørup
  2022-11-07  9:19 ` Bruce Richardson
                   ` (9 more replies)
  0 siblings, 10 replies; 36+ messages in thread
From: Morten Brørup @ 2022-11-05 13:19 UTC (permalink / raw)
  To: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli

Zero-copy access to the mempool cache is beneficial for PMD performance, and must be provided by the mempool library to fix [Bug 1052] without a performance regression.

[Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052


This RFC offers two conceptual variants of zero-copy get:
1. A simple version.
2. A version where existing (hot) objects in the cache are moved to the top of the cache before new objects from the backend driver are pulled in.

I would like some early feedback. Also, which variant do you prefer?

Notes:
* Allowing the 'cache' parameter to be NULL, and getting it from the mempool instead, was inspired by rte_mempool_cache_flush().
* Asserting that the 'mp' parameter is not NULL is not done by other functions, so I omitted it here too.

NB: Please ignore formatting. Also, this code has not even been compile tested.


PS: No promises, but I expect to offer an RFC for zero-copy put too. :-)


1. Simple version:

/**
 * Get objects from a mempool via zero-copy access to a user-owned mempool cache.
 *
 * @param cache
 *   A pointer to the mempool cache.
 * @param mp
 *   A pointer to the mempool.
 * @param n
 *   The number of objects to prefetch into the mempool cache.
 * @return
 *   The pointer to the objects in the mempool cache.
 *   NULL on error
 *   with rte_errno set appropriately.
 */
static __rte_always_inline void *
rte_mempool_cache_get_bulk(struct rte_mempool_cache *cache,
        struct rte_mempool *mp,
        unsigned int n)
{
    unsigned int len;

    if (cache == NULL)
        cache = rte_mempool_default_cache(mp, rte_lcore_id());
    if (cache == NULL) {
        rte_errno = EINVAL;
        goto fail;
    }

    rte_mempool_trace_cache_get_bulk(cache, mp, n);

    len = cache->len;

    if (unlikely(n > len)) {
        unsigned int size;

        if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE)) {
            rte_errno = EINVAL;
            goto fail;
        }

        /* Fill the cache from the backend; fetch size + requested - len objects. */
        size = cache->size;

        ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], size + n - len);
        if (unlikely(ret < 0)) {
            /*
             * We are buffer constrained.
             * Do not fill the cache, just satisfy the request.
             */
            ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], n - len);
            if (unlikely(ret < 0)) {
                rte_errno = -ret;
                goto fail;
            }

            len = 0;
        } else
            len = size;
    } else
        len -= n;

    cache->len = len;

    RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
    RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);

    return &cache->objs[len];

fail:

    RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
    RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);

    return NULL;
}


2. Advanced version:

/**
 * Get objects from a mempool via zero-copy access to a user-owned mempool cache.
 *
 * @param cache
 *   A pointer to the mempool cache.
 * @param mp
 *   A pointer to the mempool.
 * @param n
 *   The number of objects to prefetch into the mempool cache.
 * @return
 *   The pointer to the objects in the mempool cache.
 *   NULL on error
 *   with rte_errno set appropriately.
 */
static __rte_always_inline void *
rte_mempool_cache_get_bulk(struct rte_mempool_cache *cache,
        struct rte_mempool *mp,
        unsigned int n)
{
    unsigned int len;

    if (cache == NULL)
        cache = rte_mempool_default_cache(mp, rte_lcore_id());
    if (cache == NULL) {
        rte_errno = EINVAL;
        goto fail;
    }

    rte_mempool_trace_cache_get_bulk(cache, mp, n);

    len = cache->len;

    if (unlikely(n > len)) {
        unsigned int size;

        if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE)) {
            rte_errno = EINVAL;
            goto fail;
        }

        /* Fill the cache from the backend; fetch size + requested - len objects. */
        size = cache->size;

        if (likely(size + n >= 2 * len)) {
            /*
             * No overlap when copying (dst >= len): size + n - len >= len.
             * Move (i.e. copy) the existing objects in the cache to the
             * coming top of the cache, to make room for new objects below.
             */
            rte_memcpy(&cache->objs[size + n - len], &cache->objs[0], len);

            /* Fill the cache below the existing objects in the cache. */
            ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[0], size + n - len);
            if (unlikely(ret < 0)) {
                goto constrained;
            } else
                len = size;
        } else {
            /* Fill the cache on top of any objects in it. */
            ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], size + n - len);
            if (unlikely(ret < 0)) {

constrained:
                /*
                 * We are buffer constrained.
                 * Do not fill the cache, just satisfy the request.
                 */
                ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], n - len);
                if (unlikely(ret < 0)) {
                    rte_errno = -ret;
                    goto fail;
                }

                len = 0;
            } else
                len = size;
        }
    } else
        len -= n;

    cache->len = len;

    RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
    RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);

    return &cache->objs[len];

fail:

    RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
    RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);

    return NULL;
}


Med venlig hilsen / Kind regards,
-Morten Brørup


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-02-14 14:16 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-05 13:19 [RFC]: mempool: zero-copy cache get bulk Morten Brørup
2022-11-07  9:19 ` Bruce Richardson
2022-11-07 14:32   ` Morten Brørup
2022-11-15 16:18 ` [PATCH] mempool cache: add zero-copy get and put functions Morten Brørup
2022-11-16 18:04 ` [PATCH v2] " Morten Brørup
2022-11-29 20:54   ` Kamalakshitha Aligeri
2022-11-30 10:21     ` Morten Brørup
2022-12-22 15:57   ` Konstantin Ananyev
2022-12-22 17:55     ` Morten Brørup
2022-12-23 16:58       ` Konstantin Ananyev
2022-12-24 12:17         ` Morten Brørup
2022-12-24 11:49 ` [PATCH v3] " Morten Brørup
2022-12-24 11:55 ` [PATCH v4] " Morten Brørup
2022-12-27  9:24   ` Andrew Rybchenko
2022-12-27 10:31     ` Morten Brørup
2022-12-27 15:17 ` [PATCH v5] " Morten Brørup
2023-01-22 20:34   ` Konstantin Ananyev
2023-01-22 21:17     ` Morten Brørup
2023-01-23 11:53       ` Konstantin Ananyev
2023-01-23 12:23         ` Morten Brørup
2023-01-23 12:52           ` Konstantin Ananyev
2023-01-23 14:30           ` Bruce Richardson
2023-01-24  1:53             ` Kamalakshitha Aligeri
2023-02-09 14:39 ` [PATCH v6] " Morten Brørup
2023-02-09 14:52 ` [PATCH v7] " Morten Brørup
2023-02-09 14:58 ` [PATCH v8] " Morten Brørup
2023-02-10  8:35   ` fengchengwen
2023-02-12 19:56   ` Honnappa Nagarahalli
2023-02-12 23:15     ` Morten Brørup
2023-02-13  4:29       ` Honnappa Nagarahalli
2023-02-13  9:30         ` Morten Brørup
2023-02-13  9:37         ` Olivier Matz
2023-02-13 10:25           ` Morten Brørup
2023-02-14 14:16             ` Andrew Rybchenko
2023-02-13 12:24 ` [PATCH v9] " Morten Brørup
2023-02-13 14:33   ` Kamalakshitha Aligeri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).