From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BC3F045DD5; Fri, 6 Dec 2024 18:47:00 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CF8C1410E8; Fri, 6 Dec 2024 18:46:40 +0100 (CET) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id F29D440E27 for ; Fri, 6 Dec 2024 18:46:39 +0100 (CET) Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Y4dsN4v1dz6K5mZ; Sat, 7 Dec 2024 01:43:36 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 92A0C1404F5; Sat, 7 Dec 2024 01:46:39 +0800 (CST) Received: from localhost.localdomain (10.220.239.45) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 6 Dec 2024 18:46:38 +0100 From: Konstantin Ananyev To: CC: , , , , , , , , Subject: [PATCH v11 5/7] ring/soring: introduce Staged Ordered Ring Date: Fri, 6 Dec 2024 13:35:58 -0500 Message-ID: <20241206183600.34758-6-konstantin.ananyev@huawei.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20241206183600.34758-1-konstantin.ananyev@huawei.com> References: <20241111141910.40604-1-konstantin.ananyev@huawei.com> <20241206183600.34758-1-konstantin.ananyev@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.220.239.45] X-ClientProxiedBy: frapeml500004.china.huawei.com (7.182.85.22) To frapeml500007.china.huawei.com (7.182.85.172) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Staged-Ordered-Ring (SORING) provides a SW abstraction for 'ordered' queues with multiple processing 'stages'. It is based on conventional DPDK rte_ring, re-uses many of its concepts, and even substantial part of its code. It can be viewed as an 'extension' of rte_ring functionality. In particular, main SORING properties: - circular ring buffer with fixed size objects - producer, consumer plus multiple processing stages in the middle. - allows to split objects processing into multiple stages. - objects remain in the same ring while moving from one stage to the other, initial order is preserved, no extra copying needed. - preserves the ingress order of objects within the queue across multiple stages, i.e.: at the same stage multiple threads can process objects from the ring in any order, but for the next stage objects will always appear in the original order. - each stage (and producer/consumer) can be served by single and/or multiple threads. - number of stages, size and number of objects in the ring are configurable at ring initialization time. Data-path API provides four main operations: - enqueue/dequeue works in the same manner as for conventional rte_ring, all rte_ring synchronization types are supported. - acquire/release - for each stage there is an acquire (start) and release (finish) operation. after some objects are 'acquired' - given thread can safely assume that it has exclusive possession of these objects till 'release' for them is invoked. Note that right now user has to release exactly the same number of objects that was acquired before. After 'release', objects can be 'acquired' by next stage and/or dequeued by the consumer (in case of last stage). Expected use-case: applications that uses pipeline model (probably with multiple stages) for packet processing, when preserving incoming packet order is important. I.E.: IPsec processing, etc. Signed-off-by: Eimear Morrissey Signed-off-by: Konstantin Ananyev Acked-by: Morten Brørup Acked-by: Stephen Hemminger --- devtools/build-dict.sh | 1 + doc/api/doxy-api-index.md | 1 + doc/guides/prog_guide/img/soring-pic1.svg | 635 ++++++++++++++++++++++ doc/guides/prog_guide/ring_lib.rst | 202 +++++++ doc/guides/rel_notes/release_25_03.rst | 8 + lib/ring/meson.build | 4 +- lib/ring/rte_soring.c | 198 +++++++ lib/ring/rte_soring.h | 557 +++++++++++++++++++ lib/ring/soring.c | 613 +++++++++++++++++++++ lib/ring/soring.h | 138 +++++ lib/ring/version.map | 19 + 11 files changed, 2374 insertions(+), 2 deletions(-) create mode 100644 doc/guides/prog_guide/img/soring-pic1.svg create mode 100644 lib/ring/rte_soring.c create mode 100644 lib/ring/rte_soring.h create mode 100644 lib/ring/soring.c create mode 100644 lib/ring/soring.h diff --git a/devtools/build-dict.sh b/devtools/build-dict.sh index 32cf6b5735..4eb70fecc2 100755 --- a/devtools/build-dict.sh +++ b/devtools/build-dict.sh @@ -23,6 +23,7 @@ sed '/^uint->/d' | sed "/^doesn'->/d" | sed '/^wasn->/d' | sed '/^stdio->/d' | +sed '/^soring->/d' | # print to stdout cat diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index f0193502bc..b2fc24b3e4 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -174,6 +174,7 @@ The public API headers are grouped by topics: [mbuf](@ref rte_mbuf.h), [mbuf pool ops](@ref rte_mbuf_pool_ops.h), [ring](@ref rte_ring.h), + [soring](@ref rte_soring.h), [stack](@ref rte_stack.h), [tailq](@ref rte_tailq.h), [bitset](@ref rte_bitset.h), diff --git a/doc/guides/prog_guide/img/soring-pic1.svg b/doc/guides/prog_guide/img/soring-pic1.svg new file mode 100644 index 0000000000..c97e66ca43 --- /dev/null +++ b/doc/guides/prog_guide/img/soring-pic1.svg @@ -0,0 +1,635 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + + + + + + + obj5 + obj1 + obj2 + obj3 + + cons_head + cons_tail + prod_head + prod_tail + + lstages states + + producer and consumer states + + + stage[1].tail + stage[1].head + stage[0].tail + + + obj4 + stage[0].head + + diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst index f7dbba0e4e..70a454972e 100644 --- a/doc/guides/prog_guide/ring_lib.rst +++ b/doc/guides/prog_guide/ring_lib.rst @@ -491,6 +491,208 @@ Following is an example of usage: Note that between ``_start_`` and ``_finish_`` no other thread can proceed with enqueue(/dequeue) operation till ``_finish_`` completes. +Staged Ordered Ring API +----------------------- + +Staged-Ordered-Ring (SORING) API provides a SW abstraction for *ordered* queues +with multiple processing *stages*. It is based on conventional DPDK +``rte_ring`` API, re-uses many of its concepts, and even substantial part of +its code. It can be viewed as an 'extension' of ``rte_ring`` functionality. +In particular, main SORING properties: + +* circular ring buffer with fixed size objects and related metadata + +* producer, consumer plus multiple processing stages in between. + +* allows to split objects processing into multiple stages. + +* objects remain in the same ring while moving from one stage to the other, + initial order is preserved, no extra copying needed. + +* preserves the ingress order of objects within the queue across multiple + stages + +* each stage (and producer/consumer) can be served by single and/or + multiple threads. + +* number of stages, size and number of objects and their metadata in the + ring are configurable at ring initialization time. + +Data-Path API +~~~~~~~~~~~~~ + +SORING data-path API provided four main operations: + +* ``enqueue``/``dequeue`` works in the same manner as for conventional + ``rte_ring``, all rte_ring synchronization types are supported. + +* ``acquire``/``release`` - for each stage there is an ``acquire`` (start) + and ``release`` (finish) operation. + After some objects are ``acquired`` - given thread can safely assume that + it has exclusive possession of these objects till ``release`` for them is + invoked. + Note that right now user has to release exactly the same number of + objects that was acquired before. + After objects are ``released``, given thread loses its possession on them, + and they can be either acquired by next stage or dequeued + by the consumer (in case of last stage). + +A simplified representation of a SORING with two stages is shown below. +On that picture ``obj5`` and ``obj4`` elements are acquired by stage 0, +``obj2`` and ``obj3`` are acquired by stage 1, while ``obj11`` was already +released by stage 1 and is ready to be consumed. + +.. _figure_soring1: + +.. figure:: img/soring-pic1.* + +Along with traditional flavor there are enhanced versions for all these +data-path operations: ``enqueux``/``dequeux``/``acquirx``/``releasx``. +All enhanced versions take as extra parameter a pointer to an array of +metadata values. +At initialization user can request within the ``soring`` supplementary and +optional array of metadata associated with each object in the ``soring``. +While ``soring`` element size is configurable and user can specify it big +enough to hold both object and its metadata together, +for performance reasons it might be plausible to access them as separate arrays. +Note that users are free to mix and match both versions of data-path API in +a way they like. +As an example, possible usage scenario when such separation helps: + +.. code-block:: c + + /* + * use pointer to mbuf as soring element, while tx_state + * as a metadata. + * In this example we use a soring with just one stage. + */ + union tx_state { + /* negative values for error */ + int32_t rc; + /* otherwise contain valid TX port and queue IDs*/ + struct { + uint16_t port_id; + uint16_t queue_id; + } tx; + }; + struct rte_soring *soring; + + +producer/consumer part: + +.. code-block:: c + + struct rte_mbuf *pkts[MAX_PKT_BURST]; + union tx_state txst[MAX_PKT_BURST]; + ... + /* enqueue - writes to soring objects array no need to update metadata */ + uint32_t num = MAX_PKT_BURST; + num = rte_soring_enqueue_burst(soring, pkts, num, NULL); + .... + /* dequeux - reads both packets and related tx_state */ + uint32_t num = MAX_PKT_BURST; + num = rte_soring_dequeux_burst(soring, pkts, txst, num, NULL); + + /* + * TX packets out, or drop in case of error. + * Note that we don't need to dereference the soring objects itself + * to make a decision. + */ + uint32_t i, j, k, n; + struct rte_mbuf *dr[MAX_PKT_BURST]; + + k = 0; + for (i = 0; i != num; i++) { + /* packet processing reports an error */ + if (txst[i].rc < 0) + dr[k++] = pkts[i]; + /* valid packet, send it out */ + else { + /* group consequitive packets with the same port and queue IDs */ + for (j = i + 1; j < num; j++) + if (txst[j].rc != txst[i].rc) + break; + + n = rte_eth_tx_burst(txst[i].tx.port_id, txst[i].tx.queue_id, + pkts + i, j - i); + if (i + n != j) { + /* decide with unsent packets if any */ + } + } + } + /* drop erroneous packets */ + if (k != 0) + rte_pktmbuf_free_bulk(dr, k); + +acquire/release part: + +.. code-block:: c + + uint32_t ftoken; + struct rte_mbuf *pkts[MAX_PKT_BURST]; + union tx_state txst[MAX_PKT_BURST]; + ... + /* acquire - grab some packets to process */ + uint32_t num = MAX_PKT_BURST; + num = rte_soring_acquire_burst(soring, pkts, 0, num, &ftoken, NULL); + + /* process packets, fill txst[] for each */ + do_process_packets(pkts, txst, num); + + /* + * release - assuming that do_process_packets() didn't change + * contents of pkts[], we need to update soring metadata array only. + */ + rte_soring_releasx(soring, NULL, txst, 0, num, ftoken); + +Use Cases +~~~~~~~~~~ + +Expected use-cases include applications that use pipeline model +(probably with multiple stages) for packet processing, when preserving +incoming packet order is important. I.E.: IPsec processing, etc. + +SORING internals +~~~~~~~~~~~~~~~~ + +* In addition to accessible by the user array of objects (and metadata), + ``soirng`` also contains an internal array of states. Each ``state[]`` + corresponds to exactly one object within the soring. That ``state[]`` + array is used by ``acquire``/``release``/``dequeue`` operations to + store internal information and should not be accessed by the user directly. + +* At ``acquire``, soring moves stage's head (in a same way as ``rte_ring`` + ``move_head`` does), plus it saves in ``state[stage.old_head]`` + information about how many elements were acquired, acquired head position, + and special flag value to indicate that given elements are acquired + (``SORING_ST_START``). + Note that ``acquire`` returns an opaque ``ftoken`` value that user has + to provide for ``release`` function. + +* ``release`` extracts old head value from provided by user ``ftoken`` and + checks that corresponding ``state[]`` entry contains expected values + (mostly for sanity purposes). Then it marks this ``state[]`` entry with + ``SORING_ST_FINISH`` flag to indicate that given subset of objects was + released. After that, it checks does stage's old ``head`` value equals to + its current ``tail`` value. If so, then it performs ``finalize`` + operation, otherwise ``release`` just returns. + +* As ``state[]`` is shared by all threads, some other thread can perform + ``finalize`` operation for given stage. That allows ``release`` to avoid + excessive waits on the ``tail`` value. + Main purpose of ``finalize`` operation is to walk through ``state[]`` + array from current stage's ``tail`` position up to its ``head``, + check ``state[]`` and move stage ``tail`` through elements that already + are released (in ``SORING_ST_FINISH`` state). + Along with that, corresponding ``state[]`` entries are reset back to zero. + Note that ``finalize`` for given stage can be called from multiple places: + from ``release`` for that stage or from ``acquire`` for next stage, or + even from consumer's ``dequeue`` - in case given stage is the last one. + So ``finalize`` has to be MT-safe and inside it we have to guarantee that + at any given moment only one thread can update stage's ``tail`` and reset + corresponding ``state[]`` entries. + + References ---------- diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst index 426dfcd982..d2ca7d0ee5 100644 --- a/doc/guides/rel_notes/release_25_03.rst +++ b/doc/guides/rel_notes/release_25_03.rst @@ -55,6 +55,14 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Add Staged-Ordered-Ring (SORING) API to the rte_ring library.** + + New API to the ring library to provide a SW abstraction for + 'ordered' queues with multiple processing 'stages'. + It is based on conventional DPDK rte_ring, re-uses many of its concepts, + and even substantial part of its code. + It can be viewed as an 'extension' of rte_ring functionality. + Removed Items ------------- diff --git a/lib/ring/meson.build b/lib/ring/meson.build index 7fca958ed7..21f2c12989 100644 --- a/lib/ring/meson.build +++ b/lib/ring/meson.build @@ -1,8 +1,8 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel Corporation -sources = files('rte_ring.c') -headers = files('rte_ring.h') +sources = files('rte_ring.c', 'rte_soring.c', 'soring.c') +headers = files('rte_ring.h', 'rte_soring.h') # most sub-headers are not for direct inclusion indirect_headers += files ( 'rte_ring_core.h', diff --git a/lib/ring/rte_soring.c b/lib/ring/rte_soring.c new file mode 100644 index 0000000000..c0eae680e7 --- /dev/null +++ b/lib/ring/rte_soring.c @@ -0,0 +1,198 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Huawei Technologies Co., Ltd + */ + +#include + +#include "soring.h" +#include + +RTE_LOG_REGISTER_DEFAULT(soring_logtype, INFO); + +static uint32_t +soring_calc_elem_num(uint32_t count) +{ + return rte_align32pow2(count + 1); +} + +static int +soring_check_param(uint32_t esize, uint32_t msize, uint32_t count, + uint32_t stages) +{ + if (stages == 0) { + SORING_LOG(ERR, "invalid number of stages: %u", stages); + return -EINVAL; + } + + /* Check if element size is a multiple of 4B */ + if (esize == 0 || esize % 4 != 0) { + SORING_LOG(ERR, "invalid element size: %u", esize); + return -EINVAL; + } + + /* Check if size of metadata is a multiple of 4B */ + if (msize % 4 != 0) { + SORING_LOG(ERR, "invalid metadata size: %u", msize); + return -EINVAL; + } + + /* count must be a power of 2 */ + if (rte_is_power_of_2(count) == 0 || + (count > RTE_SORING_ELEM_MAX + 1)) { + SORING_LOG(ERR, "invalid number of elements: %u", count); + return -EINVAL; + } + + return 0; +} + +/* + * Calculate size offsets for SORING internal data layout. + */ +static size_t +soring_get_szofs(uint32_t esize, uint32_t msize, uint32_t count, + uint32_t stages, size_t *elst_ofs, size_t *state_ofs, + size_t *stage_ofs) +{ + size_t sz; + const struct rte_soring * const r = NULL; + + sz = sizeof(r[0]) + (size_t)count * esize; + sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); + + if (elst_ofs != NULL) + *elst_ofs = sz; + + sz = sz + (size_t)count * msize; + sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); + + if (state_ofs != NULL) + *state_ofs = sz; + + sz += sizeof(r->state[0]) * count; + sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); + + if (stage_ofs != NULL) + *stage_ofs = sz; + + sz += sizeof(r->stage[0]) * stages; + sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); + + return sz; +} + +static void +soring_dump_stage_headtail(FILE *f, const char *prefix, + struct soring_stage *st) +{ + fprintf(f, "%stail.pos=%"PRIu32"\n", prefix, st->sht.tail.pos); + fprintf(f, "%stail.sync=%"PRIu32"\n", prefix, st->sht.tail.sync); + fprintf(f, "%shead=%"PRIu32"\n", prefix, st->sht.head); +} + +void +rte_soring_dump(FILE *f, const struct rte_soring *r) +{ + uint32_t i; + char buf[32]; + + if (f == NULL || r == NULL) + return; + + fprintf(f, "soring <%s>@%p\n", r->name, r); + fprintf(f, " size=%"PRIu32"\n", r->size); + fprintf(f, " capacity=%"PRIu32"\n", r->capacity); + fprintf(f, " esize=%"PRIu32"\n", r->esize); + fprintf(f, " msize=%"PRIu32"\n", r->msize); + fprintf(f, " used=%u\n", rte_soring_count(r)); + fprintf(f, " avail=%u\n", rte_soring_free_count(r)); + + rte_ring_headtail_dump(f, " cons.", &(r->cons.ht)); + rte_ring_headtail_dump(f, " prod.", &(r->prod.ht)); + + fprintf(f, " nb_stage=%"PRIu32"\n", r->nb_stage); + for (i = 0; i < r->nb_stage; i++) { + snprintf(buf, sizeof(buf), " stage[%u].", i); + soring_dump_stage_headtail(f, buf, r->stage + i); + } +} + +ssize_t +rte_soring_get_memsize(const struct rte_soring_param *prm) +{ + int32_t rc; + uint32_t count; + + count = soring_calc_elem_num(prm->elems); + rc = soring_check_param(prm->elem_size, prm->meta_size, count, + prm->stages); + if (rc != 0) + return rc; + + return soring_get_szofs(prm->elem_size, prm->meta_size, count, + prm->stages, NULL, NULL, NULL); +} + +/* compilation-time checks */ +static void +soring_compilation_checks(void) +{ + RTE_BUILD_BUG_ON((sizeof(struct rte_soring) & + RTE_CACHE_LINE_MASK) != 0); + RTE_BUILD_BUG_ON((offsetof(struct rte_soring, cons) & + RTE_CACHE_LINE_MASK) != 0); + RTE_BUILD_BUG_ON((offsetof(struct rte_soring, prod) & + RTE_CACHE_LINE_MASK) != 0); + + RTE_BUILD_BUG_ON(offsetof(struct rte_ring_headtail, tail) != + offsetof(struct soring_stage_headtail, tail.pos)); + RTE_BUILD_BUG_ON(offsetof(struct rte_ring_headtail, sync_type) != + offsetof(struct soring_stage_headtail, unused)); +} + +int +rte_soring_init(struct rte_soring *r, const struct rte_soring_param *prm) +{ + int32_t rc; + uint32_t n; + size_t meta_ofs, stage_ofs, state_ofs; + + soring_compilation_checks(); + + if (r == NULL || prm == NULL) + return -EINVAL; + + n = soring_calc_elem_num(prm->elems); + rc = soring_check_param(prm->elem_size, prm->meta_size, n, prm->stages); + if (rc != 0) + return rc; + + soring_get_szofs(prm->elem_size, prm->meta_size, n, prm->stages, + &meta_ofs, &state_ofs, &stage_ofs); + + memset(r, 0, sizeof(*r)); + rc = strlcpy(r->name, prm->name, sizeof(r->name)); + if (rc < 0 || rc >= (int)sizeof(r->name)) + return -ENAMETOOLONG; + + r->size = n; + r->mask = r->size - 1; + r->capacity = prm->elems; + r->esize = prm->elem_size; + r->msize = prm->meta_size; + + r->prod.ht.sync_type = prm->prod_synt; + r->cons.ht.sync_type = prm->cons_synt; + + r->state = (union soring_state *)((uintptr_t)r + state_ofs); + memset(r->state, 0, sizeof(r->state[0]) * r->size); + + r->stage = (struct soring_stage *)((uintptr_t)r + stage_ofs); + r->nb_stage = prm->stages; + memset(r->stage, 0, r->nb_stage * sizeof(r->stage[0])); + + if (r->msize != 0) + r->meta = (void *)((uintptr_t)r + meta_ofs); + + return 0; +} diff --git a/lib/ring/rte_soring.h b/lib/ring/rte_soring.h new file mode 100644 index 0000000000..1c3a798e0f --- /dev/null +++ b/lib/ring/rte_soring.h @@ -0,0 +1,557 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Huawei Technologies Co., Ltd + */ + +#ifndef _RTE_SORING_H_ +#define _RTE_SORING_H_ + +/** + * @file + * This file contains definition of DPDK soring (Staged Ordered Ring) + * public API. + * Brief description: + * enqueue/dequeue works the same as for conventional rte_ring: + * any rte_ring sync types can be used, etc. + * Plus there could be multiple 'stages'. + * For each stage there is an acquire (start) and release (finish) operation. + * after some elems are 'acquired' - user can safely assume that he has + * exclusive possession of these elems till 'release' for them is done. + * Note that right now user has to release exactly the same number of elems + * he acquired before. + * After 'release', elems can be 'acquired' by next stage and/or dequeued + * (in case of last stage). + * Extra debugging might be enabled with RTE_SORING_DEBUG macro. + */ + +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** upper 2 bits are used for status */ +#define RTE_SORING_ST_BIT 30 + +/** max possible number of elements in the soring */ +#define RTE_SORING_ELEM_MAX (RTE_BIT32(RTE_SORING_ST_BIT) - 1) + +struct rte_soring_param { + /** expected name of the soring */ + const char *name; + /** number of elements in the soring */ + uint32_t elems; + /** size of elements in the soring, must be a multiple of 4 */ + uint32_t elem_size; + /** + * size of metadata for each elem, must be a multiple of 4. + * This parameter defines a size of supplementary and optional + * array of metadata associated with each object in the soring. + * While element size is configurable (see 'elem_size' parameter above), + * so user can specify it big enough to hold both object and its + * metadata together, for performance reasons it might be plausible + * to access them as separate arrays. + * Common usage scenario when such separation helps: + * enqueue() - writes to objects array + * acquire() - reads from objects array + * release() - writes to metadata array (as an example: return code) + * dequeue() - reads both objects and metadata array + */ + uint32_t meta_size; + /** number of stages in the soring */ + uint32_t stages; + /** sync type for producer */ + enum rte_ring_sync_type prod_synt; + /** sync type for consumer */ + enum rte_ring_sync_type cons_synt; +}; + +struct rte_soring; + +/** + * Calculate the memory size needed for a soring + * + * This function returns the number of bytes needed for a soring, given + * the expected parameters for it. This value is the sum of the size of + * the internal metadata and the size of the memory needed by the + * actual soring elements and their metadata. The value is aligned to a cache + * line size. + * + * @param prm + * Pointer to the structure that contains soring creation parameters. + * @return + * - The memory size needed for the soring on success. + * - -EINVAL if provided parameter values are invalid. + */ +__rte_experimental +ssize_t +rte_soring_get_memsize(const struct rte_soring_param *prm); + +/** + * Initialize a soring structure. + * + * Initialize a soring structure in memory pointed by "r". + * The size of the memory area must be large enough to store the soring + * internal structures plus the objects and metadata tables. + * It is strongly advised to use @ref rte_soring_get_memsize() to get the + * appropriate size. + * + * @param r + * Pointer to the soring structure. + * @param prm + * Pointer to the structure that contains soring creation parameters. + * @return + * - 0 on success, or a negative error code. + */ +__rte_experimental +int +rte_soring_init(struct rte_soring *r, const struct rte_soring_param *prm); + +/** + * Return the total number of filled entries in a soring. + * + * @param r + * A pointer to the soring structure. + * @return + * The number of entries in the soring. + */ +__rte_experimental +unsigned int +rte_soring_count(const struct rte_soring *r); + +/** + * Return the total number of unfilled entries in a soring. + * + * @param r + * A pointer to the soring structure. + * @return + * The number of free entries in the soring. + */ +__rte_experimental +unsigned int +rte_soring_free_count(const struct rte_soring *r); + +/** + * Dump the status of the soring + * + * @param f + * A pointer to a file for output + * @param r + * Pointer to the soring structure. + */ +__rte_experimental +void +rte_soring_dump(FILE *f, const struct rte_soring *r); + +/** + * Enqueue several objects on the soring. + * Enqueues exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to enqueue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param n + * The number of objects to add in the soring from the 'objs'. + * @param free_space + * if non-NULL, returns the amount of space in the soring after the + * enqueue operation has finished. + * @return + * - Actual number of objects enqueued, either 0 or n. + */ +__rte_experimental +uint32_t +rte_soring_enqueue_bulk(struct rte_soring *r, const void *objs, + uint32_t n, uint32_t *free_space); + +/** + * Enqueue several objects plus metadata on the soring. + * Enqueues exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to enqueue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to an array of metadata values for each object to enqueue. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param n + * The number of objects to add in the soring from the 'objs'. + * @param free_space + * if non-NULL, returns the amount of space in the soring after the + * enqueue operation has finished. + * @return + * - Actual number of objects enqueued, either 0 or n. + */ +__rte_experimental +uint32_t +rte_soring_enqueux_bulk(struct rte_soring *r, const void *objs, + const void *meta, uint32_t n, uint32_t *free_space); + +/** + * Enqueue several objects on the soring. + * Enqueues up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to enqueue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param n + * The number of objects to add in the soring from the 'objs'. + * @param free_space + * if non-NULL, returns the amount of space in the soring after the + * enqueue operation has finished. + * @return + * - Actual number of objects enqueued. + */ +__rte_experimental +uint32_t +rte_soring_enqueue_burst(struct rte_soring *r, const void *objs, + uint32_t n, uint32_t *free_space); + +/** + * Enqueue several objects plus metadata on the soring. + * Enqueues up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to enqueue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to an array of metadata values for each object to enqueue. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param n + * The number of objects to add in the soring from the 'objs'. + * @param free_space + * if non-NULL, returns the amount of space in the soring after the + * enqueue operation has finished. + * @return + * - Actual number of objects enqueued. + */ +__rte_experimental +uint32_t +rte_soring_enqueux_burst(struct rte_soring *r, const void *objs, + const void *meta, uint32_t n, uint32_t *free_space); + +/** + * Dequeue several objects from the soring. + * Dequeues exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to dequeue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param num + * The number of objects to dequeue from the soring into the objs. + * @param available + * If non-NULL, returns the number of remaining soring entries after the + * dequeue has finished. + * @return + * - Actual number of objects dequeued, either 0 or 'num'. + */ +__rte_experimental +uint32_t +rte_soring_dequeue_bulk(struct rte_soring *r, void *objs, + uint32_t num, uint32_t *available); + +/** + * Dequeue several objects plus metadata from the soring. + * Dequeues exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to dequeue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to array of metadata values for each object to dequeue. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param num + * The number of objects to dequeue from the soring into the objs. + * @param available + * If non-NULL, returns the number of remaining soring entries after the + * dequeue has finished. + * @return + * - Actual number of objects dequeued, either 0 or 'num'. + */ +__rte_experimental +uint32_t +rte_soring_dequeux_bulk(struct rte_soring *r, void *objs, void *meta, + uint32_t num, uint32_t *available); + +/** + * Dequeue several objects from the soring. + * Dequeues up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to dequeue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param num + * The number of objects to dequeue from the soring into the objs. + * @param available + * If non-NULL, returns the number of remaining soring entries after the + * dequeue has finished. + * @return + * - Actual number of objects dequeued. + */ +__rte_experimental +uint32_t +rte_soring_dequeue_burst(struct rte_soring *r, void *objs, + uint32_t num, uint32_t *available); + +/** + * Dequeue several objects plus metadata from the soring. + * Dequeues up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to dequeue. + * Size of objects to enqueue must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to array of metadata values for each object to dequeue. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param num + * The number of objects to dequeue from the soring into the objs. + * @param available + * If non-NULL, returns the number of remaining soring entries after the + * dequeue has finished. + * @return + * - Actual number of objects dequeued. + */ +__rte_experimental +uint32_t +rte_soring_dequeux_burst(struct rte_soring *r, void *objs, void *meta, + uint32_t num, uint32_t *available); + +/** + * Acquire several objects from the soring for given stage. + * Acquires exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to acquire. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param stage + * Stage to acquire objects for. + * @param num + * The number of objects to acquire. + * @param ftoken + * Pointer to the opaque 'token' value used by release() op. + * User has to store this value somewhere, and later provide to the + * release(). + * @param available + * If non-NULL, returns the number of remaining soring entries for given stage + * after the acquire has finished. + * @return + * - Actual number of objects acquired, either 0 or 'num'. + */ +__rte_experimental +uint32_t +rte_soring_acquire_bulk(struct rte_soring *r, void *objs, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available); + +/** + * Acquire several objects plus metadata from the soring for given stage. + * Acquires exactly requested number of objects or none. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to acquire. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to an array of metadata values for each for each acquired object. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param stage + * Stage to acquire objects for. + * @param num + * The number of objects to acquire. + * @param ftoken + * Pointer to the opaque 'token' value used by release() op. + * User has to store this value somewhere, and later provide to the + * release(). + * @param available + * If non-NULL, returns the number of remaining soring entries for given stage + * after the acquire has finished. + * @return + * - Actual number of objects acquired, either 0 or 'num'. + */ +__rte_experimental +uint32_t +rte_soring_acquirx_bulk(struct rte_soring *r, void *objs, void *meta, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available); + +/** + * Acquire several objects from the soring for given stage. + * Acquires up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to acquire. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param stage + * Stage to acquire objects for. + * @param num + * The number of objects to acquire. + * @param ftoken + * Pointer to the opaque 'token' value used by release() op. + * User has to store this value somewhere, and later provide to the + * release(). + * @param available + * If non-NULL, returns the number of remaining soring entries for given stage + * after the acquire has finished. + * @return + * - Actual number of objects acquired. + */ +__rte_experimental +uint32_t +rte_soring_acquire_burst(struct rte_soring *r, void *objs, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available); + +/** + * Acquire several objects plus metadata from the soring for given stage. + * Acquires up to requested number of objects. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to acquire. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to an array of metadata values for each for each acquired object. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then 'meta' parameter should be NULL. + * Otherwise the results are undefined. + * @param stage + * Stage to acquire objects for. + * @param num + * The number of objects to acquire. + * @param ftoken + * Pointer to the opaque 'token' value used by release() op. + * User has to store this value somewhere, and later provide to the + * release(). + * @param available + * If non-NULL, returns the number of remaining soring entries for given stage + * after the acquire has finished. + * @return + * - Actual number of objects acquired. + */ +__rte_experimental +uint32_t +rte_soring_acquirx_burst(struct rte_soring *r, void *objs, void *meta, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available); + +/** + * Release several objects for given stage back to the soring. + * Note that it means these objects become available for next stage or + * dequeue. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to release. + * Note that unless user needs to overwrite soring objects this parameter + * can be NULL. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param stage + * Current stage. + * @param n + * The number of objects to release. + * Has to be the same value as returned by acquire() op. + * @param ftoken + * Opaque 'token' value obtained from acquire() op. + */ +__rte_experimental +void +rte_soring_release(struct rte_soring *r, const void *objs, + uint32_t stage, uint32_t n, uint32_t ftoken); + +/** + * Release several objects plus metadata for given stage back to the soring. + * Note that it means these objects become available for next stage or + * dequeue. + * + * @param r + * A pointer to the soring structure. + * @param objs + * A pointer to an array of objects to release. + * Note that unless user needs to overwrite soring objects this parameter + * can be NULL. + * Size of objects must be the same value as 'elem_size' parameter + * used while creating the soring. Otherwise the results are undefined. + * @param meta + * A pointer to an array of metadata values for each object to release. + * Note that if user not using object metadata values, then this parameter + * can be NULL. + * Size of elements in this array must be the same value as 'meta_size' + * parameter used while creating the soring. If user created the soring with + * 'meta_size' value equals zero, then meta parameter should be NULL. + * Otherwise the results are undefined. + * @param stage + * Current stage. + * @param n + * The number of objects to release. + * Has to be the same value as returned by acquire() op. + * @param ftoken + * Opaque 'token' value obtained from acquire() op. + */ +__rte_experimental +void +rte_soring_releasx(struct rte_soring *r, const void *objs, + const void *meta, uint32_t stage, uint32_t n, uint32_t ftoken); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_SORING_H_ */ diff --git a/lib/ring/soring.c b/lib/ring/soring.c new file mode 100644 index 0000000000..e8fe890597 --- /dev/null +++ b/lib/ring/soring.c @@ -0,0 +1,613 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Huawei Technologies Co., Ltd + */ + +/** + * @file + * This file contains implementation of SORING 'datapath' functions. + * Brief description: + * ================== + * enqueue/dequeue works the same as for conventional rte_ring: + * any rte_ring sync types can be used, etc. + * Plus there could be multiple 'stages'. + * For each stage there is an acquire (start) and release (finish) operation. + * After some elems are 'acquired' - user can safely assume that he has + * exclusive possession of these elems till 'release' for them is done. + * Note that right now user has to release exactly the same number of elems + * he acquired before. + * After 'release', elems can be 'acquired' by next stage and/or dequeued + * (in case of last stage). + * Internal structure: + * =================== + * In addition to 'normal' ring of elems, it also has a ring of states of the + * same size. Each state[] corresponds to exactly one elem[]. + * state[] will be used by acquire/release/dequeue functions to store internal + * information and should not be accessed by the user directly. + * How it works: + * ============= + * 'acquire()' just moves stage's head (same as rte_ring move_head does), + * plus it saves in state[stage.cur_head] information about how many elems + * were acquired, current head position and special flag value to indicate + * that elems are acquired (SORING_ST_START). + * Note that 'acquire()' returns to the user a special 'ftoken' that user has + * to provide for 'release()' (in fact it is just a position for current head + * plus current stage index). + * 'release()' extracts old head value from provided ftoken and checks that + * corresponding 'state[]' contains expected values(mostly for sanity + * purposes). + * * Then it marks this state[] with 'SORING_ST_FINISH' flag to indicate + * that given subset of objects was released. + * After that, it checks does old head value equals to current tail value? + * If yes, then it performs 'finalize()' operation, otherwise 'release()' + * just returns (without spinning on stage tail value). + * As updated state[] is shared by all threads, some other thread can do + * 'finalize()' for given stage. + * That allows 'release()' to avoid excessive waits on the tail value. + * Main purpose of 'finalize()' operation is to walk through 'state[]' + * from current stage tail up to its head, check state[] and move stage tail + * through elements that already are in SORING_ST_FINISH state. + * Along with that, corresponding state[] values are reset to zero. + * Note that 'finalize()' for given stage can be done from multiple places: + * 'release()' for that stage or from 'acquire()' for next stage + * even from consumer's 'dequeue()' - in case given stage is the last one. + * So 'finalize()' has to be MT-safe and inside it we have to + * guarantee that only one thread will update state[] and stage's tail values. + */ + +#include "soring.h" + +/* + * Inline functions (fastpath) start here. + */ +static __rte_always_inline uint32_t +__rte_soring_stage_finalize(struct soring_stage_headtail *sht, uint32_t stage, + union soring_state *rstate, uint32_t rmask, uint32_t maxn) +{ + int32_t rc; + uint32_t ftkn, head, i, idx, k, n, tail; + union soring_stage_tail nt, ot; + union soring_state st; + + /* try to grab exclusive right to update tail value */ + ot.raw = rte_atomic_load_explicit(&sht->tail.raw, + rte_memory_order_acquire); + + /* other thread already finalizing it for us */ + if (ot.sync != 0) + return 0; + + nt.pos = ot.pos; + nt.sync = 1; + rc = rte_atomic_compare_exchange_strong_explicit(&sht->tail.raw, + (uint64_t *)(uintptr_t)&ot.raw, nt.raw, + rte_memory_order_release, rte_memory_order_relaxed); + + /* other thread won the race */ + if (rc == 0) + return 0; + + /* Ensure the head is read before rstate[] */ + head = rte_atomic_load_explicit(&sht->head, rte_memory_order_relaxed); + rte_atomic_thread_fence(rte_memory_order_acquire); + + /* + * start with current tail and walk through states that are + * already finished. + */ + + n = RTE_MIN(head - ot.pos, maxn); + for (i = 0, tail = ot.pos; i < n; i += k, tail += k) { + + idx = tail & rmask; + ftkn = SORING_FTKN_MAKE(tail, stage); + + st.raw = rte_atomic_load_explicit(&rstate[idx].raw, + rte_memory_order_relaxed); + if ((st.stnum & SORING_ST_MASK) != SORING_ST_FINISH || + st.ftoken != ftkn) + break; + + k = st.stnum & ~SORING_ST_MASK; + rte_atomic_store_explicit(&rstate[idx].raw, 0, + rte_memory_order_relaxed); + } + + + /* release exclusive right to update along with new tail value */ + ot.pos = tail; + rte_atomic_store_explicit(&sht->tail.raw, ot.raw, + rte_memory_order_release); + + return i; +} + +static __rte_always_inline uint32_t +__rte_soring_move_prod_head(struct rte_soring *r, uint32_t num, + enum rte_ring_queue_behavior behavior, enum rte_ring_sync_type st, + uint32_t *head, uint32_t *next, uint32_t *free) +{ + uint32_t n; + + switch (st) { + case RTE_RING_SYNC_ST: + case RTE_RING_SYNC_MT: + n = __rte_ring_headtail_move_head(&r->prod.ht, &r->cons.ht, + r->capacity, st, num, behavior, head, next, free); + break; + case RTE_RING_SYNC_MT_RTS: + n = __rte_ring_rts_move_head(&r->prod.rts, &r->cons.ht, + r->capacity, num, behavior, head, free); + *next = *head + n; + break; + case RTE_RING_SYNC_MT_HTS: + n = __rte_ring_hts_move_head(&r->prod.hts, &r->cons.ht, + r->capacity, num, behavior, head, free); + *next = *head + n; + break; + default: + /* unsupported mode, shouldn't be here */ + RTE_ASSERT(0); + *free = 0; + n = 0; + } + + return n; +} + +static __rte_always_inline uint32_t +__rte_soring_move_cons_head(struct rte_soring *r, uint32_t stage, uint32_t num, + enum rte_ring_queue_behavior behavior, enum rte_ring_sync_type st, + uint32_t *head, uint32_t *next, uint32_t *avail) +{ + uint32_t n; + + switch (st) { + case RTE_RING_SYNC_ST: + case RTE_RING_SYNC_MT: + n = __rte_ring_headtail_move_head(&r->cons.ht, + &r->stage[stage].ht, 0, st, num, behavior, + head, next, avail); + break; + case RTE_RING_SYNC_MT_RTS: + n = __rte_ring_rts_move_head(&r->cons.rts, &r->stage[stage].ht, + 0, num, behavior, head, avail); + *next = *head + n; + break; + case RTE_RING_SYNC_MT_HTS: + n = __rte_ring_hts_move_head(&r->cons.hts, &r->stage[stage].ht, + 0, num, behavior, head, avail); + *next = *head + n; + break; + default: + /* unsupported mode, shouldn't be here */ + RTE_ASSERT(0); + *avail = 0; + n = 0; + } + + return n; +} + +static __rte_always_inline void +__rte_soring_update_tail(struct __rte_ring_headtail *rht, + enum rte_ring_sync_type st, uint32_t head, uint32_t next, uint32_t enq) +{ + uint32_t n; + + switch (st) { + case RTE_RING_SYNC_ST: + case RTE_RING_SYNC_MT: + __rte_ring_update_tail(&rht->ht, head, next, st, enq); + break; + case RTE_RING_SYNC_MT_RTS: + __rte_ring_rts_update_tail(&rht->rts); + break; + case RTE_RING_SYNC_MT_HTS: + n = next - head; + __rte_ring_hts_update_tail(&rht->hts, head, n, enq); + break; + default: + /* unsupported mode, shouldn't be here */ + RTE_ASSERT(0); + } +} + +static __rte_always_inline uint32_t +__rte_soring_stage_move_head(struct soring_stage_headtail *d, + const struct rte_ring_headtail *s, uint32_t capacity, uint32_t num, + enum rte_ring_queue_behavior behavior, + uint32_t *old_head, uint32_t *new_head, uint32_t *avail) +{ + uint32_t n, tail; + + *old_head = rte_atomic_load_explicit(&d->head, + rte_memory_order_relaxed); + + do { + n = num; + + /* Ensure the head is read before tail */ + rte_atomic_thread_fence(rte_memory_order_acquire); + + tail = rte_atomic_load_explicit(&s->tail, + rte_memory_order_acquire); + *avail = capacity + tail - *old_head; + if (n > *avail) + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *avail; + if (n == 0) + return 0; + *new_head = *old_head + n; + } while (rte_atomic_compare_exchange_strong_explicit(&d->head, + old_head, *new_head, rte_memory_order_acq_rel, + rte_memory_order_relaxed) == 0); + + return n; +} + +static __rte_always_inline uint32_t +soring_enqueue(struct rte_soring *r, const void *objs, + const void *meta, uint32_t n, enum rte_ring_queue_behavior behavior, + uint32_t *free_space) +{ + enum rte_ring_sync_type st; + uint32_t nb_free, prod_head, prod_next; + + RTE_ASSERT(r != NULL && r->nb_stage > 0); + RTE_ASSERT(meta == NULL || r->meta != NULL); + + st = r->prod.ht.sync_type; + + n = __rte_soring_move_prod_head(r, n, behavior, st, + &prod_head, &prod_next, &nb_free); + if (n != 0) { + __rte_ring_do_enqueue_elems(&r[1], objs, r->size, + prod_head & r->mask, r->esize, n); + if (meta != NULL) + __rte_ring_do_enqueue_elems(r->meta, meta, r->size, + prod_head & r->mask, r->msize, n); + __rte_soring_update_tail(&r->prod, st, prod_head, prod_next, 1); + } + + if (free_space != NULL) + *free_space = nb_free - n; + return n; +} + +static __rte_always_inline uint32_t +soring_dequeue(struct rte_soring *r, void *objs, void *meta, + uint32_t num, enum rte_ring_queue_behavior behavior, + uint32_t *available) +{ + enum rte_ring_sync_type st; + uint32_t entries, cons_head, cons_next, n, ns, reqn; + + RTE_ASSERT(r != NULL && r->nb_stage > 0); + RTE_ASSERT(meta == NULL || r->meta != NULL); + + ns = r->nb_stage - 1; + st = r->cons.ht.sync_type; + + /* try to grab exactly @num elems first */ + n = __rte_soring_move_cons_head(r, ns, num, RTE_RING_QUEUE_FIXED, st, + &cons_head, &cons_next, &entries); + if (n == 0) { + /* try to finalize some elems from previous stage */ + n = __rte_soring_stage_finalize(&r->stage[ns].sht, ns, + r->state, r->mask, 2 * num); + entries += n; + + /* repeat attempt to grab elems */ + reqn = (behavior == RTE_RING_QUEUE_FIXED) ? num : 0; + if (entries >= reqn) + n = __rte_soring_move_cons_head(r, ns, num, behavior, + st, &cons_head, &cons_next, &entries); + else + n = 0; + } + + /* we have some elems to consume */ + if (n != 0) { + __rte_ring_do_dequeue_elems(objs, &r[1], r->size, + cons_head & r->mask, r->esize, n); + if (meta != NULL) + __rte_ring_do_dequeue_elems(meta, r->meta, r->size, + cons_head & r->mask, r->msize, n); + __rte_soring_update_tail(&r->cons, st, cons_head, cons_next, 0); + } + + if (available != NULL) + *available = entries - n; + return n; +} + +/* + * Verify internal SORING state. + * WARNING: if expected value is not equal to actual one, it means that for + * whatever reason SORING data constancy is broken. That is a very serious + * problem that most likely will cause race-conditions, memory corruption, + * program crash. + * To ease debugging it user might rebuild ring library with + * RTE_SORING_DEBUG enabled. + */ +static __rte_always_inline void +soring_verify_state(const struct rte_soring *r, uint32_t stage, uint32_t idx, + const char *msg, union soring_state val, union soring_state exp) +{ + if (val.raw != exp.raw) { +#ifdef RTE_SORING_DEBUG + rte_soring_dump(stderr, r); + rte_panic("line:%d from:%s: soring=%p, stage=%#x, idx=%#x, " + "expected={.stnum=%#x, .ftoken=%#x}, " + "actual={.stnum=%#x, .ftoken=%#x};\n", + __LINE__, msg, r, stage, idx, + exp.stnum, exp.ftoken, + val.stnum, val.ftoken); +#else + SORING_LOG(EMERG, "from:%s: soring=%p, stage=%#x, idx=%#x, " + "expected={.stnum=%#x, .ftoken=%#x}, " + "actual={.stnum=%#x, .ftoken=%#x};", + msg, r, stage, idx, + exp.stnum, exp.ftoken, + val.stnum, val.ftoken); +#endif + } +} + +/* check and update state ring at acquire op*/ +static __rte_always_inline void +acquire_state_update(const struct rte_soring *r, uint32_t stage, uint32_t idx, + uint32_t ftoken, uint32_t num) +{ + union soring_state st; + const union soring_state est = {.raw = 0}; + + st.raw = rte_atomic_load_explicit(&r->state[idx].raw, + rte_memory_order_relaxed); + soring_verify_state(r, stage, idx, __func__, st, est); + + st.ftoken = ftoken; + st.stnum = (SORING_ST_START | num); + + rte_atomic_store_explicit(&r->state[idx].raw, st.raw, + rte_memory_order_relaxed); +} + +static __rte_always_inline uint32_t +soring_acquire(struct rte_soring *r, void *objs, void *meta, + uint32_t stage, uint32_t num, enum rte_ring_queue_behavior behavior, + uint32_t *ftoken, uint32_t *available) +{ + uint32_t avail, head, idx, n, next, reqn; + struct soring_stage *pstg; + struct soring_stage_headtail *cons; + + RTE_ASSERT(r != NULL && stage < r->nb_stage); + RTE_ASSERT(meta == NULL || r->meta != NULL); + + cons = &r->stage[stage].sht; + + if (stage == 0) + n = __rte_soring_stage_move_head(cons, &r->prod.ht, 0, num, + behavior, &head, &next, &avail); + else { + pstg = r->stage + stage - 1; + + /* try to grab exactly @num elems */ + n = __rte_soring_stage_move_head(cons, &pstg->ht, 0, num, + RTE_RING_QUEUE_FIXED, &head, &next, &avail); + if (n == 0) { + /* try to finalize some elems from previous stage */ + n = __rte_soring_stage_finalize(&pstg->sht, stage - 1, + r->state, r->mask, 2 * num); + avail += n; + + /* repeat attempt to grab elems */ + reqn = (behavior == RTE_RING_QUEUE_FIXED) ? num : 0; + if (avail >= reqn) + n = __rte_soring_stage_move_head(cons, + &pstg->ht, 0, num, behavior, &head, + &next, &avail); + else + n = 0; + } + } + + if (n != 0) { + + idx = head & r->mask; + *ftoken = SORING_FTKN_MAKE(head, stage); + + /* check and update state value */ + acquire_state_update(r, stage, idx, *ftoken, n); + + /* copy elems that are ready for given stage */ + __rte_ring_do_dequeue_elems(objs, &r[1], r->size, idx, + r->esize, n); + if (meta != NULL) + __rte_ring_do_dequeue_elems(meta, r->meta, + r->size, idx, r->msize, n); + } + + if (available != NULL) + *available = avail - n; + return n; +} + +static __rte_always_inline void +soring_release(struct rte_soring *r, const void *objs, + const void *meta, uint32_t stage, uint32_t n, uint32_t ftoken) +{ + uint32_t idx, pos, tail; + struct soring_stage *stg; + union soring_state st; + + const union soring_state est = { + .stnum = (SORING_ST_START | n), + .ftoken = ftoken, + }; + + RTE_ASSERT(r != NULL && stage < r->nb_stage); + RTE_ASSERT(meta == NULL || r->meta != NULL); + + stg = r->stage + stage; + + pos = SORING_FTKN_POS(ftoken, stage); + idx = pos & r->mask; + st.raw = rte_atomic_load_explicit(&r->state[idx].raw, + rte_memory_order_relaxed); + + /* check state ring contents */ + soring_verify_state(r, stage, idx, __func__, st, est); + + /* update contents of the ring, if necessary */ + if (objs != NULL) + __rte_ring_do_enqueue_elems(&r[1], objs, r->size, idx, + r->esize, n); + if (meta != NULL) + __rte_ring_do_enqueue_elems(r->meta, meta, r->size, idx, + r->msize, n); + + /* set state to FINISH, make sure it is not reordered */ + rte_atomic_thread_fence(rte_memory_order_release); + + st.stnum = SORING_ST_FINISH | n; + rte_atomic_store_explicit(&r->state[idx].raw, st.raw, + rte_memory_order_relaxed); + + /* try to do finalize(), if appropriate */ + tail = rte_atomic_load_explicit(&stg->sht.tail.pos, + rte_memory_order_relaxed); + if (tail == pos) + __rte_soring_stage_finalize(&stg->sht, stage, r->state, r->mask, + r->capacity); +} + +/* + * Public functions (data-path) start here. + */ + +void +rte_soring_release(struct rte_soring *r, const void *objs, + uint32_t stage, uint32_t n, uint32_t ftoken) +{ + soring_release(r, objs, NULL, stage, n, ftoken); +} + + +void +rte_soring_releasx(struct rte_soring *r, const void *objs, + const void *meta, uint32_t stage, uint32_t n, uint32_t ftoken) +{ + soring_release(r, objs, meta, stage, n, ftoken); +} + +uint32_t +rte_soring_enqueue_bulk(struct rte_soring *r, const void *objs, uint32_t n, + uint32_t *free_space) +{ + return soring_enqueue(r, objs, NULL, n, RTE_RING_QUEUE_FIXED, + free_space); +} + +uint32_t +rte_soring_enqueux_bulk(struct rte_soring *r, const void *objs, + const void *meta, uint32_t n, uint32_t *free_space) +{ + return soring_enqueue(r, objs, meta, n, RTE_RING_QUEUE_FIXED, + free_space); +} + +uint32_t +rte_soring_enqueue_burst(struct rte_soring *r, const void *objs, uint32_t n, + uint32_t *free_space) +{ + return soring_enqueue(r, objs, NULL, n, RTE_RING_QUEUE_VARIABLE, + free_space); +} + +uint32_t +rte_soring_enqueux_burst(struct rte_soring *r, const void *objs, + const void *meta, uint32_t n, uint32_t *free_space) +{ + return soring_enqueue(r, objs, meta, n, RTE_RING_QUEUE_VARIABLE, + free_space); +} + +uint32_t +rte_soring_dequeue_bulk(struct rte_soring *r, void *objs, uint32_t num, + uint32_t *available) +{ + return soring_dequeue(r, objs, NULL, num, RTE_RING_QUEUE_FIXED, + available); +} + +uint32_t +rte_soring_dequeux_bulk(struct rte_soring *r, void *objs, void *meta, + uint32_t num, uint32_t *available) +{ + return soring_dequeue(r, objs, meta, num, RTE_RING_QUEUE_FIXED, + available); +} + +uint32_t +rte_soring_dequeue_burst(struct rte_soring *r, void *objs, uint32_t num, + uint32_t *available) +{ + return soring_dequeue(r, objs, NULL, num, RTE_RING_QUEUE_VARIABLE, + available); +} + +uint32_t +rte_soring_dequeux_burst(struct rte_soring *r, void *objs, void *meta, + uint32_t num, uint32_t *available) +{ + return soring_dequeue(r, objs, meta, num, RTE_RING_QUEUE_VARIABLE, + available); +} + +uint32_t +rte_soring_acquire_bulk(struct rte_soring *r, void *objs, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available) +{ + return soring_acquire(r, objs, NULL, stage, num, + RTE_RING_QUEUE_FIXED, ftoken, available); +} + +uint32_t +rte_soring_acquirx_bulk(struct rte_soring *r, void *objs, void *meta, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available) +{ + return soring_acquire(r, objs, meta, stage, num, + RTE_RING_QUEUE_FIXED, ftoken, available); +} + +uint32_t +rte_soring_acquire_burst(struct rte_soring *r, void *objs, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available) +{ + return soring_acquire(r, objs, NULL, stage, num, + RTE_RING_QUEUE_VARIABLE, ftoken, available); +} + +uint32_t +rte_soring_acquirx_burst(struct rte_soring *r, void *objs, void *meta, + uint32_t stage, uint32_t num, uint32_t *ftoken, uint32_t *available) +{ + return soring_acquire(r, objs, meta, stage, num, + RTE_RING_QUEUE_VARIABLE, ftoken, available); +} + +unsigned int +rte_soring_count(const struct rte_soring *r) +{ + uint32_t prod_tail = r->prod.ht.tail; + uint32_t cons_tail = r->cons.ht.tail; + uint32_t count = (prod_tail - cons_tail) & r->mask; + return (count > r->capacity) ? r->capacity : count; +} + +unsigned int +rte_soring_free_count(const struct rte_soring *r) +{ + return r->capacity - rte_soring_count(r); +} diff --git a/lib/ring/soring.h b/lib/ring/soring.h new file mode 100644 index 0000000000..455cf677a7 --- /dev/null +++ b/lib/ring/soring.h @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Huawei Technologies Co., Ltd + */ + +#ifndef _SORING_H_ +#define _SORING_H_ + +/** + * @file + * This file contains internal structures of DPDK soring: Staged Ordered Ring. + * Sort of extension of conventional DPDK ring. + * Internal structure: + * In addition to 'normal' ring of elems, it also has a ring of states of the + * same size. Each state[] corresponds to exactly one elem[]. + * state[] will be used by acquire/release/dequeue functions to store internal + * information and should not be accessed by the user directly. + * For actual implementation details, please refer to soring.c + */ + +#include + +/* logging stuff, register our own tag for SORING */ +#include + +extern int soring_logtype; +#define RTE_LOGTYPE_SORING soring_logtype +#define SORING_LOG(level, ...) \ + RTE_LOG_LINE(level, SORING, "" __VA_ARGS__) + +/** + * SORING internal state for each element + */ +union soring_state { + alignas(sizeof(uint64_t)) RTE_ATOMIC(uint64_t) raw; + struct { + RTE_ATOMIC(uint32_t) ftoken; + RTE_ATOMIC(uint32_t) stnum; + }; +}; + +/* upper 2 bits are used for status */ +#define SORING_ST_START RTE_SHIFT_VAL32(1, RTE_SORING_ST_BIT) +#define SORING_ST_FINISH RTE_SHIFT_VAL32(2, RTE_SORING_ST_BIT) + +#define SORING_ST_MASK \ + RTE_GENMASK32(sizeof(uint32_t) * CHAR_BIT - 1, RTE_SORING_ST_BIT) + +#define SORING_FTKN_MAKE(pos, stg) ((pos) + (stg)) +#define SORING_FTKN_POS(ftk, stg) ((ftk) - (stg)) + +/** + * SORING re-uses rte_ring internal structures and implementation + * for enqueue/dequeue operations. + */ +struct __rte_ring_headtail { + + union __rte_cache_aligned { + struct rte_ring_headtail ht; + struct rte_ring_hts_headtail hts; + struct rte_ring_rts_headtail rts; + }; + + RTE_CACHE_GUARD; +}; + +union soring_stage_tail { + /** raw 8B value to read/write *sync* and *pos* as one atomic op */ + alignas(sizeof(uint64_t)) RTE_ATOMIC(uint64_t) raw; + struct { + RTE_ATOMIC(uint32_t) sync; + RTE_ATOMIC(uint32_t) pos; + }; +}; + +struct soring_stage_headtail { + volatile union soring_stage_tail tail; + enum rte_ring_sync_type unused; /**< unused */ + volatile RTE_ATOMIC(uint32_t) head; +}; + +/** + * SORING stage head_tail structure is 'compatible' with rte_ring ones. + * rte_ring internal functions for moving producer/consumer head + * can work transparently with stage head_tail and visa-versa + * (no modifications required). + */ +struct soring_stage { + + union __rte_cache_aligned { + struct rte_ring_headtail ht; + struct soring_stage_headtail sht; + }; + + RTE_CACHE_GUARD; +}; + +/** + * soring internal structure. + * As with rte_ring actual elements array supposed to be located directly + * after the rte_soring structure. + */ +struct __rte_cache_aligned rte_soring { + uint32_t size; /**< Size of ring. */ + uint32_t mask; /**< Mask (size-1) of ring. */ + uint32_t capacity; /**< Usable size of ring */ + uint32_t esize; + /**< size of elements in the ring, must be a multiple of 4*/ + uint32_t msize; + /**< size of metadata value for each elem, must be a multiple of 4 */ + + /** Ring stages */ + struct soring_stage *stage; + uint32_t nb_stage; + + /** Ring of states (one per element) */ + union soring_state *state; + + /** Pointer to the buffer where metadata values for each elements + * are stored. This is supplementary and optional information that + * user can attach to each element of the ring. + * While it is possible to incorporate this information inside + * user-defined element, in many cases it is plausible to maintain it + * as a separate array (mainly for performance reasons). + */ + void *meta; + + RTE_CACHE_GUARD; + + /** Ring producer status. */ + struct __rte_ring_headtail prod; + + /** Ring consumer status. */ + struct __rte_ring_headtail cons; + + char name[RTE_RING_NAMESIZE]; /**< Name of the ring. */ +}; + +#endif /* _SORING_H_ */ diff --git a/lib/ring/version.map b/lib/ring/version.map index dfe3c1914a..eeaa95b2e9 100644 --- a/lib/ring/version.map +++ b/lib/ring/version.map @@ -20,4 +20,23 @@ EXPERIMENTAL { # added in 25.03 rte_ring_headtail_dump; + rte_soring_acquire_bulk; + rte_soring_acquire_burst; + rte_soring_acquirx_bulk; + rte_soring_acquirx_burst; + rte_soring_count; + rte_soring_dequeue_bulk; + rte_soring_dequeue_burst; + rte_soring_dequeux_bulk; + rte_soring_dequeux_burst; + rte_soring_enqueue_bulk; + rte_soring_enqueue_burst; + rte_soring_enqueux_bulk; + rte_soring_enqueux_burst; + rte_soring_dump; + rte_soring_free_count; + rte_soring_get_memsize; + rte_soring_init; + rte_soring_release; + rte_soring_releasx; }; -- 2.35.3