From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f66.google.com (mail-pl0-f66.google.com [209.85.160.66]) by dpdk.org (Postfix) with ESMTP id AE35C1B33B for ; Wed, 8 Nov 2017 07:18:51 +0100 (CET) Received: by mail-pl0-f66.google.com with SMTP id t3so664027plj.11 for ; Tue, 07 Nov 2017 22:18:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=W+SU12ho2eI6RCt3o+0MvL9ti2Gdn25BPSX0UwEUzes=; b=L5WwykNoC4MUWoAPUYWaVp/hOfOIjjp9Ac7kvZ20CA5hGzEVV0/WXmziZspIt47RJW GPY4ljPs8Vh7zrBWhbR28H57F93EDI2XtYfOULIMQXy6BKKOOx/fa1Aud1LPqoJVnCOS uiV1SRQbj0pR1HZdM9HiL7+dKgD0v5BcR8Q63Ar3WIn0QwOHyn/jC0CtciL+opmw5LaA N8l/tX4PcFLKmaEdpgSkc4FIWNfUpYZrYIPbOzMO5KmUk95+7+46ZL+LZw8oeEfyvu9t bEwE0eIVmYq/iGiP7HwUscISXZ40rTUHzHlsYyC8EliuRJLzDIIkTW0DHvds+i9huKg/ dWGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=W+SU12ho2eI6RCt3o+0MvL9ti2Gdn25BPSX0UwEUzes=; b=obo8rMYWxtw/8qtQhqZ4Go8NQz0LJz0hRi80CoreIWxLhuGE+wtOqvFwIJHzBMU/hR Seen6/HPfO//5RxVSYCIxMDpZvwlZMPHPi94V+gDTWPkDABAjfc0CSxyfHEC7vTjwq7R 3nYL6r0nsDbpKbDGPIANlROhQPKwidIeOgvVuH+nyIVK4Y35SASkA3igII15XpbscrqH mZHqkn6ARDVEQoO4ib1OagAhsdTTE9AEb7MM6H9czrk/6m1lbM8oOp3um6Ztj243Isy1 9Na6r9iuOLVc9DhLV8E6muXCm4tIeBnyWc2ubcedITU25mVTfvwyKpB0p4RjNVtvx4Zh gykQ== X-Gm-Message-State: AJaThX6pCBUp5L6d0+OhL2jRPU3VbmZnpsVh37TWXagh+bw/PBVDh7fq TDBUVOLDAjVh2gsSJt8cyr8= X-Google-Smtp-Source: ABhQp+S0JC1QY7ncdMiVsd56Ve8QgXerNfOlRm5tt4gjiqxe/l2sYgz5SZHUMzlXc1xcGAOAN4hDmQ== X-Received: by 10.159.198.71 with SMTP id y7mr1242018plt.391.1510121930623; Tue, 07 Nov 2017 22:18:50 -0800 (PST) Received: from nfv-demo01.hxtcorp.net ([38.106.11.25]) by smtp.gmail.com with ESMTPSA id t63sm6281020pgc.19.2017.11.07.22.18.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 07 Nov 2017 22:18:50 -0800 (PST) From: Jia He To: jerin.jacob@caviumnetworks.com, dev@dpdk.org, olivier.matz@6wind.com Cc: konstantin.ananyev@intel.com, bruce.richardson@intel.com, jianbo.liu@arm.com, hemant.agrawal@nxp.com, Jia He , jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com, jia.he@hxt-semitech.com Date: Wed, 8 Nov 2017 06:17:11 +0000 Message-Id: <1510121832-16439-2-git-send-email-hejianet@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1510121832-16439-1-git-send-email-hejianet@gmail.com> References: <1510121832-16439-1-git-send-email-hejianet@gmail.com> Subject: [dpdk-dev] [PATCH 2/3] ring: guarantee load ordering of cons/prod when doing enqueue/dequeue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Nov 2017 06:18:52 -0000 We watched a rte panic of mbuf_autotest in our qualcomm arm64 server. As for the possible race condition, please refer to [1]. Furthermore, to fix this race, there are 2 options as suggested by Jerin: 1. use rte_smp_rmb 2. use load_acquire/store_release(refer to [2]). CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "y" only on arm64. The reason why providing 2 options is due to the performance benchmark difference in different arm machines, please refer to [3]. Already fuctionally tested on the machines as follows: - on X86 - on arm64 with CONFIG_RTE_RING_USE_C11_MEM_MODEL=y - on arm64 with CONFIG_RTE_RING_USE_C11_MEM_MODEL=n We haven't tested the ppc64 case. If anyone verifies it, he can add CONFIG_RTE_RING_USE_C11_MEM_MODEL=y to ppc64 config files. [1] http://dpdk.org/ml/archives/dev/2017-October/078366.html [2] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170 [3] http://dpdk.org/ml/archives/dev/2017-October/080861.html --- Changelog: V3: arch specific implementation for enqueue/dequeue barrier V2: let users choose whether using load_acquire/store_release V1: rte_smp_rmb() between 2 loads Signed-off-by: Jia He Signed-off-by: jie2.liu@hxt-semitech.com Signed-off-by: bing.zhao@hxt-semitech.com Signed-off-by: jia.he@hxt-semitech.com Suggested-by: jerin.jacob@caviumnetworks.com Suggested-by: konstantin.ananyev@intel.com --- lib/librte_eventdev/rte_event_ring.h | 6 +- lib/librte_ring/Makefile | 4 +- lib/librte_ring/rte_ring.h | 160 +++-------------------------- lib/librte_ring/rte_ring_c11_mem.h | 185 +++++++++++++++++++++++++++++++++ lib/librte_ring/rte_ring_generic.h | 192 +++++++++++++++++++++++++++++++++++ 5 files changed, 397 insertions(+), 150 deletions(-) create mode 100644 lib/librte_ring/rte_ring_c11_mem.h create mode 100644 lib/librte_ring/rte_ring_generic.h diff --git a/lib/librte_eventdev/rte_event_ring.h b/lib/librte_eventdev/rte_event_ring.h index ea9b688..3e49458 100644 --- a/lib/librte_eventdev/rte_event_ring.h +++ b/lib/librte_eventdev/rte_event_ring.h @@ -126,9 +126,8 @@ rte_event_ring_enqueue_burst(struct rte_event_ring *r, goto end; ENQUEUE_PTRS(&r->r, &r[1], prod_head, events, n, struct rte_event); - rte_smp_wmb(); - update_tail(&r->r.prod, prod_head, prod_next, 1); + update_tail(&r->r.prod, prod_head, prod_next, 1, 1); end: if (free_space != NULL) *free_space = free_entries - n; @@ -168,9 +167,8 @@ rte_event_ring_dequeue_burst(struct rte_event_ring *r, goto end; DEQUEUE_PTRS(&r->r, &r[1], cons_head, events, n, struct rte_event); - rte_smp_rmb(); - update_tail(&r->r.cons, cons_head, cons_next, 1); + update_tail(&r->r.cons, cons_head, cons_next, 1, 0); end: if (available != NULL) diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile index e34d9d9..a2682e7 100644 --- a/lib/librte_ring/Makefile +++ b/lib/librte_ring/Makefile @@ -45,6 +45,8 @@ LIBABIVER := 1 SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h +SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \ + rte_ring_generic.h \ + rte_ring_c11_mem.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 5e9b3b7..53764d9 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -356,85 +356,19 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r); } \ } while (0) -static __rte_always_inline void -update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, - uint32_t single) -{ - /* - * If there are other enqueues/dequeues in progress that preceded us, - * we need to wait for them to complete - */ - if (!single) - while (unlikely(ht->tail != old_val)) - rte_pause(); - - ht->tail = new_val; -} - -/** - * @internal This function updates the producer head for enqueue - * - * @param r - * A pointer to the ring structure - * @param is_sp - * Indicates whether multi-producer path is needed or not - * @param n - * The number of elements we will want to enqueue, i.e. how far should the - * head be moved - * @param behavior - * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a ring - * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring - * @param old_head - * Returns head value as it was before the move, i.e. where enqueue starts - * @param new_head - * Returns the current/new head value i.e. where enqueue finishes - * @param free_entries - * Returns the amount of free space in the ring BEFORE head was moved - * @return - * Actual number of objects enqueued. - * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. - */ -static __rte_always_inline unsigned int -__rte_ring_move_prod_head(struct rte_ring *r, int is_sp, - unsigned int n, enum rte_ring_queue_behavior behavior, - uint32_t *old_head, uint32_t *new_head, - uint32_t *free_entries) -{ - const uint32_t capacity = r->capacity; - unsigned int max = n; - int success; - - do { - /* Reset n to the initial burst count */ - n = max; - - *old_head = r->prod.head; - const uint32_t cons_tail = r->cons.tail; - /* - * The subtraction is done between two unsigned 32bits value - * (the result is always modulo 32 bits even if we have - * *old_head > cons_tail). So 'free_entries' is always between 0 - * and capacity (which is < size). - */ - *free_entries = (capacity + cons_tail - *old_head); - - /* check that we have enough room in ring */ - if (unlikely(n > *free_entries)) - n = (behavior == RTE_RING_QUEUE_FIXED) ? - 0 : *free_entries; - - if (n == 0) - return 0; - - *new_head = *old_head + n; - if (is_sp) - r->prod.head = *new_head, success = 1; - else - success = rte_atomic32_cmpset(&r->prod.head, - *old_head, *new_head); - } while (unlikely(success == 0)); - return n; -} +/* Between load and load. there might be cpu reorder in weak model + * (powerpc/arm). + * There are 2 choices for the users + * 1.use rmb() memory barrier + * 2.use one-direcion load_acquire/store_release barrier,defined by + * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y + * It depends on performance test results. + */ +#ifdef RTE_RING_USE_C11_MEM_MODEL +#include "rte_ring_c11_mem.h" +#else +#include "rte_ring_generic.h" +#endif /** * @internal Enqueue several objects on the ring @@ -470,9 +404,8 @@ __rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table, goto end; ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *); - rte_smp_wmb(); - update_tail(&r->prod, prod_head, prod_next, is_sp); + update_tail(&r->prod, prod_head, prod_next, is_sp, 1); end: if (free_space != NULL) *free_space = free_entries - n; @@ -480,68 +413,6 @@ __rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table, } /** - * @internal This function updates the consumer head for dequeue - * - * @param r - * A pointer to the ring structure - * @param is_sc - * Indicates whether multi-consumer path is needed or not - * @param n - * The number of elements we will want to enqueue, i.e. how far should the - * head be moved - * @param behavior - * RTE_RING_QUEUE_FIXED: Dequeue a fixed number of items from a ring - * RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring - * @param old_head - * Returns head value as it was before the move, i.e. where dequeue starts - * @param new_head - * Returns the current/new head value i.e. where dequeue finishes - * @param entries - * Returns the number of entries in the ring BEFORE head was moved - * @return - * - Actual number of objects dequeued. - * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. - */ -static __rte_always_inline unsigned int -__rte_ring_move_cons_head(struct rte_ring *r, int is_sc, - unsigned int n, enum rte_ring_queue_behavior behavior, - uint32_t *old_head, uint32_t *new_head, - uint32_t *entries) -{ - unsigned int max = n; - int success; - - /* move cons.head atomically */ - do { - /* Restore n as it may change every loop */ - n = max; - - *old_head = r->cons.head; - const uint32_t prod_tail = r->prod.tail; - /* The subtraction is done between two unsigned 32bits value - * (the result is always modulo 32 bits even if we have - * cons_head > prod_tail). So 'entries' is always between 0 - * and size(ring)-1. */ - *entries = (prod_tail - *old_head); - - /* Set the actual entries for dequeue */ - if (n > *entries) - n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries; - - if (unlikely(n == 0)) - return 0; - - *new_head = *old_head + n; - if (is_sc) - r->cons.head = *new_head, success = 1; - else - success = rte_atomic32_cmpset(&r->cons.head, *old_head, - *new_head); - } while (unlikely(success == 0)); - return n; -} - -/** * @internal Dequeue several objects from the ring * * @param r @@ -575,9 +446,8 @@ __rte_ring_do_dequeue(struct rte_ring *r, void **obj_table, goto end; DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *); - rte_smp_rmb(); - update_tail(&r->cons, cons_head, cons_next, is_sc); + update_tail(&r->cons, cons_head, cons_next, is_sc, 0); end: if (available != NULL) diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h new file mode 100644 index 0000000..f370df9 --- /dev/null +++ b/lib/librte_ring/rte_ring_c11_mem.h @@ -0,0 +1,185 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 hxt-semitech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of hxt-semitech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_RING_C11_MEM_H_ +#define _RTE_RING_C11_MEM_H_ + +#define UNUSED_PARAMETER(x) ((void)(x)) +static __rte_always_inline void +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, + uint32_t single, uint32_t enqueue) +{ + UNUSED_PARAMETER(enqueue); + /* + * If there are other enqueues/dequeues in progress that preceded us, + * we need to wait for them to complete + */ + if (!single) + while (unlikely(ht->tail != old_val)) + rte_pause(); + + __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE); +} + +/** + * @internal This function updates the producer head for enqueue + * + * @param r + * A pointer to the ring structure + * @param is_sp + * Indicates whether multi-producer path is needed or not + * @param n + * The number of elements we will want to enqueue, i.e. how far should the + * head be moved + * @param behavior + * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring + * @param old_head + * Returns head value as it was before the move, i.e. where enqueue starts + * @param new_head + * Returns the current/new head value i.e. where enqueue finishes + * @param free_entries + * Returns the amount of free space in the ring BEFORE head was moved + * @return + * Actual number of objects enqueued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_move_prod_head(struct rte_ring *r, int is_sp, + unsigned int n, enum rte_ring_queue_behavior behavior, + uint32_t *old_head, uint32_t *new_head, + uint32_t *free_entries) +{ + const uint32_t capacity = r->capacity; + unsigned int max = n; + int success; + + do { + /* Reset n to the initial burst count */ + n = max; + + *old_head = __atomic_load_n(&r->prod.head, + __ATOMIC_ACQUIRE); + const uint32_t cons_tail = r->cons.tail; + /* + * The subtraction is done between two unsigned 32bits value + * (the result is always modulo 32 bits even if we have + * *old_head > cons_tail). So 'free_entries' is always between 0 + * and capacity (which is < size). + */ + *free_entries = (capacity + cons_tail - *old_head); + + /* check that we have enough room in ring */ + if (unlikely(n > *free_entries)) + n = (behavior == RTE_RING_QUEUE_FIXED) ? + 0 : *free_entries; + + if (n == 0) + return 0; + + *new_head = *old_head + n; + if (is_sp) + r->prod.head = *new_head, success = 1; + else + success = __atomic_compare_exchange_n(&r->prod.head, + old_head, *new_head, + 0, __ATOMIC_ACQUIRE, + __ATOMIC_RELAXED); + } while (unlikely(success == 0)); + return n; +} + +/** + * @internal This function updates the consumer head for dequeue + * + * @param r + * A pointer to the ring structure + * @param is_sc + * Indicates whether multi-consumer path is needed or not + * @param n + * The number of elements we will want to enqueue, i.e. how far should the + * head be moved + * @param behavior + * RTE_RING_QUEUE_FIXED: Dequeue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring + * @param old_head + * Returns head value as it was before the move, i.e. where dequeue starts + * @param new_head + * Returns the current/new head value i.e. where dequeue finishes + * @param entries + * Returns the number of entries in the ring BEFORE head was moved + * @return + * - Actual number of objects dequeued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_move_cons_head(struct rte_ring *r, int is_sc, + unsigned int n, enum rte_ring_queue_behavior behavior, + uint32_t *old_head, uint32_t *new_head, + uint32_t *entries) +{ + unsigned int max = n; + int success; + + /* move cons.head atomically */ + do { + /* Restore n as it may change every loop */ + n = max; + *old_head = __atomic_load_n(&r->cons.head, + __ATOMIC_ACQUIRE); + const uint32_t prod_tail = r->prod.tail; + /* The subtraction is done between two unsigned 32bits value + * (the result is always modulo 32 bits even if we have + * cons_head > prod_tail). So 'entries' is always between 0 + * and size(ring)-1. */ + *entries = (prod_tail - *old_head); + + /* Set the actual entries for dequeue */ + if (n > *entries) + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries; + + if (unlikely(n == 0)) + return 0; + + *new_head = *old_head + n; + if (is_sc) + r->cons.head = *new_head, success = 1; + else + success = __atomic_compare_exchange_n(&r->cons.head, + old_head, *new_head, + 0, __ATOMIC_ACQUIRE, + __ATOMIC_RELAXED); + } while (unlikely(success == 0)); + return n; +} + +#endif /* _RTE_RING_C11_MEM_H_ */ diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h new file mode 100644 index 0000000..aa596eb --- /dev/null +++ b/lib/librte_ring/rte_ring_generic.h @@ -0,0 +1,192 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 hxt-semitech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of hxt-semitech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_RING_GENERIC_H_ +#define _RTE_RING_GENERIC_H_ + +static __rte_always_inline void +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, + uint32_t single, uint32_t enqueue) +{ + if (enqueue) + rte_smp_wmb(); + else + rte_smp_rmb(); + /* + * If there are other enqueues/dequeues in progress that preceded us, + * we need to wait for them to complete + */ + if (!single) + while (unlikely(ht->tail != old_val)) + rte_pause(); + + ht->tail = new_val; +} + +/** + * @internal This function updates the producer head for enqueue + * + * @param r + * A pointer to the ring structure + * @param is_sp + * Indicates whether multi-producer path is needed or not + * @param n + * The number of elements we will want to enqueue, i.e. how far should the + * head be moved + * @param behavior + * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring + * @param old_head + * Returns head value as it was before the move, i.e. where enqueue starts + * @param new_head + * Returns the current/new head value i.e. where enqueue finishes + * @param free_entries + * Returns the amount of free space in the ring BEFORE head was moved + * @return + * Actual number of objects enqueued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_move_prod_head(struct rte_ring *r, int is_sp, + unsigned int n, enum rte_ring_queue_behavior behavior, + uint32_t *old_head, uint32_t *new_head, + uint32_t *free_entries) +{ + const uint32_t capacity = r->capacity; + unsigned int max = n; + int success; + + do { + /* Reset n to the initial burst count */ + n = max; + + *old_head = r->prod.head; + + /* add rmb barrier to avoid load/load reorder in weak + * memory model. It is noop on x86 */ + rte_smp_rmb(); + + const uint32_t cons_tail = r->cons.tail; + /* + * The subtraction is done between two unsigned 32bits value + * (the result is always modulo 32 bits even if we have + * *old_head > cons_tail). So 'free_entries' is always between 0 + * and capacity (which is < size). + */ + *free_entries = (capacity + cons_tail - *old_head); + + /* check that we have enough room in ring */ + if (unlikely(n > *free_entries)) + n = (behavior == RTE_RING_QUEUE_FIXED) ? + 0 : *free_entries; + + if (n == 0) + return 0; + + *new_head = *old_head + n; + if (is_sp) + r->prod.head = *new_head, success = 1; + else + success = rte_atomic32_cmpset(&r->prod.head, + *old_head, *new_head); + } while (unlikely(success == 0)); + return n; +} + +/** + * @internal This function updates the consumer head for dequeue + * + * @param r + * A pointer to the ring structure + * @param is_sc + * Indicates whether multi-consumer path is needed or not + * @param n + * The number of elements we will want to enqueue, i.e. how far should the + * head be moved + * @param behavior + * RTE_RING_QUEUE_FIXED: Dequeue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring + * @param old_head + * Returns head value as it was before the move, i.e. where dequeue starts + * @param new_head + * Returns the current/new head value i.e. where dequeue finishes + * @param entries + * Returns the number of entries in the ring BEFORE head was moved + * @return + * - Actual number of objects dequeued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_move_cons_head(struct rte_ring *r, int is_sc, + unsigned int n, enum rte_ring_queue_behavior behavior, + uint32_t *old_head, uint32_t *new_head, + uint32_t *entries) +{ + unsigned int max = n; + int success; + + /* move cons.head atomically */ + do { + /* Restore n as it may change every loop */ + n = max; + + *old_head = r->cons.head; + + /* add rmb barrier to avoid load/load reorder in weak + * memory model. It is noop on x86 */ + rte_smp_rmb(); + + const uint32_t prod_tail = r->prod.tail; + /* The subtraction is done between two unsigned 32bits value + * (the result is always modulo 32 bits even if we have + * cons_head > prod_tail). So 'entries' is always between 0 + * and size(ring)-1. */ + *entries = (prod_tail - *old_head); + + /* Set the actual entries for dequeue */ + if (n > *entries) + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries; + + if (unlikely(n == 0)) + return 0; + + *new_head = *old_head + n; + if (is_sc) + r->cons.head = *new_head, success = 1; + else + success = rte_atomic32_cmpset(&r->cons.head, *old_head, + *new_head); + } while (unlikely(success == 0)); + return n; +} + +#endif /* _RTE_RING_GENERIC_H_ */ -- 2.7.4