DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs
@ 2020-02-24 20:39 Honnappa Nagarahalli
  2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
                   ` (4 more replies)
  0 siblings, 5 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-02-24 20:39 UTC (permalink / raw)
  To: olivier.matz, konstantin.ananyev; +Cc: honnappa.nagarahalli, gavin.hu, dev, nd

Cover-letter:
RCU defer queue (DQ) APIs place 3 requirements on rte_ring library.
1) Multiple entities are responsible for providing/consuming the
   data in a single element of the DQ. Existing rte_ring APIs
   require an intermediate memcpy in such cases.

   RCU DQ API, rte_rcu_qsbr_dq_enqueue, has to store the token
   (generated by the RCU DQ APIs) and the application data
   (provided by the application) in each element of the DQ.

2) Dequeue the data from DQ only if it meets certain criteria.
   i.e. data needs to be accessed before it can be dequeued.

   RCU DQ API, rte_rcu_qsbr_dq_reclaim, can dequeue the
   element from DQ, only if the token at the head of
   the DQ can be reclaimed. If the token cannot be reclaimed
   the reserved elements need to be discarded.

3) While dequeuing from DQ, only one thread should be allowed
   to reserve elements.

   In order to make rte_rcu_qsbr_dq_reclaim API lock free, the
   'reserve elements in DQ, reclaim the token and revert/commit
   elements in DQ' process needs to be atomic.

The first requirement is satisfied by providing scatter-gather APIs
in rte_ring library. The enqueue/dequeue operations are split
into 3 parts:
a) Move produer's/consumer's head index. i.e. reserve elements in
   the ring and return the pointer in the ring to where the data
   needs to be copied to/from.
b) The application copies the data to/from the pointer.
c) Move producer's/consumer's tail index. i.e. indicate that
   the reserved elements are successfully consumed.
RCU DQ APIs require single element, multiple producer enqueue
operations. 'rte_ring_mp_enqueue_elem_reserve' and
'rte_ring_mp_enqueue_elem_commit' APIs are provided to address
these requirements.

The second and third requirements are satisfied by providing
rte_ring APIs for:
a) Move consumer's head index only if there are no elements
   reserved on the ring.
b) Reset the consumer's head index to its original value.
   i.e. discard the reserved elements.

In this case, RCU DQ APIs require single element, single consumer
dequeue operations. 'rte_ring_dequeue_elem_reserve_serial',
'rte_ring_dequeue_elem_commit' and
'rte_ring_dequeue_elem_revert_serial' APIs are provided to address
these requirements.

Honnappa Nagarahalli (1):
  lib/ring: add scatter gather and serial dequeue APIs

 lib/librte_ring/Makefile           |   1 +
 lib/librte_ring/meson.build        |   1 +
 lib/librte_ring/rte_ring_c11_mem.h |  98 +++++++
 lib/librte_ring/rte_ring_elem_sg.h | 417 +++++++++++++++++++++++++++++
 lib/librte_ring/rte_ring_generic.h |  93 +++++++
 5 files changed, 610 insertions(+)
 create mode 100644 lib/librte_ring/rte_ring_elem_sg.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
@ 2020-02-24 20:39 ` Honnappa Nagarahalli
  2020-02-26 20:38   ` Ananyev, Konstantin
  2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-02-24 20:39 UTC (permalink / raw)
  To: olivier.matz, konstantin.ananyev; +Cc: honnappa.nagarahalli, gavin.hu, dev, nd

Add scatter gather APIs to avoid intermediate memcpy. Serial
dequeue APIs are added to support access to ring elements
before actual dequeue.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
---
 lib/librte_ring/Makefile           |   1 +
 lib/librte_ring/meson.build        |   1 +
 lib/librte_ring/rte_ring_c11_mem.h |  98 +++++++
 lib/librte_ring/rte_ring_elem_sg.h | 417 +++++++++++++++++++++++++++++
 lib/librte_ring/rte_ring_generic.h |  93 +++++++
 5 files changed, 610 insertions(+)
 create mode 100644 lib/librte_ring/rte_ring_elem_sg.h

diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
index 917c560ad..824e4a9bb 100644
--- a/lib/librte_ring/Makefile
+++ b/lib/librte_ring/Makefile
@@ -17,6 +17,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \
 					rte_ring_elem.h \
+					rte_ring_elem_sg.h \
 					rte_ring_generic.h \
 					rte_ring_c11_mem.h
 
diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
index f2f3ccc88..30115ad7c 100644
--- a/lib/librte_ring/meson.build
+++ b/lib/librte_ring/meson.build
@@ -4,6 +4,7 @@
 sources = files('rte_ring.c')
 headers = files('rte_ring.h',
 		'rte_ring_elem.h',
+		'rte_ring_elem_sg.h',
 		'rte_ring_c11_mem.h',
 		'rte_ring_generic.h')
 
diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 0fb73a337..dcae8bcc0 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -178,4 +178,102 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
 	return n;
 }
 
+/**
+ * @internal This function updates the consumer head if there are no
+ * prior reserved elements on the ring.
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sc
+ *   Indicates whether multi-consumer path is needed or not
+ * @param n
+ *   The number of elements to dequeue, i.e. how far should the head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where dequeue starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where dequeue finishes
+ * @param entries
+ *   Returns the number of entries in the ring BEFORE head was moved
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_cons_head_serial(struct rte_ring *r, int is_sc,
+		unsigned int n, enum rte_ring_queue_behavior behavior,
+		uint32_t *old_head, uint32_t *new_head,
+		uint32_t *entries)
+{
+	unsigned int max = n;
+	uint32_t prod_tail;
+	uint32_t cons_tail;
+	int success;
+
+	/* move cons.head atomically */
+	*old_head = __atomic_load_n(&r->cons.head, __ATOMIC_RELAXED);
+	do {
+		/* Restore n as it may change every loop */
+		n = max;
+
+		/* Load the cons.tail and ensure that it is the
+		 * same as cons.head. load-acquire synchronizes
+		 * with the store-release in update_tail.
+		 */
+		cons_tail = __atomic_load_n(&r->cons.tail, __ATOMIC_ACQUIRE);
+		if (*old_head != cons_tail) {
+			rte_pause();
+			*old_head = __atomic_load_n(&r->cons.head,
+							__ATOMIC_RELAXED);
+			success = 0;
+			continue;
+		}
+
+		/* this load-acquire synchronize with store-release of ht->tail
+		 * in update_tail.
+		 */
+		prod_tail = __atomic_load_n(&r->prod.tail,
+					__ATOMIC_ACQUIRE);
+
+		/* The subtraction is done between two unsigned 32bits value
+		 * (the result is always modulo 32 bits even if we have
+		 * cons_head > prod_tail). So 'entries' is always between 0
+		 * and size(ring)-1.
+		 */
+		*entries = (prod_tail - *old_head);
+
+		/* Set the actual entries for dequeue */
+		if (n > *entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
+
+		if (unlikely(n == 0))
+			return 0;
+
+		*new_head = *old_head + n;
+		if (is_sc)
+			r->cons.head = *new_head, success = 1;
+		else
+			/* on failure, *old_head will be updated */
+			success = __atomic_compare_exchange_n(&r->cons.head,
+							old_head, *new_head,
+							0, __ATOMIC_RELAXED,
+							__ATOMIC_RELAXED);
+	} while (unlikely(success == 0));
+	return n;
+}
+
+/**
+ * @internal Discard reserved ring elements
+ *
+ * @param ht
+ *   A pointer to the ring's head-tail structure
+ */
+static __rte_always_inline void
+__rte_ring_revert_head(struct rte_ring_headtail *ht)
+{
+	__atomic_store_n(&ht->head, ht->tail, __ATOMIC_RELAXED);
+}
+
 #endif /* _RTE_RING_C11_MEM_H_ */
diff --git a/lib/librte_ring/rte_ring_elem_sg.h b/lib/librte_ring/rte_ring_elem_sg.h
new file mode 100644
index 000000000..a73f4fbfe
--- /dev/null
+++ b/lib/librte_ring/rte_ring_elem_sg.h
@@ -0,0 +1,417 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Copyright (c) 2020 Arm Limited
+ * Copyright (c) 2010-2017 Intel Corporation
+ * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * All rights reserved.
+ * Derived from FreeBSD's bufring.h
+ * Used as BSD-3 Licensed with permission from Kip Macy.
+ */
+
+#ifndef _RTE_RING_ELEM_SG_H_
+#define _RTE_RING_ELEM_SG_H_
+
+/**
+ * @file
+ * RTE Ring with
+ * 1) user defined element size
+ * 2) scatter gather feature to copy objects to/from the ring
+ * 3) ability to reserve, consume/discard elements in the ring
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <stdint.h>
+#include <string.h>
+#include <sys/queue.h>
+#include <errno.h>
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_memory.h>
+#include <rte_lcore.h>
+#include <rte_atomic.h>
+#include <rte_branch_prediction.h>
+#include <rte_memzone.h>
+#include <rte_pause.h>
+
+#include "rte_ring.h"
+#include "rte_ring_elem.h"
+
+/* Between load and load. there might be cpu reorder in weak model
+ * (powerpc/arm).
+ * There are 2 choices for the users
+ * 1.use rmb() memory barrier
+ * 2.use one-direction load_acquire/store_release barrier,defined by
+ * CONFIG_RTE_USE_C11_MEM_MODEL=y
+ * It depends on performance test results.
+ * By default, move common functions to rte_ring_generic.h
+ */
+#ifdef RTE_USE_C11_MEM_MODEL
+#include "rte_ring_c11_mem.h"
+#else
+#include "rte_ring_generic.h"
+#endif
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
+	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx = head & r->mask;
+	uint64_t *ring = (uint64_t *)&r[1];
+
+	*dst1 = ring + idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = num - (r->size - idx - 1);
+		*dst2 = ring;
+	}
+}
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
+	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx = head & r->mask;
+	rte_int128_t *ring = (rte_int128_t *)&r[1];
+
+	*dst1 = ring + idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = num - (r->size - idx - 1);
+		*dst2 = ring;
+	}
+}
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
+	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	if (esize == 8)
+		return __rte_ring_get_elem_addr_64(r, head,
+						num, dst1, n1, dst2);
+	else if (esize == 16)
+		return __rte_ring_get_elem_addr_128(r, head,
+						num, dst1, n1, dst2);
+	else {
+		uint32_t idx, scale, nr_idx;
+		uint32_t *ring = (uint32_t *)&r[1];
+
+		/* Normalize to uint32_t */
+		scale = esize / sizeof(uint32_t);
+		idx = head & r->mask;
+		nr_idx = idx * scale;
+
+		*dst1 = ring + nr_idx;
+		*n1 = num;
+
+		if (idx + num > r->size) {
+			*n1 = num - (r->size - idx - 1);
+			*dst2 = ring;
+		}
+	}
+}
+
+/**
+ * @internal Reserve ring elements to enqueue several objects on the ring
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of elements to reserve in the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible from ring
+ * @param is_sp
+ *   Indicates whether to use single producer or multi-producer reserve
+ * @param old_head
+ *   Producer's head index before reservation.
+ * @param new_head
+ *   Producer's head index after reservation.
+ * @param free_space
+ *   returns the amount of space after the reserve operation has finished.
+ *   It is not updated if the number of reserved elements is zero.
+ * @param dst1
+ *   Pointer to location in the ring to copy the data.
+ * @param n1
+ *   Number of elements to copy at dst1
+ * @param dst2
+ *   In case of ring wrap around, this pointer provides the location to
+ *   copy the remaining elements. The number of elements to copy at this
+ *   location is equal to (number of elements reserved - n1)
+ * @return
+ *   Actual number of elements reserved.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,
+		unsigned int n, enum rte_ring_queue_behavior behavior,
+		unsigned int is_sp, unsigned int *old_head,
+		unsigned int *new_head, unsigned int *free_space,
+		void **dst1, unsigned int *n1, void **dst2)
+{
+	uint32_t free_entries;
+
+	n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
+			old_head, new_head, &free_entries);
+
+	if (n == 0)
+		goto end;
+
+	__rte_ring_get_elem_addr(r, *old_head, esize, n, dst1, n1, dst2);
+
+	if (free_space != NULL)
+		*free_space = free_entries - n;
+
+end:
+	return n;
+}
+
+/**
+ * @internal Consume previously reserved ring elements (for enqueue)
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param old_head
+ *   Producer's head index before reservation.
+ * @param new_head
+ *   Producer's head index after reservation.
+ * @param is_sp
+ *   Indicates whether to use single producer or multi-producer head update
+ */
+static __rte_always_inline void
+__rte_ring_do_enqueue_elem_commit(struct rte_ring *r,
+		unsigned int old_head, unsigned int new_head,
+		unsigned int is_sp)
+{
+	update_tail(&r->prod, old_head, new_head, is_sp, 1);
+}
+
+/**
+ * Reserve one element for enqueuing one object on a ring
+ * (multi-producers safe). Application must call
+ * 'rte_ring_mp_enqueue_elem_commit' to complete the enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param old_head
+ *   Producer's head index before reservation. The same should be passed to
+ *   'rte_ring_mp_enqueue_elem_commit' function.
+ * @param new_head
+ *   Producer's head index after reservation. The same should be passed to
+ *   'rte_ring_mp_enqueue_elem_commit' function.
+ * @param free_space
+ *   Returns the amount of space after the reservation operation has finished.
+ *   It is not updated if the number of reserved elements is zero.
+ * @param dst
+ *   Pointer to location in the ring to copy the data.
+ * @return
+ *   - 0: Success; objects enqueued.
+ *   - -ENOBUFS: Not enough room in the ring to reserve; no element is reserved.
+ */
+static __rte_always_inline int
+rte_ring_mp_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,
+		unsigned int *old_head, unsigned int *new_head,
+		unsigned int *free_space, void **dst)
+{
+	unsigned int n;
+
+	return __rte_ring_do_enqueue_elem_reserve(r, esize, 1,
+			RTE_RING_QUEUE_FIXED, 0, old_head, new_head,
+			free_space, dst, &n, NULL) ? 0 : -ENOBUFS;
+}
+
+/**
+ * Consume previously reserved elements (for enqueue) in a ring
+ * (multi-producers safe). This API completes the enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param old_head
+ *   Producer's head index before reservation. This value was returned
+ *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
+ * @param new_head
+ *   Producer's head index after reservation. This value was returned
+ *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
+ */
+static __rte_always_inline void
+rte_ring_mp_enqueue_elem_commit(struct rte_ring *r, unsigned int old_head,
+		unsigned int new_head)
+{
+	__rte_ring_do_enqueue_elem_commit(r, old_head, new_head, 0);
+}
+
+/**
+ * @internal Reserve elements to dequeue several objects on the ring.
+ * This function blocks if there are elements reserved already.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to reserve in the ring
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Reserve fixed number of elements in a ring
+ *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible in a ring
+ * @param is_sc
+ *   Indicates whether to use single consumer or multi-consumer head update
+ * @param old_head
+ *   Consumer's head index before reservation.
+ * @param new_head
+ *   Consumer's head index after reservation.
+ * @param available
+ *   returns the number of remaining ring elements after the reservation
+ *   It is not updated if the number of reserved elements is zero.
+ * @param src1
+ *   Pointer to location in the ring to copy the data from.
+ * @param n1
+ *   Number of elements to copy from src1
+ * @param src2
+ *   In case of wrap around in the ring, this pointer provides the location
+ *   to copy the remaining elements from. The number of elements to copy from
+ *   this pointer is equal to (number of elements reserved - n1)
+ * @return
+ *   Actual number of elements reserved.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue_elem_reserve_serial(struct rte_ring *r,
+		unsigned int esize, unsigned int n,
+		enum rte_ring_queue_behavior behavior, unsigned int is_sc,
+		unsigned int *old_head, unsigned int *new_head,
+		unsigned int *available, void **src1, unsigned int *n1,
+		void **src2)
+{
+	uint32_t entries;
+
+	n = __rte_ring_move_cons_head_serial(r, is_sc, n, behavior,
+			old_head, new_head, &entries);
+
+	if (n == 0)
+		goto end;
+
+	__rte_ring_get_elem_addr(r, *old_head, esize, n, src1, n1, src2);
+
+	if (available != NULL)
+		*available = entries - n;
+
+end:
+	return n;
+}
+
+/**
+ * @internal Consume previously reserved ring elements (for dequeue)
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param old_head
+ *   Consumer's head index before reservation.
+ * @param new_head
+ *   Consumer's head index after reservation.
+ * @param is_sc
+ *   Indicates whether to use single consumer or multi-consumer head update
+ */
+static __rte_always_inline void
+__rte_ring_do_dequeue_elem_commit(struct rte_ring *r,
+		unsigned int old_head, unsigned int new_head,
+		unsigned int is_sc)
+{
+	update_tail(&r->cons, old_head, new_head, is_sc, 1);
+}
+
+/**
+ * Reserve one element on a ring for dequeue. This function blocks if there
+ * are elements reserved already. Application must call
+ * 'rte_ring_do_dequeue_elem_commit' or
+ * `rte_ring_do_dequeue_elem_revert_serial' to complete the dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param old_head
+ *   Consumer's head index before reservation. The same should be passed to
+ *   'rte_ring_dequeue_elem_commit' function.
+ * @param new_head
+ *   Consumer's head index after reservation. The same should be passed to
+ *   'rte_ring_dequeue_elem_commit' function.
+ * @param available
+ *   returns the number of remaining ring elements after the reservation
+ *   It is not updated if the number of reserved elements is zero.
+ * @param src
+ *   Pointer to location in the ring to copy the data from.
+ * @return
+ *   - 0: Success; elements reserved
+ *   - -ENOBUFS: Not enough room in the ring; no element is reserved.
+ */
+static __rte_always_inline int
+rte_ring_dequeue_elem_reserve_serial(struct rte_ring *r, unsigned int esize,
+		unsigned int *old_head, unsigned int *new_head,
+		unsigned int *available, void **src)
+{
+	unsigned int n;
+
+	return __rte_ring_do_dequeue_elem_reserve_serial(r, esize, 1,
+			RTE_RING_QUEUE_FIXED, r->cons.single, old_head,
+			new_head, available, src, &n, NULL) ? 0 : -ENOBUFS;
+}
+
+/**
+ * Consume previously reserved elements (for dequeue) in a ring
+ * (multi-consumer safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param old_head
+ *   Consumer's head index before reservation. This value was returned
+ *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
+ * @param new_head
+ *   Consumer's head index after reservation. This value was returned
+ *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
+ */
+static __rte_always_inline void
+rte_ring_dequeue_elem_commit(struct rte_ring *r, unsigned int old_head,
+		unsigned int new_head)
+{
+	__rte_ring_do_dequeue_elem_commit(r, old_head, new_head,
+						r->cons.single);
+}
+
+/**
+ * Discard previously reserved elements (for dequeue) in a ring.
+ *
+ * @warning
+ * This API can be called only if the ring elements were reserved
+ * using 'rte_ring_dequeue_xxx_elem_reserve_serial' APIs.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ */
+static __rte_always_inline void
+rte_ring_dequeue_elem_revert_serial(struct rte_ring *r)
+{
+	__rte_ring_revert_head(&r->cons);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RING_ELEM_SG_H_ */
diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index 953cdbbd5..8d7a7ffcc 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -170,4 +170,97 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
 	return n;
 }
 
+/**
+ * @internal This function updates the consumer head if there are no
+ * prior reserved elements on the ring.
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sc
+ *   Indicates whether multi-consumer path is needed or not
+ * @param n
+ *   The number of elements we will want to dequeue, i.e. how far should the
+ *   head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where dequeue starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where dequeue finishes
+ * @param entries
+ *   Returns the number of entries in the ring BEFORE head was moved
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_cons_head_serial(struct rte_ring *r, unsigned int is_sc,
+		unsigned int n, enum rte_ring_queue_behavior behavior,
+		uint32_t *old_head, uint32_t *new_head,
+		uint32_t *entries)
+{
+	unsigned int max = n;
+	int success;
+
+	/* move cons.head atomically */
+	do {
+		/* Restore n as it may change every loop */
+		n = max;
+
+		*old_head = r->cons.head;
+
+		/* add rmb barrier to avoid load/load reorder in weak
+		 * memory model. It is noop on x86
+		 */
+		rte_smp_rmb();
+
+		/* Ensure that cons.tail and cons.head are the same */
+		if (*old_head != r->cons.tail) {
+			rte_pause();
+
+			success = 0;
+			continue;
+		}
+
+		/* The subtraction is done between two unsigned 32bits value
+		 * (the result is always modulo 32 bits even if we have
+		 * cons_head > prod_tail). So 'entries' is always between 0
+		 * and size(ring)-1.
+		 */
+		*entries = (r->prod.tail - *old_head);
+
+		/* Set the actual entries for dequeue */
+		if (n > *entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
+
+		if (unlikely(n == 0))
+			return 0;
+
+		*new_head = *old_head + n;
+		if (is_sc) {
+			r->cons.head = *new_head;
+			rte_smp_rmb();
+			success = 1;
+		} else {
+			success = rte_atomic32_cmpset(&r->cons.head, *old_head,
+					*new_head);
+		}
+	} while (unlikely(success == 0));
+	return n;
+}
+
+/**
+ * @internal This function updates the head to match the tail
+ *
+ * @param ht
+ *   A pointer to the ring's head-tail structure
+ */
+static __rte_always_inline void
+__rte_ring_revert_head(struct rte_ring_headtail *ht)
+{
+	/* Discard the reserved ring elements. */
+	ht->head = ht->tail;
+}
+
 #endif /* _RTE_RING_GENERIC_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
@ 2020-02-26 20:38   ` Ananyev, Konstantin
  2020-02-26 23:21     ` Ananyev, Konstantin
  2020-02-28  0:18     ` Honnappa Nagarahalli
  0 siblings, 2 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-02-26 20:38 UTC (permalink / raw)
  To: Honnappa Nagarahalli, olivier.matz; +Cc: gavin.hu, dev, nd


Hi Honnappa,

> Add scatter gather APIs to avoid intermediate memcpy. Serial
> dequeue APIs are added to support access to ring elements
> before actual dequeue.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> ---
>  lib/librte_ring/Makefile           |   1 +
>  lib/librte_ring/meson.build        |   1 +
>  lib/librte_ring/rte_ring_c11_mem.h |  98 +++++++
>  lib/librte_ring/rte_ring_elem_sg.h | 417 +++++++++++++++++++++++++++++
>  lib/librte_ring/rte_ring_generic.h |  93 +++++++
>  5 files changed, 610 insertions(+)

As was already noticed by you this patch overlaps quite a bit with another one:
http://patches.dpdk.org/patch/66006/

Though it seems there are few significant differences in
our approaches (both API and implementation).
So we probably need to come-up with some common
view first, before moving forward with some unified version.
To start a discussion, I produced some comments, pls see below. 

I don't see changes in rte_ring.h itself, but I suppose
that's just because it is an RFC and it would be added in later versions?
Another similar question there seems only _bulk_ (RTE_RING_QUEUE_FIXED) mode,
I suppose _burst_ will also be added in later versions?

> diff --git a/lib/librte_ring/rte_ring_elem_sg.h b/lib/librte_ring/rte_ring_elem_sg.h
> new file mode 100644
> index 000000000..a73f4fbfe
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_elem_sg.h
> @@ -0,0 +1,417 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm Limited
> + * Copyright (c) 2010-2017 Intel Corporation
> + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_ELEM_SG_H_
> +#define _RTE_RING_ELEM_SG_H_
> +
> +/**
> + * @file
> + * RTE Ring with
> + * 1) user defined element size
> + * 2) scatter gather feature to copy objects to/from the ring
> + * 3) ability to reserve, consume/discard elements in the ring
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <string.h>
> +#include <sys/queue.h>
> +#include <errno.h>
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_memory.h>
> +#include <rte_lcore.h>
> +#include <rte_atomic.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_memzone.h>
> +#include <rte_pause.h>
> +
> +#include "rte_ring.h"
> +#include "rte_ring_elem.h"
> +
> +/* Between load and load. there might be cpu reorder in weak model
> + * (powerpc/arm).
> + * There are 2 choices for the users
> + * 1.use rmb() memory barrier
> + * 2.use one-direction load_acquire/store_release barrier,defined by
> + * CONFIG_RTE_USE_C11_MEM_MODEL=y
> + * It depends on performance test results.
> + * By default, move common functions to rte_ring_generic.h
> + */
> +#ifdef RTE_USE_C11_MEM_MODEL
> +#include "rte_ring_c11_mem.h"
> +#else
> +#include "rte_ring_generic.h"
> +#endif
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
> +	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	uint32_t idx = head & r->mask;
> +	uint64_t *ring = (uint64_t *)&r[1];
> +
> +	*dst1 = ring + idx;
> +	*n1 = num;
> +
> +	if (idx + num > r->size) {
> +		*n1 = num - (r->size - idx - 1);
> +		*dst2 = ring;
> +	}
> +}
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
> +	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	uint32_t idx = head & r->mask;
> +	rte_int128_t *ring = (rte_int128_t *)&r[1];
> +
> +	*dst1 = ring + idx;
> +	*n1 = num;
> +
> +	if (idx + num > r->size) {
> +		*n1 = num - (r->size - idx - 1);
> +		*dst2 = ring;
> +	}
> +}
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	if (esize == 8)
> +		return __rte_ring_get_elem_addr_64(r, head,
> +						num, dst1, n1, dst2);
> +	else if (esize == 16)
> +		return __rte_ring_get_elem_addr_128(r, head,
> +						num, dst1, n1, dst2);
> +	else {
> +		uint32_t idx, scale, nr_idx;
> +		uint32_t *ring = (uint32_t *)&r[1];
> +
> +		/* Normalize to uint32_t */
> +		scale = esize / sizeof(uint32_t);
> +		idx = head & r->mask;
> +		nr_idx = idx * scale;
> +
> +		*dst1 = ring + nr_idx;
> +		*n1 = num;
> +
> +		if (idx + num > r->size) {
> +			*n1 = num - (r->size - idx - 1);
> +			*dst2 = ring;
> +		}
> +	}
> +}
> +
> +/**
> + * @internal Reserve ring elements to enqueue several objects on the ring
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + *   This must be the same value used while creating the ring. Otherwise
> + *   the results are undefined.
> + * @param n
> + *   The number of elements to reserve in the ring.
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible from ring
> + * @param is_sp
> + *   Indicates whether to use single producer or multi-producer reserve
> + * @param old_head
> + *   Producer's head index before reservation.
> + * @param new_head
> + *   Producer's head index after reservation.
> + * @param free_space
> + *   returns the amount of space after the reserve operation has finished.
> + *   It is not updated if the number of reserved elements is zero.
> + * @param dst1
> + *   Pointer to location in the ring to copy the data.
> + * @param n1
> + *   Number of elements to copy at dst1
> + * @param dst2
> + *   In case of ring wrap around, this pointer provides the location to
> + *   copy the remaining elements. The number of elements to copy at this
> + *   location is equal to (number of elements reserved - n1)
> + * @return
> + *   Actual number of elements reserved.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,


I do understand the purpose of reserve, then either commit/abort for serial sync mode,
but what is the purpose of non-serial version of reserve/commit?
In serial  MP/MC case, after _reserve_(n) you always have to do
_commit_(n) - you can't reduce number of elements, or do _abort_.
Again you cannot avoid memcpy(n) here anyhow.
So what is the point of these functions for non-serial case? 

BTW, I think it would be good to have serial version of _enqueue_ too.

> +		unsigned int n, enum rte_ring_queue_behavior behavior,
> +		unsigned int is_sp, unsigned int *old_head,
> +		unsigned int *new_head, unsigned int *free_space,
> +		void **dst1, unsigned int *n1, void **dst2)

I do understand the intention to avoid memcpy(), but proposed API
seems overcomplicated, error prone, and not very convenient for the user.
I don't think that avoiding memcpy() will save us that many cycles here,
so probably better to keep API model a bit more regular:

n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space);
...
/* performs actual memcpy(), m<=n */ 
rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);

/* performs actual memcpy for num elems */ 
n = rte_ring_mp_serial_dequeue_bulk_reserve(ring, obj, num, &free_space);
....
/* m<=n */
rte_ring_mp_serial_dequeue_bulk_commit(ring, obj,  m);

Plus, we can have usual enqueue/dequeue API for serial sync mode:
rte_ring_serial_(enqueue/dequeue)_(bulk/burst)

> +{
> +	uint32_t free_entries;
> +
> +	n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
> +			old_head, new_head, &free_entries);
> +
> +	if (n == 0)
> +		goto end;

Here and in other similar places, why not just 'return 0;'?

> +
> +	__rte_ring_get_elem_addr(r, *old_head, esize, n, dst1, n1, dst2);
> +
> +	if (free_space != NULL)
> +		*free_space = free_entries - n;
> +
> +end:
> +	return n;
> +}
> +
> +/**
> + * @internal Consume previously reserved ring elements (for enqueue)
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param old_head
> + *   Producer's head index before reservation.
> + * @param new_head
> + *   Producer's head index after reservation.
> + * @param is_sp
> + *   Indicates whether to use single producer or multi-producer head update
> + */
> +static __rte_always_inline void
> +__rte_ring_do_enqueue_elem_commit(struct rte_ring *r,
> +		unsigned int old_head, unsigned int new_head,
> +		unsigned int is_sp)
> +{
> +	update_tail(&r->prod, old_head, new_head, is_sp, 1);
> +}
> +
> +/**
> + * Reserve one element for enqueuing one object on a ring
> + * (multi-producers safe). Application must call
> + * 'rte_ring_mp_enqueue_elem_commit' to complete the enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + *   This must be the same value used while creating the ring. Otherwise
> + *   the results are undefined.
> + * @param old_head
> + *   Producer's head index before reservation. The same should be passed to
> + *   'rte_ring_mp_enqueue_elem_commit' function.
> + * @param new_head
> + *   Producer's head index after reservation. The same should be passed to
> + *   'rte_ring_mp_enqueue_elem_commit' function.
> + * @param free_space
> + *   Returns the amount of space after the reservation operation has finished.
> + *   It is not updated if the number of reserved elements is zero.
> + * @param dst
> + *   Pointer to location in the ring to copy the data.
> + * @return
> + *   - 0: Success; objects enqueued.
> + *   - -ENOBUFS: Not enough room in the ring to reserve; no element is reserved.
> + */
> +static __rte_always_inline int
> +rte_ring_mp_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,
> +		unsigned int *old_head, unsigned int *new_head,
> +		unsigned int *free_space, void **dst)
> +{
> +	unsigned int n;
> +
> +	return __rte_ring_do_enqueue_elem_reserve(r, esize, 1,
> +			RTE_RING_QUEUE_FIXED, 0, old_head, new_head,
> +			free_space, dst, &n, NULL) ? 0 : -ENOBUFS;
> +}
> +
> +/**
> + * Consume previously reserved elements (for enqueue) in a ring
> + * (multi-producers safe). This API completes the enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param old_head
> + *   Producer's head index before reservation. This value was returned
> + *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
> + * @param new_head
> + *   Producer's head index after reservation. This value was returned
> + *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
> + */
> +static __rte_always_inline void
> +rte_ring_mp_enqueue_elem_commit(struct rte_ring *r, unsigned int old_head,
> +		unsigned int new_head)
> +{
> +	__rte_ring_do_enqueue_elem_commit(r, old_head, new_head, 0);
> +}
> +
> +/**
> + * @internal Reserve elements to dequeue several objects on the ring.
> + * This function blocks if there are elements reserved already.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + *   This must be the same value used while creating the ring. Otherwise
> + *   the results are undefined.
> + * @param n
> + *   The number of objects to reserve in the ring
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Reserve fixed number of elements in a ring
> + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible in a ring
> + * @param is_sc
> + *   Indicates whether to use single consumer or multi-consumer head update
> + * @param old_head
> + *   Consumer's head index before reservation.
> + * @param new_head
> + *   Consumer's head index after reservation.
> + * @param available
> + *   returns the number of remaining ring elements after the reservation
> + *   It is not updated if the number of reserved elements is zero.
> + * @param src1
> + *   Pointer to location in the ring to copy the data from.
> + * @param n1
> + *   Number of elements to copy from src1
> + * @param src2
> + *   In case of wrap around in the ring, this pointer provides the location
> + *   to copy the remaining elements from. The number of elements to copy from
> + *   this pointer is equal to (number of elements reserved - n1)
> + * @return
> + *   Actual number of elements reserved.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_dequeue_elem_reserve_serial(struct rte_ring *r,
> +		unsigned int esize, unsigned int n,
> +		enum rte_ring_queue_behavior behavior, unsigned int is_sc,
> +		unsigned int *old_head, unsigned int *new_head,
> +		unsigned int *available, void **src1, unsigned int *n1,
> +		void **src2)
> +{
> +	uint32_t entries;
> +
> +	n = __rte_ring_move_cons_head_serial(r, is_sc, n, behavior,
> +			old_head, new_head, &entries);
> +
> +	if (n == 0)
> +		goto end;
> +
> +	__rte_ring_get_elem_addr(r, *old_head, esize, n, src1, n1, src2);
> +
> +	if (available != NULL)
> +		*available = entries - n;
> +
> +end:
> +	return n;
> +}
> +
> +/**
> + * @internal Consume previously reserved ring elements (for dequeue)
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param old_head
> + *   Consumer's head index before reservation.
> + * @param new_head
> + *   Consumer's head index after reservation.
> + * @param is_sc
> + *   Indicates whether to use single consumer or multi-consumer head update
> + */
> +static __rte_always_inline void
> +__rte_ring_do_dequeue_elem_commit(struct rte_ring *r,
> +		unsigned int old_head, unsigned int new_head,
> +		unsigned int is_sc)
> +{

I think it is a bit dangerous and error-prone approach to let user
specify old_head/new_head manually.
Seems better just _commit(ring, num) - see above.
That way suer don't have to calculate new head mannualy,
plus we can have a check that ring.tail - ring.head >= num.    

> +	update_tail(&r->cons, old_head, new_head, is_sc, 1);

I think update_tail() is not enough here.
As in some cases we need to update  ring.head also:
let say user reserved 2 elems, but then decided to commit only one.  
So I think we need a special new function instead of update_tail() here.
BTW, in HTS I use atomic 64-bit read/write to get/set both head and tail in one go.
This is not really required - two 32bit ops would work too, I think.
As usual, both ways have some pros and cons:
using 64bit ops might be faster on 64-bit target, plus it is less error prone
(no need to think about head/tail read/write ordering, etc.),
though for 32-bit target it would mean some extra overhead. 

> +}
> +
> +/**
> + * Reserve one element on a ring for dequeue. This function blocks if there
> + * are elements reserved already. Application must call
> + * 'rte_ring_do_dequeue_elem_commit' or
> + * `rte_ring_do_dequeue_elem_revert_serial' to complete the dequeue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + *   This must be the same value used while creating the ring. Otherwise
> + *   the results are undefined.
> + * @param old_head
> + *   Consumer's head index before reservation. The same should be passed to
> + *   'rte_ring_dequeue_elem_commit' function.
> + * @param new_head
> + *   Consumer's head index after reservation. The same should be passed to
> + *   'rte_ring_dequeue_elem_commit' function.
> + * @param available
> + *   returns the number of remaining ring elements after the reservation
> + *   It is not updated if the number of reserved elements is zero.
> + * @param src
> + *   Pointer to location in the ring to copy the data from.
> + * @return
> + *   - 0: Success; elements reserved
> + *   - -ENOBUFS: Not enough room in the ring; no element is reserved.
> + */
> +static __rte_always_inline int
> +rte_ring_dequeue_elem_reserve_serial(struct rte_ring *r, unsigned int esize,
> +		unsigned int *old_head, unsigned int *new_head,
> +		unsigned int *available, void **src)
> +{
> +	unsigned int n;
> +
> +	return __rte_ring_do_dequeue_elem_reserve_serial(r, esize, 1,
> +			RTE_RING_QUEUE_FIXED, r->cons.single, old_head,
> +			new_head, available, src, &n, NULL) ? 0 : -ENOBUFS;
> +}
> +
> +/**
> + * Consume previously reserved elements (for dequeue) in a ring
> + * (multi-consumer safe).
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param old_head
> + *   Consumer's head index before reservation. This value was returned
> + *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
> + * @param new_head
> + *   Consumer's head index after reservation. This value was returned
> + *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
> + */
> +static __rte_always_inline void
> +rte_ring_dequeue_elem_commit(struct rte_ring *r, unsigned int old_head,
> +		unsigned int new_head)
> +{
> +	__rte_ring_do_dequeue_elem_commit(r, old_head, new_head,
> +						r->cons.single);
> +}
> +
> +/**
> + * Discard previously reserved elements (for dequeue) in a ring.
> + *
> + * @warning
> + * This API can be called only if the ring elements were reserved
> + * using 'rte_ring_dequeue_xxx_elem_reserve_serial' APIs.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + */
> +static __rte_always_inline void
> +rte_ring_dequeue_elem_revert_serial(struct rte_ring *r)
> +{
> +	__rte_ring_revert_head(&r->cons);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_RING_ELEM_SG_H_ */
> diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> index 953cdbbd5..8d7a7ffcc 100644
> --- a/lib/librte_ring/rte_ring_generic.h
> +++ b/lib/librte_ring/rte_ring_generic.h
> @@ -170,4 +170,97 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
>  	return n;
>  }
> 
> +/**
> + * @internal This function updates the consumer head if there are no
> + * prior reserved elements on the ring.
> + *
> + * @param r
> + *   A pointer to the ring structure
> + * @param is_sc
> + *   Indicates whether multi-consumer path is needed or not
> + * @param n
> + *   The number of elements we will want to dequeue, i.e. how far should the
> + *   head be moved
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
> + * @param old_head
> + *   Returns head value as it was before the move, i.e. where dequeue starts
> + * @param new_head
> + *   Returns the current/new head value i.e. where dequeue finishes
> + * @param entries
> + *   Returns the number of entries in the ring BEFORE head was moved
> + * @return
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_move_cons_head_serial(struct rte_ring *r, unsigned int is_sc,
> +		unsigned int n, enum rte_ring_queue_behavior behavior,
> +		uint32_t *old_head, uint32_t *new_head,
> +		uint32_t *entries)
> +{
> +	unsigned int max = n;
> +	int success;
> +
> +	/* move cons.head atomically */
> +	do {
> +		/* Restore n as it may change every loop */
> +		n = max;
> +
> +		*old_head = r->cons.head;
> +
> +		/* add rmb barrier to avoid load/load reorder in weak
> +		 * memory model. It is noop on x86
> +		 */
> +		rte_smp_rmb();
> +
> +		/* Ensure that cons.tail and cons.head are the same */
> +		if (*old_head != r->cons.tail) {
> +			rte_pause();
> +
> +			success = 0;
> +			continue;
> +		}
> +
> +		/* The subtraction is done between two unsigned 32bits value
> +		 * (the result is always modulo 32 bits even if we have
> +		 * cons_head > prod_tail). So 'entries' is always between 0
> +		 * and size(ring)-1.
> +		 */
> +		*entries = (r->prod.tail - *old_head);
> +
> +		/* Set the actual entries for dequeue */
> +		if (n > *entries)
> +			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> +
> +		if (unlikely(n == 0))
> +			return 0;
> +
> +		*new_head = *old_head + n;
> +		if (is_sc) {
> +			r->cons.head = *new_head;
> +			rte_smp_rmb();
> +			success = 1;

I don't think we need to worry about SC case in this function.
For sc(/sp)_serial (if we need such mode) - we probably can use normal move_(cons/prod)_head().

> +		} else {
> +			success = rte_atomic32_cmpset(&r->cons.head, *old_head,
> +					*new_head);
> +		}
> +	} while (unlikely(success == 0));
> +	return n;
> +}
> +
> +/**
> + * @internal This function updates the head to match the tail
> + *
> + * @param ht
> + *   A pointer to the ring's head-tail structure
> + */
> +static __rte_always_inline void
> +__rte_ring_revert_head(struct rte_ring_headtail *ht)
> +{
> +	/* Discard the reserved ring elements. */
> +	ht->head = ht->tail;
> +}
> +
>  #endif /* _RTE_RING_GENERIC_H_ */
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-02-26 20:38   ` Ananyev, Konstantin
@ 2020-02-26 23:21     ` Ananyev, Konstantin
  2020-02-28  0:18     ` Honnappa Nagarahalli
  1 sibling, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-02-26 23:21 UTC (permalink / raw)
  To: Ananyev, Konstantin, Honnappa Nagarahalli, olivier.matz; +Cc: gavin.hu, dev, nd

> > +/**
> > + * @internal Reserve ring elements to enqueue several objects on the ring
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param n
> > + *   The number of elements to reserve in the ring.
> > + * @param behavior
> > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements from a ring
> > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible from ring
> > + * @param is_sp
> > + *   Indicates whether to use single producer or multi-producer reserve
> > + * @param old_head
> > + *   Producer's head index before reservation.
> > + * @param new_head
> > + *   Producer's head index after reservation.
> > + * @param free_space
> > + *   returns the amount of space after the reserve operation has finished.
> > + *   It is not updated if the number of reserved elements is zero.
> > + * @param dst1
> > + *   Pointer to location in the ring to copy the data.
> > + * @param n1
> > + *   Number of elements to copy at dst1
> > + * @param dst2
> > + *   In case of ring wrap around, this pointer provides the location to
> > + *   copy the remaining elements. The number of elements to copy at this
> > + *   location is equal to (number of elements reserved - n1)
> > + * @return
> > + *   Actual number of elements reserved.
> > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,
> 
> 
> I do understand the purpose of reserve, then either commit/abort for serial sync mode,
> but what is the purpose of non-serial version of reserve/commit?
> In serial  MP/MC case, after _reserve_(n) you always have to do

Typo, meant 'in on-serial MP/MP case' of course. 

> _commit_(n) - you can't reduce number of elements, or do _abort_.
> Again you cannot avoid memcpy(n) here anyhow.
> So what is the point of these functions for non-serial case?
> 
> BTW, I think it would be good to have serial version of _enqueue_ too.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-02-26 20:38   ` Ananyev, Konstantin
  2020-02-26 23:21     ` Ananyev, Konstantin
@ 2020-02-28  0:18     ` Honnappa Nagarahalli
  2020-03-02 18:20       ` Ananyev, Konstantin
  1 sibling, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-02-28  0:18 UTC (permalink / raw)
  To: Ananyev, Konstantin, olivier.matz
  Cc: Gavin Hu, dev, nd, Honnappa Nagarahalli, nd

<snip>
> 
> 
> Hi Honnappa,
Thanks Konstantin for the comments.
> 
> > Add scatter gather APIs to avoid intermediate memcpy. Serial dequeue
> > APIs are added to support access to ring elements before actual
> > dequeue.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > ---
> >  lib/librte_ring/Makefile           |   1 +
> >  lib/librte_ring/meson.build        |   1 +
> >  lib/librte_ring/rte_ring_c11_mem.h |  98 +++++++
> > lib/librte_ring/rte_ring_elem_sg.h | 417 +++++++++++++++++++++++++++++
> > lib/librte_ring/rte_ring_generic.h |  93 +++++++
> >  5 files changed, 610 insertions(+)
> 
> As was already noticed by you this patch overlaps quite a bit with another
> one:
> http://patches.dpdk.org/patch/66006/
I took a cursory look at this. I need to take a detailed look, plan to do so soon.

> 
> Though it seems there are few significant differences in our approaches (both
> API and implementation).
> So we probably need to come-up with some common view first, before
> moving forward with some unified version.
> To start a discussion, I produced some comments, pls see below.
> 
> I don't see changes in rte_ring.h itself, but I suppose that's just because it is an
> RFC and it would be added in later versions?
I did not plan to add them. IMO, we should not add new APIs to that list. We should encourage using the rte_ring_xxx_elem APIs should be used going forward. They are interoperable (I mean, the application can call a mix of APIs)

> Another similar question there seems only _bulk_ (RTE_RING_QUEUE_FIXED)
> mode, I suppose _burst_ will also be added in later versions?
Here, I was trying to avoid providing APIs without a clear need (_bulk_ is enough for RCU for now). If you see the need, I can add them.

> 
> > diff --git a/lib/librte_ring/rte_ring_elem_sg.h
> > b/lib/librte_ring/rte_ring_elem_sg.h
> > new file mode 100644
> > index 000000000..a73f4fbfe
> > --- /dev/null
> > +++ b/lib/librte_ring/rte_ring_elem_sg.h
> > @@ -0,0 +1,417 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + *
> > + * Copyright (c) 2020 Arm Limited
> > + * Copyright (c) 2010-2017 Intel Corporation
> > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * All rights reserved.
> > + * Derived from FreeBSD's bufring.h
> > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > + */
> > +
> > +#ifndef _RTE_RING_ELEM_SG_H_
> > +#define _RTE_RING_ELEM_SG_H_
> > +
> > +/**
> > + * @file
> > + * RTE Ring with
> > + * 1) user defined element size
> > + * 2) scatter gather feature to copy objects to/from the ring
> > + * 3) ability to reserve, consume/discard elements in the ring  */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <stdio.h>
> > +#include <stdint.h>
> > +#include <string.h>
> > +#include <sys/queue.h>
> > +#include <errno.h>
> > +#include <rte_common.h>
> > +#include <rte_config.h>
> > +#include <rte_memory.h>
> > +#include <rte_lcore.h>
> > +#include <rte_atomic.h>
> > +#include <rte_branch_prediction.h>
> > +#include <rte_memzone.h>
> > +#include <rte_pause.h>
> > +
> > +#include "rte_ring.h"
> > +#include "rte_ring_elem.h"
> > +
> > +/* Between load and load. there might be cpu reorder in weak model
> > + * (powerpc/arm).
> > + * There are 2 choices for the users
> > + * 1.use rmb() memory barrier
> > + * 2.use one-direction load_acquire/store_release barrier,defined by
> > + * CONFIG_RTE_USE_C11_MEM_MODEL=y
> > + * It depends on performance test results.
> > + * By default, move common functions to rte_ring_generic.h  */ #ifdef
> > +RTE_USE_C11_MEM_MODEL #include "rte_ring_c11_mem.h"
> > +#else
> > +#include "rte_ring_generic.h"
> > +#endif
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
> > +	uint32_t num, void **dst1, uint32_t *n1, void **dst2) {
> > +	uint32_t idx = head & r->mask;
> > +	uint64_t *ring = (uint64_t *)&r[1];
> > +
> > +	*dst1 = ring + idx;
> > +	*n1 = num;
> > +
> > +	if (idx + num > r->size) {
> > +		*n1 = num - (r->size - idx - 1);
> > +		*dst2 = ring;
> > +	}
> > +}
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
> > +	uint32_t num, void **dst1, uint32_t *n1, void **dst2) {
> > +	uint32_t idx = head & r->mask;
> > +	rte_int128_t *ring = (rte_int128_t *)&r[1];
> > +
> > +	*dst1 = ring + idx;
> > +	*n1 = num;
> > +
> > +	if (idx + num > r->size) {
> > +		*n1 = num - (r->size - idx - 1);
> > +		*dst2 = ring;
> > +	}
> > +}
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> > +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void
> > +**dst2) {
> > +	if (esize == 8)
> > +		return __rte_ring_get_elem_addr_64(r, head,
> > +						num, dst1, n1, dst2);
> > +	else if (esize == 16)
> > +		return __rte_ring_get_elem_addr_128(r, head,
> > +						num, dst1, n1, dst2);
> > +	else {
> > +		uint32_t idx, scale, nr_idx;
> > +		uint32_t *ring = (uint32_t *)&r[1];
> > +
> > +		/* Normalize to uint32_t */
> > +		scale = esize / sizeof(uint32_t);
> > +		idx = head & r->mask;
> > +		nr_idx = idx * scale;
> > +
> > +		*dst1 = ring + nr_idx;
> > +		*n1 = num;
> > +
> > +		if (idx + num > r->size) {
> > +			*n1 = num - (r->size - idx - 1);
> > +			*dst2 = ring;
> > +		}
> > +	}
> > +}
> > +
> > +/**
> > + * @internal Reserve ring elements to enqueue several objects on the
> > +ring
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param n
> > + *   The number of elements to reserve in the ring.
> > + * @param behavior
> > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements from a
> ring
> > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible
> from ring
> > + * @param is_sp
> > + *   Indicates whether to use single producer or multi-producer reserve
> > + * @param old_head
> > + *   Producer's head index before reservation.
> > + * @param new_head
> > + *   Producer's head index after reservation.
> > + * @param free_space
> > + *   returns the amount of space after the reserve operation has finished.
> > + *   It is not updated if the number of reserved elements is zero.
> > + * @param dst1
> > + *   Pointer to location in the ring to copy the data.
> > + * @param n1
> > + *   Number of elements to copy at dst1
> > + * @param dst2
> > + *   In case of ring wrap around, this pointer provides the location to
> > + *   copy the remaining elements. The number of elements to copy at this
> > + *   location is equal to (number of elements reserved - n1)
> > + * @return
> > + *   Actual number of elements reserved.
> > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned int
> > +esize,
> 
> 
> I do understand the purpose of reserve, then either commit/abort for serial
> sync mode, but what is the purpose of non-serial version of reserve/commit?
In RCU, I have the need for scatter-gather feature. i.e. the data in the ring element is coming from multiple sources ('token' is generated by the RCU library and the application provides additional data). If I do not provide the reserve/commit, I need to introduce an intermediate memcpy to get these two data contiguously to copy to the ring element. The sequence is 'reserve(1), memcpy1, mempcy2, commit(1)'. Hence, you do not see the abort API for the enqueue.
 
> In serial  MP/MC case, after _reserve_(n) you always have to do
> _commit_(n) - you can't reduce number of elements, or do _abort_.
Agree, the intention here is to provide the scatter/gather feature.

> Again you cannot avoid memcpy(n) here anyhow.
> So what is the point of these functions for non-serial case?
It avoids an intermediate memcpy when the data is coming from multiple sources.

> 
> BTW, I think it would be good to have serial version of _enqueue_ too.
If there is a good use case, they should be provided. I did not come across a good use case.

> 
> > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > +		unsigned int is_sp, unsigned int *old_head,
> > +		unsigned int *new_head, unsigned int *free_space,
> > +		void **dst1, unsigned int *n1, void **dst2)
> 
> I do understand the intention to avoid memcpy(), but proposed API seems
> overcomplicated, error prone, and not very convenient for the user.
The issue is the need to handle the wrap around in ring storage array. i.e. when the space is reserved for more than 1 ring element, the wrap around might happen.

> I don't think that avoiding memcpy() will save us that many cycles here, so
This depends on the amount of data being copied.

> probably better to keep API model a bit more regular:
> 
> n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space); ...
> /* performs actual memcpy(), m<=n */
> rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
These do not take care of the wrap-around case or I am not able to understand your comment.

> 
> /* performs actual memcpy for num elems */ n =
> rte_ring_mp_serial_dequeue_bulk_reserve(ring, obj, num, &free_space); ....
> /* m<=n */
> rte_ring_mp_serial_dequeue_bulk_commit(ring, obj,  m);
> 
> Plus, we can have usual enqueue/dequeue API for serial sync mode:
> rte_ring_serial_(enqueue/dequeue)_(bulk/burst)
It would be good to understand the use cases. IMO, if we do not have use cases, we should not add for now. We can add them as and when the use cases are understood.

> 
> > +{
> > +	uint32_t free_entries;
> > +
> > +	n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
> > +			old_head, new_head, &free_entries);
> > +
> > +	if (n == 0)
> > +		goto end;
> 
> Here and in other similar places, why not just 'return 0;'?
Yes, should be possible.

> 
> > +
> > +	__rte_ring_get_elem_addr(r, *old_head, esize, n, dst1, n1, dst2);
> > +
> > +	if (free_space != NULL)
> > +		*free_space = free_entries - n;
> > +
> > +end:
> > +	return n;
> > +}
> > +
> > +/**
> > + * @internal Consume previously reserved ring elements (for enqueue)
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param old_head
> > + *   Producer's head index before reservation.
> > + * @param new_head
> > + *   Producer's head index after reservation.
> > + * @param is_sp
> > + *   Indicates whether to use single producer or multi-producer head
> update
> > + */
> > +static __rte_always_inline void
> > +__rte_ring_do_enqueue_elem_commit(struct rte_ring *r,
> > +		unsigned int old_head, unsigned int new_head,
> > +		unsigned int is_sp)
> > +{
> > +	update_tail(&r->prod, old_head, new_head, is_sp, 1); }
> > +
> > +/**
> > + * Reserve one element for enqueuing one object on a ring
> > + * (multi-producers safe). Application must call
> > + * 'rte_ring_mp_enqueue_elem_commit' to complete the enqueue
> operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param old_head
> > + *   Producer's head index before reservation. The same should be passed
> to
> > + *   'rte_ring_mp_enqueue_elem_commit' function.
> > + * @param new_head
> > + *   Producer's head index after reservation. The same should be passed to
> > + *   'rte_ring_mp_enqueue_elem_commit' function.
> > + * @param free_space
> > + *   Returns the amount of space after the reservation operation has
> finished.
> > + *   It is not updated if the number of reserved elements is zero.
> > + * @param dst
> > + *   Pointer to location in the ring to copy the data.
> > + * @return
> > + *   - 0: Success; objects enqueued.
> > + *   - -ENOBUFS: Not enough room in the ring to reserve; no element is
> reserved.
> > + */
> > +static __rte_always_inline int
> > +rte_ring_mp_enqueue_elem_reserve(struct rte_ring *r, unsigned int esize,
> > +		unsigned int *old_head, unsigned int *new_head,
> > +		unsigned int *free_space, void **dst) {
> > +	unsigned int n;
> > +
> > +	return __rte_ring_do_enqueue_elem_reserve(r, esize, 1,
> > +			RTE_RING_QUEUE_FIXED, 0, old_head, new_head,
> > +			free_space, dst, &n, NULL) ? 0 : -ENOBUFS; }
> > +
> > +/**
> > + * Consume previously reserved elements (for enqueue) in a ring
> > + * (multi-producers safe). This API completes the enqueue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param old_head
> > + *   Producer's head index before reservation. This value was returned
> > + *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
> > + * @param new_head
> > + *   Producer's head index after reservation. This value was returned
> > + *   when the API 'rte_ring_mp_enqueue_elem_reserve' was called.
> > + */
> > +static __rte_always_inline void
> > +rte_ring_mp_enqueue_elem_commit(struct rte_ring *r, unsigned int
> old_head,
> > +		unsigned int new_head)
> > +{
> > +	__rte_ring_do_enqueue_elem_commit(r, old_head, new_head, 0); }
> > +
> > +/**
> > + * @internal Reserve elements to dequeue several objects on the ring.
> > + * This function blocks if there are elements reserved already.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param n
> > + *   The number of objects to reserve in the ring
> > + * @param behavior
> > + *   RTE_RING_QUEUE_FIXED:    Reserve fixed number of elements in a ring
> > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible in
> a ring
> > + * @param is_sc
> > + *   Indicates whether to use single consumer or multi-consumer head
> update
> > + * @param old_head
> > + *   Consumer's head index before reservation.
> > + * @param new_head
> > + *   Consumer's head index after reservation.
> > + * @param available
> > + *   returns the number of remaining ring elements after the reservation
> > + *   It is not updated if the number of reserved elements is zero.
> > + * @param src1
> > + *   Pointer to location in the ring to copy the data from.
> > + * @param n1
> > + *   Number of elements to copy from src1
> > + * @param src2
> > + *   In case of wrap around in the ring, this pointer provides the location
> > + *   to copy the remaining elements from. The number of elements to copy
> from
> > + *   this pointer is equal to (number of elements reserved - n1)
> > + * @return
> > + *   Actual number of elements reserved.
> > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_dequeue_elem_reserve_serial(struct rte_ring *r,
> > +		unsigned int esize, unsigned int n,
> > +		enum rte_ring_queue_behavior behavior, unsigned int is_sc,
> > +		unsigned int *old_head, unsigned int *new_head,
> > +		unsigned int *available, void **src1, unsigned int *n1,
> > +		void **src2)
> > +{
> > +	uint32_t entries;
> > +
> > +	n = __rte_ring_move_cons_head_serial(r, is_sc, n, behavior,
> > +			old_head, new_head, &entries);
> > +
> > +	if (n == 0)
> > +		goto end;
> > +
> > +	__rte_ring_get_elem_addr(r, *old_head, esize, n, src1, n1, src2);
> > +
> > +	if (available != NULL)
> > +		*available = entries - n;
> > +
> > +end:
> > +	return n;
> > +}
> > +
> > +/**
> > + * @internal Consume previously reserved ring elements (for dequeue)
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param old_head
> > + *   Consumer's head index before reservation.
> > + * @param new_head
> > + *   Consumer's head index after reservation.
> > + * @param is_sc
> > + *   Indicates whether to use single consumer or multi-consumer head
> update
> > + */
> > +static __rte_always_inline void
> > +__rte_ring_do_dequeue_elem_commit(struct rte_ring *r,
> > +		unsigned int old_head, unsigned int new_head,
> > +		unsigned int is_sc)
> > +{
> 
> I think it is a bit dangerous and error-prone approach to let user specify
> old_head/new_head manually.
old_head and new_head are local to the thread in the non-serial MP/MC case. Hence, they need to be returned back to the caller.

> Seems better just _commit(ring, num) - see above.
This would not work for non-serial cases.

> That way suer don't have to calculate new head mannualy,
I do not understand the 'calculate' part. The user has to provide the same values that were returned.

> plus we can have a check that ring.tail - ring.head >= num.
> 
> > +	update_tail(&r->cons, old_head, new_head, is_sc, 1);
> 
> I think update_tail() is not enough here.
> As in some cases we need to update  ring.head also:
> let say user reserved 2 elems, but then decided to commit only one.
This patch does not address that use case. Do you see use cases for this?

> So I think we need a special new function instead of update_tail() here.
> BTW, in HTS I use atomic 64-bit read/write to get/set both head and tail in
> one go.
> This is not really required - two 32bit ops would work too, I think.
> As usual, both ways have some pros and cons:
> using 64bit ops might be faster on 64-bit target, plus it is less error prone (no
> need to think about head/tail read/write ordering, etc.), though for 32-bit
> target it would mean some extra overhead.
> 
> > +}
> > +
> > +/**
> > + * Reserve one element on a ring for dequeue. This function blocks if
> > +there
> > + * are elements reserved already. Application must call
> > + * 'rte_ring_do_dequeue_elem_commit' or
> > + * `rte_ring_do_dequeue_elem_revert_serial' to complete the dequeue
> operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param old_head
> > + *   Consumer's head index before reservation. The same should be passed
> to
> > + *   'rte_ring_dequeue_elem_commit' function.
> > + * @param new_head
> > + *   Consumer's head index after reservation. The same should be passed to
> > + *   'rte_ring_dequeue_elem_commit' function.
> > + * @param available
> > + *   returns the number of remaining ring elements after the reservation
> > + *   It is not updated if the number of reserved elements is zero.
> > + * @param src
> > + *   Pointer to location in the ring to copy the data from.
> > + * @return
> > + *   - 0: Success; elements reserved
> > + *   - -ENOBUFS: Not enough room in the ring; no element is reserved.
> > + */
> > +static __rte_always_inline int
> > +rte_ring_dequeue_elem_reserve_serial(struct rte_ring *r, unsigned int
> esize,
> > +		unsigned int *old_head, unsigned int *new_head,
> > +		unsigned int *available, void **src) {
> > +	unsigned int n;
> > +
> > +	return __rte_ring_do_dequeue_elem_reserve_serial(r, esize, 1,
> > +			RTE_RING_QUEUE_FIXED, r->cons.single, old_head,
> > +			new_head, available, src, &n, NULL) ? 0 : -ENOBUFS; }
> > +
> > +/**
> > + * Consume previously reserved elements (for dequeue) in a ring
> > + * (multi-consumer safe).
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param old_head
> > + *   Consumer's head index before reservation. This value was returned
> > + *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
> > + * @param new_head
> > + *   Consumer's head index after reservation. This value was returned
> > + *   when the API 'rte_ring_dequeue_elem_reserve_xxx' was called.
> > + */
> > +static __rte_always_inline void
> > +rte_ring_dequeue_elem_commit(struct rte_ring *r, unsigned int old_head,
> > +		unsigned int new_head)
> > +{
> > +	__rte_ring_do_dequeue_elem_commit(r, old_head, new_head,
> > +						r->cons.single);
> > +}
> > +
> > +/**
> > + * Discard previously reserved elements (for dequeue) in a ring.
> > + *
> > + * @warning
> > + * This API can be called only if the ring elements were reserved
> > + * using 'rte_ring_dequeue_xxx_elem_reserve_serial' APIs.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + */
> > +static __rte_always_inline void
> > +rte_ring_dequeue_elem_revert_serial(struct rte_ring *r) {
> > +	__rte_ring_revert_head(&r->cons);
> > +}
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_RING_ELEM_SG_H_ */
> > diff --git a/lib/librte_ring/rte_ring_generic.h
> > b/lib/librte_ring/rte_ring_generic.h
> > index 953cdbbd5..8d7a7ffcc 100644
> > --- a/lib/librte_ring/rte_ring_generic.h
> > +++ b/lib/librte_ring/rte_ring_generic.h
> > @@ -170,4 +170,97 @@ __rte_ring_move_cons_head(struct rte_ring *r,
> unsigned int is_sc,
> >  	return n;
> >  }
> >
> > +/**
> > + * @internal This function updates the consumer head if there are no
> > + * prior reserved elements on the ring.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure
> > + * @param is_sc
> > + *   Indicates whether multi-consumer path is needed or not
> > + * @param n
> > + *   The number of elements we will want to dequeue, i.e. how far should
> the
> > + *   head be moved
> > + * @param behavior
> > + *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a
> ring
> > + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from
> ring
> > + * @param old_head
> > + *   Returns head value as it was before the move, i.e. where dequeue
> starts
> > + * @param new_head
> > + *   Returns the current/new head value i.e. where dequeue finishes
> > + * @param entries
> > + *   Returns the number of entries in the ring BEFORE head was moved
> > + * @return
> > + *   - Actual number of objects dequeued.
> > + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_move_cons_head_serial(struct rte_ring *r, unsigned int is_sc,
> > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > +		uint32_t *old_head, uint32_t *new_head,
> > +		uint32_t *entries)
> > +{
> > +	unsigned int max = n;
> > +	int success;
> > +
> > +	/* move cons.head atomically */
> > +	do {
> > +		/* Restore n as it may change every loop */
> > +		n = max;
> > +
> > +		*old_head = r->cons.head;
> > +
> > +		/* add rmb barrier to avoid load/load reorder in weak
> > +		 * memory model. It is noop on x86
> > +		 */
> > +		rte_smp_rmb();
> > +
> > +		/* Ensure that cons.tail and cons.head are the same */
> > +		if (*old_head != r->cons.tail) {
> > +			rte_pause();
> > +
> > +			success = 0;
> > +			continue;
> > +		}
> > +
> > +		/* The subtraction is done between two unsigned 32bits value
> > +		 * (the result is always modulo 32 bits even if we have
> > +		 * cons_head > prod_tail). So 'entries' is always between 0
> > +		 * and size(ring)-1.
> > +		 */
> > +		*entries = (r->prod.tail - *old_head);
> > +
> > +		/* Set the actual entries for dequeue */
> > +		if (n > *entries)
> > +			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 :
> *entries;
> > +
> > +		if (unlikely(n == 0))
> > +			return 0;
> > +
> > +		*new_head = *old_head + n;
> > +		if (is_sc) {
> > +			r->cons.head = *new_head;
> > +			rte_smp_rmb();
> > +			success = 1;
> 
> I don't think we need to worry about SC case in this function.
> For sc(/sp)_serial (if we need such mode) - we probably can use normal
> move_(cons/prod)_head().
Agree

> 
> > +		} else {
> > +			success = rte_atomic32_cmpset(&r->cons.head,
> *old_head,
> > +					*new_head);
> > +		}
> > +	} while (unlikely(success == 0));
> > +	return n;
> > +}
> > +
> > +/**
> > + * @internal This function updates the head to match the tail
> > + *
> > + * @param ht
> > + *   A pointer to the ring's head-tail structure
> > + */
> > +static __rte_always_inline void
> > +__rte_ring_revert_head(struct rte_ring_headtail *ht) {
> > +	/* Discard the reserved ring elements. */
> > +	ht->head = ht->tail;
> > +}
> > +
> >  #endif /* _RTE_RING_GENERIC_H_ */
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-02-28  0:18     ` Honnappa Nagarahalli
@ 2020-03-02 18:20       ` Ananyev, Konstantin
  2020-03-04 23:21         ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-03-02 18:20 UTC (permalink / raw)
  To: Honnappa Nagarahalli, olivier.matz; +Cc: Gavin Hu, dev, nd, nd


> > > +/**
> > > + * @internal Reserve ring elements to enqueue several objects on the
> > > +ring
> > > + *
> > > + * @param r
> > > + *   A pointer to the ring structure.
> > > + * @param esize
> > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > + *   This must be the same value used while creating the ring. Otherwise
> > > + *   the results are undefined.
> > > + * @param n
> > > + *   The number of elements to reserve in the ring.
> > > + * @param behavior
> > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements from a
> > ring
> > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as possible
> > from ring
> > > + * @param is_sp
> > > + *   Indicates whether to use single producer or multi-producer reserve
> > > + * @param old_head
> > > + *   Producer's head index before reservation.
> > > + * @param new_head
> > > + *   Producer's head index after reservation.
> > > + * @param free_space
> > > + *   returns the amount of space after the reserve operation has finished.
> > > + *   It is not updated if the number of reserved elements is zero.
> > > + * @param dst1
> > > + *   Pointer to location in the ring to copy the data.
> > > + * @param n1
> > > + *   Number of elements to copy at dst1
> > > + * @param dst2
> > > + *   In case of ring wrap around, this pointer provides the location to
> > > + *   copy the remaining elements. The number of elements to copy at this
> > > + *   location is equal to (number of elements reserved - n1)
> > > + * @return
> > > + *   Actual number of elements reserved.
> > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > + */
> > > +static __rte_always_inline unsigned int
> > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned int
> > > +esize,
> >
> >
> > I do understand the purpose of reserve, then either commit/abort for serial
> > sync mode, but what is the purpose of non-serial version of reserve/commit?
> In RCU, I have the need for scatter-gather feature. i.e. the data in the ring element is coming from multiple sources ('token' is generated by
> the RCU library and the application provides additional data). If I do not provide the reserve/commit, I need to introduce an intermediate
> memcpy to get these two data contiguously to copy to the ring element. The sequence is 'reserve(1), memcpy1, mempcy2, commit(1)'.
> Hence, you do not see the abort API for the enqueue.
> 
> > In serial  MP/MC case, after _reserve_(n) you always have to do
> > _commit_(n) - you can't reduce number of elements, or do _abort_.
> Agree, the intention here is to provide the scatter/gather feature.
> 
> > Again you cannot avoid memcpy(n) here anyhow.
> > So what is the point of these functions for non-serial case?
> It avoids an intermediate memcpy when the data is coming from multiple sources.

Ok, I think I understand what was my confusion:
Your intention:
 1) reserve/commit for both serial and non-serial mode -
     to allow user get/set contents of the ring manually and avoid
     intermediate load/stores.
2) abort only for serial mode.  

My intention:
1) commit/reserve/abort only for serial case
    (as that's the only mode where we can commit less
     then was reserved or do abort).
2) get/set of ring contents are done as part of either
    reserve(for dequeue) or commit(for enqueue) API calls
    (no scatter-gather ability).

I still think that this new API you suggest creates too
big exposure of ring internals, and makes it less 'safe-to-use':
- it provides direct access to contents of the ring.
- user has to specify head/tail values directly.

So in case of some programmatic error in related user code, 
there are less chances it could be catch-up by API,
and we can easily end-up with silent memory corruption
and other nasty things that would be hard to catch/reproduce.

That makes me wonder how critical is this scatter-gather ability
in terms of overall RCU performance? 
Is the gain provided really that significant, especially if you'll update the
ring by one element at a time? 
  
> 
> >
> > BTW, I think it would be good to have serial version of _enqueue_ too.
> If there is a good use case, they should be provided. I did not come across a good use case.
> 
> >
> > > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > > +		unsigned int is_sp, unsigned int *old_head,
> > > +		unsigned int *new_head, unsigned int *free_space,
> > > +		void **dst1, unsigned int *n1, void **dst2)
> >
> > I do understand the intention to avoid memcpy(), but proposed API seems
> > overcomplicated, error prone, and not very convenient for the user.
> The issue is the need to handle the wrap around in ring storage array. i.e. when the space is reserved for more than 1 ring element, the wrap
> around might happen.
> 
> > I don't think that avoiding memcpy() will save us that many cycles here, so
> This depends on the amount of data being copied.
> 
> > probably better to keep API model a bit more regular:
> >
> > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space); ...
> > /* performs actual memcpy(), m<=n */
> > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> These do not take care of the wrap-around case or I am not able to understand your comment.

I meant that serial_enqueue_commit() will do both:
actual copy of elements to the ring and tail update (no Scatter-Gather), see above. 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-03-02 18:20       ` Ananyev, Konstantin
@ 2020-03-04 23:21         ` Honnappa Nagarahalli
  2020-03-05 18:26           ` Ananyev, Konstantin
  0 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-03-04 23:21 UTC (permalink / raw)
  To: Ananyev, Konstantin, olivier.matz
  Cc: Gavin Hu, dev, nd, Honnappa Nagarahalli, nd

<snip>

> 
> > > > +/**
> > > > + * @internal Reserve ring elements to enqueue several objects on
> > > > +the ring
> > > > + *
> > > > + * @param r
> > > > + *   A pointer to the ring structure.
> > > > + * @param esize
> > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > + *   This must be the same value used while creating the ring.
> Otherwise
> > > > + *   the results are undefined.
> > > > + * @param n
> > > > + *   The number of elements to reserve in the ring.
> > > > + * @param behavior
> > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements
> from a
> > > ring
> > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> possible
> > > from ring
> > > > + * @param is_sp
> > > > + *   Indicates whether to use single producer or multi-producer reserve
> > > > + * @param old_head
> > > > + *   Producer's head index before reservation.
> > > > + * @param new_head
> > > > + *   Producer's head index after reservation.
> > > > + * @param free_space
> > > > + *   returns the amount of space after the reserve operation has
> finished.
> > > > + *   It is not updated if the number of reserved elements is zero.
> > > > + * @param dst1
> > > > + *   Pointer to location in the ring to copy the data.
> > > > + * @param n1
> > > > + *   Number of elements to copy at dst1
> > > > + * @param dst2
> > > > + *   In case of ring wrap around, this pointer provides the location to
> > > > + *   copy the remaining elements. The number of elements to copy at
> this
> > > > + *   location is equal to (number of elements reserved - n1)
> > > > + * @return
> > > > + *   Actual number of elements reserved.
> > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > + */
> > > > +static __rte_always_inline unsigned int
> > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned
> > > > +int esize,
> > >
> > >
> > > I do understand the purpose of reserve, then either commit/abort for
> > > serial sync mode, but what is the purpose of non-serial version of
> reserve/commit?
> > In RCU, I have the need for scatter-gather feature. i.e. the data in
> > the ring element is coming from multiple sources ('token' is generated
> > by the RCU library and the application provides additional data). If I do not
> provide the reserve/commit, I need to introduce an intermediate memcpy to
> get these two data contiguously to copy to the ring element. The sequence is
> 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > Hence, you do not see the abort API for the enqueue.
> >
> > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > Agree, the intention here is to provide the scatter/gather feature.
> >
> > > Again you cannot avoid memcpy(n) here anyhow.
> > > So what is the point of these functions for non-serial case?
> > It avoids an intermediate memcpy when the data is coming from multiple
> sources.
> 
> Ok, I think I understand what was my confusion:
Yes, the following understanding is correct.

> Your intention:
>  1) reserve/commit for both serial and non-serial mode -
>      to allow user get/set contents of the ring manually and avoid
>      intermediate load/stores.
> 2) abort only for serial mode.
> 
> My intention:
> 1) commit/reserve/abort only for serial case
>     (as that's the only mode where we can commit less
>      then was reserved or do abort).
I do not know if there is a requirement on committing less than reserved. I think, if the size of commit is not known during reservation, may be the reservation can be delayed till it is known.
If there is no requirement to commit less than reserved, then I do not see a need for serial APIs for enqueue operation.

> 2) get/set of ring contents are done as part of either
>     reserve(for dequeue) or commit(for enqueue) API calls
>     (no scatter-gather ability).
> 
> I still think that this new API you suggest creates too big exposure of ring
> internals, and makes it less 'safe-to-use':
> - it provides direct access to contents of the ring.
> - user has to specify head/tail values directly.
It is some what complex. But, with the support of user defined element size, I think it becomes necessary to support scatter gather feature (since it is not a single pointer that will be stored).

> 
> So in case of some programmatic error in related user code, there are less
> chances it could be catch-up by API, and we can easily end-up with silent
> memory corruption and other nasty things that would be hard to
> catch/reproduce.
> 
> That makes me wonder how critical is this scatter-gather ability in terms of
> overall RCU performance?
> Is the gain provided really that significant, especially if you'll update the ring
> by one element at a time?
For RCU, it is 64b token and the size of the user data. Not sure how much difference it will make.
I can drop the scatter gather requirement for now.

> 
> >
> > >
> > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > If there is a good use case, they should be provided. I did not come across a
> good use case.
> >
> > >
> > > > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > +		void **dst1, unsigned int *n1, void **dst2)
> > >
> > > I do understand the intention to avoid memcpy(), but proposed API
> > > seems overcomplicated, error prone, and not very convenient for the user.
> > The issue is the need to handle the wrap around in ring storage array.
> > i.e. when the space is reserved for more than 1 ring element, the wrap
> around might happen.
> >
> > > I don't think that avoiding memcpy() will save us that many cycles
> > > here, so
> > This depends on the amount of data being copied.
> >
> > > probably better to keep API model a bit more regular:
> > >
> > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space); ...
> > > /* performs actual memcpy(), m<=n */
> > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > These do not take care of the wrap-around case or I am not able to
> understand your comment.
> 
> I meant that serial_enqueue_commit() will do both:
> actual copy of elements to the ring and tail update (no Scatter-Gather), see
> above.
RCU does not require the serial enqueue APIs, do you have any use case?

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-03-04 23:21         ` Honnappa Nagarahalli
@ 2020-03-05 18:26           ` Ananyev, Konstantin
  2020-03-25 20:43             ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-03-05 18:26 UTC (permalink / raw)
  To: Honnappa Nagarahalli, olivier.matz; +Cc: Gavin Hu, dev, nd, nd


> >
> > > > > +/**
> > > > > + * @internal Reserve ring elements to enqueue several objects on
> > > > > +the ring
> > > > > + *
> > > > > + * @param r
> > > > > + *   A pointer to the ring structure.
> > > > > + * @param esize
> > > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > > + *   This must be the same value used while creating the ring.
> > Otherwise
> > > > > + *   the results are undefined.
> > > > > + * @param n
> > > > > + *   The number of elements to reserve in the ring.
> > > > > + * @param behavior
> > > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements
> > from a
> > > > ring
> > > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> > possible
> > > > from ring
> > > > > + * @param is_sp
> > > > > + *   Indicates whether to use single producer or multi-producer reserve
> > > > > + * @param old_head
> > > > > + *   Producer's head index before reservation.
> > > > > + * @param new_head
> > > > > + *   Producer's head index after reservation.
> > > > > + * @param free_space
> > > > > + *   returns the amount of space after the reserve operation has
> > finished.
> > > > > + *   It is not updated if the number of reserved elements is zero.
> > > > > + * @param dst1
> > > > > + *   Pointer to location in the ring to copy the data.
> > > > > + * @param n1
> > > > > + *   Number of elements to copy at dst1
> > > > > + * @param dst2
> > > > > + *   In case of ring wrap around, this pointer provides the location to
> > > > > + *   copy the remaining elements. The number of elements to copy at
> > this
> > > > > + *   location is equal to (number of elements reserved - n1)
> > > > > + * @return
> > > > > + *   Actual number of elements reserved.
> > > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > > + */
> > > > > +static __rte_always_inline unsigned int
> > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned
> > > > > +int esize,
> > > >
> > > >
> > > > I do understand the purpose of reserve, then either commit/abort for
> > > > serial sync mode, but what is the purpose of non-serial version of
> > reserve/commit?
> > > In RCU, I have the need for scatter-gather feature. i.e. the data in
> > > the ring element is coming from multiple sources ('token' is generated
> > > by the RCU library and the application provides additional data). If I do not
> > provide the reserve/commit, I need to introduce an intermediate memcpy to
> > get these two data contiguously to copy to the ring element. The sequence is
> > 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > > Hence, you do not see the abort API for the enqueue.
> > >
> > > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > > Agree, the intention here is to provide the scatter/gather feature.
> > >
> > > > Again you cannot avoid memcpy(n) here anyhow.
> > > > So what is the point of these functions for non-serial case?
> > > It avoids an intermediate memcpy when the data is coming from multiple
> > sources.
> >
> > Ok, I think I understand what was my confusion:
> Yes, the following understanding is correct.
> 
> > Your intention:
> >  1) reserve/commit for both serial and non-serial mode -
> >      to allow user get/set contents of the ring manually and avoid
> >      intermediate load/stores.
> > 2) abort only for serial mode.
> >
> > My intention:
> > 1) commit/reserve/abort only for serial case
> >     (as that's the only mode where we can commit less
> >      then was reserved or do abort).
> I do not know if there is a requirement on committing less than reserved.

From my perspective, that's a necessary part of peek functionality.
revert/abort function you introduced below is just one special case of it.
Having just abort is enough when you processing elements in the ring one by one,
but not sufficient if someone would try to operate in bulks.
Let say you read (reserved) N objects from the ring, inspected them
and found that first M (<N) are ok to be removed from the ring,
others should remain.

> I think, if the size of commit is not known during reservation,
> may be the reservation can be delayed till it is known.

In some cases, you do know how much you'd like to commit,
but you can't guarantee that you can commit that much,
till you inspect contents of reserved elems.  

> If there is no requirement to commit less than reserved, then I do not see a need for serial APIs for enqueue operation.

> 
> > 2) get/set of ring contents are done as part of either
> >     reserve(for dequeue) or commit(for enqueue) API calls
> >     (no scatter-gather ability).
> >
> > I still think that this new API you suggest creates too big exposure of ring
> > internals, and makes it less 'safe-to-use':
> > - it provides direct access to contents of the ring.
> > - user has to specify head/tail values directly.
> It is some what complex. But, with the support of user defined element size, I think it becomes necessary to support scatter gather
> feature (since it is not a single pointer that will be stored).

I suppose to see the real benefit from scatter-gather, we need a scenario
where there are relatively big elems in the ring (32B+ or so),
plus enqueue/dequeue done in bulks.
If you really  envision such use case - I am ok to consider scatter-gather API too,
but I think it shouldn't be the only available API for serial mode.
Might be we can have 'normal' enqueue/dequeue API for serial mode
(actual copy done internally in ring functions, head/tail values are not exposed directly),
plus SG API as addon for some special cases.  

> >
> > So in case of some programmatic error in related user code, there are less
> > chances it could be catch-up by API, and we can easily end-up with silent
> > memory corruption and other nasty things that would be hard to
> > catch/reproduce.
> >
> > That makes me wonder how critical is this scatter-gather ability in terms of
> > overall RCU performance?
> > Is the gain provided really that significant, especially if you'll update the ring
> > by one element at a time?
> For RCU, it is 64b token and the size of the user data. Not sure how much difference it will make.
> I can drop the scatter gather requirement for now.
> 
> >
> > >
> > > >
> > > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > > If there is a good use case, they should be provided. I did not come across a
> > good use case.
> > >
> > > >
> > > > > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > > +		void **dst1, unsigned int *n1, void **dst2)
> > > >
> > > > I do understand the intention to avoid memcpy(), but proposed API
> > > > seems overcomplicated, error prone, and not very convenient for the user.
> > > The issue is the need to handle the wrap around in ring storage array.
> > > i.e. when the space is reserved for more than 1 ring element, the wrap
> > around might happen.
> > >
> > > > I don't think that avoiding memcpy() will save us that many cycles
> > > > here, so
> > > This depends on the amount of data being copied.
> > >
> > > > probably better to keep API model a bit more regular:
> > > >
> > > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space); ...
> > > > /* performs actual memcpy(), m<=n */
> > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > > These do not take care of the wrap-around case or I am not able to
> > understand your comment.
> >
> > I meant that serial_enqueue_commit() will do both:
> > actual copy of elements to the ring and tail update (no Scatter-Gather), see
> > above.
> RCU does not require the serial enqueue APIs, do you have any use case?

I agree that serial dequeue seems to have more usages then enqueue.
Though I still can name at least two cases for enqueue, from top of my head:
1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown 
overcommitted scenarios, see RFC I submitted:
http://patches.dpdk.org/cover/66001/
2. any intermediate node when you have pop/push from/to some external queue,
and enqueue/dequeue to/from the ring, would like to avoid any elem
drops in between, and by some reason don't want your own intermediate bufferization.
Let say:
dequeue_from_ring -> tx_burst/cryptodev_enqueue
rx_burst/cryptodev_dequeue -> enqueue_to_ring

Plus as enqueue/dequeue are sort of mirror, I think it is good to have both identical.
   



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
  2020-03-05 18:26           ` Ananyev, Konstantin
@ 2020-03-25 20:43             ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-03-25 20:43 UTC (permalink / raw)
  To: Ananyev, Konstantin, olivier.matz
  Cc: Gavin Hu, dev, nd, Honnappa Nagarahalli, nd

<snip>

> 
> > >
> > > > > > +/**
> > > > > > + * @internal Reserve ring elements to enqueue several objects
> > > > > > +on the ring
> > > > > > + *
> > > > > > + * @param r
> > > > > > + *   A pointer to the ring structure.
> > > > > > + * @param esize
> > > > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > > > + *   This must be the same value used while creating the ring.
> > > Otherwise
> > > > > > + *   the results are undefined.
> > > > > > + * @param n
> > > > > > + *   The number of elements to reserve in the ring.
> > > > > > + * @param behavior
> > > > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of
> elements
> > > from a
> > > > > ring
> > > > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> > > possible
> > > > > from ring
> > > > > > + * @param is_sp
> > > > > > + *   Indicates whether to use single producer or multi-producer
> reserve
> > > > > > + * @param old_head
> > > > > > + *   Producer's head index before reservation.
> > > > > > + * @param new_head
> > > > > > + *   Producer's head index after reservation.
> > > > > > + * @param free_space
> > > > > > + *   returns the amount of space after the reserve operation has
> > > finished.
> > > > > > + *   It is not updated if the number of reserved elements is zero.
> > > > > > + * @param dst1
> > > > > > + *   Pointer to location in the ring to copy the data.
> > > > > > + * @param n1
> > > > > > + *   Number of elements to copy at dst1
> > > > > > + * @param dst2
> > > > > > + *   In case of ring wrap around, this pointer provides the location
> to
> > > > > > + *   copy the remaining elements. The number of elements to copy
> at
> > > this
> > > > > > + *   location is equal to (number of elements reserved - n1)
> > > > > > + * @return
> > > > > > + *   Actual number of elements reserved.
> > > > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > > > + */
> > > > > > +static __rte_always_inline unsigned int
> > > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r,
> > > > > > +unsigned int esize,
> > > > >
> > > > >
> > > > > I do understand the purpose of reserve, then either commit/abort
> > > > > for serial sync mode, but what is the purpose of non-serial
> > > > > version of
> > > reserve/commit?
> > > > In RCU, I have the need for scatter-gather feature. i.e. the data
> > > > in the ring element is coming from multiple sources ('token' is
> > > > generated by the RCU library and the application provides
> > > > additional data). If I do not
> > > provide the reserve/commit, I need to introduce an intermediate
> > > memcpy to get these two data contiguously to copy to the ring
> > > element. The sequence is 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > > > Hence, you do not see the abort API for the enqueue.
> > > >
> > > > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > > > Agree, the intention here is to provide the scatter/gather feature.
> > > >
> > > > > Again you cannot avoid memcpy(n) here anyhow.
> > > > > So what is the point of these functions for non-serial case?
> > > > It avoids an intermediate memcpy when the data is coming from
> > > > multiple
> > > sources.
> > >
> > > Ok, I think I understand what was my confusion:
> > Yes, the following understanding is correct.
> >
> > > Your intention:
> > >  1) reserve/commit for both serial and non-serial mode -
> > >      to allow user get/set contents of the ring manually and avoid
> > >      intermediate load/stores.
> > > 2) abort only for serial mode.
> > >
> > > My intention:
> > > 1) commit/reserve/abort only for serial case
> > >     (as that's the only mode where we can commit less
> > >      then was reserved or do abort).
> > I do not know if there is a requirement on committing less than reserved.
> 
> From my perspective, that's a necessary part of peek functionality.
> revert/abort function you introduced below is just one special case of it.
> Having just abort is enough when you processing elements in the ring one by
> one, but not sufficient if someone would try to operate in bulks.
> Let say you read (reserved) N objects from the ring, inspected them and
> found that first M (<N) are ok to be removed from the ring, others should
> remain.
Agree, it makes sense from a dequeue perspective. Does it make sense from enqueue perspective?

> 
> > I think, if the size of commit is not known during reservation, may be
> > the reservation can be delayed till it is known.
> 
> In some cases, you do know how much you'd like to commit, but you can't
> guarantee that you can commit that much, till you inspect contents of
> reserved elems.
The above comment was from enqueue perspective.

> 
> > If there is no requirement to commit less than reserved, then I do not see a
> need for serial APIs for enqueue operation.
> 
> >
> > > 2) get/set of ring contents are done as part of either
> > >     reserve(for dequeue) or commit(for enqueue) API calls
> > >     (no scatter-gather ability).
> > >
> > > I still think that this new API you suggest creates too big exposure
> > > of ring internals, and makes it less 'safe-to-use':
> > > - it provides direct access to contents of the ring.
> > > - user has to specify head/tail values directly.
> > It is some what complex. But, with the support of user defined element
> > size, I think it becomes necessary to support scatter gather feature (since it
> is not a single pointer that will be stored).
> 
> I suppose to see the real benefit from scatter-gather, we need a scenario
> where there are relatively big elems in the ring (32B+ or so), plus
> enqueue/dequeue done in bulks.
> If you really  envision such use case - I am ok to consider scatter-gather API
> too, but I think it shouldn't be the only available API for serial mode.
> Might be we can have 'normal' enqueue/dequeue API for serial mode (actual
> copy done internally in ring functions, head/tail values are not exposed
> directly), plus SG API as addon for some special cases.
I will try to run some benchmarks and take a decision on if SG makes an impact on RCU defer APIs.

> 
> > >
> > > So in case of some programmatic error in related user code, there
> > > are less chances it could be catch-up by API, and we can easily
> > > end-up with silent memory corruption and other nasty things that
> > > would be hard to catch/reproduce.
> > >
> > > That makes me wonder how critical is this scatter-gather ability in
> > > terms of overall RCU performance?
> > > Is the gain provided really that significant, especially if you'll
> > > update the ring by one element at a time?
> > For RCU, it is 64b token and the size of the user data. Not sure how much
> difference it will make.
> > I can drop the scatter gather requirement for now.
> >
> > >
> > > >
> > > > >
> > > > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > > > If there is a good use case, they should be provided. I did not
> > > > come across a
> > > good use case.
> > > >
> > > > >
> > > > > > +		unsigned int n, enum rte_ring_queue_behavior
> behavior,
> > > > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > > > +		void **dst1, unsigned int *n1, void **dst2)
> > > > >
> > > > > I do understand the intention to avoid memcpy(), but proposed
> > > > > API seems overcomplicated, error prone, and not very convenient for
> the user.
> > > > The issue is the need to handle the wrap around in ring storage array.
> > > > i.e. when the space is reserved for more than 1 ring element, the
> > > > wrap
> > > around might happen.
> > > >
> > > > > I don't think that avoiding memcpy() will save us that many
> > > > > cycles here, so
> > > > This depends on the amount of data being copied.
> > > >
> > > > > probably better to keep API model a bit more regular:
> > > > >
> > > > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num,
> &free_space); ...
> > > > > /* performs actual memcpy(), m<=n */
> > > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > > > These do not take care of the wrap-around case or I am not able to
> > > understand your comment.
> > >
> > > I meant that serial_enqueue_commit() will do both:
> > > actual copy of elements to the ring and tail update (no
> > > Scatter-Gather), see above.
> > RCU does not require the serial enqueue APIs, do you have any use case?
> 
> I agree that serial dequeue seems to have more usages then enqueue.
> Though I still can name at least two cases for enqueue, from top of my head:
> 1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown
> overcommitted scenarios, see RFC I submitted:
> http://patches.dpdk.org/cover/66001/
> 2. any intermediate node when you have pop/push from/to some external
> queue, and enqueue/dequeue to/from the ring, would like to avoid any
> elem drops in between, and by some reason don't want your own
> intermediate bufferization.
> Let say:
> dequeue_from_ring -> tx_burst/cryptodev_enqueue
> rx_burst/cryptodev_dequeue -> enqueue_to_ring
> 
> Plus as enqueue/dequeue are sort of mirror, I think it is good to have both
> identical.
Ok, agreed. I think we need to allow for combination of APIs to be used. i.e. MP enqueue and serialization on dequeue.

> 
> 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs
  2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
  2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
@ 2020-10-06 13:29 ` Honnappa Nagarahalli
  2020-10-06 13:29   ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-06 13:29 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, nd

Cover-letter:
It is pretty common for the DPDK applications to be deployed in
semi-pipeline model. In these models, a small number of cores
(typically 1) are designated as I/O cores. The I/O cores work
on receiving and transmitting packets from the NIC and several
packet processing cores. The IO core and the packet processing
cores exchange the packets over a ring. Typically, such applications
receive the mbufs in a temporary array and copy the mbufs on
to the ring. Depending on the requirements the packets
could be copied in batches of 32, 64 etc resulting in 256B,
512B etc memory copy.

The scatter gather APIs help avoid intermediate copies by exposing
the space on the ring directly to the application.

v2: changed the patch to use the SP-SC and HTS modes

v1: Initial version

Todo:
Add test cases

Honnappa Nagarahalli (1):
  lib/ring: add scatter gather APIs

 lib/librte_ring/meson.build        |   3 +-
 lib/librte_ring/rte_ring_elem.h    |   1 +
 lib/librte_ring/rte_ring_peek_sg.h | 552 +++++++++++++++++++++++++++++
 3 files changed, 555 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ring/rte_ring_peek_sg.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
@ 2020-10-06 13:29   ` Honnappa Nagarahalli
  2020-10-07  8:27     ` Olivier Matz
  2020-10-12 16:20     ` Ananyev, Konstantin
  0 siblings, 2 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-06 13:29 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, nd

Add scatter gather APIs to avoid intermediate memcpy. Use cases
that involve copying large amount of data to/from the ring
can benefit from these APIs.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 lib/librte_ring/meson.build        |   3 +-
 lib/librte_ring/rte_ring_elem.h    |   1 +
 lib/librte_ring/rte_ring_peek_sg.h | 552 +++++++++++++++++++++++++++++
 3 files changed, 555 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ring/rte_ring_peek_sg.h

diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
index 31c0b4649..377694713 100644
--- a/lib/librte_ring/meson.build
+++ b/lib/librte_ring/meson.build
@@ -12,4 +12,5 @@ headers = files('rte_ring.h',
 		'rte_ring_peek.h',
 		'rte_ring_peek_c11_mem.h',
 		'rte_ring_rts.h',
-		'rte_ring_rts_c11_mem.h')
+		'rte_ring_rts_c11_mem.h',
+		'rte_ring_peek_sg.h')
diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
index 938b398fc..7d3933f15 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 
 #ifdef ALLOW_EXPERIMENTAL_API
 #include <rte_ring_peek.h>
+#include <rte_ring_peek_sg.h>
 #endif
 
 #include <rte_ring.h>
diff --git a/lib/librte_ring/rte_ring_peek_sg.h b/lib/librte_ring/rte_ring_peek_sg.h
new file mode 100644
index 000000000..97d5764a6
--- /dev/null
+++ b/lib/librte_ring/rte_ring_peek_sg.h
@@ -0,0 +1,552 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Copyright (c) 2020 Arm
+ * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * All rights reserved.
+ * Derived from FreeBSD's bufring.h
+ * Used as BSD-3 Licensed with permission from Kip Macy.
+ */
+
+#ifndef _RTE_RING_PEEK_SG_H_
+#define _RTE_RING_PEEK_SG_H_
+
+/**
+ * @file
+ * @b EXPERIMENTAL: this API may change without prior notice
+ * It is not recommended to include this file directly.
+ * Please include <rte_ring_elem.h> instead.
+ *
+ * Ring Peek Scatter Gather APIs
+ * Introduction of rte_ring with scatter gather serialized producer/consumer
+ * (HTS sync mode) makes it possible to split public enqueue/dequeue API
+ * into 3 phases:
+ * - enqueue/dequeue start
+ * - copy data to/from the ring
+ * - enqueue/dequeue finish
+ * Along with the advantages of the peek APIs, these APIs provide the ability
+ * to avoid copying of the data to temporary area.
+ *
+ * Note that right now this new API is available only for two sync modes:
+ * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
+ * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
+ * It is a user responsibility to create/init ring with appropriate sync
+ * modes selected.
+ *
+ * Example usage:
+ * // read 1 elem from the ring:
+ * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
+ * if (n != 0) {
+ *	//Copy objects in the ring
+ *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
+ *	if (n != sgd->n1)
+ *		//Second memcpy because of wrapround
+ *		n2 = n - sgd->n1;
+ *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
+ *	rte_ring_dequeue_sg_finish(ring, n);
+ * }
+ *
+ * Note that between _start_ and _finish_ none other thread can proceed
+ * with enqueue(/dequeue) operation till _finish_ completes.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_ring_peek_c11_mem.h>
+
+/* Rock that needs to be passed between reserve and commit APIs */
+struct rte_ring_sg_data {
+	/* Pointer to the first space in the ring */
+	void **ptr1;
+	/* Pointer to the second space in the ring if there is wrap-around */
+	void **ptr2;
+	/* Number of elements in the first pointer. If this is equal to
+	 * the number of elements requested, then ptr2 is NULL.
+	 * Otherwise, subtracting n1 from number of elements requested
+	 * will give the number of elements available at ptr2.
+	 */
+	unsigned int n1;
+};
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
+	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx = head & r->mask;
+	uint64_t *ring = (uint64_t *)&r[1];
+
+	*dst1 = ring + idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = num - (r->size - idx - 1);
+		*dst2 = ring;
+	}
+}
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
+	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx = head & r->mask;
+	rte_int128_t *ring = (rte_int128_t *)&r[1];
+
+	*dst1 = ring + idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = num - (r->size - idx - 1);
+		*dst2 = ring;
+	}
+}
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
+	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	if (esize == 8)
+		__rte_ring_get_elem_addr_64(r, head,
+						num, dst1, n1, dst2);
+	else if (esize == 16)
+		__rte_ring_get_elem_addr_128(r, head,
+						num, dst1, n1, dst2);
+	else {
+		uint32_t idx, scale, nr_idx;
+		uint32_t *ring = (uint32_t *)&r[1];
+
+		/* Normalize to uint32_t */
+		scale = esize / sizeof(uint32_t);
+		idx = head & r->mask;
+		nr_idx = idx * scale;
+
+		*dst1 = ring + nr_idx;
+		*n1 = num;
+
+		if (idx + num > r->size) {
+			*n1 = num - (r->size - idx - 1);
+			*dst2 = ring;
+		}
+	}
+}
+
+/**
+ * @internal This function moves prod head value.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue_sg_elem_start(struct rte_ring *r, unsigned int esize,
+		uint32_t n, enum rte_ring_queue_behavior behavior,
+		struct rte_ring_sg_data *sgd, unsigned int *free_space)
+{
+	uint32_t free, head, next;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &free);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd->ptr1,
+			&sgd->n1, (void **)&sgd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd->ptr1,
+			&sgd->n1, (void **)&sgd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		free = 0;
+	}
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_sg_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, sgd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_sg_bulk_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_sg_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_sg_data *sgd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_sg_bulk_elem_start(r, sizeof(uintptr_t), n,
+							sgd, free_space);
+}
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_sg_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, sgd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_sg_bulk_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_sg_burst_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_sg_data *sgd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_sg_burst_elem_start(r, sizeof(uintptr_t), n,
+							sgd, free_space);
+}
+
+/**
+ * Complete enqueuing several objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_sg_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
+		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete enqueuing several pointers to objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of pointers to objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_sg_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_enqueue_sg_elem_finish(r, n);
+}
+
+/**
+ * @internal This function moves cons head value and copies up to *n*
+ * objects from the ring to the user provided obj_table.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue_sg_elem_start(struct rte_ring *r,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	struct rte_ring_sg_data *sgd, unsigned int *available)
+{
+	uint32_t avail, head, next;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &avail);
+		__rte_ring_get_elem_addr(r, head, esize, n,
+					sgd->ptr1, &sgd->n1, sgd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_cons_head(r, n, behavior,
+			&head, &avail);
+		__rte_ring_get_elem_addr(r, head, esize, n,
+					sgd->ptr1, &sgd->n1, sgd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		avail = 0;
+	}
+
+	if (available != NULL)
+		*available = avail - n;
+	return n;
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_sg_bulk_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_sg_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, sgd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_sg_bulk_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_sg_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_sg_data *sgd, unsigned int *available)
+{
+	return rte_ring_dequeue_sg_bulk_elem_start(r, sizeof(uintptr_t),
+		n, sgd, available);
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_sg_burst_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_sg_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, sgd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_sg_burst_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param sgd
+ *   The scatter-gather data containing pointers for copying data.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_sg_burst_start(struct rte_ring *r, unsigned int n,
+		struct rte_ring_sg_data *sgd, unsigned int *available)
+{
+	return rte_ring_dequeue_sg_burst_elem_start(r, sizeof(uintptr_t), n,
+			sgd, available);
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_sg_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
+		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_sg_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_dequeue_elem_finish(r, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RING_PEEK_SG_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-06 13:29   ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
@ 2020-10-07  8:27     ` Olivier Matz
  2020-10-08 20:44       ` Honnappa Nagarahalli
  2020-10-12 16:20     ` Ananyev, Konstantin
  1 sibling, 1 reply; 69+ messages in thread
From: Olivier Matz @ 2020-10-07  8:27 UTC (permalink / raw)
  To: Honnappa Nagarahalli; +Cc: dev, konstantin.ananyev, david.marchand, nd

Hi Honnappa,

From a quick walkthrough, I have some questions/comments, please
see below.

On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> Add scatter gather APIs to avoid intermediate memcpy. Use cases
> that involve copying large amount of data to/from the ring
> can benefit from these APIs.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  lib/librte_ring/meson.build        |   3 +-
>  lib/librte_ring/rte_ring_elem.h    |   1 +
>  lib/librte_ring/rte_ring_peek_sg.h | 552 +++++++++++++++++++++++++++++
>  3 files changed, 555 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_ring/rte_ring_peek_sg.h
> 
> diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> index 31c0b4649..377694713 100644
> --- a/lib/librte_ring/meson.build
> +++ b/lib/librte_ring/meson.build
> @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
>  		'rte_ring_peek.h',
>  		'rte_ring_peek_c11_mem.h',
>  		'rte_ring_rts.h',
> -		'rte_ring_rts_c11_mem.h')
> +		'rte_ring_rts_c11_mem.h',
> +		'rte_ring_peek_sg.h')
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 938b398fc..7d3933f15 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
>  
>  #ifdef ALLOW_EXPERIMENTAL_API
>  #include <rte_ring_peek.h>
> +#include <rte_ring_peek_sg.h>
>  #endif
>  
>  #include <rte_ring.h>
> diff --git a/lib/librte_ring/rte_ring_peek_sg.h b/lib/librte_ring/rte_ring_peek_sg.h
> new file mode 100644
> index 000000000..97d5764a6
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_peek_sg.h
> @@ -0,0 +1,552 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm
> + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_PEEK_SG_H_
> +#define _RTE_RING_PEEK_SG_H_
> +
> +/**
> + * @file
> + * @b EXPERIMENTAL: this API may change without prior notice
> + * It is not recommended to include this file directly.
> + * Please include <rte_ring_elem.h> instead.
> + *
> + * Ring Peek Scatter Gather APIs

I am not fully convinced by the API name. To me, "scatter/gather" is
associated to iovecs, like for instance in [1]. The wikipedia definition
[2] may be closer though.

[1] https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGather.html
[2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)

What about "zero-copy"?

Also, the "peek" term looks also a bit confusing to me, but it existed
before your patch. I understand it for dequeue, but not for enqueue.

Or, what about replacing the existing experimental peek API by this one?
They look quite similar to me.

> + * Introduction of rte_ring with scatter gather serialized producer/consumer
> + * (HTS sync mode) makes it possible to split public enqueue/dequeue API
> + * into 3 phases:
> + * - enqueue/dequeue start
> + * - copy data to/from the ring
> + * - enqueue/dequeue finish
> + * Along with the advantages of the peek APIs, these APIs provide the ability
> + * to avoid copying of the data to temporary area.
> + *
> + * Note that right now this new API is available only for two sync modes:
> + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> + * It is a user responsibility to create/init ring with appropriate sync
> + * modes selected.
> + *
> + * Example usage:
> + * // read 1 elem from the ring:

Comment should be "prepare enqueuing 32 objects"

> + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> + * if (n != 0) {
> + *	//Copy objects in the ring
> + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> + *	if (n != sgd->n1)
> + *		//Second memcpy because of wrapround
> + *		n2 = n - sgd->n1;
> + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));

Missing { }

> + *	rte_ring_dequeue_sg_finish(ring, n);

Should be enqueue

> + * }
> + *
> + * Note that between _start_ and _finish_ none other thread can proceed
> + * with enqueue(/dequeue) operation till _finish_ completes.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_ring_peek_c11_mem.h>
> +
> +/* Rock that needs to be passed between reserve and commit APIs */
> +struct rte_ring_sg_data {
> +	/* Pointer to the first space in the ring */
> +	void **ptr1;
> +	/* Pointer to the second space in the ring if there is wrap-around */
> +	void **ptr2;
> +	/* Number of elements in the first pointer. If this is equal to
> +	 * the number of elements requested, then ptr2 is NULL.
> +	 * Otherwise, subtracting n1 from number of elements requested
> +	 * will give the number of elements available at ptr2.
> +	 */
> +	unsigned int n1;
> +};

Would it be possible to simply return the offset instead of this structure?
The wrap could be managed by a __rte_ring_enqueue_elems() function.

I mean something like this:

	uint32_t start;
	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
	if (n != 0) {
		/* Copy objects in the ring. */
		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t), n);
		rte_ring_enqueue_sg_finish(ring, n);
	}

It would require to slightly change __rte_ring_enqueue_elems() to support
to be called with prod_head >= size, and wrap in that case.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-07  8:27     ` Olivier Matz
@ 2020-10-08 20:44       ` Honnappa Nagarahalli
  2020-10-08 20:47         ` Honnappa Nagarahalli
  2020-10-09  7:33         ` Olivier Matz
  0 siblings, 2 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-08 20:44 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, konstantin.ananyev, david.marchand, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Hi Honnappa,
> 
> From a quick walkthrough, I have some questions/comments, please see
> below.
Hi Olivier, appreciate your input.

> 
> On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > involve copying large amount of data to/from the ring can benefit from
> > these APIs.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> >  lib/librte_ring/meson.build        |   3 +-
> >  lib/librte_ring/rte_ring_elem.h    |   1 +
> >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > +++++++++++++++++++++++++++++
> >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode 100644
> > lib/librte_ring/rte_ring_peek_sg.h
> >
> > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > index 31c0b4649..377694713 100644
> > --- a/lib/librte_ring/meson.build
> > +++ b/lib/librte_ring/meson.build
> > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> >  		'rte_ring_peek.h',
> >  		'rte_ring_peek_c11_mem.h',
> >  		'rte_ring_rts.h',
> > -		'rte_ring_rts_c11_mem.h')
> > +		'rte_ring_rts_c11_mem.h',
> > +		'rte_ring_peek_sg.h')
> > diff --git a/lib/librte_ring/rte_ring_elem.h
> > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > --- a/lib/librte_ring/rte_ring_elem.h
> > +++ b/lib/librte_ring/rte_ring_elem.h
> > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > void *obj_table,
> >
> >  #ifdef ALLOW_EXPERIMENTAL_API
> >  #include <rte_ring_peek.h>
> > +#include <rte_ring_peek_sg.h>
> >  #endif
> >
> >  #include <rte_ring.h>
> > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > b/lib/librte_ring/rte_ring_peek_sg.h
> > new file mode 100644
> > index 000000000..97d5764a6
> > --- /dev/null
> > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > @@ -0,0 +1,552 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + *
> > + * Copyright (c) 2020 Arm
> > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * All rights reserved.
> > + * Derived from FreeBSD's bufring.h
> > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > + */
> > +
> > +#ifndef _RTE_RING_PEEK_SG_H_
> > +#define _RTE_RING_PEEK_SG_H_
> > +
> > +/**
> > + * @file
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + * It is not recommended to include this file directly.
> > + * Please include <rte_ring_elem.h> instead.
> > + *
> > + * Ring Peek Scatter Gather APIs
> 
> I am not fully convinced by the API name. To me, "scatter/gather" is
> associated to iovecs, like for instance in [1]. The wikipedia definition [2] may
> be closer though.
> 
> [1]
> https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe
> r.html
> [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
The way I understand scatter-gather is, the data to be sent to something (like a device) is coming from multiple sources. It would require copying to put the data together to be contiguous. If the device supports scatter-gather, such copying is not required. The device can collect data from multiple locations and make it contiguous.

In the case I was looking at, one part of the data was coming from the user of the API and another was generated by the API itself. If these two pieces of information have to be enqueued as a single object on the ring, I had to create an intermediate copy. But by exposing the ring memory to the user, the intermediate copy is avoided. Hence I called it scatter-gather.

> 
> What about "zero-copy"?
I think no-copy (nc for short) or user-copy (uc for short) would convey the meaning better. These would indicate that the rte_ring APIs are not copying the objects and it is left to the user to do the actual copy.

> 
> Also, the "peek" term looks also a bit confusing to me, but it existed before
> your patch. I understand it for dequeue, but not for enqueue.
I kept 'peek' there because the API still offers the 'peek' API capabilities. I am also not sure on what 'peek' means for enqueue API. The enqueue 'peek' API was provided to be symmetric with dequeue peek API.

> 
> Or, what about replacing the existing experimental peek API by this one?
> They look quite similar to me.
I agree, scatter gather APIs provide the peek capability and the no-copy benefits.
Konstantin, any opinions here?

> 
> > + * Introduction of rte_ring with scatter gather serialized
> > + producer/consumer
> > + * (HTS sync mode) makes it possible to split public enqueue/dequeue
> > + API
> > + * into 3 phases:
> > + * - enqueue/dequeue start
> > + * - copy data to/from the ring
> > + * - enqueue/dequeue finish
> > + * Along with the advantages of the peek APIs, these APIs provide the
> > + ability
> > + * to avoid copying of the data to temporary area.
> > + *
> > + * Note that right now this new API is available only for two sync modes:
> > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> > + * It is a user responsibility to create/init ring with appropriate
> > + sync
> > + * modes selected.
> > + *
> > + * Example usage:
> > + * // read 1 elem from the ring:
> 
> Comment should be "prepare enqueuing 32 objects"
> 
> > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > + * if (n != 0) {
> > + *	//Copy objects in the ring
> > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > + *	if (n != sgd->n1)
> > + *		//Second memcpy because of wrapround
> > + *		n2 = n - sgd->n1;
> > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> 
> Missing { }
> 
> > + *	rte_ring_dequeue_sg_finish(ring, n);
> 
> Should be enqueue
> 
Thanks, will correct them.

> > + * }
> > + *
> > + * Note that between _start_ and _finish_ none other thread can
> > + proceed
> > + * with enqueue(/dequeue) operation till _finish_ completes.
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_ring_peek_c11_mem.h>
> > +
> > +/* Rock that needs to be passed between reserve and commit APIs */
> > +struct rte_ring_sg_data {
> > +	/* Pointer to the first space in the ring */
> > +	void **ptr1;
> > +	/* Pointer to the second space in the ring if there is wrap-around */
> > +	void **ptr2;
> > +	/* Number of elements in the first pointer. If this is equal to
> > +	 * the number of elements requested, then ptr2 is NULL.
> > +	 * Otherwise, subtracting n1 from number of elements requested
> > +	 * will give the number of elements available at ptr2.
> > +	 */
> > +	unsigned int n1;
> > +};
> 
> Would it be possible to simply return the offset instead of this structure?
> The wrap could be managed by a __rte_ring_enqueue_elems() function.
Trying to use __rte_ring_enqueue_elems() will force temporary copy. See below.

> 
> I mean something like this:
> 
> 	uint32_t start;
> 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> 	if (n != 0) {
> 		/* Copy objects in the ring. */
> 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> n);
For ex: 'obj' here is temporary copy.

> 		rte_ring_enqueue_sg_finish(ring, n);
> 	}
> 
> It would require to slightly change __rte_ring_enqueue_elems() to support
> to be called with prod_head >= size, and wrap in that case.
> 
The alternate solution I can think of requires 3 things 1) the base address of the ring 2) Index to where to copy 3) the mask. With these 3 things one could write the code like below:
for (i = 0; i < n; i++) {
	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take care of wrap-around.
}

However, I think this does not allow for passing the address in the ring to another function/API to copy the data (It is possible, but the user has to calculate the actual address, worry about the wrap-around, second pointer etc).

The current approach hides some details and provides flexibility to the application to use the pointer the way it wants.

> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-08 20:44       ` Honnappa Nagarahalli
@ 2020-10-08 20:47         ` Honnappa Nagarahalli
  2020-10-09  7:33         ` Olivier Matz
  1 sibling, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-08 20:47 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, konstantin.ananyev, david.marchand, nd, Honnappa Nagarahalli, nd

<snip>

> 
> >
> > Hi Honnappa,
> >
> > From a quick walkthrough, I have some questions/comments, please see
> > below.
> Hi Olivier, appreciate your input.
> 
> >
> > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > > involve copying large amount of data to/from the ring can benefit
> > > from these APIs.
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > ---
> > >  lib/librte_ring/meson.build        |   3 +-
> > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > +++++++++++++++++++++++++++++
> > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode
> > > 100644 lib/librte_ring/rte_ring_peek_sg.h
> > >
> > > diff --git a/lib/librte_ring/meson.build
> > > b/lib/librte_ring/meson.build index 31c0b4649..377694713 100644
> > > --- a/lib/librte_ring/meson.build
> > > +++ b/lib/librte_ring/meson.build
> > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > >  		'rte_ring_peek.h',
> > >  		'rte_ring_peek_c11_mem.h',
> > >  		'rte_ring_rts.h',
> > > -		'rte_ring_rts_c11_mem.h')
> > > +		'rte_ring_rts_c11_mem.h',
> > > +		'rte_ring_peek_sg.h')
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring
> > > *r, void *obj_table,
> > >
> > >  #ifdef ALLOW_EXPERIMENTAL_API
> > >  #include <rte_ring_peek.h>
> > > +#include <rte_ring_peek_sg.h>
> > >  #endif
> > >
> > >  #include <rte_ring.h>
> > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > new file mode 100644
> > > index 000000000..97d5764a6
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > @@ -0,0 +1,552 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2020 Arm
> > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > +#define _RTE_RING_PEEK_SG_H_
> > > +
> > > +/**
> > > + * @file
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring_elem.h> instead.
> > > + *
> > > + * Ring Peek Scatter Gather APIs
> >
> > I am not fully convinced by the API name. To me, "scatter/gather" is
> > associated to iovecs, like for instance in [1]. The wikipedia
> > definition [2] may be closer though.
> >
> > [1]
> >
> https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe
> > r.html
> > [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> The way I understand scatter-gather is, the data to be sent to something (like
> a device) is coming from multiple sources. It would require copying to put the
> data together to be contiguous. If the device supports scatter-gather, such
> copying is not required. The device can collect data from multiple locations
> and make it contiguous.
> 
> In the case I was looking at, one part of the data was coming from the user of
> the API and another was generated by the API itself. If these two pieces of
> information have to be enqueued as a single object on the ring, I had to
> create an intermediate copy. But by exposing the ring memory to the user,
> the intermediate copy is avoided. Hence I called it scatter-gather.
> 
> >
> > What about "zero-copy"?
> I think no-copy (nc for short) or user-copy (uc for short) would convey the
> meaning better. These would indicate that the rte_ring APIs are not copying
> the objects and it is left to the user to do the actual copy.
> 
> >
> > Also, the "peek" term looks also a bit confusing to me, but it existed
> > before your patch. I understand it for dequeue, but not for enqueue.
> I kept 'peek' there because the API still offers the 'peek' API capabilities. I am
> also not sure on what 'peek' means for enqueue API. The enqueue 'peek'
> API was provided to be symmetric with dequeue peek API.
> 
> >
> > Or, what about replacing the existing experimental peek API by this one?
> > They look quite similar to me.
> I agree, scatter gather APIs provide the peek capability and the no-copy
> benefits.
> Konstantin, any opinions here?
> 
> >
> > > + * Introduction of rte_ring with scatter gather serialized
> > > + producer/consumer
> > > + * (HTS sync mode) makes it possible to split public
> > > + enqueue/dequeue API
> > > + * into 3 phases:
> > > + * - enqueue/dequeue start
> > > + * - copy data to/from the ring
> > > + * - enqueue/dequeue finish
> > > + * Along with the advantages of the peek APIs, these APIs provide
> > > + the ability
> > > + * to avoid copying of the data to temporary area.
> > > + *
> > > + * Note that right now this new API is available only for two sync modes:
> > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > + * 2) Serialized Producer/Serialized Consumer
> (RTE_RING_SYNC_MT_HTS).
> > > + * It is a user responsibility to create/init ring with appropriate
> > > + sync
> > > + * modes selected.
> > > + *
> > > + * Example usage:
> > > + * // read 1 elem from the ring:
> >
> > Comment should be "prepare enqueuing 32 objects"
> >
> > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > + * if (n != 0) {
> > > + *	//Copy objects in the ring
> > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > + *	if (n != sgd->n1)
> > > + *		//Second memcpy because of wrapround
> > > + *		n2 = n - sgd->n1;
> > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> >
> > Missing { }
> >
> > > + *	rte_ring_dequeue_sg_finish(ring, n);
> >
> > Should be enqueue
> >
> Thanks, will correct them.
> 
> > > + * }
> > > + *
> > > + * Note that between _start_ and _finish_ none other thread can
> > > + proceed
> > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_peek_c11_mem.h>
> > > +
> > > +/* Rock that needs to be passed between reserve and commit APIs */
> > > +struct rte_ring_sg_data {
> > > +	/* Pointer to the first space in the ring */
> > > +	void **ptr1;
> > > +	/* Pointer to the second space in the ring if there is wrap-around */
> > > +	void **ptr2;
> > > +	/* Number of elements in the first pointer. If this is equal to
> > > +	 * the number of elements requested, then ptr2 is NULL.
> > > +	 * Otherwise, subtracting n1 from number of elements requested
> > > +	 * will give the number of elements available at ptr2.
> > > +	 */
> > > +	unsigned int n1;
> > > +};
> >
> > Would it be possible to simply return the offset instead of this structure?
> > The wrap could be managed by a __rte_ring_enqueue_elems() function.
> Trying to use __rte_ring_enqueue_elems() will force temporary copy. See
> below.
> 
> >
> > I mean something like this:
> >
> > 	uint32_t start;
> > 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > 	if (n != 0) {
> > 		/* Copy objects in the ring. */
> > 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> n);
> For ex: 'obj' here is temporary copy.
The example I provided in the comment at the top of the file is not good. I will replace the 'memcpy' with calling another API that copies the data to the ring directly. That should show the clear benefit.

> 
> > 		rte_ring_enqueue_sg_finish(ring, n);
> > 	}
> >
> > It would require to slightly change __rte_ring_enqueue_elems() to
> > support to be called with prod_head >= size, and wrap in that case.
> >
> The alternate solution I can think of requires 3 things 1) the base address of
> the ring 2) Index to where to copy 3) the mask. With these 3 things one could
> write the code like below:
> for (i = 0; i < n; i++) {
> 	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take
> care of wrap-around.
> }
> 
> However, I think this does not allow for passing the address in the ring to
> another function/API to copy the data (It is possible, but the user has to
> calculate the actual address, worry about the wrap-around, second pointer
> etc).
> 
> The current approach hides some details and provides flexibility to the
> application to use the pointer the way it wants.
> 
> >
> > Regards,
> > Olivier

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-08 20:44       ` Honnappa Nagarahalli
  2020-10-08 20:47         ` Honnappa Nagarahalli
@ 2020-10-09  7:33         ` Olivier Matz
  2020-10-09  8:05           ` Ananyev, Konstantin
  1 sibling, 1 reply; 69+ messages in thread
From: Olivier Matz @ 2020-10-09  7:33 UTC (permalink / raw)
  To: Honnappa Nagarahalli; +Cc: dev, konstantin.ananyev, david.marchand, nd

On Thu, Oct 08, 2020 at 08:44:13PM +0000, Honnappa Nagarahalli wrote:
> <snip>
> 
> > 
> > Hi Honnappa,
> > 
> > From a quick walkthrough, I have some questions/comments, please see
> > below.
> Hi Olivier, appreciate your input.
> 
> > 
> > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > > involve copying large amount of data to/from the ring can benefit from
> > > these APIs.
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > ---
> > >  lib/librte_ring/meson.build        |   3 +-
> > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > +++++++++++++++++++++++++++++
> > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode 100644
> > > lib/librte_ring/rte_ring_peek_sg.h
> > >
> > > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > > index 31c0b4649..377694713 100644
> > > --- a/lib/librte_ring/meson.build
> > > +++ b/lib/librte_ring/meson.build
> > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > >  		'rte_ring_peek.h',
> > >  		'rte_ring_peek_c11_mem.h',
> > >  		'rte_ring_rts.h',
> > > -		'rte_ring_rts_c11_mem.h')
> > > +		'rte_ring_rts_c11_mem.h',
> > > +		'rte_ring_peek_sg.h')
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > > void *obj_table,
> > >
> > >  #ifdef ALLOW_EXPERIMENTAL_API
> > >  #include <rte_ring_peek.h>
> > > +#include <rte_ring_peek_sg.h>
> > >  #endif
> > >
> > >  #include <rte_ring.h>
> > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > new file mode 100644
> > > index 000000000..97d5764a6
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > @@ -0,0 +1,552 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2020 Arm
> > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > +#define _RTE_RING_PEEK_SG_H_
> > > +
> > > +/**
> > > + * @file
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring_elem.h> instead.
> > > + *
> > > + * Ring Peek Scatter Gather APIs
> > 
> > I am not fully convinced by the API name. To me, "scatter/gather" is
> > associated to iovecs, like for instance in [1]. The wikipedia definition [2] may
> > be closer though.
> > 
> > [1]
> > https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe
> > r.html
> > [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> The way I understand scatter-gather is, the data to be sent to something (like a device) is coming from multiple sources. It would require copying to put the data together to be contiguous. If the device supports scatter-gather, such copying is not required. The device can collect data from multiple locations and make it contiguous.
> 
> In the case I was looking at, one part of the data was coming from the user of the API and another was generated by the API itself. If these two pieces of information have to be enqueued as a single object on the ring, I had to create an intermediate copy. But by exposing the ring memory to the user, the intermediate copy is avoided. Hence I called it scatter-gather.
> 
> > 
> > What about "zero-copy"?
> I think no-copy (nc for short) or user-copy (uc for short) would convey the meaning better. These would indicate that the rte_ring APIs are not copying the objects and it is left to the user to do the actual copy.
> 
> > 
> > Also, the "peek" term looks also a bit confusing to me, but it existed before
> > your patch. I understand it for dequeue, but not for enqueue.
> I kept 'peek' there because the API still offers the 'peek' API capabilities. I am also not sure on what 'peek' means for enqueue API. The enqueue 'peek' API was provided to be symmetric with dequeue peek API.
> 
> > 
> > Or, what about replacing the existing experimental peek API by this one?
> > They look quite similar to me.
> I agree, scatter gather APIs provide the peek capability and the no-copy benefits.
> Konstantin, any opinions here?
> 
> > 
> > > + * Introduction of rte_ring with scatter gather serialized
> > > + producer/consumer
> > > + * (HTS sync mode) makes it possible to split public enqueue/dequeue
> > > + API
> > > + * into 3 phases:
> > > + * - enqueue/dequeue start
> > > + * - copy data to/from the ring
> > > + * - enqueue/dequeue finish
> > > + * Along with the advantages of the peek APIs, these APIs provide the
> > > + ability
> > > + * to avoid copying of the data to temporary area.
> > > + *
> > > + * Note that right now this new API is available only for two sync modes:
> > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> > > + * It is a user responsibility to create/init ring with appropriate
> > > + sync
> > > + * modes selected.
> > > + *
> > > + * Example usage:
> > > + * // read 1 elem from the ring:
> > 
> > Comment should be "prepare enqueuing 32 objects"
> > 
> > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > + * if (n != 0) {
> > > + *	//Copy objects in the ring
> > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > + *	if (n != sgd->n1)
> > > + *		//Second memcpy because of wrapround
> > > + *		n2 = n - sgd->n1;
> > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > 
> > Missing { }
> > 
> > > + *	rte_ring_dequeue_sg_finish(ring, n);
> > 
> > Should be enqueue
> > 
> Thanks, will correct them.
> 
> > > + * }
> > > + *
> > > + * Note that between _start_ and _finish_ none other thread can
> > > + proceed
> > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_peek_c11_mem.h>
> > > +
> > > +/* Rock that needs to be passed between reserve and commit APIs */
> > > +struct rte_ring_sg_data {
> > > +	/* Pointer to the first space in the ring */
> > > +	void **ptr1;
> > > +	/* Pointer to the second space in the ring if there is wrap-around */
> > > +	void **ptr2;
> > > +	/* Number of elements in the first pointer. If this is equal to
> > > +	 * the number of elements requested, then ptr2 is NULL.
> > > +	 * Otherwise, subtracting n1 from number of elements requested
> > > +	 * will give the number of elements available at ptr2.
> > > +	 */
> > > +	unsigned int n1;
> > > +};
> > 
> > Would it be possible to simply return the offset instead of this structure?
> > The wrap could be managed by a __rte_ring_enqueue_elems() function.
> Trying to use __rte_ring_enqueue_elems() will force temporary copy. See below.
> 
> > 
> > I mean something like this:
> > 
> > 	uint32_t start;
> > 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > 	if (n != 0) {
> > 		/* Copy objects in the ring. */
> > 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> > n);
> For ex: 'obj' here is temporary copy.
> 
> > 		rte_ring_enqueue_sg_finish(ring, n);
> > 	}
> > 
> > It would require to slightly change __rte_ring_enqueue_elems() to support
> > to be called with prod_head >= size, and wrap in that case.
> > 
> The alternate solution I can think of requires 3 things 1) the base address of the ring 2) Index to where to copy 3) the mask. With these 3 things one could write the code like below:
> for (i = 0; i < n; i++) {
> 	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take care of wrap-around.
> }
> 
> However, I think this does not allow for passing the address in the ring to another function/API to copy the data (It is possible, but the user has to calculate the actual address, worry about the wrap-around, second pointer etc).
> 
> The current approach hides some details and provides flexibility to the application to use the pointer the way it wants.

I agree that doing the access + masking manually looks too complex.

However I'm not sure to get why using __rte_ring_enqueue_elems() would
result in an additional copy. I suppose the object that you want to
enqueue is already stored somewhere?

For instance, let's say you have 10 objects to enqueue, located at
different places:

	void *obj_0_to_3 = <place where objects 0 to 3 are stored>;
	void *obj_4_to_7 = ...;
	void *obj_8_to_9 = ...;
	uint32_t start;
	n = rte_ring_enqueue_sg_bulk_start(ring, 10, &start, NULL);
	if (n != 0) {
		__rte_ring_enqueue_elems(ring, start, obj_0_to_3,
			sizeof(uintptr_t), 4);
		__rte_ring_enqueue_elems(ring, start + 4, obj_4_to_7,
			sizeof(uintptr_t), 4);
		__rte_ring_enqueue_elems(ring, start + 8, obj_8_to_9,
			sizeof(uintptr_t), 2);
		rte_ring_enqueue_sg_finish(ring, 10);
	}


Thanks,
Olivier

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-09  7:33         ` Olivier Matz
@ 2020-10-09  8:05           ` Ananyev, Konstantin
  2020-10-09 22:54             ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-09  8:05 UTC (permalink / raw)
  To: Olivier Matz, Honnappa Nagarahalli; +Cc: dev, david.marchand, nd

Hi lads,

> On Thu, Oct 08, 2020 at 08:44:13PM +0000, Honnappa Nagarahalli wrote:
> > <snip>
> >
> > >
> > > Hi Honnappa,
> > >
> > > From a quick walkthrough, I have some questions/comments, please see
> > > below.
> > Hi Olivier, appreciate your input.
> >
> > >
> > > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > > > involve copying large amount of data to/from the ring can benefit from
> > > > these APIs.
> > > >
> > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > > ---
> > > >  lib/librte_ring/meson.build        |   3 +-
> > > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > > +++++++++++++++++++++++++++++
> > > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode 100644
> > > > lib/librte_ring/rte_ring_peek_sg.h
> > > >
> > > > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > > > index 31c0b4649..377694713 100644
> > > > --- a/lib/librte_ring/meson.build
> > > > +++ b/lib/librte_ring/meson.build
> > > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > > >  		'rte_ring_peek.h',
> > > >  		'rte_ring_peek_c11_mem.h',
> > > >  		'rte_ring_rts.h',
> > > > -		'rte_ring_rts_c11_mem.h')
> > > > +		'rte_ring_rts_c11_mem.h',
> > > > +		'rte_ring_peek_sg.h')
> > > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > > > --- a/lib/librte_ring/rte_ring_elem.h
> > > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > > > void *obj_table,
> > > >
> > > >  #ifdef ALLOW_EXPERIMENTAL_API
> > > >  #include <rte_ring_peek.h>
> > > > +#include <rte_ring_peek_sg.h>
> > > >  #endif
> > > >
> > > >  #include <rte_ring.h>
> > > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > > new file mode 100644
> > > > index 000000000..97d5764a6
> > > > --- /dev/null
> > > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > > @@ -0,0 +1,552 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + *
> > > > + * Copyright (c) 2020 Arm
> > > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > > + * All rights reserved.
> > > > + * Derived from FreeBSD's bufring.h
> > > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > > + */
> > > > +
> > > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > > +#define _RTE_RING_PEEK_SG_H_
> > > > +
> > > > +/**
> > > > + * @file
> > > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > > + * It is not recommended to include this file directly.
> > > > + * Please include <rte_ring_elem.h> instead.
> > > > + *
> > > > + * Ring Peek Scatter Gather APIs
> > >
> > > I am not fully convinced by the API name. To me, "scatter/gather" is
> > > associated to iovecs, like for instance in [1]. The wikipedia definition [2] may
> > > be closer though.
> > >
> > > [1]
> > > https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe
> > > r.html
> > > [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> > The way I understand scatter-gather is, the data to be sent to something (like a device) is coming from multiple sources. It would require
> copying to put the data together to be contiguous. If the device supports scatter-gather, such copying is not required. The device can
> collect data from multiple locations and make it contiguous.
> >
> > In the case I was looking at, one part of the data was coming from the user of the API and another was generated by the API itself. If
> these two pieces of information have to be enqueued as a single object on the ring, I had to create an intermediate copy. But by exposing
> the ring memory to the user, the intermediate copy is avoided. Hence I called it scatter-gather.
> >
> > >
> > > What about "zero-copy"?
> > I think no-copy (nc for short) or user-copy (uc for short) would convey the meaning better. These would indicate that the rte_ring APIs are
> not copying the objects and it is left to the user to do the actual copy.
> >
> > >
> > > Also, the "peek" term looks also a bit confusing to me, but it existed before
> > > your patch. I understand it for dequeue, but not for enqueue.
> > I kept 'peek' there because the API still offers the 'peek' API capabilities. I am also not sure on what 'peek' means for enqueue API. The
> enqueue 'peek' API was provided to be symmetric with dequeue peek API.
> >
> > >
> > > Or, what about replacing the existing experimental peek API by this one?
> > > They look quite similar to me.
> > I agree, scatter gather APIs provide the peek capability and the no-copy benefits.
> > Konstantin, any opinions here?

Sorry, didn't have time yet, to look at this RFC properly.
Will try to do it next week, as I understand that's for 21.02 anyway?

> > >
> > > > + * Introduction of rte_ring with scatter gather serialized
> > > > + producer/consumer
> > > > + * (HTS sync mode) makes it possible to split public enqueue/dequeue
> > > > + API
> > > > + * into 3 phases:
> > > > + * - enqueue/dequeue start
> > > > + * - copy data to/from the ring
> > > > + * - enqueue/dequeue finish
> > > > + * Along with the advantages of the peek APIs, these APIs provide the
> > > > + ability
> > > > + * to avoid copying of the data to temporary area.
> > > > + *
> > > > + * Note that right now this new API is available only for two sync modes:
> > > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> > > > + * It is a user responsibility to create/init ring with appropriate
> > > > + sync
> > > > + * modes selected.
> > > > + *
> > > > + * Example usage:
> > > > + * // read 1 elem from the ring:
> > >
> > > Comment should be "prepare enqueuing 32 objects"
> > >
> > > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > > + * if (n != 0) {
> > > > + *	//Copy objects in the ring
> > > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > > + *	if (n != sgd->n1)
> > > > + *		//Second memcpy because of wrapround
> > > > + *		n2 = n - sgd->n1;
> > > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > >
> > > Missing { }
> > >
> > > > + *	rte_ring_dequeue_sg_finish(ring, n);
> > >
> > > Should be enqueue
> > >
> > Thanks, will correct them.
> >
> > > > + * }
> > > > + *
> > > > + * Note that between _start_ and _finish_ none other thread can
> > > > + proceed
> > > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > > + */
> > > > +
> > > > +#ifdef __cplusplus
> > > > +extern "C" {
> > > > +#endif
> > > > +
> > > > +#include <rte_ring_peek_c11_mem.h>
> > > > +
> > > > +/* Rock that needs to be passed between reserve and commit APIs */
> > > > +struct rte_ring_sg_data {
> > > > +	/* Pointer to the first space in the ring */
> > > > +	void **ptr1;
> > > > +	/* Pointer to the second space in the ring if there is wrap-around */
> > > > +	void **ptr2;
> > > > +	/* Number of elements in the first pointer. If this is equal to
> > > > +	 * the number of elements requested, then ptr2 is NULL.
> > > > +	 * Otherwise, subtracting n1 from number of elements requested
> > > > +	 * will give the number of elements available at ptr2.
> > > > +	 */
> > > > +	unsigned int n1;
> > > > +};
> > >
> > > Would it be possible to simply return the offset instead of this structure?
> > > The wrap could be managed by a __rte_ring_enqueue_elems() function.
> > Trying to use __rte_ring_enqueue_elems() will force temporary copy. See below.
> >
> > >
> > > I mean something like this:
> > >
> > > 	uint32_t start;
> > > 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > > 	if (n != 0) {
> > > 		/* Copy objects in the ring. */
> > > 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> > > n);
> > For ex: 'obj' here is temporary copy.
> >
> > > 		rte_ring_enqueue_sg_finish(ring, n);
> > > 	}
> > >
> > > It would require to slightly change __rte_ring_enqueue_elems() to support
> > > to be called with prod_head >= size, and wrap in that case.
> > >
> > The alternate solution I can think of requires 3 things 1) the base address of the ring 2) Index to where to copy 3) the mask. With these 3
> things one could write the code like below:
> > for (i = 0; i < n; i++) {
> > 	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take care of wrap-around.
> > }
> >
> > However, I think this does not allow for passing the address in the ring to another function/API to copy the data (It is possible, but the user
> has to calculate the actual address, worry about the wrap-around, second pointer etc).
> >
> > The current approach hides some details and provides flexibility to the application to use the pointer the way it wants.
> 
> I agree that doing the access + masking manually looks too complex.
> 
> However I'm not sure to get why using __rte_ring_enqueue_elems() would
> result in an additional copy. I suppose the object that you want to
> enqueue is already stored somewhere?
> 
> For instance, let's say you have 10 objects to enqueue, located at
> different places:
> 
> 	void *obj_0_to_3 = <place where objects 0 to 3 are stored>;
> 	void *obj_4_to_7 = ...;
> 	void *obj_8_to_9 = ...;
> 	uint32_t start;
> 	n = rte_ring_enqueue_sg_bulk_start(ring, 10, &start, NULL);
> 	if (n != 0) {
> 		__rte_ring_enqueue_elems(ring, start, obj_0_to_3,
> 			sizeof(uintptr_t), 4);
> 		__rte_ring_enqueue_elems(ring, start + 4, obj_4_to_7,
> 			sizeof(uintptr_t), 4);
> 		__rte_ring_enqueue_elems(ring, start + 8, obj_8_to_9,
> 			sizeof(uintptr_t), 2);
> 		rte_ring_enqueue_sg_finish(ring, 10);
> 	}
> 


As I understand, It is not about different objects stored in different places,
it is about:
a) object is relatively big (16B+ ?)
b) You compose objects from values stored in few different places.

Let say you have:
struct elem_obj {uint64_t a; uint32_t b, c;};

And then you'd like to copy 'a' value from one location, 'b' from second,
and 'c' from third one.

Konstantin



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-09  8:05           ` Ananyev, Konstantin
@ 2020-10-09 22:54             ` Honnappa Nagarahalli
  2020-10-12 17:06               ` Ananyev, Konstantin
  0 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-09 22:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, Olivier Matz
  Cc: dev, david.marchand, nd, Honnappa Nagarahalli, nd

<snip>

> > > > Hi Honnappa,
> > > >
> > > > From a quick walkthrough, I have some questions/comments, please
> > > > see below.
> > > Hi Olivier, appreciate your input.
> > >
> > > >
> > > > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > > > Add scatter gather APIs to avoid intermediate memcpy. Use cases
> > > > > that involve copying large amount of data to/from the ring can
> > > > > benefit from these APIs.
> > > > >
> > > > > Signed-off-by: Honnappa Nagarahalli
> > > > > <honnappa.nagarahalli@arm.com>
> > > > > ---
> > > > >  lib/librte_ring/meson.build        |   3 +-
> > > > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > > > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > > > +++++++++++++++++++++++++++++
> > > > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode
> > > > > 100644 lib/librte_ring/rte_ring_peek_sg.h
> > > > >
> > > > > diff --git a/lib/librte_ring/meson.build
> > > > > b/lib/librte_ring/meson.build index 31c0b4649..377694713 100644
> > > > > --- a/lib/librte_ring/meson.build
> > > > > +++ b/lib/librte_ring/meson.build
> > > > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > > > >  		'rte_ring_peek.h',
> > > > >  		'rte_ring_peek_c11_mem.h',
> > > > >  		'rte_ring_rts.h',
> > > > > -		'rte_ring_rts_c11_mem.h')
> > > > > +		'rte_ring_rts_c11_mem.h',
> > > > > +		'rte_ring_peek_sg.h')
> > > > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15
> > > > > 100644
> > > > > --- a/lib/librte_ring/rte_ring_elem.h
> > > > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct
> > > > > rte_ring *r, void *obj_table,
> > > > >
> > > > >  #ifdef ALLOW_EXPERIMENTAL_API
> > > > >  #include <rte_ring_peek.h>
> > > > > +#include <rte_ring_peek_sg.h>
> > > > >  #endif
> > > > >
> > > > >  #include <rte_ring.h>
> > > > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > > > new file mode 100644
> > > > > index 000000000..97d5764a6
> > > > > --- /dev/null
> > > > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > > > @@ -0,0 +1,552 @@
> > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > + *
> > > > > + * Copyright (c) 2020 Arm
> > > > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > > > + * All rights reserved.
> > > > > + * Derived from FreeBSD's bufring.h
> > > > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > > > + */
> > > > > +
> > > > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > > > +#define _RTE_RING_PEEK_SG_H_
> > > > > +
> > > > > +/**
> > > > > + * @file
> > > > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > > > + * It is not recommended to include this file directly.
> > > > > + * Please include <rte_ring_elem.h> instead.
> > > > > + *
> > > > > + * Ring Peek Scatter Gather APIs
> > > >
> > > > I am not fully convinced by the API name. To me, "scatter/gather"
> > > > is associated to iovecs, like for instance in [1]. The wikipedia
> > > > definition [2] may be closer though.
> > > >
> > > > [1]
> > > >
> https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGat
> > > > he
> > > > r.html
> > > > [2]
> > > > https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> > > The way I understand scatter-gather is, the data to be sent to
> > > something (like a device) is coming from multiple sources. It would
> > > require
> > copying to put the data together to be contiguous. If the device
> > supports scatter-gather, such copying is not required. The device can
> collect data from multiple locations and make it contiguous.
> > >
> > > In the case I was looking at, one part of the data was coming from
> > > the user of the API and another was generated by the API itself. If
> > these two pieces of information have to be enqueued as a single object
> > on the ring, I had to create an intermediate copy. But by exposing the ring
> memory to the user, the intermediate copy is avoided. Hence I called it
> scatter-gather.
> > >
> > > >
> > > > What about "zero-copy"?
> > > I think no-copy (nc for short) or user-copy (uc for short) would
> > > convey the meaning better. These would indicate that the rte_ring
> > > APIs are
> > not copying the objects and it is left to the user to do the actual copy.
> > >
> > > >
> > > > Also, the "peek" term looks also a bit confusing to me, but it
> > > > existed before your patch. I understand it for dequeue, but not for
> enqueue.
> > > I kept 'peek' there because the API still offers the 'peek' API
> > > capabilities. I am also not sure on what 'peek' means for enqueue
> > > API. The
> > enqueue 'peek' API was provided to be symmetric with dequeue peek API.
> > >
> > > >
> > > > Or, what about replacing the existing experimental peek API by this one?
> > > > They look quite similar to me.
> > > I agree, scatter gather APIs provide the peek capability and the no-copy
> benefits.
> > > Konstantin, any opinions here?
> 
> Sorry, didn't have time yet, to look at this RFC properly.
> Will try to do it next week, as I understand that's for 21.02 anyway?
This is committed for 20.11. We should be able to get into RC2.

> 
> > > >
> > > > > + * Introduction of rte_ring with scatter gather serialized
> > > > > + producer/consumer
> > > > > + * (HTS sync mode) makes it possible to split public
> > > > > + enqueue/dequeue API
> > > > > + * into 3 phases:
> > > > > + * - enqueue/dequeue start
> > > > > + * - copy data to/from the ring
> > > > > + * - enqueue/dequeue finish
> > > > > + * Along with the advantages of the peek APIs, these APIs
> > > > > + provide the ability
> > > > > + * to avoid copying of the data to temporary area.
> > > > > + *
> > > > > + * Note that right now this new API is available only for two sync
> modes:
> > > > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > > > + * 2) Serialized Producer/Serialized Consumer
> (RTE_RING_SYNC_MT_HTS).
> > > > > + * It is a user responsibility to create/init ring with
> > > > > + appropriate sync
> > > > > + * modes selected.
> > > > > + *
> > > > > + * Example usage:
> > > > > + * // read 1 elem from the ring:
> > > >
> > > > Comment should be "prepare enqueuing 32 objects"
> > > >
> > > > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > > > + * if (n != 0) {
> > > > > + *	//Copy objects in the ring
> > > > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > > > + *	if (n != sgd->n1)
> > > > > + *		//Second memcpy because of wrapround
> > > > > + *		n2 = n - sgd->n1;
> > > > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > > >
> > > > Missing { }
> > > >
> > > > > + *	rte_ring_dequeue_sg_finish(ring, n);
> > > >
> > > > Should be enqueue
> > > >
> > > Thanks, will correct them.
> > >
> > > > > + * }
> > > > > + *
> > > > > + * Note that between _start_ and _finish_ none other thread can
> > > > > + proceed
> > > > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > > > + */
> > > > > +
> > > > > +#ifdef __cplusplus
> > > > > +extern "C" {
> > > > > +#endif
> > > > > +
> > > > > +#include <rte_ring_peek_c11_mem.h>
> > > > > +
> > > > > +/* Rock that needs to be passed between reserve and commit APIs
> > > > > +*/ struct rte_ring_sg_data {
> > > > > +	/* Pointer to the first space in the ring */
> > > > > +	void **ptr1;
> > > > > +	/* Pointer to the second space in the ring if there is wrap-
> around */
> > > > > +	void **ptr2;
> > > > > +	/* Number of elements in the first pointer. If this is equal to
> > > > > +	 * the number of elements requested, then ptr2 is NULL.
> > > > > +	 * Otherwise, subtracting n1 from number of elements
> requested
> > > > > +	 * will give the number of elements available at ptr2.
> > > > > +	 */
> > > > > +	unsigned int n1;
> > > > > +};
> > > >
> > > > Would it be possible to simply return the offset instead of this structure?
> > > > The wrap could be managed by a __rte_ring_enqueue_elems()
> function.
> > > Trying to use __rte_ring_enqueue_elems() will force temporary copy.
> See below.
> > >
> > > >
> > > > I mean something like this:
> > > >
> > > > 	uint32_t start;
> > > > 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > > > 	if (n != 0) {
> > > > 		/* Copy objects in the ring. */
> > > > 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> > > > n);
> > > For ex: 'obj' here is temporary copy.
> > >
> > > > 		rte_ring_enqueue_sg_finish(ring, n);
> > > > 	}
> > > >
> > > > It would require to slightly change __rte_ring_enqueue_elems() to
> > > > support to be called with prod_head >= size, and wrap in that case.
> > > >
> > > The alternate solution I can think of requires 3 things 1) the base
> > > address of the ring 2) Index to where to copy 3) the mask. With
> > > these 3
> > things one could write the code like below:
> > > for (i = 0; i < n; i++) {
> > > 	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take
> care of wrap-around.
> > > }
> > >
> > > However, I think this does not allow for passing the address in the
> > > ring to another function/API to copy the data (It is possible, but
> > > the user
> > has to calculate the actual address, worry about the wrap-around, second
> pointer etc).
> > >
> > > The current approach hides some details and provides flexibility to the
> application to use the pointer the way it wants.
> >
> > I agree that doing the access + masking manually looks too complex.
> >
> > However I'm not sure to get why using __rte_ring_enqueue_elems()
> would
> > result in an additional copy. I suppose the object that you want to
> > enqueue is already stored somewhere?
I think this is the key. The object is not stored any where (yet), it is getting generated. When it is generated, it should get stored directly into the ring. I have provided some examples below.

> >
> > For instance, let's say you have 10 objects to enqueue, located at
> > different places:
> >
> > 	void *obj_0_to_3 = <place where objects 0 to 3 are stored>;
> > 	void *obj_4_to_7 = ...;
> > 	void *obj_8_to_9 = ...;
> > 	uint32_t start;
> > 	n = rte_ring_enqueue_sg_bulk_start(ring, 10, &start, NULL);
> > 	if (n != 0) {
> > 		__rte_ring_enqueue_elems(ring, start, obj_0_to_3,
> > 			sizeof(uintptr_t), 4);
> > 		__rte_ring_enqueue_elems(ring, start + 4, obj_4_to_7,
> > 			sizeof(uintptr_t), 4);
> > 		__rte_ring_enqueue_elems(ring, start + 8, obj_8_to_9,
> > 			sizeof(uintptr_t), 2);
> > 		rte_ring_enqueue_sg_finish(ring, 10);
> > 	}
> >
> 
> 
> As I understand, It is not about different objects stored in different places, it
> is about:
> a) object is relatively big (16B+ ?)
> b) You compose objects from values stored in few different places.
> 
> Let say you have:
> struct elem_obj {uint64_t a; uint32_t b, c;};
> 
> And then you'd like to copy 'a' value from one location, 'b' from second, and
> 'c' from third one.
> 
> Konstantin
> 
I think there are multiple use cases. Some I have in mind are:

1)
Code without this patch:

struct rte_mbuf *pkts_burst[32];

/* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */

/* Pkt I/O core polls packets from the NIC, pkts_burst is the temporary store */
nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, 32);
/* Provide packets to the packet processing cores */
rte_ring_enqueue_burst(ring, pkts_burst, 32, &free_space);

Code with the patch:

/* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */

/* Reserve space on the ring */
n = rte_ring_enqueue_sg_burst_start(ring, 32, &sgd, NULL);
/* Pkt I/O core polls packets from the NIC */
if (n == 32)
	nb_rx = rte_eth_rx_burst(portid, queueid, sgd->ptr1, 32);
else
	nb_rx = rte_eth_rx_burst(portid, queueid, sgd->ptr1, sgd->n1);
/* Provide packets to the packet processing cores */
/* Temporary storage 'pkts_burst' is not required */
rte_ring_enqueue_sg_finish(ring, nb_rx);


2) This is same/similar to what Konstantin mentioned

Code without this patch:

struct elem_obj {uint64_t a; uint32_t b, c;};
struct elem_obj obj;

/* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */

obj.a = rte_get_a();
obj.b = rte_get_b();
obj.c = rte_get_c();
/* obj is the temporary storage and results in memcpy in the following call */
rte_ring_enqueue_elem(ring, sizeof(struct elem_obj), 1, &obj, NULL);

Code with the patch:

struct elem_obj *obj;
/* Reserve space on the ring */
n = rte_ring_enqueue_sg_bulk_elem_start(ring, sizeof(elem_obj), 1, &sgd, NULL);

obj = (struct elem_obj *)sgd->ptr1;
obj.a = rte_get_a();
obj.b = rte_get_b();
obj.c = rte_get_c();
/* obj is not a temporary storage */
rte_ring_enqueue_sg_elem_finish(ring, n);

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-06 13:29   ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
  2020-10-07  8:27     ` Olivier Matz
@ 2020-10-12 16:20     ` Ananyev, Konstantin
  2020-10-12 22:31       ` Honnappa Nagarahalli
  1 sibling, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-12 16:20 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev; +Cc: olivier.matz, david.marchand, nd


> Add scatter gather APIs to avoid intermediate memcpy. Use cases
> that involve copying large amount of data to/from the ring
> can benefit from these APIs.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  lib/librte_ring/meson.build        |   3 +-
>  lib/librte_ring/rte_ring_elem.h    |   1 +
>  lib/librte_ring/rte_ring_peek_sg.h | 552 +++++++++++++++++++++++++++++
>  3 files changed, 555 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_ring/rte_ring_peek_sg.h

As a generic one - need to update ring UT both func and perf
to test/measure this new API.

> 
> diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> index 31c0b4649..377694713 100644
> --- a/lib/librte_ring/meson.build
> +++ b/lib/librte_ring/meson.build
> @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
>  		'rte_ring_peek.h',
>  		'rte_ring_peek_c11_mem.h',
>  		'rte_ring_rts.h',
> -		'rte_ring_rts_c11_mem.h')
> +		'rte_ring_rts_c11_mem.h',
> +		'rte_ring_peek_sg.h')
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 938b398fc..7d3933f15 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
> 
>  #ifdef ALLOW_EXPERIMENTAL_API
>  #include <rte_ring_peek.h>
> +#include <rte_ring_peek_sg.h>
>  #endif
> 
>  #include <rte_ring.h>
> diff --git a/lib/librte_ring/rte_ring_peek_sg.h b/lib/librte_ring/rte_ring_peek_sg.h
> new file mode 100644
> index 000000000..97d5764a6
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_peek_sg.h
> @@ -0,0 +1,552 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm
> + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_PEEK_SG_H_
> +#define _RTE_RING_PEEK_SG_H_
> +
> +/**
> + * @file
> + * @b EXPERIMENTAL: this API may change without prior notice
> + * It is not recommended to include this file directly.
> + * Please include <rte_ring_elem.h> instead.
> + *
> + * Ring Peek Scatter Gather APIs
> + * Introduction of rte_ring with scatter gather serialized producer/consumer
> + * (HTS sync mode) makes it possible to split public enqueue/dequeue API
> + * into 3 phases:
> + * - enqueue/dequeue start
> + * - copy data to/from the ring
> + * - enqueue/dequeue finish
> + * Along with the advantages of the peek APIs, these APIs provide the ability
> + * to avoid copying of the data to temporary area.
> + *
> + * Note that right now this new API is available only for two sync modes:
> + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> + * It is a user responsibility to create/init ring with appropriate sync
> + * modes selected.
> + *
> + * Example usage:
> + * // read 1 elem from the ring:
> + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> + * if (n != 0) {
> + *	//Copy objects in the ring
> + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> + *	if (n != sgd->n1)
> + *		//Second memcpy because of wrapround
> + *		n2 = n - sgd->n1;
> + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> + *	rte_ring_dequeue_sg_finish(ring, n);

It is not clear from the example above why do you need SG(ZC) API.
Existing peek API would be able to handle such situation
(just copy will be done internally). Probably better to use examples 
you provided in your last reply to Olivier. 

> + * }
> + *
> + * Note that between _start_ and _finish_ none other thread can proceed
> + * with enqueue(/dequeue) operation till _finish_ completes.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_ring_peek_c11_mem.h>
> +
> +/* Rock that needs to be passed between reserve and commit APIs */
> +struct rte_ring_sg_data {
> +	/* Pointer to the first space in the ring */
> +	void **ptr1;
> +	/* Pointer to the second space in the ring if there is wrap-around */
> +	void **ptr2;
> +	/* Number of elements in the first pointer. If this is equal to
> +	 * the number of elements requested, then ptr2 is NULL.
> +	 * Otherwise, subtracting n1 from number of elements requested
> +	 * will give the number of elements available at ptr2.
> +	 */
> +	unsigned int n1;
> +};

I wonder what is the primary goal of that API?
The reason I am asking: from what I understand with this patch ZC API
will work only for ST and HTS modes (same as peek API).
Though, I think it is possible to make it work for any sync model,
by changing  API a bit: instead of returning sg_data to the user,
force him to provide function to read/write elems from/to the ring.
Just a schematic one, to illustrate the idea:

typedef void (*write_ring_func_t)(void *elem, /*pointer to first elem to update inside the ring*/
				uint32_t num, /* number of elems to update */
				uint32_t esize,
				void *udata  /* caller provide data */);

rte_ring_enqueue_zc_bulk_elem(struct rte_ring *r, unsigned int esize,
	unsigned int n, unsigned int *free_space, write_ring_func_t wf, void *udata)
{
	struct rte_ring_sg_data sgd;
	.....
	n = move_head_tail(r, ...);
	
	/* get sgd data based on n */
	get_elem_addr(r, ..., &sgd);

	/* call user defined function to fill reserved elems */
	wf(sgd.p1, sgd.n1, esize, udata);
	if (n != n1)
		wf(sgd.p2, sgd.n2, esize, udata);

	....
	return n; 
}

If we want ZC peek API also - some extra work need to be done
with introducing return value for write_ring_func()
and checking it properly, but I don't see any big problems here too.
That way ZC API can support all sync models, plus we don't need
to expose sg_data to the user directly.
Also, in future, we probably can de-duplicate the code by making
our non-ZC API to use that one internally 
(pass ring_enqueue_elems()/ob_table as a parameters). 

> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
> +	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	uint32_t idx = head & r->mask;
> +	uint64_t *ring = (uint64_t *)&r[1];
> +
> +	*dst1 = ring + idx;
> +	*n1 = num;
> +
> +	if (idx + num > r->size) {
> +		*n1 = num - (r->size - idx - 1);
> +		*dst2 = ring;
> +	}
> +}
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
> +	uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	uint32_t idx = head & r->mask;
> +	rte_int128_t *ring = (rte_int128_t *)&r[1];
> +
> +	*dst1 = ring + idx;
> +	*n1 = num;
> +
> +	if (idx + num > r->size) {
> +		*n1 = num - (r->size - idx - 1);
> +		*dst2 = ring;
> +	}
> +}
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	if (esize == 8)
> +		__rte_ring_get_elem_addr_64(r, head,
> +						num, dst1, n1, dst2);
> +	else if (esize == 16)
> +		__rte_ring_get_elem_addr_128(r, head,
> +						num, dst1, n1, dst2);


I don't think we need that special handling for 8/16B sizes.
In all functions esize is an input parameter.
If user will specify is as a constant - compiler will be able to
convert multiply to shift and add ops. 

> +	else {
> +		uint32_t idx, scale, nr_idx;
> +		uint32_t *ring = (uint32_t *)&r[1];
> +
> +		/* Normalize to uint32_t */
> +		scale = esize / sizeof(uint32_t);
> +		idx = head & r->mask;
> +		nr_idx = idx * scale;
> +
> +		*dst1 = ring + nr_idx;
> +		*n1 = num;
> +
> +		if (idx + num > r->size) {
> +			*n1 = num - (r->size - idx - 1);
> +			*dst2 = ring;
> +		}
> +	}
> +}
> +
> +/**
> + * @internal This function moves prod head value.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_enqueue_sg_elem_start(struct rte_ring *r, unsigned int esize,
> +		uint32_t n, enum rte_ring_queue_behavior behavior,
> +		struct rte_ring_sg_data *sgd, unsigned int *free_space)
> +{
> +	uint32_t free, head, next;
> +
> +	switch (r->prod.sync_type) {
> +	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
> +			behavior, &head, &next, &free);
> +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd->ptr1,
> +			&sgd->n1, (void **)&sgd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT_HTS:
> +		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
> +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd->ptr1,
> +			&sgd->n1, (void **)&sgd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT:
> +	case RTE_RING_SYNC_MT_RTS:
> +	default:
> +		/* unsupported mode, shouldn't be here */
> +		RTE_ASSERT(0);
> +		n = 0;
> +		free = 0;
> +	}
> +
> +	if (free_space != NULL)
> +		*free_space = free - n;
> +	return n;
> +}
> +
> +/**
> + * Start to enqueue several objects on the ring.
> + * Note that no actual objects are put in the queue by this function,
> + * it just reserves space for the user on the ring.
> + * User has to copy objects into the queue using the returned pointers.
> + * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete the
> + * enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + * @param n
> + *   The number of objects to add in the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param free_space
> + *   if non-NULL, returns the amount of space in the ring after the
> + *   reservation operation has finished.
> + * @return
> + *   The number of objects that can be enqueued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_enqueue_sg_bulk_elem_start(struct rte_ring *r, unsigned int esize,
> +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *free_space)
> +{
> +	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
> +			RTE_RING_QUEUE_FIXED, sgd, free_space);
> +}
> +
> +/**
> + * Start to enqueue several pointers to objects on the ring.
> + * Note that no actual pointers are put in the queue by this function,
> + * it just reserves space for the user on the ring.
> + * User has to copy pointers to objects into the queue using the
> + * returned pointers.
> + * User should call rte_ring_enqueue_sg_bulk_finish to complete the
> + * enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to add in the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param free_space
> + *   if non-NULL, returns the amount of space in the ring after the
> + *   reservation operation has finished.
> + * @return
> + *   The number of objects that can be enqueued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_enqueue_sg_bulk_start(struct rte_ring *r, unsigned int n,
> +	struct rte_ring_sg_data *sgd, unsigned int *free_space)
> +{
> +	return rte_ring_enqueue_sg_bulk_elem_start(r, sizeof(uintptr_t), n,
> +							sgd, free_space);
> +}
> +/**
> + * Start to enqueue several objects on the ring.
> + * Note that no actual objects are put in the queue by this function,
> + * it just reserves space for the user on the ring.
> + * User has to copy objects into the queue using the returned pointers.
> + * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete the
> + * enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + * @param n
> + *   The number of objects to add in the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param free_space
> + *   if non-NULL, returns the amount of space in the ring after the
> + *   reservation operation has finished.
> + * @return
> + *   The number of objects that can be enqueued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_enqueue_sg_burst_elem_start(struct rte_ring *r, unsigned int esize,
> +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *free_space)
> +{
> +	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
> +			RTE_RING_QUEUE_VARIABLE, sgd, free_space);
> +}
> +
> +/**
> + * Start to enqueue several pointers to objects on the ring.
> + * Note that no actual pointers are put in the queue by this function,
> + * it just reserves space for the user on the ring.
> + * User has to copy pointers to objects into the queue using the
> + * returned pointers.
> + * User should call rte_ring_enqueue_sg_bulk_finish to complete the
> + * enqueue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to add in the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param free_space
> + *   if non-NULL, returns the amount of space in the ring after the
> + *   reservation operation has finished.
> + * @return
> + *   The number of objects that can be enqueued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_enqueue_sg_burst_start(struct rte_ring *r, unsigned int n,
> +	struct rte_ring_sg_data *sgd, unsigned int *free_space)
> +{
> +	return rte_ring_enqueue_sg_burst_elem_start(r, sizeof(uintptr_t), n,
> +							sgd, free_space);
> +}
> +
> +/**
> + * Complete enqueuing several objects on the ring.
> + * Note that number of objects to enqueue should not exceed previous
> + * enqueue_start return value.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to add to the ring.
> + */
> +__rte_experimental
> +static __rte_always_inline void
> +rte_ring_enqueue_sg_elem_finish(struct rte_ring *r, unsigned int n)
> +{
> +	uint32_t tail;
> +
> +	switch (r->prod.sync_type) {
> +	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
> +		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
> +		break;
> +	case RTE_RING_SYNC_MT_HTS:
> +		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
> +		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
> +		break;
> +	case RTE_RING_SYNC_MT:
> +	case RTE_RING_SYNC_MT_RTS:
> +	default:
> +		/* unsupported mode, shouldn't be here */
> +		RTE_ASSERT(0);
> +	}
> +}
> +
> +/**
> + * Complete enqueuing several pointers to objects on the ring.
> + * Note that number of objects to enqueue should not exceed previous
> + * enqueue_start return value.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of pointers to objects to add to the ring.
> + */
> +__rte_experimental
> +static __rte_always_inline void
> +rte_ring_enqueue_sg_finish(struct rte_ring *r, unsigned int n)
> +{
> +	rte_ring_enqueue_sg_elem_finish(r, n);
> +}
> +
> +/**
> + * @internal This function moves cons head value and copies up to *n*
> + * objects from the ring to the user provided obj_table.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_dequeue_sg_elem_start(struct rte_ring *r,
> +	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
> +	struct rte_ring_sg_data *sgd, unsigned int *available)
> +{
> +	uint32_t avail, head, next;
> +
> +	switch (r->cons.sync_type) {
> +	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
> +			behavior, &head, &next, &avail);
> +		__rte_ring_get_elem_addr(r, head, esize, n,
> +					sgd->ptr1, &sgd->n1, sgd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT_HTS:
> +		n = __rte_ring_hts_move_cons_head(r, n, behavior,
> +			&head, &avail);
> +		__rte_ring_get_elem_addr(r, head, esize, n,
> +					sgd->ptr1, &sgd->n1, sgd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT:
> +	case RTE_RING_SYNC_MT_RTS:
> +	default:
> +		/* unsupported mode, shouldn't be here */
> +		RTE_ASSERT(0);
> +		n = 0;
> +		avail = 0;
> +	}
> +
> +	if (available != NULL)
> +		*available = avail - n;
> +	return n;
> +}
> +
> +/**
> + * Start to dequeue several objects from the ring.
> + * Note that no actual objects are copied from the queue by this function.
> + * User has to copy objects from the queue using the returned pointers.
> + * User should call rte_ring_dequeue_sg_bulk_elem_finish to complete the
> + * dequeue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + * @param n
> + *   The number of objects to remove from the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param available
> + *   If non-NULL, returns the number of remaining ring entries after the
> + *   dequeue has finished.
> + * @return
> + *   The number of objects that can be dequeued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_dequeue_sg_bulk_elem_start(struct rte_ring *r, unsigned int esize,
> +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *available)
> +{
> +	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
> +			RTE_RING_QUEUE_FIXED, sgd, available);
> +}
> +
> +/**
> + * Start to dequeue several pointers to objects from the ring.
> + * Note that no actual pointers are removed from the queue by this function.
> + * User has to copy pointers to objects from the queue using the
> + * returned pointers.
> + * User should call rte_ring_dequeue_sg_bulk_finish to complete the
> + * dequeue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to remove from the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param available
> + *   If non-NULL, returns the number of remaining ring entries after the
> + *   dequeue has finished.
> + * @return
> + *   The number of objects that can be dequeued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_dequeue_sg_bulk_start(struct rte_ring *r, unsigned int n,
> +	struct rte_ring_sg_data *sgd, unsigned int *available)
> +{
> +	return rte_ring_dequeue_sg_bulk_elem_start(r, sizeof(uintptr_t),
> +		n, sgd, available);
> +}
> +
> +/**
> + * Start to dequeue several objects from the ring.
> + * Note that no actual objects are copied from the queue by this function.
> + * User has to copy objects from the queue using the returned pointers.
> + * User should call rte_ring_dequeue_sg_burst_elem_finish to complete the
> + * dequeue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param esize
> + *   The size of ring element, in bytes. It must be a multiple of 4.
> + *   This must be the same value used while creating the ring. Otherwise
> + *   the results are undefined.
> + * @param n
> + *   The number of objects to dequeue from the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param available
> + *   If non-NULL, returns the number of remaining ring entries after the
> + *   dequeue has finished.
> + * @return
> + *   The number of objects that can be dequeued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_dequeue_sg_burst_elem_start(struct rte_ring *r, unsigned int esize,
> +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int *available)
> +{
> +	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
> +			RTE_RING_QUEUE_VARIABLE, sgd, available);
> +}
> +
> +/**
> + * Start to dequeue several pointers to objects from the ring.
> + * Note that no actual pointers are removed from the queue by this function.
> + * User has to copy pointers to objects from the queue using the
> + * returned pointers.
> + * User should call rte_ring_dequeue_sg_burst_finish to complete the
> + * dequeue operation.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to remove from the ring.
> + * @param sgd
> + *   The scatter-gather data containing pointers for copying data.
> + * @param available
> + *   If non-NULL, returns the number of remaining ring entries after the
> + *   dequeue has finished.
> + * @return
> + *   The number of objects that can be dequeued, either 0 or n
> + */
> +__rte_experimental
> +static __rte_always_inline unsigned int
> +rte_ring_dequeue_sg_burst_start(struct rte_ring *r, unsigned int n,
> +		struct rte_ring_sg_data *sgd, unsigned int *available)
> +{
> +	return rte_ring_dequeue_sg_burst_elem_start(r, sizeof(uintptr_t), n,
> +			sgd, available);
> +}
> +
> +/**
> + * Complete dequeuing several objects from the ring.
> + * Note that number of objects to dequeued should not exceed previous
> + * dequeue_start return value.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to remove from the ring.
> + */
> +__rte_experimental
> +static __rte_always_inline void
> +rte_ring_dequeue_sg_elem_finish(struct rte_ring *r, unsigned int n)
> +{
> +	uint32_t tail;
> +
> +	switch (r->cons.sync_type) {
> +	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
> +		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
> +		break;
> +	case RTE_RING_SYNC_MT_HTS:
> +		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
> +		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
> +		break;
> +	case RTE_RING_SYNC_MT:
> +	case RTE_RING_SYNC_MT_RTS:
> +	default:
> +		/* unsupported mode, shouldn't be here */
> +		RTE_ASSERT(0);
> +	}
> +}
> +
> +/**
> + * Complete dequeuing several objects from the ring.
> + * Note that number of objects to dequeued should not exceed previous
> + * dequeue_start return value.
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param n
> + *   The number of objects to remove from the ring.
> + */
> +__rte_experimental
> +static __rte_always_inline void
> +rte_ring_dequeue_sg_finish(struct rte_ring *r, unsigned int n)
> +{
> +	rte_ring_dequeue_elem_finish(r, n);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_RING_PEEK_SG_H_ */
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-09 22:54             ` Honnappa Nagarahalli
@ 2020-10-12 17:06               ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-12 17:06 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Olivier Matz; +Cc: dev, david.marchand, nd, nd



 
> <snip>
> 
> > > > > Hi Honnappa,
> > > > >
> > > > > From a quick walkthrough, I have some questions/comments, please
> > > > > see below.
> > > > Hi Olivier, appreciate your input.
> > > >
> > > > >
> > > > > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > > > > Add scatter gather APIs to avoid intermediate memcpy. Use cases
> > > > > > that involve copying large amount of data to/from the ring can
> > > > > > benefit from these APIs.
> > > > > >
> > > > > > Signed-off-by: Honnappa Nagarahalli
> > > > > > <honnappa.nagarahalli@arm.com>
> > > > > > ---
> > > > > >  lib/librte_ring/meson.build        |   3 +-
> > > > > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > > > > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > > > > +++++++++++++++++++++++++++++
> > > > > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode
> > > > > > 100644 lib/librte_ring/rte_ring_peek_sg.h
> > > > > >
> > > > > > diff --git a/lib/librte_ring/meson.build
> > > > > > b/lib/librte_ring/meson.build index 31c0b4649..377694713 100644
> > > > > > --- a/lib/librte_ring/meson.build
> > > > > > +++ b/lib/librte_ring/meson.build
> > > > > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > > > > >  		'rte_ring_peek.h',
> > > > > >  		'rte_ring_peek_c11_mem.h',
> > > > > >  		'rte_ring_rts.h',
> > > > > > -		'rte_ring_rts_c11_mem.h')
> > > > > > +		'rte_ring_rts_c11_mem.h',
> > > > > > +		'rte_ring_peek_sg.h')
> > > > > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > > > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15
> > > > > > 100644
> > > > > > --- a/lib/librte_ring/rte_ring_elem.h
> > > > > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > > > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct
> > > > > > rte_ring *r, void *obj_table,
> > > > > >
> > > > > >  #ifdef ALLOW_EXPERIMENTAL_API
> > > > > >  #include <rte_ring_peek.h>
> > > > > > +#include <rte_ring_peek_sg.h>
> > > > > >  #endif
> > > > > >
> > > > > >  #include <rte_ring.h>
> > > > > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > > > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > > > > new file mode 100644
> > > > > > index 000000000..97d5764a6
> > > > > > --- /dev/null
> > > > > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > > > > @@ -0,0 +1,552 @@
> > > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > > + *
> > > > > > + * Copyright (c) 2020 Arm
> > > > > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > > > > + * All rights reserved.
> > > > > > + * Derived from FreeBSD's bufring.h
> > > > > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > > > > + */
> > > > > > +
> > > > > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > > > > +#define _RTE_RING_PEEK_SG_H_
> > > > > > +
> > > > > > +/**
> > > > > > + * @file
> > > > > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > > > > + * It is not recommended to include this file directly.
> > > > > > + * Please include <rte_ring_elem.h> instead.
> > > > > > + *
> > > > > > + * Ring Peek Scatter Gather APIs
> > > > >
> > > > > I am not fully convinced by the API name. To me, "scatter/gather"
> > > > > is associated to iovecs, like for instance in [1]. The wikipedia
> > > > > definition [2] may be closer though.
> > > > >
> > > > > [1]
> > > > >
> > https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGat
> > > > > he
> > > > > r.html
> > > > > [2]
> > > > > https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> > > > The way I understand scatter-gather is, the data to be sent to
> > > > something (like a device) is coming from multiple sources. It would
> > > > require
> > > copying to put the data together to be contiguous. If the device
> > > supports scatter-gather, such copying is not required. The device can
> > collect data from multiple locations and make it contiguous.
> > > >
> > > > In the case I was looking at, one part of the data was coming from
> > > > the user of the API and another was generated by the API itself. If
> > > these two pieces of information have to be enqueued as a single object
> > > on the ring, I had to create an intermediate copy. But by exposing the ring
> > memory to the user, the intermediate copy is avoided. Hence I called it
> > scatter-gather.
> > > >
> > > > >
> > > > > What about "zero-copy"?
> > > > I think no-copy (nc for short) or user-copy (uc for short) would
> > > > convey the meaning better. These would indicate that the rte_ring
> > > > APIs are
> > > not copying the objects and it is left to the user to do the actual copy.


+1 for _ZC_ in naming.
_NC_ is probably ok too, but sounds really strange to me.

> > > >
> > > > >
> > > > > Also, the "peek" term looks also a bit confusing to me, but it
> > > > > existed before your patch. I understand it for dequeue, but not for
> > enqueue.
> > > > I kept 'peek' there because the API still offers the 'peek' API
> > > > capabilities. I am also not sure on what 'peek' means for enqueue
> > > > API. The
> > > enqueue 'peek' API was provided to be symmetric with dequeue peek API.
> > > >
> > > > >
> > > > > Or, what about replacing the existing experimental peek API by this one?
> > > > > They look quite similar to me.
> > > > I agree, scatter gather APIs provide the peek capability and the no-copy
> > benefits.
> > > > Konstantin, any opinions here?

I am still not very comfortable with API that allows users to access
elems locations directly. I do understand that it could be beneficial in some
special cases (you provided some good examples below), so I don't object to
have it as addon.
But I still think it shouldn't be the _only_ API. 

> >
> > Sorry, didn't have time yet, to look at this RFC properly.
> > Will try to do it next week, as I understand that's for 21.02 anyway?
> This is committed for 20.11. We should be able to get into RC2.

Sounds really tight..., but ok, let's see how it goes.
 
> >
> > > > >
> > > > > > + * Introduction of rte_ring with scatter gather serialized
> > > > > > + producer/consumer
> > > > > > + * (HTS sync mode) makes it possible to split public
> > > > > > + enqueue/dequeue API
> > > > > > + * into 3 phases:
> > > > > > + * - enqueue/dequeue start
> > > > > > + * - copy data to/from the ring
> > > > > > + * - enqueue/dequeue finish
> > > > > > + * Along with the advantages of the peek APIs, these APIs
> > > > > > + provide the ability
> > > > > > + * to avoid copying of the data to temporary area.
> > > > > > + *
> > > > > > + * Note that right now this new API is available only for two sync
> > modes:
> > > > > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > > > > + * 2) Serialized Producer/Serialized Consumer
> > (RTE_RING_SYNC_MT_HTS).
> > > > > > + * It is a user responsibility to create/init ring with
> > > > > > + appropriate sync
> > > > > > + * modes selected.
> > > > > > + *
> > > > > > + * Example usage:
> > > > > > + * // read 1 elem from the ring:
> > > > >
> > > > > Comment should be "prepare enqueuing 32 objects"
> > > > >
> > > > > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > > > > + * if (n != 0) {
> > > > > > + *	//Copy objects in the ring
> > > > > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > > > > + *	if (n != sgd->n1)
> > > > > > + *		//Second memcpy because of wrapround
> > > > > > + *		n2 = n - sgd->n1;
> > > > > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > > > >
> > > > > Missing { }
> > > > >
> > > > > > + *	rte_ring_dequeue_sg_finish(ring, n);
> > > > >
> > > > > Should be enqueue
> > > > >
> > > > Thanks, will correct them.
> > > >
> > > > > > + * }
> > > > > > + *
> > > > > > + * Note that between _start_ and _finish_ none other thread can
> > > > > > + proceed
> > > > > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > > > > + */
> > > > > > +
> > > > > > +#ifdef __cplusplus
> > > > > > +extern "C" {
> > > > > > +#endif
> > > > > > +
> > > > > > +#include <rte_ring_peek_c11_mem.h>
> > > > > > +
> > > > > > +/* Rock that needs to be passed between reserve and commit APIs
> > > > > > +*/ struct rte_ring_sg_data {
> > > > > > +	/* Pointer to the first space in the ring */
> > > > > > +	void **ptr1;
> > > > > > +	/* Pointer to the second space in the ring if there is wrap-
> > around */
> > > > > > +	void **ptr2;
> > > > > > +	/* Number of elements in the first pointer. If this is equal to
> > > > > > +	 * the number of elements requested, then ptr2 is NULL.
> > > > > > +	 * Otherwise, subtracting n1 from number of elements
> > requested
> > > > > > +	 * will give the number of elements available at ptr2.
> > > > > > +	 */
> > > > > > +	unsigned int n1;
> > > > > > +};
> > > > >
> > > > > Would it be possible to simply return the offset instead of this structure?
> > > > > The wrap could be managed by a __rte_ring_enqueue_elems()
> > function.
> > > > Trying to use __rte_ring_enqueue_elems() will force temporary copy.
> > See below.
> > > >
> > > > >
> > > > > I mean something like this:
> > > > >
> > > > > 	uint32_t start;
> > > > > 	n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > > > > 	if (n != 0) {
> > > > > 		/* Copy objects in the ring. */
> > > > > 		__rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> > > > > n);
> > > > For ex: 'obj' here is temporary copy.
> > > >
> > > > > 		rte_ring_enqueue_sg_finish(ring, n);
> > > > > 	}
> > > > >
> > > > > It would require to slightly change __rte_ring_enqueue_elems() to
> > > > > support to be called with prod_head >= size, and wrap in that case.
> > > > >
> > > > The alternate solution I can think of requires 3 things 1) the base
> > > > address of the ring 2) Index to where to copy 3) the mask. With
> > > > these 3
> > > things one could write the code like below:
> > > > for (i = 0; i < n; i++) {
> > > > 	ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take
> > care of wrap-around.
> > > > }
> > > >
> > > > However, I think this does not allow for passing the address in the
> > > > ring to another function/API to copy the data (It is possible, but
> > > > the user
> > > has to calculate the actual address, worry about the wrap-around, second
> > pointer etc).
> > > >
> > > > The current approach hides some details and provides flexibility to the
> > application to use the pointer the way it wants.
> > >
> > > I agree that doing the access + masking manually looks too complex.
> > >
> > > However I'm not sure to get why using __rte_ring_enqueue_elems()
> > would
> > > result in an additional copy. I suppose the object that you want to
> > > enqueue is already stored somewhere?
> I think this is the key. The object is not stored any where (yet), it is getting generated. When it is generated, it should get stored directly into
> the ring. I have provided some examples below.
> 
> > >
> > > For instance, let's say you have 10 objects to enqueue, located at
> > > different places:
> > >
> > > 	void *obj_0_to_3 = <place where objects 0 to 3 are stored>;
> > > 	void *obj_4_to_7 = ...;
> > > 	void *obj_8_to_9 = ...;
> > > 	uint32_t start;
> > > 	n = rte_ring_enqueue_sg_bulk_start(ring, 10, &start, NULL);
> > > 	if (n != 0) {
> > > 		__rte_ring_enqueue_elems(ring, start, obj_0_to_3,
> > > 			sizeof(uintptr_t), 4);
> > > 		__rte_ring_enqueue_elems(ring, start + 4, obj_4_to_7,
> > > 			sizeof(uintptr_t), 4);
> > > 		__rte_ring_enqueue_elems(ring, start + 8, obj_8_to_9,
> > > 			sizeof(uintptr_t), 2);
> > > 		rte_ring_enqueue_sg_finish(ring, 10);
> > > 	}
> > >
> >
> >
> > As I understand, It is not about different objects stored in different places, it
> > is about:
> > a) object is relatively big (16B+ ?)
> > b) You compose objects from values stored in few different places.
> >
> > Let say you have:
> > struct elem_obj {uint64_t a; uint32_t b, c;};
> >
> > And then you'd like to copy 'a' value from one location, 'b' from second, and
> > 'c' from third one.
> >
> > Konstantin
> >
> I think there are multiple use cases. Some I have in mind are:
> 
> 1)
> Code without this patch:
> 
> struct rte_mbuf *pkts_burst[32];
> 
> /* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */
> 
> /* Pkt I/O core polls packets from the NIC, pkts_burst is the temporary store */
> nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, 32);
> /* Provide packets to the packet processing cores */
> rte_ring_enqueue_burst(ring, pkts_burst, 32, &free_space);
> 
> Code with the patch:
> 
> /* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */
> 
> /* Reserve space on the ring */
> n = rte_ring_enqueue_sg_burst_start(ring, 32, &sgd, NULL);
> /* Pkt I/O core polls packets from the NIC */
> if (n == 32)
> 	nb_rx = rte_eth_rx_burst(portid, queueid, sgd->ptr1, 32);
> else
> 	nb_rx = rte_eth_rx_burst(portid, queueid, sgd->ptr1, sgd->n1);
> /* Provide packets to the packet processing cores */
> /* Temporary storage 'pkts_burst' is not required */
> rte_ring_enqueue_sg_finish(ring, nb_rx);
> 
> 
> 2) This is same/similar to what Konstantin mentioned
> 
> Code without this patch:
> 
> struct elem_obj {uint64_t a; uint32_t b, c;};
> struct elem_obj obj;
> 
> /* Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS */
> 
> obj.a = rte_get_a();
> obj.b = rte_get_b();
> obj.c = rte_get_c();
> /* obj is the temporary storage and results in memcpy in the following call */
> rte_ring_enqueue_elem(ring, sizeof(struct elem_obj), 1, &obj, NULL);
> 
> Code with the patch:
> 
> struct elem_obj *obj;
> /* Reserve space on the ring */
> n = rte_ring_enqueue_sg_bulk_elem_start(ring, sizeof(elem_obj), 1, &sgd, NULL);
> 
> obj = (struct elem_obj *)sgd->ptr1;
> obj.a = rte_get_a();
> obj.b = rte_get_b();
> obj.c = rte_get_c();
> /* obj is not a temporary storage */
> rte_ring_enqueue_sg_elem_finish(ring, n);

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-12 16:20     ` Ananyev, Konstantin
@ 2020-10-12 22:31       ` Honnappa Nagarahalli
  2020-10-13 11:38         ` Ananyev, Konstantin
  0 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-12 22:31 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: olivier.matz, david.marchand, nd, Honnappa Nagarahalli, nd

Hi Konstantin,
	Appreciate your feedback.

<snip>

> 
> 
> > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > involve copying large amount of data to/from the ring can benefit from
> > these APIs.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> >  lib/librte_ring/meson.build        |   3 +-
> >  lib/librte_ring/rte_ring_elem.h    |   1 +
> >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > +++++++++++++++++++++++++++++
> >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode 100644
> > lib/librte_ring/rte_ring_peek_sg.h
> 
> As a generic one - need to update ring UT both func and perf to
> test/measure this new API.
Yes, will add.

> 
> >
> > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > index 31c0b4649..377694713 100644
> > --- a/lib/librte_ring/meson.build
> > +++ b/lib/librte_ring/meson.build
> > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> >  		'rte_ring_peek.h',
> >  		'rte_ring_peek_c11_mem.h',
> >  		'rte_ring_rts.h',
> > -		'rte_ring_rts_c11_mem.h')
> > +		'rte_ring_rts_c11_mem.h',
> > +		'rte_ring_peek_sg.h')
> > diff --git a/lib/librte_ring/rte_ring_elem.h
> > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > --- a/lib/librte_ring/rte_ring_elem.h
> > +++ b/lib/librte_ring/rte_ring_elem.h
> > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > void *obj_table,
> >
> >  #ifdef ALLOW_EXPERIMENTAL_API
> >  #include <rte_ring_peek.h>
> > +#include <rte_ring_peek_sg.h>
> >  #endif
> >
> >  #include <rte_ring.h>
> > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > b/lib/librte_ring/rte_ring_peek_sg.h
> > new file mode 100644
> > index 000000000..97d5764a6
> > --- /dev/null
> > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > @@ -0,0 +1,552 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + *
> > + * Copyright (c) 2020 Arm
> > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * All rights reserved.
> > + * Derived from FreeBSD's bufring.h
> > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > + */
> > +
> > +#ifndef _RTE_RING_PEEK_SG_H_
> > +#define _RTE_RING_PEEK_SG_H_
> > +
> > +/**
> > + * @file
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + * It is not recommended to include this file directly.
> > + * Please include <rte_ring_elem.h> instead.
> > + *
> > + * Ring Peek Scatter Gather APIs
> > + * Introduction of rte_ring with scatter gather serialized
> > +producer/consumer
> > + * (HTS sync mode) makes it possible to split public enqueue/dequeue
> > +API
> > + * into 3 phases:
> > + * - enqueue/dequeue start
> > + * - copy data to/from the ring
> > + * - enqueue/dequeue finish
> > + * Along with the advantages of the peek APIs, these APIs provide the
> > +ability
> > + * to avoid copying of the data to temporary area.
> > + *
> > + * Note that right now this new API is available only for two sync modes:
> > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> > + * It is a user responsibility to create/init ring with appropriate
> > +sync
> > + * modes selected.
> > + *
> > + * Example usage:
> > + * // read 1 elem from the ring:
> > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > + * if (n != 0) {
> > + *	//Copy objects in the ring
> > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > + *	if (n != sgd->n1)
> > + *		//Second memcpy because of wrapround
> > + *		n2 = n - sgd->n1;
> > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > + *	rte_ring_dequeue_sg_finish(ring, n);
> 
> It is not clear from the example above why do you need SG(ZC) API.
> Existing peek API would be able to handle such situation (just copy will be
> done internally). Probably better to use examples you provided in your last
> reply to Olivier.
Agree, not a good example, will change it.

> 
> > + * }
> > + *
> > + * Note that between _start_ and _finish_ none other thread can
> > + proceed
> > + * with enqueue(/dequeue) operation till _finish_ completes.
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_ring_peek_c11_mem.h>
> > +
> > +/* Rock that needs to be passed between reserve and commit APIs */
> > +struct rte_ring_sg_data {
> > +	/* Pointer to the first space in the ring */
> > +	void **ptr1;
> > +	/* Pointer to the second space in the ring if there is wrap-around */
> > +	void **ptr2;
> > +	/* Number of elements in the first pointer. If this is equal to
> > +	 * the number of elements requested, then ptr2 is NULL.
> > +	 * Otherwise, subtracting n1 from number of elements requested
> > +	 * will give the number of elements available at ptr2.
> > +	 */
> > +	unsigned int n1;
> > +};
> 
> I wonder what is the primary goal of that API?
> The reason I am asking: from what I understand with this patch ZC API will
> work only for ST and HTS modes (same as peek API).
> Though, I think it is possible to make it work for any sync model, by changing
Agree, the functionality can be extended to other modes as well. I added these 2 modes as I found the use cases for these.

> API a bit: instead of returning sg_data to the user, force him to provide
> function to read/write elems from/to the ring.
> Just a schematic one, to illustrate the idea:
> 
> typedef void (*write_ring_func_t)(void *elem, /*pointer to first elem to
> update inside the ring*/
> 				uint32_t num, /* number of elems to update
> */
> 				uint32_t esize,
> 				void *udata  /* caller provide data */);
> 
> rte_ring_enqueue_zc_bulk_elem(struct rte_ring *r, unsigned int esize,
> 	unsigned int n, unsigned int *free_space, write_ring_func_t wf, void
> *udata) {
> 	struct rte_ring_sg_data sgd;
> 	.....
> 	n = move_head_tail(r, ...);
> 
> 	/* get sgd data based on n */
> 	get_elem_addr(r, ..., &sgd);
> 
> 	/* call user defined function to fill reserved elems */
> 	wf(sgd.p1, sgd.n1, esize, udata);
> 	if (n != n1)
> 		wf(sgd.p2, sgd.n2, esize, udata);
> 
> 	....
> 	return n;
> }
> 
I think the call back function makes it difficult to use the API. The call back function would be a wrapper around another function or API which will have its own arguments. Now all those parameters have to passed using the 'udata'. For ex: in the 2nd example that I provided earlier, the user has to create a wrapper around 'rte_eth_rx_burst' API and then provide the parameters to 'rte_eth_rx_burst' through 'udata'. 'udata' would need a structure definition as well.

> If we want ZC peek API also - some extra work need to be done with
> introducing return value for write_ring_func() and checking it properly, but I
> don't see any big problems here too.
> That way ZC API can support all sync models, plus we don't need to expose
> sg_data to the user directly.
Other modes can be supported with the method used in this patch as well. If you see a need, I can add them.
IMO, only issue with exposing sg_data is ABI compatibility in the future. I think, we can align the 'struct rte_ring_sg_data' to cache line boundary and it should provide ability to extend it in the future without affecting the ABI compatibility.

> Also, in future, we probably can de-duplicate the code by making our non-ZC
> API to use that one internally (pass ring_enqueue_elems()/ob_table as a
> parameters).
> 
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr_64(struct rte_ring *r, uint32_t head,
> > +	uint32_t num, void **dst1, uint32_t *n1, void **dst2) {
> > +	uint32_t idx = head & r->mask;
> > +	uint64_t *ring = (uint64_t *)&r[1];
> > +
> > +	*dst1 = ring + idx;
> > +	*n1 = num;
> > +
> > +	if (idx + num > r->size) {
> > +		*n1 = num - (r->size - idx - 1);
> > +		*dst2 = ring;
> > +	}
> > +}
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr_128(struct rte_ring *r, uint32_t head,
> > +	uint32_t num, void **dst1, uint32_t *n1, void **dst2) {
> > +	uint32_t idx = head & r->mask;
> > +	rte_int128_t *ring = (rte_int128_t *)&r[1];
> > +
> > +	*dst1 = ring + idx;
> > +	*n1 = num;
> > +
> > +	if (idx + num > r->size) {
> > +		*n1 = num - (r->size - idx - 1);
> > +		*dst2 = ring;
> > +	}
> > +}
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> > +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void
> > +**dst2) {
> > +	if (esize == 8)
> > +		__rte_ring_get_elem_addr_64(r, head,
> > +						num, dst1, n1, dst2);
> > +	else if (esize == 16)
> > +		__rte_ring_get_elem_addr_128(r, head,
> > +						num, dst1, n1, dst2);
> 
> 
> I don't think we need that special handling for 8/16B sizes.
> In all functions esize is an input parameter.
> If user will specify is as a constant - compiler will be able to convert multiply
> to shift and add ops.
Ok, I will check this out.

> 
> > +	else {
> > +		uint32_t idx, scale, nr_idx;
> > +		uint32_t *ring = (uint32_t *)&r[1];
> > +
> > +		/* Normalize to uint32_t */
> > +		scale = esize / sizeof(uint32_t);
> > +		idx = head & r->mask;
> > +		nr_idx = idx * scale;
> > +
> > +		*dst1 = ring + nr_idx;
> > +		*n1 = num;
> > +
> > +		if (idx + num > r->size) {
> > +			*n1 = num - (r->size - idx - 1);
> > +			*dst2 = ring;
> > +		}
> > +	}
> > +}
> > +
> > +/**
> > + * @internal This function moves prod head value.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_enqueue_sg_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +		uint32_t n, enum rte_ring_queue_behavior behavior,
> > +		struct rte_ring_sg_data *sgd, unsigned int *free_space) {
> > +	uint32_t free, head, next;
> > +
> > +	switch (r->prod.sync_type) {
> > +	case RTE_RING_SYNC_ST:
> > +		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
> > +			behavior, &head, &next, &free);
> > +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd-
> >ptr1,
> > +			&sgd->n1, (void **)&sgd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT_HTS:
> > +		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head,
> &free);
> > +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&sgd-
> >ptr1,
> > +			&sgd->n1, (void **)&sgd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT:
> > +	case RTE_RING_SYNC_MT_RTS:
> > +	default:
> > +		/* unsupported mode, shouldn't be here */
> > +		RTE_ASSERT(0);
> > +		n = 0;
> > +		free = 0;
> > +	}
> > +
> > +	if (free_space != NULL)
> > +		*free_space = free - n;
> > +	return n;
> > +}
> > +
> > +/**
> > + * Start to enqueue several objects on the ring.
> > + * Note that no actual objects are put in the queue by this function,
> > + * it just reserves space for the user on the ring.
> > + * User has to copy objects into the queue using the returned pointers.
> > + * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete
> > +the
> > + * enqueue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + * @param n
> > + *   The number of objects to add in the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param free_space
> > + *   if non-NULL, returns the amount of space in the ring after the
> > + *   reservation operation has finished.
> > + * @return
> > + *   The number of objects that can be enqueued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_enqueue_sg_bulk_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int
> > +*free_space) {
> > +	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
> > +			RTE_RING_QUEUE_FIXED, sgd, free_space); }
> > +
> > +/**
> > + * Start to enqueue several pointers to objects on the ring.
> > + * Note that no actual pointers are put in the queue by this
> > +function,
> > + * it just reserves space for the user on the ring.
> > + * User has to copy pointers to objects into the queue using the
> > + * returned pointers.
> > + * User should call rte_ring_enqueue_sg_bulk_finish to complete the
> > + * enqueue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to add in the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param free_space
> > + *   if non-NULL, returns the amount of space in the ring after the
> > + *   reservation operation has finished.
> > + * @return
> > + *   The number of objects that can be enqueued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_enqueue_sg_bulk_start(struct rte_ring *r, unsigned int n,
> > +	struct rte_ring_sg_data *sgd, unsigned int *free_space) {
> > +	return rte_ring_enqueue_sg_bulk_elem_start(r, sizeof(uintptr_t), n,
> > +							sgd, free_space);
> > +}
> > +/**
> > + * Start to enqueue several objects on the ring.
> > + * Note that no actual objects are put in the queue by this function,
> > + * it just reserves space for the user on the ring.
> > + * User has to copy objects into the queue using the returned pointers.
> > + * User should call rte_ring_enqueue_sg_bulk_elem_finish to complete
> > +the
> > + * enqueue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + * @param n
> > + *   The number of objects to add in the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param free_space
> > + *   if non-NULL, returns the amount of space in the ring after the
> > + *   reservation operation has finished.
> > + * @return
> > + *   The number of objects that can be enqueued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_enqueue_sg_burst_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int
> > +*free_space) {
> > +	return __rte_ring_do_enqueue_sg_elem_start(r, esize, n,
> > +			RTE_RING_QUEUE_VARIABLE, sgd, free_space); }
> > +
> > +/**
> > + * Start to enqueue several pointers to objects on the ring.
> > + * Note that no actual pointers are put in the queue by this
> > +function,
> > + * it just reserves space for the user on the ring.
> > + * User has to copy pointers to objects into the queue using the
> > + * returned pointers.
> > + * User should call rte_ring_enqueue_sg_bulk_finish to complete the
> > + * enqueue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to add in the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param free_space
> > + *   if non-NULL, returns the amount of space in the ring after the
> > + *   reservation operation has finished.
> > + * @return
> > + *   The number of objects that can be enqueued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_enqueue_sg_burst_start(struct rte_ring *r, unsigned int n,
> > +	struct rte_ring_sg_data *sgd, unsigned int *free_space) {
> > +	return rte_ring_enqueue_sg_burst_elem_start(r, sizeof(uintptr_t),
> n,
> > +							sgd, free_space);
> > +}
> > +
> > +/**
> > + * Complete enqueuing several objects on the ring.
> > + * Note that number of objects to enqueue should not exceed previous
> > + * enqueue_start return value.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to add to the ring.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void
> > +rte_ring_enqueue_sg_elem_finish(struct rte_ring *r, unsigned int n) {
> > +	uint32_t tail;
> > +
> > +	switch (r->prod.sync_type) {
> > +	case RTE_RING_SYNC_ST:
> > +		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
> > +		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
> > +		break;
> > +	case RTE_RING_SYNC_MT_HTS:
> > +		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
> > +		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
> > +		break;
> > +	case RTE_RING_SYNC_MT:
> > +	case RTE_RING_SYNC_MT_RTS:
> > +	default:
> > +		/* unsupported mode, shouldn't be here */
> > +		RTE_ASSERT(0);
> > +	}
> > +}
> > +
> > +/**
> > + * Complete enqueuing several pointers to objects on the ring.
> > + * Note that number of objects to enqueue should not exceed previous
> > + * enqueue_start return value.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of pointers to objects to add to the ring.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void
> > +rte_ring_enqueue_sg_finish(struct rte_ring *r, unsigned int n) {
> > +	rte_ring_enqueue_sg_elem_finish(r, n); }
> > +
> > +/**
> > + * @internal This function moves cons head value and copies up to *n*
> > + * objects from the ring to the user provided obj_table.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_dequeue_sg_elem_start(struct rte_ring *r,
> > +	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
> > +	struct rte_ring_sg_data *sgd, unsigned int *available) {
> > +	uint32_t avail, head, next;
> > +
> > +	switch (r->cons.sync_type) {
> > +	case RTE_RING_SYNC_ST:
> > +		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
> > +			behavior, &head, &next, &avail);
> > +		__rte_ring_get_elem_addr(r, head, esize, n,
> > +					sgd->ptr1, &sgd->n1, sgd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT_HTS:
> > +		n = __rte_ring_hts_move_cons_head(r, n, behavior,
> > +			&head, &avail);
> > +		__rte_ring_get_elem_addr(r, head, esize, n,
> > +					sgd->ptr1, &sgd->n1, sgd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT:
> > +	case RTE_RING_SYNC_MT_RTS:
> > +	default:
> > +		/* unsupported mode, shouldn't be here */
> > +		RTE_ASSERT(0);
> > +		n = 0;
> > +		avail = 0;
> > +	}
> > +
> > +	if (available != NULL)
> > +		*available = avail - n;
> > +	return n;
> > +}
> > +
> > +/**
> > + * Start to dequeue several objects from the ring.
> > + * Note that no actual objects are copied from the queue by this function.
> > + * User has to copy objects from the queue using the returned pointers.
> > + * User should call rte_ring_dequeue_sg_bulk_elem_finish to complete
> > +the
> > + * dequeue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + * @param n
> > + *   The number of objects to remove from the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param available
> > + *   If non-NULL, returns the number of remaining ring entries after the
> > + *   dequeue has finished.
> > + * @return
> > + *   The number of objects that can be dequeued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_dequeue_sg_bulk_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int
> > +*available) {
> > +	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
> > +			RTE_RING_QUEUE_FIXED, sgd, available); }
> > +
> > +/**
> > + * Start to dequeue several pointers to objects from the ring.
> > + * Note that no actual pointers are removed from the queue by this
> function.
> > + * User has to copy pointers to objects from the queue using the
> > + * returned pointers.
> > + * User should call rte_ring_dequeue_sg_bulk_finish to complete the
> > + * dequeue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to remove from the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param available
> > + *   If non-NULL, returns the number of remaining ring entries after the
> > + *   dequeue has finished.
> > + * @return
> > + *   The number of objects that can be dequeued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_dequeue_sg_bulk_start(struct rte_ring *r, unsigned int n,
> > +	struct rte_ring_sg_data *sgd, unsigned int *available) {
> > +	return rte_ring_dequeue_sg_bulk_elem_start(r, sizeof(uintptr_t),
> > +		n, sgd, available);
> > +}
> > +
> > +/**
> > + * Start to dequeue several objects from the ring.
> > + * Note that no actual objects are copied from the queue by this function.
> > + * User has to copy objects from the queue using the returned pointers.
> > + * User should call rte_ring_dequeue_sg_burst_elem_finish to complete
> > +the
> > + * dequeue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param esize
> > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > + *   This must be the same value used while creating the ring. Otherwise
> > + *   the results are undefined.
> > + * @param n
> > + *   The number of objects to dequeue from the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param available
> > + *   If non-NULL, returns the number of remaining ring entries after the
> > + *   dequeue has finished.
> > + * @return
> > + *   The number of objects that can be dequeued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_dequeue_sg_burst_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +	unsigned int n, struct rte_ring_sg_data *sgd, unsigned int
> > +*available) {
> > +	return __rte_ring_do_dequeue_sg_elem_start(r, esize, n,
> > +			RTE_RING_QUEUE_VARIABLE, sgd, available); }
> > +
> > +/**
> > + * Start to dequeue several pointers to objects from the ring.
> > + * Note that no actual pointers are removed from the queue by this
> function.
> > + * User has to copy pointers to objects from the queue using the
> > + * returned pointers.
> > + * User should call rte_ring_dequeue_sg_burst_finish to complete the
> > + * dequeue operation.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to remove from the ring.
> > + * @param sgd
> > + *   The scatter-gather data containing pointers for copying data.
> > + * @param available
> > + *   If non-NULL, returns the number of remaining ring entries after the
> > + *   dequeue has finished.
> > + * @return
> > + *   The number of objects that can be dequeued, either 0 or n
> > + */
> > +__rte_experimental
> > +static __rte_always_inline unsigned int
> > +rte_ring_dequeue_sg_burst_start(struct rte_ring *r, unsigned int n,
> > +		struct rte_ring_sg_data *sgd, unsigned int *available) {
> > +	return rte_ring_dequeue_sg_burst_elem_start(r, sizeof(uintptr_t),
> n,
> > +			sgd, available);
> > +}
> > +
> > +/**
> > + * Complete dequeuing several objects from the ring.
> > + * Note that number of objects to dequeued should not exceed previous
> > + * dequeue_start return value.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to remove from the ring.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void
> > +rte_ring_dequeue_sg_elem_finish(struct rte_ring *r, unsigned int n) {
> > +	uint32_t tail;
> > +
> > +	switch (r->cons.sync_type) {
> > +	case RTE_RING_SYNC_ST:
> > +		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
> > +		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
> > +		break;
> > +	case RTE_RING_SYNC_MT_HTS:
> > +		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
> > +		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
> > +		break;
> > +	case RTE_RING_SYNC_MT:
> > +	case RTE_RING_SYNC_MT_RTS:
> > +	default:
> > +		/* unsupported mode, shouldn't be here */
> > +		RTE_ASSERT(0);
> > +	}
> > +}
> > +
> > +/**
> > + * Complete dequeuing several objects from the ring.
> > + * Note that number of objects to dequeued should not exceed previous
> > + * dequeue_start return value.
> > + *
> > + * @param r
> > + *   A pointer to the ring structure.
> > + * @param n
> > + *   The number of objects to remove from the ring.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void
> > +rte_ring_dequeue_sg_finish(struct rte_ring *r, unsigned int n) {
> > +	rte_ring_dequeue_elem_finish(r, n);
> > +}
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_RING_PEEK_SG_H_ */
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
  2020-10-12 22:31       ` Honnappa Nagarahalli
@ 2020-10-13 11:38         ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-13 11:38 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev; +Cc: olivier.matz, david.marchand, nd, nd


Hi Honnappa,

> Hi Konstantin,
> 	Appreciate your feedback.
> 
> <snip>
> 
> >
> >
> > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > > involve copying large amount of data to/from the ring can benefit from
> > > these APIs.
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > ---
> > >  lib/librte_ring/meson.build        |   3 +-
> > >  lib/librte_ring/rte_ring_elem.h    |   1 +
> > >  lib/librte_ring/rte_ring_peek_sg.h | 552
> > > +++++++++++++++++++++++++++++
> > >  3 files changed, 555 insertions(+), 1 deletion(-)  create mode 100644
> > > lib/librte_ring/rte_ring_peek_sg.h
> >
> > As a generic one - need to update ring UT both func and perf to
> > test/measure this new API.
> Yes, will add.
> 
> >
> > >
> > > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > > index 31c0b4649..377694713 100644
> > > --- a/lib/librte_ring/meson.build
> > > +++ b/lib/librte_ring/meson.build
> > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > >  		'rte_ring_peek.h',
> > >  		'rte_ring_peek_c11_mem.h',
> > >  		'rte_ring_rts.h',
> > > -		'rte_ring_rts_c11_mem.h')
> > > +		'rte_ring_rts_c11_mem.h',
> > > +		'rte_ring_peek_sg.h')
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > > void *obj_table,
> > >
> > >  #ifdef ALLOW_EXPERIMENTAL_API
> > >  #include <rte_ring_peek.h>
> > > +#include <rte_ring_peek_sg.h>
> > >  #endif
> > >
> > >  #include <rte_ring.h>
> > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > new file mode 100644
> > > index 000000000..97d5764a6
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > @@ -0,0 +1,552 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2020 Arm
> > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > +#define _RTE_RING_PEEK_SG_H_
> > > +
> > > +/**
> > > + * @file
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring_elem.h> instead.
> > > + *
> > > + * Ring Peek Scatter Gather APIs
> > > + * Introduction of rte_ring with scatter gather serialized
> > > +producer/consumer
> > > + * (HTS sync mode) makes it possible to split public enqueue/dequeue
> > > +API
> > > + * into 3 phases:
> > > + * - enqueue/dequeue start
> > > + * - copy data to/from the ring
> > > + * - enqueue/dequeue finish
> > > + * Along with the advantages of the peek APIs, these APIs provide the
> > > +ability
> > > + * to avoid copying of the data to temporary area.
> > > + *
> > > + * Note that right now this new API is available only for two sync modes:
> > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> > > + * It is a user responsibility to create/init ring with appropriate
> > > +sync
> > > + * modes selected.
> > > + *
> > > + * Example usage:
> > > + * // read 1 elem from the ring:
> > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > + * if (n != 0) {
> > > + *	//Copy objects in the ring
> > > + *	memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > + *	if (n != sgd->n1)
> > > + *		//Second memcpy because of wrapround
> > > + *		n2 = n - sgd->n1;
> > > + *		memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> > > + *	rte_ring_dequeue_sg_finish(ring, n);
> >
> > It is not clear from the example above why do you need SG(ZC) API.
> > Existing peek API would be able to handle such situation (just copy will be
> > done internally). Probably better to use examples you provided in your last
> > reply to Olivier.
> Agree, not a good example, will change it.
> 
> >
> > > + * }
> > > + *
> > > + * Note that between _start_ and _finish_ none other thread can
> > > + proceed
> > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_peek_c11_mem.h>
> > > +
> > > +/* Rock that needs to be passed between reserve and commit APIs */
> > > +struct rte_ring_sg_data {
> > > +	/* Pointer to the first space in the ring */
> > > +	void **ptr1;
> > > +	/* Pointer to the second space in the ring if there is wrap-around */
> > > +	void **ptr2;
> > > +	/* Number of elements in the first pointer. If this is equal to
> > > +	 * the number of elements requested, then ptr2 is NULL.
> > > +	 * Otherwise, subtracting n1 from number of elements requested
> > > +	 * will give the number of elements available at ptr2.
> > > +	 */
> > > +	unsigned int n1;
> > > +};
> >
> > I wonder what is the primary goal of that API?
> > The reason I am asking: from what I understand with this patch ZC API will
> > work only for ST and HTS modes (same as peek API).
> > Though, I think it is possible to make it work for any sync model, by changing
> Agree, the functionality can be extended to other modes as well. I added these 2 modes as I found the use cases for these.
> 
> > API a bit: instead of returning sg_data to the user, force him to provide
> > function to read/write elems from/to the ring.
> > Just a schematic one, to illustrate the idea:
> >
> > typedef void (*write_ring_func_t)(void *elem, /*pointer to first elem to
> > update inside the ring*/
> > 				uint32_t num, /* number of elems to update
> > */
> > 				uint32_t esize,
> > 				void *udata  /* caller provide data */);
> >
> > rte_ring_enqueue_zc_bulk_elem(struct rte_ring *r, unsigned int esize,
> > 	unsigned int n, unsigned int *free_space, write_ring_func_t wf, void
> > *udata) {
> > 	struct rte_ring_sg_data sgd;
> > 	.....
> > 	n = move_head_tail(r, ...);
> >
> > 	/* get sgd data based on n */
> > 	get_elem_addr(r, ..., &sgd);
> >
> > 	/* call user defined function to fill reserved elems */
> > 	wf(sgd.p1, sgd.n1, esize, udata);
> > 	if (n != n1)
> > 		wf(sgd.p2, sgd.n2, esize, udata);
> >
> > 	....
> > 	return n;
> > }
> >
> I think the call back function makes it difficult to use the API. The call back function would be a wrapper around another function or API
> which will have its own arguments. Now all those parameters have to passed using the 'udata'. For ex: in the 2nd example that I provided
> earlier, the user has to create a wrapper around 'rte_eth_rx_burst' API and then provide the parameters to 'rte_eth_rx_burst' through
> 'udata'. 'udata' would need a structure definition as well.

Yes, it would, though I don't see much problems with that.
Let say for eth_rx_burst(), user will need something like struct {uint16_t p, q;} udata = {.p = port_id, .q=queue_id,};

> 
> > If we want ZC peek API also - some extra work need to be done with
> > introducing return value for write_ring_func() and checking it properly, but I
> > don't see any big problems here too.
> > That way ZC API can support all sync models, plus we don't need to expose
> > sg_data to the user directly.
> Other modes can be supported with the method used in this patch as well. 

You mean via exposing to the user tail value (in sg_data or so)?
I am still a bit nervous about doing that. 

> If you see a need, I can add them.

Not, really, I just thought callbacks will be a good idea here...

> IMO, only issue with exposing sg_data is ABI compatibility in the future. I think, we can align the 'struct rte_ring_sg_data' to cache line
> boundary and it should provide ability to extend it in the future without affecting the ABI compatibility.

As I understand sg_data is experimental struct (as the rest of API in that file).
So breaking it shouldn't be a problem for a while.

I suppose to summarize things - as I understand you think callback approach
is not a good choice.
From other hand, I am not really happy with idea to expose tail values updates
to the user.
Then I suggest we can just go ahead with that patch as it is:
sg_data approach, _ZC_ peek API only.

> 
> > Also, in future, we probably can de-duplicate the code by making our non-ZC
> > API to use that one internally (pass ring_enqueue_elems()/ob_table as a
> > parameters).
> >
> > > +

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs
  2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
  2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
  2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
@ 2020-10-23  4:43 ` Honnappa Nagarahalli
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
                     ` (4 more replies)
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
  4 siblings, 5 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

It is pretty common for the DPDK applications to be deployed in
semi-pipeline model. In these models, a small number of cores
(typically 1) are designated as I/O cores. The I/O cores work
on receiving and transmitting packets from the NIC and several
packet processing cores. The IO core and the packet processing
cores exchange the packets over a ring. Typically, such applications
receive the mbufs in a temporary array and copy the mbufs on
to the ring. Depending on the requirements the packets
could be copied in batches of 32, 64 etc resulting in 256B,
512B etc memory copy.

The zero copy APIs help avoid intermediate copies by exposing
the space on the ring directly to the application.

v3:
1) Changed the name of the APIs to 'zero-copy (zc)'
2) Made the address calculation simpler
3) Structure to return the data to the user is aligned on
   cache line boundary.
4) Added functional and stress test cases

v2: changed the patch to use the SP-SC and HTS modes

v1: Initial version

Honnappa Nagarahalli (5):
  test/ring: fix the memory dump size
  lib/ring: add zero copy APIs
  test/ring: move common function to header file
  test/ring: add functional tests for zero copy APIs
  test/ring: add stress tests for zero copy APIs

 app/test/meson.build                   |   2 +
 app/test/test_ring.c                   | 207 +++++++++-
 app/test/test_ring.h                   |  53 +++
 app/test/test_ring_mt_peek_stress_zc.c |  56 +++
 app/test/test_ring_st_peek_stress_zc.c |  65 +++
 app/test/test_ring_stress.c            |   6 +
 app/test/test_ring_stress.h            |   2 +
 app/test/test_ring_stress_impl.h       |   2 +-
 lib/librte_ring/meson.build            |   1 +
 lib/librte_ring/rte_ring_elem.h        |   1 +
 lib/librte_ring/rte_ring_peek_zc.h     | 542 +++++++++++++++++++++++++
 11 files changed, 925 insertions(+), 12 deletions(-)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
@ 2020-10-23  4:43   ` Honnappa Nagarahalli
  2020-10-23 13:24     ` Ananyev, Konstantin
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd, stable

Pass the correct number of bytes to dump the memory.

Fixes: bf28df24e915 ("test/ring: add contention stress test"
Cc: konstantin.ananyev@intel.com
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/test_ring_stress_impl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
index 3b9a480eb..f9ca63b90 100644
--- a/app/test/test_ring_stress_impl.h
+++ b/app/test/test_ring_stress_impl.h
@@ -159,7 +159,7 @@ check_updt_elem(struct ring_elem *elm[], uint32_t num,
 				"offending object: %p\n",
 				__func__, rte_lcore_id(), num, i, elm[i]);
 			rte_memdump(stdout, "expected", check, sizeof(*check));
-			rte_memdump(stdout, "result", elm[i], sizeof(elm[i]));
+			rte_memdump(stdout, "result", elm[i], sizeof(*elm[i]));
 			rte_spinlock_unlock(&dump_lock);
 			return -EINVAL;
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
@ 2020-10-23  4:43   ` Honnappa Nagarahalli
  2020-10-23 13:59     ` Ananyev, Konstantin
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

Add zero-copy APIs. These APIs provide the capability to
copy the data to/from the ring memory directly, without
having a temporary copy (for ex: an array of mbufs on
the stack). Use cases that involve copying large amount
of data to/from the ring can benefit from these APIs.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 lib/librte_ring/meson.build        |   1 +
 lib/librte_ring/rte_ring_elem.h    |   1 +
 lib/librte_ring/rte_ring_peek_zc.h | 542 +++++++++++++++++++++++++++++
 3 files changed, 544 insertions(+)
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
index 31c0b4649..36fdcb6a5 100644
--- a/lib/librte_ring/meson.build
+++ b/lib/librte_ring/meson.build
@@ -11,5 +11,6 @@ headers = files('rte_ring.h',
 		'rte_ring_hts_c11_mem.h',
 		'rte_ring_peek.h',
 		'rte_ring_peek_c11_mem.h',
+		'rte_ring_peek_zc.h',
 		'rte_ring_rts.h',
 		'rte_ring_rts_c11_mem.h')
diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
index 938b398fc..7034d29c0 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 
 #ifdef ALLOW_EXPERIMENTAL_API
 #include <rte_ring_peek.h>
+#include <rte_ring_peek_zc.h>
 #endif
 
 #include <rte_ring.h>
diff --git a/lib/librte_ring/rte_ring_peek_zc.h b/lib/librte_ring/rte_ring_peek_zc.h
new file mode 100644
index 000000000..9db2d343f
--- /dev/null
+++ b/lib/librte_ring/rte_ring_peek_zc.h
@@ -0,0 +1,542 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Copyright (c) 2020 Arm Limited
+ * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * All rights reserved.
+ * Derived from FreeBSD's bufring.h
+ * Used as BSD-3 Licensed with permission from Kip Macy.
+ */
+
+#ifndef _RTE_RING_PEEK_ZC_H_
+#define _RTE_RING_PEEK_ZC_H_
+
+/**
+ * @file
+ * @b EXPERIMENTAL: this API may change without prior notice
+ * It is not recommended to include this file directly.
+ * Please include <rte_ring_elem.h> instead.
+ *
+ * Ring Peek Zero Copy APIs
+ * These APIs make it possible to split public enqueue/dequeue API
+ * into 3 parts:
+ * - enqueue/dequeue start
+ * - copy data to/from the ring
+ * - enqueue/dequeue finish
+ * Along with the advantages of the peek APIs, these APIs provide the ability
+ * to avoid copying of the data to temporary area (for ex: array of mbufs
+ * on the stack).
+ *
+ * Note that currently these APIs are available only for two sync modes:
+ * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
+ * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
+ * It is user's responsibility to create/init ring with appropriate sync
+ * modes selected.
+ *
+ * Following are some examples showing the API usage.
+ * 1)
+ * struct elem_obj {uint64_t a; uint32_t b, c;};
+ * struct elem_obj *obj;
+ *
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(elem_obj), 1, &zcd, NULL);
+ *
+ * // Produce the data directly on the ring memory
+ * obj = (struct elem_obj *)zcd->ptr1;
+ * obj.a = rte_get_a();
+ * obj.b = rte_get_b();
+ * obj.c = rte_get_c();
+ * rte_ring_enqueue_zc_elem_finish(ring, n);
+ *
+ * 2)
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
+ *
+ * // Pkt I/O core polls packets from the NIC
+ * if (n == 32)
+ *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, 32);
+ * else
+ *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
+ *
+ * // Provide packets to the packet processing cores
+ * rte_ring_enqueue_zc_finish(r, nb_rx);
+ *
+ * Note that between _start_ and _finish_ none other thread can proceed
+ * with enqueue/dequeue operation till _finish_ completes.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_ring_peek_c11_mem.h>
+
+/**
+ * Ring zero-copy information structure.
+ *
+ * This structure contains the pointers and length of the space
+ * reserved on the ring storage.
+ */
+struct rte_ring_zc_data {
+	/* Pointer to the first space in the ring */
+	void **ptr1;
+	/* Pointer to the second space in the ring if there is wrap-around */
+	void **ptr2;
+	/* Number of elements in the first pointer. If this is equal to
+	 * the number of elements requested, then ptr2 is NULL.
+	 * Otherwise, subtracting n1 from number of elements requested
+	 * will give the number of elements available at ptr2.
+	 */
+	unsigned int n1;
+} __rte_cache_aligned;
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
+	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx, scale, nr_idx;
+	uint32_t *ring = (uint32_t *)&r[1];
+
+	/* Normalize to uint32_t */
+	scale = esize / sizeof(uint32_t);
+	idx = head & r->mask;
+	nr_idx = idx * scale;
+
+	*dst1 = ring + nr_idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = r->size - idx;
+		*dst2 = ring;
+	}
+}
+
+/**
+ * @internal This function moves prod head value.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue_zc_elem_start(struct rte_ring *r, unsigned int esize,
+		uint32_t n, enum rte_ring_queue_behavior behavior,
+		struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	uint32_t free, head, next;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &free);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,
+			&zcd->n1, (void **)&zcd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,
+			&zcd->n1, (void **)&zcd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		free = 0;
+	}
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+
+/**
+ * Complete enqueuing several objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
+		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete enqueuing several pointers to objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of pointers to objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_enqueue_zc_elem_finish(r, n);
+}
+
+/**
+ * @internal This function moves cons head value and copies up to *n*
+ * objects from the ring to the user provided obj_table.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue_zc_elem_start(struct rte_ring *r,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	uint32_t avail, head, next;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &avail);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,
+			&zcd->n1, (void **)&zcd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_cons_head(r, n, behavior,
+			&head, &avail);
+		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,
+			&zcd->n1, (void **)&zcd->ptr2);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		avail = 0;
+	}
+
+	if (available != NULL)
+		*available = avail - n;
+	return n;
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_bulk_elem_start(r, sizeof(uintptr_t),
+		n, zcd, available);
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_start(struct rte_ring *r, unsigned int n,
+		struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+			zcd, available);
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
+		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_dequeue_elem_finish(r, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RING_PEEK_ZC_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
@ 2020-10-23  4:43   ` Honnappa Nagarahalli
  2020-10-23 14:22     ` Ananyev, Konstantin
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

Move test_ring_inc_ptr to header file so that it can be used by
functions in other files.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/test_ring.c | 11 -----------
 app/test/test_ring.h | 11 +++++++++++
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index a62cb263b..329d538a9 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			NULL);
 }
 
-static void**
-test_ring_inc_ptr(void **obj, int esize, unsigned int n)
-{
-	/* Legacy queue APIs? */
-	if ((esize) == -1)
-		return ((void **)obj) + n;
-	else
-		return (void **)(((uint32_t *)obj) +
-					(n * esize / sizeof(uint32_t)));
-}
-
 static void
 test_ring_mem_init(void *obj, unsigned int count, int esize)
 {
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index d4b15af7c..16697ee02 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -42,6 +42,17 @@ test_ring_create(const char *name, int esize, unsigned int count,
 						(socket_id), (flags));
 }
 
+static inline void**
+test_ring_inc_ptr(void **obj, int esize, unsigned int n)
+{
+	/* Legacy queue APIs? */
+	if ((esize) == -1)
+		return ((void **)obj) + n;
+	else
+		return (void **)(((uint32_t *)obj) +
+					(n * esize / sizeof(uint32_t)));
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
                     ` (2 preceding siblings ...)
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
@ 2020-10-23  4:43   ` Honnappa Nagarahalli
  2020-10-23 14:20     ` Ananyev, Konstantin
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

Add functional tests for zero copy APIs. Test enqueue/dequeue
functions are created using the zero copy APIs to fit into
the existing testing method.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/test_ring.c | 196 +++++++++++++++++++++++++++++++++++++++++++
 app/test/test_ring.h |  42 ++++++++++
 2 files changed, 238 insertions(+)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index 329d538a9..99fe4b46f 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
  */
 
 #include <string.h>
@@ -68,6 +69,149 @@
 
 static const int esize[] = {-1, 4, 8, 16, 20};
 
+/* Wrappers around the zero-copy APIs. The wrappers match
+ * the normal enqueue/dequeue API declarations.
+ */
+static unsigned int
+test_ring_enqueue_zc_bulk(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_start(r, n, &zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_start(r, n, &zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
 static const struct {
 	const char *desc;
 	uint32_t api_type;
@@ -219,6 +363,58 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	}
 };
 
 static unsigned int
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index 16697ee02..33c8a31fe 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -53,6 +53,48 @@ test_ring_inc_ptr(void **obj, int esize, unsigned int n)
 					(n * esize / sizeof(uint32_t)));
 }
 
+static inline void
+test_ring_mem_copy(void *dst, void * const *src, int esize, unsigned int num)
+{
+	size_t temp_sz;
+
+	temp_sz = num * sizeof(void *);
+	if (esize != -1)
+		temp_sz = esize * num;
+
+	memcpy(dst, src, temp_sz);
+}
+
+/* Copy to the ring memory */
+static inline void
+test_ring_copy_to(struct rte_ring_zc_data *zcd, void * const *src, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(zcd->ptr1, src, esize, zcd->n1);
+	if (zcd->n1 != num) {
+		if (esize == -1)
+			src = src + zcd->n1;
+		else
+			src = (void * const *)(((const uint32_t *)src) +
+					(zcd->n1 * esize / sizeof(uint32_t)));
+		test_ring_mem_copy(zcd->ptr2, src,
+					esize, num - zcd->n1);
+	}
+}
+
+/* Copy from the ring memory */
+static inline void
+test_ring_copy_from(struct rte_ring_zc_data *zcd, void *dst, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(dst, zcd->ptr1, esize, zcd->n1);
+
+	if (zcd->n1 != num) {
+		dst = test_ring_inc_ptr(dst, esize, zcd->n1);
+		test_ring_mem_copy(dst, zcd->ptr2, esize, num - zcd->n1);
+	}
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] test/ring: add stress tests for zero copy APIs
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
                     ` (3 preceding siblings ...)
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
@ 2020-10-23  4:43   ` Honnappa Nagarahalli
  2020-10-23 14:11     ` Ananyev, Konstantin
  4 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23  4:43 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

Add stress tests for zero copy API.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/meson.build                   |  2 +
 app/test/test_ring_mt_peek_stress_zc.c | 56 ++++++++++++++++++++++
 app/test/test_ring_st_peek_stress_zc.c | 65 ++++++++++++++++++++++++++
 app/test/test_ring_stress.c            |  6 +++
 app/test/test_ring_stress.h            |  2 +
 5 files changed, 131 insertions(+)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 8bfb02890..88c831a92 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -108,9 +108,11 @@ test_sources = files('commands.c',
 	'test_ring_mpmc_stress.c',
 	'test_ring_hts_stress.c',
 	'test_ring_mt_peek_stress.c',
+	'test_ring_mt_peek_stress_zc.c',
 	'test_ring_perf.c',
 	'test_ring_rts_stress.c',
 	'test_ring_st_peek_stress.c',
+	'test_ring_st_peek_stress_zc.c',
 	'test_ring_stress.c',
 	'test_rwlock.c',
 	'test_sched.c',
diff --git a/app/test/test_ring_mt_peek_stress_zc.c b/app/test/test_ring_mt_peek_stress_zc.c
new file mode 100644
index 000000000..7e0bd511a
--- /dev/null
+++ b/app/test/test_ring_mt_peek_stress_zc.c
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, n);
+		rte_ring_dequeue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, n);
+		rte_ring_enqueue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ);
+}
+
+const struct test test_ring_mt_peek_stress_zc = {
+	.name = "MT_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_st_peek_stress_zc.c b/app/test/test_ring_st_peek_stress_zc.c
new file mode 100644
index 000000000..2933e30bf
--- /dev/null
+++ b/app/test/test_ring_st_peek_stress_zc.c
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, n);
+		rte_ring_dequeue_zc_finish(r, n);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return n;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, n);
+		rte_ring_enqueue_zc_finish(r, n);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return n;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num, RING_F_SP_ENQ | RING_F_SC_DEQ);
+}
+
+const struct test test_ring_st_peek_stress_zc = {
+	.name = "ST_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index c4f82ea56..1af45e0fc 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -49,9 +49,15 @@ test_ring_stress(void)
 	n += test_ring_mt_peek_stress.nb_case;
 	k += run_test(&test_ring_mt_peek_stress);
 
+	n += test_ring_mt_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_mt_peek_stress_zc);
+
 	n += test_ring_st_peek_stress.nb_case;
 	k += run_test(&test_ring_st_peek_stress);
 
+	n += test_ring_st_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_st_peek_stress_zc);
+
 	printf("Number of tests:\t%u\nSuccess:\t%u\nFailed:\t%u\n",
 		n, k, n - k);
 	return (k != n);
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index c85d6fa92..416d68c9a 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -36,4 +36,6 @@ extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
+extern const struct test test_ring_mt_peek_stress_zc;
 extern const struct test test_ring_st_peek_stress;
+extern const struct test test_ring_st_peek_stress_zc;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
@ 2020-10-23 13:24     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-23 13:24 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd, stable


> Pass the correct number of bytes to dump the memory.
> 
> Fixes: bf28df24e915 ("test/ring: add contention stress test"
> Cc: konstantin.ananyev@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  app/test/test_ring_stress_impl.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
> index 3b9a480eb..f9ca63b90 100644
> --- a/app/test/test_ring_stress_impl.h
> +++ b/app/test/test_ring_stress_impl.h
> @@ -159,7 +159,7 @@ check_updt_elem(struct ring_elem *elm[], uint32_t num,
>  				"offending object: %p\n",
>  				__func__, rte_lcore_id(), num, i, elm[i]);
>  			rte_memdump(stdout, "expected", check, sizeof(*check));
> -			rte_memdump(stdout, "result", elm[i], sizeof(elm[i]));
> +			rte_memdump(stdout, "result", elm[i], sizeof(*elm[i]));
>  			rte_spinlock_unlock(&dump_lock);
>  			return -EINVAL;
>  		}
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
@ 2020-10-23 13:59     ` Ananyev, Konstantin
  2020-10-24 15:45       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-23 13:59 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd


> 
> Add zero-copy APIs. These APIs provide the capability to
> copy the data to/from the ring memory directly, without
> having a temporary copy (for ex: an array of mbufs on
> the stack). Use cases that involve copying large amount
> of data to/from the ring can benefit from these APIs.

LGTM in general.
Few nits, see below.

> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  lib/librte_ring/meson.build        |   1 +
>  lib/librte_ring/rte_ring_elem.h    |   1 +
>  lib/librte_ring/rte_ring_peek_zc.h | 542 +++++++++++++++++++++++++++++
>  3 files changed, 544 insertions(+)
>  create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

Need to update documentation: PG and RN.

> 
> diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> index 31c0b4649..36fdcb6a5 100644
> --- a/lib/librte_ring/meson.build
> +++ b/lib/librte_ring/meson.build
> @@ -11,5 +11,6 @@ headers = files('rte_ring.h',
>  		'rte_ring_hts_c11_mem.h',
>  		'rte_ring_peek.h',
>  		'rte_ring_peek_c11_mem.h',
> +		'rte_ring_peek_zc.h',
>  		'rte_ring_rts.h',
>  		'rte_ring_rts_c11_mem.h')
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 938b398fc..7034d29c0 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
> 
>  #ifdef ALLOW_EXPERIMENTAL_API
>  #include <rte_ring_peek.h>
> +#include <rte_ring_peek_zc.h>
>  #endif
> 
>  #include <rte_ring.h>
> diff --git a/lib/librte_ring/rte_ring_peek_zc.h b/lib/librte_ring/rte_ring_peek_zc.h
> new file mode 100644
> index 000000000..9db2d343f
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_peek_zc.h
> @@ -0,0 +1,542 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm Limited
> + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_PEEK_ZC_H_
> +#define _RTE_RING_PEEK_ZC_H_
> +
> +/**
> + * @file
> + * @b EXPERIMENTAL: this API may change without prior notice
> + * It is not recommended to include this file directly.
> + * Please include <rte_ring_elem.h> instead.
> + *
> + * Ring Peek Zero Copy APIs
> + * These APIs make it possible to split public enqueue/dequeue API
> + * into 3 parts:
> + * - enqueue/dequeue start
> + * - copy data to/from the ring
> + * - enqueue/dequeue finish
> + * Along with the advantages of the peek APIs, these APIs provide the ability
> + * to avoid copying of the data to temporary area (for ex: array of mbufs
> + * on the stack).
> + *
> + * Note that currently these APIs are available only for two sync modes:
> + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> + * It is user's responsibility to create/init ring with appropriate sync
> + * modes selected.
> + *
> + * Following are some examples showing the API usage.
> + * 1)
> + * struct elem_obj {uint64_t a; uint32_t b, c;};
> + * struct elem_obj *obj;
> + *
> + * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
> + * // Reserve space on the ring
> + * n = rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(elem_obj), 1, &zcd, NULL);
> + *
> + * // Produce the data directly on the ring memory
> + * obj = (struct elem_obj *)zcd->ptr1;
> + * obj.a = rte_get_a();

As obj is a pointer, should be obj->a = ...
Same for b and c.

> + * obj.b = rte_get_b();
> + * obj.c = rte_get_c();
> + * rte_ring_enqueue_zc_elem_finish(ring, n);
> + *
> + * 2)
> + * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
> + * // Reserve space on the ring
> + * n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> + *
> + * // Pkt I/O core polls packets from the NIC
> + * if (n == 32)
> + *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, 32);
> + * else
> + *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);

Hmm, that doesn't look exactly correct to me.
It could be that n == 32, but we still need to do wrap-around.
Shouldn't it be:

If (n != 0) {
	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
	if (nb_rx == zcd->n1 && nb_rx != n)
		nb_rx += rte_eth_rx_burst(portid, queueid, zcd->ptr2, n - nb_rx);
}

> + *
> + * // Provide packets to the packet processing cores
> + * rte_ring_enqueue_zc_finish(r, nb_rx);
> + *
> + * Note that between _start_ and _finish_ none other thread can proceed
> + * with enqueue/dequeue operation till _finish_ completes.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_ring_peek_c11_mem.h>
> +
> +/**
> + * Ring zero-copy information structure.
> + *
> + * This structure contains the pointers and length of the space
> + * reserved on the ring storage.
> + */
> +struct rte_ring_zc_data {
> +	/* Pointer to the first space in the ring */
> +	void **ptr1;

Why not just 'void *ptr1;'?
Same for ptr2.

> +	/* Pointer to the second space in the ring if there is wrap-around */
> +	void **ptr2;
> +	/* Number of elements in the first pointer. If this is equal to
> +	 * the number of elements requested, then ptr2 is NULL.
> +	 * Otherwise, subtracting n1 from number of elements requested
> +	 * will give the number of elements available at ptr2.
> +	 */
> +	unsigned int n1;
> +} __rte_cache_aligned;
> +
> +static __rte_always_inline void
> +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
> +{
> +	uint32_t idx, scale, nr_idx;
> +	uint32_t *ring = (uint32_t *)&r[1];
> +
> +	/* Normalize to uint32_t */
> +	scale = esize / sizeof(uint32_t);
> +	idx = head & r->mask;
> +	nr_idx = idx * scale;
> +
> +	*dst1 = ring + nr_idx;
> +	*n1 = num;
> +
> +	if (idx + num > r->size) {
> +		*n1 = r->size - idx;
> +		*dst2 = ring;
> +	}

Seems like missing:
else {*dst2 = NULL;}

> +}
> +
> +/**
> + * @internal This function moves prod head value.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_enqueue_zc_elem_start(struct rte_ring *r, unsigned int esize,
> +		uint32_t n, enum rte_ring_queue_behavior behavior,
> +		struct rte_ring_zc_data *zcd, unsigned int *free_space)
> +{
> +	uint32_t free, head, next;
> +
> +	switch (r->prod.sync_type) {
> +	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
> +			behavior, &head, &next, &free);
> +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,

If you change ptr1, ptr2 to be just 'void *', then probably no extra type-cast
will be needed here. 

> +			&zcd->n1, (void **)&zcd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT_HTS:
> +		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
> +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd->ptr1,
> +			&zcd->n1, (void **)&zcd->ptr2);
> +		break;
> +	case RTE_RING_SYNC_MT:
> +	case RTE_RING_SYNC_MT_RTS:
> +	default:
> +		/* unsupported mode, shouldn't be here */
> +		RTE_ASSERT(0);
> +		n = 0;
> +		free = 0;
> +	}

Would it make sense to move __rte_ring_get_elem_addr() here and do it
only when n != 0?
I.E:

if (n != 0)
	__rte_ring_get_elem_addr(...);
	
Same comments for _dequeue_ analog.

> +
> +	if (free_space != NULL)
> +		*free_space = free - n;
> +	return n;
> +}
> +

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 5/5] test/ring: add stress tests for zero copy APIs
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
@ 2020-10-23 14:11     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-23 14:11 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd



> 
> Add stress tests for zero copy API.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  app/test/meson.build                   |  2 +
>  app/test/test_ring_mt_peek_stress_zc.c | 56 ++++++++++++++++++++++
>  app/test/test_ring_st_peek_stress_zc.c | 65 ++++++++++++++++++++++++++
>  app/test/test_ring_stress.c            |  6 +++
>  app/test/test_ring_stress.h            |  2 +
>  5 files changed, 131 insertions(+)
>  create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
>  create mode 100644 app/test/test_ring_st_peek_stress_zc.c
> 
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
@ 2020-10-23 14:20     ` Ananyev, Konstantin
  2020-10-23 22:47       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-23 14:20 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd


> 
> Add functional tests for zero copy APIs. Test enqueue/dequeue
> functions are created using the zero copy APIs to fit into
> the existing testing method.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  app/test/test_ring.c | 196 +++++++++++++++++++++++++++++++++++++++++++
>  app/test/test_ring.h |  42 ++++++++++
>  2 files changed, 238 insertions(+)

....

> diff --git a/app/test/test_ring.h b/app/test/test_ring.h
> index 16697ee02..33c8a31fe 100644
> --- a/app/test/test_ring.h
> +++ b/app/test/test_ring.h
> @@ -53,6 +53,48 @@ test_ring_inc_ptr(void **obj, int esize, unsigned int n)
>  					(n * esize / sizeof(uint32_t)));
>  }
> 
> +static inline void
> +test_ring_mem_copy(void *dst, void * const *src, int esize, unsigned int num)
> +{
> +	size_t temp_sz;
> +
> +	temp_sz = num * sizeof(void *);
> +	if (esize != -1)
> +		temp_sz = esize * num;
> +
> +	memcpy(dst, src, temp_sz);
> +}
> +
> +/* Copy to the ring memory */
> +static inline void
> +test_ring_copy_to(struct rte_ring_zc_data *zcd, void * const *src, int esize,
> +	unsigned int num)
> +{
> +	test_ring_mem_copy(zcd->ptr1, src, esize, zcd->n1);
> +	if (zcd->n1 != num) {
> +		if (esize == -1)
> +			src = src + zcd->n1;
> +		else
> +			src = (void * const *)(((const uint32_t *)src) +
> +					(zcd->n1 * esize / sizeof(uint32_t)));

Why just not:
src = test_ring_inc_ptr(src, esize, zcd->n1);
?

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> +		test_ring_mem_copy(zcd->ptr2, src,
> +					esize, num - zcd->n1);
> +	}
> +}
> +
> +/* Copy from the ring memory */
> +static inline void
> +test_ring_copy_from(struct rte_ring_zc_data *zcd, void *dst, int esize,
> +	unsigned int num)
> +{
> +	test_ring_mem_copy(dst, zcd->ptr1, esize, zcd->n1);
> +
> +	if (zcd->n1 != num) {
> +		dst = test_ring_inc_ptr(dst, esize, zcd->n1);
> +		test_ring_mem_copy(dst, zcd->ptr2, esize, num - zcd->n1);
> +	}
> +}
> +
>  static __rte_always_inline unsigned int
>  test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			unsigned int api_type)
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file
  2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
@ 2020-10-23 14:22     ` Ananyev, Konstantin
  2020-10-23 23:54       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-23 14:22 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: olivier.matz, david.marchand, dharmik.thakkar, ruifeng.wang, nd

> Move test_ring_inc_ptr to header file so that it can be used by
> functions in other files.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  app/test/test_ring.c | 11 -----------
>  app/test/test_ring.h | 11 +++++++++++
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c
> index a62cb263b..329d538a9 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			NULL);
>  }
> 
> -static void**
> -test_ring_inc_ptr(void **obj, int esize, unsigned int n)
> -{
> -	/* Legacy queue APIs? */
> -	if ((esize) == -1)
> -		return ((void **)obj) + n;
> -	else
> -		return (void **)(((uint32_t *)obj) +
> -					(n * esize / sizeof(uint32_t)));
> -}
> -
>  static void
>  test_ring_mem_init(void *obj, unsigned int count, int esize)
>  {
> diff --git a/app/test/test_ring.h b/app/test/test_ring.h
> index d4b15af7c..16697ee02 100644
> --- a/app/test/test_ring.h
> +++ b/app/test/test_ring.h
> @@ -42,6 +42,17 @@ test_ring_create(const char *name, int esize, unsigned int count,
>  						(socket_id), (flags));
>  }
> 
> +static inline void**
> +test_ring_inc_ptr(void **obj, int esize, unsigned int n)
> +{
> +	/* Legacy queue APIs? */
> +	if ((esize) == -1)
> +		return ((void **)obj) + n;
> +	else
> +		return (void **)(((uint32_t *)obj) +
> +					(n * esize / sizeof(uint32_t)));
> +}

In all these pointer arithemetics, why do you need 'void **'?
Why just not 'void*', or even uintptr_t?


> +
>  static __rte_always_inline unsigned int
>  test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			unsigned int api_type)
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs
  2020-10-23 14:20     ` Ananyev, Konstantin
@ 2020-10-23 22:47       ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23 22:47 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: olivier.matz, david.marchand, Dharmik Thakkar, Ruifeng Wang, nd,
	Honnappa Nagarahalli, nd

<snip>

> >
> > Add functional tests for zero copy APIs. Test enqueue/dequeue
> > functions are created using the zero copy APIs to fit into the
> > existing testing method.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > ---
> >  app/test/test_ring.c | 196
> > +++++++++++++++++++++++++++++++++++++++++++
> >  app/test/test_ring.h |  42 ++++++++++
> >  2 files changed, 238 insertions(+)
> 
> ....
> 
> > diff --git a/app/test/test_ring.h b/app/test/test_ring.h index
> > 16697ee02..33c8a31fe 100644
> > --- a/app/test/test_ring.h
> > +++ b/app/test/test_ring.h
> > @@ -53,6 +53,48 @@ test_ring_inc_ptr(void **obj, int esize, unsigned int
> n)
> >  					(n * esize / sizeof(uint32_t)));  }
> >
> > +static inline void
> > +test_ring_mem_copy(void *dst, void * const *src, int esize, unsigned
> > +int num) {
> > +	size_t temp_sz;
> > +
> > +	temp_sz = num * sizeof(void *);
> > +	if (esize != -1)
> > +		temp_sz = esize * num;
> > +
> > +	memcpy(dst, src, temp_sz);
> > +}
> > +
> > +/* Copy to the ring memory */
> > +static inline void
> > +test_ring_copy_to(struct rte_ring_zc_data *zcd, void * const *src, int
> esize,
> > +	unsigned int num)
> > +{
> > +	test_ring_mem_copy(zcd->ptr1, src, esize, zcd->n1);
> > +	if (zcd->n1 != num) {
> > +		if (esize == -1)
> > +			src = src + zcd->n1;
> > +		else
> > +			src = (void * const *)(((const uint32_t *)src) +
> > +					(zcd->n1 * esize / sizeof(uint32_t)));
> 
> Why just not:
> src = test_ring_inc_ptr(src, esize, zcd->n1); ?
test_enqdeq_impl requires the enqueue APIs to have 'const' pointer for data to be copied to the ring. Because of this, the 'src' parameter needs to be a 'const'.
If I change test_ring_inc_ptr to take const parameter, a lot of things in test_ring.c break as test_ring_inc_ptr is called with lot of non-const pointers.

> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> > +		test_ring_mem_copy(zcd->ptr2, src,
> > +					esize, num - zcd->n1);
> > +	}
> > +}
> > +
> > +/* Copy from the ring memory */
> > +static inline void
> > +test_ring_copy_from(struct rte_ring_zc_data *zcd, void *dst, int esize,
> > +	unsigned int num)
> > +{
> > +	test_ring_mem_copy(dst, zcd->ptr1, esize, zcd->n1);
> > +
> > +	if (zcd->n1 != num) {
> > +		dst = test_ring_inc_ptr(dst, esize, zcd->n1);
> > +		test_ring_mem_copy(dst, zcd->ptr2, esize, num - zcd->n1);
> > +	}
> > +}
> > +
> >  static __rte_always_inline unsigned int  test_ring_enqueue(struct
> > rte_ring *r, void **obj, int esize, unsigned int n,
> >  			unsigned int api_type)
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file
  2020-10-23 14:22     ` Ananyev, Konstantin
@ 2020-10-23 23:54       ` Honnappa Nagarahalli
  2020-10-24  0:29         ` Stephen Hemminger
  0 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-23 23:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: olivier.matz, david.marchand, Dharmik Thakkar, Ruifeng Wang, nd,
	Honnappa Nagarahalli, nd

<snip>

> 
> > Move test_ring_inc_ptr to header file so that it can be used by
> > functions in other files.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > ---
> >  app/test/test_ring.c | 11 -----------  app/test/test_ring.h | 11
> > +++++++++++
> >  2 files changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> > a62cb263b..329d538a9 100644
> > --- a/app/test/test_ring.c
> > +++ b/app/test/test_ring.c
> > @@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj,
> int esize, unsigned int n,
> >  			NULL);
> >  }
> >
> > -static void**
> > -test_ring_inc_ptr(void **obj, int esize, unsigned int n) -{
> > -	/* Legacy queue APIs? */
> > -	if ((esize) == -1)
> > -		return ((void **)obj) + n;
> > -	else
> > -		return (void **)(((uint32_t *)obj) +
> > -					(n * esize / sizeof(uint32_t)));
> > -}
> > -
> >  static void
> >  test_ring_mem_init(void *obj, unsigned int count, int esize)  { diff
> > --git a/app/test/test_ring.h b/app/test/test_ring.h index
> > d4b15af7c..16697ee02 100644
> > --- a/app/test/test_ring.h
> > +++ b/app/test/test_ring.h
> > @@ -42,6 +42,17 @@ test_ring_create(const char *name, int esize,
> unsigned int count,
> >  						(socket_id), (flags));
> >  }
> >
> > +static inline void**
> > +test_ring_inc_ptr(void **obj, int esize, unsigned int n) {
> > +	/* Legacy queue APIs? */
> > +	if ((esize) == -1)
> > +		return ((void **)obj) + n;
> > +	else
> > +		return (void **)(((uint32_t *)obj) +
> > +					(n * esize / sizeof(uint32_t))); }
> 
> In all these pointer arithemetics, why do you need 'void **'?
> Why just not 'void*', or even uintptr_t?
I will change it as follows:

static inline void*
test_ring_inc_ptr(void *obj, int esize, unsigned int n)
{
        int sz;

        sz = esize;
        /* Legacy queue APIs? */
        if ((esize) == -1)
                sz = sizeof(void *);

        return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
}

> 
> 
> > +
> >  static __rte_always_inline unsigned int  test_ring_enqueue(struct
> > rte_ring *r, void **obj, int esize, unsigned int n,
> >  			unsigned int api_type)
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file
  2020-10-23 23:54       ` Honnappa Nagarahalli
@ 2020-10-24  0:29         ` Stephen Hemminger
  2020-10-24  0:31           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: Stephen Hemminger @ 2020-10-24  0:29 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Ananyev, Konstantin, dev, olivier.matz, david.marchand,
	Dharmik Thakkar, Ruifeng Wang, nd

On Fri, 23 Oct 2020 23:54:22 +0000
Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> wrote:

> <snip>
> 
> >   
> > > Move test_ring_inc_ptr to header file so that it can be used by
> > > functions in other files.
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > ---
> > >  app/test/test_ring.c | 11 -----------  app/test/test_ring.h | 11
> > > +++++++++++
> > >  2 files changed, 11 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> > > a62cb263b..329d538a9 100644
> > > --- a/app/test/test_ring.c
> > > +++ b/app/test/test_ring.c
> > > @@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj,  
> > int esize, unsigned int n,  
> > >  			NULL);
> > >  }
> > >
> > > -static void**
> > > -test_ring_inc_ptr(void **obj, int esize, unsigned int n) -{
> > > -	/* Legacy queue APIs? */
> > > -	if ((esize) == -1)
> > > -		return ((void **)obj) + n;
> > > -	else
> > > -		return (void **)(((uint32_t *)obj) +
> > > -					(n * esize / sizeof(uint32_t)));
> > > -}
> > > -
> > >  static void
> > >  test_ring_mem_init(void *obj, unsigned int count, int esize)  { diff
> > > --git a/app/test/test_ring.h b/app/test/test_ring.h index
> > > d4b15af7c..16697ee02 100644
> > > --- a/app/test/test_ring.h
> > > +++ b/app/test/test_ring.h
> > > @@ -42,6 +42,17 @@ test_ring_create(const char *name, int esize,  
> > unsigned int count,  
> > >  						(socket_id), (flags));
> > >  }
> > >
> > > +static inline void**
> > > +test_ring_inc_ptr(void **obj, int esize, unsigned int n) {
> > > +	/* Legacy queue APIs? */
> > > +	if ((esize) == -1)
> > > +		return ((void **)obj) + n;
> > > +	else
> > > +		return (void **)(((uint32_t *)obj) +
> > > +					(n * esize / sizeof(uint32_t))); }  
> > 
> > In all these pointer arithemetics, why do you need 'void **'?
> > Why just not 'void*', or even uintptr_t?  
> I will change it as follows:
> 
> static inline void*
> test_ring_inc_ptr(void *obj, int esize, unsigned int n)
> {
>         int sz;
> 
>         sz = esize;
>         /* Legacy queue APIs? */
>         if ((esize) == -1)

Extra (paren) doesn't help readability either

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file
  2020-10-24  0:29         ` Stephen Hemminger
@ 2020-10-24  0:31           ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24  0:31 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Ananyev, Konstantin, dev, olivier.matz, david.marchand,
	Dharmik Thakkar, Ruifeng Wang, nd, Honnappa Nagarahalli, nd

<snip>

> > static inline void*
> > test_ring_inc_ptr(void *obj, int esize, unsigned int n) {
> >         int sz;
> >
> >         sz = esize;
> >         /* Legacy queue APIs? */
> >         if ((esize) == -1)
> 
> Extra (paren) doesn't help readability either
+1 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs
  2020-10-23 13:59     ` Ananyev, Konstantin
@ 2020-10-24 15:45       ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 15:45 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: olivier.matz, david.marchand, Dharmik Thakkar, Ruifeng Wang, nd,
	Honnappa Nagarahalli, nd

Hi Konstantin,
	Thank you for the quick comments. Please see the responses inline.

<snip>

> 
> 
> >
> > Add zero-copy APIs. These APIs provide the capability to copy the data
> > to/from the ring memory directly, without having a temporary copy (for
> > ex: an array of mbufs on the stack). Use cases that involve copying
> > large amount of data to/from the ring can benefit from these APIs.
> 
> LGTM in general.
> Few nits, see below.
> 
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > ---
> >  lib/librte_ring/meson.build        |   1 +
> >  lib/librte_ring/rte_ring_elem.h    |   1 +
> >  lib/librte_ring/rte_ring_peek_zc.h | 542
> > +++++++++++++++++++++++++++++
> >  3 files changed, 544 insertions(+)
> >  create mode 100644 lib/librte_ring/rte_ring_peek_zc.h
> 
> Need to update documentation: PG and RN.
> 
> >
> > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> > index 31c0b4649..36fdcb6a5 100644
> > --- a/lib/librte_ring/meson.build
> > +++ b/lib/librte_ring/meson.build
> > @@ -11,5 +11,6 @@ headers = files('rte_ring.h',
> >  		'rte_ring_hts_c11_mem.h',
> >  		'rte_ring_peek.h',
> >  		'rte_ring_peek_c11_mem.h',
> > +		'rte_ring_peek_zc.h',
> >  		'rte_ring_rts.h',
> >  		'rte_ring_rts_c11_mem.h')
> > diff --git a/lib/librte_ring/rte_ring_elem.h
> > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7034d29c0 100644
> > --- a/lib/librte_ring/rte_ring_elem.h
> > +++ b/lib/librte_ring/rte_ring_elem.h
> > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r,
> > void *obj_table,
> >
> >  #ifdef ALLOW_EXPERIMENTAL_API
> >  #include <rte_ring_peek.h>
> > +#include <rte_ring_peek_zc.h>
> >  #endif
> >
> >  #include <rte_ring.h>
> > diff --git a/lib/librte_ring/rte_ring_peek_zc.h
> > b/lib/librte_ring/rte_ring_peek_zc.h
> > new file mode 100644
> > index 000000000..9db2d343f
> > --- /dev/null
> > +++ b/lib/librte_ring/rte_ring_peek_zc.h
> > @@ -0,0 +1,542 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + *
> > + * Copyright (c) 2020 Arm Limited
> > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * All rights reserved.
> > + * Derived from FreeBSD's bufring.h
> > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > + */
> > +
> > +#ifndef _RTE_RING_PEEK_ZC_H_
> > +#define _RTE_RING_PEEK_ZC_H_
> > +
> > +/**
> > + * @file
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + * It is not recommended to include this file directly.
> > + * Please include <rte_ring_elem.h> instead.
> > + *
> > + * Ring Peek Zero Copy APIs
> > + * These APIs make it possible to split public enqueue/dequeue API
> > + * into 3 parts:
> > + * - enqueue/dequeue start
> > + * - copy data to/from the ring
> > + * - enqueue/dequeue finish
> > + * Along with the advantages of the peek APIs, these APIs provide the
> > +ability
> > + * to avoid copying of the data to temporary area (for ex: array of
> > +mbufs
> > + * on the stack).
> > + *
> > + * Note that currently these APIs are available only for two sync modes:
> > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > + * 2) Serialized Producer/Serialized Consumer
> (RTE_RING_SYNC_MT_HTS).
> > + * It is user's responsibility to create/init ring with appropriate
> > +sync
> > + * modes selected.
> > + *
> > + * Following are some examples showing the API usage.
> > + * 1)
> > + * struct elem_obj {uint64_t a; uint32_t b, c;};
> > + * struct elem_obj *obj;
> > + *
> > + * // Create ring with sync type RTE_RING_SYNC_ST or
> > +RTE_RING_SYNC_MT_HTS
> > + * // Reserve space on the ring
> > + * n = rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(elem_obj), 1,
> > +&zcd, NULL);
> > + *
> > + * // Produce the data directly on the ring memory
> > + * obj = (struct elem_obj *)zcd->ptr1;
> > + * obj.a = rte_get_a();
> 
> As obj is a pointer, should be obj->a = ...
> Same for b and c.
Will fix.

> 
> > + * obj.b = rte_get_b();
> > + * obj.c = rte_get_c();
> > + * rte_ring_enqueue_zc_elem_finish(ring, n);
> > + *
> > + * 2)
> > + * // Create ring with sync type RTE_RING_SYNC_ST or
> > + RTE_RING_SYNC_MT_HTS
> > + * // Reserve space on the ring
> > + * n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> > + *
> > + * // Pkt I/O core polls packets from the NIC
> > + * if (n == 32)
> > + *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, 32);
> > + * else
> > + *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> 
> Hmm, that doesn't look exactly correct to me.
> It could be that n == 32, but we still need to do wrap-around.
> Shouldn't it be:
> 
> If (n != 0) {
> 	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> 	if (nb_rx == zcd->n1 && nb_rx != n)
> 		nb_rx += rte_eth_rx_burst(portid, queueid, zcd->ptr2, n -
> nb_rx); }
Agree

> 
> > + *
> > + * // Provide packets to the packet processing cores
> > + * rte_ring_enqueue_zc_finish(r, nb_rx);
> > + *
> > + * Note that between _start_ and _finish_ none other thread can
> > + proceed
> > + * with enqueue/dequeue operation till _finish_ completes.
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_ring_peek_c11_mem.h>
> > +
> > +/**
> > + * Ring zero-copy information structure.
> > + *
> > + * This structure contains the pointers and length of the space
> > + * reserved on the ring storage.
> > + */
> > +struct rte_ring_zc_data {
> > +	/* Pointer to the first space in the ring */
> > +	void **ptr1;
> 
> Why not just 'void *ptr1;'?
> Same for ptr2.
Agree

> 
> > +	/* Pointer to the second space in the ring if there is wrap-around */
> > +	void **ptr2;
> > +	/* Number of elements in the first pointer. If this is equal to
> > +	 * the number of elements requested, then ptr2 is NULL.
> > +	 * Otherwise, subtracting n1 from number of elements requested
> > +	 * will give the number of elements available at ptr2.
> > +	 */
> > +	unsigned int n1;
> > +} __rte_cache_aligned;
> > +
> > +static __rte_always_inline void
> > +__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
> > +	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void
> > +**dst2) {
> > +	uint32_t idx, scale, nr_idx;
> > +	uint32_t *ring = (uint32_t *)&r[1];
> > +
> > +	/* Normalize to uint32_t */
> > +	scale = esize / sizeof(uint32_t);
> > +	idx = head & r->mask;
> > +	nr_idx = idx * scale;
> > +
> > +	*dst1 = ring + nr_idx;
> > +	*n1 = num;
> > +
> > +	if (idx + num > r->size) {
> > +		*n1 = r->size - idx;
> > +		*dst2 = ring;
> > +	}
> 
> Seems like missing:
> else {*dst2 = NULL;}
I did not add it since dst2 should be accessed only if there is wrap-around. Will call it out in the struct above.

> 
> > +}
> > +
> > +/**
> > + * @internal This function moves prod head value.
> > + */
> > +static __rte_always_inline unsigned int
> > +__rte_ring_do_enqueue_zc_elem_start(struct rte_ring *r, unsigned int
> esize,
> > +		uint32_t n, enum rte_ring_queue_behavior behavior,
> > +		struct rte_ring_zc_data *zcd, unsigned int *free_space) {
> > +	uint32_t free, head, next;
> > +
> > +	switch (r->prod.sync_type) {
> > +	case RTE_RING_SYNC_ST:
> > +		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
> > +			behavior, &head, &next, &free);
> > +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd-
> >ptr1,
> 
> If you change ptr1, ptr2 to be just 'void *', then probably no extra type-cast
> will be needed here.
Thanks for catching this, pointers are out of whack.

> 
> > +			&zcd->n1, (void **)&zcd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT_HTS:
> > +		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head,
> &free);
> > +		__rte_ring_get_elem_addr(r, head, esize, n, (void **)&zcd-
> >ptr1,
> > +			&zcd->n1, (void **)&zcd->ptr2);
> > +		break;
> > +	case RTE_RING_SYNC_MT:
> > +	case RTE_RING_SYNC_MT_RTS:
> > +	default:
> > +		/* unsupported mode, shouldn't be here */
> > +		RTE_ASSERT(0);
> > +		n = 0;
> > +		free = 0;
> > +	}
> 
> Would it make sense to move __rte_ring_get_elem_addr() here and do it
> only when n != 0?
> I.E:
> 
> if (n != 0)
> 	__rte_ring_get_elem_addr(...);
It adds an 'if' statement. We can add a return to the default case and skip the if statement.

> 
> Same comments for _dequeue_ analog.
> 
> > +
> > +	if (free_space != NULL)
> > +		*free_space = free - n;
> > +	return n;
> > +}
> > +

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs
  2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
                   ` (2 preceding siblings ...)
  2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
@ 2020-10-24 16:11 ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
                     ` (8 more replies)
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
  4 siblings, 9 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

It is pretty common for the DPDK applications to be deployed in
semi-pipeline model. In these models, a small number of cores
(typically 1) are designated as I/O cores. The I/O cores work
on receiving and transmitting packets from the NIC and several
packet processing cores. The IO core and the packet processing
cores exchange the packets over a ring. Typically, such applications
receive the mbufs in a temporary array and copy the mbufs on
to the ring. Depending on the requirements the packets
could be copied in batches of 32, 64 etc resulting in 256B,
512B etc memory copy.

The zero copy APIs help avoid intermediate copies by exposing
the space on the ring directly to the application.

v4:
1) Fixed multiple pointer issues
2) Added documentation

v3:
1) Changed the name of the APIs to 'zero-copy (zc)'
2) Made the address calculation simpler
3) Structure to return the data to the user is aligned on
   cache line boundary.
4) Added functional and stress test cases

v2: changed the patch to use the SP-SC and HTS modes

v1: Initial version

Honnappa Nagarahalli (8):
  lib/ring: add zero copy APIs
  test/ring: move common function to header file
  test/ring: add functional tests for zero copy APIs
  test/ring: add stress tests for zero copy APIs
  doc/ring: add zero copy peek APIs
  test/ring: fix the memory dump size
  test/ring: remove unnecessary braces
  test/ring: user uintptr_t instead of unsigned long

 app/test/meson.build                   |   2 +
 app/test/test_ring.c                   | 209 +++++++++-
 app/test/test_ring.h                   |  67 ++-
 app/test/test_ring_mt_peek_stress_zc.c |  56 +++
 app/test/test_ring_st_peek_stress_zc.c |  65 +++
 app/test/test_ring_stress.c            |   6 +
 app/test/test_ring_stress.h            |   2 +
 app/test/test_ring_stress_impl.h       |   2 +-
 doc/guides/prog_guide/ring_lib.rst     |  41 ++
 doc/guides/rel_notes/release_20_11.rst |   9 +
 lib/librte_ring/meson.build            |   1 +
 lib/librte_ring/rte_ring_elem.h        |   1 +
 lib/librte_ring/rte_ring_peek_zc.h     | 546 +++++++++++++++++++++++++
 13 files changed, 988 insertions(+), 19 deletions(-)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 1/8] lib/ring: add zero copy APIs
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file Honnappa Nagarahalli
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add zero-copy APIs. These APIs provide the capability to
copy the data to/from the ring memory directly, without
having a temporary copy (for ex: an array of mbufs on
the stack). Use cases that involve copying large amount
of data to/from the ring can benefit from these APIs.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 lib/librte_ring/meson.build        |   1 +
 lib/librte_ring/rte_ring_elem.h    |   1 +
 lib/librte_ring/rte_ring_peek_zc.h | 546 +++++++++++++++++++++++++++++
 3 files changed, 548 insertions(+)
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
index 31c0b4649..36fdcb6a5 100644
--- a/lib/librte_ring/meson.build
+++ b/lib/librte_ring/meson.build
@@ -11,5 +11,6 @@ headers = files('rte_ring.h',
 		'rte_ring_hts_c11_mem.h',
 		'rte_ring_peek.h',
 		'rte_ring_peek_c11_mem.h',
+		'rte_ring_peek_zc.h',
 		'rte_ring_rts.h',
 		'rte_ring_rts_c11_mem.h')
diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
index 938b398fc..7034d29c0 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 
 #ifdef ALLOW_EXPERIMENTAL_API
 #include <rte_ring_peek.h>
+#include <rte_ring_peek_zc.h>
 #endif
 
 #include <rte_ring.h>
diff --git a/lib/librte_ring/rte_ring_peek_zc.h b/lib/librte_ring/rte_ring_peek_zc.h
new file mode 100644
index 000000000..482c716ab
--- /dev/null
+++ b/lib/librte_ring/rte_ring_peek_zc.h
@@ -0,0 +1,546 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Copyright (c) 2020 Arm Limited
+ * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * All rights reserved.
+ * Derived from FreeBSD's bufring.h
+ * Used as BSD-3 Licensed with permission from Kip Macy.
+ */
+
+#ifndef _RTE_RING_PEEK_ZC_H_
+#define _RTE_RING_PEEK_ZC_H_
+
+/**
+ * @file
+ * @b EXPERIMENTAL: this API may change without prior notice
+ * It is not recommended to include this file directly.
+ * Please include <rte_ring_elem.h> instead.
+ *
+ * Ring Peek Zero Copy APIs
+ * These APIs make it possible to split public enqueue/dequeue API
+ * into 3 parts:
+ * - enqueue/dequeue start
+ * - copy data to/from the ring
+ * - enqueue/dequeue finish
+ * Along with the advantages of the peek APIs, these APIs provide the ability
+ * to avoid copying of the data to temporary area (for ex: array of mbufs
+ * on the stack).
+ *
+ * Note that currently these APIs are available only for two sync modes:
+ * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
+ * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
+ * It is user's responsibility to create/init ring with appropriate sync
+ * modes selected.
+ *
+ * Following are some examples showing the API usage.
+ * 1)
+ * struct elem_obj {uint64_t a; uint32_t b, c;};
+ * struct elem_obj *obj;
+ *
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(elem_obj), 1, &zcd, NULL);
+ *
+ * // Produce the data directly on the ring memory
+ * obj = (struct elem_obj *)zcd->ptr1;
+ * obj->a = rte_get_a();
+ * obj->b = rte_get_b();
+ * obj->c = rte_get_c();
+ * rte_ring_enqueue_zc_elem_finish(ring, n);
+ *
+ * 2)
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
+ *
+ * // Pkt I/O core polls packets from the NIC
+ * if (n != 0) {
+ *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
+ *	if (nb_rx == zcd->n1 && n != zcd->n1)
+ *		nb_rx = rte_eth_rx_burst(portid, queueid,
+ *						zcd->ptr2, n - zcd->n1);
+ *
+ *	// Provide packets to the packet processing cores
+ *	rte_ring_enqueue_zc_finish(r, nb_rx);
+ * }
+ *
+ * Note that between _start_ and _finish_ none other thread can proceed
+ * with enqueue/dequeue operation till _finish_ completes.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_ring_peek_c11_mem.h>
+
+/**
+ * Ring zero-copy information structure.
+ *
+ * This structure contains the pointers and length of the space
+ * reserved on the ring storage.
+ */
+struct rte_ring_zc_data {
+	/* Pointer to the first space in the ring */
+	void *ptr1;
+	/* Pointer to the second space in the ring if there is wrap-around.
+	 * It contains valid value only if wrap-around happens.
+	 */
+	void *ptr2;
+	/* Number of elements in the first pointer. If this is equal to
+	 * the number of elements requested, then ptr2 is NULL.
+	 * Otherwise, subtracting n1 from number of elements requested
+	 * will give the number of elements available at ptr2.
+	 */
+	unsigned int n1;
+} __rte_cache_aligned;
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
+	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx, scale, nr_idx;
+	uint32_t *ring = (uint32_t *)&r[1];
+
+	/* Normalize to uint32_t */
+	scale = esize / sizeof(uint32_t);
+	idx = head & r->mask;
+	nr_idx = idx * scale;
+
+	*dst1 = ring + nr_idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = r->size - idx;
+		*dst2 = ring;
+	}
+}
+
+/**
+ * @internal This function moves prod head value.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue_zc_elem_start(struct rte_ring *r, unsigned int esize,
+		uint32_t n, enum rte_ring_queue_behavior behavior,
+		struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	uint32_t free, head, next;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &free);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		free = 0;
+		return n;
+	}
+
+	__rte_ring_get_elem_addr(r, head, esize, n, &zcd->ptr1,
+		&zcd->n1, &zcd->ptr2);
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+
+/**
+ * Complete enqueuing several objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
+		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete enqueuing several pointers to objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of pointers to objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_enqueue_zc_elem_finish(r, n);
+}
+
+/**
+ * @internal This function moves cons head value and copies up to *n*
+ * objects from the ring to the user provided obj_table.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue_zc_elem_start(struct rte_ring *r,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	uint32_t avail, head, next;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &avail);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_cons_head(r, n, behavior,
+			&head, &avail);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		avail = 0;
+		return n;
+	}
+
+	__rte_ring_get_elem_addr(r, head, esize, n, &zcd->ptr1,
+		&zcd->n1, &zcd->ptr2);
+
+	if (available != NULL)
+		*available = avail - n;
+	return n;
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_bulk_elem_start(r, sizeof(uintptr_t),
+		n, zcd, available);
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_start(struct rte_ring *r, unsigned int n,
+		struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+			zcd, available);
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
+		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_dequeue_elem_finish(r, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RING_PEEK_ZC_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Move test_ring_inc_ptr to header file so that it can be used by
functions in other files.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/test_ring.c | 11 -----------
 app/test/test_ring.h | 13 +++++++++++++
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index a62cb263b..329d538a9 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			NULL);
 }
 
-static void**
-test_ring_inc_ptr(void **obj, int esize, unsigned int n)
-{
-	/* Legacy queue APIs? */
-	if ((esize) == -1)
-		return ((void **)obj) + n;
-	else
-		return (void **)(((uint32_t *)obj) +
-					(n * esize / sizeof(uint32_t)));
-}
-
 static void
 test_ring_mem_init(void *obj, unsigned int count, int esize)
 {
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index d4b15af7c..b44711398 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -42,6 +42,19 @@ test_ring_create(const char *name, int esize, unsigned int count,
 						(socket_id), (flags));
 }
 
+static inline void*
+test_ring_inc_ptr(void *obj, int esize, unsigned int n)
+{
+	size_t sz;
+
+	sz = sizeof(void *);
+	/* Legacy queue APIs? */
+	if (esize != -1)
+		sz = esize;
+
+	return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 4/8] test/ring: add stress " Honnappa Nagarahalli
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add functional tests for zero copy APIs. Test enqueue/dequeue
functions are created using the zero copy APIs to fit into
the existing testing method.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_ring.c | 196 +++++++++++++++++++++++++++++++++++++++++++
 app/test/test_ring.h |  42 ++++++++++
 2 files changed, 238 insertions(+)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index 329d538a9..99fe4b46f 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
  */
 
 #include <string.h>
@@ -68,6 +69,149 @@
 
 static const int esize[] = {-1, 4, 8, 16, 20};
 
+/* Wrappers around the zero-copy APIs. The wrappers match
+ * the normal enqueue/dequeue API declarations.
+ */
+static unsigned int
+test_ring_enqueue_zc_bulk(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_start(r, n, &zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret > 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_start(r, n, &zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret > 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
 static const struct {
 	const char *desc;
 	uint32_t api_type;
@@ -219,6 +363,58 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	}
 };
 
 static unsigned int
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index b44711398..b525abb79 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -55,6 +55,48 @@ test_ring_inc_ptr(void *obj, int esize, unsigned int n)
 	return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
 }
 
+static inline void
+test_ring_mem_copy(void *dst, void * const *src, int esize, unsigned int num)
+{
+	size_t sz;
+
+	sz = num * sizeof(void *);
+	if (esize != -1)
+		sz = esize * num;
+
+	memcpy(dst, src, sz);
+}
+
+/* Copy to the ring memory */
+static inline void
+test_ring_copy_to(struct rte_ring_zc_data *zcd, void * const *src, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(zcd->ptr1, src, esize, zcd->n1);
+	if (zcd->n1 != num) {
+		if (esize == -1)
+			src = src + zcd->n1;
+		else
+			src = (void * const *)((const uint32_t *)src +
+					(zcd->n1 * esize / sizeof(uint32_t)));
+		test_ring_mem_copy(zcd->ptr2, src,
+					esize, num - zcd->n1);
+	}
+}
+
+/* Copy from the ring memory */
+static inline void
+test_ring_copy_from(struct rte_ring_zc_data *zcd, void *dst, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(dst, zcd->ptr1, esize, zcd->n1);
+
+	if (zcd->n1 != num) {
+		dst = test_ring_inc_ptr(dst, esize, zcd->n1);
+		test_ring_mem_copy(dst, zcd->ptr2, esize, num - zcd->n1);
+	}
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 4/8] test/ring: add stress tests for zero copy APIs
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (2 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add stress tests for zero copy API.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/meson.build                   |  2 +
 app/test/test_ring_mt_peek_stress_zc.c | 56 ++++++++++++++++++++++
 app/test/test_ring_st_peek_stress_zc.c | 65 ++++++++++++++++++++++++++
 app/test/test_ring_stress.c            |  6 +++
 app/test/test_ring_stress.h            |  2 +
 5 files changed, 131 insertions(+)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 8bfb02890..88c831a92 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -108,9 +108,11 @@ test_sources = files('commands.c',
 	'test_ring_mpmc_stress.c',
 	'test_ring_hts_stress.c',
 	'test_ring_mt_peek_stress.c',
+	'test_ring_mt_peek_stress_zc.c',
 	'test_ring_perf.c',
 	'test_ring_rts_stress.c',
 	'test_ring_st_peek_stress.c',
+	'test_ring_st_peek_stress_zc.c',
 	'test_ring_stress.c',
 	'test_rwlock.c',
 	'test_sched.c',
diff --git a/app/test/test_ring_mt_peek_stress_zc.c b/app/test/test_ring_mt_peek_stress_zc.c
new file mode 100644
index 000000000..7e0bd511a
--- /dev/null
+++ b/app/test/test_ring_mt_peek_stress_zc.c
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, n);
+		rte_ring_dequeue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, n);
+		rte_ring_enqueue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ);
+}
+
+const struct test test_ring_mt_peek_stress_zc = {
+	.name = "MT_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_st_peek_stress_zc.c b/app/test/test_ring_st_peek_stress_zc.c
new file mode 100644
index 000000000..2933e30bf
--- /dev/null
+++ b/app/test/test_ring_st_peek_stress_zc.c
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, n);
+		rte_ring_dequeue_zc_finish(r, n);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return n;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, n);
+		rte_ring_enqueue_zc_finish(r, n);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return n;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num, RING_F_SP_ENQ | RING_F_SC_DEQ);
+}
+
+const struct test test_ring_st_peek_stress_zc = {
+	.name = "ST_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index c4f82ea56..1af45e0fc 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -49,9 +49,15 @@ test_ring_stress(void)
 	n += test_ring_mt_peek_stress.nb_case;
 	k += run_test(&test_ring_mt_peek_stress);
 
+	n += test_ring_mt_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_mt_peek_stress_zc);
+
 	n += test_ring_st_peek_stress.nb_case;
 	k += run_test(&test_ring_st_peek_stress);
 
+	n += test_ring_st_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_st_peek_stress_zc);
+
 	printf("Number of tests:\t%u\nSuccess:\t%u\nFailed:\t%u\n",
 		n, k, n - k);
 	return (k != n);
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index c85d6fa92..416d68c9a 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -36,4 +36,6 @@ extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
+extern const struct test test_ring_mt_peek_stress_zc;
 extern const struct test test_ring_st_peek_stress;
+extern const struct test test_ring_st_peek_stress_zc;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (3 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 4/8] test/ring: add stress " Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add zero copy peek API documentation.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 doc/guides/prog_guide/ring_lib.rst     | 41 ++++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_11.rst |  9 ++++++
 2 files changed, 50 insertions(+)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 895484d95..247646d38 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -452,6 +452,47 @@ selected. As an example of usage:
 Note that between ``_start_`` and ``_finish_`` none other thread can proceed
 with enqueue(/dequeue) operation till ``_finish_`` completes.
 
+Ring Peek Zero Copy API
+-----------------------
+
+Along with the advantages of the peek APIs, zero copy APIs provide the ability
+to copy the data to the ring memory directly without the need for temporary
+storage (for ex: array of mbufs on the stack).
+
+These APIs make it possible to split public enqueue/dequeue API into 3 phases:
+
+* enqueue/dequeue start
+
+* copy data to/from the ring
+
+* enqueue/dequeue finish
+
+Note that this API is available only for two sync modes:
+
+*   Single Producer/Single Consumer (SP/SC)
+
+*   Multi-producer/Multi-consumer with Head/Tail Sync (HTS)
+
+It is a user responsibility to create/init ring with appropriate sync modes.
+Following is an example of usage:
+
+.. code-block:: c
+
+    /* Reserve space on the ring */
+    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
+    /* Pkt I/O core polls packets from the NIC */
+    if (n != 0) {
+        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
+        if (nb_rx == zcd->n1 && n != zcd->n1)
+            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
+							n - zcd->n1);
+        /* Provide packets to the packet processing cores */
+        rte_ring_enqueue_zc_finish(r, nb_rx);
+    }
+
+Note that between ``_start_`` and ``_finish_`` no other thread can proceed
+with enqueue(/dequeue) operation till ``_finish_`` completes.
+
 References
 ----------
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index d8ac359e5..fdc78b3da 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,15 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added zero copy APIs for rte_ring.**
+
+  For rings with producer/consumer in ``RTE_RING_SYNC_ST``, ``RTE_RING_SYNC_MT_HTS``
+  modes, these APIs split enqueue/dequeue operation into three phases
+  (enqueue/dequeue start, copy data to/from ring, enqueue/dequeue finish).
+  Along with the advantages of the peek APIs, these provide the ability to
+  copy the data to the ring memory directly without the need for temporary
+  storage.
+
 * **Added write combining store APIs.**
 
   Added ``rte_write32_wc`` and ``rte_write32_wc_relaxed`` APIs
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (4 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd, stable

Pass the correct number of bytes to dump the memory.

Fixes: bf28df24e915 ("test/ring: add contention stress test"
Cc: konstantin.ananyev@intel.com
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_ring_stress_impl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
index 3b9a480eb..f9ca63b90 100644
--- a/app/test/test_ring_stress_impl.h
+++ b/app/test/test_ring_stress_impl.h
@@ -159,7 +159,7 @@ check_updt_elem(struct ring_elem *elm[], uint32_t num,
 				"offending object: %p\n",
 				__func__, rte_lcore_id(), num, i, elm[i]);
 			rte_memdump(stdout, "expected", check, sizeof(*check));
-			rte_memdump(stdout, "result", elm[i], sizeof(elm[i]));
+			rte_memdump(stdout, "result", elm[i], sizeof(*elm[i]));
 			rte_spinlock_unlock(&dump_lock);
 			return -EINVAL;
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (5 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
  2020-10-24 16:18   ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Remove unnecessary braces to improve readability.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 app/test/test_ring.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index b525abb79..c8bfec839 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -35,11 +35,11 @@ test_ring_create(const char *name, int esize, unsigned int count,
 		int socket_id, unsigned int flags)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
-		return rte_ring_create((name), (count), (socket_id), (flags));
+	if (esize == -1)
+		return rte_ring_create(name, count, socket_id, flags);
 	else
-		return rte_ring_create_elem((name), (esize), (count),
-						(socket_id), (flags));
+		return rte_ring_create_elem(name, esize, count,
+						socket_id, flags);
 }
 
 static inline void*
@@ -102,7 +102,7 @@ test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
+	if (esize == -1)
 		switch (api_type) {
 		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
 			return rte_ring_enqueue(r, *obj);
@@ -163,7 +163,7 @@ test_ring_dequeue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
+	if (esize == -1)
 		switch (api_type) {
 		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
 			return rte_ring_dequeue(r, obj);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (6 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
@ 2020-10-24 16:11   ` Honnappa Nagarahalli
  2020-10-24 16:18   ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:11 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Use uintptr_t instead of unsigned long while initializing the
array of pointers.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 app/test/test_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index 99fe4b46f..51c05cabb 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -447,7 +447,7 @@ test_ring_mem_init(void *obj, unsigned int count, int esize)
 	/* Legacy queue APIs? */
 	if (esize == -1)
 		for (i = 0; i < count; i++)
-			((void **)obj)[i] = (void *)(unsigned long)i;
+			((void **)obj)[i] = (void *)(uintptr_t)i;
 	else
 		for (i = 0; i < (count * esize / sizeof(uint32_t)); i++)
 			((uint32_t *)obj)[i] = i;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
                     ` (7 preceding siblings ...)
  2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
@ 2020-10-24 16:18   ` Honnappa Nagarahalli
  2020-10-25  7:16     ` David Marchand
  8 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-24 16:18 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, konstantin.ananyev, stephen
  Cc: Dharmik Thakkar, Ruifeng Wang, olivier.matz, david.marchand, nd,
	Honnappa Nagarahalli, nd

Hi David,
	Checkpatch CI is showing "WARNING" on a lot of the patches in this series, but it does not list any real warnings.  Any idea what is happening?

Thanks,
Honnappa

> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Sent: Saturday, October 24, 2020 11:11 AM
> To: dev@dpdk.org; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; konstantin.ananyev@intel.com;
> stephen@networkplumber.org
> Cc: Dharmik Thakkar <Dharmik.Thakkar@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com;
> david.marchand@redhat.com; nd <nd@arm.com>
> Subject: [PATCH v4 0/8] lib/ring: add zero copy APIs
> 
> It is pretty common for the DPDK applications to be deployed in semi-
> pipeline model. In these models, a small number of cores (typically 1) are
> designated as I/O cores. The I/O cores work on receiving and transmitting
> packets from the NIC and several packet processing cores. The IO core and
> the packet processing cores exchange the packets over a ring. Typically, such
> applications receive the mbufs in a temporary array and copy the mbufs on
> to the ring. Depending on the requirements the packets could be copied in
> batches of 32, 64 etc resulting in 256B, 512B etc memory copy.
> 
> The zero copy APIs help avoid intermediate copies by exposing the space on
> the ring directly to the application.
> 
> v4:
> 1) Fixed multiple pointer issues
> 2) Added documentation
> 
> v3:
> 1) Changed the name of the APIs to 'zero-copy (zc)'
> 2) Made the address calculation simpler
> 3) Structure to return the data to the user is aligned on
>    cache line boundary.
> 4) Added functional and stress test cases
> 
> v2: changed the patch to use the SP-SC and HTS modes
> 
> v1: Initial version
> 
> Honnappa Nagarahalli (8):
>   lib/ring: add zero copy APIs
>   test/ring: move common function to header file
>   test/ring: add functional tests for zero copy APIs
>   test/ring: add stress tests for zero copy APIs
>   doc/ring: add zero copy peek APIs
>   test/ring: fix the memory dump size
>   test/ring: remove unnecessary braces
>   test/ring: user uintptr_t instead of unsigned long
> 
>  app/test/meson.build                   |   2 +
>  app/test/test_ring.c                   | 209 +++++++++-
>  app/test/test_ring.h                   |  67 ++-
>  app/test/test_ring_mt_peek_stress_zc.c |  56 +++
> app/test/test_ring_st_peek_stress_zc.c |  65 +++
>  app/test/test_ring_stress.c            |   6 +
>  app/test/test_ring_stress.h            |   2 +
>  app/test/test_ring_stress_impl.h       |   2 +-
>  doc/guides/prog_guide/ring_lib.rst     |  41 ++
>  doc/guides/rel_notes/release_20_11.rst |   9 +
>  lib/librte_ring/meson.build            |   1 +
>  lib/librte_ring/rte_ring_elem.h        |   1 +
>  lib/librte_ring/rte_ring_peek_zc.h     | 546 +++++++++++++++++++++++++
>  13 files changed, 988 insertions(+), 19 deletions(-)  create mode 100644
> app/test/test_ring_mt_peek_stress_zc.c
>  create mode 100644 app/test/test_ring_st_peek_stress_zc.c
>  create mode 100644 lib/librte_ring/rte_ring_peek_zc.h
> 
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs
  2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
                   ` (3 preceding siblings ...)
  2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
@ 2020-10-25  5:45 ` Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
                     ` (8 more replies)
  4 siblings, 9 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

It is pretty common for the DPDK applications to be deployed in
semi-pipeline model. In these models, a small number of cores
(typically 1) are designated as I/O cores. The I/O cores work
on receiving and transmitting packets from the NIC and several
packet processing cores. The IO core and the packet processing
cores exchange the packets over a ring. Typically, such applications
receive the mbufs in a temporary array and copy the mbufs on
to the ring. Depending on the requirements the packets
could be copied in batches of 32, 64 etc resulting in 256B,
512B etc memory copy.

The zero copy APIs help avoid intermediate copies by exposing
the space on the ring directly to the application.

v5:
1) Fixed CI compilation issues

v4:
1) Fixed multiple pointer issues
2) Added documentation

v3:
1) Changed the name of the APIs to 'zero-copy (zc)'
2) Made the address calculation simpler
3) Structure to return the data to the user is aligned on
   cache line boundary.
4) Added functional and stress test cases

v2: changed the patch to use the SP-SC and HTS modes

v1: Initial version

Honnappa Nagarahalli (8):
  lib/ring: add zero copy APIs
  test/ring: move common function to header file
  test/ring: add functional tests for zero copy APIs
  test/ring: add stress tests for zero copy APIs
  doc/ring: add zero copy peek APIs
  test/ring: fix the memory dump size
  test/ring: remove unnecessary braces
  test/ring: user uintptr_t instead of unsigned long

 app/test/meson.build                   |   2 +
 app/test/test_ring.c                   | 209 +++++++++-
 app/test/test_ring.h                   |  67 ++-
 app/test/test_ring_mt_peek_stress_zc.c |  56 +++
 app/test/test_ring_st_peek_stress_zc.c |  63 +++
 app/test/test_ring_stress.c            |   6 +
 app/test/test_ring_stress.h            |   2 +
 app/test/test_ring_stress_impl.h       |   2 +-
 doc/guides/prog_guide/ring_lib.rst     |  41 ++
 doc/guides/rel_notes/release_20_11.rst |   9 +
 lib/librte_ring/meson.build            |   1 +
 lib/librte_ring/rte_ring_elem.h        |   1 +
 lib/librte_ring/rte_ring_peek_zc.h     | 548 +++++++++++++++++++++++++
 13 files changed, 988 insertions(+), 19 deletions(-)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 1/8] lib/ring: add zero copy APIs
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-27 14:11     ` Ananyev, Konstantin
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add zero-copy APIs. These APIs provide the capability to
copy the data to/from the ring memory directly, without
having a temporary copy (for ex: an array of mbufs on
the stack). Use cases that involve copying large amount
of data to/from the ring can benefit from these APIs.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 lib/librte_ring/meson.build        |   1 +
 lib/librte_ring/rte_ring_elem.h    |   1 +
 lib/librte_ring/rte_ring_peek_zc.h | 548 +++++++++++++++++++++++++++++
 3 files changed, 550 insertions(+)
 create mode 100644 lib/librte_ring/rte_ring_peek_zc.h

diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
index 31c0b4649..36fdcb6a5 100644
--- a/lib/librte_ring/meson.build
+++ b/lib/librte_ring/meson.build
@@ -11,5 +11,6 @@ headers = files('rte_ring.h',
 		'rte_ring_hts_c11_mem.h',
 		'rte_ring_peek.h',
 		'rte_ring_peek_c11_mem.h',
+		'rte_ring_peek_zc.h',
 		'rte_ring_rts.h',
 		'rte_ring_rts_c11_mem.h')
diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
index 938b398fc..7034d29c0 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 
 #ifdef ALLOW_EXPERIMENTAL_API
 #include <rte_ring_peek.h>
+#include <rte_ring_peek_zc.h>
 #endif
 
 #include <rte_ring.h>
diff --git a/lib/librte_ring/rte_ring_peek_zc.h b/lib/librte_ring/rte_ring_peek_zc.h
new file mode 100644
index 000000000..603d30df3
--- /dev/null
+++ b/lib/librte_ring/rte_ring_peek_zc.h
@@ -0,0 +1,548 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Copyright (c) 2020 Arm Limited
+ * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * All rights reserved.
+ * Derived from FreeBSD's bufring.h
+ * Used as BSD-3 Licensed with permission from Kip Macy.
+ */
+
+#ifndef _RTE_RING_PEEK_ZC_H_
+#define _RTE_RING_PEEK_ZC_H_
+
+/**
+ * @file
+ * @b EXPERIMENTAL: this API may change without prior notice
+ * It is not recommended to include this file directly.
+ * Please include <rte_ring_elem.h> instead.
+ *
+ * Ring Peek Zero Copy APIs
+ * These APIs make it possible to split public enqueue/dequeue API
+ * into 3 parts:
+ * - enqueue/dequeue start
+ * - copy data to/from the ring
+ * - enqueue/dequeue finish
+ * Along with the advantages of the peek APIs, these APIs provide the ability
+ * to avoid copying of the data to temporary area (for ex: array of mbufs
+ * on the stack).
+ *
+ * Note that currently these APIs are available only for two sync modes:
+ * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
+ * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
+ * It is user's responsibility to create/init ring with appropriate sync
+ * modes selected.
+ *
+ * Following are some examples showing the API usage.
+ * 1)
+ * struct elem_obj {uint64_t a; uint32_t b, c;};
+ * struct elem_obj *obj;
+ *
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(elem_obj), 1, &zcd, NULL);
+ *
+ * // Produce the data directly on the ring memory
+ * obj = (struct elem_obj *)zcd->ptr1;
+ * obj->a = rte_get_a();
+ * obj->b = rte_get_b();
+ * obj->c = rte_get_c();
+ * rte_ring_enqueue_zc_elem_finish(ring, n);
+ *
+ * 2)
+ * // Create ring with sync type RTE_RING_SYNC_ST or RTE_RING_SYNC_MT_HTS
+ * // Reserve space on the ring
+ * n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
+ *
+ * // Pkt I/O core polls packets from the NIC
+ * if (n != 0) {
+ *	nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
+ *	if (nb_rx == zcd->n1 && n != zcd->n1)
+ *		nb_rx = rte_eth_rx_burst(portid, queueid,
+ *						zcd->ptr2, n - zcd->n1);
+ *
+ *	// Provide packets to the packet processing cores
+ *	rte_ring_enqueue_zc_finish(r, nb_rx);
+ * }
+ *
+ * Note that between _start_ and _finish_ none other thread can proceed
+ * with enqueue/dequeue operation till _finish_ completes.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_ring_peek_c11_mem.h>
+
+/**
+ * Ring zero-copy information structure.
+ *
+ * This structure contains the pointers and length of the space
+ * reserved on the ring storage.
+ */
+struct rte_ring_zc_data {
+	/* Pointer to the first space in the ring */
+	void *ptr1;
+	/* Pointer to the second space in the ring if there is wrap-around.
+	 * It contains valid value only if wrap-around happens.
+	 */
+	void *ptr2;
+	/* Number of elements in the first pointer. If this is equal to
+	 * the number of elements requested, then ptr2 is NULL.
+	 * Otherwise, subtracting n1 from number of elements requested
+	 * will give the number of elements available at ptr2.
+	 */
+	unsigned int n1;
+} __rte_cache_aligned;
+
+static __rte_always_inline void
+__rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
+	uint32_t esize, uint32_t num, void **dst1, uint32_t *n1, void **dst2)
+{
+	uint32_t idx, scale, nr_idx;
+	uint32_t *ring = (uint32_t *)&r[1];
+
+	/* Normalize to uint32_t */
+	scale = esize / sizeof(uint32_t);
+	idx = head & r->mask;
+	nr_idx = idx * scale;
+
+	*dst1 = ring + nr_idx;
+	*n1 = num;
+
+	if (idx + num > r->size) {
+		*n1 = r->size - idx;
+		*dst2 = ring;
+	} else {
+		*dst2 = NULL;
+	}
+}
+
+/**
+ * @internal This function moves prod head value.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue_zc_elem_start(struct rte_ring *r, unsigned int esize,
+		uint32_t n, enum rte_ring_queue_behavior behavior,
+		struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	uint32_t free, head, next;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_prod_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &free);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_prod_head(r, n, behavior, &head, &free);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		free = 0;
+		return n;
+	}
+
+	__rte_ring_get_elem_addr(r, head, esize, n, &zcd->ptr1,
+		&zcd->n1, &zcd->ptr2);
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_bulk_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+/**
+ * Start to enqueue several objects on the ring.
+ * Note that no actual objects are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy objects into the queue using the returned pointers.
+ * User should call rte_ring_enqueue_zc_elem_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return __rte_ring_do_enqueue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, free_space);
+}
+
+/**
+ * Start to enqueue several pointers to objects on the ring.
+ * Note that no actual pointers are put in the queue by this function,
+ * it just reserves space for the user on the ring.
+ * User has to copy pointers to objects into the queue using the
+ * returned pointers.
+ * User should call rte_ring_enqueue_zc_finish to complete the
+ * enqueue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add in the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param free_space
+ *   If non-NULL, returns the amount of space in the ring after the
+ *   reservation operation has finished.
+ * @return
+ *   The number of objects that can be enqueued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_enqueue_zc_burst_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *free_space)
+{
+	return rte_ring_enqueue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+							zcd, free_space);
+}
+
+/**
+ * Complete enqueuing several objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->prod.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->prod, &tail, n);
+		__rte_ring_st_set_head_tail(&r->prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_prod, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_prod, tail, n, 1);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete enqueuing several pointers to objects on the ring.
+ * Note that number of objects to enqueue should not exceed previous
+ * enqueue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of pointers to objects to add to the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_enqueue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_enqueue_zc_elem_finish(r, n);
+}
+
+/**
+ * @internal This function moves cons head value and copies up to *n*
+ * objects from the ring to the user provided obj_table.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue_zc_elem_start(struct rte_ring *r,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	uint32_t avail, head, next;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_move_cons_head(r, RTE_RING_SYNC_ST, n,
+			behavior, &head, &next, &avail);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_move_cons_head(r, n, behavior,
+			&head, &avail);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+		n = 0;
+		avail = 0;
+		return n;
+	}
+
+	__rte_ring_get_elem_addr(r, head, esize, n, &zcd->ptr1,
+		&zcd->n1, &zcd->ptr2);
+
+	if (available != NULL)
+		*available = avail - n;
+	return n;
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_FIXED, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_bulk_start(struct rte_ring *r, unsigned int n,
+	struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_bulk_elem_start(r, sizeof(uintptr_t),
+		n, zcd, available);
+}
+
+/**
+ * Start to dequeue several objects from the ring.
+ * Note that no actual objects are copied from the queue by this function.
+ * User has to copy objects from the queue using the returned pointers.
+ * User should call rte_ring_dequeue_zc_elem_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_elem_start(struct rte_ring *r, unsigned int esize,
+	unsigned int n, struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return __rte_ring_do_dequeue_zc_elem_start(r, esize, n,
+			RTE_RING_QUEUE_VARIABLE, zcd, available);
+}
+
+/**
+ * Start to dequeue several pointers to objects from the ring.
+ * Note that no actual pointers are removed from the queue by this function.
+ * User has to copy pointers to objects from the queue using the
+ * returned pointers.
+ * User should call rte_ring_dequeue_zc_finish to complete the
+ * dequeue operation.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ * @param zcd
+ *   Structure containing the pointers and length of the space
+ *   reserved on the ring storage.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects that can be dequeued, either 0 or n
+ */
+__rte_experimental
+static __rte_always_inline unsigned int
+rte_ring_dequeue_zc_burst_start(struct rte_ring *r, unsigned int n,
+		struct rte_ring_zc_data *zcd, unsigned int *available)
+{
+	return rte_ring_dequeue_zc_burst_elem_start(r, sizeof(uintptr_t), n,
+			zcd, available);
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_elem_finish(struct rte_ring *r, unsigned int n)
+{
+	uint32_t tail;
+
+	switch (r->cons.sync_type) {
+	case RTE_RING_SYNC_ST:
+		n = __rte_ring_st_get_tail(&r->cons, &tail, n);
+		__rte_ring_st_set_head_tail(&r->cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT_HTS:
+		n = __rte_ring_hts_get_tail(&r->hts_cons, &tail, n);
+		__rte_ring_hts_set_head_tail(&r->hts_cons, tail, n, 0);
+		break;
+	case RTE_RING_SYNC_MT:
+	case RTE_RING_SYNC_MT_RTS:
+	default:
+		/* unsupported mode, shouldn't be here */
+		RTE_ASSERT(0);
+	}
+}
+
+/**
+ * Complete dequeuing several objects from the ring.
+ * Note that number of objects to dequeued should not exceed previous
+ * dequeue_start return value.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param n
+ *   The number of objects to remove from the ring.
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_ring_dequeue_zc_finish(struct rte_ring *r, unsigned int n)
+{
+	rte_ring_dequeue_elem_finish(r, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RING_PEEK_ZC_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-27 13:51     ` Ananyev, Konstantin
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Move test_ring_inc_ptr to header file so that it can be used by
functions in other files.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 app/test/test_ring.c | 11 -----------
 app/test/test_ring.h | 13 +++++++++++++
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index a62cb263b..329d538a9 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			NULL);
 }
 
-static void**
-test_ring_inc_ptr(void **obj, int esize, unsigned int n)
-{
-	/* Legacy queue APIs? */
-	if ((esize) == -1)
-		return ((void **)obj) + n;
-	else
-		return (void **)(((uint32_t *)obj) +
-					(n * esize / sizeof(uint32_t)));
-}
-
 static void
 test_ring_mem_init(void *obj, unsigned int count, int esize)
 {
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index d4b15af7c..b44711398 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -42,6 +42,19 @@ test_ring_create(const char *name, int esize, unsigned int count,
 						(socket_id), (flags));
 }
 
+static inline void*
+test_ring_inc_ptr(void *obj, int esize, unsigned int n)
+{
+	size_t sz;
+
+	sz = sizeof(void *);
+	/* Legacy queue APIs? */
+	if (esize != -1)
+		sz = esize;
+
+	return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 4/8] test/ring: add stress " Honnappa Nagarahalli
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add functional tests for zero copy APIs. Test enqueue/dequeue
functions are created using the zero copy APIs to fit into
the existing testing method.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_ring.c | 196 +++++++++++++++++++++++++++++++++++++++++++
 app/test/test_ring.h |  42 ++++++++++
 2 files changed, 238 insertions(+)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index 329d538a9..3914cb98a 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
  */
 
 #include <string.h>
@@ -68,6 +69,149 @@
 
 static const int esize[] = {-1, 4, 8, 16, 20};
 
+/* Wrappers around the zero-copy APIs. The wrappers match
+ * the normal enqueue/dequeue API declarations.
+ */
+static unsigned int
+test_ring_enqueue_zc_bulk(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	uint32_t ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free_space);
+	if (ret != 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_bulk_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret != 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst(struct rte_ring *r, void * const *obj_table,
+	unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_start(r, n, &zcd, free_space);
+	if (ret != 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_enqueue_zc_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_enqueue_zc_burst_elem_start(r, esize, n,
+				&zcd, free_space);
+	if (ret != 0) {
+		/* Copy the data to the ring */
+		test_ring_copy_to(&zcd, obj_table, esize, ret);
+		rte_ring_enqueue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, available);
+	if (ret != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_bulk_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst(struct rte_ring *r, void **obj_table,
+	unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_start(r, n, &zcd, available);
+	if (ret != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, sizeof(void *), ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
+static unsigned int
+test_ring_dequeue_zc_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	unsigned int ret;
+	struct rte_ring_zc_data zcd;
+
+	ret = rte_ring_dequeue_zc_burst_elem_start(r, esize, n,
+				&zcd, available);
+	if (ret != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj_table, esize, ret);
+		rte_ring_dequeue_zc_finish(r, ret);
+	}
+
+	return ret;
+}
+
 static const struct {
 	const char *desc;
 	uint32_t api_type;
@@ -219,6 +363,58 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_bulk,
+			.felem = test_ring_enqueue_zc_bulk_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_bulk,
+			.felem = test_ring_dequeue_zc_bulk_elem,
+		},
+	},
+	{
+		.desc = "SP/SC sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_SPSC,
+		.create_flags = RING_F_SP_ENQ | RING_F_SC_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	},
+	{
+		.desc = "MP_HTS/MC_HTS sync mode (ZC)",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ,
+		.enq = {
+			.flegacy = test_ring_enqueue_zc_burst,
+			.felem = test_ring_enqueue_zc_burst_elem,
+		},
+		.deq = {
+			.flegacy = test_ring_dequeue_zc_burst,
+			.felem = test_ring_dequeue_zc_burst_elem,
+		},
+	}
 };
 
 static unsigned int
diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index b44711398..b525abb79 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -55,6 +55,48 @@ test_ring_inc_ptr(void *obj, int esize, unsigned int n)
 	return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
 }
 
+static inline void
+test_ring_mem_copy(void *dst, void * const *src, int esize, unsigned int num)
+{
+	size_t sz;
+
+	sz = num * sizeof(void *);
+	if (esize != -1)
+		sz = esize * num;
+
+	memcpy(dst, src, sz);
+}
+
+/* Copy to the ring memory */
+static inline void
+test_ring_copy_to(struct rte_ring_zc_data *zcd, void * const *src, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(zcd->ptr1, src, esize, zcd->n1);
+	if (zcd->n1 != num) {
+		if (esize == -1)
+			src = src + zcd->n1;
+		else
+			src = (void * const *)((const uint32_t *)src +
+					(zcd->n1 * esize / sizeof(uint32_t)));
+		test_ring_mem_copy(zcd->ptr2, src,
+					esize, num - zcd->n1);
+	}
+}
+
+/* Copy from the ring memory */
+static inline void
+test_ring_copy_from(struct rte_ring_zc_data *zcd, void *dst, int esize,
+	unsigned int num)
+{
+	test_ring_mem_copy(dst, zcd->ptr1, esize, zcd->n1);
+
+	if (zcd->n1 != num) {
+		dst = test_ring_inc_ptr(dst, esize, zcd->n1);
+		test_ring_mem_copy(dst, zcd->ptr2, esize, num - zcd->n1);
+	}
+}
+
 static __rte_always_inline unsigned int
 test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 4/8] test/ring: add stress tests for zero copy APIs
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (2 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add stress tests for zero copy API.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/meson.build                   |  2 +
 app/test/test_ring_mt_peek_stress_zc.c | 56 +++++++++++++++++++++++
 app/test/test_ring_st_peek_stress_zc.c | 63 ++++++++++++++++++++++++++
 app/test/test_ring_stress.c            |  6 +++
 app/test/test_ring_stress.h            |  2 +
 5 files changed, 129 insertions(+)
 create mode 100644 app/test/test_ring_mt_peek_stress_zc.c
 create mode 100644 app/test/test_ring_st_peek_stress_zc.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 8bfb02890..88c831a92 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -108,9 +108,11 @@ test_sources = files('commands.c',
 	'test_ring_mpmc_stress.c',
 	'test_ring_hts_stress.c',
 	'test_ring_mt_peek_stress.c',
+	'test_ring_mt_peek_stress_zc.c',
 	'test_ring_perf.c',
 	'test_ring_rts_stress.c',
 	'test_ring_st_peek_stress.c',
+	'test_ring_st_peek_stress_zc.c',
 	'test_ring_stress.c',
 	'test_rwlock.c',
 	'test_sched.c',
diff --git a/app/test/test_ring_mt_peek_stress_zc.c b/app/test/test_ring_mt_peek_stress_zc.c
new file mode 100644
index 000000000..7e0bd511a
--- /dev/null
+++ b/app/test/test_ring_mt_peek_stress_zc.c
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, n);
+		rte_ring_dequeue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	n = (m == n) ? n : 0;
+	if (n != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, n);
+		rte_ring_enqueue_zc_finish(r, n);
+	}
+
+	return n;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ);
+}
+
+const struct test test_ring_mt_peek_stress_zc = {
+	.name = "MT_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_st_peek_stress_zc.c b/app/test/test_ring_st_peek_stress_zc.c
new file mode 100644
index 000000000..b9dbd4a6f
--- /dev/null
+++ b/app/test/test_ring_st_peek_stress_zc.c
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Arm Limited
+ */
+
+#include "test_ring.h"
+#include "test_ring_stress_impl.h"
+#include <rte_ring_elem.h>
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_dequeue_zc_bulk_start(r, n, &zcd, avail);
+	if (m != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_from(&zcd, obj, -1, m);
+		rte_ring_dequeue_zc_finish(r, m);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return m;
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	uint32_t m;
+	struct rte_ring_zc_data zcd;
+
+	static rte_spinlock_t lck = RTE_SPINLOCK_INITIALIZER;
+
+	rte_spinlock_lock(&lck);
+
+	m = rte_ring_enqueue_zc_bulk_start(r, n, &zcd, free);
+	if (m != 0) {
+		/* Copy the data from the ring */
+		test_ring_copy_to(&zcd, obj, -1, m);
+		rte_ring_enqueue_zc_finish(r, m);
+	}
+
+	rte_spinlock_unlock(&lck);
+	return m;
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num, RING_F_SP_ENQ | RING_F_SC_DEQ);
+}
+
+const struct test test_ring_st_peek_stress_zc = {
+	.name = "ST_PEEK_ZC",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index c4f82ea56..1af45e0fc 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -49,9 +49,15 @@ test_ring_stress(void)
 	n += test_ring_mt_peek_stress.nb_case;
 	k += run_test(&test_ring_mt_peek_stress);
 
+	n += test_ring_mt_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_mt_peek_stress_zc);
+
 	n += test_ring_st_peek_stress.nb_case;
 	k += run_test(&test_ring_st_peek_stress);
 
+	n += test_ring_st_peek_stress_zc.nb_case;
+	k += run_test(&test_ring_st_peek_stress_zc);
+
 	printf("Number of tests:\t%u\nSuccess:\t%u\nFailed:\t%u\n",
 		n, k, n - k);
 	return (k != n);
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index c85d6fa92..416d68c9a 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -36,4 +36,6 @@ extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
+extern const struct test test_ring_mt_peek_stress_zc;
 extern const struct test test_ring_st_peek_stress;
+extern const struct test test_ring_st_peek_stress_zc;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (3 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 4/8] test/ring: add stress " Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-27 14:11     ` Ananyev, Konstantin
  2020-10-29 10:52     ` David Marchand
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
                     ` (3 subsequent siblings)
  8 siblings, 2 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Add zero copy peek API documentation.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 doc/guides/prog_guide/ring_lib.rst     | 41 ++++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_11.rst |  9 ++++++
 2 files changed, 50 insertions(+)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 895484d95..247646d38 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -452,6 +452,47 @@ selected. As an example of usage:
 Note that between ``_start_`` and ``_finish_`` none other thread can proceed
 with enqueue(/dequeue) operation till ``_finish_`` completes.
 
+Ring Peek Zero Copy API
+-----------------------
+
+Along with the advantages of the peek APIs, zero copy APIs provide the ability
+to copy the data to the ring memory directly without the need for temporary
+storage (for ex: array of mbufs on the stack).
+
+These APIs make it possible to split public enqueue/dequeue API into 3 phases:
+
+* enqueue/dequeue start
+
+* copy data to/from the ring
+
+* enqueue/dequeue finish
+
+Note that this API is available only for two sync modes:
+
+*   Single Producer/Single Consumer (SP/SC)
+
+*   Multi-producer/Multi-consumer with Head/Tail Sync (HTS)
+
+It is a user responsibility to create/init ring with appropriate sync modes.
+Following is an example of usage:
+
+.. code-block:: c
+
+    /* Reserve space on the ring */
+    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
+    /* Pkt I/O core polls packets from the NIC */
+    if (n != 0) {
+        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
+        if (nb_rx == zcd->n1 && n != zcd->n1)
+            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
+							n - zcd->n1);
+        /* Provide packets to the packet processing cores */
+        rte_ring_enqueue_zc_finish(r, nb_rx);
+    }
+
+Note that between ``_start_`` and ``_finish_`` no other thread can proceed
+with enqueue(/dequeue) operation till ``_finish_`` completes.
+
 References
 ----------
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index d8ac359e5..fdc78b3da 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,15 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added zero copy APIs for rte_ring.**
+
+  For rings with producer/consumer in ``RTE_RING_SYNC_ST``, ``RTE_RING_SYNC_MT_HTS``
+  modes, these APIs split enqueue/dequeue operation into three phases
+  (enqueue/dequeue start, copy data to/from ring, enqueue/dequeue finish).
+  Along with the advantages of the peek APIs, these provide the ability to
+  copy the data to the ring memory directly without the need for temporary
+  storage.
+
 * **Added write combining store APIs.**
 
   Added ``rte_write32_wc`` and ``rte_write32_wc_relaxed`` APIs
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (4 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd, stable

Pass the correct number of bytes to dump the memory.

Fixes: bf28df24e915 ("test/ring: add contention stress test"
Cc: konstantin.ananyev@intel.com
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_ring_stress_impl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
index 3b9a480eb..f9ca63b90 100644
--- a/app/test/test_ring_stress_impl.h
+++ b/app/test/test_ring_stress_impl.h
@@ -159,7 +159,7 @@ check_updt_elem(struct ring_elem *elm[], uint32_t num,
 				"offending object: %p\n",
 				__func__, rte_lcore_id(), num, i, elm[i]);
 			rte_memdump(stdout, "expected", check, sizeof(*check));
-			rte_memdump(stdout, "result", elm[i], sizeof(elm[i]));
+			rte_memdump(stdout, "result", elm[i], sizeof(*elm[i]));
 			rte_spinlock_unlock(&dump_lock);
 			return -EINVAL;
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (5 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-27 14:13     ` Ananyev, Konstantin
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
  2020-10-29 13:57   ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
  8 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Remove unnecessary braces to improve readability.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 app/test/test_ring.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/app/test/test_ring.h b/app/test/test_ring.h
index b525abb79..c8bfec839 100644
--- a/app/test/test_ring.h
+++ b/app/test/test_ring.h
@@ -35,11 +35,11 @@ test_ring_create(const char *name, int esize, unsigned int count,
 		int socket_id, unsigned int flags)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
-		return rte_ring_create((name), (count), (socket_id), (flags));
+	if (esize == -1)
+		return rte_ring_create(name, count, socket_id, flags);
 	else
-		return rte_ring_create_elem((name), (esize), (count),
-						(socket_id), (flags));
+		return rte_ring_create_elem(name, esize, count,
+						socket_id, flags);
 }
 
 static inline void*
@@ -102,7 +102,7 @@ test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
+	if (esize == -1)
 		switch (api_type) {
 		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
 			return rte_ring_enqueue(r, *obj);
@@ -163,7 +163,7 @@ test_ring_dequeue(struct rte_ring *r, void **obj, int esize, unsigned int n,
 			unsigned int api_type)
 {
 	/* Legacy queue APIs? */
-	if ((esize) == -1)
+	if (esize == -1)
 		switch (api_type) {
 		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
 			return rte_ring_dequeue(r, obj);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (6 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
@ 2020-10-25  5:45   ` Honnappa Nagarahalli
  2020-10-27 14:14     ` Ananyev, Konstantin
  2020-10-29 13:57   ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
  8 siblings, 1 reply; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-25  5:45 UTC (permalink / raw)
  To: dev, honnappa.nagarahalli, konstantin.ananyev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd

Use uintptr_t instead of unsigned long while initializing the
array of pointers.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 app/test/test_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index 3914cb98a..5b7fdfa45 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -447,7 +447,7 @@ test_ring_mem_init(void *obj, unsigned int count, int esize)
 	/* Legacy queue APIs? */
 	if (esize == -1)
 		for (i = 0; i < count; i++)
-			((void **)obj)[i] = (void *)(unsigned long)i;
+			((void **)obj)[i] = (void *)(uintptr_t)i;
 	else
 		for (i = 0; i < (count * esize / sizeof(uint32_t)); i++)
 			((uint32_t *)obj)[i] = i;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs
  2020-10-24 16:18   ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
@ 2020-10-25  7:16     ` David Marchand
  2020-10-25  8:14       ` Thomas Monjalon
  0 siblings, 1 reply; 69+ messages in thread
From: David Marchand @ 2020-10-25  7:16 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Thomas Monjalon
  Cc: dev, konstantin.ananyev, stephen, Dharmik Thakkar, Ruifeng Wang,
	olivier.matz, nd

On Sat, Oct 24, 2020 at 6:18 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
> Hi David,
>         Checkpatch CI is showing "WARNING" on a lot of the patches in this series, but it does not list any real warnings.  Any idea what is happening?

I reported it to Thomas last night.
Thomas has fixed a few things in the script but I don't know if he's
finished with it.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs
  2020-10-25  7:16     ` David Marchand
@ 2020-10-25  8:14       ` Thomas Monjalon
  0 siblings, 0 replies; 69+ messages in thread
From: Thomas Monjalon @ 2020-10-25  8:14 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: David Marchand, dev, konstantin.ananyev, stephen,
	Dharmik Thakkar, Ruifeng Wang, olivier.matz, nd

25/10/2020 08:16, David Marchand:
> On Sat, Oct 24, 2020 at 6:18 PM Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com> wrote:
> >
> > Hi David,
> >         Checkpatch CI is showing "WARNING" on a lot of the patches in this series, but it does not list any real warnings.  Any idea what is happening?
> 
> I reported it to Thomas last night.
> Thomas has fixed a few things in the script but I don't know if he's
> finished with it.

Yes this is an issue in Linux checkpatch that I've updated on the server
just before you send this series.
Then I've fixed it:

--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -977,7 +977,7 @@ sub seed_camelcase_includes {
 sub git_is_single_file {
        my ($filename) = @_;
 
-       return 0 if ((which("git") eq "") || !(-e "$gitroot"));
+       return 0 if ((which("git") eq "") || !(-e "$root/.git"));




^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
@ 2020-10-27 13:51     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-27 13:51 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd



> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Sent: Sunday, October 25, 2020 5:46 AM
> To: dev@dpdk.org; honnappa.nagarahalli@arm.com; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> stephen@networkplumber.org
> Cc: dharmik.thakkar@arm.com; ruifeng.wang@arm.com; olivier.matz@6wind.com; david.marchand@redhat.com; nd@arm.com
> Subject: [PATCH v5 2/8] test/ring: move common function to header file
> 
> Move test_ring_inc_ptr to header file so that it can be used by
> functions in other files.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---
>  app/test/test_ring.c | 11 -----------
>  app/test/test_ring.h | 13 +++++++++++++
>  2 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c
> index a62cb263b..329d538a9 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -243,17 +243,6 @@ test_ring_deq_impl(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			NULL);
>  }
> 
> -static void**
> -test_ring_inc_ptr(void **obj, int esize, unsigned int n)
> -{
> -	/* Legacy queue APIs? */
> -	if ((esize) == -1)
> -		return ((void **)obj) + n;
> -	else
> -		return (void **)(((uint32_t *)obj) +
> -					(n * esize / sizeof(uint32_t)));
> -}
> -
>  static void
>  test_ring_mem_init(void *obj, unsigned int count, int esize)
>  {
> diff --git a/app/test/test_ring.h b/app/test/test_ring.h
> index d4b15af7c..b44711398 100644
> --- a/app/test/test_ring.h
> +++ b/app/test/test_ring.h
> @@ -42,6 +42,19 @@ test_ring_create(const char *name, int esize, unsigned int count,
>  						(socket_id), (flags));
>  }
> 
> +static inline void*
> +test_ring_inc_ptr(void *obj, int esize, unsigned int n)
> +{
> +	size_t sz;
> +
> +	sz = sizeof(void *);
> +	/* Legacy queue APIs? */
> +	if (esize != -1)
> +		sz = esize;
> +
> +	return (void *)((uint32_t *)obj + (n * sz / sizeof(uint32_t)));
> +}
> +
>  static __rte_always_inline unsigned int
>  test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			unsigned int api_type)
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/8] lib/ring: add zero copy APIs
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
@ 2020-10-27 14:11     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-27 14:11 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd



> 
> Add zero-copy APIs. These APIs provide the capability to
> copy the data to/from the ring memory directly, without
> having a temporary copy (for ex: an array of mbufs on
> the stack). Use cases that involve copying large amount
> of data to/from the ring can benefit from these APIs.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> ---

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
@ 2020-10-27 14:11     ` Ananyev, Konstantin
  2020-10-29 10:52     ` David Marchand
  1 sibling, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-27 14:11 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd


> 
> Add zero copy peek API documentation.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  doc/guides/prog_guide/ring_lib.rst     | 41 ++++++++++++++++++++++++++
>  doc/guides/rel_notes/release_20_11.rst |  9 ++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
> index 895484d95..247646d38 100644
> --- a/doc/guides/prog_guide/ring_lib.rst
> +++ b/doc/guides/prog_guide/ring_lib.rst
> @@ -452,6 +452,47 @@ selected. As an example of usage:
>  Note that between ``_start_`` and ``_finish_`` none other thread can proceed
>  with enqueue(/dequeue) operation till ``_finish_`` completes.
> 
> +Ring Peek Zero Copy API
> +-----------------------
> +
> +Along with the advantages of the peek APIs, zero copy APIs provide the ability
> +to copy the data to the ring memory directly without the need for temporary
> +storage (for ex: array of mbufs on the stack).
> +
> +These APIs make it possible to split public enqueue/dequeue API into 3 phases:
> +
> +* enqueue/dequeue start
> +
> +* copy data to/from the ring
> +
> +* enqueue/dequeue finish
> +
> +Note that this API is available only for two sync modes:
> +
> +*   Single Producer/Single Consumer (SP/SC)
> +
> +*   Multi-producer/Multi-consumer with Head/Tail Sync (HTS)
> +
> +It is a user responsibility to create/init ring with appropriate sync modes.
> +Following is an example of usage:
> +
> +.. code-block:: c
> +
> +    /* Reserve space on the ring */
> +    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> +    /* Pkt I/O core polls packets from the NIC */
> +    if (n != 0) {
> +        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> +        if (nb_rx == zcd->n1 && n != zcd->n1)
> +            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
> +							n - zcd->n1);
> +        /* Provide packets to the packet processing cores */
> +        rte_ring_enqueue_zc_finish(r, nb_rx);
> +    }
> +
> +Note that between ``_start_`` and ``_finish_`` no other thread can proceed
> +with enqueue(/dequeue) operation till ``_finish_`` completes.
> +
>  References
>  ----------
> 
> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
> index d8ac359e5..fdc78b3da 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,15 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Added zero copy APIs for rte_ring.**
> +
> +  For rings with producer/consumer in ``RTE_RING_SYNC_ST``, ``RTE_RING_SYNC_MT_HTS``
> +  modes, these APIs split enqueue/dequeue operation into three phases
> +  (enqueue/dequeue start, copy data to/from ring, enqueue/dequeue finish).
> +  Along with the advantages of the peek APIs, these provide the ability to
> +  copy the data to the ring memory directly without the need for temporary
> +  storage.
> +
>  * **Added write combining store APIs.**
> 
>    Added ``rte_write32_wc`` and ``rte_write32_wc_relaxed`` APIs
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
@ 2020-10-27 14:13     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-27 14:13 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd


> Remove unnecessary braces to improve readability.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  app/test/test_ring.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/app/test/test_ring.h b/app/test/test_ring.h
> index b525abb79..c8bfec839 100644
> --- a/app/test/test_ring.h
> +++ b/app/test/test_ring.h
> @@ -35,11 +35,11 @@ test_ring_create(const char *name, int esize, unsigned int count,
>  		int socket_id, unsigned int flags)
>  {
>  	/* Legacy queue APIs? */
> -	if ((esize) == -1)
> -		return rte_ring_create((name), (count), (socket_id), (flags));
> +	if (esize == -1)
> +		return rte_ring_create(name, count, socket_id, flags);
>  	else
> -		return rte_ring_create_elem((name), (esize), (count),
> -						(socket_id), (flags));
> +		return rte_ring_create_elem(name, esize, count,
> +						socket_id, flags);
>  }
> 
>  static inline void*
> @@ -102,7 +102,7 @@ test_ring_enqueue(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			unsigned int api_type)
>  {
>  	/* Legacy queue APIs? */
> -	if ((esize) == -1)
> +	if (esize == -1)
>  		switch (api_type) {
>  		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
>  			return rte_ring_enqueue(r, *obj);
> @@ -163,7 +163,7 @@ test_ring_dequeue(struct rte_ring *r, void **obj, int esize, unsigned int n,
>  			unsigned int api_type)
>  {
>  	/* Legacy queue APIs? */
> -	if ((esize) == -1)
> +	if (esize == -1)
>  		switch (api_type) {
>  		case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
>  			return rte_ring_dequeue(r, obj);
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
@ 2020-10-27 14:14     ` Ananyev, Konstantin
  0 siblings, 0 replies; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-27 14:14 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, stephen
  Cc: dharmik.thakkar, ruifeng.wang, olivier.matz, david.marchand, nd


> Use uintptr_t instead of unsigned long while initializing the
> array of pointers.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  app/test/test_ring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c
> index 3914cb98a..5b7fdfa45 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -447,7 +447,7 @@ test_ring_mem_init(void *obj, unsigned int count, int esize)
>  	/* Legacy queue APIs? */
>  	if (esize == -1)
>  		for (i = 0; i < count; i++)
> -			((void **)obj)[i] = (void *)(unsigned long)i;
> +			((void **)obj)[i] = (void *)(uintptr_t)i;
>  	else
>  		for (i = 0; i < (count * esize / sizeof(uint32_t)); i++)
>  			((uint32_t *)obj)[i] = i;
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
  2020-10-27 14:11     ` Ananyev, Konstantin
@ 2020-10-29 10:52     ` David Marchand
  2020-10-29 11:28       ` Ananyev, Konstantin
  1 sibling, 1 reply; 69+ messages in thread
From: David Marchand @ 2020-10-29 10:52 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: dev, Ananyev, Konstantin, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang (Arm Technology China),
	Olivier Matz, nd

On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
<honnappa.nagarahalli@arm.com> wrote:
> +.. code-block:: c
> +
> +    /* Reserve space on the ring */
> +    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> +    /* Pkt I/O core polls packets from the NIC */
> +    if (n != 0) {
> +        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> +        if (nb_rx == zcd->n1 && n != zcd->n1)
> +            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
> +                                                       n - zcd->n1);

Should it be nb_rx += ?

> +        /* Provide packets to the packet processing cores */
> +        rte_ring_enqueue_zc_finish(r, nb_rx);
> +    }
> +
> +Note that between ``_start_`` and ``_finish_`` no other thread can proceed
> +with enqueue(/dequeue) operation till ``_finish_`` completes.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-29 10:52     ` David Marchand
@ 2020-10-29 11:28       ` Ananyev, Konstantin
  2020-10-29 12:35         ` David Marchand
  0 siblings, 1 reply; 69+ messages in thread
From: Ananyev, Konstantin @ 2020-10-29 11:28 UTC (permalink / raw)
  To: David Marchand, Honnappa Nagarahalli
  Cc: dev, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang (Arm Technology China),
	Olivier Matz, nd



> 
> On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
> <honnappa.nagarahalli@arm.com> wrote:
> > +.. code-block:: c
> > +
> > +    /* Reserve space on the ring */
> > +    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> > +    /* Pkt I/O core polls packets from the NIC */
> > +    if (n != 0) {
> > +        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> > +        if (nb_rx == zcd->n1 && n != zcd->n1)
> > +            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
> > +                                                       n - zcd->n1);
> 
> Should it be nb_rx += ?

Yes, it should.
Good catch 😊

> 
> > +        /* Provide packets to the packet processing cores */
> > +        rte_ring_enqueue_zc_finish(r, nb_rx);
> > +    }
> > +
> > +Note that between ``_start_`` and ``_finish_`` no other thread can proceed
> > +with enqueue(/dequeue) operation till ``_finish_`` completes.
> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-29 11:28       ` Ananyev, Konstantin
@ 2020-10-29 12:35         ` David Marchand
  2020-10-29 17:29           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 69+ messages in thread
From: David Marchand @ 2020-10-29 12:35 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Ananyev, Konstantin, dev, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang (Arm Technology China),
	Olivier Matz, nd

On Thu, Oct 29, 2020 at 12:29 PM Ananyev, Konstantin
<konstantin.ananyev@intel.com> wrote:
> > On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
> > <honnappa.nagarahalli@arm.com> wrote:
> > > +.. code-block:: c
> > > +
> > > +    /* Reserve space on the ring */
> > > +    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> > > +    /* Pkt I/O core polls packets from the NIC */
> > > +    if (n != 0) {
> > > +        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> > > +        if (nb_rx == zcd->n1 && n != zcd->n1)
> > > +            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
> > > +                                                       n - zcd->n1);
> >
> > Should it be nb_rx += ?
>
> Yes, it should.
> Good catch

No need for a respin, I can fix.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs
  2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
                     ` (7 preceding siblings ...)
  2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
@ 2020-10-29 13:57   ` David Marchand
  2020-10-29 22:11     ` Honnappa Nagarahalli
  8 siblings, 1 reply; 69+ messages in thread
From: David Marchand @ 2020-10-29 13:57 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: dev, Ananyev, Konstantin, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang (Arm Technology China),
	Olivier Matz, nd

On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
<honnappa.nagarahalli@arm.com> wrote:
>
> It is pretty common for the DPDK applications to be deployed in
> semi-pipeline model. In these models, a small number of cores
> (typically 1) are designated as I/O cores. The I/O cores work
> on receiving and transmitting packets from the NIC and several
> packet processing cores. The IO core and the packet processing
> cores exchange the packets over a ring. Typically, such applications
> receive the mbufs in a temporary array and copy the mbufs on
> to the ring. Depending on the requirements the packets
> could be copied in batches of 32, 64 etc resulting in 256B,
> 512B etc memory copy.
>
> The zero copy APIs help avoid intermediate copies by exposing
> the space on the ring directly to the application.

Reordered the patches to have the fixes and coding style changes first
in the series.
Fixed incorrect Fixes: line format.
Squashed documentation with introduction of the API.
Moved release note update (ring comes after EAL).
Fixed example of API usage.

Series applied, thanks Honnappa.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs
  2020-10-29 12:35         ` David Marchand
@ 2020-10-29 17:29           ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-29 17:29 UTC (permalink / raw)
  To: David Marchand
  Cc: Ananyev, Konstantin, dev, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang, Olivier Matz, nd, Honnappa Nagarahalli, nd

<snip>

> 
> On Thu, Oct 29, 2020 at 12:29 PM Ananyev, Konstantin
> <konstantin.ananyev@intel.com> wrote:
> > > On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
> > > <honnappa.nagarahalli@arm.com> wrote:
> > > > +.. code-block:: c
> > > > +
> > > > +    /* Reserve space on the ring */
> > > > +    n = rte_ring_enqueue_zc_burst_start(r, 32, &zcd, NULL);
> > > > +    /* Pkt I/O core polls packets from the NIC */
> > > > +    if (n != 0) {
> > > > +        nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr1, zcd->n1);
> > > > +        if (nb_rx == zcd->n1 && n != zcd->n1)
> > > > +            nb_rx = rte_eth_rx_burst(portid, queueid, zcd->ptr2,
> > > > +                                                       n -
> > > > + zcd->n1);
> > >
> > > Should it be nb_rx += ?
> >
> > Yes, it should.
> > Good catch
> 
> No need for a respin, I can fix.
Thank you

> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs
  2020-10-29 13:57   ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
@ 2020-10-29 22:11     ` Honnappa Nagarahalli
  0 siblings, 0 replies; 69+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-29 22:11 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, Ananyev, Konstantin, Stephen Hemminger, Dharmik Thakkar,
	Ruifeng Wang, Olivier Matz, nd, nd

<snip>

> On Sun, Oct 25, 2020 at 6:46 AM Honnappa Nagarahalli
> <honnappa.nagarahalli@arm.com> wrote:
> >
> > It is pretty common for the DPDK applications to be deployed in
> > semi-pipeline model. In these models, a small number of cores
> > (typically 1) are designated as I/O cores. The I/O cores work on
> > receiving and transmitting packets from the NIC and several packet
> > processing cores. The IO core and the packet processing cores exchange
> > the packets over a ring. Typically, such applications receive the
> > mbufs in a temporary array and copy the mbufs on to the ring.
> > Depending on the requirements the packets could be copied in batches
> > of 32, 64 etc resulting in 256B, 512B etc memory copy.
> >
> > The zero copy APIs help avoid intermediate copies by exposing the
> > space on the ring directly to the application.
> 
> Reordered the patches to have the fixes and coding style changes first in the
> series.
> Fixed incorrect Fixes: line format.
> Squashed documentation with introduction of the API.
> Moved release note update (ring comes after EAL).
> Fixed example of API usage.
> 
> Series applied, thanks Honnappa.
Thanks David. Sorry, you had to fix things up.

> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2020-10-29 22:12 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
2020-02-26 20:38   ` Ananyev, Konstantin
2020-02-26 23:21     ` Ananyev, Konstantin
2020-02-28  0:18     ` Honnappa Nagarahalli
2020-03-02 18:20       ` Ananyev, Konstantin
2020-03-04 23:21         ` Honnappa Nagarahalli
2020-03-05 18:26           ` Ananyev, Konstantin
2020-03-25 20:43             ` Honnappa Nagarahalli
2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
2020-10-06 13:29   ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
2020-10-07  8:27     ` Olivier Matz
2020-10-08 20:44       ` Honnappa Nagarahalli
2020-10-08 20:47         ` Honnappa Nagarahalli
2020-10-09  7:33         ` Olivier Matz
2020-10-09  8:05           ` Ananyev, Konstantin
2020-10-09 22:54             ` Honnappa Nagarahalli
2020-10-12 17:06               ` Ananyev, Konstantin
2020-10-12 16:20     ` Ananyev, Konstantin
2020-10-12 22:31       ` Honnappa Nagarahalli
2020-10-13 11:38         ` Ananyev, Konstantin
2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-23 13:24     ` Ananyev, Konstantin
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23 13:59     ` Ananyev, Konstantin
2020-10-24 15:45       ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-23 14:22     ` Ananyev, Konstantin
2020-10-23 23:54       ` Honnappa Nagarahalli
2020-10-24  0:29         ` Stephen Hemminger
2020-10-24  0:31           ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-23 14:20     ` Ananyev, Konstantin
2020-10-23 22:47       ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
2020-10-23 14:11     ` Ananyev, Konstantin
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-24 16:18   ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-25  7:16     ` David Marchand
2020-10-25  8:14       ` Thomas Monjalon
2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
2020-10-27 14:11     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-27 13:51     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-27 14:11     ` Ananyev, Konstantin
2020-10-29 10:52     ` David Marchand
2020-10-29 11:28       ` Ananyev, Konstantin
2020-10-29 12:35         ` David Marchand
2020-10-29 17:29           ` Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-27 14:13     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-27 14:14     ` Ananyev, Konstantin
2020-10-29 13:57   ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
2020-10-29 22:11     ` Honnappa Nagarahalli

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git