* [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers
@ 2019-11-18  9:50 Shahaf Shuler
  2019-11-18 16:09 ` Stephen Hemminger
                   ` (8 more replies)
  0 siblings, 9 replies; 77+ messages in thread
From: Shahaf Shuler @ 2019-11-18  9:50 UTC (permalink / raw)
  To: olivier.matz, Thomas Monjalon, dev, arybchenko
  Cc: Asaf Penso, Olga Shern, Alex Rosenbaum, eagostini
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 75 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 69 insertions(+), 6 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 92d81972ab..e631dfff30 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -295,6 +295,13 @@ rte_mbuf_to_priv(struct rte_mbuf *m)
 }
 
 /**
+ * When set pktmbuf mempool will hold only mbufs with pinned external buffer.
+ * The external buffer will be attached on the mbuf creation and will not be
+ * detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+/**
  * Private data in case of pktmbuf pool.
  *
  * A structure that contains some pktmbuf_pool-specific data that are
@@ -303,6 +310,7 @@ rte_mbuf_to_priv(struct rte_mbuf *m)
 struct rte_pktmbuf_pool_private {
 	uint16_t mbuf_data_room_size; /**< Size of data space in each mbuf. */
 	uint16_t mbuf_priv_size;      /**< Size of private area in each mbuf. */
+	uint32_t flags;		      /**< Use RTE_PKTMMBUF_POOL_F_*. */
 };
 
 #ifdef RTE_LIBRTE_MBUF_DEBUG
@@ -660,6 +668,50 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
 	int socket_id);
 
 /**
+ * Create a mbuf pool with pinned external buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param socket_id
+ *   The socket identifier where the mempool memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param buffers
+ *   Array of buffers to be attached to the mbufs in the pool.
+ *   Array size should be n.
+ * @param buffers_len
+ *   Array of buffer length. buffers_len[i] describes the length of a buffer
+ *   pointed by buffer[i].
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+struct rte_mempool *
+rte_pktmbuf_ext_buffer_pool_create(const char *name, unsigned n,
+				   unsigned cache_size, uint16_t priv_size,
+				   int socket_id, void **buffers,
+				   uint16_t *buffer_len);
+
+/**
  * Create a mbuf pool with a given mempool ops name
  *
  * This function creates and initializes a packet mbuf pool. It is
@@ -1137,25 +1189,36 @@ __rte_pktmbuf_free_direct(struct rte_mbuf *m)
 static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 {
 	struct rte_mempool *mp = m->pool;
+	struct rte_pktmbuf_pool_private *priv =
+		(struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
+	uint8_t pinned_ext_mbuf = priv->flags &
+				  RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
-		__rte_pktmbuf_free_extbuf(m);
-	else
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		if (pinned_ext_mbuf) {
+			m->ol_flags = EXT_ATTACHED_MBUF;
+			goto reset_data;
+		} else {
+			__rte_pktmbuf_free_extbuf(m);
+		}
+	} else {
 		__rte_pktmbuf_free_direct(m);
+	}
 
-	priv_size = rte_pktmbuf_priv_size(mp);
+	priv_size = priv->mbuf_priv_size;
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
-	buf_len = rte_pktmbuf_data_room_size(mp);
+	buf_len = priv->mbuf_data_room_size;
 
 	m->priv_size = priv_size;
 	m->buf_addr = (char *)m + mbuf_size;
 	m->buf_iova = rte_mempool_virt2iova(m) + mbuf_size;
 	m->buf_len = (uint16_t)buf_len;
+	m->ol_flags = 0;
+reset_data:
 	rte_pktmbuf_reset_headroom(m);
 	m->data_len = 0;
-	m->ol_flags = 0;
 }
 
 /**
-- 
2.12.0
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
@ 2019-11-18 16:09 ` Stephen Hemminger
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 77+ messages in thread
From: Stephen Hemminger @ 2019-11-18 16:09 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: olivier.matz, Thomas Monjalon, dev, arybchenko, Asaf Penso,
	Olga Shern, Alex Rosenbaum, eagostini
On Mon, 18 Nov 2019 09:50:07 +0000
Shahaf Shuler <shahafs@mellanox.com> wrote:
> +struct rte_mempool *
> +rte_pktmbuf_ext_buffer_pool_create(const char *name, unsigned n,
> +				   unsigned cache_size, uint16_t priv_size,
> +				   int socket_id, void **buffers,
> +				   uint16_t *buffer_len);
New API's must be marked experimental
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH 0/4] mbuf: introduce pktmbuf pool with pinned external buffers
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
  2019-11-18 16:09 ` Stephen Hemminger
@ 2020-01-10 17:56 ` Viacheslav Ovsiienko
  2020-01-10 17:56   ` [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
                     ` (3 more replies)
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
                   ` (6 subsequent siblings)
  8 siblings, 4 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-10 17:56 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, Shahaf Shuler
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RFC: http://patches.dpdk.org/patch/63077/
Viacheslav Ovsiienko (4):
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 +++++++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 ++++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 145 ++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 145 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 406 insertions(+), 41 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
@ 2020-01-10 17:56   ` Viacheslav Ovsiienko
  2020-01-10 18:23     ` Stephen Hemminger
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-10 17:56 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, Shahaf Shuler
Update detach routine to check the mbuf pool type.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 59 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 55 insertions(+), 4 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 219b110..e115ae5 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -306,6 +306,41 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * When set pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached on the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) rte_mbuf_has_pinned_extbuf(mb)
+
+static inline uint64_t
+rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
+{
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(m->pool);
+
+		return priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+	}
+	return 0;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -571,7 +606,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -1141,11 +1177,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffed,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(mp);
+
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			RTE_ASSERT(m->shinfo == NULL);
+			m->ol_flags = EXT_ATTACHED_MBUF;
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH 2/4] mbuf: create packet pool with external memory buffers
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
  2020-01-10 17:56   ` [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-10 17:57   ` Viacheslav Ovsiienko
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-10 17:57 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initialises each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and NULL shared info
pointer, because external buffers are not supposed to be
freed and sharing management is not needed. Also, these
mbufs can not be attached to other mbufs (not intended to
be indirect).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 145 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  86 ++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 229 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..9659669 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,9 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+		0 : user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +107,63 @@
 	m->next = NULL;
 }
 
+/*
+ * pktmbuf constructor for the pool with pinned external buffer,
+ * given as a callback function to rte_mempool_obj_iter() in
+ * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
+ * mbuf to their default values.
+ */
+void
+rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			void *opaque_arg,
+			void *_m,
+			__attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
+
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +226,90 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) + (unsigned int)priv_size;
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index e115ae5..2992881 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -637,6 +637,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
+
+/**
+ * The packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
+			     void *m, unsigned int i);
 
 /**
  * A  packet mbuf pool constructor.
@@ -738,6 +766,62 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
@@ -813,7 +897,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH 3/4] app/testpmd: add mempool with external data buffers
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
  2020-01-10 17:56   ` [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-10 17:57   ` Viacheslav Ovsiienko
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-10 17:57 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9da1ffb..5c6fe18 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2395,6 +2395,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index b374682..6d3818c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -865,6 +866,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -933,6 +994,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 857a11f..a47f214 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH 4/4] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-10 17:57   ` [dpdk-dev] [PATCH 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-10 17:57   ` Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-10 17:57 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ca25e32..c87ce15 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 25a2952..e5a885d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1341,7 +1341,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e927343..f35cc87 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..6d4ddb5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)(vector unsigned double) {
+		rxq->mbuf_initializer};
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-10 17:56   ` [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-10 18:23     ` Stephen Hemminger
  2020-01-13 17:07       ` Slava Ovsiienko
  2020-01-14  7:19       ` Slava Ovsiienko
  0 siblings, 2 replies; 77+ messages in thread
From: Stephen Hemminger @ 2020-01-10 18:23 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, Shahaf Shuler
On Fri, 10 Jan 2020 17:56:59 +0000
Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> +
> +static inline uint64_t
> +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
> +{
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		/*
> +		 * The mbuf has the external attached buffer,
> +		 * we should check the type of the memory pool where
> +		 * the mbuf was allocated from.
> +		 */
> +		struct rte_pktmbuf_pool_private *priv =
> +			(struct rte_pktmbuf_pool_private *)
> +				rte_mempool_get_priv(m->pool);
> +
> +		return priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> +	}
> +	return 0;
> +}
New functions need to be marked experimental.
The return value should be boolean not uint64_t
Why does this need to be inlined (and thereby create new ABI burden)?
Also having it inline makes making pktmbuf_pool_private really private in future.
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-10 18:23     ` Stephen Hemminger
@ 2020-01-13 17:07       ` Slava Ovsiienko
  2020-01-14  7:19       ` Slava Ovsiienko
  1 sibling, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-13 17:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler
Hi, Stephen
Thanks a lot for the comment.
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, January 10, 2020 20:24
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external
> buffer
> 
> On Fri, 10 Jan 2020 17:56:59 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > +
> > +static inline uint64_t
> > +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m) {
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffer,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(m->pool);
> > +
> > +		return priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> > +	}
> > +	return 0;
> > +}
> 
> New functions need to be marked experimental.
> The return value should be boolean not uint64_t
Will be fixed in v2.
> 
> Why does this need to be inlined (and thereby create new ABI burden)?
> Also having it inline makes making pktmbuf_pool_private really private in
> future.
Due to performance reasons. This routine potentially might be used in datapath.
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-10 18:23     ` Stephen Hemminger
  2020-01-13 17:07       ` Slava Ovsiienko
@ 2020-01-14  7:19       ` Slava Ovsiienko
  1 sibling, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-14  7:19 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, January 10, 2020 20:24
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external
> buffer
> 
> On Fri, 10 Jan 2020 17:56:59 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > +
> > +static inline uint64_t
> > +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m) {
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffer,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(m->pool);
> > +
> > +		return priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> > +	}
> > +	return 0;
> > +}
> 
> New functions need to be marked experimental.
> 
> The return value should be boolean not uint64_t
I intentionally avoided the "bool" in v1. This is not native C type,
requires the extra header (stdbool.h at least) and has poor portability -
we would run into conflict with AltiVec - "bool" becomes reserved keyword.
So, I'm going to change return value type to uint32_t (as private flag field has).
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
  2019-11-18 16:09 ` Stephen Hemminger
  2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
@ 2020-01-14  7:49 ` Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
                     ` (3 more replies)
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                   ` (5 subsequent siblings)
  8 siblings, 4 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  7:49 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RFC: http://patches.dpdk.org/patch/63077
v1: http://patches.dpdk.org/cover/64424
v2: fix rte_experimantal issue on comment addressing
    rte_mbuf_has_pinned_extbuf return type is uint32_t
    fix Power9 compilation issue
Viacheslav Ovsiienko (4):
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 +++++++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 +++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 144 ++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 151 ++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 411 insertions(+), 41 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
@ 2020-01-14  7:49   ` Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  7:49 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
Update detach routine to check the mbuf pool type.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 65 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 61 insertions(+), 4 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 219b110..46ae76c 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -32,6 +32,7 @@
  */
 
 #include <stdint.h>
+#include <stdbool.h>
 #include <rte_compat.h>
 #include <rte_common.h>
 #include <rte_config.h>
@@ -306,6 +307,46 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * When set pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached on the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#ifdef ALLOW_EXPERIMENTAL_API
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) rte_mbuf_has_pinned_extbuf(mb)
+#else
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false
+#endif
+
+__rte_experimental
+static inline uint32_t
+rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
+{
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(m->pool);
+
+		return priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+	}
+	return 0;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -571,7 +612,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -1141,11 +1183,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffed,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(mp);
+
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			RTE_ASSERT(m->shinfo == NULL);
+			m->ol_flags = EXT_ATTACHED_MBUF;
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2 2/4] mbuf: create packet pool with external memory buffers
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-14  7:49   ` Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  7:49 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initialises each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and NULL shared info
pointer, because external buffers are not supposed to be
freed and sharing management is not needed. Also, these
mbufs can not be attached to other mbufs (not intended to
be indirect).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 144 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  86 ++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 228 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..d151469 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,9 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+		0 : user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +107,63 @@
 	m->next = NULL;
 }
 
+/*
+ * pktmbuf constructor for the pool with pinned external buffer,
+ * given as a callback function to rte_mempool_obj_iter() in
+ * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
+ * mbuf to their default values.
+ */
+void
+rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			void *opaque_arg,
+			void *_m,
+			__attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
+
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +226,89 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) + (unsigned int)priv_size;
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 46ae76c..c14b8a1 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -643,6 +643,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
+
+/**
+ * The packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
+			     void *m, unsigned int i);
 
 /**
  * A  packet mbuf pool constructor.
@@ -744,6 +772,62 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
@@ -819,7 +903,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2 3/4] app/testpmd: add mempool with external data buffers
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-14  7:49   ` Viacheslav Ovsiienko
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  7:49 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9da1ffb..5c6fe18 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2395,6 +2395,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 2eec8af..5f910ba 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -865,6 +866,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -933,6 +994,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 857a11f..a47f214 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2 4/4] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-14  7:49   ` Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  7:49 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ca25e32..c87ce15 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b11c5eb..fdc7529 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1337,7 +1337,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e362b4a..24fa038 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..9e5c6ee 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)vec_vsx_ld
+			(0, (vector unsigned char *)&rxq->mbuf_initializer);
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned external buffer
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (2 preceding siblings ...)
  2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
@ 2020-01-14  9:15 ` Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
                     ` (3 more replies)
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                   ` (4 subsequent siblings)
  8 siblings, 4 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  9:15 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RFC: http://patches.dpdk.org/patch/63077
v1: http://patches.dpdk.org/cover/64424
v2: - fix rte_experimantal issue on comment addressing
    - rte_mbuf_has_pinned_extbuf return type is uint32_t
    - fix Power9 compilation issue
v3: - fix "#include <stdbool.h> leftover
Viacheslav Ovsiienko (4):
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 +++++++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 +++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 144 ++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 150 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 410 insertions(+), 41 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-14  9:15   ` Viacheslav Ovsiienko
  2020-01-14 15:27     ` Olivier Matz
  2020-01-14 15:50     ` Stephen Hemminger
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  9:15 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
Update detach routine to check the mbuf pool type.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 64 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 60 insertions(+), 4 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 219b110..8f486af 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -306,6 +306,46 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * When set pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached on the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#ifdef ALLOW_EXPERIMENTAL_API
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) rte_mbuf_has_pinned_extbuf(mb)
+#else
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false
+#endif
+
+__rte_experimental
+static inline uint32_t
+rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
+{
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(m->pool);
+
+		return priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+	}
+	return 0;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -571,7 +611,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -1141,11 +1182,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffed,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(mp);
+
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			RTE_ASSERT(m->shinfo == NULL);
+			m->ol_flags = EXT_ATTACHED_MBUF;
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
@ 2020-01-14  9:15   ` Viacheslav Ovsiienko
  2020-01-14 16:04     ` Olivier Matz
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  9:15 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initialises each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and NULL shared info
pointer, because external buffers are not supposed to be
freed and sharing management is not needed. Also, these
mbufs can not be attached to other mbufs (not intended to
be indirect).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 144 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  86 ++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 228 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..d151469 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,9 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+		0 : user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +107,63 @@
 	m->next = NULL;
 }
 
+/*
+ * pktmbuf constructor for the pool with pinned external buffer,
+ * given as a callback function to rte_mempool_obj_iter() in
+ * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
+ * mbuf to their default values.
+ */
+void
+rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			void *opaque_arg,
+			void *_m,
+			__attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
+
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +226,89 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) + (unsigned int)priv_size;
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 8f486af..7bde297 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -642,6 +642,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
+
+/**
+ * The packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
+			     void *m, unsigned int i);
 
 /**
  * A  packet mbuf pool constructor.
@@ -743,6 +771,62 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
@@ -818,7 +902,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v3 3/4] app/testpmd: add mempool with external data buffers
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-14  9:15   ` Viacheslav Ovsiienko
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  9:15 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9da1ffb..5c6fe18 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2395,6 +2395,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 2eec8af..5f910ba 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -865,6 +866,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -933,6 +994,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 857a11f..a47f214 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v3 4/4] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-14  9:15   ` Viacheslav Ovsiienko
  3 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-14  9:15 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ca25e32..c87ce15 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b11c5eb..fdc7529 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1337,7 +1337,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e362b4a..24fa038 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..9e5c6ee 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)vec_vsx_ld
+			(0, (vector unsigned char *)&rxq->mbuf_initializer);
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
@ 2020-01-14 15:27     ` Olivier Matz
  2020-01-15 12:52       ` Slava Ovsiienko
  2020-01-14 15:50     ` Stephen Hemminger
  1 sibling, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-14 15:27 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen
Hi Viacheslav,
Please see some comments below.
On Tue, Jan 14, 2020 at 09:15:02AM +0000, Viacheslav Ovsiienko wrote:
> Update detach routine to check the mbuf pool type.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 64 +++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 60 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 219b110..8f486af 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -306,6 +306,46 @@ struct rte_pktmbuf_pool_private {
>  	uint32_t flags; /**< reserved for future use. */
>  };
>  
> +/**
> + * When set pktmbuf mempool will hold only mbufs with pinned external
> + * buffer. The external buffer will be attached on the mbuf at the
> + * memory pool creation and will never be detached by the mbuf free calls.
> + * mbuf should not contain any room for data after the mbuf structure.
> + */
> +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
Out of curiosity, why using a pool flag instead of flagging the mbufs?
The reason I see is that adding a new m->flag would impact places where
we copy or reset the flags (it should be excluded). Is there another
reason?
> +/**
> + * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
> + * otherwise. The pinned external buffer is allocated at pool creation
> + * time and should not be freed.
> + *
> + * External buffer is a user-provided anonymous buffer.
> + */
> +#ifdef ALLOW_EXPERIMENTAL_API
> +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) rte_mbuf_has_pinned_extbuf(mb)
> +#else
> +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false
> +#endif
I suppose you added these lines because the compilation was broken after
introducing the new __rte_experimental API, which is called from detach().
I find a bit strange that we require to do this. I don't see what would
be broken without the ifdef: an application compiled for 19.11 cannot use
the pinned-ext-buf feature (because it did not exist), so the modification
looks safe to me.
> +
> +__rte_experimental
> +static inline uint32_t
I don't think uint32_t is really better than uint64_t. I agree with Stephen
that bool is the preferred choice, however if it breaks compilation in some
cases, I think int is better.
> +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
> +{
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		/*
> +		 * The mbuf has the external attached buffer,
> +		 * we should check the type of the memory pool where
> +		 * the mbuf was allocated from.
> +		 */
> +		struct rte_pktmbuf_pool_private *priv =
> +			(struct rte_pktmbuf_pool_private *)
> +				rte_mempool_get_priv(m->pool);
> +
> +		return priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
What about introducing a rte_pktmbuf_priv_flags() function, on the
same model than rte_pktmbuf_priv_size() or rte_pktmbuf_data_room_size()?
> +	}
> +	return 0;
> +}
> +
>  #ifdef RTE_LIBRTE_MBUF_DEBUG
>  
>  /**  check mbuf type in debug mode */
> @@ -571,7 +611,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  static __rte_always_inline void
>  rte_mbuf_raw_free(struct rte_mbuf *m)
>  {
> -	RTE_ASSERT(RTE_MBUF_DIRECT(m));
> +	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> +		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
>  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
>  	RTE_ASSERT(m->next == NULL);
>  	RTE_ASSERT(m->nb_segs == 1);
> @@ -1141,11 +1182,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  	uint32_t mbuf_size, buf_len;
>  	uint16_t priv_size;
>  
> -	if (RTE_MBUF_HAS_EXTBUF(m))
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		/*
> +		 * The mbuf has the external attached buffed,
> +		 * we should check the type of the memory pool where
> +		 * the mbuf was allocated from to detect the pinned
> +		 * external buffer.
> +		 */
> +		struct rte_pktmbuf_pool_private *priv =
> +			(struct rte_pktmbuf_pool_private *)
> +				rte_mempool_get_priv(mp);
> +
It could be rte_pktmbuf_priv_flags() as said above.
> +		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> +			RTE_ASSERT(m->shinfo == NULL);
> +			m->ol_flags = EXT_ATTACHED_MBUF;
> +			return;
> +		}
I think it is not possible to have m->shinfo == NULL (this comment is
also related to next patch, because shinfo init is done there). If you
try to clone a mbuf that comes from an ext-pinned pool, it will
crash. Here is the code from attach():
	static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
	{
	        RTE_ASSERT(RTE_MBUF_DIRECT(mi) &&
	            rte_mbuf_refcnt_read(mi) == 1);
	        if (RTE_MBUF_HAS_EXTBUF(m)) {
	                rte_mbuf_ext_refcnt_update(m->shinfo, 1); << HERE
	                mi->ol_flags = m->ol_flags;
	                mi->shinfo = m->shinfo;
		...
The 2 alternatives I see are:
- do not allow to clone these mbufs, but today there is no return value
  to attach() functions, so there is no way to know if it failed or not
- manage shinfo to support clones
I think just ignoring the rte_mbuf_ext_refcnt_update() if shinfo == NULL
is not an option, because if could result in recycling the extbuf while a
clone still references it.
>  		__rte_pktmbuf_free_extbuf(m);
> -	else
> +	} else {
>  		__rte_pktmbuf_free_direct(m);
> -
> +	}
>  	priv_size = rte_pktmbuf_priv_size(mp);
>  	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
>  	buf_len = rte_pktmbuf_data_room_size(mp);
> -- 
> 1.8.3.1
> 
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
  2020-01-14 15:27     ` Olivier Matz
@ 2020-01-14 15:50     ` Stephen Hemminger
  1 sibling, 0 replies; 77+ messages in thread
From: Stephen Hemminger @ 2020-01-14 15:50 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, olivier.matz
On Tue, 14 Jan 2020 09:15:02 +0000
Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> +/**
> + * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
> + * otherwise. The pinned external buffer is allocated at pool creation
> + * time and should not be freed.
> + *
> + * External buffer is a user-provided anonymous buffer.
> + */
> +#ifdef ALLOW_EXPERIMENTAL_API
> +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) rte_mbuf_has_pinned_extbuf(mb)
> +#else
> +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false
> +#endif
This is worse than just letting new code in.
If you have to use conditional compilation, then please base it off
an config value.
And make the resulting function an inline, this avoid introducing yet
another macro. MACROS ARE HARDER TO READ.
#ifdef RTE_CONFIG_MBUF_PINNED
static inline bool
rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
{
...
}
#else
static inline bool
rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
{
	return false;
}
#endif
> +__rte_experimental
> +static inline uint32_t
> +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m)
> +{
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		/*
> +		 * The mbuf has the external attached buffer,
> +		 * we should check the type of the memory pool where
> +		 * the mbuf was allocated from.
> +		 */
> +		struct rte_pktmbuf_pool_private *priv =
> +			(struct rte_pktmbuf_pool_private *)
> +				rte_mempool_get_priv(m->pool);
Since rte_mempool_get_priv() returns void *, the cast is unnecessary
in standard C. Maybe you still need it for people using rte_mbuf.h in C++
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers
  2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-14 16:04     ` Olivier Matz
  2020-01-15 18:13       ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-14 16:04 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen
On Tue, Jan 14, 2020 at 09:15:03AM +0000, Viacheslav Ovsiienko wrote:
> The dedicated routine rte_pktmbuf_pool_create_extbuf() is
> provided to create mbuf pool with data buffers located in
> the pinned external memory. The application provides the
> external memory description and routine initialises each
> mbuf with appropriate virtual and physical buffer address.
> It is entirely application responsibility to register
> external memory with rte_extmem_register() API, map this
> memory, etc.
> 
> The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
> is set in private pool structure, specifying the new special
> pool type. The allocated mbufs from pool of this kind will
> have the EXT_ATTACHED_MBUF flag set and NULL shared info
> pointer, because external buffers are not supposed to be
> freed and sharing management is not needed. Also, these
> mbufs can not be attached to other mbufs (not intended to
> be indirect).
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  lib/librte_mbuf/rte_mbuf.c           | 144 ++++++++++++++++++++++++++++++++++-
>  lib/librte_mbuf/rte_mbuf.h           |  86 ++++++++++++++++++++-
>  lib/librte_mbuf/rte_mbuf_version.map |   1 +
>  3 files changed, 228 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 8fa7f49..d151469 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -59,9 +59,9 @@
>  	}
>  
>  	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
> -		user_mbp_priv->mbuf_data_room_size +
> +		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
> +		0 : user_mbp_priv->mbuf_data_room_size) +
>  		user_mbp_priv->mbuf_priv_size);
Is this check really needed?
> -	RTE_ASSERT(user_mbp_priv->flags == 0);
We can keep
 RTE_ASSERT(user_mbp_priv->flags & ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF == 0);
>  
>  	mbp_priv = rte_mempool_get_priv(mp);
>  	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
> @@ -107,6 +107,63 @@
>  	m->next = NULL;
>  }
>  
> +/*
> + * pktmbuf constructor for the pool with pinned external buffer,
> + * given as a callback function to rte_mempool_obj_iter() in
> + * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
> + * mbuf to their default values.
> + */
> +void
> +rte_pktmbuf_init_extmem(struct rte_mempool *mp,
> +			void *opaque_arg,
> +			void *_m,
> +			__attribute__((unused)) unsigned int i)
> +{
> +	struct rte_mbuf *m = _m;
> +	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
> +	struct rte_pktmbuf_extmem *ext_mem;
> +	uint32_t mbuf_size, buf_len, priv_size;
> +
> +	priv_size = rte_pktmbuf_priv_size(mp);
> +	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
> +	buf_len = rte_pktmbuf_data_room_size(mp);
> +
> +	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
> +	RTE_ASSERT(mp->elt_size >= mbuf_size);
> +	RTE_ASSERT(buf_len <= UINT16_MAX);
> +
> +	memset(m, 0, mbuf_size);
> +	m->priv_size = priv_size;
> +	m->buf_len = (uint16_t)buf_len;
> +
> +	/* set the data buffer pointers to external memory */
> +	ext_mem = ctx->ext_mem + ctx->ext;
> +
> +	RTE_ASSERT(ctx->ext < ctx->ext_num);
> +	RTE_ASSERT(ctx->off < ext_mem->buf_len);
> +
> +	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
> +	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
> +		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
> +
> +	ctx->off += ext_mem->elt_size;
> +	if (ctx->off >= ext_mem->buf_len) {
> +		ctx->off = 0;
> +		++ctx->ext;
> +	}
> +	/* keep some headroom between start of buffer and data */
> +	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
> +
> +	/* init some constant fields */
> +	m->pool = mp;
> +	m->nb_segs = 1;
> +	m->port = MBUF_INVALID_PORT;
> +	m->ol_flags = EXT_ATTACHED_MBUF;
> +	rte_mbuf_refcnt_set(m, 1);
> +	m->next = NULL;
> +}
> +
> +
>  /* Helper to create a mbuf pool with given mempool ops name*/
>  struct rte_mempool *
>  rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
> @@ -169,6 +226,89 @@ struct rte_mempool *
>  			data_room_size, socket_id, NULL);
>  }
>  
> +/* Helper to create a mbuf pool with pinned external data buffers. */
> +struct rte_mempool *
> +rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
> +	unsigned int cache_size, uint16_t priv_size,
> +	uint16_t data_room_size, int socket_id,
> +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
> +{
> +	struct rte_mempool *mp;
> +	struct rte_pktmbuf_pool_private mbp_priv;
> +	struct rte_pktmbuf_extmem_init_ctx init_ctx;
> +	const char *mp_ops_name;
> +	unsigned int elt_size;
> +	unsigned int i, n_elts = 0;
> +	int ret;
> +
> +	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> +			priv_size);
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +	/* Check the external memory descriptors. */
> +	for (i = 0; i < ext_num; i++) {
> +		struct rte_pktmbuf_extmem *extm = ext_mem + i;
> +
> +		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
> +			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
> +			rte_errno = EINVAL;
> +			return NULL;
> +		}
> +		if (data_room_size > extm->elt_size) {
> +			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
> +				priv_size);
> +			rte_errno = EINVAL;
> +			return NULL;
> +		}
> +		n_elts += extm->buf_len / extm->elt_size;
> +	}
> +	/* Check whether enough external memory provided. */
> +	if (n_elts < n) {
> +		RTE_LOG(ERR, MBUF, "not enough extmem\n");
> +		rte_errno = ENOMEM;
> +		return NULL;
> +	}
> +	elt_size = sizeof(struct rte_mbuf) + (unsigned int)priv_size;
> +	memset(&mbp_priv, 0, sizeof(mbp_priv));
> +	mbp_priv.mbuf_data_room_size = data_room_size;
> +	mbp_priv.mbuf_priv_size = priv_size;
> +	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> +
> +	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> +		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> +	if (mp == NULL)
> +		return NULL;
> +
> +	mp_ops_name = rte_mbuf_best_mempool_ops();
> +	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
> +		rte_mempool_free(mp);
> +		rte_errno = -ret;
> +		return NULL;
> +	}
> +	rte_pktmbuf_pool_init(mp, &mbp_priv);
> +
> +	ret = rte_mempool_populate_default(mp);
> +	if (ret < 0) {
> +		rte_mempool_free(mp);
> +		rte_errno = -ret;
> +		return NULL;
> +	}
> +
> +	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
> +		.ext_mem = ext_mem,
> +		.ext_num = ext_num,
> +		.ext = 0,
> +		.off = 0,
> +	};
> +	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
> +
> +	return mp;
> +}
Instead of duplicating some code, would it be possible to do:
int
rte_pktmbuf_pool_attach_extbuf(struct rte_mempool *mp,
	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
{
	struct rte_pktmbuf_extmem_init_ctx init_ctx = { 0 };
	struct rte_pktmbuf_pool_private *priv;
	/* XXX assert mempool is fully populated? */
	priv = rte_mempool_get_priv(mp);
	mbp_priv.flags |= RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
	return init_ctx.ret;
}
The application would have to call:
	rte_pktmbuf_pool_create(...);
	rte_pktmbuf_pool_attach_extbuf(...);
> +
>  /* do some sanity checks on a mbuf: panic if it fails */
>  void
>  rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 8f486af..7bde297 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -642,6 +642,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
>  		      void *m, unsigned i);
>  
> +/** The context to initialize the mbufs with pinned external buffers. */
> +struct rte_pktmbuf_extmem_init_ctx {
> +	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
> +	unsigned int ext_num; /* number of descriptors in array. */
> +	unsigned int ext; /* loop descriptor index. */
> +	size_t off; /* loop buffer offset. */
> +};
> +
> +/**
> + * The packet mbuf constructor for pools with pinned external memory.
> + *
> + * This function initializes some fields in the mbuf structure that are
> + * not modified by the user once created (origin pool, buffer start
> + * address, and so on). This function is given as a callback function to
> + * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
> + *
> + * @param mp
> + *   The mempool from which mbufs originate.
> + * @param opaque_arg
> + *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
> + *   context structure
> + * @param m
> + *   The mbuf to initialize.
> + * @param i
> + *   The index of the mbuf in the pool table.
> + */
> +void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
> +			     void *m, unsigned int i);
>  
>  /**
>   * A  packet mbuf pool constructor.
> @@ -743,6 +771,62 @@ struct rte_mempool *
>  	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
>  	int socket_id, const char *ops_name);
>  
> +/** A structure that describes the pinned external buffer segment. */
> +struct rte_pktmbuf_extmem {
> +	void *buf_ptr;		/**< The virtual address of data buffer. */
> +	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
> +	size_t buf_len;		/**< External buffer length in bytes. */
> +	uint16_t elt_size;	/**< mbuf element size in bytes. */
> +};
> +
> +/**
> + * Create a mbuf pool with external pinned data buffers.
> + *
> + * This function creates and initializes a packet mbuf pool that contains
> + * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
> + *
> + * @param name
> + *   The name of the mbuf pool.
> + * @param n
> + *   The number of elements in the mbuf pool. The optimum size (in terms
> + *   of memory usage) for a mempool is when n is a power of two minus one:
> + *   n = (2^q - 1).
> + * @param cache_size
> + *   Size of the per-core object cache. See rte_mempool_create() for
> + *   details.
> + * @param priv_size
> + *   Size of application private are between the rte_mbuf structure
> + *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
> + * @param data_room_size
> + *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
> + * @param socket_id
> + *   The socket identifier where the memory should be allocated. The
> + *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
> + *   reserved zone.
> + * @param ext_mem
> + *   Pointer to the array of structures describing the external memory
> + *   for data buffers. It is caller responsibility to register this memory
> + *   with rte_extmem_register() (if needed), map this memory to appropriate
> + *   physical device, etc.
> + * @param ext_num
> + *   Number of elements in the ext_mem array.
> + * @return
> + *   The pointer to the new allocated mempool, on success. NULL on error
> + *   with rte_errno set appropriately. Possible rte_errno values include:
> + *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
> + *    - E_RTE_SECONDARY - function was called from a secondary process instance
> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
> + *    - ENOSPC - the maximum number of memzones has already been allocated
> + *    - EEXIST - a memzone with the same name already exists
> + *    - ENOMEM - no appropriate memory area found in which to create memzone
> + */
> +__rte_experimental
> +struct rte_mempool *
> +rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
> +	unsigned int cache_size, uint16_t priv_size,
> +	uint16_t data_room_size, int socket_id,
> +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
> +
>  /**
>   * Get the data room size of mbufs stored in a pktmbuf_pool
>   *
> @@ -818,7 +902,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
>  	m->nb_segs = 1;
>  	m->port = MBUF_INVALID_PORT;
>  
> -	m->ol_flags = 0;
> +	m->ol_flags &= EXT_ATTACHED_MBUF;
>  	m->packet_type = 0;
>  	rte_pktmbuf_reset_headroom(m);
>  
I wonder if it should go in previous patch?
> diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
> index 3bbb476..ab161bc 100644
> --- a/lib/librte_mbuf/rte_mbuf_version.map
> +++ b/lib/librte_mbuf/rte_mbuf_version.map
> @@ -44,5 +44,6 @@ EXPERIMENTAL {
>  	rte_mbuf_dyn_dump;
>  	rte_pktmbuf_copy;
>  	rte_pktmbuf_free_bulk;
> +	rte_pktmbuf_pool_create_extbuf;
>  
>  };
> -- 
> 1.8.3.1
> 
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
  2020-01-14 15:27     ` Olivier Matz
@ 2020-01-15 12:52       ` Slava Ovsiienko
  0 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-15 12:52 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler, stephen
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, January 14, 2020 17:27
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; stephen@networkplumber.org
> Subject: Re: [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
> 
> Hi Viacheslav,
> 
> Please see some comments below.
> 
> On Tue, Jan 14, 2020 at 09:15:02AM +0000, Viacheslav Ovsiienko wrote:
> > Update detach routine to check the mbuf pool type.
> >
> > Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 64
> > +++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 60 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 219b110..8f486af 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -306,6 +306,46 @@ struct rte_pktmbuf_pool_private {
> >  	uint32_t flags; /**< reserved for future use. */  };
> >
> > +/**
> > + * When set pktmbuf mempool will hold only mbufs with pinned external
> > + * buffer. The external buffer will be attached on the mbuf at the
> > + * memory pool creation and will never be detached by the mbuf free calls.
> > + * mbuf should not contain any room for data after the mbuf structure.
> > + */
> > +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
> 
> Out of curiosity, why using a pool flag instead of flagging the mbufs?
> The reason I see is that adding a new m->flag would impact places where we
> copy or reset the flags (it should be excluded). Is there another reason?
> 
Can we introduce the new static flag for mbuf?
Yes, there is some problem - there are many places in DPDK where the flags
in new allocated mbufs are disregarded (ol_flags field is just set to zero).
So, any flag set on allocation (even static, dynamic one is not possible to handle at all)
would get lost. We could fix it in new application (this feature is addressed to the
new ones) and PMDs supporting this, the question is whether we are allowed to
define the new mbuf static (in meaning not dynamic) flag.
> > +/**
> > + * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
> > + * otherwise. The pinned external buffer is allocated at pool
> > +creation
> > + * time and should not be freed.
> > + *
> > + * External buffer is a user-provided anonymous buffer.
> > + */
> > +#ifdef ALLOW_EXPERIMENTAL_API
> > +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb)
> rte_mbuf_has_pinned_extbuf(mb)
> > +#else #define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false #endif
> 
> I suppose you added these lines because the compilation was broken after
> introducing the new __rte_experimental API, which is called from detach().
> 
> I find a bit strange that we require to do this. I don't see what would be
> broken without the ifdef: an application compiled for 19.11 cannot use the
> pinned-ext-buf feature (because it did not exist), so the modification looks
> safe to me.
Without ifdef compilation fails if there is no experimental API usage configured.
> 
> > +
> > +__rte_experimental
> > +static inline uint32_t
> 
> I don't think uint32_t is really better than uint64_t. I agree with Stephen that
> bool is the preferred choice, however if it breaks compilation in some cases, I
> think int is better.
Yes, bool causes compilation issues (just including stdbool.h causes build failure),
and, for example bool  is reserved gcc keyword if AltiVec support is enabled.
uint32_t is the type of priv->flags. 
> 
> > +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m) {
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffer,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(m->pool);
> > +
> > +		return priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> 
> What about introducing a rte_pktmbuf_priv_flags() function, on the same
> model than rte_pktmbuf_priv_size() or rte_pktmbuf_data_room_size()?
Nice idea, thanks. I think this routine can be not experimental and we would
get rid of ifdef and other stuff.
> 
> 
> > +	}
> > +	return 0;
> > +}
> > +
> >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> >
> >  /**  check mbuf type in debug mode */ @@ -571,7 +611,8 @@ static
> > inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
> > static __rte_always_inline void  rte_mbuf_raw_free(struct rte_mbuf *m)
> > {
> > -	RTE_ASSERT(RTE_MBUF_DIRECT(m));
> > +	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> > +		  (!RTE_MBUF_HAS_EXTBUF(m) ||
> RTE_MBUF_HAS_PINNED_EXTBUF(m)));
> >  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
> >  	RTE_ASSERT(m->next == NULL);
> >  	RTE_ASSERT(m->nb_segs == 1);
> > @@ -1141,11 +1182,26 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> >  	uint32_t mbuf_size, buf_len;
> >  	uint16_t priv_size;
> >
> > -	if (RTE_MBUF_HAS_EXTBUF(m))
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffed,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from to detect the pinned
> > +		 * external buffer.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(mp);
> > +
> 
> It could be rte_pktmbuf_priv_flags() as said above.
> 
> > +		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> > +			RTE_ASSERT(m->shinfo == NULL);
> > +			m->ol_flags = EXT_ATTACHED_MBUF;
> > +			return;
> > +		}
> 
> I think it is not possible to have m->shinfo == NULL (this comment is also
> related to next patch, because shinfo init is done there). If you try to clone a
> mbuf that comes from an ext-pinned pool, it will crash. Here is the code from
> attach():
> 
> 	static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct
> rte_mbuf *m)
> 	{
> 	        RTE_ASSERT(RTE_MBUF_DIRECT(mi) &&
> 	            rte_mbuf_refcnt_read(mi) == 1);
> 
> 	        if (RTE_MBUF_HAS_EXTBUF(m)) {
> 	                rte_mbuf_ext_refcnt_update(m->shinfo, 1); << HERE
> 	                mi->ol_flags = m->ol_flags;
> 	                mi->shinfo = m->shinfo;
> 		...
> 
> The 2 alternatives I see are:
> 
> - do not allow to clone these mbufs, but today there is no return value
>   to attach() functions, so there is no way to know if it failed or not
> 
> - manage shinfo to support clones
> 
> I think just ignoring the rte_mbuf_ext_refcnt_update() if shinfo == NULL is not
> an option, because if could result in recycling the extbuf while a clone still
> references it.
> 
> 
The clone for the mbufs with pinned buffers is not supposed at all, this was
chosen as development precondition. We can't touch the buf_adr/iova_addr fields in mbuf,
because there is no way to restore these pointers (nomore fixed offset between mbuf and data  buffer),
so rte_mbuf_detach() would be not operational at all.
Also, we can't deduce the mbuf base address from data buffer address. 
We could add RTE_ASSERT to prevent attaching (to) mbufs with pinned data, what do
you think?
> >  		__rte_pktmbuf_free_extbuf(m);
> > -	else
> > +	} else {
> >  		__rte_pktmbuf_free_direct(m);
> > -
> > +	}
> >  	priv_size = rte_pktmbuf_priv_size(mp);
> >  	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
> >  	buf_len = rte_pktmbuf_data_room_size(mp);
> > --
> > 1.8.3.1
> >
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers
  2020-01-14 16:04     ` Olivier Matz
@ 2020-01-15 18:13       ` Slava Ovsiienko
  0 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-15 18:13 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler, stephen
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, January 14, 2020 18:05
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; stephen@networkplumber.org
> Subject: Re: [PATCH v3 2/4] mbuf: create packet pool with external memory
> buffers
> 
> On Tue, Jan 14, 2020 at 09:15:03AM +0000, Viacheslav Ovsiienko wrote:
> > The dedicated routine rte_pktmbuf_pool_create_extbuf() is provided to
> > create mbuf pool with data buffers located in the pinned external
> > memory. The application provides the external memory description and
> > routine initialises each mbuf with appropriate virtual and physical
> > buffer address.
> > It is entirely application responsibility to register external memory
> > with rte_extmem_register() API, map this memory, etc.
> >
> > The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF is set in
> > private pool structure, specifying the new special pool type. The
> > allocated mbufs from pool of this kind will have the EXT_ATTACHED_MBUF
> > flag set and NULL shared info pointer, because external buffers are
> > not supposed to be freed and sharing management is not needed. Also,
> > these mbufs can not be attached to other mbufs (not intended to be
> > indirect).
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.c           | 144
> ++++++++++++++++++++++++++++++++++-
> >  lib/librte_mbuf/rte_mbuf.h           |  86 ++++++++++++++++++++-
> >  lib/librte_mbuf/rte_mbuf_version.map |   1 +
> >  3 files changed, 228 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > index 8fa7f49..d151469 100644
> > --- a/lib/librte_mbuf/rte_mbuf.c
> > +++ b/lib/librte_mbuf/rte_mbuf.c
> > @@ -59,9 +59,9 @@
> >  	}
> >
> >  	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
> > -		user_mbp_priv->mbuf_data_room_size +
> > +		((user_mbp_priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
> > +		0 : user_mbp_priv->mbuf_data_room_size) +
> >  		user_mbp_priv->mbuf_priv_size);
> 
> Is this check really needed?
It seems so, it is in separated routine, which might be called externally.
> 
> > -	RTE_ASSERT(user_mbp_priv->flags == 0);
> 
> We can keep
>  RTE_ASSERT(user_mbp_priv->flags &
> ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF == 0);
OK, thanks.
> 
> >
> >  	mbp_priv = rte_mempool_get_priv(mp);
> >  	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv)); @@ -107,6
> > +107,63 @@
> >  	m->next = NULL;
> >  }
> >
> > +/*
> > + * pktmbuf constructor for the pool with pinned external buffer,
> > + * given as a callback function to rte_mempool_obj_iter() in
> > + * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
> > + * mbuf to their default values.
> > + */
> > +void
> > +rte_pktmbuf_init_extmem(struct rte_mempool *mp,
> > +			void *opaque_arg,
> > +			void *_m,
> > +			__attribute__((unused)) unsigned int i) {
> > +	struct rte_mbuf *m = _m;
> > +	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
> > +	struct rte_pktmbuf_extmem *ext_mem;
> > +	uint32_t mbuf_size, buf_len, priv_size;
> > +
> > +	priv_size = rte_pktmbuf_priv_size(mp);
> > +	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
> > +	buf_len = rte_pktmbuf_data_room_size(mp);
> > +
> > +	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) ==
> priv_size);
> > +	RTE_ASSERT(mp->elt_size >= mbuf_size);
> > +	RTE_ASSERT(buf_len <= UINT16_MAX);
> > +
> > +	memset(m, 0, mbuf_size);
> > +	m->priv_size = priv_size;
> > +	m->buf_len = (uint16_t)buf_len;
> > +
> > +	/* set the data buffer pointers to external memory */
> > +	ext_mem = ctx->ext_mem + ctx->ext;
> > +
> > +	RTE_ASSERT(ctx->ext < ctx->ext_num);
> > +	RTE_ASSERT(ctx->off < ext_mem->buf_len);
> > +
> > +	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
> > +	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
> > +		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
> > +
> > +	ctx->off += ext_mem->elt_size;
> > +	if (ctx->off >= ext_mem->buf_len) {
> > +		ctx->off = 0;
> > +		++ctx->ext;
> > +	}
> > +	/* keep some headroom between start of buffer and data */
> > +	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m-
> >buf_len);
> > +
> > +	/* init some constant fields */
> > +	m->pool = mp;
> > +	m->nb_segs = 1;
> > +	m->port = MBUF_INVALID_PORT;
> > +	m->ol_flags = EXT_ATTACHED_MBUF;
> > +	rte_mbuf_refcnt_set(m, 1);
> > +	m->next = NULL;
> > +}
> > +
> > +
> >  /* Helper to create a mbuf pool with given mempool ops name*/  struct
> > rte_mempool *  rte_pktmbuf_pool_create_by_ops(const char *name,
> > unsigned int n, @@ -169,6 +226,89 @@ struct rte_mempool *
> >  			data_room_size, socket_id, NULL);
> >  }
> >
> > +/* Helper to create a mbuf pool with pinned external data buffers. */
> > +struct rte_mempool * rte_pktmbuf_pool_create_extbuf(const char *name,
> > +unsigned int n,
> > +	unsigned int cache_size, uint16_t priv_size,
> > +	uint16_t data_room_size, int socket_id,
> > +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num) {
> > +	struct rte_mempool *mp;
> > +	struct rte_pktmbuf_pool_private mbp_priv;
> > +	struct rte_pktmbuf_extmem_init_ctx init_ctx;
> > +	const char *mp_ops_name;
> > +	unsigned int elt_size;
> > +	unsigned int i, n_elts = 0;
> > +	int ret;
> > +
> > +	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
> > +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> > +			priv_size);
> > +		rte_errno = EINVAL;
> > +		return NULL;
> > +	}
> > +	/* Check the external memory descriptors. */
> > +	for (i = 0; i < ext_num; i++) {
> > +		struct rte_pktmbuf_extmem *extm = ext_mem + i;
> > +
> > +		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
> > +			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
> > +			rte_errno = EINVAL;
> > +			return NULL;
> > +		}
> > +		if (data_room_size > extm->elt_size) {
> > +			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
> > +				priv_size);
> > +			rte_errno = EINVAL;
> > +			return NULL;
> > +		}
> > +		n_elts += extm->buf_len / extm->elt_size;
> > +	}
> > +	/* Check whether enough external memory provided. */
> > +	if (n_elts < n) {
> > +		RTE_LOG(ERR, MBUF, "not enough extmem\n");
> > +		rte_errno = ENOMEM;
> > +		return NULL;
> > +	}
> > +	elt_size = sizeof(struct rte_mbuf) + (unsigned int)priv_size;
> > +	memset(&mbp_priv, 0, sizeof(mbp_priv));
> > +	mbp_priv.mbuf_data_room_size = data_room_size;
> > +	mbp_priv.mbuf_priv_size = priv_size;
> > +	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> > +
> > +	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> > +		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> > +	if (mp == NULL)
> > +		return NULL;
> > +
> > +	mp_ops_name = rte_mbuf_best_mempool_ops();
> > +	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
> > +	if (ret != 0) {
> > +		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
> > +		rte_mempool_free(mp);
> > +		rte_errno = -ret;
> > +		return NULL;
> > +	}
> > +	rte_pktmbuf_pool_init(mp, &mbp_priv);
> > +
> > +	ret = rte_mempool_populate_default(mp);
> > +	if (ret < 0) {
> > +		rte_mempool_free(mp);
> > +		rte_errno = -ret;
> > +		return NULL;
> > +	}
> > +
> > +	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
> > +		.ext_mem = ext_mem,
> > +		.ext_num = ext_num,
> > +		.ext = 0,
> > +		.off = 0,
> > +	};
> > +	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
> > +
> > +	return mp;
> > +}
> 
> Instead of duplicating some code, would it be possible to do:
> 
> int
> rte_pktmbuf_pool_attach_extbuf(struct rte_mempool *mp,
> 	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num) {
> 	struct rte_pktmbuf_extmem_init_ctx init_ctx = { 0 };
> 	struct rte_pktmbuf_pool_private *priv;
> 
> 	/* XXX assert mempool is fully populated? */
> 
> 	priv = rte_mempool_get_priv(mp);
> 	mbp_priv.flags |= RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> 
> 	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
> 
> 	return init_ctx.ret;
> }
> 
> The application would have to call:
> 
> 	rte_pktmbuf_pool_create(...);
> 	rte_pktmbuf_pool_attach_extbuf(...);
> 
It seems there are some disadvantages:
- no data_room_size check (we should remove asserts from rte_pktmbuf_pool_init)
- rte_mempool_obj_iter would be called twice, it might involve rte_mempool_virt2iova()
  and it would take some time
The code duplication is not so large as it could be seen from the diff - the part of
the  rte_pktmbuf_pool_create_extbuf() is related to checking extmem, and the main part
of job is done in the rte_pktmbuf_init_extmem().
> 
> > +
> >  /* do some sanity checks on a mbuf: panic if it fails */  void
> > rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header) diff
> > --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index
> > 8f486af..7bde297 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -642,6 +642,34 @@ static inline struct rte_mbuf
> > *rte_mbuf_raw_alloc(struct rte_mempool *mp)  void
> rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
> >  		      void *m, unsigned i);
> >
> > +/** The context to initialize the mbufs with pinned external buffers.
> > +*/ struct rte_pktmbuf_extmem_init_ctx {
> > +	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor
> array. */
> > +	unsigned int ext_num; /* number of descriptors in array. */
> > +	unsigned int ext; /* loop descriptor index. */
> > +	size_t off; /* loop buffer offset. */ };
> > +
> > +/**
> > + * The packet mbuf constructor for pools with pinned external memory.
> > + *
> > + * This function initializes some fields in the mbuf structure that
> > +are
> > + * not modified by the user once created (origin pool, buffer start
> > + * address, and so on). This function is given as a callback function
> > +to
> > + * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
> > + *
> > + * @param mp
> > + *   The mempool from which mbufs originate.
> > + * @param opaque_arg
> > + *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
> > + *   context structure
> > + * @param m
> > + *   The mbuf to initialize.
> > + * @param i
> > + *   The index of the mbuf in the pool table.
> > + */
> > +void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void
> *opaque_arg,
> > +			     void *m, unsigned int i);
> >
> >  /**
> >   * A  packet mbuf pool constructor.
> > @@ -743,6 +771,62 @@ struct rte_mempool *
> >  	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
> >  	int socket_id, const char *ops_name);
> >
> > +/** A structure that describes the pinned external buffer segment. */
> > +struct rte_pktmbuf_extmem {
> > +	void *buf_ptr;		/**< The virtual address of data buffer. */
> > +	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
> > +	size_t buf_len;		/**< External buffer length in bytes. */
> > +	uint16_t elt_size;	/**< mbuf element size in bytes. */
> > +};
> > +
> > +/**
> > + * Create a mbuf pool with external pinned data buffers.
> > + *
> > + * This function creates and initializes a packet mbuf pool that
> > +contains
> > + * only mbufs with external buffer. It is a wrapper to rte_mempool
> functions.
> > + *
> > + * @param name
> > + *   The name of the mbuf pool.
> > + * @param n
> > + *   The number of elements in the mbuf pool. The optimum size (in terms
> > + *   of memory usage) for a mempool is when n is a power of two minus
> one:
> > + *   n = (2^q - 1).
> > + * @param cache_size
> > + *   Size of the per-core object cache. See rte_mempool_create() for
> > + *   details.
> > + * @param priv_size
> > + *   Size of application private are between the rte_mbuf structure
> > + *   and the data buffer. This value must be aligned to
> RTE_MBUF_PRIV_ALIGN.
> > + * @param data_room_size
> > + *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
> > + * @param socket_id
> > + *   The socket identifier where the memory should be allocated. The
> > + *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
> > + *   reserved zone.
> > + * @param ext_mem
> > + *   Pointer to the array of structures describing the external memory
> > + *   for data buffers. It is caller responsibility to register this memory
> > + *   with rte_extmem_register() (if needed), map this memory to
> appropriate
> > + *   physical device, etc.
> > + * @param ext_num
> > + *   Number of elements in the ext_mem array.
> > + * @return
> > + *   The pointer to the new allocated mempool, on success. NULL on error
> > + *   with rte_errno set appropriately. Possible rte_errno values include:
> > + *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config
> structure
> > + *    - E_RTE_SECONDARY - function was called from a secondary process
> instance
> > + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
> > + *    - ENOSPC - the maximum number of memzones has already been
> allocated
> > + *    - EEXIST - a memzone with the same name already exists
> > + *    - ENOMEM - no appropriate memory area found in which to create
> memzone
> > + */
> > +__rte_experimental
> > +struct rte_mempool *
> > +rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
> > +	unsigned int cache_size, uint16_t priv_size,
> > +	uint16_t data_room_size, int socket_id,
> > +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
> > +
> >  /**
> >   * Get the data room size of mbufs stored in a pktmbuf_pool
> >   *
> > @@ -818,7 +902,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf
> *m)
> >  	m->nb_segs = 1;
> >  	m->port = MBUF_INVALID_PORT;
> >
> > -	m->ol_flags = 0;
> > +	m->ol_flags &= EXT_ATTACHED_MBUF;
> >  	m->packet_type = 0;
> >  	rte_pktmbuf_reset_headroom(m);
> >
> 
> I wonder if it should go in previous patch?
Mmm... Definitely - yes 😊
Thanks, will move this line.
> 
> > diff --git a/lib/librte_mbuf/rte_mbuf_version.map
> > b/lib/librte_mbuf/rte_mbuf_version.map
> > index 3bbb476..ab161bc 100644
> > --- a/lib/librte_mbuf/rte_mbuf_version.map
> > +++ b/lib/librte_mbuf/rte_mbuf_version.map
> > @@ -44,5 +44,6 @@ EXPERIMENTAL {
> >  	rte_mbuf_dyn_dump;
> >  	rte_pktmbuf_copy;
> >  	rte_pktmbuf_free_bulk;
> > +	rte_pktmbuf_pool_create_extbuf;
> >
> >  };
> > --
> > 1.8.3.1
> >
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned external buffer
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (3 preceding siblings ...)
  2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-16 13:04 ` Viacheslav Ovsiienko
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
                     ` (4 more replies)
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                   ` (3 subsequent siblings)
  8 siblings, 5 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly the external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
RFC: http://patches.dpdk.org/patch/63077
v1: - http://patches.dpdk.org/cover/64424
v2: - fix rte_experimantal issue on comment addressing
    - rte_mbuf_has_pinned_extbuf return type is uint32_t
    - fix Power9 compilation issue
v3: - http://patches.dpdk.org/cover/64424/
    - fix "#include <stdbool.h> leftover
v4: - introduce rte_pktmbuf_priv_flags
    - support cloning pinned mbufs as for regular mbufs
      with external buffers
    - address the minor comments
Viacheslav Ovsiienko (5):
  mbuf: introduce routine to get private mbuf pool flags
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 +++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 178 +++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 196 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 487 insertions(+), 44 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-16 13:04   ` Viacheslav Ovsiienko
  2020-01-20 12:16     ` Olivier Matz
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The routine rte_pktmbuf_priv_flags is introduced to fetch
the flags from the mbuf memory pool private structure
in unified fashion.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 219b110..e9f6fa9 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * Return the flags from private data in an mempool structure.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The flags from the private data structure.
+ */
+static inline uint32_t
+rte_pktmbuf_priv_flags(struct rte_mempool *mp)
+{
+	struct rte_pktmbuf_pool_private *mbp_priv;
+
+	mbp_priv = (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
+	return mbp_priv->flags;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
@ 2020-01-16 13:04   ` Viacheslav Ovsiienko
  2020-01-20 13:56     ` Olivier Matz
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Update detach routine to check the mbuf pool type.
Introduce the special internal version of detach routine to handle
the special case of pinned external bufferon mbuf freeing.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 95 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 88 insertions(+), 7 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index e9f6fa9..52d57d1 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -323,6 +323,24 @@ struct rte_pktmbuf_pool_private {
 	return mbp_priv->flags;
 }
 
+/**
+ * When set pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached on the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns non zero if given mbuf has an pinned external buffer, or zero
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed on mbuf freeing.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) \
+	(rte_pktmbuf_priv_flags(mb->pool) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -588,7 +606,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -794,7 +813,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
@@ -1158,11 +1177,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		uint32_t flags = rte_pktmbuf_priv_flags(mp);
+
+		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			/*
+			 * The pinned external buffer should not be
+			 * detached from its backing mbuf, just exit.
+			 */
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
@@ -1177,6 +1211,51 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 }
 
 /**
+ * @internal version of rte_pktmbuf_detach() to be used on mbuf freeing.
+ * For indirect and regular (not pinned) external mbufs the standard
+ * rte_pktmbuf is involved, for pinned external buffer mbufs the special
+ * handling is performed:
+ *
+ *  - return zero if reference counter in shinfo is one. It means there is
+ *  no more references to this pinned buffer and mbuf can be returned to
+ *  the pool
+ *
+ *  - otherwise (if reference counter is not one), decrement reference
+ *  counter and return non-zero value to prevent freeing the backing mbuf.
+ *
+ * Returns non zero if mbuf should not be freed.
+ */
+static inline uint16_t __rte_pktmbuf_detach_on_free(struct rte_mbuf *m)
+{
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		uint32_t flags = rte_pktmbuf_priv_flags(m->pool);
+
+		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			struct rte_mbuf_ext_shared_info *shinfo;
+
+			/* Clear flags, mbuf is being freed. */
+			m->ol_flags = EXT_ATTACHED_MBUF;
+			shinfo = m->shinfo;
+			/* Optimize for performance - do not dec/reinit */
+			if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
+				return 0;
+			/*
+			 * Direct usage of add primitive to avoid
+			 * duplication of comparing with one.
+			 */
+			if (likely(rte_atomic16_add_return
+					(&shinfo->refcnt_atomic, -1)))
+				return 1;
+			/* Reinitialize counter before mbuf freeing. */
+			rte_mbuf_ext_refcnt_set(shinfo, 1);
+			return 0;
+		}
+	}
+	rte_pktmbuf_detach(m);
+	return 0;
+}
+
+/**
  * Decrease reference counter and unlink a mbuf segment
  *
  * This function does the same than a free, except that it does not
@@ -1198,7 +1277,8 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
 
 		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+			if (__rte_pktmbuf_detach_on_free(m))
+				return NULL;
 
 		if (m->next != NULL) {
 			m->next = NULL;
@@ -1210,7 +1290,8 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
 
 		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+			if (__rte_pktmbuf_detach_on_free(m))
+				return NULL;
 
 		if (m->next != NULL) {
 			m->next = NULL;
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-16 13:04   ` Viacheslav Ovsiienko
  2020-01-20 13:59     ` Olivier Matz
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  4 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initializes each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and initialiazed shared
info structure, allowing cloning with regular mbufs (without
attached external buffers of any kind).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 178 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  84 +++++++++++++++++
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 260 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..b9d89d0 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,12 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+			sizeof(struct rte_mbuf_ext_shared_info) :
+			user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
+	RTE_ASSERT((user_mbp_priv->flags &
+		    ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +110,89 @@
 	m->next = NULL;
 }
 
+/*
+ * @internal
+ * The callback routine called when reference counter in shinfo for mbufs
+ * with pinned external buffer reaches zero. It means there is no more
+ * references to buffer backing mbuf and this one should be freed.
+ */
+static void rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
+{
+	struct rte_mbuf *m = opaque;
+
+	RTE_SET_USED(addr);
+	RTE_ASSERT(RTE_MBUF_HAS_EXTBUF(m));
+	RTE_ASSERT(RTE_MBUF_HAS_PINNED_EXTBUF(m));
+	RTE_ASSERT(m->shinfo->fcb_opaque == m);
+
+	rte_mbuf_ext_refcnt_set(m->shinfo, 1);
+	rte_pktmbuf_free_seg(m);
+}
+
+/*
+ * pktmbuf constructor for the pool with pinned external buffer,
+ * given as a callback function to rte_mempool_obj_iter() in
+ * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
+ * mbuf to their default values.
+ */
+void
+rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			void *opaque_arg,
+			void *_m,
+			__attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+	struct rte_mbuf_ext_shared_info *shinfo;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+
+	/* init external buffer shared info items */
+	shinfo = RTE_PTR_ADD(m, mbuf_size);
+	m->shinfo = shinfo;
+	shinfo->free_cb = rte_pktmbuf_free_pinned_extmem;
+	shinfo->fcb_opaque = m;
+	rte_mbuf_ext_refcnt_set(shinfo, 1);
+}
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +255,92 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) +
+		   (unsigned int)priv_size +
+		   sizeof(struct rte_mbuf_ext_shared_info);
+
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
@@ -247,7 +419,7 @@ int rte_mbuf_check(const struct rte_mbuf *m, int is_header,
 	return 0;
 }
 
-/**
+/*
  * @internal helper function for freeing a bulk of packet mbuf segments
  * via an array holding the packet mbuf segments from the same mempool
  * pending to be freed.
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 52d57d1..093210e 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -637,6 +637,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
+
+/**
+ * The packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
+			     void *m, unsigned int i);
 
 /**
  * A  packet mbuf pool constructor.
@@ -738,6 +766,62 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-16 13:04   ` Viacheslav Ovsiienko
  2020-01-20 14:11     ` Olivier Matz
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  4 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9da1ffb..5c6fe18 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2395,6 +2395,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 2eec8af..5f910ba 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -865,6 +866,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -933,6 +994,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 857a11f..a47f214 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v4 5/5] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-16 13:04   ` Viacheslav Ovsiienko
  4 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-16 13:04 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ca25e32..c87ce15 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 67cafd1..5e31f01 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1337,7 +1337,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e362b4a..24fa038 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..9e5c6ee 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)vec_vsx_ld
+			(0, (vector unsigned char *)&rxq->mbuf_initializer);
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
@ 2020-01-20 12:16     ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 12:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
On Thu, Jan 16, 2020 at 01:04:25PM +0000, Viacheslav Ovsiienko wrote:
> The routine rte_pktmbuf_priv_flags is introduced to fetch
> the flags from the mbuf memory pool private structure
> in unified fashion.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-20 13:56     ` Olivier Matz
  2020-01-20 15:41       ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 13:56 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
Hi Slava,
Some comments inline.
On Thu, Jan 16, 2020 at 01:04:26PM +0000, Viacheslav Ovsiienko wrote:
> Update detach routine to check the mbuf pool type.
> Introduce the special internal version of detach routine to handle
> the special case of pinned external bufferon mbuf freeing.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 95 ++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 88 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index e9f6fa9..52d57d1 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -323,6 +323,24 @@ struct rte_pktmbuf_pool_private {
>  	return mbp_priv->flags;
>  }
>  
> +/**
> + * When set pktmbuf mempool will hold only mbufs with pinned external
few minor doc enhancements:
-When set pktmbuf...
+When set, pktmbuf...
> + * buffer. The external buffer will be attached on the mbuf at the
-attached on
+attached to
> + * memory pool creation and will never be detached by the mbuf free calls.
> + * mbuf should not contain any room for data after the mbuf structure.
> + */
> +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
> +
> +/**
> + * Returns non zero if given mbuf has an pinned external buffer, or zero
-an
+a
> + * otherwise. The pinned external buffer is allocated at pool creation
> + * time and should not be freed on mbuf freeing.
> + *
> + * External buffer is a user-provided anonymous buffer.
> + */
> +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) \
> +	(rte_pktmbuf_priv_flags(mb->pool) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
> +
>  #ifdef RTE_LIBRTE_MBUF_DEBUG
>  
>  /**  check mbuf type in debug mode */
> @@ -588,7 +606,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  static __rte_always_inline void
>  rte_mbuf_raw_free(struct rte_mbuf *m)
>  {
> -	RTE_ASSERT(RTE_MBUF_DIRECT(m));
> +	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> +		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
>  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
>  	RTE_ASSERT(m->next == NULL);
>  	RTE_ASSERT(m->nb_segs == 1);
> @@ -794,7 +813,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
>  	m->nb_segs = 1;
>  	m->port = MBUF_INVALID_PORT;
>  
> -	m->ol_flags = 0;
> +	m->ol_flags &= EXT_ATTACHED_MBUF;
>  	m->packet_type = 0;
>  	rte_pktmbuf_reset_headroom(m);
>  
> @@ -1158,11 +1177,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  	uint32_t mbuf_size, buf_len;
>  	uint16_t priv_size;
>  
> -	if (RTE_MBUF_HAS_EXTBUF(m))
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		/*
> +		 * The mbuf has the external attached buffer,
> +		 * we should check the type of the memory pool where
> +		 * the mbuf was allocated from to detect the pinned
> +		 * external buffer.
> +		 */
> +		uint32_t flags = rte_pktmbuf_priv_flags(mp);
> +
> +		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> +			/*
> +			 * The pinned external buffer should not be
> +			 * detached from its backing mbuf, just exit.
> +			 */
> +			return;
> +		}
>  		__rte_pktmbuf_free_extbuf(m);
> -	else
> +	} else {
>  		__rte_pktmbuf_free_direct(m);
> -
> +	}
This new behavior could be documented in the API of detach(). I mean
something saying that an ext mem pinned mbuf cannot be detached, and
the function will do nothing.
>  	priv_size = rte_pktmbuf_priv_size(mp);
>  	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
>  	buf_len = rte_pktmbuf_data_room_size(mp);
> @@ -1177,6 +1211,51 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  }
>  
>  /**
> + * @internal version of rte_pktmbuf_detach() to be used on mbuf freeing.
-version
+Version
> + * For indirect and regular (not pinned) external mbufs the standard
> + * rte_pktmbuf is involved, for pinned external buffer mbufs the special
> + * handling is performed:
Sorry, it is not very clear to me, especially what "the standard
rte_pktmbuf is involved" means.
> + *
> + *  - return zero if reference counter in shinfo is one. It means there is
> + *  no more references to this pinned buffer and mbuf can be returned to
-references
+reference
> + *  the pool
> + *
> + *  - otherwise (if reference counter is not one), decrement reference
> + *  counter and return non-zero value to prevent freeing the backing mbuf.
> + *
> + * Returns non zero if mbuf should not be freed.
> + */
> +static inline uint16_t __rte_pktmbuf_detach_on_free(struct rte_mbuf *m)
I think int would be better than uint16_t
> +{
> +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> +		uint32_t flags = rte_pktmbuf_priv_flags(m->pool);
> +
> +		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> +			struct rte_mbuf_ext_shared_info *shinfo;
> +
> +			/* Clear flags, mbuf is being freed. */
> +			m->ol_flags = EXT_ATTACHED_MBUF;
> +			shinfo = m->shinfo;
> +			/* Optimize for performance - do not dec/reinit */
> +			if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> +				return 0;
> +			/*
> +			 * Direct usage of add primitive to avoid
> +			 * duplication of comparing with one.
> +			 */
> +			if (likely(rte_atomic16_add_return
> +					(&shinfo->refcnt_atomic, -1)))
> +				return 1;
> +			/* Reinitialize counter before mbuf freeing. */
> +			rte_mbuf_ext_refcnt_set(shinfo, 1);
> +			return 0;
> +		}
> +	}
> +	rte_pktmbuf_detach(m);
> +	return 0;
> +}
I don't think the API comment really reflects what is done in this
function. In my understanding, the detach() operation does nothing
on an extmem pinned mbuf. So detach() is probably not the proper name.
What about something like this instead:
/* [...].
 *  assume m is pinned to external memory */
static inline int
__rte_pktmbuf_pinned_ext_buf_decref(struct rte_mbuf *m)
{
	struct rte_mbuf_ext_shared_info *shinfo;
	/* Clear flags, mbuf is being freed. */
	m->ol_flags = EXT_ATTACHED_MBUF;
	shinfo = m->shinfo;
	/* Optimize for performance - do not dec/reinit */
	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
		return 0;
	/*
	 * Direct usage of add primitive to avoid
	 * duplication of comparing with one.
	 */
	if (likely(rte_atomic16_add_return
			(&shinfo->refcnt_atomic, -1)))
		return 1;
	/* Reinitialize counter before mbuf freeing. */
	rte_mbuf_ext_refcnt_set(shinfo, 1);
	return 0;
}
static __rte_always_inline struct rte_mbuf *
rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
{
	__rte_mbuf_sanity_check(m, 0);
	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
		if (!RTE_MBUF_DIRECT(m))
			if (!RTE_MBUF_HAS_PINNED_EXTBUF(m))
				rte_pktmbuf_detach(m);
			else if (__rte_pktmbuf_pinned_ext_buf_decref(m))
				return NULL;
		}
		...
	... (and same below) ...
(just quickly tested)
The other advantage is that we don't call rte_pktmbuf_detach() where not
needed.
> +
> +/**
>   * Decrease reference counter and unlink a mbuf segment
>   *
>   * This function does the same than a free, except that it does not
> @@ -1198,7 +1277,8 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
>  
>  		if (!RTE_MBUF_DIRECT(m))
> -			rte_pktmbuf_detach(m);
> +			if (__rte_pktmbuf_detach_on_free(m))
> +				return NULL;
>  
>  		if (m->next != NULL) {
>  			m->next = NULL;
> @@ -1210,7 +1290,8 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
>  
>  		if (!RTE_MBUF_DIRECT(m))
> -			rte_pktmbuf_detach(m);
> +			if (__rte_pktmbuf_detach_on_free(m))
> +				return NULL;
>  
>  		if (m->next != NULL) {
>  			m->next = NULL;
> -- 
> 1.8.3.1
> 
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-20 13:59     ` Olivier Matz
  2020-01-20 17:33       ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 13:59 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
On Thu, Jan 16, 2020 at 01:04:27PM +0000, Viacheslav Ovsiienko wrote:
> The dedicated routine rte_pktmbuf_pool_create_extbuf() is
> provided to create mbuf pool with data buffers located in
> the pinned external memory. The application provides the
> external memory description and routine initializes each
> mbuf with appropriate virtual and physical buffer address.
> It is entirely application responsibility to register
> external memory with rte_extmem_register() API, map this
> memory, etc.
> 
> The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
> is set in private pool structure, specifying the new special
> pool type. The allocated mbufs from pool of this kind will
> have the EXT_ATTACHED_MBUF flag set and initialiazed shared
> info structure, allowing cloning with regular mbufs (without
> attached external buffers of any kind).
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  lib/librte_mbuf/rte_mbuf.c           | 178 ++++++++++++++++++++++++++++++++++-
>  lib/librte_mbuf/rte_mbuf.h           |  84 +++++++++++++++++
>  lib/librte_mbuf/rte_mbuf_version.map |   1 +
>  3 files changed, 260 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 8fa7f49..b9d89d0 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -59,9 +59,12 @@
>  	}
>  
>  	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
> -		user_mbp_priv->mbuf_data_room_size +
> +		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
> +			sizeof(struct rte_mbuf_ext_shared_info) :
> +			user_mbp_priv->mbuf_data_room_size) +
>  		user_mbp_priv->mbuf_priv_size);
> -	RTE_ASSERT(user_mbp_priv->flags == 0);
> +	RTE_ASSERT((user_mbp_priv->flags &
> +		    ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0);
>  
>  	mbp_priv = rte_mempool_get_priv(mp);
>  	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
> @@ -107,6 +110,89 @@
>  	m->next = NULL;
>  }
>  
> +/*
> + * @internal
> + * The callback routine called when reference counter in shinfo for mbufs
> + * with pinned external buffer reaches zero. It means there is no more
> + * references to buffer backing mbuf and this one should be freed.
> + */
> +static void rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
> +{
> +	struct rte_mbuf *m = opaque;
> +
> +	RTE_SET_USED(addr);
> +	RTE_ASSERT(RTE_MBUF_HAS_EXTBUF(m));
> +	RTE_ASSERT(RTE_MBUF_HAS_PINNED_EXTBUF(m));
> +	RTE_ASSERT(m->shinfo->fcb_opaque == m);
> +
> +	rte_mbuf_ext_refcnt_set(m->shinfo, 1);
> +	rte_pktmbuf_free_seg(m);
I think it should be rte_mbuf_raw_free(m) instead.
Else, the corresponding code can fail:
  - alloc packet from pinned_pool
  - increase ref counter (this is legal)
  - free packet on 2 cores at the same time
The refcnt will reach 0 on one of the core, then is calls
rte_pktmbuf_free_pinned_extmem(), then rte_pktmbuf_free_seg(), then
rte_pktmbuf_prefree_seg() which will do nothing because refcnt is not 1
-> mem leak
> +}
> +
> +/*
> + * pktmbuf constructor for the pool with pinned external buffer,
> + * given as a callback function to rte_mempool_obj_iter() in
> + * rte_pktmbuf_pool_create_extbuf(). Set the fields of a packet
> + * mbuf to their default values.
> + */
> +void
> +rte_pktmbuf_init_extmem(struct rte_mempool *mp,
> +			void *opaque_arg,
> +			void *_m,
> +			__attribute__((unused)) unsigned int i)
> +{
Can it be static?
If it is public, it should be experimental and in .map
> +	struct rte_mbuf *m = _m;
> +	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
> +	struct rte_pktmbuf_extmem *ext_mem;
> +	uint32_t mbuf_size, buf_len, priv_size;
> +	struct rte_mbuf_ext_shared_info *shinfo;
> +
> +	priv_size = rte_pktmbuf_priv_size(mp);
> +	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
> +	buf_len = rte_pktmbuf_data_room_size(mp);
> +
> +	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
> +	RTE_ASSERT(mp->elt_size >= mbuf_size);
> +	RTE_ASSERT(buf_len <= UINT16_MAX);
> +
> +	memset(m, 0, mbuf_size);
> +	m->priv_size = priv_size;
> +	m->buf_len = (uint16_t)buf_len;
> +
> +	/* set the data buffer pointers to external memory */
> +	ext_mem = ctx->ext_mem + ctx->ext;
> +
> +	RTE_ASSERT(ctx->ext < ctx->ext_num);
> +	RTE_ASSERT(ctx->off < ext_mem->buf_len);
> +
> +	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
> +	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
> +		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
> +
> +	ctx->off += ext_mem->elt_size;
> +	if (ctx->off >= ext_mem->buf_len) {
> +		ctx->off = 0;
> +		++ctx->ext;
> +	}
> +	/* keep some headroom between start of buffer and data */
> +	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
> +
> +	/* init some constant fields */
> +	m->pool = mp;
> +	m->nb_segs = 1;
> +	m->port = MBUF_INVALID_PORT;
> +	m->ol_flags = EXT_ATTACHED_MBUF;
> +	rte_mbuf_refcnt_set(m, 1);
> +	m->next = NULL;
> +
> +	/* init external buffer shared info items */
> +	shinfo = RTE_PTR_ADD(m, mbuf_size);
I think it should be
shinfo = RTE_PTR_ADD(m, mbuf_size + priv_size);
> +	m->shinfo = shinfo;
> +	shinfo->free_cb = rte_pktmbuf_free_pinned_extmem;
> +	shinfo->fcb_opaque = m;
> +	rte_mbuf_ext_refcnt_set(shinfo, 1);
To me, it is not very clear that free_cb() will only be called
for mbuf from non-pinned pools that are attached to a mbuf from
a pinned pool, i.e. the free_cb is not called at all if attach()
is not used.
> +}
> +
>  /* Helper to create a mbuf pool with given mempool ops name*/
>  struct rte_mempool *
>  rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
> @@ -169,6 +255,92 @@ struct rte_mempool *
>  			data_room_size, socket_id, NULL);
>  }
>  
> +/* Helper to create a mbuf pool with pinned external data buffers. */
> +struct rte_mempool *
> +rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
> +	unsigned int cache_size, uint16_t priv_size,
> +	uint16_t data_room_size, int socket_id,
> +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num)
struct rte_pktmbuf_extmem *ext_mem can be const
> +{
> +	struct rte_mempool *mp;
> +	struct rte_pktmbuf_pool_private mbp_priv;
> +	struct rte_pktmbuf_extmem_init_ctx init_ctx;
> +	const char *mp_ops_name;
> +	unsigned int elt_size;
> +	unsigned int i, n_elts = 0;
> +	int ret;
> +
> +	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> +			priv_size);
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +	/* Check the external memory descriptors. */
> +	for (i = 0; i < ext_num; i++) {
> +		struct rte_pktmbuf_extmem *extm = ext_mem + i;
> +
> +		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
> +			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
> +			rte_errno = EINVAL;
> +			return NULL;
> +		}
> +		if (data_room_size > extm->elt_size) {
> +			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
> +				priv_size);
> +			rte_errno = EINVAL;
> +			return NULL;
> +		}
> +		n_elts += extm->buf_len / extm->elt_size;
> +	}
Maybe it's just me, but I think there is a source of confusion between:
- elt_size: the size of mempool element (i.e a sizeof(mbuf) +
  priv_size + sizeof(shinfo))
- extm->elt_size: the size of data buffer in external memory
The structure fields could be renamed like this:
struct rte_pktmbuf_extmem {
	void *ptr;		/**< The virtual address of data buffer. */
	rte_iova_t iova;	/**< The IO address of the data buffer. */
	size_t len;		/**< External buffer length in bytes. */
	uint16_t mbuf_data_size; /**< mbuf data size in bytes. */
};
> +	/* Check whether enough external memory provided. */
> +	if (n_elts < n) {
> +		RTE_LOG(ERR, MBUF, "not enough extmem\n");
> +		rte_errno = ENOMEM;
> +		return NULL;
> +	}
> +	elt_size = sizeof(struct rte_mbuf) +
> +		   (unsigned int)priv_size +
> +		   sizeof(struct rte_mbuf_ext_shared_info);
> +
> +	memset(&mbp_priv, 0, sizeof(mbp_priv));
> +	mbp_priv.mbuf_data_room_size = data_room_size;
> +	mbp_priv.mbuf_priv_size = priv_size;
> +	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
> +
> +	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> +		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> +	if (mp == NULL)
> +		return NULL;
> +
> +	mp_ops_name = rte_mbuf_best_mempool_ops();
> +	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
> +		rte_mempool_free(mp);
> +		rte_errno = -ret;
> +		return NULL;
> +	}
> +	rte_pktmbuf_pool_init(mp, &mbp_priv);
> +
> +	ret = rte_mempool_populate_default(mp);
> +	if (ret < 0) {
> +		rte_mempool_free(mp);
> +		rte_errno = -ret;
> +		return NULL;
> +	}
I still think that these ~20 lines could be simplified into a call to
rte_pktmbuf_pool_create(). I don't think that using rte_mempool_obj_iter()
twice would cause a performance issue.
> +
> +	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
> +		.ext_mem = ext_mem,
> +		.ext_num = ext_num,
> +		.ext = 0,
> +		.off = 0,
> +	};
> +	rte_mempool_obj_iter(mp, rte_pktmbuf_init_extmem, &init_ctx);
> +
> +	return mp;
> +}
> +
>  /* do some sanity checks on a mbuf: panic if it fails */
>  void
>  rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
7> @@ -247,7 +419,7 @@ int rte_mbuf_check(const struct rte_mbuf *m, int is_header,
>  	return 0;
>  }
>  
> -/**
> +/*
>   * @internal helper function for freeing a bulk of packet mbuf segments
>   * via an array holding the packet mbuf segments from the same mempool
>   * pending to be freed.
I agree with this change, but it should not go in that patch.
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 52d57d1..093210e 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -637,6 +637,34 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
>  		      void *m, unsigned i);
>  
> +/** The context to initialize the mbufs with pinned external buffers. */
> +struct rte_pktmbuf_extmem_init_ctx {
> +	struct rte_pktmbuf_extmem *ext_mem; /* pointer to descriptor array. */
> +	unsigned int ext_num; /* number of descriptors in array. */
> +	unsigned int ext; /* loop descriptor index. */
> +	size_t off; /* loop buffer offset. */
> +};
I think it could be private.
> +
> +/**
> + * The packet mbuf constructor for pools with pinned external memory.
> + *
> + * This function initializes some fields in the mbuf structure that are
> + * not modified by the user once created (origin pool, buffer start
> + * address, and so on). This function is given as a callback function to
> + * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
> + *
> + * @param mp
> + *   The mempool from which mbufs originate.
> + * @param opaque_arg
> + *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
> + *   context structure
> + * @param m
> + *   The mbuf to initialize.
> + * @param i
> + *   The index of the mbuf in the pool table.
> + */
> +void rte_pktmbuf_init_extmem(struct rte_mempool *mp, void *opaque_arg,
> +			     void *m, unsigned int i);
>  
Except if I missing something, it could be private too
>  /**
>   * A  packet mbuf pool constructor.
> @@ -738,6 +766,62 @@ struct rte_mempool *
>  	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
>  	int socket_id, const char *ops_name);
>  
> +/** A structure that describes the pinned external buffer segment. */
> +struct rte_pktmbuf_extmem {
> +	void *buf_ptr;		/**< The virtual address of data buffer. */
> +	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
> +	size_t buf_len;		/**< External buffer length in bytes. */
> +	uint16_t elt_size;	/**< mbuf element size in bytes. */
> +};
See my proposition above about fields renaming.
> +
> +/**
> + * Create a mbuf pool with external pinned data buffers.
> + *
> + * This function creates and initializes a packet mbuf pool that contains
> + * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
> + *
> + * @param name
> + *   The name of the mbuf pool.
> + * @param n
> + *   The number of elements in the mbuf pool. The optimum size (in terms
> + *   of memory usage) for a mempool is when n is a power of two minus one:
> + *   n = (2^q - 1).
> + * @param cache_size
> + *   Size of the per-core object cache. See rte_mempool_create() for
> + *   details.
> + * @param priv_size
> + *   Size of application private are between the rte_mbuf structure
> + *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
> + * @param data_room_size
> + *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
> + * @param socket_id
> + *   The socket identifier where the memory should be allocated. The
> + *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
> + *   reserved zone.
> + * @param ext_mem
> + *   Pointer to the array of structures describing the external memory
> + *   for data buffers. It is caller responsibility to register this memory
> + *   with rte_extmem_register() (if needed), map this memory to appropriate
> + *   physical device, etc.
> + * @param ext_num
> + *   Number of elements in the ext_mem array.
> + * @return
> + *   The pointer to the new allocated mempool, on success. NULL on error
> + *   with rte_errno set appropriately. Possible rte_errno values include:
> + *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
> + *    - E_RTE_SECONDARY - function was called from a secondary process instance
> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
> + *    - ENOSPC - the maximum number of memzones has already been allocated
> + *    - EEXIST - a memzone with the same name already exists
> + *    - ENOMEM - no appropriate memory area found in which to create memzone
> + */
> +__rte_experimental
> +struct rte_mempool *
> +rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
> +	unsigned int cache_size, uint16_t priv_size,
> +	uint16_t data_room_size, int socket_id,
> +	struct rte_pktmbuf_extmem *ext_mem, unsigned int ext_num);
> +
>  /**
>   * Get the data room size of mbufs stored in a pktmbuf_pool
>   *
> diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
> index 3bbb476..ab161bc 100644
> --- a/lib/librte_mbuf/rte_mbuf_version.map
> +++ b/lib/librte_mbuf/rte_mbuf_version.map
> @@ -44,5 +44,6 @@ EXPERIMENTAL {
>  	rte_mbuf_dyn_dump;
>  	rte_pktmbuf_copy;
>  	rte_pktmbuf_free_bulk;
> +	rte_pktmbuf_pool_create_extbuf;
>  
>  };
> -- 
> 1.8.3.1
> 
One more thing: it would be nice to have a functional test for this new
feature. I did a very minimal one to check the basic alloc/free/attach
feature, you can restart from that if you want.
static int
test_ext_pinned(int test_case)
{
	struct rte_pktmbuf_extmem ext_mem;
	struct rte_mempool *pinned_pool = NULL;
	struct rte_mempool *std_pool = NULL;
	const struct rte_memzone *mz = NULL;
	struct rte_mbuf *m = NULL, *m2 = NULL;
	printf("Test mbuf pool with mbufs data pinned to external buffer (%d)\n", test_case);
	std_pool = rte_pktmbuf_pool_create("std_pool",
			NB_MBUF, MEMPOOL_CACHE_SIZE, 0, MBUF_DATA_SIZE,
			SOCKET_ID_ANY);
	if (std_pool == NULL)
		GOTO_FAIL("std_pool alloc failed");
	mz = rte_memzone_reserve("std_pool",
				NB_MBUF * sizeof(struct rte_mbuf),
				SOCKET_ID_ANY,
				RTE_MEMZONE_2MB|RTE_MEMZONE_SIZE_HINT_ONLY);
	if (mz == NULL)
		GOTO_FAIL("memzone alloc failed");
	ext_mem.buf_ptr = mz->addr;
	ext_mem.buf_iova = mz->iova;
	ext_mem.buf_len = mz->len;
	ext_mem.elt_size = sizeof(struct rte_mbuf);
	pinned_pool = rte_pktmbuf_pool_create_extbuf("pinned_pool",
					NB_MBUF, MEMPOOL_CACHE_SIZE,
					0, 0, SOCKET_ID_ANY, &ext_mem, 1);
	if (pinned_pool == NULL)
		GOTO_FAIL("pinned_pool alloc failed");
	m = rte_pktmbuf_alloc(pinned_pool);
	if (unlikely(m == NULL))
		goto fail;
	if (test_case != 0) {
		m2 = rte_pktmbuf_alloc(std_pool);
		if (unlikely(m == NULL))
			goto fail;
		rte_pktmbuf_attach(m2, m);
	}
	if (test_case == 0) {
		rte_pktmbuf_free(m);
	} else if (test_case == 1) {
		rte_pktmbuf_free(m);
		rte_pktmbuf_free(m2);
	} else if (test_case == 2) {
		rte_pktmbuf_free(m2);
		rte_pktmbuf_free(m);
	}
	rte_mempool_free(pinned_pool);
	rte_memzone_free(mz);
	rte_mempool_free(std_pool);
	return 0;
fail:
	rte_pktmbuf_free(m2);
	rte_pktmbuf_free(m);
	rte_mempool_free(pinned_pool);
	rte_memzone_free(mz);
	rte_mempool_free(std_pool);
	return -1;
}
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers
  2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-20 14:11     ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 14:11 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
On Thu, Jan 16, 2020 at 01:04:28PM +0000, Viacheslav Ovsiienko wrote:
> The new mbuf pool type is added to testpmd. To engage the
> mbuf pool with externally attached data buffers the parameter
> "--mp-alloc=xbuf" should be specified in testpmd command line.
> 
> The objective of this patch is just to test whether mbuf pool
> with externally attached data buffers works OK. The memory for
> data buffers is allocated from DPDK memory, so this is not
> "true" external memory from some physical device (this is
> supposed the most common use case for such kind of mbuf pool).
> 
> The user should be aware that not all drivers support the mbuf
> with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
> PMDs just overwrite ol_flags field and flag value is getting
> lost).
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  app/test-pmd/config.c     |  2 ++
>  app/test-pmd/flowgen.c    |  3 +-
>  app/test-pmd/parameters.c |  2 ++
>  app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
>  app/test-pmd/testpmd.h    |  4 ++-
>  app/test-pmd/txonly.c     |  3 +-
>  6 files changed, 92 insertions(+), 3 deletions(-)
> 
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index 9da1ffb..5c6fe18 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -2395,6 +2395,8 @@ struct igb_ring_desc_16_bytes {
>  		return "xmem";
>  	case MP_ALLOC_XMEM_HUGE:
>  		return "xmemhuge";
> +	case MP_ALLOC_XBUF:
> +		return "xbuf";
>  	default:
>  		return "invalid";
>  	}
> diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
> index 03b72aa..ae50cdc 100644
> --- a/app/test-pmd/flowgen.c
> +++ b/app/test-pmd/flowgen.c
> @@ -199,7 +199,8 @@
>  							   sizeof(*ip_hdr));
>  		pkt->nb_segs		= 1;
>  		pkt->pkt_len		= pkt_size;
> -		pkt->ol_flags		= ol_flags;
> +		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
> +		pkt->ol_flags		|= ol_flags;
>  		pkt->vlan_tci		= vlan_tci;
>  		pkt->vlan_tci_outer	= vlan_tci_outer;
>  		pkt->l2_len		= sizeof(struct rte_ether_hdr);
This shows that we have to be careful when using a mempool with
external memory pinned mbufs. Maybe that's something that should
be mentionned in the release note?
That's not the first time I'm asking myself if ol_flags shouldn't be
splitted in ol_flags and flags. Certainly something to think about for
next ABI breakage release.
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 13:56     ` Olivier Matz
@ 2020-01-20 15:41       ` Slava Ovsiienko
  2020-01-20 16:17         ` Olivier Matz
  0 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-20 15:41 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
Hi, Olivier
Thanks a lot for the thorough review.
There are some answers to comments, please, see below.
> >
> >  /**
> > + * @internal version of rte_pktmbuf_detach() to be used on mbuf freeing.
> 
> -version
> +Version
> 
> > + * For indirect and regular (not pinned) external mbufs the standard
> > + * rte_pktmbuf is involved, for pinned external buffer mbufs the
> > + special
> > + * handling is performed:
> 
> Sorry, it is not very clear to me, especially what "the standard rte_pktmbuf is
> involved" means.
Sorry, it is mistype, should be read as "rte_pktmbuf_detach is invoked".
> 
> > + *
> > + *  - return zero if reference counter in shinfo is one. It means
> > + there is
> > + *  no more references to this pinned buffer and mbuf can be returned
> > + to
> 
> -references
> +reference
> 
> > + *  the pool
> > + *
> > + *  - otherwise (if reference counter is not one), decrement
> > +reference
> > + *  counter and return non-zero value to prevent freeing the backing mbuf.
> > + *
> > + * Returns non zero if mbuf should not be freed.
> > + */
> > +static inline uint16_t __rte_pktmbuf_detach_on_free(struct rte_mbuf
> > +*m)
> 
> I think int would be better than uint16_t
> 
> > +{
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		uint32_t flags = rte_pktmbuf_priv_flags(m->pool);
> > +
> > +		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> > +			struct rte_mbuf_ext_shared_info *shinfo;
> > +
> > +			/* Clear flags, mbuf is being freed. */
> > +			m->ol_flags = EXT_ATTACHED_MBUF;
> > +			shinfo = m->shinfo;
> > +			/* Optimize for performance - do not dec/reinit */
> > +			if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> > +				return 0;
> > +			/*
> > +			 * Direct usage of add primitive to avoid
> > +			 * duplication of comparing with one.
> > +			 */
> > +			if (likely(rte_atomic16_add_return
> > +					(&shinfo->refcnt_atomic, -1)))
> > +				return 1;
> > +			/* Reinitialize counter before mbuf freeing. */
> > +			rte_mbuf_ext_refcnt_set(shinfo, 1);
> > +			return 0;
> > +		}
> > +	}
> > +	rte_pktmbuf_detach(m);
> > +	return 0;
> > +}
> 
> I don't think the API comment really reflects what is done in this function. In
> my understanding, the detach() operation does nothing on an extmem
> pinned mbuf. So detach() is probably not the proper name.
> 
> What about something like this instead:
> 
> /* [...].
>  *  assume m is pinned to external memory */ static inline int
> __rte_pktmbuf_pinned_ext_buf_decref(struct rte_mbuf *m) {
> 	struct rte_mbuf_ext_shared_info *shinfo;
> 
> 	/* Clear flags, mbuf is being freed. */
> 	m->ol_flags = EXT_ATTACHED_MBUF;
> 	shinfo = m->shinfo;
> 
> 	/* Optimize for performance - do not dec/reinit */
> 	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> 		return 0;
> 
> 	/*
> 	 * Direct usage of add primitive to avoid
> 	 * duplication of comparing with one.
> 	 */
> 	if (likely(rte_atomic16_add_return
> 			(&shinfo->refcnt_atomic, -1)))
> 		return 1;
> 
> 	/* Reinitialize counter before mbuf freeing. */
> 	rte_mbuf_ext_refcnt_set(shinfo, 1);
> 	return 0;
> }
> 
> static __rte_always_inline struct rte_mbuf * rte_pktmbuf_prefree_seg(struct
> rte_mbuf *m) {
> 	__rte_mbuf_sanity_check(m, 0);
> 
> 	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
> 
> 		if (!RTE_MBUF_DIRECT(m))
> 			if (!RTE_MBUF_HAS_PINNED_EXTBUF(m))
> 				rte_pktmbuf_detach(m);
> 			else if (__rte_pktmbuf_pinned_ext_buf_decref(m))
> 				return NULL;
> 		}
> 		...
> 	... (and same below) ...
> 
> 
> (just quickly tested)
> 
> The other advantage is that we don't call rte_pktmbuf_detach() where not
> needed.
Your proposal fetches the private flags for all indirect packets, including the ones
with IND_ATTACHED_MBUF flags (not external), this extra fetch and check might affect
the performance for indirect packets (and it does not matter for packets with external
buffers). My approach updates the prefree routine for the packets with
external buffers only, keeping intact the handling for all other mbuf types. 
> 
> > +
> > +/**
> >   * Decrease reference counter and unlink a mbuf segment
> >   *
> >   * This function does the same than a free, except that it does not
> > @@ -1198,7 +1277,8 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> >  	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
> >
> >  		if (!RTE_MBUF_DIRECT(m))
> > -			rte_pktmbuf_detach(m);
> > +			if (__rte_pktmbuf_detach_on_free(m))
> > +				return NULL;
> >
> >  		if (m->next != NULL) {
> >  			m->next = NULL;
> > @@ -1210,7 +1290,8 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> >  	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
> >
> >  		if (!RTE_MBUF_DIRECT(m))
> > -			rte_pktmbuf_detach(m);
> > +			if (__rte_pktmbuf_detach_on_free(m))
> > +				return NULL;
> >
> >  		if (m->next != NULL) {
> >  			m->next = NULL;
> > --
> > 1.8.3.1
> >
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 15:41       ` Slava Ovsiienko
@ 2020-01-20 16:17         ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 16:17 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
Hi,
On Mon, Jan 20, 2020 at 03:41:10PM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> Thanks a lot for the thorough review.
> There are some answers to comments, please, see below.
> 
> > >
> > >  /**
> > > + * @internal version of rte_pktmbuf_detach() to be used on mbuf freeing.
> > 
> > -version
> > +Version
> > 
> > > + * For indirect and regular (not pinned) external mbufs the standard
> > > + * rte_pktmbuf is involved, for pinned external buffer mbufs the
> > > + special
> > > + * handling is performed:
> > 
> > Sorry, it is not very clear to me, especially what "the standard rte_pktmbuf is
> > involved" means.
> 
> Sorry, it is mistype, should be read as "rte_pktmbuf_detach is invoked".
> > 
> > > + *
> > > + *  - return zero if reference counter in shinfo is one. It means
> > > + there is
> > > + *  no more references to this pinned buffer and mbuf can be returned
> > > + to
> > 
> > -references
> > +reference
> > 
> > > + *  the pool
> > > + *
> > > + *  - otherwise (if reference counter is not one), decrement
> > > +reference
> > > + *  counter and return non-zero value to prevent freeing the backing mbuf.
> > > + *
> > > + * Returns non zero if mbuf should not be freed.
> > > + */
> > > +static inline uint16_t __rte_pktmbuf_detach_on_free(struct rte_mbuf
> > > +*m)
> > 
> > I think int would be better than uint16_t
> > 
> > > +{
> > > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > > +		uint32_t flags = rte_pktmbuf_priv_flags(m->pool);
> > > +
> > > +		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> > > +			struct rte_mbuf_ext_shared_info *shinfo;
> > > +
> > > +			/* Clear flags, mbuf is being freed. */
> > > +			m->ol_flags = EXT_ATTACHED_MBUF;
> > > +			shinfo = m->shinfo;
> > > +			/* Optimize for performance - do not dec/reinit */
> > > +			if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> > > +				return 0;
> > > +			/*
> > > +			 * Direct usage of add primitive to avoid
> > > +			 * duplication of comparing with one.
> > > +			 */
> > > +			if (likely(rte_atomic16_add_return
> > > +					(&shinfo->refcnt_atomic, -1)))
> > > +				return 1;
> > > +			/* Reinitialize counter before mbuf freeing. */
> > > +			rte_mbuf_ext_refcnt_set(shinfo, 1);
> > > +			return 0;
> > > +		}
> > > +	}
> > > +	rte_pktmbuf_detach(m);
> > > +	return 0;
> > > +}
> > 
> > I don't think the API comment really reflects what is done in this function. In
> > my understanding, the detach() operation does nothing on an extmem
> > pinned mbuf. So detach() is probably not the proper name.
> > 
> > What about something like this instead:
> > 
> > /* [...].
> >  *  assume m is pinned to external memory */ static inline int
> > __rte_pktmbuf_pinned_ext_buf_decref(struct rte_mbuf *m) {
> > 	struct rte_mbuf_ext_shared_info *shinfo;
> > 
> > 	/* Clear flags, mbuf is being freed. */
> > 	m->ol_flags = EXT_ATTACHED_MBUF;
> > 	shinfo = m->shinfo;
> > 
> > 	/* Optimize for performance - do not dec/reinit */
> > 	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> > 		return 0;
> > 
> > 	/*
> > 	 * Direct usage of add primitive to avoid
> > 	 * duplication of comparing with one.
> > 	 */
> > 	if (likely(rte_atomic16_add_return
> > 			(&shinfo->refcnt_atomic, -1)))
> > 		return 1;
> > 
> > 	/* Reinitialize counter before mbuf freeing. */
> > 	rte_mbuf_ext_refcnt_set(shinfo, 1);
> > 	return 0;
> > }
> > 
> > static __rte_always_inline struct rte_mbuf * rte_pktmbuf_prefree_seg(struct
> > rte_mbuf *m) {
> > 	__rte_mbuf_sanity_check(m, 0);
> > 
> > 	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
> > 
> > 		if (!RTE_MBUF_DIRECT(m))
> > 			if (!RTE_MBUF_HAS_PINNED_EXTBUF(m))
> > 				rte_pktmbuf_detach(m);
> > 			else if (__rte_pktmbuf_pinned_ext_buf_decref(m))
> > 				return NULL;
> > 		}
> > 		...
> > 	... (and same below) ...
> > 
> > 
> > (just quickly tested)
> > 
> > The other advantage is that we don't call rte_pktmbuf_detach() where not
> > needed.
> Your proposal fetches the private flags for all indirect packets, including the ones
> with IND_ATTACHED_MBUF flags (not external), this extra fetch and check might affect
> the performance for indirect packets (and it does not matter for packets with external
> buffers). My approach updates the prefree routine for the packets with
> external buffers only, keeping intact the handling for all other mbuf types. 
maybe just change the test to this?
	if (!RTE_MBUF_HAS_EXTBUF(m) || !RTE_MBUF_HAS_PINNED_EXTBUF(m))
if you prefer, test can be moved in __rte_pktmbuf_pinned_ext_buf_decref():
	if (!RTE_MBUF_HAS_EXTBUF(m) || !RTE_MBUF_HAS_PINNED_EXTBUF(m))
		return 0;
But my preference would go to the 1st one.
The root of my comment was more about the naming, I don't think the
function should be something_detach() because it would not detach
anything in case of ext mem pinned buffer.
> 
> > 
> > > +
> > > +/**
> > >   * Decrease reference counter and unlink a mbuf segment
> > >   *
> > >   * This function does the same than a free, except that it does not
> > > @@ -1198,7 +1277,8 @@ static inline void rte_pktmbuf_detach(struct
> > rte_mbuf *m)
> > >  	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
> > >
> > >  		if (!RTE_MBUF_DIRECT(m))
> > > -			rte_pktmbuf_detach(m);
> > > +			if (__rte_pktmbuf_detach_on_free(m))
> > > +				return NULL;
> > >
> > >  		if (m->next != NULL) {
> > >  			m->next = NULL;
> > > @@ -1210,7 +1290,8 @@ static inline void rte_pktmbuf_detach(struct
> > rte_mbuf *m)
> > >  	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
> > >
> > >  		if (!RTE_MBUF_DIRECT(m))
> > > -			rte_pktmbuf_detach(m);
> > > +			if (__rte_pktmbuf_detach_on_free(m))
> > > +				return NULL;
> > >
> > >  		if (m->next != NULL) {
> > >  			m->next = NULL;
> > > --
> > > 1.8.3.1
> > >
> 
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned external buffer
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (4 preceding siblings ...)
  2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-20 17:23 ` Viacheslav Ovsiienko
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
                     ` (5 more replies)
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
                   ` (2 subsequent siblings)
  8 siblings, 6 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RFC: http://patches.dpdk.org/patch/63077
v1: - http://patches.dpdk.org/cover/64424
v2: - fix rte_experimantal issue on comment addressing
    - rte_mbuf_has_pinned_extbuf return type is uint32_t
    - fix Power9 compilation issue
v3: - fix "#include <stdbool.h> leftover
v4: - https://patches.dpdk.org/cover/64809/
    - introduce rte_pktmbuf_priv_flags
    - support cloning pinned mbufs as for regular mbufs
      with external buffers
    - address the minor comments
v5: - update rte_pktmbuf_prefree_seg
    - rename __rte_pktmbuf_extbuf_detach
    - addressing comment
    - fix typos
Viacheslav Ovsiienko (5):
  mbuf: introduce routine to get private mbuf pool flags
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 +++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 198 ++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 183 ++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 492 insertions(+), 46 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-20 17:23   ` Viacheslav Ovsiienko
  2020-01-20 20:43     ` Stephen Hemminger
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The routine rte_pktmbuf_priv_flags is introduced to fetch
the flags from the mbuf memory pool private structure
in unified fashion.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 2d4bda2..9b0691d 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * Return the flags from private data in an mempool structure.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The flags from the private data structure.
+ */
+static inline uint32_t
+rte_pktmbuf_priv_flags(struct rte_mempool *mp)
+{
+	struct rte_pktmbuf_pool_private *mbp_priv;
+
+	mbp_priv = (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
+	return mbp_priv->flags;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
@ 2020-01-20 17:23   ` Viacheslav Ovsiienko
  2020-01-20 17:40     ` Olivier Matz
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Update detach routine to check the mbuf pool type.
Introduce the special internal version of detach routine to handle
the special case of pinned external bufferon mbuf freeing.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.h | 102 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 93 insertions(+), 9 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 9b0691d..7a41aad 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -323,6 +323,24 @@ struct rte_pktmbuf_pool_private {
 	return mbp_priv->flags;
 }
 
+/**
+ * When set, pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached to the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns non zero if given mbuf has a pinned external buffer, or zero
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed on mbuf freeing.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) \
+	(rte_pktmbuf_priv_flags(mb->pool) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -588,7 +606,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -794,7 +813,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
@@ -1153,6 +1172,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
  *
  * All other fields of the given packet mbuf will be left intact.
  *
+ * If the packet mbuf was allocated from the pool with pinned
+ * external buffers the rte_pktmbuf_detach does nothing with the
+ * mbuf of this kind, because the pinned buffers are not supposed
+ * to be detached.
+ *
  * @param m
  *   The indirect attached packet mbuf.
  */
@@ -1162,11 +1186,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		uint32_t flags = rte_pktmbuf_priv_flags(mp);
+
+		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			/*
+			 * The pinned external buffer should not be
+			 * detached from its backing mbuf, just exit.
+			 */
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
@@ -1181,6 +1220,41 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 }
 
 /**
+ * @internal Handle the packet mbufs with attached pinned external buffer
+ * on the mbuf freeing:
+ *
+ *  - return zero if reference counter in shinfo is one. It means there is
+ *  no more reference to this pinned buffer and mbuf can be returned to
+ *  the pool
+ *
+ *  - otherwise (if reference counter is not one), decrement reference
+ *  counter and return non-zero value to prevent freeing the backing mbuf.
+ *
+ * Returns non zero if mbuf should not be freed.
+ */
+static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
+{
+	struct rte_mbuf_ext_shared_info *shinfo;
+
+	/* Clear flags, mbuf is being freed. */
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	shinfo = m->shinfo;
+	/* Optimize for performance - do not dec/reinit */
+	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
+		return 0;
+	/*
+	 * Direct usage of add primitive to avoid
+	 * duplication of comparing with one.
+	 */
+	if (likely(rte_atomic16_add_return
+			(&shinfo->refcnt_atomic, -1)))
+		return 1;
+	/* Reinitialize counter before mbuf freeing. */
+	rte_mbuf_ext_refcnt_set(shinfo, 1);
+	return 0;
+}
+
+/**
  * Decrease reference counter and unlink a mbuf segment
  *
  * This function does the same than a free, except that it does not
@@ -1201,8 +1275,13 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 
 	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
 
-		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+		if (!RTE_MBUF_DIRECT(m)) {
+			if (!RTE_MBUF_HAS_EXTBUF(m) ||
+			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
+				rte_pktmbuf_detach(m);
+			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
+				return NULL;
+		}
 
 		if (m->next != NULL) {
 			m->next = NULL;
@@ -1213,8 +1292,13 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 
 	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
 
-		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+		if (!RTE_MBUF_DIRECT(m)) {
+			if (!RTE_MBUF_HAS_EXTBUF(m) ||
+			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
+				rte_pktmbuf_detach(m);
+			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
+				return NULL;
+		}
 
 		if (m->next != NULL) {
 			m->next = NULL;
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-20 17:23   ` Viacheslav Ovsiienko
  2020-01-20 17:46     ` Olivier Matz
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initializes each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and initialiazed shared
info structure, allowing cloning with regular mbufs (without
attached external buffers of any kind).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 198 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  64 +++++++++++
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 260 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..a709246 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,12 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+			sizeof(struct rte_mbuf_ext_shared_info) :
+			user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
+	RTE_ASSERT((user_mbp_priv->flags &
+		    ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +110,108 @@
 	m->next = NULL;
 }
 
+/*
+ * @internal The callback routine called when reference counter in shinfo
+ * for mbufs with pinned external buffer reaches zero. It means there is
+ * no more reference to buffer backing mbuf and this one should be freed.
+ * This routine is called for the regular (not with pinned external or
+ * indirect buffer) mbufs on detaching from the mbuf with pinned external
+ * buffer.
+ */
+static void rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
+{
+	struct rte_mbuf *m = opaque;
+
+	RTE_SET_USED(addr);
+	RTE_ASSERT(RTE_MBUF_HAS_EXTBUF(m));
+	RTE_ASSERT(RTE_MBUF_HAS_PINNED_EXTBUF(m));
+	RTE_ASSERT(m->shinfo->fcb_opaque == m);
+
+	rte_mbuf_ext_refcnt_set(m->shinfo, 1);
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	if (m->next != NULL) {
+		m->next = NULL;
+		m->nb_segs = 1;
+	}
+	rte_mbuf_raw_free(m);
+}
+
+/**
+ * @internal Packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+static void
+__rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			  void *opaque_arg,
+			  void *_m,
+			  __attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	const struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+	struct rte_mbuf_ext_shared_info *shinfo;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+
+	/* init external buffer shared info items */
+	shinfo = RTE_PTR_ADD(m, mbuf_size);
+	m->shinfo = shinfo;
+	shinfo->free_cb = rte_pktmbuf_free_pinned_extmem;
+	shinfo->fcb_opaque = m;
+	rte_mbuf_ext_refcnt_set(shinfo, 1);
+}
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +274,93 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	const struct rte_pktmbuf_extmem *ext_mem,
+	unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		const struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) +
+		   (unsigned int)priv_size +
+		   sizeof(struct rte_mbuf_ext_shared_info);
+
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, __rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
@@ -247,7 +439,7 @@ int rte_mbuf_check(const struct rte_mbuf *m, int is_header,
 	return 0;
 }
 
-/**
+/*
  * @internal helper function for freeing a bulk of packet mbuf segments
  * via an array holding the packet mbuf segments from the same mempool
  * pending to be freed.
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 7a41aad..eaeda04 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -637,6 +637,13 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	const struct rte_pktmbuf_extmem *ext_mem; /* descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
 
 /**
  * A  packet mbuf pool constructor.
@@ -738,6 +745,63 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	const struct rte_pktmbuf_extmem *ext_mem,
+	unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 4/5] app/testpmd: add mempool with external data buffers
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-20 17:23   ` Viacheslav Ovsiienko
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  2020-01-20 17:30   ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Slava Ovsiienko
  5 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 52f1d9d..9669cbd 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2422,6 +2422,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 38dbb12..f9f4cd1 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -868,6 +869,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -936,6 +997,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 7cf48d0..3dd5fc7 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v5 5/5] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-20 17:23   ` Viacheslav Ovsiienko
  2020-01-20 17:30   ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Slava Ovsiienko
  5 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 17:23 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index c936a7f..4092cb7 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 67cafd1..5e31f01 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1337,7 +1337,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index b6a33c5..3f659d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..9e5c6ee 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)vec_vsx_ld
+			(0, (vector unsigned char *)&rxq->mbuf_initializer);
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
                     ` (4 preceding siblings ...)
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
@ 2020-01-20 17:30   ` Slava Ovsiienko
  2020-01-20 17:41     ` Olivier Matz
  5 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-20 17:30 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	olivier.matz, stephen, thomas
The unit test (as part of  test_mbuf application) will be provided as separated patch.
With best regards, Slava
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Monday, January 20, 2020 19:23
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; olivier.matz@6wind.com;
> stephen@networkplumber.org; thomas@mellanox.net
> Subject: [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned external
> buffer
> 
> Today's pktmbuf pool contains only mbufs with no external buffers.
> This means data buffer for the mbuf should be placed right after the mbuf
> structure (+ the private data when enabled).
> 
> On some cases, the application would want to have the buffers allocated from
> a different device in the platform. This is in order to do zero copy for the
> packet directly to the device memory. Examples for such devices can be GPU
> or storage device. For such cases the native pktmbuf pool does not fit since
> each mbuf would need to point to external buffer.
> 
> To support above, the pktmbuf pool will be populated with mbuf pointing to
> the device buffers using the mbuf external buffer feature.
> The PMD will populate its receive queues with those buffer, so that every
> packet received will be scattered directly to the device memory.
> on the other direction, embedding the buffer pointer to the transmit queues
> of the NIC, will make the DMA to fetch device memory using peer to peer
> communication.
> 
> Such mbuf with external buffer should be handled with care when mbuf is
> freed. Mainly The external buffer should not be detached, so that it can be
> reused for the next packet receive.
> 
> This patch introduce a new flag on the rte_pktmbuf_pool_private structure to
> specify this mempool is for mbuf with pinned external buffer. Upon detach
> this flag is validated and buffer is not detached.
> A new mempool create wrapper is also introduced to help application to
> create and populate such mempool.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> RFC:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.
> dpdk.org%2Fpatch%2F63077&data=02%7C01%7Cviacheslavo%40mellano
> x.com%7Cefff8bba40804fd99d1808d79dcd7ac7%7Ca652971c7d2e4d9ba6a4d1
> 49256f461b%7C0%7C0%7C637151378168487564&sdata=eukDFAJo1IuVkx
> OPcxh7fVbMlOWtdVA7jN%2FxKtt9wgg%3D&reserved=0
> v1: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.
> dpdk.org%2Fcover%2F64424&data=02%7C01%7Cviacheslavo%40mellanox
> .com%7Cefff8bba40804fd99d1808d79dcd7ac7%7Ca652971c7d2e4d9ba6a4d14
> 9256f461b%7C0%7C0%7C637151378168487564&sdata=WXZ22vksi%2FkEa
> KRrc4d6X%2F24Da4PJXBABotz6O8SPTs%3D&reserved=0
> v2: - fix rte_experimantal issue on comment addressing
>     - rte_mbuf_has_pinned_extbuf return type is uint32_t
>     - fix Power9 compilation issue
> v3: - fix "#include <stdbool.h> leftover
> v4: -
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatches
> .dpdk.org%2Fcover%2F64809%2F&data=02%7C01%7Cviacheslavo%40mell
> anox.com%7Cefff8bba40804fd99d1808d79dcd7ac7%7Ca652971c7d2e4d9ba6a
> 4d149256f461b%7C0%7C0%7C637151378168487564&sdata=5tOR2I9DUD
> nLYANlB05lf2qFYuCpVARUMYP4PJbz2ac%3D&reserved=0
>     - introduce rte_pktmbuf_priv_flags
>     - support cloning pinned mbufs as for regular mbufs
>       with external buffers
>     - address the minor comments
> v5: - update rte_pktmbuf_prefree_seg
>     - rename __rte_pktmbuf_extbuf_detach
>     - addressing comment
>     - fix typos
> 
> Viacheslav Ovsiienko (5):
>   mbuf: introduce routine to get private mbuf pool flags
>   mbuf: detach mbuf with pinned external buffer
>   mbuf: create packet pool with external memory buffers
>   app/testpmd: add mempool with external data buffers
>   net/mlx5: allow use allocated mbuf with external buffer
> 
>  app/test-pmd/config.c                    |   2 +
>  app/test-pmd/flowgen.c                   |   3 +-
>  app/test-pmd/parameters.c                |   2 +
>  app/test-pmd/testpmd.c                   |  81 +++++++++++++
>  app/test-pmd/testpmd.h                   |   4 +-
>  app/test-pmd/txonly.c                    |   3 +-
>  drivers/net/mlx5/mlx5_rxq.c              |   7 +-
>  drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
>  drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
>  drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
>  drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 ++---
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
>  lib/librte_mbuf/rte_mbuf.c               | 198
> ++++++++++++++++++++++++++++++-
>  lib/librte_mbuf/rte_mbuf.h               | 183 ++++++++++++++++++++++++++--
>  lib/librte_mbuf/rte_mbuf_version.map     |   1 +
>  16 files changed, 492 insertions(+), 46 deletions(-)
> 
> --
> 1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 13:59     ` Olivier Matz
@ 2020-01-20 17:33       ` Slava Ovsiienko
  0 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-20 17:33 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
Hi, Olivier
Great! Thanks a lot for the unit test template, this is very nice starting point,
I will think what would be good to add for more thorough testing.
With best regards, Slava
 
> One more thing: it would be nice to have a functional test for this new
> feature. I did a very minimal one to check the basic alloc/free/attach
> feature, you can restart from that if you want.
> 
> static int
> test_ext_pinned(int test_case)
> {
> 	struct rte_pktmbuf_extmem ext_mem;
> 	struct rte_mempool *pinned_pool = NULL;
> 	struct rte_mempool *std_pool = NULL;
> 	const struct rte_memzone *mz = NULL;
> 	struct rte_mbuf *m = NULL, *m2 = NULL;
> 
> 	printf("Test mbuf pool with mbufs data pinned to external buffer
> (%d)\n", test_case);
> 
> 	std_pool = rte_pktmbuf_pool_create("std_pool",
> 			NB_MBUF, MEMPOOL_CACHE_SIZE, 0,
> MBUF_DATA_SIZE,
> 			SOCKET_ID_ANY);
> 	if (std_pool == NULL)
> 		GOTO_FAIL("std_pool alloc failed");
> 
> 	mz = rte_memzone_reserve("std_pool",
> 				NB_MBUF * sizeof(struct rte_mbuf),
> 				SOCKET_ID_ANY,
> 
> 	RTE_MEMZONE_2MB|RTE_MEMZONE_SIZE_HINT_ONLY);
> 	if (mz == NULL)
> 		GOTO_FAIL("memzone alloc failed");
> 
> 	ext_mem.buf_ptr = mz->addr;
> 	ext_mem.buf_iova = mz->iova;
> 	ext_mem.buf_len = mz->len;
> 	ext_mem.elt_size = sizeof(struct rte_mbuf);
> 
> 	pinned_pool = rte_pktmbuf_pool_create_extbuf("pinned_pool",
> 					NB_MBUF, MEMPOOL_CACHE_SIZE,
> 					0, 0, SOCKET_ID_ANY, &ext_mem, 1);
> 	if (pinned_pool == NULL)
> 		GOTO_FAIL("pinned_pool alloc failed");
> 
> 	m = rte_pktmbuf_alloc(pinned_pool);
> 	if (unlikely(m == NULL))
> 		goto fail;
> 
> 	if (test_case != 0) {
> 		m2 = rte_pktmbuf_alloc(std_pool);
> 		if (unlikely(m == NULL))
> 			goto fail;
> 		rte_pktmbuf_attach(m2, m);
> 	}
> 
> 	if (test_case == 0) {
> 		rte_pktmbuf_free(m);
> 	} else if (test_case == 1) {
> 		rte_pktmbuf_free(m);
> 		rte_pktmbuf_free(m2);
> 	} else if (test_case == 2) {
> 		rte_pktmbuf_free(m2);
> 		rte_pktmbuf_free(m);
> 	}
> 
> 
> 	rte_mempool_free(pinned_pool);
> 	rte_memzone_free(mz);
> 	rte_mempool_free(std_pool);
> 	return 0;
> 
> fail:
> 	rte_pktmbuf_free(m2);
> 	rte_pktmbuf_free(m);
> 	rte_mempool_free(pinned_pool);
> 	rte_memzone_free(mz);
> 	rte_mempool_free(std_pool);
> 	return -1;
> }
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-20 17:40     ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 17:40 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
On Mon, Jan 20, 2020 at 05:23:20PM +0000, Viacheslav Ovsiienko wrote:
> Update detach routine to check the mbuf pool type.
> Introduce the special internal version of detach routine to handle
> the special case of pinned external bufferon mbuf freeing.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
[...]
In case there is a new version, can you please add some newlines:
> +static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
> +{
> +	struct rte_mbuf_ext_shared_info *shinfo;
> +
> +	/* Clear flags, mbuf is being freed. */
> +	m->ol_flags = EXT_ATTACHED_MBUF;
> +	shinfo = m->shinfo;
here
> +	/* Optimize for performance - do not dec/reinit */
> +	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
> +		return 0;
here
> +	/*
> +	 * Direct usage of add primitive to avoid
> +	 * duplication of comparing with one.
> +	 */
> +	if (likely(rte_atomic16_add_return
> +			(&shinfo->refcnt_atomic, -1)))
> +		return 1;
here
> +	/* Reinitialize counter before mbuf freeing. */
> +	rte_mbuf_ext_refcnt_set(shinfo, 1);
> +	return 0;
> +}
> +
> +/**
>   * Decrease reference counter and unlink a mbuf segment
>   *
>   * This function does the same than a free, except that it does not
Apart from this,
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 17:30   ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Slava Ovsiienko
@ 2020-01-20 17:41     ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 17:41 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
On Mon, Jan 20, 2020 at 05:30:45PM +0000, Slava Ovsiienko wrote:
> The unit test (as part of  test_mbuf application) will be provided as separated patch.
OK, thanks
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-20 17:46     ` Olivier Matz
  0 siblings, 0 replies; 77+ messages in thread
From: Olivier Matz @ 2020-01-20 17:46 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
On Mon, Jan 20, 2020 at 05:23:21PM +0000, Viacheslav Ovsiienko wrote:
> The dedicated routine rte_pktmbuf_pool_create_extbuf() is
> provided to create mbuf pool with data buffers located in
> the pinned external memory. The application provides the
> external memory description and routine initializes each
> mbuf with appropriate virtual and physical buffer address.
> It is entirely application responsibility to register
> external memory with rte_extmem_register() API, map this
> memory, etc.
> 
> The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
> is set in private pool structure, specifying the new special
> pool type. The allocated mbufs from pool of this kind will
> have the EXT_ATTACHED_MBUF flag set and initialiazed shared
> info structure, allowing cloning with regular mbufs (without
> attached external buffers of any kind).
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
[...]
> @@ -247,7 +439,7 @@ int rte_mbuf_check(const struct rte_mbuf *m, int is_header,
>  	return 0;
>  }
>  
> -/**
> +/*
>   * @internal helper function for freeing a bulk of packet mbuf segments
>   * via an array holding the packet mbuf segments from the same mempool
>   * pending to be freed.
It could be removed.
[...]
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 7a41aad..eaeda04 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -637,6 +637,13 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
>  		      void *m, unsigned i);
>  
> +/** The context to initialize the mbufs with pinned external buffers. */
> +struct rte_pktmbuf_extmem_init_ctx {
> +	const struct rte_pktmbuf_extmem *ext_mem; /* descriptor array. */
> +	unsigned int ext_num; /* number of descriptors in array. */
> +	unsigned int ext; /* loop descriptor index. */
> +	size_t off; /* loop buffer offset. */
> +};
Can this definition be private in the .c ?
Apart from this,
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 0/5] mbuf: detach mbuf with pinned external buffer
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (5 preceding siblings ...)
  2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
@ 2020-01-20 19:16 ` Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
                     ` (5 more replies)
  2020-01-22  8:50 ` [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue Viacheslav Ovsiienko
  2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
  8 siblings, 6 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Today's pktmbuf pool contains only mbufs with no external buffers.
This means data buffer for the mbuf should be placed right after the
mbuf structure (+ the private data when enabled).
On some cases, the application would want to have the buffers allocated
from a different device in the platform. This is in order to do zero
copy for the packet directly to the device memory. Examples for such
devices can be GPU or storage device. For such cases the native pktmbuf
pool does not fit since each mbuf would need to point to external
buffer.
To support above, the pktmbuf pool will be populated with mbuf pointing
to the device buffers using the mbuf external buffer feature.
The PMD will populate its receive queues with those buffer, so that
every packet received will be scattered directly to the device memory.
on the other direction, embedding the buffer pointer to the transmit
queues of the NIC, will make the DMA to fetch device memory
using peer to peer communication.
Such mbuf with external buffer should be handled with care when mbuf is
freed. Mainly The external buffer should not be detached, so that it can
be reused for the next packet receive.
This patch introduce a new flag on the rte_pktmbuf_pool_private
structure to specify this mempool is for mbuf with pinned external
buffer. Upon detach this flag is validated and buffer is not detached.
A new mempool create wrapper is also introduced to help application to
create and populate such mempool.
The unit test (as part of test_mbuf application) will be provided
as dedicated patch.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RFC: http://patches.dpdk.org/patch/63077
v1: http://patches.dpdk.org/cover/64424
v2: - fix rte_experimantal issue on comment addressing
    - rte_mbuf_has_pinned_extbuf return type is uint32_t
    - fix Power9 compilation issue
v3: - fix "#include <stdbool.h> leftover
v4: - https://patches.dpdk.org/cover/64809/
    - introduce rte_pktmbuf_priv_flags
    - support cloning pinned mbufs as for regular mbufs
      with external buffers
    - address the minor comments
v5: - http://patches.dpdk.org/cover/64979/
    - update rte_pktmbuf_prefree_seg
    - rename __rte_pktmbuf_extbuf_detach
    - __rte_pktmbuf_init_extmem is static
    - const qualifier is specified for external memory
      description parameter of rte_pktmbuf_pool_create_extbuf
    - addressing minor comments
    - fix typos
v6: - new lines inserted
    - struct rte_pktmbuf_extmem_init_ctx is local to rte_mbuf.c
Viacheslav Ovsiienko (5):
  mbuf: introduce routine to get private mbuf pool flags
  mbuf: detach mbuf with pinned external buffer
  mbuf: create packet pool with external memory buffers
  app/testpmd: add mempool with external data buffers
  net/mlx5: allow use allocated mbuf with external buffer
 app/test-pmd/config.c                    |   2 +
 app/test-pmd/flowgen.c                   |   3 +-
 app/test-pmd/parameters.c                |   2 +
 app/test-pmd/testpmd.c                   |  81 ++++++++++++
 app/test-pmd/testpmd.h                   |   4 +-
 app/test-pmd/txonly.c                    |   3 +-
 drivers/net/mlx5/mlx5_rxq.c              |   7 +-
 drivers/net/mlx5/mlx5_rxtx.c             |   2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |   2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         |  14 +--
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   5 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |  29 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   2 +-
 lib/librte_mbuf/rte_mbuf.c               | 204 ++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h               | 180 +++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_version.map     |   1 +
 16 files changed, 495 insertions(+), 46 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
@ 2020-01-20 19:16   ` Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The routine rte_pktmbuf_priv_flags is introduced to fetch
the flags from the mbuf memory pool private structure
in unified fashion.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 2d4bda2..9b0691d 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
 	uint32_t flags; /**< reserved for future use. */
 };
 
+/**
+ * Return the flags from private data in an mempool structure.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The flags from the private data structure.
+ */
+static inline uint32_t
+rte_pktmbuf_priv_flags(struct rte_mempool *mp)
+{
+	struct rte_pktmbuf_pool_private *mbp_priv;
+
+	mbp_priv = (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
+	return mbp_priv->flags;
+}
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
@ 2020-01-20 19:16   ` Viacheslav Ovsiienko
  2023-12-06 10:55     ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned externalbuffer Morten Brørup
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
Update detach routine to check the mbuf pool type.
Introduce the special internal version of detach routine to handle
the special case of pinned external bufferon mbuf freeing.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.h | 105 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 96 insertions(+), 9 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 9b0691d..c4f5085 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -323,6 +323,24 @@ struct rte_pktmbuf_pool_private {
 	return mbp_priv->flags;
 }
 
+/**
+ * When set, pktmbuf mempool will hold only mbufs with pinned external
+ * buffer. The external buffer will be attached to the mbuf at the
+ * memory pool creation and will never be detached by the mbuf free calls.
+ * mbuf should not contain any room for data after the mbuf structure.
+ */
+#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
+
+/**
+ * Returns non zero if given mbuf has a pinned external buffer, or zero
+ * otherwise. The pinned external buffer is allocated at pool creation
+ * time and should not be freed on mbuf freeing.
+ *
+ * External buffer is a user-provided anonymous buffer.
+ */
+#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) \
+	(rte_pktmbuf_priv_flags(mb->pool) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+
 #ifdef RTE_LIBRTE_MBUF_DEBUG
 
 /**  check mbuf type in debug mode */
@@ -588,7 +606,8 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
+		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
@@ -794,7 +813,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 	m->nb_segs = 1;
 	m->port = MBUF_INVALID_PORT;
 
-	m->ol_flags = 0;
+	m->ol_flags &= EXT_ATTACHED_MBUF;
 	m->packet_type = 0;
 	rte_pktmbuf_reset_headroom(m);
 
@@ -1153,6 +1172,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
  *
  * All other fields of the given packet mbuf will be left intact.
  *
+ * If the packet mbuf was allocated from the pool with pinned
+ * external buffers the rte_pktmbuf_detach does nothing with the
+ * mbuf of this kind, because the pinned buffers are not supposed
+ * to be detached.
+ *
  * @param m
  *   The indirect attached packet mbuf.
  */
@@ -1162,11 +1186,26 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 	uint32_t mbuf_size, buf_len;
 	uint16_t priv_size;
 
-	if (RTE_MBUF_HAS_EXTBUF(m))
+	if (RTE_MBUF_HAS_EXTBUF(m)) {
+		/*
+		 * The mbuf has the external attached buffer,
+		 * we should check the type of the memory pool where
+		 * the mbuf was allocated from to detect the pinned
+		 * external buffer.
+		 */
+		uint32_t flags = rte_pktmbuf_priv_flags(mp);
+
+		if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
+			/*
+			 * The pinned external buffer should not be
+			 * detached from its backing mbuf, just exit.
+			 */
+			return;
+		}
 		__rte_pktmbuf_free_extbuf(m);
-	else
+	} else {
 		__rte_pktmbuf_free_direct(m);
-
+	}
 	priv_size = rte_pktmbuf_priv_size(mp);
 	mbuf_size = (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
 	buf_len = rte_pktmbuf_data_room_size(mp);
@@ -1181,6 +1220,44 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 }
 
 /**
+ * @internal Handle the packet mbufs with attached pinned external buffer
+ * on the mbuf freeing:
+ *
+ *  - return zero if reference counter in shinfo is one. It means there is
+ *  no more reference to this pinned buffer and mbuf can be returned to
+ *  the pool
+ *
+ *  - otherwise (if reference counter is not one), decrement reference
+ *  counter and return non-zero value to prevent freeing the backing mbuf.
+ *
+ * Returns non zero if mbuf should not be freed.
+ */
+static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
+{
+	struct rte_mbuf_ext_shared_info *shinfo;
+
+	/* Clear flags, mbuf is being freed. */
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	shinfo = m->shinfo;
+
+	/* Optimize for performance - do not dec/reinit */
+	if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1))
+		return 0;
+
+	/*
+	 * Direct usage of add primitive to avoid
+	 * duplication of comparing with one.
+	 */
+	if (likely(rte_atomic16_add_return
+			(&shinfo->refcnt_atomic, -1)))
+		return 1;
+
+	/* Reinitialize counter before mbuf freeing. */
+	rte_mbuf_ext_refcnt_set(shinfo, 1);
+	return 0;
+}
+
+/**
  * Decrease reference counter and unlink a mbuf segment
  *
  * This function does the same than a free, except that it does not
@@ -1201,8 +1278,13 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 
 	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
 
-		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+		if (!RTE_MBUF_DIRECT(m)) {
+			if (!RTE_MBUF_HAS_EXTBUF(m) ||
+			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
+				rte_pktmbuf_detach(m);
+			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
+				return NULL;
+		}
 
 		if (m->next != NULL) {
 			m->next = NULL;
@@ -1213,8 +1295,13 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 
 	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
 
-		if (!RTE_MBUF_DIRECT(m))
-			rte_pktmbuf_detach(m);
+		if (!RTE_MBUF_DIRECT(m)) {
+			if (!RTE_MBUF_HAS_EXTBUF(m) ||
+			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
+				rte_pktmbuf_detach(m);
+			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
+				return NULL;
+		}
 
 		if (m->next != NULL) {
 			m->next = NULL;
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-20 19:16   ` Viacheslav Ovsiienko
  2020-01-20 20:48     ` Stephen Hemminger
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initializes each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and initialiazed shared
info structure, allowing cloning with regular mbufs (without
attached external buffers of any kind).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mbuf/rte_mbuf.c           | 204 ++++++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf.h           |  58 +++++++++-
 lib/librte_mbuf/rte_mbuf_version.map |   1 +
 3 files changed, 260 insertions(+), 3 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8fa7f49..25eea1d 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -59,9 +59,12 @@
 	}
 
 	RTE_ASSERT(mp->elt_size >= sizeof(struct rte_mbuf) +
-		user_mbp_priv->mbuf_data_room_size +
+		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
+			sizeof(struct rte_mbuf_ext_shared_info) :
+			user_mbp_priv->mbuf_data_room_size) +
 		user_mbp_priv->mbuf_priv_size);
-	RTE_ASSERT(user_mbp_priv->flags == 0);
+	RTE_ASSERT((user_mbp_priv->flags &
+		    ~RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0);
 
 	mbp_priv = rte_mempool_get_priv(mp);
 	memcpy(mbp_priv, user_mbp_priv, sizeof(*mbp_priv));
@@ -107,6 +110,116 @@
 	m->next = NULL;
 }
 
+/*
+ * @internal The callback routine called when reference counter in shinfo
+ * for mbufs with pinned external buffer reaches zero. It means there is
+ * no more reference to buffer backing mbuf and this one should be freed.
+ * This routine is called for the regular (not with pinned external or
+ * indirect buffer) mbufs on detaching from the mbuf with pinned external
+ * buffer.
+ */
+static void rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
+{
+	struct rte_mbuf *m = opaque;
+
+	RTE_SET_USED(addr);
+	RTE_ASSERT(RTE_MBUF_HAS_EXTBUF(m));
+	RTE_ASSERT(RTE_MBUF_HAS_PINNED_EXTBUF(m));
+	RTE_ASSERT(m->shinfo->fcb_opaque == m);
+
+	rte_mbuf_ext_refcnt_set(m->shinfo, 1);
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	if (m->next != NULL) {
+		m->next = NULL;
+		m->nb_segs = 1;
+	}
+	rte_mbuf_raw_free(m);
+}
+
+/** The context to initialize the mbufs with pinned external buffers. */
+struct rte_pktmbuf_extmem_init_ctx {
+	const struct rte_pktmbuf_extmem *ext_mem; /* descriptor array. */
+	unsigned int ext_num; /* number of descriptors in array. */
+	unsigned int ext; /* loop descriptor index. */
+	size_t off; /* loop buffer offset. */
+};
+
+/**
+ * @internal Packet mbuf constructor for pools with pinned external memory.
+ *
+ * This function initializes some fields in the mbuf structure that are
+ * not modified by the user once created (origin pool, buffer start
+ * address, and so on). This function is given as a callback function to
+ * rte_mempool_obj_iter() called from rte_mempool_create_extmem().
+ *
+ * @param mp
+ *   The mempool from which mbufs originate.
+ * @param opaque_arg
+ *   A pointer to the rte_pktmbuf_extmem_init_ctx - initialization
+ *   context structure
+ * @param m
+ *   The mbuf to initialize.
+ * @param i
+ *   The index of the mbuf in the pool table.
+ */
+static void
+__rte_pktmbuf_init_extmem(struct rte_mempool *mp,
+			  void *opaque_arg,
+			  void *_m,
+			  __attribute__((unused)) unsigned int i)
+{
+	struct rte_mbuf *m = _m;
+	struct rte_pktmbuf_extmem_init_ctx *ctx = opaque_arg;
+	const struct rte_pktmbuf_extmem *ext_mem;
+	uint32_t mbuf_size, buf_len, priv_size;
+	struct rte_mbuf_ext_shared_info *shinfo;
+
+	priv_size = rte_pktmbuf_priv_size(mp);
+	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
+	buf_len = rte_pktmbuf_data_room_size(mp);
+
+	RTE_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
+	RTE_ASSERT(mp->elt_size >= mbuf_size);
+	RTE_ASSERT(buf_len <= UINT16_MAX);
+
+	memset(m, 0, mbuf_size);
+	m->priv_size = priv_size;
+	m->buf_len = (uint16_t)buf_len;
+
+	/* set the data buffer pointers to external memory */
+	ext_mem = ctx->ext_mem + ctx->ext;
+
+	RTE_ASSERT(ctx->ext < ctx->ext_num);
+	RTE_ASSERT(ctx->off < ext_mem->buf_len);
+
+	m->buf_addr = RTE_PTR_ADD(ext_mem->buf_ptr, ctx->off);
+	m->buf_iova = ext_mem->buf_iova == RTE_BAD_IOVA ?
+		      RTE_BAD_IOVA : (ext_mem->buf_iova + ctx->off);
+
+	ctx->off += ext_mem->elt_size;
+	if (ctx->off >= ext_mem->buf_len) {
+		ctx->off = 0;
+		++ctx->ext;
+	}
+	/* keep some headroom between start of buffer and data */
+	m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = MBUF_INVALID_PORT;
+	m->ol_flags = EXT_ATTACHED_MBUF;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+
+	/* init external buffer shared info items */
+	shinfo = RTE_PTR_ADD(m, mbuf_size);
+	m->shinfo = shinfo;
+	shinfo->free_cb = rte_pktmbuf_free_pinned_extmem;
+	shinfo->fcb_opaque = m;
+	rte_mbuf_ext_refcnt_set(shinfo, 1);
+}
+
 /* Helper to create a mbuf pool with given mempool ops name*/
 struct rte_mempool *
 rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n,
@@ -169,6 +282,93 @@ struct rte_mempool *
 			data_room_size, socket_id, NULL);
 }
 
+/* Helper to create a mbuf pool with pinned external data buffers. */
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	const struct rte_pktmbuf_extmem *ext_mem,
+	unsigned int ext_num)
+{
+	struct rte_mempool *mp;
+	struct rte_pktmbuf_pool_private mbp_priv;
+	struct rte_pktmbuf_extmem_init_ctx init_ctx;
+	const char *mp_ops_name;
+	unsigned int elt_size;
+	unsigned int i, n_elts = 0;
+	int ret;
+
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	/* Check the external memory descriptors. */
+	for (i = 0; i < ext_num; i++) {
+		const struct rte_pktmbuf_extmem *extm = ext_mem + i;
+
+		if (!extm->elt_size || !extm->buf_len || !extm->buf_ptr) {
+			RTE_LOG(ERR, MBUF, "invalid extmem descriptor\n");
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		if (data_room_size > extm->elt_size) {
+			RTE_LOG(ERR, MBUF, "ext elt_size=%u is too small\n",
+				priv_size);
+			rte_errno = EINVAL;
+			return NULL;
+		}
+		n_elts += extm->buf_len / extm->elt_size;
+	}
+	/* Check whether enough external memory provided. */
+	if (n_elts < n) {
+		RTE_LOG(ERR, MBUF, "not enough extmem\n");
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	elt_size = sizeof(struct rte_mbuf) +
+		   (unsigned int)priv_size +
+		   sizeof(struct rte_mbuf_ext_shared_info);
+
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+	mbp_priv.mbuf_data_room_size = data_room_size;
+	mbp_priv.mbuf_priv_size = priv_size;
+	mbp_priv.flags = RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		 sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+	if (mp == NULL)
+		return NULL;
+
+	mp_ops_name = rte_mbuf_best_mempool_ops();
+	ret = rte_mempool_set_ops_byname(mp, mp_ops_name, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+	rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+	ret = rte_mempool_populate_default(mp);
+	if (ret < 0) {
+		rte_mempool_free(mp);
+		rte_errno = -ret;
+		return NULL;
+	}
+
+	init_ctx = (struct rte_pktmbuf_extmem_init_ctx){
+		.ext_mem = ext_mem,
+		.ext_num = ext_num,
+		.ext = 0,
+		.off = 0,
+	};
+	rte_mempool_obj_iter(mp, __rte_pktmbuf_init_extmem, &init_ctx);
+
+	return mp;
+}
+
 /* do some sanity checks on a mbuf: panic if it fails */
 void
 rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header)
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index c4f5085..5902389 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -637,7 +637,6 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 void rte_pktmbuf_init(struct rte_mempool *mp, void *opaque_arg,
 		      void *m, unsigned i);
 
-
 /**
  * A  packet mbuf pool constructor.
  *
@@ -738,6 +737,63 @@ struct rte_mempool *
 	unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size,
 	int socket_id, const char *ops_name);
 
+/** A structure that describes the pinned external buffer segment. */
+struct rte_pktmbuf_extmem {
+	void *buf_ptr;		/**< The virtual address of data buffer. */
+	rte_iova_t buf_iova;	/**< The IO address of the data buffer. */
+	size_t buf_len;		/**< External buffer length in bytes. */
+	uint16_t elt_size;	/**< mbuf element size in bytes. */
+};
+
+/**
+ * Create a mbuf pool with external pinned data buffers.
+ *
+ * This function creates and initializes a packet mbuf pool that contains
+ * only mbufs with external buffer. It is a wrapper to rte_mempool functions.
+ *
+ * @param name
+ *   The name of the mbuf pool.
+ * @param n
+ *   The number of elements in the mbuf pool. The optimum size (in terms
+ *   of memory usage) for a mempool is when n is a power of two minus one:
+ *   n = (2^q - 1).
+ * @param cache_size
+ *   Size of the per-core object cache. See rte_mempool_create() for
+ *   details.
+ * @param priv_size
+ *   Size of application private are between the rte_mbuf structure
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
+ * @param data_room_size
+ *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
+ * @param socket_id
+ *   The socket identifier where the memory should be allocated. The
+ *   value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the
+ *   reserved zone.
+ * @param ext_mem
+ *   Pointer to the array of structures describing the external memory
+ *   for data buffers. It is caller responsibility to register this memory
+ *   with rte_extmem_register() (if needed), map this memory to appropriate
+ *   physical device, etc.
+ * @param ext_num
+ *   Number of elements in the ext_mem array.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
+ *    - E_RTE_SECONDARY - function was called from a secondary process instance
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
+ *    - ENOSPC - the maximum number of memzones has already been allocated
+ *    - EEXIST - a memzone with the same name already exists
+ *    - ENOMEM - no appropriate memory area found in which to create memzone
+ */
+__rte_experimental
+struct rte_mempool *
+rte_pktmbuf_pool_create_extbuf(const char *name, unsigned int n,
+	unsigned int cache_size, uint16_t priv_size,
+	uint16_t data_room_size, int socket_id,
+	const struct rte_pktmbuf_extmem *ext_mem,
+	unsigned int ext_num);
+
 /**
  * Get the data room size of mbufs stored in a pktmbuf_pool
  *
diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
index 3bbb476..ab161bc 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -44,5 +44,6 @@ EXPERIMENTAL {
 	rte_mbuf_dyn_dump;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
+	rte_pktmbuf_pool_create_extbuf;
 
 };
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 4/5] app/testpmd: add mempool with external data buffers
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-20 19:16   ` Viacheslav Ovsiienko
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
  2020-01-20 22:55   ` [dpdk-dev] [PATCH v6 0/5] mbuf: detach mbuf with pinned " Thomas Monjalon
  5 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/config.c     |  2 ++
 app/test-pmd/flowgen.c    |  3 +-
 app/test-pmd/parameters.c |  2 ++
 app/test-pmd/testpmd.c    | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h    |  4 ++-
 app/test-pmd/txonly.c     |  3 +-
 6 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 52f1d9d..9669cbd 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2422,6 +2422,8 @@ struct igb_ring_desc_16_bytes {
 		return "xmem";
 	case MP_ALLOC_XMEM_HUGE:
 		return "xmemhuge";
+	case MP_ALLOC_XBUF:
+		return "xbuf";
 	default:
 		return "invalid";
 	}
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 03b72aa..ae50cdc 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -199,7 +199,8 @@
 							   sizeof(*ip_hdr));
 		pkt->nb_segs		= 1;
 		pkt->pkt_len		= pkt_size;
-		pkt->ol_flags		= ol_flags;
+		pkt->ol_flags		&= EXT_ATTACHED_MBUF;
+		pkt->ol_flags		|= ol_flags;
 		pkt->vlan_tci		= vlan_tci;
 		pkt->vlan_tci_outer	= vlan_tci_outer;
 		pkt->l2_len		= sizeof(struct rte_ether_hdr);
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2e7a504..6340104 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -841,6 +841,8 @@
 					mp_alloc_type = MP_ALLOC_XMEM;
 				else if (!strcmp(optarg, "xmemhuge"))
 					mp_alloc_type = MP_ALLOC_XMEM_HUGE;
+				else if (!strcmp(optarg, "xbuf"))
+					mp_alloc_type = MP_ALLOC_XBUF;
 				else
 					rte_exit(EXIT_FAILURE,
 						"mp-alloc %s invalid - must be: "
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 38dbb12..f9f4cd1 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
+#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -868,6 +869,66 @@ struct extmem_param {
 	}
 }
 
+static unsigned int
+setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
+	    char *pool_name, struct rte_pktmbuf_extmem **ext_mem)
+{
+	struct rte_pktmbuf_extmem *xmem;
+	unsigned int ext_num, zone_num, elt_num;
+	uint16_t elt_size;
+
+	elt_size = RTE_ALIGN_CEIL(mbuf_sz, RTE_CACHE_LINE_SIZE);
+	elt_num = EXTBUF_ZONE_SIZE / elt_size;
+	zone_num = (nb_mbufs + elt_num - 1) / elt_num;
+
+	xmem = malloc(sizeof(struct rte_pktmbuf_extmem) * zone_num);
+	if (xmem == NULL) {
+		TESTPMD_LOG(ERR, "Cannot allocate memory for "
+				 "external buffer descriptors\n");
+		*ext_mem = NULL;
+		return 0;
+	}
+	for (ext_num = 0; ext_num < zone_num; ext_num++) {
+		struct rte_pktmbuf_extmem *xseg = xmem + ext_num;
+		const struct rte_memzone *mz;
+		char mz_name[RTE_MEMZONE_NAMESIZE];
+		int ret;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_xb_%u", pool_name, ext_num);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			errno = ENAMETOOLONG;
+			ext_num = 0;
+			break;
+		}
+		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
+						 socket_id,
+						 RTE_MEMZONE_IOVA_CONTIG |
+						 RTE_MEMZONE_1GB |
+						 RTE_MEMZONE_SIZE_HINT_ONLY,
+						 EXTBUF_ZONE_SIZE);
+		if (mz == NULL) {
+			/*
+			 * The caller exits on external buffer creation
+			 * error, so there is no need to free memzones.
+			 */
+			errno = ENOMEM;
+			ext_num = 0;
+			break;
+		}
+		xseg->buf_ptr = mz->addr;
+		xseg->buf_iova = mz->iova;
+		xseg->buf_len = EXTBUF_ZONE_SIZE;
+		xseg->elt_size = elt_size;
+	}
+	if (ext_num == 0 && xmem != NULL) {
+		free(xmem);
+		xmem = NULL;
+	}
+	*ext_mem = xmem;
+	return ext_num;
+}
+
 /*
  * Configuration initialisation done once at init time.
  */
@@ -936,6 +997,26 @@ struct extmem_param {
 					heap_socket);
 			break;
 		}
+	case MP_ALLOC_XBUF:
+		{
+			struct rte_pktmbuf_extmem *ext_mem;
+			unsigned int ext_num;
+
+			ext_num = setup_extbuf(nb_mbuf,	mbuf_seg_size,
+					       socket_id, pool_name, &ext_mem);
+			if (ext_num == 0)
+				rte_exit(EXIT_FAILURE,
+					 "Can't create pinned data buffers\n");
+
+			TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n",
+					rte_mbuf_best_mempool_ops());
+			rte_mp = rte_pktmbuf_pool_create_extbuf
+					(pool_name, nb_mbuf, mb_mempool_cache,
+					 0, mbuf_seg_size, socket_id,
+					 ext_mem, ext_num);
+			free(ext_mem);
+			break;
+		}
 	default:
 		{
 			rte_exit(EXIT_FAILURE, "Invalid mempool creation mode\n");
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 7cf48d0..3dd5fc7 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -76,8 +76,10 @@ enum {
 	/**< allocate mempool natively, but populate using anonymous memory */
 	MP_ALLOC_XMEM,
 	/**< allocate and populate mempool using anonymous memory */
-	MP_ALLOC_XMEM_HUGE
+	MP_ALLOC_XMEM_HUGE,
 	/**< allocate and populate mempool using anonymous hugepage memory */
+	MP_ALLOC_XBUF
+	/**< allocate mempool natively, use rte_pktmbuf_pool_create_extbuf */
 };
 
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 3caf281..871cf6c 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -170,7 +170,8 @@
 
 	rte_pktmbuf_reset_headroom(pkt);
 	pkt->data_len = tx_pkt_seg_lengths[0];
-	pkt->ol_flags = ol_flags;
+	pkt->ol_flags &= EXT_ATTACHED_MBUF;
+	pkt->ol_flags |= ol_flags;
 	pkt->vlan_tci = vlan_tci;
 	pkt->vlan_tci_outer = vlan_tci_outer;
 	pkt->l2_len = sizeof(struct rte_ether_hdr);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v6 5/5] net/mlx5: allow use allocated mbuf with external buffer
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
@ 2020-01-20 19:16   ` Viacheslav Ovsiienko
  2020-01-20 22:55   ` [dpdk-dev] [PATCH v6 0/5] mbuf: detach mbuf with pinned " Thomas Monjalon
  5 siblings, 0 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-20 19:16 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  7 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c             |  2 +-
 drivers/net/mlx5/mlx5_rxtx.h             |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec.h         | 14 ++++----------
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  5 ++---
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 29 +++++++++++++++--------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |  2 +-
 7 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index c936a7f..4092cb7 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -225,6 +225,9 @@
 	if (mlx5_rxq_check_vec_support(&rxq_ctrl->rxq) > 0) {
 		struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
 		struct rte_mbuf *mbuf_init = &rxq->fake_mbuf;
+		struct rte_pktmbuf_pool_private *priv =
+			(struct rte_pktmbuf_pool_private *)
+				rte_mempool_get_priv(rxq_ctrl->rxq.mp);
 		int j;
 
 		/* Initialize default rearm_data for vPMD. */
@@ -232,13 +235,15 @@
 		rte_mbuf_refcnt_set(mbuf_init, 1);
 		mbuf_init->nb_segs = 1;
 		mbuf_init->port = rxq->port_id;
+		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
+			mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
 		/*
 		 * prevent compiler reordering:
 		 * rearm_data covers previous fields.
 		 */
 		rte_compiler_barrier();
 		rxq->mbuf_initializer =
-			*(uint64_t *)&mbuf_init->rearm_data;
+			*(rte_xmm_t *)&mbuf_init->rearm_data;
 		/* Padding with a fake mbuf for vectorized Rx. */
 		for (j = 0; j < MLX5_VPMD_DESCS_PER_LOOP; ++j)
 			(*rxq->elts)[elts_n + j] = &rxq->fake_mbuf;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 67cafd1..5e31f01 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1337,7 +1337,7 @@ enum mlx5_txcmp_code {
 			}
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
-			pkt->ol_flags = 0;
+			pkt->ol_flags &= EXT_ATTACHED_MBUF;
 			/* If compressed, take hash result from mini-CQE. */
 			rss_hash_res = rte_be_to_cpu_32(mcqe == NULL ?
 							cqe->rx_hash_res :
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index b6a33c5..3f659d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -144,7 +144,7 @@ struct mlx5_rxq_data {
 	struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */
 	uint16_t idx; /* Queue index. */
 	struct mlx5_rxq_stats stats;
-	uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */
+	rte_xmm_t mbuf_initializer; /* Default rearm/flags for vectorized Rx. */
 	struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 85e0bd5..d8c07f2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -97,18 +97,12 @@
 		void *buf_addr;
 
 		/*
-		 * Load the virtual address for Rx WQE. non-x86 processors
-		 * (mostly RISC such as ARM and Power) are more vulnerable to
-		 * load stall. For x86, reducing the number of instructions
-		 * seems to matter most.
+		 * In order to support the mbufs with external attached
+		 * data buffer we should use the buf_addr pointer instead of
+		 * rte_mbuf_buf_addr(). It touches the mbuf itself and may
+		 * impact the performance.
 		 */
-#ifdef RTE_ARCH_X86_64
 		buf_addr = elts[i]->buf_addr;
-		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
-#else
-		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
-		assert(buf_addr == elts[i]->buf_addr);
-#endif
 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 8e79883..9e5c6ee 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -344,9 +344,8 @@
 		PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 		PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED};
 	const vector unsigned char mbuf_init =
-		(vector unsigned char)(vector unsigned long){
-		*(__attribute__((__aligned__(8))) unsigned long *)
-		&rxq->mbuf_initializer, 0LL};
+		(vector unsigned char)vec_vsx_ld
+			(0, (vector unsigned char *)&rxq->mbuf_initializer);
 	const vector unsigned short rearm_sel_mask =
 		(vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0};
 	vector unsigned char rearm0, rearm1, rearm2, rearm3;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 86785c7..332e9ac 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -264,8 +264,8 @@
 	const uint32x4_t cv_mask =
 		vdupq_n_u32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			    PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
-	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
-	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
+	const uint64x2_t mbuf_init = vld1q_u64
+				((const uint64_t *)&rxq->mbuf_initializer);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
@@ -326,18 +326,19 @@
 	/* Merge to ol_flags. */
 	ol_flags = vorrq_u32(ol_flags, cv_flags);
 	/* Merge mbuf_init and ol_flags, and store. */
-	rearm0 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_high_u64(vreinterpretq_u64_u32(
-						       ol_flags)), 32));
-	rearm1 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_high_u64(vreinterpretq_u64_u32(
-						     ol_flags)), r32_mask));
-	rearm2 = vcombine_u64(mbuf_init,
-			      vshr_n_u64(vget_low_u64(vreinterpretq_u64_u32(
-						      ol_flags)), 32));
-	rearm3 = vcombine_u64(mbuf_init,
-			      vand_u64(vget_low_u64(vreinterpretq_u64_u32(
-						    ol_flags)), r32_mask));
+	rearm0 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 3),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm1 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 2),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm2 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 1),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+	rearm3 = vreinterpretq_u64_u32(vsetq_lane_u32
+					(vgetq_lane_u32(ol_flags, 0),
+					 vreinterpretq_u32_u64(mbuf_init), 2));
+
 	vst1q_u64((void *)&pkts[0]->rearm_data, rearm0);
 	vst1q_u64((void *)&pkts[1]->rearm_data, rearm1);
 	vst1q_u64((void *)&pkts[2]->rearm_data, rearm2);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 35b7761..07d40d5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -259,7 +259,7 @@
 			      PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD |
 			      PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
 	const __m128i mbuf_init =
-		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
+		_mm_load_si128((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
 	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
@ 2020-01-20 20:43     ` Stephen Hemminger
  2020-01-20 22:52       ` Thomas Monjalon
                         ` (2 more replies)
  0 siblings, 3 replies; 77+ messages in thread
From: Stephen Hemminger @ 2020-01-20 20:43 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, matan, rasland, orika, shahafs, olivier.matz, thomas
On Mon, 20 Jan 2020 17:23:19 +0000
Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> The routine rte_pktmbuf_priv_flags is introduced to fetch
> the flags from the mbuf memory pool private structure
> in unified fashion.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 2d4bda2..9b0691d 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
>  	uint32_t flags; /**< reserved for future use. */
>  };
>  
> +/**
> + * Return the flags from private data in an mempool structure.
> + *
> + * @param mp
> + *   A pointer to the mempool structure.
> + * @return
> + *   The flags from the private data structure.
> + */
> +static inline uint32_t
> +rte_pktmbuf_priv_flags(struct rte_mempool *mp)
> +{
> +	struct rte_pktmbuf_pool_private *mbp_priv;
> +
> +	mbp_priv = (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
> +	return mbp_priv->flags;
> +}
> +
>  #ifdef RTE_LIBRTE_MBUF_DEBUG
>  
>  /**  check mbuf type in debug mode */
Looks fine, but a couple of minor suggestions.
Since this doesn't modify the mbuf, the arguments should be const.
Since rte_mempool_get_priv returns void *, the cast is unnecessary.
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
@ 2020-01-20 20:48     ` Stephen Hemminger
  2020-01-21  7:04       ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Stephen Hemminger @ 2020-01-20 20:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, matan, rasland, orika, shahafs, olivier.matz, thomas
On Mon, 20 Jan 2020 19:16:24 +0000
Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> +		((user_mbp_priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
> +			sizeof(struct rte_mbuf_ext_shared_info) :
> +			user_mbp_priv->mbuf_data_room_size
Should rte_pktmbuf_data_room_size() be used, and have it return the
right value for external mbuf?
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 20:43     ` Stephen Hemminger
@ 2020-01-20 22:52       ` Thomas Monjalon
  2020-01-21  6:48       ` Slava Ovsiienko
  2020-01-21  8:00       ` Slava Ovsiienko
  2 siblings, 0 replies; 77+ messages in thread
From: Thomas Monjalon @ 2020-01-20 22:52 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, olivier.matz, Stephen Hemminger
  Cc: dev, matan, rasland, orika, shahafs
20/01/2020 21:43, Stephen Hemminger:
> On Mon, 20 Jan 2020 17:23:19 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > The routine rte_pktmbuf_priv_flags is introduced to fetch
> > the flags from the mbuf memory pool private structure
> > in unified fashion.
> > 
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > +static inline uint32_t
> > +rte_pktmbuf_priv_flags(struct rte_mempool *mp)
> > +{
> > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > +
> > +	mbp_priv = (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp);
> > +	return mbp_priv->flags;
> > +}
> 
> Looks fine, but a couple of minor suggestions.
> 
> 
> Since this doesn't modify the mbuf, the arguments should be const.
> Since rte_mempool_get_priv returns void *, the cast is unnecessary.
It makes sense.
Please let's make these minor changes in a separate patch
as I am closing -rc1 with the v6 of this series included.
Thanks
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v6 0/5] mbuf: detach mbuf with pinned external buffer
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
                     ` (4 preceding siblings ...)
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
@ 2020-01-20 22:55   ` Thomas Monjalon
  5 siblings, 0 replies; 77+ messages in thread
From: Thomas Monjalon @ 2020-01-20 22:55 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, matan, rasland, orika, shahafs, olivier.matz, stephen
20/01/2020 20:16, Viacheslav Ovsiienko:
> Today's pktmbuf pool contains only mbufs with no external buffers.
> This means data buffer for the mbuf should be placed right after the
> mbuf structure (+ the private data when enabled).
> 
> On some cases, the application would want to have the buffers allocated
> from a different device in the platform. This is in order to do zero
> copy for the packet directly to the device memory. Examples for such
> devices can be GPU or storage device. For such cases the native pktmbuf
> pool does not fit since each mbuf would need to point to external
> buffer.
> 
> To support above, the pktmbuf pool will be populated with mbuf pointing
> to the device buffers using the mbuf external buffer feature.
> The PMD will populate its receive queues with those buffer, so that
> every packet received will be scattered directly to the device memory.
> on the other direction, embedding the buffer pointer to the transmit
> queues of the NIC, will make the DMA to fetch device memory
> using peer to peer communication.
> 
> Such mbuf with external buffer should be handled with care when mbuf is
> freed. Mainly The external buffer should not be detached, so that it can
> be reused for the next packet receive.
> 
> This patch introduce a new flag on the rte_pktmbuf_pool_private
> structure to specify this mempool is for mbuf with pinned external
> buffer. Upon detach this flag is validated and buffer is not detached.
> A new mempool create wrapper is also introduced to help application to
> create and populate such mempool.
> 
> The unit test (as part of test_mbuf application) will be provided
> as dedicated patch.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> RFC: http://patches.dpdk.org/patch/63077
> v1: http://patches.dpdk.org/cover/64424
> v2: - fix rte_experimantal issue on comment addressing
>     - rte_mbuf_has_pinned_extbuf return type is uint32_t
>     - fix Power9 compilation issue
> v3: - fix "#include <stdbool.h> leftover
> v4: - https://patches.dpdk.org/cover/64809/
>     - introduce rte_pktmbuf_priv_flags
>     - support cloning pinned mbufs as for regular mbufs
>       with external buffers
>     - address the minor comments
> v5: - http://patches.dpdk.org/cover/64979/
>     - update rte_pktmbuf_prefree_seg
>     - rename __rte_pktmbuf_extbuf_detach
>     - __rte_pktmbuf_init_extmem is static
>     - const qualifier is specified for external memory
>       description parameter of rte_pktmbuf_pool_create_extbuf
>     - addressing minor comments
>     - fix typos
> v6: - new lines inserted
>     - struct rte_pktmbuf_extmem_init_ctx is local to rte_mbuf.c
> 
> Viacheslav Ovsiienko (5):
>   mbuf: introduce routine to get private mbuf pool flags
>   mbuf: detach mbuf with pinned external buffer
>   mbuf: create packet pool with external memory buffers
>   app/testpmd: add mempool with external data buffers
>   net/mlx5: allow use allocated mbuf with external buffer
Applied, thanks
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 20:43     ` Stephen Hemminger
  2020-01-20 22:52       ` Thomas Monjalon
@ 2020-01-21  6:48       ` Slava Ovsiienko
  2020-01-21  8:00       ` Slava Ovsiienko
  2 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-21  6:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	olivier.matz, thomas
Hi, Stephen
I'm appreciated for the for reviewing.
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, January 20, 2020 22:44
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; olivier.matz@6wind.com; thomas@mellanox.net
> Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool
> flags
> 
> On Mon, 20 Jan 2020 17:23:19 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > The routine rte_pktmbuf_priv_flags is introduced to fetch the flags
> > from the mbuf memory pool private structure in unified fashion.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 2d4bda2..9b0691d 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> >  	uint32_t flags; /**< reserved for future use. */  };
> >
> > +/**
> > + * Return the flags from private data in an mempool structure.
> > + *
> > + * @param mp
> > + *   A pointer to the mempool structure.
> > + * @return
> > + *   The flags from the private data structure.
> > + */
> > +static inline uint32_t
> > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > +
> > +	mbp_priv = (struct rte_pktmbuf_pool_private
> *)rte_mempool_get_priv(mp);
> > +	return mbp_priv->flags;
> > +}
> > +
> >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> >
> >  /**  check mbuf type in debug mode */
> 
> Looks fine, but a couple of minor suggestions.
> 
> 
> Since this doesn't modify the mbuf, the arguments should be const.
> Since rte_mempool_get_priv returns void *, the cast is unnecessary.
Agree, will prepare the dedicated patch.
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers
  2020-01-20 20:48     ` Stephen Hemminger
@ 2020-01-21  7:04       ` Slava Ovsiienko
  0 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-21  7:04 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	olivier.matz, thomas
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, January 20, 2020 22:48
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; olivier.matz@6wind.com; thomas@mellanox.net
> Subject: Re: [PATCH v6 3/5] mbuf: create packet pool with external memory
> buffers
> 
> On Mon, 20 Jan 2020 19:16:24 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > +		((user_mbp_priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ?
> > +			sizeof(struct rte_mbuf_ext_shared_info) :
> > +			user_mbp_priv->mbuf_data_room_size
> 
> Should rte_pktmbuf_data_room_size() be used, and have it return the right
> value for external mbuf?
The pool initialization is not completed yet (private structure of mbuf pool is
not copied yet, so in this point  rte_pktmbuf_data_room_size() is unable to fetch
the valid data room size. Beside this,  rte_pktmbuf_data_room_size() returns the
valid value for the mbuf with pinned external buffers after pool initialization is done.
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-20 20:43     ` Stephen Hemminger
  2020-01-20 22:52       ` Thomas Monjalon
  2020-01-21  6:48       ` Slava Ovsiienko
@ 2020-01-21  8:00       ` Slava Ovsiienko
  2020-01-21  8:14         ` Olivier Matz
  2 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-21  8:00 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	olivier.matz, thomas
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, January 20, 2020 22:44
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; olivier.matz@6wind.com; thomas@mellanox.net
> Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool
> flags
> 
> On Mon, 20 Jan 2020 17:23:19 +0000
> Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> 
> > The routine rte_pktmbuf_priv_flags is introduced to fetch the flags
> > from the mbuf memory pool private structure in unified fashion.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 2d4bda2..9b0691d 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> >  	uint32_t flags; /**< reserved for future use. */  };
> >
> > +/**
> > + * Return the flags from private data in an mempool structure.
> > + *
> > + * @param mp
> > + *   A pointer to the mempool structure.
> > + * @return
> > + *   The flags from the private data structure.
> > + */
> > +static inline uint32_t
> > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > +
> > +	mbp_priv = (struct rte_pktmbuf_pool_private
> *)rte_mempool_get_priv(mp);
> > +	return mbp_priv->flags;
> > +}
> > +
> >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> >
> >  /**  check mbuf type in debug mode */
> 
> Looks fine, but a couple of minor suggestions.
> 
> 
> Since this doesn't modify the mbuf, the arguments should be const.
> Since rte_mempool_get_priv returns void *, the cast is unnecessary.
rte_mempool_get_priv() does not expect "const", so adding "const" is a bit problematic,
and we should not change the rte_mempool_get_priv() prototype.
Do you think we should take private structure pointer directly from the pool
structure instead of calling rte_mempool_get_priv() ?
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-21  8:00       ` Slava Ovsiienko
@ 2020-01-21  8:14         ` Olivier Matz
  2020-01-21  8:23           ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-21  8:14 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Stephen Hemminger, dev, Matan Azrad, Raslan Darawsheh, Ori Kam,
	Shahaf Shuler, thomas
On Tue, Jan 21, 2020 at 08:00:17AM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Monday, January 20, 2020 22:44
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > <shahafs@mellanox.com>; olivier.matz@6wind.com; thomas@mellanox.net
> > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool
> > flags
> > 
> > On Mon, 20 Jan 2020 17:23:19 +0000
> > Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> > 
> > > The routine rte_pktmbuf_priv_flags is introduced to fetch the flags
> > > from the mbuf memory pool private structure in unified fashion.
> > >
> > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > ---
> > >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> > >  1 file changed, 17 insertions(+)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 2d4bda2..9b0691d 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> > >  	uint32_t flags; /**< reserved for future use. */  };
> > >
> > > +/**
> > > + * Return the flags from private data in an mempool structure.
> > > + *
> > > + * @param mp
> > > + *   A pointer to the mempool structure.
> > > + * @return
> > > + *   The flags from the private data structure.
> > > + */
> > > +static inline uint32_t
> > > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > > +
> > > +	mbp_priv = (struct rte_pktmbuf_pool_private
> > *)rte_mempool_get_priv(mp);
> > > +	return mbp_priv->flags;
> > > +}
> > > +
> > >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> > >
> > >  /**  check mbuf type in debug mode */
> > 
> > Looks fine, but a couple of minor suggestions.
> > 
> > 
> > Since this doesn't modify the mbuf, the arguments should be const.
> > Since rte_mempool_get_priv returns void *, the cast is unnecessary.
> 
> rte_mempool_get_priv() does not expect "const", so adding "const" is a bit problematic,
> and we should not change the rte_mempool_get_priv() prototype.
> Do you think we should take private structure pointer directly from the pool
> structure instead of calling rte_mempool_get_priv() ?
I'm not sure it would work. The problem is that to get the priv,
we do pool_ptr + offset. So if we want to remove the const, we'll
have to do a cast to "unconst". Not sure it is worth doing it.
Thanks
Olivier
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-21  8:14         ` Olivier Matz
@ 2020-01-21  8:23           ` Slava Ovsiienko
  2020-01-21  9:13             ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-21  8:23 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Stephen Hemminger, dev, Matan Azrad, Raslan Darawsheh, Ori Kam,
	Shahaf Shuler, thomas
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, January 21, 2020 10:14
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; thomas@mellanox.net
> Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool
> flags
> 
> On Tue, Jan 21, 2020 at 08:00:17AM +0000, Slava Ovsiienko wrote:
> > > -----Original Message-----
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > Sent: Monday, January 20, 2020 22:44
> > > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh
> > > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > > <shahafs@mellanox.com>; olivier.matz@6wind.com;
> thomas@mellanox.net
> > > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private
> > > mbuf pool flags
> > >
> > > On Mon, 20 Jan 2020 17:23:19 +0000
> > > Viacheslav Ovsiienko <viacheslavo@mellanox.com> wrote:
> > >
> > > > The routine rte_pktmbuf_priv_flags is introduced to fetch the
> > > > flags from the mbuf memory pool private structure in unified fashion.
> > > >
> > > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > > ---
> > > >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> > > >  1 file changed, 17 insertions(+)
> > > >
> > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > b/lib/librte_mbuf/rte_mbuf.h index 2d4bda2..9b0691d 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> > > >  	uint32_t flags; /**< reserved for future use. */  };
> > > >
> > > > +/**
> > > > + * Return the flags from private data in an mempool structure.
> > > > + *
> > > > + * @param mp
> > > > + *   A pointer to the mempool structure.
> > > > + * @return
> > > > + *   The flags from the private data structure.
> > > > + */
> > > > +static inline uint32_t
> > > > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > > > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > > > +
> > > > +	mbp_priv = (struct rte_pktmbuf_pool_private
> > > *)rte_mempool_get_priv(mp);
> > > > +	return mbp_priv->flags;
> > > > +}
> > > > +
> > > >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> > > >
> > > >  /**  check mbuf type in debug mode */
> > >
> > > Looks fine, but a couple of minor suggestions.
> > >
> > >
> > > Since this doesn't modify the mbuf, the arguments should be const.
> > > Since rte_mempool_get_priv returns void *, the cast is unnecessary.
> >
> > rte_mempool_get_priv() does not expect "const", so adding "const" is a
> > bit problematic, and we should not change the rte_mempool_get_priv()
> prototype.
> > Do you think we should take private structure pointer directly from
> > the pool structure instead of calling rte_mempool_get_priv() ?
> 
> I'm not sure it would work. The problem is that to get the priv, we do
> pool_ptr + offset. So if we want to remove the const, we'll have to do a cast to
> "unconst". Not sure it is worth doing it.
> 
> Thanks
> Olivier
OK, I'll just remove not necessary (struct rte_pktmbuf_pool_private*) cast and
will not introduce const qualifier for the parameter.
With best regards, Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-21  8:23           ` Slava Ovsiienko
@ 2020-01-21  9:13             ` Slava Ovsiienko
  2020-01-21 14:01               ` Olivier Matz
  0 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-21  9:13 UTC (permalink / raw)
  To: Slava Ovsiienko, Olivier Matz
  Cc: Stephen Hemminger, dev, Matan Azrad, Raslan Darawsheh, Ori Kam,
	Shahaf Shuler, thomas
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Slava Ovsiienko
> Sent: Tuesday, January 21, 2020 10:24
> To: Olivier Matz <olivier.matz@6wind.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; thomas@mellanox.net
> Subject: Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private
> mbuf pool flags
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Tuesday, January 21, 2020 10:14
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> > Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > <shahafs@mellanox.com>; thomas@mellanox.net
> > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private
> > mbuf pool flags
> >
> > On Tue, Jan 21, 2020 at 08:00:17AM +0000, Slava Ovsiienko wrote:
> > > > -----Original Message-----
> > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > Sent: Monday, January 20, 2020 22:44
> > > > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > > > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> > Darawsheh
> > > > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf
> > > > Shuler <shahafs@mellanox.com>; olivier.matz@6wind.com;
> > thomas@mellanox.net
> > > > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private
> > > > mbuf pool flags
> > > >
> > > > On Mon, 20 Jan 2020 17:23:19 +0000 Viacheslav Ovsiienko
> > > > <viacheslavo@mellanox.com> wrote:
> > > >
> > > > > The routine rte_pktmbuf_priv_flags is introduced to fetch the
> > > > > flags from the mbuf memory pool private structure in unified fashion.
> > > > >
> > > > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > > > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > > > ---
> > > > >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> > > > >  1 file changed, 17 insertions(+)
> > > > >
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > b/lib/librte_mbuf/rte_mbuf.h index 2d4bda2..9b0691d 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> > > > >  	uint32_t flags; /**< reserved for future use. */  };
> > > > >
> > > > > +/**
> > > > > + * Return the flags from private data in an mempool structure.
> > > > > + *
> > > > > + * @param mp
> > > > > + *   A pointer to the mempool structure.
> > > > > + * @return
> > > > > + *   The flags from the private data structure.
> > > > > + */
> > > > > +static inline uint32_t
> > > > > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > > > > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > > > > +
> > > > > +	mbp_priv = (struct rte_pktmbuf_pool_private
> > > > *)rte_mempool_get_priv(mp);
> > > > > +	return mbp_priv->flags;
> > > > > +}
> > > > > +
> > > > >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> > > > >
> > > > >  /**  check mbuf type in debug mode */
> > > >
> > > > Looks fine, but a couple of minor suggestions.
> > > >
> > > >
> > > > Since this doesn't modify the mbuf, the arguments should be const.
> > > > Since rte_mempool_get_priv returns void *, the cast is unnecessary.
> > >
> > > rte_mempool_get_priv() does not expect "const", so adding "const" is
> > > a bit problematic, and we should not change the
> > > rte_mempool_get_priv()
> > prototype.
> > > Do you think we should take private structure pointer directly from
> > > the pool structure instead of calling rte_mempool_get_priv() ?
> >
> > I'm not sure it would work. The problem is that to get the priv, we do
> > pool_ptr + offset. So if we want to remove the const, we'll have to do
> > a cast to "unconst". Not sure it is worth doing it.
> >
> > Thanks
> > Olivier
> OK, I'll just remove not necessary (struct rte_pktmbuf_pool_private*) cast and
> will not introduce const qualifier for the parameter.
> 
> With best regards, Slava
I've checked the rte_mempool_get_priv() usage - in all header files there
are the type casts. The "rte_mbuf.h" contains the rte_pktmbuf_priv_size() and
rte_pktmbuf_data_room_size(), both provide the cast.  What is the reason?
C++ compatibility? Should we remove the cast in rte_pktmbuf_priv_flags()?
With best regards,
Slava
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-21  9:13             ` Slava Ovsiienko
@ 2020-01-21 14:01               ` Olivier Matz
  2020-01-21 16:21                 ` Stephen Hemminger
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-01-21 14:01 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Stephen Hemminger, dev, Matan Azrad, Raslan Darawsheh, Ori Kam,
	Shahaf Shuler, thomas
On Tue, Jan 21, 2020 at 09:13:48AM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Slava Ovsiienko
> > Sent: Tuesday, January 21, 2020 10:24
> > To: Olivier Matz <olivier.matz@6wind.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> > Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > <shahafs@mellanox.com>; thomas@mellanox.net
> > Subject: Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private
> > mbuf pool flags
> > 
> > > -----Original Message-----
> > > From: Olivier Matz <olivier.matz@6wind.com>
> > > Sent: Tuesday, January 21, 2020 10:14
> > > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > > Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> > > Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > > <shahafs@mellanox.com>; thomas@mellanox.net
> > > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private
> > > mbuf pool flags
> > >
> > > On Tue, Jan 21, 2020 at 08:00:17AM +0000, Slava Ovsiienko wrote:
> > > > > -----Original Message-----
> > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > Sent: Monday, January 20, 2020 22:44
> > > > > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > > > > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> > > Darawsheh
> > > > > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf
> > > > > Shuler <shahafs@mellanox.com>; olivier.matz@6wind.com;
> > > thomas@mellanox.net
> > > > > Subject: Re: [PATCH v5 1/5] mbuf: introduce routine to get private
> > > > > mbuf pool flags
> > > > >
> > > > > On Mon, 20 Jan 2020 17:23:19 +0000 Viacheslav Ovsiienko
> > > > > <viacheslavo@mellanox.com> wrote:
> > > > >
> > > > > > The routine rte_pktmbuf_priv_flags is introduced to fetch the
> > > > > > flags from the mbuf memory pool private structure in unified fashion.
> > > > > >
> > > > > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > > > > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > > > > ---
> > > > > >  lib/librte_mbuf/rte_mbuf.h | 17 +++++++++++++++++
> > > > > >  1 file changed, 17 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > > b/lib/librte_mbuf/rte_mbuf.h index 2d4bda2..9b0691d 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > @@ -306,6 +306,23 @@ struct rte_pktmbuf_pool_private {
> > > > > >  	uint32_t flags; /**< reserved for future use. */  };
> > > > > >
> > > > > > +/**
> > > > > > + * Return the flags from private data in an mempool structure.
> > > > > > + *
> > > > > > + * @param mp
> > > > > > + *   A pointer to the mempool structure.
> > > > > > + * @return
> > > > > > + *   The flags from the private data structure.
> > > > > > + */
> > > > > > +static inline uint32_t
> > > > > > +rte_pktmbuf_priv_flags(struct rte_mempool *mp) {
> > > > > > +	struct rte_pktmbuf_pool_private *mbp_priv;
> > > > > > +
> > > > > > +	mbp_priv = (struct rte_pktmbuf_pool_private
> > > > > *)rte_mempool_get_priv(mp);
> > > > > > +	return mbp_priv->flags;
> > > > > > +}
> > > > > > +
> > > > > >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> > > > > >
> > > > > >  /**  check mbuf type in debug mode */
> > > > >
> > > > > Looks fine, but a couple of minor suggestions.
> > > > >
> > > > >
> > > > > Since this doesn't modify the mbuf, the arguments should be const.
> > > > > Since rte_mempool_get_priv returns void *, the cast is unnecessary.
> > > >
> > > > rte_mempool_get_priv() does not expect "const", so adding "const" is
> > > > a bit problematic, and we should not change the
> > > > rte_mempool_get_priv()
> > > prototype.
> > > > Do you think we should take private structure pointer directly from
> > > > the pool structure instead of calling rte_mempool_get_priv() ?
> > >
> > > I'm not sure it would work. The problem is that to get the priv, we do
> > > pool_ptr + offset. So if we want to remove the const, we'll have to do
> > > a cast to "unconst". Not sure it is worth doing it.
> > >
> > > Thanks
> > > Olivier
> > OK, I'll just remove not necessary (struct rte_pktmbuf_pool_private*) cast and
> > will not introduce const qualifier for the parameter.
> > 
> > With best regards, Slava
> I've checked the rte_mempool_get_priv() usage - in all header files there
> are the type casts. The "rte_mbuf.h" contains the rte_pktmbuf_priv_size() and
> rte_pktmbuf_data_room_size(), both provide the cast.  What is the reason?
> C++ compatibility? Should we remove the cast in rte_pktmbuf_priv_flags()?
Removing the cast will certainly break C++ code using this header.
There is a similar case in commit a2ff2827dc84 ("mbuf: fix C++ build on
void pointer cast")
In my opinion it can stay as it is now.
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags
  2020-01-21 14:01               ` Olivier Matz
@ 2020-01-21 16:21                 ` Stephen Hemminger
  0 siblings, 0 replies; 77+ messages in thread
From: Stephen Hemminger @ 2020-01-21 16:21 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Slava Ovsiienko, dev, Matan Azrad, Raslan Darawsheh, Ori Kam,
	Shahaf Shuler, thomas
On Tue, 21 Jan 2020 15:01:57 +0100
Olivier Matz <olivier.matz@6wind.com> wrote:
> > I've checked the rte_mempool_get_priv() usage - in all header files there
> > are the type casts. The "rte_mbuf.h" contains the rte_pktmbuf_priv_size() and
> > rte_pktmbuf_data_room_size(), both provide the cast.  What is the reason?
> > C++ compatibility? Should we remove the cast in rte_pktmbuf_priv_flags()?  
> 
> Removing the cast will certainly break C++ code using this header.
> There is a similar case in commit a2ff2827dc84 ("mbuf: fix C++ build on
> void pointer cast")
> 
> In my opinion it can stay as it is now.
Agreed. Thanks for trying.
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (6 preceding siblings ...)
  2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
@ 2020-01-22  8:50 ` Viacheslav Ovsiienko
  2020-02-06  9:46   ` Olivier Matz
  2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
  8 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-22  8:50 UTC (permalink / raw)
  To: dev; +Cc: olivier.matz, stephen, thomas
Minor style issue is fixed.
Fixes: 6c8e50c2e549 ("mbuf: create pool with external memory buffers")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 25eea1d..e5e6739 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -118,7 +118,8 @@
  * indirect buffer) mbufs on detaching from the mbuf with pinned external
  * buffer.
  */
-static void rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
+static void
+rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
 {
 	struct rte_mbuf *m = opaque;
 
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer
  2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
                   ` (7 preceding siblings ...)
  2020-01-22  8:50 ` [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue Viacheslav Ovsiienko
@ 2020-01-24 20:25 ` Viacheslav Ovsiienko
  2020-01-26 10:53   ` Slava Ovsiienko
                     ` (2 more replies)
  8 siblings, 3 replies; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-01-24 20:25 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, orika, shahafs, olivier.matz, stephen, thomas
This patch adds unit test for the mbufs allocated from
the special pool with pinned external data buffers.
The pinned buffer mbufs are tested in the same way as
regular ones with taking into account some specifics
of cloning/attaching.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test/test_mbuf.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 150 insertions(+), 15 deletions(-)
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 61ecffc..ee2f2f0 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -310,8 +310,17 @@
 	return -1;
 }
 
+static uint16_t
+testclone_refcnt_read(struct rte_mbuf *m)
+{
+	return RTE_MBUF_HAS_PINNED_EXTBUF(m) ?
+	       rte_mbuf_ext_refcnt_read(m->shinfo) :
+	       rte_mbuf_refcnt_read(m);
+}
+
 static int
-testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool)
+testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool,
+				struct rte_mempool *clone_pool)
 {
 	struct rte_mbuf *m = NULL;
 	struct rte_mbuf *clone = NULL;
@@ -331,7 +340,7 @@
 	*data = MAGIC_DATA;
 
 	/* clone the allocated mbuf */
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
@@ -339,7 +348,7 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone\n");
 
-	if (rte_mbuf_refcnt_read(m) != 2)
+	if (testclone_refcnt_read(m) != 2)
 		GOTO_FAIL("invalid refcnt in m\n");
 
 	/* free the clone */
@@ -358,7 +367,7 @@
 	data = rte_pktmbuf_mtod(m->next, unaligned_uint32_t *);
 	*data = MAGIC_DATA;
 
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
@@ -370,15 +379,15 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone->next\n");
 
-	if (rte_mbuf_refcnt_read(m) != 2)
+	if (testclone_refcnt_read(m) != 2)
 		GOTO_FAIL("invalid refcnt in m\n");
 
-	if (rte_mbuf_refcnt_read(m->next) != 2)
+	if (testclone_refcnt_read(m->next) != 2)
 		GOTO_FAIL("invalid refcnt in m->next\n");
 
 	/* try to clone the clone */
 
-	clone2 = rte_pktmbuf_clone(clone, pktmbuf_pool);
+	clone2 = rte_pktmbuf_clone(clone, clone_pool);
 	if (clone2 == NULL)
 		GOTO_FAIL("cannot clone the clone\n");
 
@@ -390,10 +399,10 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone2->next\n");
 
-	if (rte_mbuf_refcnt_read(m) != 3)
+	if (testclone_refcnt_read(m) != 3)
 		GOTO_FAIL("invalid refcnt in m\n");
 
-	if (rte_mbuf_refcnt_read(m->next) != 3)
+	if (testclone_refcnt_read(m->next) != 3)
 		GOTO_FAIL("invalid refcnt in m->next\n");
 
 	/* free mbuf */
@@ -418,7 +427,8 @@
 }
 
 static int
-test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool)
+test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool,
+		  struct rte_mempool *clone_pool)
 {
 	struct rte_mbuf *m = NULL;
 	struct rte_mbuf *copy = NULL;
@@ -458,11 +468,14 @@
 	copy = NULL;
 
 	/* same test with a cloned mbuf */
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
-	if (!RTE_MBUF_CLONED(clone))
+	if ((!RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+	     !RTE_MBUF_CLONED(clone)) ||
+	    (RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+	     !RTE_MBUF_HAS_EXTBUF(clone)))
 		GOTO_FAIL("clone did not give a cloned mbuf\n");
 
 	copy = rte_pktmbuf_copy(clone, pktmbuf_pool, 0, UINT32_MAX);
@@ -1199,10 +1212,11 @@
 	buf = rte_pktmbuf_alloc(pktmbuf_pool);
 	if (buf == NULL)
 		return -1;
+	/*
 	printf("Checking good mbuf initially\n");
 	if (verify_mbuf_check_panics(buf) != -1)
 		return -1;
-
+	*/
 	printf("Now checking for error conditions\n");
 
 	if (verify_mbuf_check_panics(NULL)) {
@@ -2411,6 +2425,120 @@ struct test_case {
 	return -1;
 }
 
+/*
+ * Test the mbuf pool with pinned external data buffers
+ *  - Allocate memory zone for external buffer
+ *  - Create the mbuf pool with pinned external buffer
+ *  - Check the created pool with relevant mbuf pool unit tests
+ */
+static int
+test_pktmbuf_ext_pinned_buffer(struct rte_mempool *std_pool)
+{
+
+	struct rte_pktmbuf_extmem ext_mem;
+	struct rte_mempool *pinned_pool = NULL;
+	const struct rte_memzone *mz = NULL;
+
+	printf("Test mbuf pool with external pinned data buffers\n");
+
+	/* Allocate memzone for the external data buffer */
+	mz = rte_memzone_reserve("pinned_pool",
+				 NB_MBUF * MBUF_DATA_SIZE,
+				 SOCKET_ID_ANY,
+				 RTE_MEMZONE_2MB | RTE_MEMZONE_SIZE_HINT_ONLY);
+	if (mz == NULL)
+		GOTO_FAIL("%s: Memzone allocation failed\n", __func__);
+
+	/* Create the mbuf pool with pinned external data buffer */
+	ext_mem.buf_ptr = mz->addr;
+	ext_mem.buf_iova = mz->iova;
+	ext_mem.buf_len = mz->len;
+	ext_mem.elt_size = MBUF_DATA_SIZE;
+
+	pinned_pool = rte_pktmbuf_pool_create_extbuf("test_pinned_pool",
+				NB_MBUF, MEMPOOL_CACHE_SIZE, 0,
+				MBUF_DATA_SIZE,	SOCKET_ID_ANY,
+				&ext_mem, 1);
+	if (pinned_pool == NULL)
+		GOTO_FAIL("%s: Mbuf pool with pinned external"
+			  " buffer creation failed\n", __func__);
+	/* test multiple mbuf alloc */
+	if (test_pktmbuf_pool(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed\n",
+			  __func__);
+
+	/* do it another time to check that all mbufs were freed */
+	if (test_pktmbuf_pool(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed (2)\n",
+			  __func__);
+
+	/* test that the data pointer on a packet mbuf is set properly */
+	if (test_pktmbuf_pool_ptr(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_pool_ptr(pinned) failed\n",
+			  __func__);
+
+	/* test data manipulation in mbuf with non-ascii data */
+	if (test_pktmbuf_with_non_ascii_data(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_with_non_ascii_data(pinned)"
+			  " failed\n", __func__);
+
+	/* test free pktmbuf segment one by one */
+	if (test_pktmbuf_free_segment(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_free_segment(pinned) failed\n",
+			  __func__);
+
+	if (testclone_testupdate_testdetach(pinned_pool, std_pool) < 0)
+		GOTO_FAIL("%s: testclone_and_testupdate(pinned) failed\n",
+			  __func__);
+
+	if (test_pktmbuf_copy(pinned_pool, std_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_copy(pinned) failed\n",
+			  __func__);
+
+	if (test_failing_mbuf_sanity_check(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_failing_mbuf_sanity_check(pinned)"
+			  " failed\n", __func__);
+
+	if (test_mbuf_linearize_check(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_linearize_check(pinned) failed\n",
+			  __func__);
+
+	/* test for allocating a bulk of mbufs with various sizes */
+	if (test_pktmbuf_alloc_bulk(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_alloc_bulk(pinned) failed\n",
+			  __func__);
+
+	/* test for allocating a bulk of mbufs with various sizes */
+	if (test_neg_pktmbuf_alloc_bulk(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_neg_rte_pktmbuf_alloc_bulk(pinned)"
+			  " failed\n", __func__);
+
+	/* test to read mbuf packet */
+	if (test_pktmbuf_read(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read(pinned) failed\n",
+			  __func__);
+
+	/* test to read mbuf packet from offset */
+	if (test_pktmbuf_read_from_offset(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_offset(pinned)"
+			  " failed\n", __func__);
+
+	/* test to read data from chain of mbufs with data segments */
+	if (test_pktmbuf_read_from_chain(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_chain(pinned)"
+			  " failed\n", __func__);
+
+	RTE_SET_USED(std_pool);
+	rte_mempool_free(pinned_pool);
+	rte_memzone_free(mz);
+	return 0;
+
+fail:
+	rte_mempool_free(pinned_pool);
+	rte_memzone_free(mz);
+	return -1;
+}
+
 static int
 test_mbuf_dyn(struct rte_mempool *pktmbuf_pool)
 {
@@ -2635,12 +2763,12 @@ struct test_case {
 		goto err;
 	}
 
-	if (testclone_testupdate_testdetach(pktmbuf_pool) < 0) {
+	if (testclone_testupdate_testdetach(pktmbuf_pool, pktmbuf_pool) < 0) {
 		printf("testclone_and_testupdate() failed \n");
 		goto err;
 	}
 
-	if (test_pktmbuf_copy(pktmbuf_pool) < 0) {
+	if (test_pktmbuf_copy(pktmbuf_pool, pktmbuf_pool) < 0) {
 		printf("test_pktmbuf_copy() failed\n");
 		goto err;
 	}
@@ -2731,6 +2859,13 @@ struct test_case {
 		goto err;
 	}
 
+	/* test the mbuf pool with pinned external data buffers */
+	if (test_pktmbuf_ext_pinned_buffer(pktmbuf_pool) < 0) {
+		printf("test_pktmbuf_ext_pinned_buffer() failed\n");
+		goto err;
+	}
+
+
 	ret = 0;
 err:
 	rte_mempool_free(pktmbuf_pool);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer
  2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2020-01-26 10:53   ` Slava Ovsiienko
  2020-02-06  8:17   ` Olivier Matz
  2020-02-06  9:49   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  2 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-01-26 10:53 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	olivier.matz, stephen, thomas
verify_mbuf_check_panics() is left commented out to test the patch
(verify_mbuf_check_panics() fails on my setup), will be removed in v2.
With best regards, Slava
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Friday, January 24, 2020 22:25
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; olivier.matz@6wind.com;
> stephen@networkplumber.org; thomas@mellanox.net
> Subject: [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external
> buffer
> 
> This patch adds unit test for the mbufs allocated from the special pool with
> pinned external data buffers.
> 
> The pinned buffer mbufs are tested in the same way as regular ones with
> taking into account some specifics of cloning/attaching.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  app/test/test_mbuf.c | 165
> ++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 150 insertions(+), 15 deletions(-)
> 
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c index
> 61ecffc..ee2f2f0 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -310,8 +310,17 @@
>  	return -1;
>  }
> 
> +static uint16_t
> +testclone_refcnt_read(struct rte_mbuf *m) {
> +	return RTE_MBUF_HAS_PINNED_EXTBUF(m) ?
> +	       rte_mbuf_ext_refcnt_read(m->shinfo) :
> +	       rte_mbuf_refcnt_read(m);
> +}
> +
>  static int
> -testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool)
> +testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool,
> +				struct rte_mempool *clone_pool)
>  {
>  	struct rte_mbuf *m = NULL;
>  	struct rte_mbuf *clone = NULL;
> @@ -331,7 +340,7 @@
>  	*data = MAGIC_DATA;
> 
>  	/* clone the allocated mbuf */
> -	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
> +	clone = rte_pktmbuf_clone(m, clone_pool);
>  	if (clone == NULL)
>  		GOTO_FAIL("cannot clone data\n");
> 
> @@ -339,7 +348,7 @@
>  	if (*data != MAGIC_DATA)
>  		GOTO_FAIL("invalid data in clone\n");
> 
> -	if (rte_mbuf_refcnt_read(m) != 2)
> +	if (testclone_refcnt_read(m) != 2)
>  		GOTO_FAIL("invalid refcnt in m\n");
> 
>  	/* free the clone */
> @@ -358,7 +367,7 @@
>  	data = rte_pktmbuf_mtod(m->next, unaligned_uint32_t *);
>  	*data = MAGIC_DATA;
> 
> -	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
> +	clone = rte_pktmbuf_clone(m, clone_pool);
>  	if (clone == NULL)
>  		GOTO_FAIL("cannot clone data\n");
> 
> @@ -370,15 +379,15 @@
>  	if (*data != MAGIC_DATA)
>  		GOTO_FAIL("invalid data in clone->next\n");
> 
> -	if (rte_mbuf_refcnt_read(m) != 2)
> +	if (testclone_refcnt_read(m) != 2)
>  		GOTO_FAIL("invalid refcnt in m\n");
> 
> -	if (rte_mbuf_refcnt_read(m->next) != 2)
> +	if (testclone_refcnt_read(m->next) != 2)
>  		GOTO_FAIL("invalid refcnt in m->next\n");
> 
>  	/* try to clone the clone */
> 
> -	clone2 = rte_pktmbuf_clone(clone, pktmbuf_pool);
> +	clone2 = rte_pktmbuf_clone(clone, clone_pool);
>  	if (clone2 == NULL)
>  		GOTO_FAIL("cannot clone the clone\n");
> 
> @@ -390,10 +399,10 @@
>  	if (*data != MAGIC_DATA)
>  		GOTO_FAIL("invalid data in clone2->next\n");
> 
> -	if (rte_mbuf_refcnt_read(m) != 3)
> +	if (testclone_refcnt_read(m) != 3)
>  		GOTO_FAIL("invalid refcnt in m\n");
> 
> -	if (rte_mbuf_refcnt_read(m->next) != 3)
> +	if (testclone_refcnt_read(m->next) != 3)
>  		GOTO_FAIL("invalid refcnt in m->next\n");
> 
>  	/* free mbuf */
> @@ -418,7 +427,8 @@
>  }
> 
>  static int
> -test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool)
> +test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool,
> +		  struct rte_mempool *clone_pool)
>  {
>  	struct rte_mbuf *m = NULL;
>  	struct rte_mbuf *copy = NULL;
> @@ -458,11 +468,14 @@
>  	copy = NULL;
> 
>  	/* same test with a cloned mbuf */
> -	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
> +	clone = rte_pktmbuf_clone(m, clone_pool);
>  	if (clone == NULL)
>  		GOTO_FAIL("cannot clone data\n");
> 
> -	if (!RTE_MBUF_CLONED(clone))
> +	if ((!RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
> +	     !RTE_MBUF_CLONED(clone)) ||
> +	    (RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
> +	     !RTE_MBUF_HAS_EXTBUF(clone)))
>  		GOTO_FAIL("clone did not give a cloned mbuf\n");
> 
>  	copy = rte_pktmbuf_copy(clone, pktmbuf_pool, 0, UINT32_MAX); @@
> -1199,10 +1212,11 @@
>  	buf = rte_pktmbuf_alloc(pktmbuf_pool);
>  	if (buf == NULL)
>  		return -1;
> +	/*
>  	printf("Checking good mbuf initially\n");
>  	if (verify_mbuf_check_panics(buf) != -1)
>  		return -1;
> -
> +	*/
>  	printf("Now checking for error conditions\n");
> 
>  	if (verify_mbuf_check_panics(NULL)) {
> @@ -2411,6 +2425,120 @@ struct test_case {
>  	return -1;
>  }
> 
> +/*
> + * Test the mbuf pool with pinned external data buffers
> + *  - Allocate memory zone for external buffer
> + *  - Create the mbuf pool with pinned external buffer
> + *  - Check the created pool with relevant mbuf pool unit tests  */
> +static int test_pktmbuf_ext_pinned_buffer(struct rte_mempool *std_pool)
> +{
> +
> +	struct rte_pktmbuf_extmem ext_mem;
> +	struct rte_mempool *pinned_pool = NULL;
> +	const struct rte_memzone *mz = NULL;
> +
> +	printf("Test mbuf pool with external pinned data buffers\n");
> +
> +	/* Allocate memzone for the external data buffer */
> +	mz = rte_memzone_reserve("pinned_pool",
> +				 NB_MBUF * MBUF_DATA_SIZE,
> +				 SOCKET_ID_ANY,
> +				 RTE_MEMZONE_2MB |
> RTE_MEMZONE_SIZE_HINT_ONLY);
> +	if (mz == NULL)
> +		GOTO_FAIL("%s: Memzone allocation failed\n", __func__);
> +
> +	/* Create the mbuf pool with pinned external data buffer */
> +	ext_mem.buf_ptr = mz->addr;
> +	ext_mem.buf_iova = mz->iova;
> +	ext_mem.buf_len = mz->len;
> +	ext_mem.elt_size = MBUF_DATA_SIZE;
> +
> +	pinned_pool = rte_pktmbuf_pool_create_extbuf("test_pinned_pool",
> +				NB_MBUF, MEMPOOL_CACHE_SIZE, 0,
> +				MBUF_DATA_SIZE,	SOCKET_ID_ANY,
> +				&ext_mem, 1);
> +	if (pinned_pool == NULL)
> +		GOTO_FAIL("%s: Mbuf pool with pinned external"
> +			  " buffer creation failed\n", __func__);
> +	/* test multiple mbuf alloc */
> +	if (test_pktmbuf_pool(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed\n",
> +			  __func__);
> +
> +	/* do it another time to check that all mbufs were freed */
> +	if (test_pktmbuf_pool(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed (2)\n",
> +			  __func__);
> +
> +	/* test that the data pointer on a packet mbuf is set properly */
> +	if (test_pktmbuf_pool_ptr(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_pktmbuf_pool_ptr(pinned) failed\n",
> +			  __func__);
> +
> +	/* test data manipulation in mbuf with non-ascii data */
> +	if (test_pktmbuf_with_non_ascii_data(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_pktmbuf_with_non_ascii_data(pinned)"
> +			  " failed\n", __func__);
> +
> +	/* test free pktmbuf segment one by one */
> +	if (test_pktmbuf_free_segment(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_pktmbuf_free_segment(pinned)
> failed\n",
> +			  __func__);
> +
> +	if (testclone_testupdate_testdetach(pinned_pool, std_pool) < 0)
> +		GOTO_FAIL("%s: testclone_and_testupdate(pinned) failed\n",
> +			  __func__);
> +
> +	if (test_pktmbuf_copy(pinned_pool, std_pool) < 0)
> +		GOTO_FAIL("%s: test_pktmbuf_copy(pinned) failed\n",
> +			  __func__);
> +
> +	if (test_failing_mbuf_sanity_check(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_failing_mbuf_sanity_check(pinned)"
> +			  " failed\n", __func__);
> +
> +	if (test_mbuf_linearize_check(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_mbuf_linearize_check(pinned) failed\n",
> +			  __func__);
> +
> +	/* test for allocating a bulk of mbufs with various sizes */
> +	if (test_pktmbuf_alloc_bulk(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_rte_pktmbuf_alloc_bulk(pinned)
> failed\n",
> +			  __func__);
> +
> +	/* test for allocating a bulk of mbufs with various sizes */
> +	if (test_neg_pktmbuf_alloc_bulk(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_neg_rte_pktmbuf_alloc_bulk(pinned)"
> +			  " failed\n", __func__);
> +
> +	/* test to read mbuf packet */
> +	if (test_pktmbuf_read(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_rte_pktmbuf_read(pinned) failed\n",
> +			  __func__);
> +
> +	/* test to read mbuf packet from offset */
> +	if (test_pktmbuf_read_from_offset(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_offset(pinned)"
> +			  " failed\n", __func__);
> +
> +	/* test to read data from chain of mbufs with data segments */
> +	if (test_pktmbuf_read_from_chain(pinned_pool) < 0)
> +		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_chain(pinned)"
> +			  " failed\n", __func__);
> +
> +	RTE_SET_USED(std_pool);
> +	rte_mempool_free(pinned_pool);
> +	rte_memzone_free(mz);
> +	return 0;
> +
> +fail:
> +	rte_mempool_free(pinned_pool);
> +	rte_memzone_free(mz);
> +	return -1;
> +}
> +
>  static int
>  test_mbuf_dyn(struct rte_mempool *pktmbuf_pool)  { @@ -2635,12
> +2763,12 @@ struct test_case {
>  		goto err;
>  	}
> 
> -	if (testclone_testupdate_testdetach(pktmbuf_pool) < 0) {
> +	if (testclone_testupdate_testdetach(pktmbuf_pool, pktmbuf_pool) <
> 0) {
>  		printf("testclone_and_testupdate() failed \n");
>  		goto err;
>  	}
> 
> -	if (test_pktmbuf_copy(pktmbuf_pool) < 0) {
> +	if (test_pktmbuf_copy(pktmbuf_pool, pktmbuf_pool) < 0) {
>  		printf("test_pktmbuf_copy() failed\n");
>  		goto err;
>  	}
> @@ -2731,6 +2859,13 @@ struct test_case {
>  		goto err;
>  	}
> 
> +	/* test the mbuf pool with pinned external data buffers */
> +	if (test_pktmbuf_ext_pinned_buffer(pktmbuf_pool) < 0) {
> +		printf("test_pktmbuf_ext_pinned_buffer() failed\n");
> +		goto err;
> +	}
> +
> +
>  	ret = 0;
>  err:
>  	rte_mempool_free(pktmbuf_pool);
> --
> 1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer
  2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
  2020-01-26 10:53   ` Slava Ovsiienko
@ 2020-02-06  8:17   ` Olivier Matz
  2020-02-06  8:24     ` Slava Ovsiienko
  2020-02-06  9:49   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  2 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-02-06  8:17 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, orika, shahafs, stephen, thomas
Hi,
On Fri, Jan 24, 2020 at 08:25:18PM +0000, Viacheslav Ovsiienko wrote:
> This patch adds unit test for the mbufs allocated from
> the special pool with pinned external data buffers.
> 
> The pinned buffer mbufs are tested in the same way as
> regular ones with taking into account some specifics
> of cloning/attaching.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Looks good to me, you can add my ack in the v2, once we understand
the issue with verify_mbuf_check_panics().
> @@ -1199,10 +1212,11 @@
>  	buf = rte_pktmbuf_alloc(pktmbuf_pool);
>  	if (buf == NULL)
>  		return -1;
> +	/*
>  	printf("Checking good mbuf initially\n");
>  	if (verify_mbuf_check_panics(buf) != -1)
>  		return -1;
> -
> +	*/
>  	printf("Now checking for error conditions\n");
>  
>  	if (verify_mbuf_check_panics(NULL)) {
> @@ -2411,6 +2425,120 @@ struct test_case {
Note: on my platform, it still works if I remove this comment.
Regards,
Olivier
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer
  2020-02-06  8:17   ` Olivier Matz
@ 2020-02-06  8:24     ` Slava Ovsiienko
  2020-02-06  9:51       ` Slava Ovsiienko
  0 siblings, 1 reply; 77+ messages in thread
From: Slava Ovsiienko @ 2020-02-06  8:24 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
Olivier, thanks for the reviewing.
I'll remove the comment and send the v2.
I use 1G huge pages, will retest over 2M
and continue finding why my host fails.
With best regards, Slava
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, February 6, 2020 10:17
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; stephen@networkplumber.org;
> thomas@mellanox.net
> Subject: Re: [PATCH] app/test: add test for mbuf with pinned external buffer
> 
> Hi,
> 
> On Fri, Jan 24, 2020 at 08:25:18PM +0000, Viacheslav Ovsiienko wrote:
> > This patch adds unit test for the mbufs allocated from the special
> > pool with pinned external data buffers.
> >
> > The pinned buffer mbufs are tested in the same way as regular ones
> > with taking into account some specifics of cloning/attaching.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> Looks good to me, you can add my ack in the v2, once we understand the
> issue with verify_mbuf_check_panics().
> 
> > @@ -1199,10 +1212,11 @@
> >  	buf = rte_pktmbuf_alloc(pktmbuf_pool);
> >  	if (buf == NULL)
> >  		return -1;
> > +	/*
> >  	printf("Checking good mbuf initially\n");
> >  	if (verify_mbuf_check_panics(buf) != -1)
> >  		return -1;
> > -
> > +	*/
> >  	printf("Now checking for error conditions\n");
> >
> >  	if (verify_mbuf_check_panics(NULL)) { @@ -2411,6 +2425,120 @@
> struct
> > test_case {
> 
> Note: on my platform, it still works if I remove this comment.
> 
> Regards,
> Olivier
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue
  2020-01-22  8:50 ` [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue Viacheslav Ovsiienko
@ 2020-02-06  9:46   ` Olivier Matz
  2020-02-06 14:26     ` Thomas Monjalon
  0 siblings, 1 reply; 77+ messages in thread
From: Olivier Matz @ 2020-02-06  9:46 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, stephen, thomas
On Wed, Jan 22, 2020 at 08:50:35AM +0000, Viacheslav Ovsiienko wrote:
> Minor style issue is fixed.
> 
> Fixes: 6c8e50c2e549 ("mbuf: create pool with external memory buffers")
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply	[flat|nested] 77+ messages in thread
* [dpdk-dev] [PATCH v2] app/test: add test for mbuf with pinned external buffer
  2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
  2020-01-26 10:53   ` Slava Ovsiienko
  2020-02-06  8:17   ` Olivier Matz
@ 2020-02-06  9:49   ` Viacheslav Ovsiienko
  2020-02-06 14:43     ` Thomas Monjalon
  2 siblings, 1 reply; 77+ messages in thread
From: Viacheslav Ovsiienko @ 2020-02-06  9:49 UTC (permalink / raw)
  To: dev; +Cc: thomas, olivier.matz, ferruh.yigit
This patch adds unit test for the mbufs allocated from
the special pool with pinned external data buffers.
The pinned buffer mbufs are tested in the same way as
regular ones with taking into account some specifics
of cloning/attaching.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v2: uncomment the failed sanity check test 
 app/test/test_mbuf.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 149 insertions(+), 14 deletions(-)
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 61ecffc..8200b4f 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -310,8 +310,17 @@
 	return -1;
 }
 
+static uint16_t
+testclone_refcnt_read(struct rte_mbuf *m)
+{
+	return RTE_MBUF_HAS_PINNED_EXTBUF(m) ?
+	       rte_mbuf_ext_refcnt_read(m->shinfo) :
+	       rte_mbuf_refcnt_read(m);
+}
+
 static int
-testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool)
+testclone_testupdate_testdetach(struct rte_mempool *pktmbuf_pool,
+				struct rte_mempool *clone_pool)
 {
 	struct rte_mbuf *m = NULL;
 	struct rte_mbuf *clone = NULL;
@@ -331,7 +340,7 @@
 	*data = MAGIC_DATA;
 
 	/* clone the allocated mbuf */
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
@@ -339,7 +348,7 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone\n");
 
-	if (rte_mbuf_refcnt_read(m) != 2)
+	if (testclone_refcnt_read(m) != 2)
 		GOTO_FAIL("invalid refcnt in m\n");
 
 	/* free the clone */
@@ -358,7 +367,7 @@
 	data = rte_pktmbuf_mtod(m->next, unaligned_uint32_t *);
 	*data = MAGIC_DATA;
 
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
@@ -370,15 +379,15 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone->next\n");
 
-	if (rte_mbuf_refcnt_read(m) != 2)
+	if (testclone_refcnt_read(m) != 2)
 		GOTO_FAIL("invalid refcnt in m\n");
 
-	if (rte_mbuf_refcnt_read(m->next) != 2)
+	if (testclone_refcnt_read(m->next) != 2)
 		GOTO_FAIL("invalid refcnt in m->next\n");
 
 	/* try to clone the clone */
 
-	clone2 = rte_pktmbuf_clone(clone, pktmbuf_pool);
+	clone2 = rte_pktmbuf_clone(clone, clone_pool);
 	if (clone2 == NULL)
 		GOTO_FAIL("cannot clone the clone\n");
 
@@ -390,10 +399,10 @@
 	if (*data != MAGIC_DATA)
 		GOTO_FAIL("invalid data in clone2->next\n");
 
-	if (rte_mbuf_refcnt_read(m) != 3)
+	if (testclone_refcnt_read(m) != 3)
 		GOTO_FAIL("invalid refcnt in m\n");
 
-	if (rte_mbuf_refcnt_read(m->next) != 3)
+	if (testclone_refcnt_read(m->next) != 3)
 		GOTO_FAIL("invalid refcnt in m->next\n");
 
 	/* free mbuf */
@@ -418,7 +427,8 @@
 }
 
 static int
-test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool)
+test_pktmbuf_copy(struct rte_mempool *pktmbuf_pool,
+		  struct rte_mempool *clone_pool)
 {
 	struct rte_mbuf *m = NULL;
 	struct rte_mbuf *copy = NULL;
@@ -458,11 +468,14 @@
 	copy = NULL;
 
 	/* same test with a cloned mbuf */
-	clone = rte_pktmbuf_clone(m, pktmbuf_pool);
+	clone = rte_pktmbuf_clone(m, clone_pool);
 	if (clone == NULL)
 		GOTO_FAIL("cannot clone data\n");
 
-	if (!RTE_MBUF_CLONED(clone))
+	if ((!RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+	     !RTE_MBUF_CLONED(clone)) ||
+	    (RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+	     !RTE_MBUF_HAS_EXTBUF(clone)))
 		GOTO_FAIL("clone did not give a cloned mbuf\n");
 
 	copy = rte_pktmbuf_copy(clone, pktmbuf_pool, 0, UINT32_MAX);
@@ -1199,6 +1212,7 @@
 	buf = rte_pktmbuf_alloc(pktmbuf_pool);
 	if (buf == NULL)
 		return -1;
+
 	printf("Checking good mbuf initially\n");
 	if (verify_mbuf_check_panics(buf) != -1)
 		return -1;
@@ -2411,6 +2425,120 @@ struct test_case {
 	return -1;
 }
 
+/*
+ * Test the mbuf pool with pinned external data buffers
+ *  - Allocate memory zone for external buffer
+ *  - Create the mbuf pool with pinned external buffer
+ *  - Check the created pool with relevant mbuf pool unit tests
+ */
+static int
+test_pktmbuf_ext_pinned_buffer(struct rte_mempool *std_pool)
+{
+
+	struct rte_pktmbuf_extmem ext_mem;
+	struct rte_mempool *pinned_pool = NULL;
+	const struct rte_memzone *mz = NULL;
+
+	printf("Test mbuf pool with external pinned data buffers\n");
+
+	/* Allocate memzone for the external data buffer */
+	mz = rte_memzone_reserve("pinned_pool",
+				 NB_MBUF * MBUF_DATA_SIZE,
+				 SOCKET_ID_ANY,
+				 RTE_MEMZONE_2MB | RTE_MEMZONE_SIZE_HINT_ONLY);
+	if (mz == NULL)
+		GOTO_FAIL("%s: Memzone allocation failed\n", __func__);
+
+	/* Create the mbuf pool with pinned external data buffer */
+	ext_mem.buf_ptr = mz->addr;
+	ext_mem.buf_iova = mz->iova;
+	ext_mem.buf_len = mz->len;
+	ext_mem.elt_size = MBUF_DATA_SIZE;
+
+	pinned_pool = rte_pktmbuf_pool_create_extbuf("test_pinned_pool",
+				NB_MBUF, MEMPOOL_CACHE_SIZE, 0,
+				MBUF_DATA_SIZE,	SOCKET_ID_ANY,
+				&ext_mem, 1);
+	if (pinned_pool == NULL)
+		GOTO_FAIL("%s: Mbuf pool with pinned external"
+			  " buffer creation failed\n", __func__);
+	/* test multiple mbuf alloc */
+	if (test_pktmbuf_pool(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed\n",
+			  __func__);
+
+	/* do it another time to check that all mbufs were freed */
+	if (test_pktmbuf_pool(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_pool(pinned) failed (2)\n",
+			  __func__);
+
+	/* test that the data pointer on a packet mbuf is set properly */
+	if (test_pktmbuf_pool_ptr(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_pool_ptr(pinned) failed\n",
+			  __func__);
+
+	/* test data manipulation in mbuf with non-ascii data */
+	if (test_pktmbuf_with_non_ascii_data(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_with_non_ascii_data(pinned)"
+			  " failed\n", __func__);
+
+	/* test free pktmbuf segment one by one */
+	if (test_pktmbuf_free_segment(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_free_segment(pinned) failed\n",
+			  __func__);
+
+	if (testclone_testupdate_testdetach(pinned_pool, std_pool) < 0)
+		GOTO_FAIL("%s: testclone_and_testupdate(pinned) failed\n",
+			  __func__);
+
+	if (test_pktmbuf_copy(pinned_pool, std_pool) < 0)
+		GOTO_FAIL("%s: test_pktmbuf_copy(pinned) failed\n",
+			  __func__);
+
+	if (test_failing_mbuf_sanity_check(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_failing_mbuf_sanity_check(pinned)"
+			  " failed\n", __func__);
+
+	if (test_mbuf_linearize_check(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_mbuf_linearize_check(pinned) failed\n",
+			  __func__);
+
+	/* test for allocating a bulk of mbufs with various sizes */
+	if (test_pktmbuf_alloc_bulk(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_alloc_bulk(pinned) failed\n",
+			  __func__);
+
+	/* test for allocating a bulk of mbufs with various sizes */
+	if (test_neg_pktmbuf_alloc_bulk(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_neg_rte_pktmbuf_alloc_bulk(pinned)"
+			  " failed\n", __func__);
+
+	/* test to read mbuf packet */
+	if (test_pktmbuf_read(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read(pinned) failed\n",
+			  __func__);
+
+	/* test to read mbuf packet from offset */
+	if (test_pktmbuf_read_from_offset(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_offset(pinned)"
+			  " failed\n", __func__);
+
+	/* test to read data from chain of mbufs with data segments */
+	if (test_pktmbuf_read_from_chain(pinned_pool) < 0)
+		GOTO_FAIL("%s: test_rte_pktmbuf_read_from_chain(pinned)"
+			  " failed\n", __func__);
+
+	RTE_SET_USED(std_pool);
+	rte_mempool_free(pinned_pool);
+	rte_memzone_free(mz);
+	return 0;
+
+fail:
+	rte_mempool_free(pinned_pool);
+	rte_memzone_free(mz);
+	return -1;
+}
+
 static int
 test_mbuf_dyn(struct rte_mempool *pktmbuf_pool)
 {
@@ -2635,12 +2763,12 @@ struct test_case {
 		goto err;
 	}
 
-	if (testclone_testupdate_testdetach(pktmbuf_pool) < 0) {
+	if (testclone_testupdate_testdetach(pktmbuf_pool, pktmbuf_pool) < 0) {
 		printf("testclone_and_testupdate() failed \n");
 		goto err;
 	}
 
-	if (test_pktmbuf_copy(pktmbuf_pool) < 0) {
+	if (test_pktmbuf_copy(pktmbuf_pool, pktmbuf_pool) < 0) {
 		printf("test_pktmbuf_copy() failed\n");
 		goto err;
 	}
@@ -2731,6 +2859,13 @@ struct test_case {
 		goto err;
 	}
 
+	/* test the mbuf pool with pinned external data buffers */
+	if (test_pktmbuf_ext_pinned_buffer(pktmbuf_pool) < 0) {
+		printf("test_pktmbuf_ext_pinned_buffer() failed\n");
+		goto err;
+	}
+
+
 	ret = 0;
 err:
 	rte_mempool_free(pktmbuf_pool);
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer
  2020-02-06  8:24     ` Slava Ovsiienko
@ 2020-02-06  9:51       ` Slava Ovsiienko
  0 siblings, 0 replies; 77+ messages in thread
From: Slava Ovsiienko @ 2020-02-06  9:51 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Ori Kam, Shahaf Shuler,
	stephen, thomas
I checked with 2M huge pages - fork() works OK on my host, does not work over 1G.
RH7.2/3.10.327. Will try other kernels.
With best regards, Slava
> -----Original Message-----
> From: Slava Ovsiienko
> Sent: Thursday, February 6, 2020 10:25
> To: Olivier Matz <olivier.matz@6wind.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; stephen@networkplumber.org;
> thomas@mellanox.net
> Subject: RE: [PATCH] app/test: add test for mbuf with pinned external buffer
> 
> Olivier, thanks for the reviewing.
> I'll remove the comment and send the v2.
> I use 1G huge pages, will retest over 2M and continue finding why my host
> fails.
> 
> With best regards, Slava
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Thursday, February 6, 2020 10:17
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh
> > <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > <shahafs@mellanox.com>; stephen@networkplumber.org;
> > thomas@mellanox.net
> > Subject: Re: [PATCH] app/test: add test for mbuf with pinned external
> > buffer
> >
> > Hi,
> >
> > On Fri, Jan 24, 2020 at 08:25:18PM +0000, Viacheslav Ovsiienko wrote:
> > > This patch adds unit test for the mbufs allocated from the special
> > > pool with pinned external data buffers.
> > >
> > > The pinned buffer mbufs are tested in the same way as regular ones
> > > with taking into account some specifics of cloning/attaching.
> > >
> > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> >
> > Looks good to me, you can add my ack in the v2, once we understand the
> > issue with verify_mbuf_check_panics().
> >
> > > @@ -1199,10 +1212,11 @@
> > >  	buf = rte_pktmbuf_alloc(pktmbuf_pool);
> > >  	if (buf == NULL)
> > >  		return -1;
> > > +	/*
> > >  	printf("Checking good mbuf initially\n");
> > >  	if (verify_mbuf_check_panics(buf) != -1)
> > >  		return -1;
> > > -
> > > +	*/
> > >  	printf("Now checking for error conditions\n");
> > >
> > >  	if (verify_mbuf_check_panics(NULL)) { @@ -2411,6 +2425,120 @@
> > struct
> > > test_case {
> >
> > Note: on my platform, it still works if I remove this comment.
> >
> > Regards,
> > Olivier
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue
  2020-02-06  9:46   ` Olivier Matz
@ 2020-02-06 14:26     ` Thomas Monjalon
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Monjalon @ 2020-02-06 14:26 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, stephen, Olivier Matz
06/02/2020 10:46, Olivier Matz:
> On Wed, Jan 22, 2020 at 08:50:35AM +0000, Viacheslav Ovsiienko wrote:
> > Minor style issue is fixed.
> > 
> > Fixes: 6c8e50c2e549 ("mbuf: create pool with external memory buffers")
> > 
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Applied, thanks
^ permalink raw reply	[flat|nested] 77+ messages in thread
* Re: [dpdk-dev] [PATCH v2] app/test: add test for mbuf with pinned external buffer
  2020-02-06  9:49   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
@ 2020-02-06 14:43     ` Thomas Monjalon
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Monjalon @ 2020-02-06 14:43 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, olivier.matz, ferruh.yigit
06/02/2020 10:49, Viacheslav Ovsiienko:
> This patch adds unit test for the mbufs allocated from
> the special pool with pinned external data buffers.
> 
> The pinned buffer mbufs are tested in the same way as
> regular ones with taking into account some specifics
> of cloning/attaching.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> v2: uncomment the failed sanity check test 
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Applied, thanks
^ permalink raw reply	[flat|nested] 77+ messages in thread
* RE: [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned externalbuffer
  2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
@ 2023-12-06 10:55     ` Morten Brørup
  0 siblings, 0 replies; 77+ messages in thread
From: Morten Brørup @ 2023-12-06 10:55 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev, Shahaf Shuler
  Cc: Matan Azrad, rasland, orika, Olivier Matz, Stephen Hemminger,
	Thomas Monjalon, Konstantin Ananyev, Feifei Wang, nd,
	Ruifeng Wang, bruce.richardson
Triggered by another discussion, I have identified a potential bug related to this patch.
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav
> Ovsiienko
> Sent: Monday, 20 January 2020 20.16
> 
> Update detach routine to check the mbuf pool type.
> Introduce the special internal version of detach routine to handle
> the special case of pinned external bufferon mbuf freeing.
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> ---
[...]
> @@ -1201,8 +1278,13 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> 
>  	if (likely(rte_mbuf_refcnt_read(m) == 1)) {
> 
> -		if (!RTE_MBUF_DIRECT(m))
> -			rte_pktmbuf_detach(m);
> +		if (!RTE_MBUF_DIRECT(m)) {
> +			if (!RTE_MBUF_HAS_EXTBUF(m) ||
> +			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
> +				rte_pktmbuf_detach(m);
> +			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
> +				return NULL;
When NULL is returned here, m->refcnt is still 1.
> +		}
> 
>  		if (m->next != NULL) {
>  			m->next = NULL;
> @@ -1213,8 +1295,13 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> 
>  	} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
> 
> -		if (!RTE_MBUF_DIRECT(m))
> -			rte_pktmbuf_detach(m);
> +		if (!RTE_MBUF_DIRECT(m)) {
> +			if (!RTE_MBUF_HAS_EXTBUF(m) ||
> +			    !RTE_MBUF_HAS_PINNED_EXTBUF(m))
> +				rte_pktmbuf_detach(m);
> +			else if (__rte_pktmbuf_pinned_extbuf_decref(m))
> +				return NULL;
When NULL is returned here, m->refcnt has been decremented to 0.
I don't know which is correct, but I suppose m->refcnt should end up with the same value in both cases?
> +		}
> 
>  		if (m->next != NULL) {
>  			m->next = NULL;
> --
> 1.8.3.1
> 
^ permalink raw reply	[flat|nested] 77+ messages in thread
end of thread, other threads:[~2023-12-06 10:55 UTC | newest]
Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18  9:50 [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Shahaf Shuler
2019-11-18 16:09 ` Stephen Hemminger
2020-01-10 17:56 ` [dpdk-dev] [PATCH 0/4] " Viacheslav Ovsiienko
2020-01-10 17:56   ` [dpdk-dev] [PATCH 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
2020-01-10 18:23     ` Stephen Hemminger
2020-01-13 17:07       ` Slava Ovsiienko
2020-01-14  7:19       ` Slava Ovsiienko
2020-01-10 17:57   ` [dpdk-dev] [PATCH 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-10 17:57   ` [dpdk-dev] [PATCH 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-10 17:57   ` [dpdk-dev] [PATCH 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-14  7:49 ` [dpdk-dev] [PATCH v2 0/4] mbuf: introduce pktmbuf pool with pinned external buffers Viacheslav Ovsiienko
2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 1/4] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-14  7:49   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-14  9:15 ` [dpdk-dev] [PATCH v3 0/4] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 1/4] " Viacheslav Ovsiienko
2020-01-14 15:27     ` Olivier Matz
2020-01-15 12:52       ` Slava Ovsiienko
2020-01-14 15:50     ` Stephen Hemminger
2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 2/4] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-14 16:04     ` Olivier Matz
2020-01-15 18:13       ` Slava Ovsiienko
2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 3/4] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-14  9:15   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-16 13:04 ` [dpdk-dev] [PATCH v4 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
2020-01-20 12:16     ` Olivier Matz
2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
2020-01-20 13:56     ` Olivier Matz
2020-01-20 15:41       ` Slava Ovsiienko
2020-01-20 16:17         ` Olivier Matz
2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-20 13:59     ` Olivier Matz
2020-01-20 17:33       ` Slava Ovsiienko
2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-20 14:11     ` Olivier Matz
2020-01-16 13:04   ` [dpdk-dev] [PATCH v4 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-20 17:23 ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Viacheslav Ovsiienko
2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
2020-01-20 20:43     ` Stephen Hemminger
2020-01-20 22:52       ` Thomas Monjalon
2020-01-21  6:48       ` Slava Ovsiienko
2020-01-21  8:00       ` Slava Ovsiienko
2020-01-21  8:14         ` Olivier Matz
2020-01-21  8:23           ` Slava Ovsiienko
2020-01-21  9:13             ` Slava Ovsiienko
2020-01-21 14:01               ` Olivier Matz
2020-01-21 16:21                 ` Stephen Hemminger
2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
2020-01-20 17:40     ` Olivier Matz
2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-20 17:46     ` Olivier Matz
2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-20 17:23   ` [dpdk-dev] [PATCH v5 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-20 17:30   ` [dpdk-dev] [PATCH v5 0/5] mbuf: detach mbuf with pinned " Slava Ovsiienko
2020-01-20 17:41     ` Olivier Matz
2020-01-20 19:16 ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 1/5] mbuf: introduce routine to get private mbuf pool flags Viacheslav Ovsiienko
2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned external buffer Viacheslav Ovsiienko
2023-12-06 10:55     ` [dpdk-dev] [PATCH v6 2/5] mbuf: detach mbuf with pinned externalbuffer Morten Brørup
2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 3/5] mbuf: create packet pool with external memory buffers Viacheslav Ovsiienko
2020-01-20 20:48     ` Stephen Hemminger
2020-01-21  7:04       ` Slava Ovsiienko
2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 4/5] app/testpmd: add mempool with external data buffers Viacheslav Ovsiienko
2020-01-20 19:16   ` [dpdk-dev] [PATCH v6 5/5] net/mlx5: allow use allocated mbuf with external buffer Viacheslav Ovsiienko
2020-01-20 22:55   ` [dpdk-dev] [PATCH v6 0/5] mbuf: detach mbuf with pinned " Thomas Monjalon
2020-01-22  8:50 ` [dpdk-dev] [PATCH] mbuf: fix pinned memory free routine style issue Viacheslav Ovsiienko
2020-02-06  9:46   ` Olivier Matz
2020-02-06 14:26     ` Thomas Monjalon
2020-01-24 20:25 ` [dpdk-dev] [PATCH] app/test: add test for mbuf with pinned external buffer Viacheslav Ovsiienko
2020-01-26 10:53   ` Slava Ovsiienko
2020-02-06  8:17   ` Olivier Matz
2020-02-06  8:24     ` Slava Ovsiienko
2020-02-06  9:51       ` Slava Ovsiienko
2020-02-06  9:49   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2020-02-06 14:43     ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).