DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate
@ 2020-06-18  7:24 Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add Three-Level table utility Suanming Mou
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Suanming Mou @ 2020-06-18  7:24 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: rasland, dev

This patch set optimizes the DevX single counter allocate from two sides:

1. Add the multiple level table to have a quick look up while
allocate/search the single shared counter.

2. Optimize the pool look up for the new allocated single counter.

Suanming Mou (3):
  net/mlx5: add Three-Level table utility
  net/mlx5: manage shared counters in Three-Level table
  net/mlx5: optimize single counter pool search

 drivers/net/mlx5/mlx5.c         |  16 +++
 drivers/net/mlx5/mlx5.h         |  10 ++
 drivers/net/mlx5/mlx5_flow_dv.c | 115 +++++++++++------
 drivers/net/mlx5/mlx5_utils.c   | 276 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_utils.h   | 165 ++++++++++++++++++++++++
 5 files changed, 545 insertions(+), 37 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 1/3] net/mlx5: add Three-Level table utility
  2020-06-18  7:24 [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Suanming Mou
@ 2020-06-18  7:24 ` Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 2/3] net/mlx5: manage shared counters in Three-Level table Suanming Mou
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Suanming Mou @ 2020-06-18  7:24 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: rasland, dev

For the case which data is linked with sequence increased index, the
array table will be more efficient than hash table once need to search
one data entry in large numbers of entries. Since the traditional hash
tables has fixed table size, when huge numbers of data saved to the hash
table, it also comes lots of hash conflict.

But simple array table also has fixed size, allocates all the needed
memory at once will waste lots of memory. For the case don't know the
exactly number of entries will be impossible to allocate the array.

Then the multiple level table helps to balance the two disadvantages.
Allocate a global high level table with sub table entries at first,
the global table contains the sub table entries, and the sub table will
be allocated only once the corresponding index entry need to be saved.
e.g. for up to 32-bits index, three level table with 10-10-12 splitting,
with sequence increased index, the memory grows with every 4K entries.

The currently implementation introduces 10-10-12 32-bits splitting
Three-Level table to help the cases which have millions of entries to
save. The index entries can be addressed directly by the index, no
search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_utils.c | 276 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_utils.h | 165 +++++++++++++++++++++++++
 2 files changed, 441 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_utils.c b/drivers/net/mlx5/mlx5_utils.c
index d29fbcb..5c76b29 100644
--- a/drivers/net/mlx5/mlx5_utils.c
+++ b/drivers/net/mlx5/mlx5_utils.c
@@ -482,3 +482,279 @@ struct mlx5_indexed_pool *
 	       pool->trunk_empty, pool->trunk_avail, pool->trunk_free);
 #endif
 }
+
+struct mlx5_l3t_tbl *
+mlx5_l3t_create(enum mlx5_l3t_type type)
+{
+	struct mlx5_l3t_tbl *tbl;
+	struct mlx5_indexed_pool_config l3t_ip_cfg = {
+		.trunk_size = 16,
+		.grow_trunk = 6,
+		.grow_shift = 1,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = rte_malloc_socket,
+		.free = rte_free,
+	};
+
+	if (type >= MLX5_L3T_TYPE_MAX) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	tbl = rte_zmalloc(NULL, sizeof(struct mlx5_l3t_tbl), 1);
+	if (!tbl) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	tbl->type = type;
+	switch (type) {
+	case MLX5_L3T_TYPE_WORD:
+		l3t_ip_cfg.size = sizeof(struct mlx5_l3t_entry_word) +
+				  sizeof(uint16_t) * MLX5_L3T_ET_SIZE;
+		l3t_ip_cfg.type = "mlx5_l3t_e_tbl_w";
+		break;
+	case MLX5_L3T_TYPE_DWORD:
+		l3t_ip_cfg.size = sizeof(struct mlx5_l3t_entry_dword) +
+				  sizeof(uint32_t) * MLX5_L3T_ET_SIZE;
+		l3t_ip_cfg.type = "mlx5_l3t_e_tbl_dw";
+		break;
+	case MLX5_L3T_TYPE_QWORD:
+		l3t_ip_cfg.size = sizeof(struct mlx5_l3t_entry_qword) +
+				  sizeof(uint64_t) * MLX5_L3T_ET_SIZE;
+		l3t_ip_cfg.type = "mlx5_l3t_e_tbl_qw";
+		break;
+	default:
+		l3t_ip_cfg.size = sizeof(struct mlx5_l3t_entry_ptr) +
+				  sizeof(void *) * MLX5_L3T_ET_SIZE;
+		l3t_ip_cfg.type = "mlx5_l3t_e_tbl_tpr";
+		break;
+	}
+	tbl->eip = mlx5_ipool_create(&l3t_ip_cfg);
+	if (!tbl->eip) {
+		rte_errno = ENOMEM;
+		rte_free(tbl);
+		tbl = NULL;
+	}
+	return tbl;
+}
+
+void
+mlx5_l3t_destroy(struct mlx5_l3t_tbl *tbl)
+{
+	struct mlx5_l3t_level_tbl *g_tbl, *m_tbl;
+	uint32_t i, j;
+
+	if (!tbl)
+		return;
+	g_tbl = tbl->tbl;
+	if (g_tbl) {
+		for (i = 0; i < MLX5_L3T_GT_SIZE; i++) {
+			m_tbl = g_tbl->tbl[i];
+			if (!m_tbl)
+				continue;
+			for (j = 0; j < MLX5_L3T_MT_SIZE; j++) {
+				if (!m_tbl->tbl[j])
+					continue;
+				MLX5_ASSERT(!((struct mlx5_l3t_entry_word *)
+					    m_tbl->tbl[j])->ref_cnt);
+				mlx5_ipool_free(tbl->eip,
+						((struct mlx5_l3t_entry_word *)
+						m_tbl->tbl[j])->idx);
+				m_tbl->tbl[j] = 0;
+				if (!(--m_tbl->ref_cnt))
+					break;
+			}
+			MLX5_ASSERT(!m_tbl->ref_cnt);
+			rte_free(g_tbl->tbl[i]);
+			g_tbl->tbl[i] = 0;
+			if (!(--g_tbl->ref_cnt))
+				break;
+		}
+		MLX5_ASSERT(!g_tbl->ref_cnt);
+		rte_free(tbl->tbl);
+		tbl->tbl = 0;
+	}
+	mlx5_ipool_destroy(tbl->eip);
+	rte_free(tbl);
+}
+
+uint32_t
+mlx5_l3t_get_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx,
+		   union mlx5_l3t_data *data)
+{
+	struct mlx5_l3t_level_tbl *g_tbl, *m_tbl;
+	void *e_tbl;
+	uint32_t entry_idx;
+
+	g_tbl = tbl->tbl;
+	if (!g_tbl)
+		return -1;
+	m_tbl = g_tbl->tbl[(idx >> MLX5_L3T_GT_OFFSET) & MLX5_L3T_GT_MASK];
+	if (!m_tbl)
+		return -1;
+	e_tbl = m_tbl->tbl[(idx >> MLX5_L3T_MT_OFFSET) & MLX5_L3T_MT_MASK];
+	if (!e_tbl)
+		return -1;
+	entry_idx = idx & MLX5_L3T_ET_MASK;
+	switch (tbl->type) {
+	case MLX5_L3T_TYPE_WORD:
+		data->word = ((struct mlx5_l3t_entry_word *)e_tbl)->entry
+			     [entry_idx];
+		break;
+	case MLX5_L3T_TYPE_DWORD:
+		data->dword = ((struct mlx5_l3t_entry_dword *)e_tbl)->entry
+			     [entry_idx];
+		break;
+	case MLX5_L3T_TYPE_QWORD:
+		data->qword = ((struct mlx5_l3t_entry_qword *)e_tbl)->entry
+			      [entry_idx];
+		break;
+	default:
+		data->ptr = ((struct mlx5_l3t_entry_ptr *)e_tbl)->entry
+			    [entry_idx];
+		break;
+	}
+	return 0;
+}
+
+void
+mlx5_l3t_clear_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx)
+{
+	struct mlx5_l3t_level_tbl *g_tbl, *m_tbl;
+	struct mlx5_l3t_entry_word *w_e_tbl;
+	struct mlx5_l3t_entry_dword *dw_e_tbl;
+	struct mlx5_l3t_entry_qword *qw_e_tbl;
+	struct mlx5_l3t_entry_ptr *ptr_e_tbl;
+	void *e_tbl;
+	uint32_t entry_idx;
+	uint64_t ref_cnt;
+
+	g_tbl = tbl->tbl;
+	if (!g_tbl)
+		return;
+	m_tbl = g_tbl->tbl[(idx >> MLX5_L3T_GT_OFFSET) & MLX5_L3T_GT_MASK];
+	if (!m_tbl)
+		return;
+	e_tbl = m_tbl->tbl[(idx >> MLX5_L3T_MT_OFFSET) & MLX5_L3T_MT_MASK];
+	if (!e_tbl)
+		return;
+	entry_idx = idx & MLX5_L3T_ET_MASK;
+	switch (tbl->type) {
+	case MLX5_L3T_TYPE_WORD:
+		w_e_tbl = (struct mlx5_l3t_entry_word *)e_tbl;
+		w_e_tbl->entry[entry_idx] = 0;
+		ref_cnt = --w_e_tbl->ref_cnt;
+		break;
+	case MLX5_L3T_TYPE_DWORD:
+		dw_e_tbl = (struct mlx5_l3t_entry_dword *)e_tbl;
+		dw_e_tbl->entry[entry_idx] = 0;
+		ref_cnt = --dw_e_tbl->ref_cnt;
+		break;
+	case MLX5_L3T_TYPE_QWORD:
+		qw_e_tbl = (struct mlx5_l3t_entry_qword *)e_tbl;
+		qw_e_tbl->entry[entry_idx] = 0;
+		ref_cnt = --qw_e_tbl->ref_cnt;
+		break;
+	default:
+		ptr_e_tbl = (struct mlx5_l3t_entry_ptr *)e_tbl;
+		ptr_e_tbl->entry[entry_idx] = NULL;
+		ref_cnt = --ptr_e_tbl->ref_cnt;
+		break;
+	}
+	if (!ref_cnt) {
+		mlx5_ipool_free(tbl->eip,
+				((struct mlx5_l3t_entry_word *)e_tbl)->idx);
+		m_tbl->tbl[(idx >> MLX5_L3T_MT_OFFSET) & MLX5_L3T_MT_MASK] =
+									NULL;
+		if (!(--m_tbl->ref_cnt)) {
+			rte_free(m_tbl);
+			g_tbl->tbl
+			[(idx >> MLX5_L3T_GT_OFFSET) & MLX5_L3T_GT_MASK] = NULL;
+			if (!(--g_tbl->ref_cnt)) {
+				rte_free(g_tbl);
+				tbl->tbl = 0;
+			}
+		}
+	}
+}
+
+uint32_t
+mlx5_l3t_set_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx,
+		   union mlx5_l3t_data *data)
+{
+	struct mlx5_l3t_level_tbl *g_tbl, *m_tbl;
+	struct mlx5_l3t_entry_word *w_e_tbl;
+	struct mlx5_l3t_entry_dword *dw_e_tbl;
+	struct mlx5_l3t_entry_qword *qw_e_tbl;
+	struct mlx5_l3t_entry_ptr *ptr_e_tbl;
+	void *e_tbl;
+	uint32_t entry_idx, tbl_idx = 0;
+
+	/* Check the global table, create it if empty. */
+	g_tbl = tbl->tbl;
+	if (!g_tbl) {
+		g_tbl = rte_zmalloc(NULL, sizeof(struct mlx5_l3t_level_tbl) +
+				    sizeof(void *) * MLX5_L3T_GT_SIZE, 1);
+		if (!g_tbl) {
+			rte_errno = ENOMEM;
+			return -1;
+		}
+		tbl->tbl = g_tbl;
+	}
+	/*
+	 * Check the middle table, create it if empty. Ref_cnt will be
+	 * increased if new sub table created.
+	 */
+	m_tbl = g_tbl->tbl[(idx >> MLX5_L3T_GT_OFFSET) & MLX5_L3T_GT_MASK];
+	if (!m_tbl) {
+		m_tbl = rte_zmalloc(NULL, sizeof(struct mlx5_l3t_level_tbl) +
+				    sizeof(void *) * MLX5_L3T_MT_SIZE, 1);
+		if (!m_tbl) {
+			rte_errno = ENOMEM;
+			return -1;
+		}
+		g_tbl->tbl[(idx >> MLX5_L3T_GT_OFFSET) & MLX5_L3T_GT_MASK] =
+									m_tbl;
+		g_tbl->ref_cnt++;
+	}
+	/*
+	 * Check the entry table, create it if empty. Ref_cnt will be
+	 * increased if new sub entry table created.
+	 */
+	e_tbl = m_tbl->tbl[(idx >> MLX5_L3T_MT_OFFSET) & MLX5_L3T_MT_MASK];
+	if (!e_tbl) {
+		e_tbl = mlx5_ipool_zmalloc(tbl->eip, &tbl_idx);
+		if (!e_tbl) {
+			rte_errno = ENOMEM;
+			return -1;
+		}
+		((struct mlx5_l3t_entry_word *)e_tbl)->idx = tbl_idx;
+		m_tbl->tbl[(idx >> MLX5_L3T_MT_OFFSET) & MLX5_L3T_MT_MASK] =
+									e_tbl;
+		m_tbl->ref_cnt++;
+	}
+	entry_idx = idx & MLX5_L3T_ET_MASK;
+	switch (tbl->type) {
+	case MLX5_L3T_TYPE_WORD:
+		w_e_tbl = (struct mlx5_l3t_entry_word *)e_tbl;
+		w_e_tbl->entry[entry_idx] = data->word;
+		w_e_tbl->ref_cnt++;
+		break;
+	case MLX5_L3T_TYPE_DWORD:
+		dw_e_tbl = (struct mlx5_l3t_entry_dword *)e_tbl;
+		dw_e_tbl->entry[entry_idx] = data->dword;
+		dw_e_tbl->ref_cnt++;
+		break;
+	case MLX5_L3T_TYPE_QWORD:
+		qw_e_tbl = (struct mlx5_l3t_entry_qword *)e_tbl;
+		qw_e_tbl->entry[entry_idx] = data->qword;
+		qw_e_tbl->ref_cnt++;
+		break;
+	default:
+		ptr_e_tbl = (struct mlx5_l3t_entry_ptr *)e_tbl;
+		ptr_e_tbl->entry[entry_idx] = data->ptr;
+		ptr_e_tbl->ref_cnt++;
+		break;
+	}
+	return 0;
+}
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index f4ec151..18cfc2c 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -55,6 +55,106 @@
 	 (((val) & (from)) * ((to) / (from))))
 
 /*
+ * For the case which data is linked with sequence increased index, the
+ * array table will be more efficiect than hash table once need to serarch
+ * one data entry in large numbers of entries. Since the traditional hash
+ * tables has fixed table size, when huge numbers of data saved to the hash
+ * table, it also comes lots of hash conflict.
+ *
+ * But simple array table also has fixed size, allocates all the needed
+ * memory at once will waste lots of memory. For the case don't know the
+ * exactly number of entries will be impossible to allocate the array.
+ *
+ * Then the mulitple level table helps to balance the two disadvantages.
+ * Allocate a global high level table with sub table entries at first,
+ * the global table contains the sub table entries, and the sub table will
+ * be allocated only once the corresponding index entry need to be saved.
+ * e.g. for up to 32-bits index, three level table with 10-10-12 splitting,
+ * with sequence increased index, the memory grows with every 4K entries.
+ *
+ * The currently implementation introduces 10-10-12 32-bits splitting
+ * Three-Level table to help the cases which have millions of enties to
+ * save. The index entries can be addressed directly by the index, no
+ * search will be needed.q
+ */
+
+/* L3 table global table define. */
+#define MLX5_L3T_GT_OFFSET 22
+#define MLX5_L3T_GT_SIZE (1 << 10)
+#define MLX5_L3T_GT_MASK (MLX5_L3T_GT_SIZE - 1)
+
+/* L3 table middle table define. */
+#define MLX5_L3T_MT_OFFSET 12
+#define MLX5_L3T_MT_SIZE (1 << 10)
+#define MLX5_L3T_MT_MASK (MLX5_L3T_MT_SIZE - 1)
+
+/* L3 table entry table define. */
+#define MLX5_L3T_ET_OFFSET 0
+#define MLX5_L3T_ET_SIZE (1 << 12)
+#define MLX5_L3T_ET_MASK (MLX5_L3T_ET_SIZE - 1)
+
+/* L3 table type. */
+enum mlx5_l3t_type {
+	MLX5_L3T_TYPE_WORD = 0,
+	MLX5_L3T_TYPE_DWORD,
+	MLX5_L3T_TYPE_QWORD,
+	MLX5_L3T_TYPE_PTR,
+	MLX5_L3T_TYPE_MAX,
+};
+
+struct mlx5_indexed_pool;
+
+/* Generic data struct. */
+union mlx5_l3t_data {
+	uint16_t word;
+	uint32_t dword;
+	uint64_t qword;
+	void *ptr;
+};
+
+/* L3 level table data structure. */
+struct mlx5_l3t_level_tbl {
+	uint64_t ref_cnt; /* Table ref_cnt. */
+	void *tbl[]; /* Table array. */
+};
+
+/* L3 word entry table data structure. */
+struct mlx5_l3t_entry_word {
+	uint32_t idx; /* Table index. */
+	uint64_t ref_cnt; /* Table ref_cnt. */
+	uint16_t entry[]; /* Entry arrary. */
+};
+
+/* L3 double word entry table data structure. */
+struct mlx5_l3t_entry_dword {
+	uint32_t idx; /* Table index. */
+	uint64_t ref_cnt; /* Table ref_cnt. */
+	uint32_t entry[]; /* Entry arrary. */
+};
+
+/* L3 quad word entry table data structure. */
+struct mlx5_l3t_entry_qword {
+	uint32_t idx; /* Table index. */
+	uint64_t ref_cnt; /* Table ref_cnt. */
+	uint64_t entry[]; /* Entry arrary. */
+};
+
+/* L3 pointer entry table data structure. */
+struct mlx5_l3t_entry_ptr {
+	uint32_t idx; /* Table index. */
+	uint64_t ref_cnt; /* Table ref_cnt. */
+	void *entry[]; /* Entry arrary. */
+};
+
+/* L3 table data structure. */
+struct mlx5_l3t_tbl {
+	enum mlx5_l3t_type type; /* Table type. */
+	struct mlx5_indexed_pool *eip;
+	/* Table index pool handles. */
+	struct mlx5_l3t_level_tbl *tbl; /* Global table index. */
+};
+
+/*
  * The indexed memory entry index is made up of trunk index and offset of
  * the entry in the trunk. Since the entry index is 32 bits, in case user
  * prefers to have small trunks, user can change the macro below to a big
@@ -345,6 +445,71 @@ struct mlx5_indexed_pool *
  */
 void mlx5_ipool_dump(struct mlx5_indexed_pool *pool);
 
+/**
+ * This function allocates new empty Three-level table.
+ *
+ * @param type
+ *   The l3t can set as word, double word, quad word or pointer with index.
+ *
+ * @return
+ *   - Pointer to the allocated l3t.
+ *   - NULL on error. Not enough memory, or invalid arguments.
+ */
+struct mlx5_l3t_tbl *mlx5_l3t_create(enum mlx5_l3t_type type);
+
+/**
+ * This function destroys Three-level table.
+ *
+ * @param tbl
+ *   Pointer to the l3t.
+ */
+void mlx5_l3t_destroy(struct mlx5_l3t_tbl *tbl);
+
+/**
+ * This function gets the index entry from Three-level table.
+ *
+ * @param tbl
+ *   Pointer to the l3t.
+ * @param idx
+ *   Index to the entry.
+ * @param data
+ *   Pointer to the memory which saves the entry data.
+ *   When function call returns 0, data contains the entry data get from
+ *   l3t.
+ *   When function call returns -1, data is not modified.
+ *
+ * @return
+ *   0 if success, -1 on error.
+ */
+
+uint32_t mlx5_l3t_get_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx,
+			    union mlx5_l3t_data *data);
+/**
+ * This function clears the index entry from Three-level table.
+ *
+ * @param tbl
+ *   Pointer to the l3t.
+ * @param idx
+ *   Index to the entry.
+ */
+void mlx5_l3t_clear_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx);
+
+/**
+ * This function gets the index entry from Three-level table.
+ *
+ * @param tbl
+ *   Pointer to the l3t.
+ * @param idx
+ *   Index to the entry.
+ * @param data
+ *   Pointer to the memory which contains the entry data save to l3t.
+ *
+ * @return
+ *   0 if success, -1 on error.
+ */
+uint32_t mlx5_l3t_set_entry(struct mlx5_l3t_tbl *tbl, uint32_t idx,
+			    union mlx5_l3t_data *data);
+
 /*
  * Macros for linked list based on indexed memory.
  * Example data structure:
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 2/3] net/mlx5: manage shared counters in Three-Level table
  2020-06-18  7:24 [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add Three-Level table utility Suanming Mou
@ 2020-06-18  7:24 ` Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 3/3] net/mlx5: optimize single counter pool search Suanming Mou
  2020-06-21 14:15 ` [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Raslan Darawsheh
  3 siblings, 0 replies; 5+ messages in thread
From: Suanming Mou @ 2020-06-18  7:24 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: rasland, dev

Currently, to check if any shared counter with same ID existing, it will
have to loop the counter pools to search for the counter. Even add the
counter to the list will also not so helpful while there are thousands
of shared counters in the list.

Change Three-Level table to look up the counter index saved in the
relevant table entry will be more efficient.

This patch introduces the Three-level table to save the ID relevant
counter index in the table. Then the next while the same ID comes, just
check the table entry of this ID will get the counter index directly.
No search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         | 13 ++++++++++
 drivers/net/mlx5/mlx5.h         |  1 +
 drivers/net/mlx5/mlx5_flow_dv.c | 53 ++++++++++++++++++++++++-----------------
 3 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5c86f6f..4c0c26e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -716,6 +716,11 @@ struct mlx5_dev_ctx_shared *
 	mlx5_os_set_reg_mr_cb(&sh->share_cache.reg_mr_cb,
 			      &sh->share_cache.dereg_mr_cb);
 	mlx5_os_dev_shared_handler_install(sh);
+	sh->cnt_id_tbl = mlx5_l3t_create(MLX5_L3T_TYPE_DWORD);
+	if (!sh->cnt_id_tbl) {
+		err = rte_errno;
+		goto error;
+	}
 	mlx5_flow_aging_init(sh);
 	mlx5_flow_counters_mng_init(sh);
 	mlx5_flow_ipool_create(sh, config);
@@ -732,6 +737,10 @@ struct mlx5_dev_ctx_shared *
 error:
 	pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
 	MLX5_ASSERT(sh);
+	if (sh->cnt_id_tbl) {
+		mlx5_l3t_destroy(sh->cnt_id_tbl);
+		sh->cnt_id_tbl = NULL;
+	}
 	if (sh->tis)
 		claim_zero(mlx5_devx_cmd_destroy(sh->tis));
 	if (sh->td)
@@ -793,6 +802,10 @@ struct mlx5_dev_ctx_shared *
 	mlx5_flow_counters_mng_close(sh);
 	mlx5_flow_ipool_destroy(sh);
 	mlx5_os_dev_shared_handler_uninstall(sh);
+	if (sh->cnt_id_tbl) {
+		mlx5_l3t_destroy(sh->cnt_id_tbl);
+		sh->cnt_id_tbl = NULL;
+	}
 	if (sh->pd)
 		claim_zero(mlx5_glue->dealloc_pd(sh->pd));
 	if (sh->tis)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 5bd5acd..1ee9da7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -565,6 +565,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_indexed_pool *ipool[MLX5_IPOOL_MAX];
 	/* Memory Pool for mlx5 flow resources. */
+	struct mlx5_l3t_tbl *cnt_id_tbl; /* Shared counter lookup table. */
 	/* Shared interrupt handler section. */
 	struct rte_intr_handle intr_handle; /* Interrupt handler for device. */
 	struct rte_intr_handle intr_handle_devx; /* DEVX interrupt handler. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 5bb252e..6e4e10c 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -4453,8 +4453,8 @@ struct field_modify_info modify_tcp[] = {
 /**
  * Search for existed shared counter.
  *
- * @param[in] cont
- *   Pointer to the relevant counter pool container.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
  * @param[in] id
  *   The shared counter ID to search.
  * @param[out] ppool
@@ -4464,26 +4464,22 @@ struct field_modify_info modify_tcp[] = {
  *   NULL if not existed, otherwise pointer to the shared extend counter.
  */
 static struct mlx5_flow_counter_ext *
-flow_dv_counter_shared_search(struct mlx5_pools_container *cont, uint32_t id,
+flow_dv_counter_shared_search(struct rte_eth_dev *dev, uint32_t id,
 			      struct mlx5_flow_counter_pool **ppool)
 {
-	struct mlx5_flow_counter_ext *cnt;
-	struct mlx5_flow_counter_pool *pool;
-	uint32_t i, j;
-	uint32_t n_valid = rte_atomic16_read(&cont->n_valid);
+	struct mlx5_priv *priv = dev->data->dev_private;
+	union mlx5_l3t_data data;
+	uint32_t cnt_idx;
 
-	for (i = 0; i < n_valid; i++) {
-		pool = cont->pools[i];
-		for (j = 0; j < MLX5_COUNTERS_PER_POOL; ++j) {
-			cnt = MLX5_GET_POOL_CNT_EXT(pool, j);
-			if (cnt->ref_cnt && cnt->shared && cnt->id == id) {
-				if (ppool)
-					*ppool = cont->pools[i];
-				return cnt;
-			}
-		}
-	}
-	return NULL;
+	if (mlx5_l3t_get_entry(priv->sh->cnt_id_tbl, id, &data) || !data.dword)
+		return NULL;
+	cnt_idx = data.dword;
+	/*
+	 * Shared counters don't have age info. The counter extend is after
+	 * the counter datat structure.
+	 */
+	return (struct mlx5_flow_counter_ext *)
+	       ((flow_dv_counter_get_by_idx(dev, cnt_idx, ppool)) + 1);
 }
 
 /**
@@ -4529,7 +4525,7 @@ struct field_modify_info modify_tcp[] = {
 		return 0;
 	}
 	if (shared) {
-		cnt_ext = flow_dv_counter_shared_search(cont, id, &pool);
+		cnt_ext = flow_dv_counter_shared_search(dev, id, &pool);
 		if (cnt_ext) {
 			if (cnt_ext->ref_cnt + 1 == 0) {
 				rte_errno = E2BIG;
@@ -4597,6 +4593,13 @@ struct field_modify_info modify_tcp[] = {
 		cnt_ext->shared = shared;
 		cnt_ext->ref_cnt = 1;
 		cnt_ext->id = id;
+		if (shared) {
+			union mlx5_l3t_data data;
+
+			data.dword = cnt_idx;
+			if (mlx5_l3t_set_entry(priv->sh->cnt_id_tbl, id, &data))
+				return 0;
+		}
 	}
 	if (!priv->counter_fallback && !priv->sh->cmng.query_thread_on)
 		/* Start the asynchronous batch query by the host thread. */
@@ -4679,6 +4682,7 @@ struct field_modify_info modify_tcp[] = {
 static void
 flow_dv_counter_release(struct rte_eth_dev *dev, uint32_t counter)
 {
+	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_flow_counter_pool *pool = NULL;
 	struct mlx5_flow_counter *cnt;
 	struct mlx5_flow_counter_ext *cnt_ext = NULL;
@@ -4689,8 +4693,13 @@ struct field_modify_info modify_tcp[] = {
 	MLX5_ASSERT(pool);
 	if (counter < MLX5_CNT_BATCH_OFFSET) {
 		cnt_ext = MLX5_CNT_TO_CNT_EXT(pool, cnt);
-		if (cnt_ext && --cnt_ext->ref_cnt)
-			return;
+		if (cnt_ext) {
+			if (--cnt_ext->ref_cnt)
+				return;
+			if (cnt_ext->shared)
+				mlx5_l3t_clear_entry(priv->sh->cnt_id_tbl,
+						     cnt_ext->id);
+		}
 	}
 	if (IS_AGE_POOL(pool))
 		flow_dv_counter_remove_from_age(dev, counter, cnt);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 3/3] net/mlx5: optimize single counter pool search
  2020-06-18  7:24 [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add Three-Level table utility Suanming Mou
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 2/3] net/mlx5: manage shared counters in Three-Level table Suanming Mou
@ 2020-06-18  7:24 ` Suanming Mou
  2020-06-21 14:15 ` [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Raslan Darawsheh
  3 siblings, 0 replies; 5+ messages in thread
From: Suanming Mou @ 2020-06-18  7:24 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: rasland, dev

For single counter, when allocate a new counter, it needs to find the pool
it belongs in order to do the query together.

Once there are millions of counters allocated, the pool array in the
counter container will become very large. In this case, the pool search
from the pool array will become extremely slow.

Save the minimum and maximum counter ID to have a quick check of current
counter ID range. And start searching the pool from the last pool in the
container will mostly get the needed pool since counter ID increases
sequentially.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |  3 +++
 drivers/net/mlx5/mlx5.h         |  9 +++++++
 drivers/net/mlx5/mlx5_flow_dv.c | 60 +++++++++++++++++++++++++++++++----------
 3 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 4c0c26e..670a59e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -457,6 +457,9 @@ struct mlx5_flow_id_pool *
 	memset(&sh->cmng, 0, sizeof(sh->cmng));
 	TAILQ_INIT(&sh->cmng.flow_counters);
 	for (i = 0; i < MLX5_CCONT_TYPE_MAX; ++i) {
+		sh->cmng.ccont[i].min_id = MLX5_CNT_BATCH_OFFSET;
+		sh->cmng.ccont[i].max_id = -1;
+		sh->cmng.ccont[i].last_pool_idx = POOL_IDX_INVALID;
 		TAILQ_INIT(&sh->cmng.ccont[i].pool_list);
 		rte_spinlock_init(&sh->cmng.ccont[i].resize_sl);
 	}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1ee9da7..3ddae17 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -312,6 +312,12 @@ struct mlx5_drop {
 	MLX5_CNT_TO_CNT_EXT(pool, MLX5_POOL_GET_CNT((pool), (offset)))
 #define MLX5_CNT_TO_AGE(cnt) \
 	((struct mlx5_age_param *)((cnt) + 1))
+/*
+ * The maximum single counter is 0x800000 as MLX5_CNT_BATCH_OFFSET
+ * defines. The pool size is 512, pool index should never reach
+ * INT16_MAX.
+ */
+#define POOL_IDX_INVALID UINT16_MAX
 
 struct mlx5_flow_counter_pool;
 
@@ -420,6 +426,9 @@ struct mlx5_counter_stats_raw {
 struct mlx5_pools_container {
 	rte_atomic16_t n_valid; /* Number of valid pools. */
 	uint16_t n; /* Number of pools. */
+	uint16_t last_pool_idx; /* Last used pool index */
+	int min_id; /* The minimum counter ID in the pools. */
+	int max_id; /* The maximum counter ID in the pools. */
 	rte_spinlock_t resize_sl; /* The resize lock. */
 	struct mlx5_counter_pools pool_list; /* Counter pool list. */
 	struct mlx5_flow_counter_pool **pools; /* Counter pool array. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 6e4e10c..9fa8568 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -4051,6 +4051,28 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Check the devx counter belongs to the pool.
+ *
+ * @param[in] pool
+ *   Pointer to the counter pool.
+ * @param[in] id
+ *   The counter devx ID.
+ *
+ * @return
+ *   True if counter belongs to the pool, false otherwise.
+ */
+static bool
+flow_dv_is_counter_in_pool(struct mlx5_flow_counter_pool *pool, int id)
+{
+	int base = (pool->min_dcs->id / MLX5_COUNTERS_PER_POOL) *
+		   MLX5_COUNTERS_PER_POOL;
+
+	if (id >= base && id < base + MLX5_COUNTERS_PER_POOL)
+		return true;
+	return false;
+}
+
+/**
  * Get a pool by devx counter ID.
  *
  * @param[in] cont
@@ -4065,24 +4087,25 @@ struct field_modify_info modify_tcp[] = {
 flow_dv_find_pool_by_id(struct mlx5_pools_container *cont, int id)
 {
 	uint32_t i;
-	uint32_t n_valid = rte_atomic16_read(&cont->n_valid);
 
-	for (i = 0; i < n_valid; i++) {
+	/* Check last used pool. */
+	if (cont->last_pool_idx != POOL_IDX_INVALID &&
+	    flow_dv_is_counter_in_pool(cont->pools[cont->last_pool_idx], id))
+		return cont->pools[cont->last_pool_idx];
+	/* ID out of range means no suitable pool in the container. */
+	if (id > cont->max_id || id < cont->min_id)
+		return NULL;
+	/*
+	 * Find the pool from the end of the container, since mostly counter
+	 * ID is sequence increasing, and the last pool should be the needed
+	 * one.
+	 */
+	i = rte_atomic16_read(&cont->n_valid);
+	while (i--) {
 		struct mlx5_flow_counter_pool *pool = cont->pools[i];
-		int base = (pool->min_dcs->id / MLX5_COUNTERS_PER_POOL) *
-			   MLX5_COUNTERS_PER_POOL;
 
-		if (id >= base && id < base + MLX5_COUNTERS_PER_POOL) {
-			/*
-			 * Move the pool to the head, as counter allocate
-			 * always gets the first pool in the container.
-			 */
-			if (pool != TAILQ_FIRST(&cont->pool_list)) {
-				TAILQ_REMOVE(&cont->pool_list, pool, next);
-				TAILQ_INSERT_HEAD(&cont->pool_list, pool, next);
-			}
+		if (flow_dv_is_counter_in_pool(pool, id))
 			return pool;
-		}
 	}
 	return NULL;
 }
@@ -4337,6 +4360,15 @@ struct field_modify_info modify_tcp[] = {
 	TAILQ_INSERT_HEAD(&cont->pool_list, pool, next);
 	pool->index = n_valid;
 	cont->pools[n_valid] = pool;
+	if (!batch) {
+		int base = RTE_ALIGN_FLOOR(dcs->id, MLX5_COUNTERS_PER_POOL);
+
+		if (base < cont->min_id)
+			cont->min_id = base;
+		if (base > cont->max_id)
+			cont->max_id = base + MLX5_COUNTERS_PER_POOL - 1;
+		cont->last_pool_idx = pool->index;
+	}
 	/* Pool initialization must be updated before host thread access. */
 	rte_cio_wmb();
 	rte_atomic16_add(&cont->n_valid, 1);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate
  2020-06-18  7:24 [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Suanming Mou
                   ` (2 preceding siblings ...)
  2020-06-18  7:24 ` [dpdk-dev] [PATCH 3/3] net/mlx5: optimize single counter pool search Suanming Mou
@ 2020-06-21 14:15 ` Raslan Darawsheh
  3 siblings, 0 replies; 5+ messages in thread
From: Raslan Darawsheh @ 2020-06-21 14:15 UTC (permalink / raw)
  To: Suanming Mou, Slava Ovsiienko, Matan Azrad; +Cc: dev

Hi,

> -----Original Message-----
> From: Suanming Mou <suanmingm@mellanox.com>
> Sent: Thursday, June 18, 2020 10:25 AM
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Matan Azrad
> <matan@mellanox.com>
> Cc: Raslan Darawsheh <rasland@mellanox.com>; dev@dpdk.org
> Subject: [PATCH 0/3] net/mlx5: optimize single counter allocate
> 
> This patch set optimizes the DevX single counter allocate from two sides:
> 
> 1. Add the multiple level table to have a quick look up while
> allocate/search the single shared counter.
> 
> 2. Optimize the pool look up for the new allocated single counter.
> 
> Suanming Mou (3):
>   net/mlx5: add Three-Level table utility
>   net/mlx5: manage shared counters in Three-Level table
>   net/mlx5: optimize single counter pool search
> 
>  drivers/net/mlx5/mlx5.c         |  16 +++
>  drivers/net/mlx5/mlx5.h         |  10 ++
>  drivers/net/mlx5/mlx5_flow_dv.c | 115 +++++++++++------
>  drivers/net/mlx5/mlx5_utils.c   | 276
> ++++++++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_utils.h   | 165 ++++++++++++++++++++++++
>  5 files changed, 545 insertions(+), 37 deletions(-)
> 
> --
> 1.8.3.1


Series applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-21 14:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18  7:24 [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Suanming Mou
2020-06-18  7:24 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add Three-Level table utility Suanming Mou
2020-06-18  7:24 ` [dpdk-dev] [PATCH 2/3] net/mlx5: manage shared counters in Three-Level table Suanming Mou
2020-06-18  7:24 ` [dpdk-dev] [PATCH 3/3] net/mlx5: optimize single counter pool search Suanming Mou
2020-06-21 14:15 ` [dpdk-dev] [PATCH 0/3] net/mlx5: optimize single counter allocate Raslan Darawsheh

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git