DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 00/11] net/mlx5: flow insertion performance improvements
@ 2024-02-28 17:00 Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
                   ` (12 more replies)
  0 siblings, 13 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Goal of this patchset is to improve the throughput of flow insertion
and deletion in mlx5 PMD when HW Steering flow engine is used.

- Patch 1 - Use preallocated per-queue, per-actions template buffer
  for storing translated flow actions, instead of allocating and
  filling it on demand, on each flow operation.
- Patches 2-4 - Make resource index allocation optional. This allocation
  will be skipped when it is not required by the created template table.
- Patches 5-7 - Reduce memory footprint of the internal flow queue.
- Patch 8 - Remove indirection between flow job and flow itself,
  by using flow as an operation container.
- Patches 9-10 - Reduce memory footpring of flow struct by moving
  rarely used flow fields outside of the main flow struct.
  These fields will accesses only when needed.
  Also remove unneeded `zmalloc` usage.
- Patch 11 - Remove unneeded device status check in flow create.

In general all of these changes result in the following improvements
(all numbers are averaged Kflows/sec):

|              | Insertion) |   +%   | Deletion |   +%  |
|--------------|:----------:|:------:|:--------:|:-----:|
| baseline     |   6338.7   |        |  9739.6  |       |
| improvements |   6978.8   | +10.1% |  10432.4 | +7.1% |

The basic benchmark was run on ConnectX-6 Dx (22.40.1000),
on the system with Intel Xeon Platinum 8380 CPU.

Bing Zhao (2):
  net/mlx5: skip the unneeded resource index allocation
  net/mlx5: remove unneeded device status checking

Dariusz Sosnowski (7):
  net/mlx5: allocate local DR rule action buffers
  net/mlx5: remove action params from job
  net/mlx5: remove flow pattern from job
  net/mlx5: remove updated flow from job
  net/mlx5: use flow as operation container
  net/mlx5: move rarely used flow fields outside
  net/mlx5: reuse flow fields

Erez Shitrit (2):
  net/mlx5/hws: add check for matcher rule update support
  net/mlx5/hws: add check if matcher contains complex rules

 drivers/net/mlx5/hws/mlx5dr.h         |  16 +
 drivers/net/mlx5/hws/mlx5dr_action.c  |   6 +
 drivers/net/mlx5/hws/mlx5dr_action.h  |   2 +
 drivers/net/mlx5/hws/mlx5dr_matcher.c |  29 +
 drivers/net/mlx5/mlx5.h               |  29 +-
 drivers/net/mlx5/mlx5_flow.h          | 128 ++++-
 drivers/net/mlx5/mlx5_flow_hw.c       | 794 ++++++++++++++++----------
 7 files changed, 666 insertions(+), 338 deletions(-)

--
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/11] net/mlx5: allocate local DR rule action buffers
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Goal of this is to remove the unnecessary copying of precalculated
mlx5dr_rule_action structures used to create HWS flow rules.

Before this patch, during template table creation an array of these
structures was calculated for each actions template used.
Each of these structures contained either full action definition or
partial (depends on mask configuration).
During flow creation, this array was copied to stack and later passed to
mlx5dr_rule_create().

This patch removes this copy by implementing the following:

- Allocate an array of mlx5dr_rule_action structures for each actions
  template and queue.
- Populate them with precalculated data from relevant actions templates.
- During flow creation, construction of unmasked actions works on an
  array dedicated for the specific queue and actions template.
- Pass this buffer to mlx5dr_rule_create directly.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    | 13 +++++++++
 drivers/net/mlx5/mlx5_flow_hw.c | 51 +++++++++++++++++++++++++++++----
 2 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 1530e6962f..11135645ef 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1554,6 +1554,10 @@ struct mlx5_matcher_info {
 	uint32_t refcnt;
 };
 
+struct mlx5_dr_rule_action_container {
+	struct mlx5dr_rule_action acts[MLX5_HW_MAX_ACTS];
+} __rte_cache_aligned;
+
 struct rte_flow_template_table {
 	LIST_ENTRY(rte_flow_template_table) next;
 	struct mlx5_flow_group *grp; /* The group rte_flow_template_table uses. */
@@ -1573,6 +1577,15 @@ struct rte_flow_template_table {
 	uint32_t refcnt; /* Table reference counter. */
 	struct mlx5_tbl_multi_pattern_ctx mpctx;
 	struct mlx5dr_matcher_attr matcher_attr;
+	/**
+	 * Variable length array of containers containing precalculated templates of DR actions
+	 * arrays. This array is allocated at template table creation time and contains
+	 * one container per each queue, per each actions template.
+	 * Essentially rule_acts is a 2-dimensional array indexed with (AT index, queue) pair.
+	 * Each container will provide a local "queue buffer" to work on for flow creation
+	 * operations when using a given actions template.
+	 */
+	struct mlx5_dr_rule_action_container rule_acts[];
 };
 
 static __rte_always_inline struct mlx5dr_matcher *
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index f778fd0698..442237f2b6 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -2499,6 +2499,34 @@ __flow_hw_actions_translate(struct rte_eth_dev *dev,
 				  "fail to create rte table");
 }
 
+static __rte_always_inline struct mlx5dr_rule_action *
+flow_hw_get_dr_action_buffer(struct mlx5_priv *priv,
+			     struct rte_flow_template_table *table,
+			     uint8_t action_template_index,
+			     uint32_t queue)
+{
+	uint32_t offset = action_template_index * priv->nb_queue + queue;
+
+	return &table->rule_acts[offset].acts[0];
+}
+
+static void
+flow_hw_populate_rule_acts_caches(struct rte_eth_dev *dev,
+				  struct rte_flow_template_table *table,
+				  uint8_t at_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t q;
+
+	for (q = 0; q < priv->nb_queue; ++q) {
+		struct mlx5dr_rule_action *rule_acts =
+				flow_hw_get_dr_action_buffer(priv, table, at_idx, q);
+
+		rte_memcpy(rule_acts, table->ats[at_idx].acts.rule_acts,
+			   sizeof(table->ats[at_idx].acts.rule_acts));
+	}
+}
+
 /**
  * Translate rte_flow actions to DR action.
  *
@@ -2526,6 +2554,7 @@ flow_hw_actions_translate(struct rte_eth_dev *dev,
 						tbl->ats[i].action_template,
 						&tbl->mpctx, error))
 			goto err;
+		flow_hw_populate_rule_acts_caches(dev, tbl, i);
 	}
 	ret = mlx5_tbl_multi_pattern_process(dev, tbl, &tbl->mpctx.segments[0],
 					     rte_log2_u32(tbl->cfg.attr.nb_flows),
@@ -2914,7 +2943,6 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_multi_pattern_segment *mp_segment = NULL;
 
-	rte_memcpy(rule_acts, hw_acts->rule_acts, sizeof(*rule_acts) * at->dr_actions_num);
 	attr.group = table->grp->group_id;
 	ft_flag = mlx5_hw_act_flag[!!table->grp->group_id][table->type];
 	if (table->type == MLX5DR_TABLE_TYPE_FDB) {
@@ -3316,7 +3344,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3339,6 +3367,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	mlx5_ipool_malloc(table->resource, &res_idx);
 	if (!res_idx)
 		goto error;
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterward.
@@ -3460,7 +3489,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
@@ -3482,6 +3511,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	mlx5_ipool_malloc(table->resource, &res_idx);
 	if (!res_idx)
 		goto error;
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterwards.
@@ -3591,7 +3621,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
 	struct rte_flow_template_table *table = of->table;
@@ -3609,6 +3639,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		goto error;
 	nf = job->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterwards.
@@ -4335,6 +4366,7 @@ mlx5_hw_build_template_table(struct rte_eth_dev *dev,
 			i++;
 			goto at_error;
 		}
+		flow_hw_populate_rule_acts_caches(dev, tbl, i);
 	}
 	tbl->nb_action_templates = nb_action_templates;
 	if (mlx5_is_multi_pattern_active(&tbl->mpctx)) {
@@ -4423,6 +4455,7 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	uint32_t i = 0, max_tpl = MLX5_HW_TBL_MAX_ITEM_TEMPLATE;
 	uint32_t nb_flows = rte_align32pow2(attr->nb_flows);
 	bool port_started = !!dev->data->dev_started;
+	size_t tbl_mem_size;
 	int err;
 
 	/* HWS layer accepts only 1 item template with root table. */
@@ -4442,8 +4475,16 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
+	/*
+	 * Amount of memory required for rte_flow_template_table struct:
+	 * - Size of the struct itself.
+	 * - VLA of DR rule action containers at the end =
+	 *     number of actions templates * number of queues * size of DR rule actions container.
+	 */
+	tbl_mem_size = sizeof(*tbl);
+	tbl_mem_size += nb_action_templates * priv->nb_queue * sizeof(tbl->rule_acts[0]);
 	/* Allocate the table memory. */
-	tbl = mlx5_malloc(MLX5_MEM_ZERO, sizeof(*tbl), 0, rte_socket_id());
+	tbl = mlx5_malloc(MLX5_MEM_ZERO, tbl_mem_size, RTE_CACHE_LINE_SIZE, rte_socket_id());
 	if (!tbl)
 		goto error;
 	tbl->cfg = *table_cfg;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 02/11] net/mlx5/hws: add check for matcher rule update support
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, Erez Shitrit

From: Erez Shitrit <erezsh@nvidia.com>

The user want to know before trying to update a rule if that matcher that
keeps the original rule supports updating.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
---
 drivers/net/mlx5/hws/mlx5dr.h         |  8 ++++++++
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 12 ++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index d612f300c6..781b11e681 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -491,6 +491,14 @@ int mlx5dr_matcher_resize_rule_move(struct mlx5dr_matcher *src_matcher,
 				    struct mlx5dr_rule *rule,
 				    struct mlx5dr_rule_attr *attr);
 
+/* Check matcher ability to update existing rules
+ *
+ * @param[in] matcher
+ *	that the rule belongs to.
+ * @return true when the matcher is updatable false otherwise.
+ */
+bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher);
+
 /* Get the size of the rule handle (mlx5dr_rule) to be used on rule creation.
  *
  * @return size in bytes of rule handle struct.
diff --git a/drivers/net/mlx5/hws/mlx5dr_matcher.c b/drivers/net/mlx5/hws/mlx5dr_matcher.c
index 8a74a1ed7d..4e4da8e8f6 100644
--- a/drivers/net/mlx5/hws/mlx5dr_matcher.c
+++ b/drivers/net/mlx5/hws/mlx5dr_matcher.c
@@ -1530,6 +1530,18 @@ int mlx5dr_match_template_destroy(struct mlx5dr_match_template *mt)
 	return 0;
 }
 
+bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher)
+{
+	if (mlx5dr_table_is_root(matcher->tbl) ||
+	    mlx5dr_matcher_req_fw_wqe(matcher) ||
+	    mlx5dr_matcher_is_resizable(matcher) ||
+	    (!matcher->attr.optimize_using_rule_idx &&
+	    !mlx5dr_matcher_is_insert_by_idx(matcher)))
+		return false;
+
+	return true;
+}
+
 static int mlx5dr_matcher_resize_precheck(struct mlx5dr_matcher *src_matcher,
 					  struct mlx5dr_matcher *dst_matcher)
 {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 03/11] net/mlx5/hws: add check if matcher contains complex rules
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, Erez Shitrit

From: Erez Shitrit <erezsh@nvidia.com>

The function returns true if that matcher can contain complicated rule,
which means rule that needs more than one writing to the HW in order to
have it.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
---
 drivers/net/mlx5/hws/mlx5dr.h         |  8 ++++++++
 drivers/net/mlx5/hws/mlx5dr_action.c  |  6 ++++++
 drivers/net/mlx5/hws/mlx5dr_action.h  |  2 ++
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 17 +++++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index 781b11e681..d903decc1b 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -499,6 +499,14 @@ int mlx5dr_matcher_resize_rule_move(struct mlx5dr_matcher *src_matcher,
  */
 bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher);
 
+/* Check matcher if might contain rules that need complex structure
+ *
+ * @param[in] matcher
+ *	that the rule belongs to.
+ * @return true when the matcher is contains such rules, false otherwise.
+ */
+bool mlx5dr_matcher_is_dependent(struct mlx5dr_matcher *matcher);
+
 /* Get the size of the rule handle (mlx5dr_rule) to be used on rule creation.
  *
  * @return size in bytes of rule handle struct.
diff --git a/drivers/net/mlx5/hws/mlx5dr_action.c b/drivers/net/mlx5/hws/mlx5dr_action.c
index 631763dee0..494642d25c 100644
--- a/drivers/net/mlx5/hws/mlx5dr_action.c
+++ b/drivers/net/mlx5/hws/mlx5dr_action.c
@@ -3265,6 +3265,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_SINGLE1 | ASF_REMOVE;
 			setter->set_single = &mlx5dr_action_setter_ipv6_route_ext_pop;
 			setter->idx_single = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_PUSH_IPV6_ROUTE_EXT:
@@ -3291,6 +3292,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->set_double = &mlx5dr_action_setter_ipv6_route_ext_mhdr;
 			setter->idx_double = i;
 			setter->extra_data = 2;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_MODIFY_HDR:
@@ -3299,6 +3301,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_MODIFY;
 			setter->set_double = &mlx5dr_action_setter_modify_header;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_ASO_METER:
@@ -3326,6 +3329,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_INSERT;
 			setter->set_double = &mlx5dr_action_setter_insert_ptr;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_REFORMAT_L2_TO_TNL_L3:
@@ -3336,6 +3340,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->idx_double = i;
 			setter->set_single = &mlx5dr_action_setter_common_decap;
 			setter->idx_single = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_REFORMAT_TNL_L3_TO_L2:
@@ -3344,6 +3349,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_MODIFY | ASF_INSERT;
 			setter->set_double = &mlx5dr_action_setter_tnl_l3_to_l2;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_TAG:
diff --git a/drivers/net/mlx5/hws/mlx5dr_action.h b/drivers/net/mlx5/hws/mlx5dr_action.h
index 0c8e4bbb5a..68139ca092 100644
--- a/drivers/net/mlx5/hws/mlx5dr_action.h
+++ b/drivers/net/mlx5/hws/mlx5dr_action.h
@@ -119,6 +119,8 @@ struct mlx5dr_action_template {
 	uint8_t num_of_action_stes;
 	uint8_t num_actions;
 	uint8_t only_term;
+	/* indicates rule might require dependent wqe */
+	bool need_dep_write;
 	uint32_t flags;
 };
 
diff --git a/drivers/net/mlx5/hws/mlx5dr_matcher.c b/drivers/net/mlx5/hws/mlx5dr_matcher.c
index 4e4da8e8f6..1c64abfa57 100644
--- a/drivers/net/mlx5/hws/mlx5dr_matcher.c
+++ b/drivers/net/mlx5/hws/mlx5dr_matcher.c
@@ -1542,6 +1542,23 @@ bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher)
 	return true;
 }
 
+bool mlx5dr_matcher_is_dependent(struct mlx5dr_matcher *matcher)
+{
+	int i;
+
+	if (matcher->action_ste.max_stes || mlx5dr_matcher_req_fw_wqe(matcher))
+		return true;
+
+	for (i = 0; i < matcher->num_of_at; i++) {
+		struct mlx5dr_action_template *at = &matcher->at[i];
+
+		if (at->need_dep_write)
+			return true;
+	}
+
+	return false;
+}
+
 static int mlx5dr_matcher_resize_precheck(struct mlx5dr_matcher *src_matcher,
 					  struct mlx5dr_matcher *dst_matcher)
 {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 04/11] net/mlx5: skip the unneeded resource index allocation
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (2 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

From: Bing Zhao <bingz@nvidia.com>

The resource index was introduced to decouple the flow rule and its
resources used by hardware steering. This is needed only when a rule
update is supported.

In some cases, the update is not supported on a table(matcher). E.g.:
  * Table is resizable
  * FW gets involved
  * Root table
  * Not index based or optimized (not applicable)

Or only one STE entry is required per rule. When doing an update, the
operation is always atomic. There is no need for the extra resource
index either.

If the matcher doesn't support rule update or the maximal entry is
only 1 for this matcher, there is no need to manage the resource
index allocation and free from the pool.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow_hw.c | 129 +++++++++++++++++++-------------
 1 file changed, 76 insertions(+), 53 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 442237f2b6..fcf493c771 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3364,9 +3364,6 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
@@ -3375,7 +3372,14 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	flow->table = table;
 	flow->mt_idx = pattern_template_index;
 	flow->idx = flow_idx;
-	flow->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		flow->res_idx = res_idx;
+	} else {
+		flow->res_idx = flow_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3385,11 +3389,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	job->user_data = user_data;
 	rule_attr.user_data = job;
 	/*
-	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices for rule
-	 * insertion hints.
+	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
+	 * for rule insertion hints.
 	 */
-	MLX5_ASSERT(res_idx > 0);
-	flow->rule_idx = res_idx - 1;
+	flow->rule_idx = flow->res_idx - 1;
 	rule_attr.rule_idx = flow->rule_idx;
 	/*
 	 * Construct the flow actions based on the input actions.
@@ -3432,12 +3435,12 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return (struct rte_flow *)flow;
 error:
-	if (job)
-		flow_hw_job_put(priv, job, queue);
+	if (table->resource && res_idx)
+		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (res_idx)
-		mlx5_ipool_free(table->resource, res_idx);
+	if (job)
+		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3508,9 +3511,6 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
@@ -3519,7 +3519,14 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	flow->table = table;
 	flow->mt_idx = 0;
 	flow->idx = flow_idx;
-	flow->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		flow->res_idx = res_idx;
+	} else {
+		flow->res_idx = flow_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3528,9 +3535,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	job->flow = flow;
 	job->user_data = user_data;
 	rule_attr.user_data = job;
-	/*
-	 * Set the rule index.
-	 */
+	/* Set the rule index. */
 	flow->rule_idx = rule_index;
 	rule_attr.rule_idx = flow->rule_idx;
 	/*
@@ -3566,12 +3571,12 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return (struct rte_flow *)flow;
 error:
-	if (job)
-		flow_hw_job_put(priv, job, queue);
-	if (res_idx)
+	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
+	if (job)
+		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3634,9 +3639,6 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		goto error;
 	}
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	nf = job->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3647,7 +3649,14 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	nf->table = table;
 	nf->mt_idx = of->mt_idx;
 	nf->idx = of->idx;
-	nf->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		nf->res_idx = res_idx;
+	} else {
+		nf->res_idx = of->res_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3657,11 +3666,11 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	job->user_data = user_data;
 	rule_attr.user_data = job;
 	/*
-	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices for rule
-	 * insertion hints.
+	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
+	 * for rule insertion hints.
+	 * If there is only one STE, the update will be atomic by nature.
 	 */
-	MLX5_ASSERT(res_idx > 0);
-	nf->rule_idx = res_idx - 1;
+	nf->rule_idx = nf->res_idx - 1;
 	rule_attr.rule_idx = nf->rule_idx;
 	/*
 	 * Construct the flow actions based on the input actions.
@@ -3687,14 +3696,14 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return 0;
 error:
+	if (table->resource && res_idx)
+		mlx5_ipool_free(table->resource, res_idx);
 	/* Flow created fail, return the descriptor and flow memory. */
 	if (job)
 		flow_hw_job_put(priv, job, queue);
-	if (res_idx)
-		mlx5_ipool_free(table->resource, res_idx);
 	return rte_flow_error_set(error, rte_errno,
-			RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-			"fail to update rte flow");
+				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				  "fail to update rte flow");
 }
 
 /**
@@ -3949,13 +3958,15 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	}
 	if (job->type != MLX5_HW_Q_JOB_TYPE_UPDATE) {
 		if (table) {
-			mlx5_ipool_free(table->resource, res_idx);
+			if (table->resource)
+				mlx5_ipool_free(table->resource, res_idx);
 			mlx5_ipool_free(table->flow, flow->idx);
 		}
 	} else {
 		rte_memcpy(flow, job->upd_flow,
 			   offsetof(struct rte_flow_hw, rule));
-		mlx5_ipool_free(table->resource, res_idx);
+		if (table->resource)
+			mlx5_ipool_free(table->resource, res_idx);
 	}
 }
 
@@ -4455,6 +4466,7 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	uint32_t i = 0, max_tpl = MLX5_HW_TBL_MAX_ITEM_TEMPLATE;
 	uint32_t nb_flows = rte_align32pow2(attr->nb_flows);
 	bool port_started = !!dev->data->dev_started;
+	bool rpool_needed;
 	size_t tbl_mem_size;
 	int err;
 
@@ -4492,13 +4504,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->flow = mlx5_ipool_create(&cfg);
 	if (!tbl->flow)
 		goto error;
-	/* Allocate rule indexed pool. */
-	cfg.size = 0;
-	cfg.type = "mlx5_hw_table_rule";
-	cfg.max_idx += priv->hw_q[0].size;
-	tbl->resource = mlx5_ipool_create(&cfg);
-	if (!tbl->resource)
-		goto error;
 	/* Register the flow group. */
 	ge = mlx5_hlist_register(priv->sh->groups, attr->flow_attr.group, &ctx);
 	if (!ge)
@@ -4578,12 +4583,30 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->type = attr->flow_attr.transfer ? MLX5DR_TABLE_TYPE_FDB :
 		    (attr->flow_attr.egress ? MLX5DR_TABLE_TYPE_NIC_TX :
 		    MLX5DR_TABLE_TYPE_NIC_RX);
+	/*
+	 * Only the matcher supports update and needs more than 1 WQE, an additional
+	 * index is needed. Or else the flow index can be reused.
+	 */
+	rpool_needed = mlx5dr_matcher_is_updatable(tbl->matcher_info[0].matcher) &&
+		       mlx5dr_matcher_is_dependent(tbl->matcher_info[0].matcher);
+	if (rpool_needed) {
+		/* Allocate rule indexed pool. */
+		cfg.size = 0;
+		cfg.type = "mlx5_hw_table_rule";
+		cfg.max_idx += priv->hw_q[0].size;
+		tbl->resource = mlx5_ipool_create(&cfg);
+		if (!tbl->resource)
+			goto res_error;
+	}
 	if (port_started)
 		LIST_INSERT_HEAD(&priv->flow_hw_tbl, tbl, next);
 	else
 		LIST_INSERT_HEAD(&priv->flow_hw_tbl_ongo, tbl, next);
 	rte_rwlock_init(&tbl->matcher_replace_rwlk);
 	return tbl;
+res_error:
+	if (tbl->matcher_info[0].matcher)
+		(void)mlx5dr_matcher_destroy(tbl->matcher_info[0].matcher);
 at_error:
 	for (i = 0; i < nb_action_templates; i++) {
 		__flow_hw_action_template_destroy(dev, &tbl->ats[i].acts);
@@ -4601,8 +4624,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		if (tbl->grp)
 			mlx5_hlist_unregister(priv->sh->groups,
 					      &tbl->grp->entry);
-		if (tbl->resource)
-			mlx5_ipool_destroy(tbl->resource);
 		if (tbl->flow)
 			mlx5_ipool_destroy(tbl->flow);
 		mlx5_free(tbl);
@@ -4811,12 +4832,13 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	uint32_t ridx = 1;
 
 	/* Build ipool allocated object bitmap. */
-	mlx5_ipool_flush_cache(table->resource);
+	if (table->resource)
+		mlx5_ipool_flush_cache(table->resource);
 	mlx5_ipool_flush_cache(table->flow);
 	/* Check if ipool has allocated objects. */
 	if (table->refcnt ||
 	    mlx5_ipool_get_next(table->flow, &fidx) ||
-	    mlx5_ipool_get_next(table->resource, &ridx)) {
+	    (table->resource && mlx5_ipool_get_next(table->resource, &ridx))) {
 		DRV_LOG(WARNING, "Table %p is still in use.", (void *)table);
 		return rte_flow_error_set(error, EBUSY,
 				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
@@ -4838,7 +4860,8 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	if (table->matcher_info[1].matcher)
 		mlx5dr_matcher_destroy(table->matcher_info[1].matcher);
 	mlx5_hlist_unregister(priv->sh->groups, &table->grp->entry);
-	mlx5_ipool_destroy(table->resource);
+	if (table->resource)
+		mlx5_ipool_destroy(table->resource);
 	mlx5_ipool_destroy(table->flow);
 	mlx5_free(table);
 	return 0;
@@ -12340,11 +12363,11 @@ flow_hw_table_resize(struct rte_eth_dev *dev,
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 					  table, "cannot resize flows pool");
-	ret = mlx5_ipool_resize(table->resource, nb_flows);
-	if (ret)
-		return rte_flow_error_set(error, EINVAL,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
-					  table, "cannot resize resources pool");
+	/*
+	 * A resizable matcher doesn't support rule update. In this case, the ipool
+	 * for the resource is not created and there is no need to resize it.
+	 */
+	MLX5_ASSERT(!table->resource);
 	if (mlx5_is_multi_pattern_active(&table->mpctx)) {
 		ret = flow_hw_table_resize_multi_pattern_actions(dev, table, nb_flows, error);
 		if (ret < 0)
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 05/11] net/mlx5: remove action params from job
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (3 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held references to buffers which contained:

- modify header commands array,
- encap/decap data buffer,
- IPv6 routing data buffer.

These buffers were passed as parameters to HWS layer during rule
creation. They were needed only during the call to HWS layer
when flow operation is enqueues (i.e. mlx5dr_rule_create()).
After operation is enqueued, data stored there can be safely discarded
and it is not required to store it during the whole lifecycle of a job.

This patch removes references to these buffers from mlx5_hw_q_job
and removes relevant allocations to reduce job memory footprint.
Buffers stored per job are replaced with stack allocated ones,
contained in mlx5_flow_hw_action_params struct.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   3 -
 drivers/net/mlx5/mlx5_flow.h    |  10 +++
 drivers/net/mlx5/mlx5_flow_hw.c | 120 ++++++++++++++------------------
 3 files changed, 63 insertions(+), 70 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bb1853e797..bd0846d6bf 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -401,9 +401,6 @@ struct mlx5_hw_q_job {
 		const void *action; /* Indirect action attached to the job. */
 	};
 	void *user_data; /* Job user data. */
-	uint8_t *encap_data; /* Encap data. */
-	uint8_t *push_data; /* IPv6 routing push data. */
-	struct mlx5_modification_cmd *mhdr_cmd;
 	struct rte_flow_item *items;
 	union {
 		struct {
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 11135645ef..df1c913017 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1294,6 +1294,16 @@ typedef int
 
 #define MLX5_MHDR_MAX_CMD ((MLX5_MAX_MODIFY_NUM) * 2 + 1)
 
+/** Container for flow action data constructed during flow rule creation. */
+struct mlx5_flow_hw_action_params {
+	/** Array of constructed modify header commands. */
+	struct mlx5_modification_cmd mhdr_cmd[MLX5_MHDR_MAX_CMD];
+	/** Constructed encap/decap data buffer. */
+	uint8_t encap_data[MLX5_ENCAP_MAX_LEN];
+	/** Constructed IPv6 routing data buffer. */
+	uint8_t ipv6_push_data[MLX5_PUSH_MAX_LEN];
+};
+
 /* rte flow action translate to DR action struct. */
 struct mlx5_action_construct_data {
 	LIST_ENTRY(mlx5_action_construct_data) next;
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index fcf493c771..7160477c83 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -158,7 +158,7 @@ static int flow_hw_translate_group(struct rte_eth_dev *dev,
 				   struct rte_flow_error *error);
 static __rte_always_inline int
 flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action);
@@ -2799,7 +2799,7 @@ flow_hw_mhdr_cmd_is_nop(const struct mlx5_modification_cmd *cmd)
  *    0 on success, negative value otherwise and rte_errno is set.
  */
 static __rte_always_inline int
-flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
+flow_hw_modify_field_construct(struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action)
@@ -2858,7 +2858,7 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
 
 		if (i >= act_data->modify_header.mhdr_cmds_end)
 			return -1;
-		if (flow_hw_mhdr_cmd_is_nop(&job->mhdr_cmd[i])) {
+		if (flow_hw_mhdr_cmd_is_nop(&mhdr_cmd[i])) {
 			++i;
 			continue;
 		}
@@ -2878,7 +2878,7 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
 		    mhdr_action->dst.field == RTE_FLOW_FIELD_IPV6_DSCP)
 			data <<= MLX5_IPV6_HDR_DSCP_SHIFT;
 		data = (data & mask) >> off_b;
-		job->mhdr_cmd[i++].data1 = rte_cpu_to_be_32(data);
+		mhdr_cmd[i++].data1 = rte_cpu_to_be_32(data);
 		++field;
 	} while (field->size);
 	return 0;
@@ -2892,8 +2892,10 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
  *
  * @param[in] dev
  *   Pointer to the rte_eth_dev structure.
- * @param[in] job
- *   Pointer to job descriptor.
+ * @param[in] flow
+ *   Pointer to flow structure.
+ * @param[in] ap
+ *   Pointer to container for temporarily constructed actions' parameters.
  * @param[in] hw_acts
  *   Pointer to translated actions from template.
  * @param[in] it_idx
@@ -2910,7 +2912,8 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
  */
 static __rte_always_inline int
 flow_hw_actions_construct(struct rte_eth_dev *dev,
-			  struct mlx5_hw_q_job *job,
+			  struct rte_flow_hw *flow,
+			  struct mlx5_flow_hw_action_params *ap,
 			  const struct mlx5_hw_action_template *hw_at,
 			  const uint8_t it_idx,
 			  const struct rte_flow_action actions[],
@@ -2920,7 +2923,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
-	struct rte_flow_template_table *table = job->flow->table;
+	struct rte_flow_template_table *table = flow->table;
 	struct mlx5_action_construct_data *act_data;
 	const struct rte_flow_actions_template *at = hw_at->action_template;
 	const struct mlx5_hw_actions *hw_acts = &hw_at->acts;
@@ -2931,8 +2934,6 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	const struct rte_flow_action_ethdev *port_action = NULL;
 	const struct rte_flow_action_meter *meter = NULL;
 	const struct rte_flow_action_age *age = NULL;
-	uint8_t *buf = job->encap_data;
-	uint8_t *push_buf = job->push_data;
 	struct rte_flow_attr attr = {
 			.ingress = 1,
 	};
@@ -2957,17 +2958,17 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	if (hw_acts->mhdr && hw_acts->mhdr->mhdr_cmds_num > 0 && !hw_acts->mhdr->shared) {
 		uint16_t pos = hw_acts->mhdr->pos;
 
-		mp_segment = mlx5_multi_pattern_segment_find(table, job->flow->res_idx);
+		mp_segment = mlx5_multi_pattern_segment_find(table, flow->res_idx);
 		if (!mp_segment || !mp_segment->mhdr_action)
 			return -1;
 		rule_acts[pos].action = mp_segment->mhdr_action;
 		/* offset is relative to DR action */
 		rule_acts[pos].modify_header.offset =
-					job->flow->res_idx - mp_segment->head_index;
+					flow->res_idx - mp_segment->head_index;
 		rule_acts[pos].modify_header.data =
-					(uint8_t *)job->mhdr_cmd;
-		rte_memcpy(job->mhdr_cmd, hw_acts->mhdr->mhdr_cmds,
-			   sizeof(*job->mhdr_cmd) * hw_acts->mhdr->mhdr_cmds_num);
+					(uint8_t *)ap->mhdr_cmd;
+		rte_memcpy(ap->mhdr_cmd, hw_acts->mhdr->mhdr_cmds,
+			   sizeof(*ap->mhdr_cmd) * hw_acts->mhdr->mhdr_cmds_num);
 	}
 	LIST_FOREACH(act_data, &hw_acts->act_list, next) {
 		uint32_t jump_group;
@@ -3000,7 +3001,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		case RTE_FLOW_ACTION_TYPE_INDIRECT:
 			if (flow_hw_shared_action_construct
 					(dev, queue, action, table, it_idx,
-					 at->action_flags, job->flow,
+					 at->action_flags, flow,
 					 &rule_acts[act_data->action_dst]))
 				return -1;
 			break;
@@ -3025,8 +3026,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			rule_acts[act_data->action_dst].action =
 			(!!attr.group) ? jump->hws_action : jump->root_action;
-			job->flow->jump = jump;
-			job->flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->jump = jump;
+			flow->fate_type = MLX5_FLOW_FATE_JUMP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
@@ -3036,8 +3037,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			if (!hrxq)
 				return -1;
 			rule_acts[act_data->action_dst].action = hrxq->action;
-			job->flow->hrxq = hrxq;
-			job->flow->fate_type = MLX5_FLOW_FATE_QUEUE;
+			flow->hrxq = hrxq;
+			flow->fate_type = MLX5_FLOW_FATE_QUEUE;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_RSS:
 			item_flags = table->its[it_idx]->item_flags;
@@ -3049,38 +3050,37 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP:
 			enc_item = ((const struct rte_flow_action_vxlan_encap *)
 				   action->conf)->definition;
-			if (flow_dv_convert_encap_data(enc_item, buf, &encap_len, NULL))
+			if (flow_dv_convert_encap_data(enc_item, ap->encap_data, &encap_len, NULL))
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP:
 			enc_item = ((const struct rte_flow_action_nvgre_encap *)
 				   action->conf)->definition;
-			if (flow_dv_convert_encap_data(enc_item, buf, &encap_len, NULL))
+			if (flow_dv_convert_encap_data(enc_item, ap->encap_data, &encap_len, NULL))
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RAW_ENCAP:
 			raw_encap_data =
 				(const struct rte_flow_action_raw_encap *)
 				 action->conf;
-			rte_memcpy((void *)buf, raw_encap_data->data, act_data->encap.len);
-			MLX5_ASSERT(raw_encap_data->size ==
-				    act_data->encap.len);
+			rte_memcpy(ap->encap_data, raw_encap_data->data, act_data->encap.len);
+			MLX5_ASSERT(raw_encap_data->size == act_data->encap.len);
 			break;
 		case RTE_FLOW_ACTION_TYPE_IPV6_EXT_PUSH:
 			ipv6_push =
 				(const struct rte_flow_action_ipv6_ext_push *)action->conf;
-			rte_memcpy((void *)push_buf, ipv6_push->data,
+			rte_memcpy(ap->ipv6_push_data, ipv6_push->data,
 				   act_data->ipv6_ext.len);
 			MLX5_ASSERT(ipv6_push->size == act_data->ipv6_ext.len);
 			break;
 		case RTE_FLOW_ACTION_TYPE_MODIFY_FIELD:
 			if (action->type == RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID)
-				ret = flow_hw_set_vlan_vid_construct(dev, job,
+				ret = flow_hw_set_vlan_vid_construct(dev, ap->mhdr_cmd,
 								     act_data,
 								     hw_acts,
 								     action);
 			else
-				ret = flow_hw_modify_field_construct(job,
+				ret = flow_hw_modify_field_construct(ap->mhdr_cmd,
 								     act_data,
 								     hw_acts,
 								     action);
@@ -3116,8 +3116,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			rule_acts[act_data->action_dst + 1].action =
 					(!!attr.group) ? jump->hws_action :
 							 jump->root_action;
-			job->flow->jump = jump;
-			job->flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->jump = jump;
+			flow->fate_type = MLX5_FLOW_FATE_JUMP;
 			if (mlx5_aso_mtr_wait(priv->sh, MLX5_HW_INV_QUEUE, aso_mtr))
 				return -1;
 			break;
@@ -3131,11 +3131,11 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			age_idx = mlx5_hws_age_action_create(priv, queue, 0,
 							     age,
-							     job->flow->res_idx,
+							     flow->res_idx,
 							     error);
 			if (age_idx == 0)
 				return -rte_errno;
-			job->flow->age_idx = age_idx;
+			flow->age_idx = age_idx;
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3158,7 +3158,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
-			job->flow->cnt_id = cnt_id;
+			flow->cnt_id = cnt_id;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_COUNT:
 			ret = mlx5_hws_cnt_pool_get_action_offset
@@ -3169,7 +3169,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
-			job->flow->cnt_id = act_data->shared_counter.id;
+			flow->cnt_id = act_data->shared_counter.id;
 			break;
 		case RTE_FLOW_ACTION_TYPE_CONNTRACK:
 			ct_idx = MLX5_INDIRECT_ACTION_IDX_GET(action->conf);
@@ -3196,8 +3196,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			ret = flow_hw_meter_mark_compile(dev,
 				act_data->action_dst, action,
-				rule_acts, &job->flow->mtr_id,
-				MLX5_HW_INV_QUEUE, error);
+				rule_acts, &flow->mtr_id, MLX5_HW_INV_QUEUE, error);
 			if (ret != 0)
 				return ret;
 			break;
@@ -3207,9 +3206,9 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
-			age_idx = job->flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
+			age_idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
 			if (mlx5_hws_cnt_age_get(priv->hws_cpool,
-						 job->flow->cnt_id) != age_idx)
+						 flow->cnt_id) != age_idx)
 				/*
 				 * This is first use of this indirect counter
 				 * for this indirect AGE, need to increase the
@@ -3221,7 +3220,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		 * Update this indirect counter the indirect/direct AGE in which
 		 * using it.
 		 */
-		mlx5_hws_cnt_age_set(priv->hws_cpool, job->flow->cnt_id,
+		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id,
 				     age_idx);
 	}
 	if (hw_acts->encap_decap && !hw_acts->encap_decap->shared) {
@@ -3231,21 +3230,21 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		if (ix < 0)
 			return -1;
 		if (!mp_segment)
-			mp_segment = mlx5_multi_pattern_segment_find(table, job->flow->res_idx);
+			mp_segment = mlx5_multi_pattern_segment_find(table, flow->res_idx);
 		if (!mp_segment || !mp_segment->reformat_action[ix])
 			return -1;
 		ra->action = mp_segment->reformat_action[ix];
 		/* reformat offset is relative to selected DR action */
-		ra->reformat.offset = job->flow->res_idx - mp_segment->head_index;
-		ra->reformat.data = buf;
+		ra->reformat.offset = flow->res_idx - mp_segment->head_index;
+		ra->reformat.data = ap->encap_data;
 	}
 	if (hw_acts->push_remove && !hw_acts->push_remove->shared) {
 		rule_acts[hw_acts->push_remove_pos].ipv6_ext.offset =
-				job->flow->res_idx - 1;
-		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = push_buf;
+				flow->res_idx - 1;
+		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = ap->ipv6_push_data;
 	}
 	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id))
-		job->flow->cnt_id = hw_acts->cnt_id;
+		flow->cnt_id = hw_acts->cnt_id;
 	return 0;
 }
 
@@ -3345,6 +3344,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3401,7 +3401,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, flow, &ap,
 				      &table->ats[action_template_index],
 				      pattern_template_index, actions,
 				      rule_acts, queue, error)) {
@@ -3493,6 +3493,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
@@ -3545,7 +3546,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, flow, &ap,
 				      &table->ats[action_template_index],
 				      0, actions, rule_acts, queue, error)) {
 		rte_errno = EINVAL;
@@ -3627,6 +3628,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
 	struct rte_flow_template_table *table = of->table;
@@ -3679,7 +3681,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, nf, &ap,
 				      &table->ats[action_template_index],
 				      nf->mt_idx, actions,
 				      rule_acts, queue, error)) {
@@ -6611,7 +6613,7 @@ flow_hw_set_vlan_vid(struct rte_eth_dev *dev,
 
 static __rte_always_inline int
 flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action)
@@ -6639,8 +6641,7 @@ flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
 		.conf = &conf
 	};
 
-	return flow_hw_modify_field_construct(job, act_data, hw_acts,
-					      &modify_action);
+	return flow_hw_modify_field_construct(mhdr_cmd, act_data, hw_acts, &modify_action);
 }
 
 static int
@@ -9990,10 +9991,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
 			    sizeof(struct mlx5_hw_q_job) +
-			    sizeof(uint8_t) * MLX5_ENCAP_MAX_LEN +
-			    sizeof(uint8_t) * MLX5_PUSH_MAX_LEN +
-			    sizeof(struct mlx5_modification_cmd) *
-			    MLX5_MHDR_MAX_CMD +
 			    sizeof(struct rte_flow_item) *
 			    MLX5_HW_MAX_ITEMS +
 				sizeof(struct rte_flow_hw)) *
@@ -10006,8 +10003,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		uint8_t *encap = NULL, *push = NULL;
-		struct mlx5_modification_cmd *mhdr_cmd = NULL;
 		struct rte_flow_item *items = NULL;
 		struct rte_flow_hw *upd_flow = NULL;
 
@@ -10021,20 +10016,11 @@ flow_hw_configure(struct rte_eth_dev *dev,
 				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		mhdr_cmd = (struct mlx5_modification_cmd *)
-			   &job[_queue_attr[i]->size];
-		encap = (uint8_t *)
-			 &mhdr_cmd[_queue_attr[i]->size * MLX5_MHDR_MAX_CMD];
-		push = (uint8_t *)
-			 &encap[_queue_attr[i]->size * MLX5_ENCAP_MAX_LEN];
 		items = (struct rte_flow_item *)
-			 &push[_queue_attr[i]->size * MLX5_PUSH_MAX_LEN];
+			 &job[_queue_attr[i]->size];
 		upd_flow = (struct rte_flow_hw *)
 			&items[_queue_attr[i]->size * MLX5_HW_MAX_ITEMS];
 		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].mhdr_cmd = &mhdr_cmd[j * MLX5_MHDR_MAX_CMD];
-			job[j].encap_data = &encap[j * MLX5_ENCAP_MAX_LEN];
-			job[j].push_data = &push[j * MLX5_PUSH_MAX_LEN];
 			job[j].items = &items[j * MLX5_HW_MAX_ITEMS];
 			job[j].upd_flow = &upd_flow[j];
 			priv->hw_q[i].job[j] = &job[j];
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 06/11] net/mlx5: remove flow pattern from job
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (4 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held a reference to temporary flow rule pattern
and contained temporary REPRESENTED_PORT and TAG items structs.
They are used whenever it is required to prepend a flow rule pattern,
provided by the application with one of such items.
If prepending is required, then flow rule pattern is copied over to
temporary buffer and a new item added internally in PMD.
Such constructed buffer is passed to the HWS layer when flow create
operation is being enqueued.
After operation is enqueued, temporary flow pattern can be safely
discarded, so there is no need to store it during
the whole lifecycle of mlx5_hw_q_job.

This patch removes all references to flow rule pattern and items stored
inside mlx5_hw_q_job and removes relevant allocations to reduce job
memory footprint.
Temporary pattern and items stored per job are replaced with stack
allocated ones, contained in mlx5_flow_hw_pattern_params struct.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         | 17 ++++-------
 drivers/net/mlx5/mlx5_flow.h    | 10 +++++++
 drivers/net/mlx5/mlx5_flow_hw.c | 51 ++++++++++++++-------------------
 3 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bd0846d6bf..fc3d28e6f2 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -401,17 +401,12 @@ struct mlx5_hw_q_job {
 		const void *action; /* Indirect action attached to the job. */
 	};
 	void *user_data; /* Job user data. */
-	struct rte_flow_item *items;
-	union {
-		struct {
-			/* User memory for query output */
-			void *user;
-			/* Data extracted from hardware */
-			void *hw;
-		} __rte_packed query;
-		struct rte_flow_item_ethdev port_spec;
-		struct rte_flow_item_tag tag_spec;
-	} __rte_packed;
+	struct {
+		/* User memory for query output */
+		void *user;
+		/* Data extracted from hardware */
+		void *hw;
+	} query;
 	struct rte_flow_hw *upd_flow; /* Flow with updated values. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index df1c913017..96b43ce61e 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1304,6 +1304,16 @@ struct mlx5_flow_hw_action_params {
 	uint8_t ipv6_push_data[MLX5_PUSH_MAX_LEN];
 };
 
+/** Container for dynamically generated flow items used during flow rule creation. */
+struct mlx5_flow_hw_pattern_params {
+	/** Array of dynamically generated flow items. */
+	struct rte_flow_item items[MLX5_HW_MAX_ITEMS];
+	/** Temporary REPRESENTED_PORT item generated by PMD. */
+	struct rte_flow_item_ethdev port_spec;
+	/** Temporary TAG item generated by PMD. */
+	struct rte_flow_item_tag tag_spec;
+};
+
 /* rte flow action translate to DR action struct. */
 struct mlx5_action_construct_data {
 	LIST_ENTRY(mlx5_action_construct_data) next;
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 7160477c83..c3d9eef999 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3253,44 +3253,44 @@ flow_hw_get_rule_items(struct rte_eth_dev *dev,
 		       const struct rte_flow_template_table *table,
 		       const struct rte_flow_item items[],
 		       uint8_t pattern_template_index,
-		       struct mlx5_hw_q_job *job)
+		       struct mlx5_flow_hw_pattern_params *pp)
 {
 	struct rte_flow_pattern_template *pt = table->its[pattern_template_index];
 
 	/* Only one implicit item can be added to flow rule pattern. */
 	MLX5_ASSERT(!pt->implicit_port || !pt->implicit_tag);
-	/* At least one item was allocated in job descriptor for items. */
+	/* At least one item was allocated in pattern params for items. */
 	MLX5_ASSERT(MLX5_HW_MAX_ITEMS >= 1);
 	if (pt->implicit_port) {
 		if (pt->orig_item_nb + 1 > MLX5_HW_MAX_ITEMS) {
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		/* Set up represented port item in job descriptor. */
-		job->port_spec = (struct rte_flow_item_ethdev){
+		/* Set up represented port item in pattern params. */
+		pp->port_spec = (struct rte_flow_item_ethdev){
 			.port_id = dev->data->port_id,
 		};
-		job->items[0] = (struct rte_flow_item){
+		pp->items[0] = (struct rte_flow_item){
 			.type = RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT,
-			.spec = &job->port_spec,
+			.spec = &pp->port_spec,
 		};
-		rte_memcpy(&job->items[1], items, sizeof(*items) * pt->orig_item_nb);
-		return job->items;
+		rte_memcpy(&pp->items[1], items, sizeof(*items) * pt->orig_item_nb);
+		return pp->items;
 	} else if (pt->implicit_tag) {
 		if (pt->orig_item_nb + 1 > MLX5_HW_MAX_ITEMS) {
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		/* Set up tag item in job descriptor. */
-		job->tag_spec = (struct rte_flow_item_tag){
+		/* Set up tag item in pattern params. */
+		pp->tag_spec = (struct rte_flow_item_tag){
 			.data = flow_hw_tx_tag_regc_value(dev),
 		};
-		job->items[0] = (struct rte_flow_item){
+		pp->items[0] = (struct rte_flow_item){
 			.type = (enum rte_flow_item_type)MLX5_RTE_FLOW_ITEM_TYPE_TAG,
-			.spec = &job->tag_spec,
+			.spec = &pp->tag_spec,
 		};
-		rte_memcpy(&job->items[1], items, sizeof(*items) * pt->orig_item_nb);
-		return job->items;
+		rte_memcpy(&pp->items[1], items, sizeof(*items) * pt->orig_item_nb);
+		return pp->items;
 	} else {
 		return items;
 	}
@@ -3345,6 +3345,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	};
 	struct mlx5dr_rule_action *rule_acts;
 	struct mlx5_flow_hw_action_params ap;
+	struct mlx5_flow_hw_pattern_params pp;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3409,7 +3410,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		goto error;
 	}
 	rule_items = flow_hw_get_rule_items(dev, table, items,
-					    pattern_template_index, job);
+					    pattern_template_index, &pp);
 	if (!rule_items)
 		goto error;
 	if (likely(!rte_flow_template_table_resizable(dev->data->port_id, &table->cfg.attr))) {
@@ -9990,11 +9991,8 @@ flow_hw_configure(struct rte_eth_dev *dev,
 			goto err;
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
-			    sizeof(struct mlx5_hw_q_job) +
-			    sizeof(struct rte_flow_item) *
-			    MLX5_HW_MAX_ITEMS +
-				sizeof(struct rte_flow_hw)) *
-			    _queue_attr[i]->size;
+			     sizeof(struct mlx5_hw_q_job) +
+			     sizeof(struct rte_flow_hw)) * _queue_attr[i]->size;
 	}
 	priv->hw_q = mlx5_malloc(MLX5_MEM_ZERO, mem_size,
 				 64, SOCKET_ID_ANY);
@@ -10003,7 +10001,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		struct rte_flow_item *items = NULL;
 		struct rte_flow_hw *upd_flow = NULL;
 
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
@@ -10016,12 +10013,8 @@ flow_hw_configure(struct rte_eth_dev *dev,
 				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		items = (struct rte_flow_item *)
-			 &job[_queue_attr[i]->size];
-		upd_flow = (struct rte_flow_hw *)
-			&items[_queue_attr[i]->size * MLX5_HW_MAX_ITEMS];
+		upd_flow = (struct rte_flow_hw *)&job[_queue_attr[i]->size];
 		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].items = &items[j * MLX5_HW_MAX_ITEMS];
 			job[j].upd_flow = &upd_flow[j];
 			priv->hw_q[i].job[j] = &job[j];
 		}
@@ -12193,14 +12186,12 @@ flow_hw_calc_table_hash(struct rte_eth_dev *dev,
 			 uint32_t *hash, struct rte_flow_error *error)
 {
 	const struct rte_flow_item *items;
-	/* Temp job to allow adding missing items */
-	static struct rte_flow_item tmp_items[MLX5_HW_MAX_ITEMS];
-	static struct mlx5_hw_q_job job = {.items = tmp_items};
+	struct mlx5_flow_hw_pattern_params pp;
 	int res;
 
 	items = flow_hw_get_rule_items(dev, table, pattern,
 				       pattern_template_index,
-				       &job);
+				       &pp);
 	res = mlx5dr_rule_hash_calculate(mlx5_table_matcher(table), items,
 					 pattern_template_index,
 					 MLX5DR_RULE_HASH_CALC_MODE_RAW,
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/11] net/mlx5: remove updated flow from job
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (5 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held a reference to a temporary flow rule struct,
used during flow rule update operation. It serves as a container for
flow actions data calculated during actions construction.
After flow rule update operation succeeds, data from temporary flow rule
is copied over to original flow rule.

Although access to this temporary flow rule struct is required
during both operation enqueue step and completion polling step,
there can be only one ongoing flow update operation for a given
flow rule. As a result there is no need to store it per job.

This patch removes all references to temporary flow rule struct
stored in mlx5_hw_q_job and removes relevant allocations to reduce
job memory footprint.
Temporary flow rule struct stored per job is replaced with:

- If table is not resizable - An array of rte_flow_hw_aux structs,
  stored in template table. This array holds one entry per each
  flow rule, each containing a single mentioned temporary struct.
- If table is resizable - Additional rte_flow_hw_aux struct,
  allocated alongside rte_flow_hw in resizable ipool.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   1 -
 drivers/net/mlx5/mlx5_flow.h    |   7 +++
 drivers/net/mlx5/mlx5_flow_hw.c | 100 ++++++++++++++++++++++++++------
 3 files changed, 89 insertions(+), 19 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index fc3d28e6f2..0cc32bf67b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -407,7 +407,6 @@ struct mlx5_hw_q_job {
 		/* Data extracted from hardware */
 		void *hw;
 	} query;
-	struct rte_flow_hw *upd_flow; /* Flow with updated values. */
 };
 
 /* HW steering job descriptor LIFO pool. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 96b43ce61e..8fd07bdce4 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1281,6 +1281,12 @@ struct rte_flow_hw {
 	uint8_t rule[]; /* HWS layer data struct. */
 } __rte_packed;
 
+/** Auxiliary data stored per flow which is not required to be stored in main flow structure. */
+struct rte_flow_hw_aux {
+	/** Placeholder flow struct used during flow rule update operation. */
+	struct rte_flow_hw upd_flow;
+};
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
@@ -1589,6 +1595,7 @@ struct rte_flow_template_table {
 	/* Action templates bind to the table. */
 	struct mlx5_hw_action_template ats[MLX5_HW_TBL_MAX_ACTION_TEMPLATE];
 	struct mlx5_indexed_pool *flow; /* The table's flow ipool. */
+	struct rte_flow_hw_aux *flow_aux; /**< Auxiliary data stored per flow. */
 	struct mlx5_indexed_pool *resource; /* The table's resource ipool. */
 	struct mlx5_flow_template_table_cfg cfg;
 	uint32_t type; /* Flow table type RX/TX/FDB. */
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index c3d9eef999..acc56819eb 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -79,6 +79,66 @@ struct mlx5_indlst_legacy {
 #define MLX5_CONST_ENCAP_ITEM(encap_type, ptr) \
 (((const struct encap_type *)(ptr))->definition)
 
+/**
+ * Returns the size of a struct with a following layout:
+ *
+ * @code{.c}
+ * struct rte_flow_hw {
+ *     // rte_flow_hw fields
+ *     uint8_t rule[mlx5dr_rule_get_handle_size()];
+ * };
+ * @endcode
+ *
+ * Such struct is used as a basic container for HW Steering flow rule.
+ */
+static size_t
+mlx5_flow_hw_entry_size(void)
+{
+	return sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size();
+}
+
+/**
+ * Returns the size of "auxed" rte_flow_hw structure which is assumed to be laid out as follows:
+ *
+ * @code{.c}
+ * struct {
+ *     struct rte_flow_hw {
+ *         // rte_flow_hw fields
+ *         uint8_t rule[mlx5dr_rule_get_handle_size()];
+ *     } flow;
+ *     struct rte_flow_hw_aux aux;
+ * };
+ * @endcode
+ *
+ * Such struct is used whenever rte_flow_hw_aux cannot be allocated separately from the rte_flow_hw
+ * e.g., when table is resizable.
+ */
+static size_t
+mlx5_flow_hw_auxed_entry_size(void)
+{
+	size_t rule_size = mlx5dr_rule_get_handle_size();
+
+	return sizeof(struct rte_flow_hw) + rule_size + sizeof(struct rte_flow_hw_aux);
+}
+
+/**
+ * Returns a valid pointer to rte_flow_hw_aux associated with given rte_flow_hw
+ * depending on template table configuration.
+ */
+static __rte_always_inline struct rte_flow_hw_aux *
+mlx5_flow_hw_aux(uint16_t port_id, struct rte_flow_hw *flow)
+{
+	struct rte_flow_template_table *table = flow->table;
+
+	if (rte_flow_template_table_resizable(port_id, &table->cfg.attr)) {
+		size_t offset = sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size();
+
+		return RTE_PTR_ADD(flow, offset);
+	} else {
+		return &table->flow_aux[flow->idx - 1];
+	}
+}
+
 static int
 mlx5_tbl_multi_pattern_process(struct rte_eth_dev *dev,
 			       struct rte_flow_template_table *tbl,
@@ -3632,6 +3692,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
+	struct rte_flow_hw_aux *aux;
 	struct rte_flow_template_table *table = of->table;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t res_idx = 0;
@@ -3642,7 +3703,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		goto error;
 	}
-	nf = job->upd_flow;
+	aux = mlx5_flow_hw_aux(dev->data->port_id, of);
+	nf = &aux->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
@@ -3689,11 +3751,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
-	/*
-	 * Switch the old flow and the new flow.
-	 */
+	/* Switch to the old flow. New flow will retrieved from the table on completion. */
 	job->flow = of;
-	job->upd_flow = nf;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
 	if (likely(!ret))
@@ -3966,8 +4025,10 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 			mlx5_ipool_free(table->flow, flow->idx);
 		}
 	} else {
-		rte_memcpy(flow, job->upd_flow,
-			   offsetof(struct rte_flow_hw, rule));
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+		struct rte_flow_hw *upd_flow = &aux->upd_flow;
+
+		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
 	}
@@ -4456,7 +4517,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		.data = &flow_attr,
 	};
 	struct mlx5_indexed_pool_config cfg = {
-		.size = sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size(),
 		.trunk_size = 1 << 12,
 		.per_core_cache = 1 << 13,
 		.need_lock = 1,
@@ -4477,6 +4537,9 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	if (!attr->flow_attr.group)
 		max_tpl = 1;
 	cfg.max_idx = nb_flows;
+	cfg.size = !rte_flow_template_table_resizable(dev->data->port_id, attr) ?
+		   mlx5_flow_hw_entry_size() :
+		   mlx5_flow_hw_auxed_entry_size();
 	/* For table has very limited flows, disable cache. */
 	if (nb_flows < cfg.trunk_size) {
 		cfg.per_core_cache = 0;
@@ -4507,6 +4570,11 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->flow = mlx5_ipool_create(&cfg);
 	if (!tbl->flow)
 		goto error;
+	/* Allocate table of auxiliary flow rule structs. */
+	tbl->flow_aux = mlx5_malloc(MLX5_MEM_ZERO, sizeof(struct rte_flow_hw_aux) * nb_flows,
+				    RTE_CACHE_LINE_SIZE, rte_dev_numa_node(dev->device));
+	if (!tbl->flow_aux)
+		goto error;
 	/* Register the flow group. */
 	ge = mlx5_hlist_register(priv->sh->groups, attr->flow_attr.group, &ctx);
 	if (!ge)
@@ -4627,6 +4695,8 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		if (tbl->grp)
 			mlx5_hlist_unregister(priv->sh->groups,
 					      &tbl->grp->entry);
+		if (tbl->flow_aux)
+			mlx5_free(tbl->flow_aux);
 		if (tbl->flow)
 			mlx5_ipool_destroy(tbl->flow);
 		mlx5_free(tbl);
@@ -4865,6 +4935,7 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	mlx5_hlist_unregister(priv->sh->groups, &table->grp->entry);
 	if (table->resource)
 		mlx5_ipool_destroy(table->resource);
+	mlx5_free(table->flow_aux);
 	mlx5_ipool_destroy(table->flow);
 	mlx5_free(table);
 	return 0;
@@ -9991,8 +10062,7 @@ flow_hw_configure(struct rte_eth_dev *dev,
 			goto err;
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
-			     sizeof(struct mlx5_hw_q_job) +
-			     sizeof(struct rte_flow_hw)) * _queue_attr[i]->size;
+			     sizeof(struct mlx5_hw_q_job)) * _queue_attr[i]->size;
 	}
 	priv->hw_q = mlx5_malloc(MLX5_MEM_ZERO, mem_size,
 				 64, SOCKET_ID_ANY);
@@ -10001,23 +10071,17 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		struct rte_flow_hw *upd_flow = NULL;
-
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
 		priv->hw_q[i].size = _queue_attr[i]->size;
 		if (i == 0)
 			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
 					    &priv->hw_q[nb_q_updated];
 		else
-			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
-				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
+			priv->hw_q[i].job = (struct mlx5_hw_q_job **)&job[_queue_attr[i - 1]->size];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		upd_flow = (struct rte_flow_hw *)&job[_queue_attr[i]->size];
-		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].upd_flow = &upd_flow[j];
+		for (j = 0; j < _queue_attr[i]->size; j++)
 			priv->hw_q[i].job[j] = &job[j];
-		}
 		/* Notice ring name length is limited. */
 		priv->hw_q[i].indir_cq = mlx5_hwq_ring_create
 			(dev->data->port_id, i, _queue_attr[i]->size, "indir_act_cq");
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 08/11] net/mlx5: use flow as operation container
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (6 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

While processing async flow operations in mlx5 PMD,
mlx5_hw_q_job struct is used to hold the following data
related to the ongoing operation.

- operation type,
- user data,
- flow reference.

Job itself is then passed to mlx5dr layer as its "user data".
Other types of data required during flow operation processing
are accessed through the flow itself.

Since most of the accessed fields are in the flow struct itself,
the operation type and user data can be moved to the flow struct.
This removes unnecessary memory indirection and reduces memory
footprint of flow operations processing. It decreases cache stress
and as a result can increase processing throughput.

This patch removes the mlx5_hw_q_job from async flow operations
processing and from now on the flow itself can represent the ongoing
operation. Async operations on indirect actions still use jobs.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   8 +-
 drivers/net/mlx5/mlx5_flow.h    |  13 ++
 drivers/net/mlx5/mlx5_flow_hw.c | 210 +++++++++++++++-----------------
 3 files changed, 116 insertions(+), 115 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 0cc32bf67b..4362efb02f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -396,10 +396,7 @@ enum mlx5_hw_indirect_type {
 struct mlx5_hw_q_job {
 	uint32_t type; /* Job type. */
 	uint32_t indirect_type;
-	union {
-		struct rte_flow_hw *flow; /* Flow attached to the job. */
-		const void *action; /* Indirect action attached to the job. */
-	};
+	const void *action; /* Indirect action attached to the job. */
 	void *user_data; /* Job user data. */
 	struct {
 		/* User memory for query output */
@@ -412,7 +409,8 @@ struct mlx5_hw_q_job {
 /* HW steering job descriptor LIFO pool. */
 struct mlx5_hw_q {
 	uint32_t job_idx; /* Free job index. */
-	uint32_t size; /* LIFO size. */
+	uint32_t size; /* Job LIFO queue size. */
+	uint32_t ongoing_flow_ops; /* Number of ongoing flow operations. */
 	struct mlx5_hw_q_job **job; /* LIFO header. */
 	struct rte_ring *indir_cq; /* Indirect action SW completion queue. */
 	struct rte_ring *indir_iq; /* Indirect action SW in progress queue. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 8fd07bdce4..2e3e7d0533 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1257,6 +1257,16 @@ typedef uint32_t cnt_id_t;
 
 #if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)
 
+enum {
+	MLX5_FLOW_HW_FLOW_OP_TYPE_NONE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE,
+};
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
@@ -1278,6 +1288,9 @@ struct rte_flow_hw {
 	cnt_id_t cnt_id;
 	uint32_t mtr_id;
 	uint32_t rule_idx;
+	uint8_t operation_type; /**< Ongoing flow operation type. */
+	void *user_data; /**< Application's private data passed to enqueued flow operation. */
+	uint8_t padding[1]; /**< Padding for proper alignment of mlx5dr rule struct. */
 	uint8_t rule[]; /* HWS layer data struct. */
 } __rte_packed;
 
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index acc56819eb..4d39e7bd45 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -312,6 +312,31 @@ static const struct rte_flow_item_eth ctrl_rx_eth_bcast_spec = {
 	.hdr.ether_type = 0,
 };
 
+static inline uint32_t
+flow_hw_q_pending(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	MLX5_ASSERT(q->size >= q->job_idx);
+	return (q->size - q->job_idx) + q->ongoing_flow_ops;
+}
+
+static inline void
+flow_hw_q_inc_flow_ops(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	q->ongoing_flow_ops++;
+}
+
+static inline void
+flow_hw_q_dec_flow_ops(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	q->ongoing_flow_ops--;
+}
+
 static __rte_always_inline struct mlx5_hw_q_job *
 flow_hw_job_get(struct mlx5_priv *priv, uint32_t queue)
 {
@@ -3407,20 +3432,15 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	struct mlx5_flow_hw_action_params ap;
 	struct mlx5_flow_hw_pattern_params pp;
 	struct rte_flow_hw *flow = NULL;
-	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
 	uint32_t flow_idx = 0;
 	uint32_t res_idx = 0;
 	int ret;
 
 	if (unlikely((!dev->data->dev_started))) {
-		rte_errno = EINVAL;
-		goto error;
-	}
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
+		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "Port must be started before enqueueing flow operations");
+		return NULL;
 	}
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
@@ -3442,13 +3462,12 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		flow->res_idx = flow_idx;
 	}
 	/*
-	 * Set the job type here in order to know if the flow memory
+	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
 	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_CREATE;
-	job->flow = flow;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
+	flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+	flow->user_data = user_data;
+	rule_attr.user_data = flow;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3482,7 +3501,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	} else {
 		uint32_t selector;
 
-		job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
 		rte_rwlock_read_lock(&table->matcher_replace_rwlk);
 		selector = table->matcher_selector;
 		ret = mlx5dr_rule_create(table->matcher_info[selector].matcher,
@@ -3493,15 +3512,15 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		flow->matcher_selector = selector;
 	}
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return (struct rte_flow *)flow;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3556,19 +3575,14 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	struct mlx5dr_rule_action *rule_acts;
 	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
-	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
 	uint32_t res_idx = 0;
 	int ret;
 
 	if (unlikely(rule_index >= table->cfg.attr.nb_flows)) {
-		rte_errno = EINVAL;
-		goto error;
-	}
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
+		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "Flow rule index exceeds table size");
+		return NULL;
 	}
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
@@ -3590,13 +3604,12 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		flow->res_idx = flow_idx;
 	}
 	/*
-	 * Set the job type here in order to know if the flow memory
+	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
 	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_CREATE;
-	job->flow = flow;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
+	flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+	flow->user_data = user_data;
+	rule_attr.user_data = flow;
 	/* Set the rule index. */
 	flow->rule_idx = rule_index;
 	rule_attr.rule_idx = flow->rule_idx;
@@ -3621,7 +3634,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	} else {
 		uint32_t selector;
 
-		job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
 		rte_rwlock_read_lock(&table->matcher_replace_rwlk);
 		selector = table->matcher_selector;
 		ret = mlx5dr_rule_create(table->matcher_info[selector].matcher,
@@ -3630,15 +3643,15 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 	}
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return (struct rte_flow *)flow;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3694,15 +3707,9 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	struct rte_flow_hw *nf;
 	struct rte_flow_hw_aux *aux;
 	struct rte_flow_template_table *table = of->table;
-	struct mlx5_hw_q_job *job = NULL;
 	uint32_t res_idx = 0;
 	int ret;
 
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
 	aux = mlx5_flow_hw_aux(dev->data->port_id, of);
 	nf = &aux->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
@@ -3722,14 +3729,6 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
-	/*
-	 * Set the job type here in order to know if the flow memory
-	 * should be freed or not when get the result from dequeue.
-	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_UPDATE;
-	job->flow = nf;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3751,18 +3750,22 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
-	/* Switch to the old flow. New flow will retrieved from the table on completion. */
-	job->flow = of;
+	/*
+	 * Set the flow operation type here in order to know if the flow memory
+	 * should be freed or not when get the result from dequeue.
+	 */
+	of->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
+	of->user_data = user_data;
+	rule_attr.user_data = of;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return 0;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
-	/* Flow created fail, return the descriptor and flow memory. */
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	return rte_flow_error_set(error, rte_errno,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 				  "fail to update rte flow");
@@ -3806,27 +3809,23 @@ flow_hw_async_flow_destroy(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct rte_flow_hw *fh = (struct rte_flow_hw *)flow;
-	struct mlx5_hw_q_job *job;
+	bool resizable = rte_flow_template_table_resizable(dev->data->port_id,
+							   &fh->table->cfg.attr);
 	int ret;
 
-	job = flow_hw_job_get(priv, queue);
-	if (!job)
-		return rte_flow_error_set(error, ENOMEM,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-					  "fail to destroy rte flow: flow queue full");
-	job->type = !rte_flow_template_table_resizable(dev->data->port_id, &fh->table->cfg.attr) ?
-		    MLX5_HW_Q_JOB_TYPE_DESTROY : MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY;
-	job->user_data = user_data;
-	job->flow = fh;
-	rule_attr.user_data = job;
+	fh->operation_type = !resizable ?
+			     MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY :
+			     MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY;
+	fh->user_data = user_data;
+	rule_attr.user_data = fh;
 	rule_attr.rule_idx = fh->rule_idx;
 	ret = mlx5dr_rule_destroy((struct mlx5dr_rule *)fh->rule, &rule_attr);
 	if (ret) {
-		flow_hw_job_put(priv, job, queue);
 		return rte_flow_error_set(error, rte_errno,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "fail to destroy rte flow");
 	}
+	flow_hw_q_inc_flow_ops(priv, queue);
 	return 0;
 }
 
@@ -3931,16 +3930,16 @@ mlx5_hw_pull_flow_transfer_comp(struct rte_eth_dev *dev,
 				uint16_t n_res)
 {
 	uint32_t size, i;
-	struct mlx5_hw_q_job *job = NULL;
+	struct rte_flow_hw *flow = NULL;
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct rte_ring *ring = priv->hw_q[queue].flow_transfer_completed;
 
 	size = RTE_MIN(rte_ring_count(ring), n_res);
 	for (i = 0; i < size; i++) {
 		res[i].status = RTE_FLOW_OP_SUCCESS;
-		rte_ring_dequeue(ring, (void **)&job);
-		res[i].user_data = job->user_data;
-		flow_hw_job_put(priv, job, queue);
+		rte_ring_dequeue(ring, (void **)&flow);
+		res[i].user_data = flow->user_data;
+		flow_hw_q_dec_flow_ops(priv, queue);
 	}
 	return (int)size;
 }
@@ -3997,12 +3996,11 @@ __flow_hw_pull_indir_action_comp(struct rte_eth_dev *dev,
 
 static __rte_always_inline void
 hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct rte_flow_hw *flow,
 			       uint32_t queue, struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
-	struct rte_flow_hw *flow = job->flow;
 	struct rte_flow_template_table *table = flow->table;
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
@@ -4018,12 +4016,10 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 		mlx5_ipool_free(pool->idx_pool,	flow->mtr_id);
 		flow->mtr_id = 0;
 	}
-	if (job->type != MLX5_HW_Q_JOB_TYPE_UPDATE) {
-		if (table) {
-			if (table->resource)
-				mlx5_ipool_free(table->resource, res_idx);
-			mlx5_ipool_free(table->flow, flow->idx);
-		}
+	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
+		if (table->resource)
+			mlx5_ipool_free(table->resource, res_idx);
+		mlx5_ipool_free(table->flow, flow->idx);
 	} else {
 		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		struct rte_flow_hw *upd_flow = &aux->upd_flow;
@@ -4036,28 +4032,27 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 
 static __rte_always_inline void
 hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
-		      struct mlx5_hw_q_job *job,
+		      struct rte_flow_hw *flow,
 		      uint32_t queue, enum rte_flow_op_status status,
 		      struct rte_flow_error *error)
 {
-	struct rte_flow_hw *flow = job->flow;
 	struct rte_flow_template_table *table = flow->table;
 	uint32_t selector = flow->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
-	switch (job->type) {
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE:
+	switch (flow->operation_type) {
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
 		rte_atomic_fetch_add_explicit
 			(&table->matcher_info[selector].refcnt, 1,
 			 rte_memory_order_relaxed);
 		break;
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY:
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY:
 		rte_atomic_fetch_sub_explicit
 			(&table->matcher_info[selector].refcnt, 1,
 			 rte_memory_order_relaxed);
-		hw_cmpl_flow_update_or_destroy(dev, job, queue, error);
+		hw_cmpl_flow_update_or_destroy(dev, flow, queue, error);
 		break;
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE:
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE:
 		if (status == RTE_FLOW_OP_SUCCESS) {
 			rte_atomic_fetch_sub_explicit
 				(&table->matcher_info[selector].refcnt, 1,
@@ -4101,7 +4096,6 @@ flow_hw_pull(struct rte_eth_dev *dev,
 	     struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_hw_q_job *job;
 	int ret, i;
 
 	/* 1. Pull the flow completion. */
@@ -4111,23 +4105,24 @@ flow_hw_pull(struct rte_eth_dev *dev,
 				RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 				"fail to query flow queue");
 	for (i = 0; i <  ret; i++) {
-		job = (struct mlx5_hw_q_job *)res[i].user_data;
+		struct rte_flow_hw *flow = res[i].user_data;
+
 		/* Restore user data. */
-		res[i].user_data = job->user_data;
-		switch (job->type) {
-		case MLX5_HW_Q_JOB_TYPE_DESTROY:
-		case MLX5_HW_Q_JOB_TYPE_UPDATE:
-			hw_cmpl_flow_update_or_destroy(dev, job, queue, error);
+		res[i].user_data = flow->user_data;
+		switch (flow->operation_type) {
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE:
+			hw_cmpl_flow_update_or_destroy(dev, flow, queue, error);
 			break;
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE:
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE:
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY:
-			hw_cmpl_resizable_tbl(dev, job, queue, res[i].status, error);
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE:
+			hw_cmpl_resizable_tbl(dev, flow, queue, res[i].status, error);
 			break;
 		default:
 			break;
 		}
-		flow_hw_job_put(priv, job, queue);
+		flow_hw_q_dec_flow_ops(priv, queue);
 	}
 	/* 2. Pull indirect action comp. */
 	if (ret < n_res)
@@ -4171,7 +4166,7 @@ __flow_hw_push_action(struct rte_eth_dev *dev,
 			mlx5_aso_push_wqe(priv->sh,
 					  &priv->hws_mpool->sq[queue]);
 	}
-	return priv->hw_q[queue].size - priv->hw_q[queue].job_idx;
+	return flow_hw_q_pending(priv, queue);
 }
 
 static int
@@ -10073,6 +10068,7 @@ flow_hw_configure(struct rte_eth_dev *dev,
 	for (i = 0; i < nb_q_updated; i++) {
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
 		priv->hw_q[i].size = _queue_attr[i]->size;
+		priv->hw_q[i].ongoing_flow_ops = 0;
 		if (i == 0)
 			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
 					    &priv->hw_q[nb_q_updated];
@@ -12499,7 +12495,6 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 {
 	int ret;
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_hw_q_job *job;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
 	struct rte_flow_template_table *table = hw_flow->table;
 	uint32_t table_selector = table->matcher_selector;
@@ -12525,31 +12520,26 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "no active table resize");
-	job = flow_hw_job_get(priv, queue);
-	if (!job)
-		return rte_flow_error_set(error, ENOMEM,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-					  "queue is full");
-	job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE;
-	job->user_data = user_data;
-	job->flow = hw_flow;
-	rule_attr.user_data = job;
+	hw_flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE;
+	hw_flow->user_data = user_data;
+	rule_attr.user_data = hw_flow;
 	if (rule_selector == table_selector) {
 		struct rte_ring *ring = !attr->postpone ?
 					priv->hw_q[queue].flow_transfer_completed :
 					priv->hw_q[queue].flow_transfer_pending;
-		rte_ring_enqueue(ring, job);
+		rte_ring_enqueue(ring, hw_flow);
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return 0;
 	}
 	ret = mlx5dr_matcher_resize_rule_move(other_matcher,
 					      (struct mlx5dr_rule *)hw_flow->rule,
 					      &rule_attr);
 	if (ret) {
-		flow_hw_job_put(priv, job, queue);
 		return rte_flow_error_set(error, rte_errno,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "flow transfer failed");
 	}
+	flow_hw_q_inc_flow_ops(priv, queue);
 	return 0;
 }
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 09/11] net/mlx5: move rarely used flow fields outside
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (7 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Some of the flow fields are either not always required
or are used very rarely, e.g.:

- AGE action reference,
- direct METER/METER_MARK action reference,
- matcher selector for resizable tables.

This patch moves these fields to rte_flow_hw_aux struct in order to
reduce the overall size of the flow struct, reducing the total size
of working set for most common use cases.
This results in reduction of the frequency of cache invalidation
during async flow operations processing.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    |  61 +++++++++++-----
 drivers/net/mlx5/mlx5_flow_hw.c | 121 ++++++++++++++++++++++++--------
 2 files changed, 138 insertions(+), 44 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 2e3e7d0533..1c67d8dd35 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1271,31 +1271,60 @@ enum {
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
 
-/* HWS flow struct. */
+/** HWS flow struct. */
 struct rte_flow_hw {
-	uint32_t idx; /* Flow index from indexed pool. */
-	uint32_t res_idx; /* Resource index from indexed pool. */
-	uint32_t fate_type; /* Fate action type. */
+	/** The table flow allcated from. */
+	struct rte_flow_template_table *table;
+	/** Application's private data passed to enqueued flow operation. */
+	void *user_data;
+	/** Flow index from indexed pool. */
+	uint32_t idx;
+	/** Resource index from indexed pool. */
+	uint32_t res_idx;
+	/** HWS flow rule index passed to mlx5dr. */
+	uint32_t rule_idx;
+	/** Fate action type. */
+	uint32_t fate_type;
+	/** Ongoing flow operation type. */
+	uint8_t operation_type;
+	/** Index of pattern template this flow is based on. */
+	uint8_t mt_idx;
+
+	/** COUNT action index. */
+	cnt_id_t cnt_id;
 	union {
-		/* Jump action. */
+		/** Jump action. */
 		struct mlx5_hw_jump_action *jump;
-		struct mlx5_hrxq *hrxq; /* TIR action. */
+		/** TIR action. */
+		struct mlx5_hrxq *hrxq;
 	};
-	struct rte_flow_template_table *table; /* The table flow allcated from. */
-	uint8_t mt_idx;
-	uint8_t matcher_selector:1;
+
+	/**
+	 * Padding for alignment to 56 bytes.
+	 * Since mlx5dr rule is 72 bytes, whole flow is contained within 128 B (2 cache lines).
+	 * This space is reserved for future additions to flow struct.
+	 */
+	uint8_t padding[10];
+	/** HWS layer data struct. */
+	uint8_t rule[];
+} __rte_packed;
+
+/** Auxiliary data fields that are updatable. */
+struct rte_flow_hw_aux_fields {
+	/** AGE action index. */
 	uint32_t age_idx;
-	cnt_id_t cnt_id;
+	/** Direct meter (METER or METER_MARK) action index. */
 	uint32_t mtr_id;
-	uint32_t rule_idx;
-	uint8_t operation_type; /**< Ongoing flow operation type. */
-	void *user_data; /**< Application's private data passed to enqueued flow operation. */
-	uint8_t padding[1]; /**< Padding for proper alignment of mlx5dr rule struct. */
-	uint8_t rule[]; /* HWS layer data struct. */
-} __rte_packed;
+};
 
 /** Auxiliary data stored per flow which is not required to be stored in main flow structure. */
 struct rte_flow_hw_aux {
+	/** Auxiliary fields associated with the original flow. */
+	struct rte_flow_hw_aux_fields orig;
+	/** Auxiliary fields associated with the updated flow. */
+	struct rte_flow_hw_aux_fields upd;
+	/** Index of resizable matcher associated with this flow. */
+	uint8_t matcher_selector;
 	/** Placeholder flow struct used during flow rule update operation. */
 	struct rte_flow_hw upd_flow;
 };
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 4d39e7bd45..3252f76e64 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -139,6 +139,50 @@ mlx5_flow_hw_aux(uint16_t port_id, struct rte_flow_hw *flow)
 	}
 }
 
+static __rte_always_inline void
+mlx5_flow_hw_aux_set_age_idx(struct rte_flow_hw *flow,
+			     struct rte_flow_hw_aux *aux,
+			     uint32_t age_idx)
+{
+	/*
+	 * Only when creating a flow rule, the type will be set explicitly.
+	 * Or else, it should be none in the rule update case.
+	 */
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		aux->upd.age_idx = age_idx;
+	else
+		aux->orig.age_idx = age_idx;
+}
+
+static __rte_always_inline uint32_t
+mlx5_flow_hw_aux_get_age_idx(struct rte_flow_hw *flow, struct rte_flow_hw_aux *aux)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		return aux->upd.age_idx;
+	else
+		return aux->orig.age_idx;
+}
+
+static __rte_always_inline void
+mlx5_flow_hw_aux_set_mtr_id(struct rte_flow_hw *flow,
+			    struct rte_flow_hw_aux *aux,
+			    uint32_t mtr_id)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		aux->upd.mtr_id = mtr_id;
+	else
+		aux->orig.mtr_id = mtr_id;
+}
+
+static __rte_always_inline uint32_t __rte_unused
+mlx5_flow_hw_aux_get_mtr_id(struct rte_flow_hw *flow, struct rte_flow_hw_aux *aux)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		return aux->upd.mtr_id;
+	else
+		return aux->orig.mtr_id;
+}
+
 static int
 mlx5_tbl_multi_pattern_process(struct rte_eth_dev *dev,
 			       struct rte_flow_template_table *tbl,
@@ -2753,6 +2797,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_age_info *age_info;
 	struct mlx5_hws_age_param *param;
+	struct rte_flow_hw_aux *aux;
 	uint32_t act_idx = (uint32_t)(uintptr_t)action->conf;
 	uint32_t type = act_idx >> MLX5_INDIRECT_ACTION_TYPE_OFFSET;
 	uint32_t idx = act_idx &
@@ -2790,11 +2835,12 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 		flow->cnt_id = act_idx;
 		break;
 	case MLX5_INDIRECT_ACTION_TYPE_AGE:
+		aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		/*
 		 * Save the index with the indirect type, to recognize
 		 * it in flow destroy.
 		 */
-		flow->age_idx = act_idx;
+		mlx5_flow_hw_aux_set_age_idx(flow, aux, act_idx);
 		if (action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 			/*
 			 * The mutual update for idirect AGE & COUNT will be
@@ -3020,14 +3066,16 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	const struct rte_flow_action_meter *meter = NULL;
 	const struct rte_flow_action_age *age = NULL;
 	struct rte_flow_attr attr = {
-			.ingress = 1,
+		.ingress = 1,
 	};
 	uint32_t ft_flag;
-	size_t encap_len = 0;
 	int ret;
+	size_t encap_len = 0;
 	uint32_t age_idx = 0;
+	uint32_t mtr_idx = 0;
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_multi_pattern_segment *mp_segment = NULL;
+	struct rte_flow_hw_aux *aux;
 
 	attr.group = table->grp->group_id;
 	ft_flag = mlx5_hw_act_flag[!!table->grp->group_id][table->type];
@@ -3207,6 +3255,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			age = action->conf;
 			/*
 			 * First, create the AGE parameter, then create its
@@ -3220,7 +3269,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 							     error);
 			if (age_idx == 0)
 				return -rte_errno;
-			flow->age_idx = age_idx;
+			mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3281,9 +3330,11 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			ret = flow_hw_meter_mark_compile(dev,
 				act_data->action_dst, action,
-				rule_acts, &flow->mtr_id, MLX5_HW_INV_QUEUE, error);
+				rule_acts, &mtr_idx, MLX5_HW_INV_QUEUE, error);
 			if (ret != 0)
 				return ret;
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+			mlx5_flow_hw_aux_set_mtr_id(flow, aux, mtr_idx);
 			break;
 		default:
 			break;
@@ -3291,9 +3342,10 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
-			age_idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
-			if (mlx5_hws_cnt_age_get(priv->hws_cpool,
-						 flow->cnt_id) != age_idx)
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+			age_idx = mlx5_flow_hw_aux_get_age_idx(flow, aux) &
+				  MLX5_HWS_AGE_IDX_MASK;
+			if (mlx5_hws_cnt_age_get(priv->hws_cpool, flow->cnt_id) != age_idx)
 				/*
 				 * This is first use of this indirect counter
 				 * for this indirect AGE, need to increase the
@@ -3305,8 +3357,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		 * Update this indirect counter the indirect/direct AGE in which
 		 * using it.
 		 */
-		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id,
-				     age_idx);
+		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, age_idx);
 	}
 	if (hw_acts->encap_decap && !hw_acts->encap_decap->shared) {
 		int ix = mlx5_multi_pattern_reformat_to_index(hw_acts->encap_decap->action_type);
@@ -3499,6 +3550,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 	} else {
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		uint32_t selector;
 
 		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
@@ -3510,7 +3562,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
-		flow->matcher_selector = selector;
+		aux->matcher_selector = selector;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3632,6 +3684,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 rule_acts, &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 	} else {
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		uint32_t selector;
 
 		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
@@ -3642,6 +3695,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 rule_acts, &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
+		aux->matcher_selector = selector;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3729,6 +3783,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
+	/* Indicate the construction function to set the proper fields. */
+	nf->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3846,15 +3902,17 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 			  struct rte_flow_hw *flow,
 			  struct rte_flow_error *error)
 {
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(priv->dev_data->port_id, flow);
 	uint32_t *cnt_queue;
+	uint32_t age_idx = aux->orig.age_idx;
 
 	if (mlx5_hws_cnt_is_shared(priv->hws_cpool, flow->cnt_id)) {
-		if (flow->age_idx && !mlx5_hws_age_is_indirect(flow->age_idx)) {
+		if (age_idx && !mlx5_hws_age_is_indirect(age_idx)) {
 			/* Remove this AGE parameter from indirect counter. */
 			mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, 0);
 			/* Release the AGE parameter. */
-			mlx5_hws_age_action_destroy(priv, flow->age_idx, error);
-			flow->age_idx = 0;
+			mlx5_hws_age_action_destroy(priv, age_idx, error);
+			mlx5_flow_hw_aux_set_age_idx(flow, aux, 0);
 		}
 		return;
 	}
@@ -3863,16 +3921,16 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	/* Put the counter first to reduce the race risk in BG thread. */
 	mlx5_hws_cnt_pool_put(priv->hws_cpool, cnt_queue, &flow->cnt_id);
 	flow->cnt_id = 0;
-	if (flow->age_idx) {
-		if (mlx5_hws_age_is_indirect(flow->age_idx)) {
-			uint32_t idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
+	if (age_idx) {
+		if (mlx5_hws_age_is_indirect(age_idx)) {
+			uint32_t idx = age_idx & MLX5_HWS_AGE_IDX_MASK;
 
 			mlx5_hws_age_nb_cnt_decrease(priv, idx);
 		} else {
 			/* Release the AGE parameter. */
-			mlx5_hws_age_action_destroy(priv, flow->age_idx, error);
+			mlx5_hws_age_action_destroy(priv, age_idx, error);
 		}
-		flow->age_idx = 0;
+		mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 	}
 }
 
@@ -4002,6 +4060,7 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
 	struct rte_flow_template_table *table = flow->table;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
 
@@ -4012,9 +4071,9 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	if (mlx5_hws_cnt_id_valid(flow->cnt_id))
 		flow_hw_age_count_release(priv, queue,
 					  flow, error);
-	if (flow->mtr_id) {
-		mlx5_ipool_free(pool->idx_pool,	flow->mtr_id);
-		flow->mtr_id = 0;
+	if (aux->orig.mtr_id) {
+		mlx5_ipool_free(pool->idx_pool,	aux->orig.mtr_id);
+		aux->orig.mtr_id = 0;
 	}
 	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
 		if (table->resource)
@@ -4025,6 +4084,8 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 		struct rte_flow_hw *upd_flow = &aux->upd_flow;
 
 		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
+		aux->orig = aux->upd;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
 	}
@@ -4037,7 +4098,8 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 		      struct rte_flow_error *error)
 {
 	struct rte_flow_template_table *table = flow->table;
-	uint32_t selector = flow->matcher_selector;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+	uint32_t selector = aux->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
 	switch (flow->operation_type) {
@@ -4060,7 +4122,7 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 			rte_atomic_fetch_add_explicit
 				(&table->matcher_info[other_selector].refcnt, 1,
 				 rte_memory_order_relaxed);
-			flow->matcher_selector = other_selector;
+			aux->matcher_selector = other_selector;
 		}
 		break;
 	default:
@@ -11206,6 +11268,7 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 {
 	int ret = -EINVAL;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
+	struct rte_flow_hw_aux *aux;
 
 	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
 		switch (actions->type) {
@@ -11216,8 +11279,9 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 						    error);
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
-			ret = flow_hw_query_age(dev, hw_flow->age_idx, data,
-						error);
+			aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
+			ret = flow_hw_query_age(dev, mlx5_flow_hw_aux_get_age_idx(hw_flow, aux),
+						data, error);
 			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
@@ -12497,8 +12561,9 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
 	struct rte_flow_template_table *table = hw_flow->table;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
 	uint32_t table_selector = table->matcher_selector;
-	uint32_t rule_selector = hw_flow->matcher_selector;
+	uint32_t rule_selector = aux->matcher_selector;
 	uint32_t other_selector;
 	struct mlx5dr_matcher *other_matcher;
 	struct mlx5dr_rule_attr rule_attr = {
@@ -12511,7 +12576,7 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 	 * the one that was used BEFORE table resize.
 	 * Since the function is called AFTER table resize,
 	 * `table->matcher_selector` always points to the new matcher and
-	 * `hw_flow->matcher_selector` points to a matcher used to create the flow.
+	 * `aux->matcher_selector` points to a matcher used to create the flow.
 	 */
 	other_selector = rule_selector == table_selector ?
 			 (rule_selector + 1) & 1 : rule_selector;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 10/11] net/mlx5: reuse flow fields
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (8 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-28 17:00 ` [PATCH 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Each time a flow is allocated in mlx5 PMD the whole buffer,
both rte_flow_hw and mlx5dr_rule parts, are zeroed.
This introduces some wasted work because:

- mlx5dr layer does not assume that mlx5dr_rule must be initialized,
- flow action translation in mlx5 PMD does not need most of the fields
  of rte_flow_hw to be zeroed.

To reduce this wasted work, this patch introduces flags field to
flow definition. Each flow field which is not always initialized
during flow creation, will have a correspondent flag set if value is
valid (in other words - it was set during flow creation).
Utilizing this mechanism allows PMD to:

- remove zeroing from flow allocation,
- access some fields (especially from rte_flow_hw_aux) if and only if
  corresponding flag is set.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    | 24 ++++++++-
 drivers/net/mlx5/mlx5_flow_hw.c | 93 +++++++++++++++++++++------------
 2 files changed, 83 insertions(+), 34 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 1c67d8dd35..a01e970d04 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1267,6 +1267,26 @@ enum {
 	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE,
 };
 
+enum {
+	MLX5_FLOW_HW_FLOW_FLAG_CNT_ID = RTE_BIT32(0),
+	MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP = RTE_BIT32(1),
+	MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ = RTE_BIT32(2),
+	MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX = RTE_BIT32(3),
+	MLX5_FLOW_HW_FLOW_FLAG_MTR_ID = RTE_BIT32(4),
+	MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR = RTE_BIT32(5),
+	MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW = RTE_BIT32(6),
+};
+
+#define MLX5_FLOW_HW_FLOW_FLAGS_ALL ( \
+		MLX5_FLOW_HW_FLOW_FLAG_CNT_ID | \
+		MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP | \
+		MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ | \
+		MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX | \
+		MLX5_FLOW_HW_FLOW_FLAG_MTR_ID | \
+		MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR | \
+		MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW \
+	)
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
@@ -1283,8 +1303,8 @@ struct rte_flow_hw {
 	uint32_t res_idx;
 	/** HWS flow rule index passed to mlx5dr. */
 	uint32_t rule_idx;
-	/** Fate action type. */
-	uint32_t fate_type;
+	/** Which flow fields (inline or in auxiliary struct) are used. */
+	uint32_t flags;
 	/** Ongoing flow operation type. */
 	uint8_t operation_type;
 	/** Index of pattern template this flow is based on. */
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 3252f76e64..4e4beb4428 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -2832,6 +2832,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 				&rule_act->action,
 				&rule_act->counter.offset))
 			return -1;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 		flow->cnt_id = act_idx;
 		break;
 	case MLX5_INDIRECT_ACTION_TYPE_AGE:
@@ -2841,6 +2842,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 		 * it in flow destroy.
 		 */
 		mlx5_flow_hw_aux_set_age_idx(flow, aux, act_idx);
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX;
 		if (action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 			/*
 			 * The mutual update for idirect AGE & COUNT will be
@@ -2856,6 +2858,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 						  &param->queue_id, &age_cnt,
 						  idx) < 0)
 				return -1;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = age_cnt;
 			param->nb_cnts++;
 		} else {
@@ -3160,7 +3163,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			rule_acts[act_data->action_dst].action =
 			(!!attr.group) ? jump->hws_action : jump->root_action;
 			flow->jump = jump;
-			flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
@@ -3171,7 +3174,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			rule_acts[act_data->action_dst].action = hrxq->action;
 			flow->hrxq = hrxq;
-			flow->fate_type = MLX5_FLOW_FATE_QUEUE;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_RSS:
 			item_flags = table->its[it_idx]->item_flags;
@@ -3250,7 +3253,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 					(!!attr.group) ? jump->hws_action :
 							 jump->root_action;
 			flow->jump = jump;
-			flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP;
 			if (mlx5_aso_mtr_wait(priv->sh, MLX5_HW_INV_QUEUE, aso_mtr))
 				return -1;
 			break;
@@ -3270,6 +3273,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			if (age_idx == 0)
 				return -rte_errno;
 			mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX;
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3292,6 +3296,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = cnt_id;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_COUNT:
@@ -3303,6 +3308,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = act_data->shared_counter.id;
 			break;
 		case RTE_FLOW_ACTION_TYPE_CONNTRACK:
@@ -3335,13 +3341,18 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return ret;
 			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			mlx5_flow_hw_aux_set_mtr_id(flow, aux, mtr_idx);
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MTR_ID;
 			break;
 		default:
 			break;
 		}
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
+		/* If indirect count is used, then CNT_ID flag should be set. */
+		MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID);
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
+			/* If indirect AGE is used, then AGE_IDX flag should be set. */
+			MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX);
 			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			age_idx = mlx5_flow_hw_aux_get_age_idx(flow, aux) &
 				  MLX5_HWS_AGE_IDX_MASK;
@@ -3379,8 +3390,10 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				flow->res_idx - 1;
 		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = ap->ipv6_push_data;
 	}
-	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id))
+	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id)) {
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 		flow->cnt_id = hw_acts->cnt_id;
+	}
 	return 0;
 }
 
@@ -3493,7 +3506,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 				   "Port must be started before enqueueing flow operations");
 		return NULL;
 	}
-	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
+	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3512,6 +3525,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	} else {
 		flow->res_idx = flow_idx;
 	}
+	flow->flags = 0;
 	/*
 	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3563,6 +3577,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		aux->matcher_selector = selector;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3636,7 +3651,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 				   "Flow rule index exceeds table size");
 		return NULL;
 	}
-	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
+	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3655,6 +3670,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	} else {
 		flow->res_idx = flow_idx;
 	}
+	flow->flags = 0;
 	/*
 	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3696,6 +3712,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		aux->matcher_selector = selector;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3783,6 +3800,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
+	nf->flags = 0;
 	/* Indicate the construction function to set the proper fields. */
 	nf->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	/*
@@ -3812,6 +3830,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	 */
 	of->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	of->user_data = user_data;
+	of->flags |= MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW;
 	rule_attr.user_data = of;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
@@ -3906,13 +3925,14 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	uint32_t *cnt_queue;
 	uint32_t age_idx = aux->orig.age_idx;
 
+	MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID);
 	if (mlx5_hws_cnt_is_shared(priv->hws_cpool, flow->cnt_id)) {
-		if (age_idx && !mlx5_hws_age_is_indirect(age_idx)) {
+		if ((flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX) &&
+		    !mlx5_hws_age_is_indirect(age_idx)) {
 			/* Remove this AGE parameter from indirect counter. */
 			mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, 0);
 			/* Release the AGE parameter. */
 			mlx5_hws_age_action_destroy(priv, age_idx, error);
-			mlx5_flow_hw_aux_set_age_idx(flow, aux, 0);
 		}
 		return;
 	}
@@ -3920,8 +3940,7 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	cnt_queue = mlx5_hws_cnt_is_pool_shared(priv) ? NULL : &queue;
 	/* Put the counter first to reduce the race risk in BG thread. */
 	mlx5_hws_cnt_pool_put(priv->hws_cpool, cnt_queue, &flow->cnt_id);
-	flow->cnt_id = 0;
-	if (age_idx) {
+	if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX) {
 		if (mlx5_hws_age_is_indirect(age_idx)) {
 			uint32_t idx = age_idx & MLX5_HWS_AGE_IDX_MASK;
 
@@ -3930,7 +3949,6 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 			/* Release the AGE parameter. */
 			mlx5_hws_age_action_destroy(priv, age_idx, error);
 		}
-		mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 	}
 }
 
@@ -4060,34 +4078,35 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
 	struct rte_flow_template_table *table = flow->table;
-	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
 
-	if (flow->fate_type == MLX5_FLOW_FATE_JUMP)
-		flow_hw_jump_release(dev, flow->jump);
-	else if (flow->fate_type == MLX5_FLOW_FATE_QUEUE)
-		mlx5_hrxq_obj_release(dev, flow->hrxq);
-	if (mlx5_hws_cnt_id_valid(flow->cnt_id))
-		flow_hw_age_count_release(priv, queue,
-					  flow, error);
-	if (aux->orig.mtr_id) {
-		mlx5_ipool_free(pool->idx_pool,	aux->orig.mtr_id);
-		aux->orig.mtr_id = 0;
-	}
-	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
-		if (table->resource)
-			mlx5_ipool_free(table->resource, res_idx);
-		mlx5_ipool_free(table->flow, flow->idx);
-	} else {
+	if (flow->flags & MLX5_FLOW_HW_FLOW_FLAGS_ALL) {
 		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
-		struct rte_flow_hw *upd_flow = &aux->upd_flow;
 
-		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
-		aux->orig = aux->upd;
-		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP)
+			flow_hw_jump_release(dev, flow->jump);
+		else if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ)
+			mlx5_hrxq_obj_release(dev, flow->hrxq);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID)
+			flow_hw_age_count_release(priv, queue, flow, error);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MTR_ID)
+			mlx5_ipool_free(pool->idx_pool, aux->orig.mtr_id);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW) {
+			struct rte_flow_hw *upd_flow = &aux->upd_flow;
+
+			rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
+			aux->orig = aux->upd;
+			flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+			if (table->resource)
+				mlx5_ipool_free(table->resource, res_idx);
+		}
+	}
+	if (flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY ||
+	    flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY) {
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
+		mlx5_ipool_free(table->flow, flow->idx);
 	}
 }
 
@@ -4102,6 +4121,7 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 	uint32_t selector = aux->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
+	MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR);
 	switch (flow->operation_type) {
 	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
 		rte_atomic_fetch_add_explicit
@@ -11275,10 +11295,18 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 		case RTE_FLOW_ACTION_TYPE_VOID:
 			break;
 		case RTE_FLOW_ACTION_TYPE_COUNT:
+			if (!(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID))
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+							  "counter not defined in the rule");
 			ret = flow_hw_query_counter(dev, hw_flow->cnt_id, data,
 						    error);
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
+			if (!(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX))
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+							  "age data not available");
 			aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
 			ret = flow_hw_query_age(dev, mlx5_flow_hw_aux_get_age_idx(hw_flow, aux),
 						data, error);
@@ -12571,6 +12599,7 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 		.burst = attr->postpone,
 	};
 
+	MLX5_ASSERT(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR);
 	/**
 	 * mlx5dr_matcher_resize_rule_move() accepts original table matcher -
 	 * the one that was used BEFORE table resize.
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 11/11] net/mlx5: remove unneeded device status checking
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (9 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
@ 2024-02-28 17:00 ` Dariusz Sosnowski
  2024-02-29  8:52 ` [PATCH 00/11] net/mlx5: flow insertion performance improvements Ori Kam
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
  12 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-28 17:00 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, stable

From: Bing Zhao <bingz@nvidia.com>

The flow rule can be inserted even before the device started. The
only exception is for a queue or RSS action.

For the other interfaces of template API, the start status is not
checked. The checking would cause some cache miss or eviction since
the flag locates on some other cache line.

Fixes: f1fecffa88df ("net/mlx5: support Direct Rules action template API")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow_hw.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 4e4beb4428..8481a9b7f0 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3501,11 +3501,6 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	uint32_t res_idx = 0;
 	int ret;
 
-	if (unlikely((!dev->data->dev_started))) {
-		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-				   "Port must be started before enqueueing flow operations");
-		return NULL;
-	}
 	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 00/11] net/mlx5: flow insertion performance improvements
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (10 preceding siblings ...)
  2024-02-28 17:00 ` [PATCH 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
@ 2024-02-29  8:52 ` Ori Kam
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
  12 siblings, 0 replies; 26+ messages in thread
From: Ori Kam @ 2024-02-29  8:52 UTC (permalink / raw)
  To: Dariusz Sosnowski, Slava Ovsiienko, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Hi Dariusz,

> -----Original Message-----
> From: Dariusz Sosnowski <dsosnowski@nvidia.com>
> Sent: Wednesday, February 28, 2024 7:01 PM
> 
> Goal of this patchset is to improve the throughput of flow insertion
> and deletion in mlx5 PMD when HW Steering flow engine is used.
> 
> - Patch 1 - Use preallocated per-queue, per-actions template buffer
>   for storing translated flow actions, instead of allocating and
>   filling it on demand, on each flow operation.
> - Patches 2-4 - Make resource index allocation optional. This allocation
>   will be skipped when it is not required by the created template table.
> - Patches 5-7 - Reduce memory footprint of the internal flow queue.
> - Patch 8 - Remove indirection between flow job and flow itself,
>   by using flow as an operation container.
> - Patches 9-10 - Reduce memory footpring of flow struct by moving
>   rarely used flow fields outside of the main flow struct.
>   These fields will accesses only when needed.
>   Also remove unneeded `zmalloc` usage.
> - Patch 11 - Remove unneeded device status check in flow create.
> 
> In general all of these changes result in the following improvements
> (all numbers are averaged Kflows/sec):
> 
> |              | Insertion) |   +%   | Deletion |   +%  |
> |--------------|:----------:|:------:|:--------:|:-----:|
> | baseline     |   6338.7   |        |  9739.6  |       |
> | improvements |   6978.8   | +10.1% |  10432.4 | +7.1% |
> 
> The basic benchmark was run on ConnectX-6 Dx (22.40.1000),
> on the system with Intel Xeon Platinum 8380 CPU.
> 
> Bing Zhao (2):
>   net/mlx5: skip the unneeded resource index allocation
>   net/mlx5: remove unneeded device status checking
> 
> Dariusz Sosnowski (7):
>   net/mlx5: allocate local DR rule action buffers
>   net/mlx5: remove action params from job
>   net/mlx5: remove flow pattern from job
>   net/mlx5: remove updated flow from job
>   net/mlx5: use flow as operation container
>   net/mlx5: move rarely used flow fields outside
>   net/mlx5: reuse flow fields
> 
> Erez Shitrit (2):
>   net/mlx5/hws: add check for matcher rule update support
>   net/mlx5/hws: add check if matcher contains complex rules
> 
>  drivers/net/mlx5/hws/mlx5dr.h         |  16 +
>  drivers/net/mlx5/hws/mlx5dr_action.c  |   6 +
>  drivers/net/mlx5/hws/mlx5dr_action.h  |   2 +
>  drivers/net/mlx5/hws/mlx5dr_matcher.c |  29 +
>  drivers/net/mlx5/mlx5.h               |  29 +-
>  drivers/net/mlx5/mlx5_flow.h          | 128 ++++-
>  drivers/net/mlx5/mlx5_flow_hw.c       | 794 ++++++++++++++++----------
>  7 files changed, 666 insertions(+), 338 deletions(-)
> 
> --
> 2.39.2

Series-acked-by:  Ori Kam <orika@nvidia.com>
Best,
Ori


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 00/11] net/mlx5: flow insertion performance improvements
  2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
                   ` (11 preceding siblings ...)
  2024-02-29  8:52 ` [PATCH 00/11] net/mlx5: flow insertion performance improvements Ori Kam
@ 2024-02-29 11:51 ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
                     ` (11 more replies)
  12 siblings, 12 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Goal of this patchset is to improve the throughput of flow insertion
and deletion in mlx5 PMD when HW Steering flow engine is used.

- Patch 1 - Use preallocated per-queue, per-actions template buffer
  for storing translated flow actions, instead of allocating and
  filling it on demand, on each flow operation.
- Patches 2-4 - Make resource index allocation optional. This allocation
  will be skipped when it is not required by the created template table.
- Patches 5-7 - Reduce memory footprint of the internal flow queue.
- Patch 8 - Remove indirection between flow job and flow itself,
  by using flow as an operation container.
- Patches 9-10 - Reduce memory footpring of flow struct by moving
  rarely used flow fields outside of the main flow struct.
  These fields will accesses only when needed.
  Also remove unneeded `zmalloc` usage.
- Patch 11 - Remove unneeded device status check in flow create.

In general all of these changes result in the following improvements
(all numbers are averaged Kflows/sec):

|              | Insertion) |   +%   | Deletion |   +%  |
|--------------|:----------:|:------:|:--------:|:-----:|
| baseline     |   6338.7   |        |  9739.6  |       |
| improvements |   6978.8   | +10.1% |  10432.4 | +7.1% |

The basic benchmark was run on ConnectX-6 Dx (22.40.1000),
on the system with Intel Xeon Platinum 8380 CPU.

v2:

- Rebased.
- Applied Acked-by tags from previous version.

Bing Zhao (2):
  net/mlx5: skip the unneeded resource index allocation
  net/mlx5: remove unneeded device status checking

Dariusz Sosnowski (7):
  net/mlx5: allocate local DR rule action buffers
  net/mlx5: remove action params from job
  net/mlx5: remove flow pattern from job
  net/mlx5: remove updated flow from job
  net/mlx5: use flow as operation container
  net/mlx5: move rarely used flow fields outside
  net/mlx5: reuse flow fields

Erez Shitrit (2):
  net/mlx5/hws: add check for matcher rule update support
  net/mlx5/hws: add check if matcher contains complex rules

 drivers/net/mlx5/hws/mlx5dr.h         |  16 +
 drivers/net/mlx5/hws/mlx5dr_action.c  |   6 +
 drivers/net/mlx5/hws/mlx5dr_action.h  |   2 +
 drivers/net/mlx5/hws/mlx5dr_matcher.c |  29 +
 drivers/net/mlx5/mlx5.h               |  29 +-
 drivers/net/mlx5/mlx5_flow.h          | 128 ++++-
 drivers/net/mlx5/mlx5_flow_hw.c       | 794 ++++++++++++++++----------
 7 files changed, 666 insertions(+), 338 deletions(-)

--
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 01/11] net/mlx5: allocate local DR rule action buffers
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Goal of this is to remove the unnecessary copying of precalculated
mlx5dr_rule_action structures used to create HWS flow rules.

Before this patch, during template table creation an array of these
structures was calculated for each actions template used.
Each of these structures contained either full action definition or
partial (depends on mask configuration).
During flow creation, this array was copied to stack and later passed to
mlx5dr_rule_create().

This patch removes this copy by implementing the following:

- Allocate an array of mlx5dr_rule_action structures for each actions
  template and queue.
- Populate them with precalculated data from relevant actions templates.
- During flow creation, construction of unmasked actions works on an
  array dedicated for the specific queue and actions template.
- Pass this buffer to mlx5dr_rule_create directly.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    | 13 +++++++++
 drivers/net/mlx5/mlx5_flow_hw.c | 51 +++++++++++++++++++++++++++++----
 2 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 7aa24f7c52..02af0a08fa 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1566,6 +1566,10 @@ struct mlx5_matcher_info {
 	uint32_t refcnt;
 };
 
+struct mlx5_dr_rule_action_container {
+	struct mlx5dr_rule_action acts[MLX5_HW_MAX_ACTS];
+} __rte_cache_aligned;
+
 struct rte_flow_template_table {
 	LIST_ENTRY(rte_flow_template_table) next;
 	struct mlx5_flow_group *grp; /* The group rte_flow_template_table uses. */
@@ -1585,6 +1589,15 @@ struct rte_flow_template_table {
 	uint32_t refcnt; /* Table reference counter. */
 	struct mlx5_tbl_multi_pattern_ctx mpctx;
 	struct mlx5dr_matcher_attr matcher_attr;
+	/**
+	 * Variable length array of containers containing precalculated templates of DR actions
+	 * arrays. This array is allocated at template table creation time and contains
+	 * one container per each queue, per each actions template.
+	 * Essentially rule_acts is a 2-dimensional array indexed with (AT index, queue) pair.
+	 * Each container will provide a local "queue buffer" to work on for flow creation
+	 * operations when using a given actions template.
+	 */
+	struct mlx5_dr_rule_action_container rule_acts[];
 };
 
 static __rte_always_inline struct mlx5dr_matcher *
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 9620b7f576..ef91a23a9b 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -2512,6 +2512,34 @@ __flow_hw_actions_translate(struct rte_eth_dev *dev,
 				  "fail to create rte table");
 }
 
+static __rte_always_inline struct mlx5dr_rule_action *
+flow_hw_get_dr_action_buffer(struct mlx5_priv *priv,
+			     struct rte_flow_template_table *table,
+			     uint8_t action_template_index,
+			     uint32_t queue)
+{
+	uint32_t offset = action_template_index * priv->nb_queue + queue;
+
+	return &table->rule_acts[offset].acts[0];
+}
+
+static void
+flow_hw_populate_rule_acts_caches(struct rte_eth_dev *dev,
+				  struct rte_flow_template_table *table,
+				  uint8_t at_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t q;
+
+	for (q = 0; q < priv->nb_queue; ++q) {
+		struct mlx5dr_rule_action *rule_acts =
+				flow_hw_get_dr_action_buffer(priv, table, at_idx, q);
+
+		rte_memcpy(rule_acts, table->ats[at_idx].acts.rule_acts,
+			   sizeof(table->ats[at_idx].acts.rule_acts));
+	}
+}
+
 /**
  * Translate rte_flow actions to DR action.
  *
@@ -2539,6 +2567,7 @@ flow_hw_actions_translate(struct rte_eth_dev *dev,
 						tbl->ats[i].action_template,
 						&tbl->mpctx, error))
 			goto err;
+		flow_hw_populate_rule_acts_caches(dev, tbl, i);
 	}
 	ret = mlx5_tbl_multi_pattern_process(dev, tbl, &tbl->mpctx.segments[0],
 					     rte_log2_u32(tbl->cfg.attr.nb_flows),
@@ -2928,7 +2957,6 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_multi_pattern_segment *mp_segment = NULL;
 
-	rte_memcpy(rule_acts, hw_acts->rule_acts, sizeof(*rule_acts) * at->dr_actions_num);
 	attr.group = table->grp->group_id;
 	ft_flag = mlx5_hw_act_flag[!!table->grp->group_id][table->type];
 	if (table->type == MLX5DR_TABLE_TYPE_FDB) {
@@ -3335,7 +3363,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3358,6 +3386,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	mlx5_ipool_malloc(table->resource, &res_idx);
 	if (!res_idx)
 		goto error;
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterward.
@@ -3479,7 +3508,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
@@ -3501,6 +3530,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	mlx5_ipool_malloc(table->resource, &res_idx);
 	if (!res_idx)
 		goto error;
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterwards.
@@ -3610,7 +3640,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		.user_data = user_data,
 		.burst = attr->postpone,
 	};
-	struct mlx5dr_rule_action rule_acts[MLX5_HW_MAX_ACTS];
+	struct mlx5dr_rule_action *rule_acts;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
 	struct rte_flow_template_table *table = of->table;
@@ -3628,6 +3658,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		goto error;
 	nf = job->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
+	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
 	 * when free the flow afterwards.
@@ -4354,6 +4385,7 @@ mlx5_hw_build_template_table(struct rte_eth_dev *dev,
 			i++;
 			goto at_error;
 		}
+		flow_hw_populate_rule_acts_caches(dev, tbl, i);
 	}
 	tbl->nb_action_templates = nb_action_templates;
 	if (mlx5_is_multi_pattern_active(&tbl->mpctx)) {
@@ -4442,6 +4474,7 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	uint32_t i = 0, max_tpl = MLX5_HW_TBL_MAX_ITEM_TEMPLATE;
 	uint32_t nb_flows = rte_align32pow2(attr->nb_flows);
 	bool port_started = !!dev->data->dev_started;
+	size_t tbl_mem_size;
 	int err;
 
 	/* HWS layer accepts only 1 item template with root table. */
@@ -4461,8 +4494,16 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
+	/*
+	 * Amount of memory required for rte_flow_template_table struct:
+	 * - Size of the struct itself.
+	 * - VLA of DR rule action containers at the end =
+	 *     number of actions templates * number of queues * size of DR rule actions container.
+	 */
+	tbl_mem_size = sizeof(*tbl);
+	tbl_mem_size += nb_action_templates * priv->nb_queue * sizeof(tbl->rule_acts[0]);
 	/* Allocate the table memory. */
-	tbl = mlx5_malloc(MLX5_MEM_ZERO, sizeof(*tbl), 0, rte_socket_id());
+	tbl = mlx5_malloc(MLX5_MEM_ZERO, tbl_mem_size, RTE_CACHE_LINE_SIZE, rte_socket_id());
 	if (!tbl)
 		goto error;
 	tbl->cfg = *table_cfg;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 02/11] net/mlx5/hws: add check for matcher rule update support
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, Erez Shitrit

From: Erez Shitrit <erezsh@nvidia.com>

The user want to know before trying to update a rule if that matcher that
keeps the original rule supports updating.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/hws/mlx5dr.h         |  8 ++++++++
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 12 ++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index 8441ae97e9..c5824a6480 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -492,6 +492,14 @@ int mlx5dr_matcher_resize_rule_move(struct mlx5dr_matcher *src_matcher,
 				    struct mlx5dr_rule *rule,
 				    struct mlx5dr_rule_attr *attr);
 
+/* Check matcher ability to update existing rules
+ *
+ * @param[in] matcher
+ *	that the rule belongs to.
+ * @return true when the matcher is updatable false otherwise.
+ */
+bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher);
+
 /* Get the size of the rule handle (mlx5dr_rule) to be used on rule creation.
  *
  * @return size in bytes of rule handle struct.
diff --git a/drivers/net/mlx5/hws/mlx5dr_matcher.c b/drivers/net/mlx5/hws/mlx5dr_matcher.c
index 8a74a1ed7d..4e4da8e8f6 100644
--- a/drivers/net/mlx5/hws/mlx5dr_matcher.c
+++ b/drivers/net/mlx5/hws/mlx5dr_matcher.c
@@ -1530,6 +1530,18 @@ int mlx5dr_match_template_destroy(struct mlx5dr_match_template *mt)
 	return 0;
 }
 
+bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher)
+{
+	if (mlx5dr_table_is_root(matcher->tbl) ||
+	    mlx5dr_matcher_req_fw_wqe(matcher) ||
+	    mlx5dr_matcher_is_resizable(matcher) ||
+	    (!matcher->attr.optimize_using_rule_idx &&
+	    !mlx5dr_matcher_is_insert_by_idx(matcher)))
+		return false;
+
+	return true;
+}
+
 static int mlx5dr_matcher_resize_precheck(struct mlx5dr_matcher *src_matcher,
 					  struct mlx5dr_matcher *dst_matcher)
 {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 03/11] net/mlx5/hws: add check if matcher contains complex rules
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, Erez Shitrit

From: Erez Shitrit <erezsh@nvidia.com>

The function returns true if that matcher can contain complicated rule,
which means rule that needs more than one writing to the HW in order to
have it.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/hws/mlx5dr.h         |  8 ++++++++
 drivers/net/mlx5/hws/mlx5dr_action.c  |  6 ++++++
 drivers/net/mlx5/hws/mlx5dr_action.h  |  2 ++
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 17 +++++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index c5824a6480..36ecccf9ac 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -500,6 +500,14 @@ int mlx5dr_matcher_resize_rule_move(struct mlx5dr_matcher *src_matcher,
  */
 bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher);
 
+/* Check matcher if might contain rules that need complex structure
+ *
+ * @param[in] matcher
+ *	that the rule belongs to.
+ * @return true when the matcher is contains such rules, false otherwise.
+ */
+bool mlx5dr_matcher_is_dependent(struct mlx5dr_matcher *matcher);
+
 /* Get the size of the rule handle (mlx5dr_rule) to be used on rule creation.
  *
  * @return size in bytes of rule handle struct.
diff --git a/drivers/net/mlx5/hws/mlx5dr_action.c b/drivers/net/mlx5/hws/mlx5dr_action.c
index 96cad553aa..084d4d606e 100644
--- a/drivers/net/mlx5/hws/mlx5dr_action.c
+++ b/drivers/net/mlx5/hws/mlx5dr_action.c
@@ -3686,6 +3686,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_SINGLE1 | ASF_REMOVE;
 			setter->set_single = &mlx5dr_action_setter_ipv6_route_ext_pop;
 			setter->idx_single = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_PUSH_IPV6_ROUTE_EXT:
@@ -3712,6 +3713,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->set_double = &mlx5dr_action_setter_ipv6_route_ext_mhdr;
 			setter->idx_double = i;
 			setter->extra_data = 2;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_MODIFY_HDR:
@@ -3720,6 +3722,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_MODIFY;
 			setter->set_double = &mlx5dr_action_setter_modify_header;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_ASO_METER:
@@ -3747,6 +3750,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_INSERT;
 			setter->set_double = &mlx5dr_action_setter_insert_ptr;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_REFORMAT_L2_TO_TNL_L3:
@@ -3757,6 +3761,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->idx_double = i;
 			setter->set_single = &mlx5dr_action_setter_common_decap;
 			setter->idx_single = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_REFORMAT_TNL_L3_TO_L2:
@@ -3765,6 +3770,7 @@ int mlx5dr_action_template_process(struct mlx5dr_action_template *at)
 			setter->flags |= ASF_DOUBLE | ASF_MODIFY | ASF_INSERT;
 			setter->set_double = &mlx5dr_action_setter_tnl_l3_to_l2;
 			setter->idx_double = i;
+			at->need_dep_write = true;
 			break;
 
 		case MLX5DR_ACTION_TYP_TAG:
diff --git a/drivers/net/mlx5/hws/mlx5dr_action.h b/drivers/net/mlx5/hws/mlx5dr_action.h
index 064c18a90c..57e059a572 100644
--- a/drivers/net/mlx5/hws/mlx5dr_action.h
+++ b/drivers/net/mlx5/hws/mlx5dr_action.h
@@ -151,6 +151,8 @@ struct mlx5dr_action_template {
 	uint8_t num_of_action_stes;
 	uint8_t num_actions;
 	uint8_t only_term;
+	/* indicates rule might require dependent wqe */
+	bool need_dep_write;
 	uint32_t flags;
 };
 
diff --git a/drivers/net/mlx5/hws/mlx5dr_matcher.c b/drivers/net/mlx5/hws/mlx5dr_matcher.c
index 4e4da8e8f6..1c64abfa57 100644
--- a/drivers/net/mlx5/hws/mlx5dr_matcher.c
+++ b/drivers/net/mlx5/hws/mlx5dr_matcher.c
@@ -1542,6 +1542,23 @@ bool mlx5dr_matcher_is_updatable(struct mlx5dr_matcher *matcher)
 	return true;
 }
 
+bool mlx5dr_matcher_is_dependent(struct mlx5dr_matcher *matcher)
+{
+	int i;
+
+	if (matcher->action_ste.max_stes || mlx5dr_matcher_req_fw_wqe(matcher))
+		return true;
+
+	for (i = 0; i < matcher->num_of_at; i++) {
+		struct mlx5dr_action_template *at = &matcher->at[i];
+
+		if (at->need_dep_write)
+			return true;
+	}
+
+	return false;
+}
+
 static int mlx5dr_matcher_resize_precheck(struct mlx5dr_matcher *src_matcher,
 					  struct mlx5dr_matcher *dst_matcher)
 {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 04/11] net/mlx5: skip the unneeded resource index allocation
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (2 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

From: Bing Zhao <bingz@nvidia.com>

The resource index was introduced to decouple the flow rule and its
resources used by hardware steering. This is needed only when a rule
update is supported.

In some cases, the update is not supported on a table(matcher). E.g.:
  * Table is resizable
  * FW gets involved
  * Root table
  * Not index based or optimized (not applicable)

Or only one STE entry is required per rule. When doing an update, the
operation is always atomic. There is no need for the extra resource
index either.

If the matcher doesn't support rule update or the maximal entry is
only 1 for this matcher, there is no need to manage the resource
index allocation and free from the pool.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow_hw.c | 129 +++++++++++++++++++-------------
 1 file changed, 76 insertions(+), 53 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index ef91a23a9b..1fe8f42618 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3383,9 +3383,6 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
@@ -3394,7 +3391,14 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	flow->table = table;
 	flow->mt_idx = pattern_template_index;
 	flow->idx = flow_idx;
-	flow->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		flow->res_idx = res_idx;
+	} else {
+		flow->res_idx = flow_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3404,11 +3408,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	job->user_data = user_data;
 	rule_attr.user_data = job;
 	/*
-	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices for rule
-	 * insertion hints.
+	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
+	 * for rule insertion hints.
 	 */
-	MLX5_ASSERT(res_idx > 0);
-	flow->rule_idx = res_idx - 1;
+	flow->rule_idx = flow->res_idx - 1;
 	rule_attr.rule_idx = flow->rule_idx;
 	/*
 	 * Construct the flow actions based on the input actions.
@@ -3451,12 +3454,12 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return (struct rte_flow *)flow;
 error:
-	if (job)
-		flow_hw_job_put(priv, job, queue);
+	if (table->resource && res_idx)
+		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (res_idx)
-		mlx5_ipool_free(table->resource, res_idx);
+	if (job)
+		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3527,9 +3530,6 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
 	 * Set the table here in order to know the destination table
@@ -3538,7 +3538,14 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	flow->table = table;
 	flow->mt_idx = 0;
 	flow->idx = flow_idx;
-	flow->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		flow->res_idx = res_idx;
+	} else {
+		flow->res_idx = flow_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3547,9 +3554,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	job->flow = flow;
 	job->user_data = user_data;
 	rule_attr.user_data = job;
-	/*
-	 * Set the rule index.
-	 */
+	/* Set the rule index. */
 	flow->rule_idx = rule_index;
 	rule_attr.rule_idx = flow->rule_idx;
 	/*
@@ -3585,12 +3590,12 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return (struct rte_flow *)flow;
 error:
-	if (job)
-		flow_hw_job_put(priv, job, queue);
-	if (res_idx)
+	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
+	if (job)
+		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3653,9 +3658,6 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		goto error;
 	}
-	mlx5_ipool_malloc(table->resource, &res_idx);
-	if (!res_idx)
-		goto error;
 	nf = job->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3666,7 +3668,14 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	nf->table = table;
 	nf->mt_idx = of->mt_idx;
 	nf->idx = of->idx;
-	nf->res_idx = res_idx;
+	if (table->resource) {
+		mlx5_ipool_malloc(table->resource, &res_idx);
+		if (!res_idx)
+			goto error;
+		nf->res_idx = res_idx;
+	} else {
+		nf->res_idx = of->res_idx;
+	}
 	/*
 	 * Set the job type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3676,11 +3685,11 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	job->user_data = user_data;
 	rule_attr.user_data = job;
 	/*
-	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices for rule
-	 * insertion hints.
+	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
+	 * for rule insertion hints.
+	 * If there is only one STE, the update will be atomic by nature.
 	 */
-	MLX5_ASSERT(res_idx > 0);
-	nf->rule_idx = res_idx - 1;
+	nf->rule_idx = nf->res_idx - 1;
 	rule_attr.rule_idx = nf->rule_idx;
 	/*
 	 * Construct the flow actions based on the input actions.
@@ -3706,14 +3715,14 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	if (likely(!ret))
 		return 0;
 error:
+	if (table->resource && res_idx)
+		mlx5_ipool_free(table->resource, res_idx);
 	/* Flow created fail, return the descriptor and flow memory. */
 	if (job)
 		flow_hw_job_put(priv, job, queue);
-	if (res_idx)
-		mlx5_ipool_free(table->resource, res_idx);
 	return rte_flow_error_set(error, rte_errno,
-			RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-			"fail to update rte flow");
+				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				  "fail to update rte flow");
 }
 
 /**
@@ -3968,13 +3977,15 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	}
 	if (job->type != MLX5_HW_Q_JOB_TYPE_UPDATE) {
 		if (table) {
-			mlx5_ipool_free(table->resource, res_idx);
+			if (table->resource)
+				mlx5_ipool_free(table->resource, res_idx);
 			mlx5_ipool_free(table->flow, flow->idx);
 		}
 	} else {
 		rte_memcpy(flow, job->upd_flow,
 			   offsetof(struct rte_flow_hw, rule));
-		mlx5_ipool_free(table->resource, res_idx);
+		if (table->resource)
+			mlx5_ipool_free(table->resource, res_idx);
 	}
 }
 
@@ -4474,6 +4485,7 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	uint32_t i = 0, max_tpl = MLX5_HW_TBL_MAX_ITEM_TEMPLATE;
 	uint32_t nb_flows = rte_align32pow2(attr->nb_flows);
 	bool port_started = !!dev->data->dev_started;
+	bool rpool_needed;
 	size_t tbl_mem_size;
 	int err;
 
@@ -4511,13 +4523,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->flow = mlx5_ipool_create(&cfg);
 	if (!tbl->flow)
 		goto error;
-	/* Allocate rule indexed pool. */
-	cfg.size = 0;
-	cfg.type = "mlx5_hw_table_rule";
-	cfg.max_idx += priv->hw_q[0].size;
-	tbl->resource = mlx5_ipool_create(&cfg);
-	if (!tbl->resource)
-		goto error;
 	/* Register the flow group. */
 	ge = mlx5_hlist_register(priv->sh->groups, attr->flow_attr.group, &ctx);
 	if (!ge)
@@ -4597,12 +4602,30 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->type = attr->flow_attr.transfer ? MLX5DR_TABLE_TYPE_FDB :
 		    (attr->flow_attr.egress ? MLX5DR_TABLE_TYPE_NIC_TX :
 		    MLX5DR_TABLE_TYPE_NIC_RX);
+	/*
+	 * Only the matcher supports update and needs more than 1 WQE, an additional
+	 * index is needed. Or else the flow index can be reused.
+	 */
+	rpool_needed = mlx5dr_matcher_is_updatable(tbl->matcher_info[0].matcher) &&
+		       mlx5dr_matcher_is_dependent(tbl->matcher_info[0].matcher);
+	if (rpool_needed) {
+		/* Allocate rule indexed pool. */
+		cfg.size = 0;
+		cfg.type = "mlx5_hw_table_rule";
+		cfg.max_idx += priv->hw_q[0].size;
+		tbl->resource = mlx5_ipool_create(&cfg);
+		if (!tbl->resource)
+			goto res_error;
+	}
 	if (port_started)
 		LIST_INSERT_HEAD(&priv->flow_hw_tbl, tbl, next);
 	else
 		LIST_INSERT_HEAD(&priv->flow_hw_tbl_ongo, tbl, next);
 	rte_rwlock_init(&tbl->matcher_replace_rwlk);
 	return tbl;
+res_error:
+	if (tbl->matcher_info[0].matcher)
+		(void)mlx5dr_matcher_destroy(tbl->matcher_info[0].matcher);
 at_error:
 	for (i = 0; i < nb_action_templates; i++) {
 		__flow_hw_action_template_destroy(dev, &tbl->ats[i].acts);
@@ -4620,8 +4643,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		if (tbl->grp)
 			mlx5_hlist_unregister(priv->sh->groups,
 					      &tbl->grp->entry);
-		if (tbl->resource)
-			mlx5_ipool_destroy(tbl->resource);
 		if (tbl->flow)
 			mlx5_ipool_destroy(tbl->flow);
 		mlx5_free(tbl);
@@ -4830,12 +4851,13 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	uint32_t ridx = 1;
 
 	/* Build ipool allocated object bitmap. */
-	mlx5_ipool_flush_cache(table->resource);
+	if (table->resource)
+		mlx5_ipool_flush_cache(table->resource);
 	mlx5_ipool_flush_cache(table->flow);
 	/* Check if ipool has allocated objects. */
 	if (table->refcnt ||
 	    mlx5_ipool_get_next(table->flow, &fidx) ||
-	    mlx5_ipool_get_next(table->resource, &ridx)) {
+	    (table->resource && mlx5_ipool_get_next(table->resource, &ridx))) {
 		DRV_LOG(WARNING, "Table %p is still in use.", (void *)table);
 		return rte_flow_error_set(error, EBUSY,
 				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
@@ -4857,7 +4879,8 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	if (table->matcher_info[1].matcher)
 		mlx5dr_matcher_destroy(table->matcher_info[1].matcher);
 	mlx5_hlist_unregister(priv->sh->groups, &table->grp->entry);
-	mlx5_ipool_destroy(table->resource);
+	if (table->resource)
+		mlx5_ipool_destroy(table->resource);
 	mlx5_ipool_destroy(table->flow);
 	mlx5_free(table);
 	return 0;
@@ -12476,11 +12499,11 @@ flow_hw_table_resize(struct rte_eth_dev *dev,
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 					  table, "cannot resize flows pool");
-	ret = mlx5_ipool_resize(table->resource, nb_flows);
-	if (ret)
-		return rte_flow_error_set(error, EINVAL,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
-					  table, "cannot resize resources pool");
+	/*
+	 * A resizable matcher doesn't support rule update. In this case, the ipool
+	 * for the resource is not created and there is no need to resize it.
+	 */
+	MLX5_ASSERT(!table->resource);
 	if (mlx5_is_multi_pattern_active(&table->mpctx)) {
 		ret = flow_hw_table_resize_multi_pattern_actions(dev, table, nb_flows, error);
 		if (ret < 0)
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 05/11] net/mlx5: remove action params from job
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (3 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held references to buffers which contained:

- modify header commands array,
- encap/decap data buffer,
- IPv6 routing data buffer.

These buffers were passed as parameters to HWS layer during rule
creation. They were needed only during the call to HWS layer
when flow operation is enqueues (i.e. mlx5dr_rule_create()).
After operation is enqueued, data stored there can be safely discarded
and it is not required to store it during the whole lifecycle of a job.

This patch removes references to these buffers from mlx5_hw_q_job
and removes relevant allocations to reduce job memory footprint.
Buffers stored per job are replaced with stack allocated ones,
contained in mlx5_flow_hw_action_params struct.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   3 -
 drivers/net/mlx5/mlx5_flow.h    |  10 +++
 drivers/net/mlx5/mlx5_flow_hw.c | 120 ++++++++++++++------------------
 3 files changed, 63 insertions(+), 70 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index f11a0181b8..42dc312a87 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -401,9 +401,6 @@ struct mlx5_hw_q_job {
 		const void *action; /* Indirect action attached to the job. */
 	};
 	void *user_data; /* Job user data. */
-	uint8_t *encap_data; /* Encap data. */
-	uint8_t *push_data; /* IPv6 routing push data. */
-	struct mlx5_modification_cmd *mhdr_cmd;
 	struct rte_flow_item *items;
 	union {
 		struct {
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 02af0a08fa..9ed356e1c2 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1306,6 +1306,16 @@ typedef int
 
 #define MLX5_MHDR_MAX_CMD ((MLX5_MAX_MODIFY_NUM) * 2 + 1)
 
+/** Container for flow action data constructed during flow rule creation. */
+struct mlx5_flow_hw_action_params {
+	/** Array of constructed modify header commands. */
+	struct mlx5_modification_cmd mhdr_cmd[MLX5_MHDR_MAX_CMD];
+	/** Constructed encap/decap data buffer. */
+	uint8_t encap_data[MLX5_ENCAP_MAX_LEN];
+	/** Constructed IPv6 routing data buffer. */
+	uint8_t ipv6_push_data[MLX5_PUSH_MAX_LEN];
+};
+
 /* rte flow action translate to DR action struct. */
 struct mlx5_action_construct_data {
 	LIST_ENTRY(mlx5_action_construct_data) next;
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 1fe8f42618..a87fe4d07a 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -158,7 +158,7 @@ static int flow_hw_translate_group(struct rte_eth_dev *dev,
 				   struct rte_flow_error *error);
 static __rte_always_inline int
 flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action);
@@ -2812,7 +2812,7 @@ flow_hw_mhdr_cmd_is_nop(const struct mlx5_modification_cmd *cmd)
  *    0 on success, negative value otherwise and rte_errno is set.
  */
 static __rte_always_inline int
-flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
+flow_hw_modify_field_construct(struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action)
@@ -2871,7 +2871,7 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
 
 		if (i >= act_data->modify_header.mhdr_cmds_end)
 			return -1;
-		if (flow_hw_mhdr_cmd_is_nop(&job->mhdr_cmd[i])) {
+		if (flow_hw_mhdr_cmd_is_nop(&mhdr_cmd[i])) {
 			++i;
 			continue;
 		}
@@ -2891,7 +2891,7 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
 		    mhdr_action->dst.field == RTE_FLOW_FIELD_IPV6_DSCP)
 			data <<= MLX5_IPV6_HDR_DSCP_SHIFT;
 		data = (data & mask) >> off_b;
-		job->mhdr_cmd[i++].data1 = rte_cpu_to_be_32(data);
+		mhdr_cmd[i++].data1 = rte_cpu_to_be_32(data);
 		++field;
 	} while (field->size);
 	return 0;
@@ -2905,8 +2905,10 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
  *
  * @param[in] dev
  *   Pointer to the rte_eth_dev structure.
- * @param[in] job
- *   Pointer to job descriptor.
+ * @param[in] flow
+ *   Pointer to flow structure.
+ * @param[in] ap
+ *   Pointer to container for temporarily constructed actions' parameters.
  * @param[in] hw_acts
  *   Pointer to translated actions from template.
  * @param[in] it_idx
@@ -2923,7 +2925,8 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
  */
 static __rte_always_inline int
 flow_hw_actions_construct(struct rte_eth_dev *dev,
-			  struct mlx5_hw_q_job *job,
+			  struct rte_flow_hw *flow,
+			  struct mlx5_flow_hw_action_params *ap,
 			  const struct mlx5_hw_action_template *hw_at,
 			  const uint8_t it_idx,
 			  const struct rte_flow_action actions[],
@@ -2933,7 +2936,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
-	struct rte_flow_template_table *table = job->flow->table;
+	struct rte_flow_template_table *table = flow->table;
 	struct mlx5_action_construct_data *act_data;
 	const struct rte_flow_actions_template *at = hw_at->action_template;
 	const struct mlx5_hw_actions *hw_acts = &hw_at->acts;
@@ -2945,8 +2948,6 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	const struct rte_flow_action_meter *meter = NULL;
 	const struct rte_flow_action_age *age = NULL;
 	const struct rte_flow_action_nat64 *nat64_c = NULL;
-	uint8_t *buf = job->encap_data;
-	uint8_t *push_buf = job->push_data;
 	struct rte_flow_attr attr = {
 			.ingress = 1,
 	};
@@ -2971,17 +2972,17 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	if (hw_acts->mhdr && hw_acts->mhdr->mhdr_cmds_num > 0 && !hw_acts->mhdr->shared) {
 		uint16_t pos = hw_acts->mhdr->pos;
 
-		mp_segment = mlx5_multi_pattern_segment_find(table, job->flow->res_idx);
+		mp_segment = mlx5_multi_pattern_segment_find(table, flow->res_idx);
 		if (!mp_segment || !mp_segment->mhdr_action)
 			return -1;
 		rule_acts[pos].action = mp_segment->mhdr_action;
 		/* offset is relative to DR action */
 		rule_acts[pos].modify_header.offset =
-					job->flow->res_idx - mp_segment->head_index;
+					flow->res_idx - mp_segment->head_index;
 		rule_acts[pos].modify_header.data =
-					(uint8_t *)job->mhdr_cmd;
-		rte_memcpy(job->mhdr_cmd, hw_acts->mhdr->mhdr_cmds,
-			   sizeof(*job->mhdr_cmd) * hw_acts->mhdr->mhdr_cmds_num);
+					(uint8_t *)ap->mhdr_cmd;
+		rte_memcpy(ap->mhdr_cmd, hw_acts->mhdr->mhdr_cmds,
+			   sizeof(*ap->mhdr_cmd) * hw_acts->mhdr->mhdr_cmds_num);
 	}
 	LIST_FOREACH(act_data, &hw_acts->act_list, next) {
 		uint32_t jump_group;
@@ -3014,7 +3015,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		case RTE_FLOW_ACTION_TYPE_INDIRECT:
 			if (flow_hw_shared_action_construct
 					(dev, queue, action, table, it_idx,
-					 at->action_flags, job->flow,
+					 at->action_flags, flow,
 					 &rule_acts[act_data->action_dst]))
 				return -1;
 			break;
@@ -3039,8 +3040,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			rule_acts[act_data->action_dst].action =
 			(!!attr.group) ? jump->hws_action : jump->root_action;
-			job->flow->jump = jump;
-			job->flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->jump = jump;
+			flow->fate_type = MLX5_FLOW_FATE_JUMP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
@@ -3050,8 +3051,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			if (!hrxq)
 				return -1;
 			rule_acts[act_data->action_dst].action = hrxq->action;
-			job->flow->hrxq = hrxq;
-			job->flow->fate_type = MLX5_FLOW_FATE_QUEUE;
+			flow->hrxq = hrxq;
+			flow->fate_type = MLX5_FLOW_FATE_QUEUE;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_RSS:
 			item_flags = table->its[it_idx]->item_flags;
@@ -3063,38 +3064,37 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP:
 			enc_item = ((const struct rte_flow_action_vxlan_encap *)
 				   action->conf)->definition;
-			if (flow_dv_convert_encap_data(enc_item, buf, &encap_len, NULL))
+			if (flow_dv_convert_encap_data(enc_item, ap->encap_data, &encap_len, NULL))
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP:
 			enc_item = ((const struct rte_flow_action_nvgre_encap *)
 				   action->conf)->definition;
-			if (flow_dv_convert_encap_data(enc_item, buf, &encap_len, NULL))
+			if (flow_dv_convert_encap_data(enc_item, ap->encap_data, &encap_len, NULL))
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RAW_ENCAP:
 			raw_encap_data =
 				(const struct rte_flow_action_raw_encap *)
 				 action->conf;
-			rte_memcpy((void *)buf, raw_encap_data->data, act_data->encap.len);
-			MLX5_ASSERT(raw_encap_data->size ==
-				    act_data->encap.len);
+			rte_memcpy(ap->encap_data, raw_encap_data->data, act_data->encap.len);
+			MLX5_ASSERT(raw_encap_data->size == act_data->encap.len);
 			break;
 		case RTE_FLOW_ACTION_TYPE_IPV6_EXT_PUSH:
 			ipv6_push =
 				(const struct rte_flow_action_ipv6_ext_push *)action->conf;
-			rte_memcpy((void *)push_buf, ipv6_push->data,
+			rte_memcpy(ap->ipv6_push_data, ipv6_push->data,
 				   act_data->ipv6_ext.len);
 			MLX5_ASSERT(ipv6_push->size == act_data->ipv6_ext.len);
 			break;
 		case RTE_FLOW_ACTION_TYPE_MODIFY_FIELD:
 			if (action->type == RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID)
-				ret = flow_hw_set_vlan_vid_construct(dev, job,
+				ret = flow_hw_set_vlan_vid_construct(dev, ap->mhdr_cmd,
 								     act_data,
 								     hw_acts,
 								     action);
 			else
-				ret = flow_hw_modify_field_construct(job,
+				ret = flow_hw_modify_field_construct(ap->mhdr_cmd,
 								     act_data,
 								     hw_acts,
 								     action);
@@ -3130,8 +3130,8 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			rule_acts[act_data->action_dst + 1].action =
 					(!!attr.group) ? jump->hws_action :
 							 jump->root_action;
-			job->flow->jump = jump;
-			job->flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->jump = jump;
+			flow->fate_type = MLX5_FLOW_FATE_JUMP;
 			if (mlx5_aso_mtr_wait(priv->sh, MLX5_HW_INV_QUEUE, aso_mtr))
 				return -1;
 			break;
@@ -3145,11 +3145,11 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			age_idx = mlx5_hws_age_action_create(priv, queue, 0,
 							     age,
-							     job->flow->res_idx,
+							     flow->res_idx,
 							     error);
 			if (age_idx == 0)
 				return -rte_errno;
-			job->flow->age_idx = age_idx;
+			flow->age_idx = age_idx;
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3172,7 +3172,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
-			job->flow->cnt_id = cnt_id;
+			flow->cnt_id = cnt_id;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_COUNT:
 			ret = mlx5_hws_cnt_pool_get_action_offset
@@ -3183,7 +3183,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
-			job->flow->cnt_id = act_data->shared_counter.id;
+			flow->cnt_id = act_data->shared_counter.id;
 			break;
 		case RTE_FLOW_ACTION_TYPE_CONNTRACK:
 			ct_idx = MLX5_INDIRECT_ACTION_IDX_GET(action->conf);
@@ -3210,8 +3210,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			ret = flow_hw_meter_mark_compile(dev,
 				act_data->action_dst, action,
-				rule_acts, &job->flow->mtr_id,
-				MLX5_HW_INV_QUEUE, error);
+				rule_acts, &flow->mtr_id, MLX5_HW_INV_QUEUE, error);
 			if (ret != 0)
 				return ret;
 			break;
@@ -3226,9 +3225,9 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
-			age_idx = job->flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
+			age_idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
 			if (mlx5_hws_cnt_age_get(priv->hws_cpool,
-						 job->flow->cnt_id) != age_idx)
+						 flow->cnt_id) != age_idx)
 				/*
 				 * This is first use of this indirect counter
 				 * for this indirect AGE, need to increase the
@@ -3240,7 +3239,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		 * Update this indirect counter the indirect/direct AGE in which
 		 * using it.
 		 */
-		mlx5_hws_cnt_age_set(priv->hws_cpool, job->flow->cnt_id,
+		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id,
 				     age_idx);
 	}
 	if (hw_acts->encap_decap && !hw_acts->encap_decap->shared) {
@@ -3250,21 +3249,21 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		if (ix < 0)
 			return -1;
 		if (!mp_segment)
-			mp_segment = mlx5_multi_pattern_segment_find(table, job->flow->res_idx);
+			mp_segment = mlx5_multi_pattern_segment_find(table, flow->res_idx);
 		if (!mp_segment || !mp_segment->reformat_action[ix])
 			return -1;
 		ra->action = mp_segment->reformat_action[ix];
 		/* reformat offset is relative to selected DR action */
-		ra->reformat.offset = job->flow->res_idx - mp_segment->head_index;
-		ra->reformat.data = buf;
+		ra->reformat.offset = flow->res_idx - mp_segment->head_index;
+		ra->reformat.data = ap->encap_data;
 	}
 	if (hw_acts->push_remove && !hw_acts->push_remove->shared) {
 		rule_acts[hw_acts->push_remove_pos].ipv6_ext.offset =
-				job->flow->res_idx - 1;
-		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = push_buf;
+				flow->res_idx - 1;
+		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = ap->ipv6_push_data;
 	}
 	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id))
-		job->flow->cnt_id = hw_acts->cnt_id;
+		flow->cnt_id = hw_acts->cnt_id;
 	return 0;
 }
 
@@ -3364,6 +3363,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3420,7 +3420,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, flow, &ap,
 				      &table->ats[action_template_index],
 				      pattern_template_index, actions,
 				      rule_acts, queue, error)) {
@@ -3512,6 +3512,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
@@ -3564,7 +3565,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, flow, &ap,
 				      &table->ats[action_template_index],
 				      0, actions, rule_acts, queue, error)) {
 		rte_errno = EINVAL;
@@ -3646,6 +3647,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct mlx5dr_rule_action *rule_acts;
+	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
 	struct rte_flow_template_table *table = of->table;
@@ -3698,7 +3700,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	 * No need to copy and contrust a new "actions" list based on the
 	 * user's input, in order to save the cost.
 	 */
-	if (flow_hw_actions_construct(dev, job,
+	if (flow_hw_actions_construct(dev, nf, &ap,
 				      &table->ats[action_template_index],
 				      nf->mt_idx, actions,
 				      rule_acts, queue, error)) {
@@ -6682,7 +6684,7 @@ flow_hw_set_vlan_vid(struct rte_eth_dev *dev,
 
 static __rte_always_inline int
 flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct mlx5_modification_cmd *mhdr_cmd,
 			       struct mlx5_action_construct_data *act_data,
 			       const struct mlx5_hw_actions *hw_acts,
 			       const struct rte_flow_action *action)
@@ -6710,8 +6712,7 @@ flow_hw_set_vlan_vid_construct(struct rte_eth_dev *dev,
 		.conf = &conf
 	};
 
-	return flow_hw_modify_field_construct(job, act_data, hw_acts,
-					      &modify_action);
+	return flow_hw_modify_field_construct(mhdr_cmd, act_data, hw_acts, &modify_action);
 }
 
 static int
@@ -10121,10 +10122,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
 			    sizeof(struct mlx5_hw_q_job) +
-			    sizeof(uint8_t) * MLX5_ENCAP_MAX_LEN +
-			    sizeof(uint8_t) * MLX5_PUSH_MAX_LEN +
-			    sizeof(struct mlx5_modification_cmd) *
-			    MLX5_MHDR_MAX_CMD +
 			    sizeof(struct rte_flow_item) *
 			    MLX5_HW_MAX_ITEMS +
 				sizeof(struct rte_flow_hw)) *
@@ -10137,8 +10134,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		uint8_t *encap = NULL, *push = NULL;
-		struct mlx5_modification_cmd *mhdr_cmd = NULL;
 		struct rte_flow_item *items = NULL;
 		struct rte_flow_hw *upd_flow = NULL;
 
@@ -10152,20 +10147,11 @@ flow_hw_configure(struct rte_eth_dev *dev,
 				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		mhdr_cmd = (struct mlx5_modification_cmd *)
-			   &job[_queue_attr[i]->size];
-		encap = (uint8_t *)
-			 &mhdr_cmd[_queue_attr[i]->size * MLX5_MHDR_MAX_CMD];
-		push = (uint8_t *)
-			 &encap[_queue_attr[i]->size * MLX5_ENCAP_MAX_LEN];
 		items = (struct rte_flow_item *)
-			 &push[_queue_attr[i]->size * MLX5_PUSH_MAX_LEN];
+			 &job[_queue_attr[i]->size];
 		upd_flow = (struct rte_flow_hw *)
 			&items[_queue_attr[i]->size * MLX5_HW_MAX_ITEMS];
 		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].mhdr_cmd = &mhdr_cmd[j * MLX5_MHDR_MAX_CMD];
-			job[j].encap_data = &encap[j * MLX5_ENCAP_MAX_LEN];
-			job[j].push_data = &push[j * MLX5_PUSH_MAX_LEN];
 			job[j].items = &items[j * MLX5_HW_MAX_ITEMS];
 			job[j].upd_flow = &upd_flow[j];
 			priv->hw_q[i].job[j] = &job[j];
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 06/11] net/mlx5: remove flow pattern from job
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (4 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held a reference to temporary flow rule pattern
and contained temporary REPRESENTED_PORT and TAG items structs.
They are used whenever it is required to prepend a flow rule pattern,
provided by the application with one of such items.
If prepending is required, then flow rule pattern is copied over to
temporary buffer and a new item added internally in PMD.
Such constructed buffer is passed to the HWS layer when flow create
operation is being enqueued.
After operation is enqueued, temporary flow pattern can be safely
discarded, so there is no need to store it during
the whole lifecycle of mlx5_hw_q_job.

This patch removes all references to flow rule pattern and items stored
inside mlx5_hw_q_job and removes relevant allocations to reduce job
memory footprint.
Temporary pattern and items stored per job are replaced with stack
allocated ones, contained in mlx5_flow_hw_pattern_params struct.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         | 17 ++++-------
 drivers/net/mlx5/mlx5_flow.h    | 10 +++++++
 drivers/net/mlx5/mlx5_flow_hw.c | 51 ++++++++++++++-------------------
 3 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 42dc312a87..1ca6223f95 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -401,17 +401,12 @@ struct mlx5_hw_q_job {
 		const void *action; /* Indirect action attached to the job. */
 	};
 	void *user_data; /* Job user data. */
-	struct rte_flow_item *items;
-	union {
-		struct {
-			/* User memory for query output */
-			void *user;
-			/* Data extracted from hardware */
-			void *hw;
-		} __rte_packed query;
-		struct rte_flow_item_ethdev port_spec;
-		struct rte_flow_item_tag tag_spec;
-	} __rte_packed;
+	struct {
+		/* User memory for query output */
+		void *user;
+		/* Data extracted from hardware */
+		void *hw;
+	} query;
 	struct rte_flow_hw *upd_flow; /* Flow with updated values. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 9ed356e1c2..436d1391bc 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1316,6 +1316,16 @@ struct mlx5_flow_hw_action_params {
 	uint8_t ipv6_push_data[MLX5_PUSH_MAX_LEN];
 };
 
+/** Container for dynamically generated flow items used during flow rule creation. */
+struct mlx5_flow_hw_pattern_params {
+	/** Array of dynamically generated flow items. */
+	struct rte_flow_item items[MLX5_HW_MAX_ITEMS];
+	/** Temporary REPRESENTED_PORT item generated by PMD. */
+	struct rte_flow_item_ethdev port_spec;
+	/** Temporary TAG item generated by PMD. */
+	struct rte_flow_item_tag tag_spec;
+};
+
 /* rte flow action translate to DR action struct. */
 struct mlx5_action_construct_data {
 	LIST_ENTRY(mlx5_action_construct_data) next;
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index a87fe4d07a..ab67dc139e 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3272,44 +3272,44 @@ flow_hw_get_rule_items(struct rte_eth_dev *dev,
 		       const struct rte_flow_template_table *table,
 		       const struct rte_flow_item items[],
 		       uint8_t pattern_template_index,
-		       struct mlx5_hw_q_job *job)
+		       struct mlx5_flow_hw_pattern_params *pp)
 {
 	struct rte_flow_pattern_template *pt = table->its[pattern_template_index];
 
 	/* Only one implicit item can be added to flow rule pattern. */
 	MLX5_ASSERT(!pt->implicit_port || !pt->implicit_tag);
-	/* At least one item was allocated in job descriptor for items. */
+	/* At least one item was allocated in pattern params for items. */
 	MLX5_ASSERT(MLX5_HW_MAX_ITEMS >= 1);
 	if (pt->implicit_port) {
 		if (pt->orig_item_nb + 1 > MLX5_HW_MAX_ITEMS) {
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		/* Set up represented port item in job descriptor. */
-		job->port_spec = (struct rte_flow_item_ethdev){
+		/* Set up represented port item in pattern params. */
+		pp->port_spec = (struct rte_flow_item_ethdev){
 			.port_id = dev->data->port_id,
 		};
-		job->items[0] = (struct rte_flow_item){
+		pp->items[0] = (struct rte_flow_item){
 			.type = RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT,
-			.spec = &job->port_spec,
+			.spec = &pp->port_spec,
 		};
-		rte_memcpy(&job->items[1], items, sizeof(*items) * pt->orig_item_nb);
-		return job->items;
+		rte_memcpy(&pp->items[1], items, sizeof(*items) * pt->orig_item_nb);
+		return pp->items;
 	} else if (pt->implicit_tag) {
 		if (pt->orig_item_nb + 1 > MLX5_HW_MAX_ITEMS) {
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		/* Set up tag item in job descriptor. */
-		job->tag_spec = (struct rte_flow_item_tag){
+		/* Set up tag item in pattern params. */
+		pp->tag_spec = (struct rte_flow_item_tag){
 			.data = flow_hw_tx_tag_regc_value(dev),
 		};
-		job->items[0] = (struct rte_flow_item){
+		pp->items[0] = (struct rte_flow_item){
 			.type = (enum rte_flow_item_type)MLX5_RTE_FLOW_ITEM_TYPE_TAG,
-			.spec = &job->tag_spec,
+			.spec = &pp->tag_spec,
 		};
-		rte_memcpy(&job->items[1], items, sizeof(*items) * pt->orig_item_nb);
-		return job->items;
+		rte_memcpy(&pp->items[1], items, sizeof(*items) * pt->orig_item_nb);
+		return pp->items;
 	} else {
 		return items;
 	}
@@ -3364,6 +3364,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	};
 	struct mlx5dr_rule_action *rule_acts;
 	struct mlx5_flow_hw_action_params ap;
+	struct mlx5_flow_hw_pattern_params pp;
 	struct rte_flow_hw *flow = NULL;
 	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
@@ -3428,7 +3429,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		goto error;
 	}
 	rule_items = flow_hw_get_rule_items(dev, table, items,
-					    pattern_template_index, job);
+					    pattern_template_index, &pp);
 	if (!rule_items)
 		goto error;
 	if (likely(!rte_flow_template_table_resizable(dev->data->port_id, &table->cfg.attr))) {
@@ -10121,11 +10122,8 @@ flow_hw_configure(struct rte_eth_dev *dev,
 			goto err;
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
-			    sizeof(struct mlx5_hw_q_job) +
-			    sizeof(struct rte_flow_item) *
-			    MLX5_HW_MAX_ITEMS +
-				sizeof(struct rte_flow_hw)) *
-			    _queue_attr[i]->size;
+			     sizeof(struct mlx5_hw_q_job) +
+			     sizeof(struct rte_flow_hw)) * _queue_attr[i]->size;
 	}
 	priv->hw_q = mlx5_malloc(MLX5_MEM_ZERO, mem_size,
 				 64, SOCKET_ID_ANY);
@@ -10134,7 +10132,6 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		struct rte_flow_item *items = NULL;
 		struct rte_flow_hw *upd_flow = NULL;
 
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
@@ -10147,12 +10144,8 @@ flow_hw_configure(struct rte_eth_dev *dev,
 				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		items = (struct rte_flow_item *)
-			 &job[_queue_attr[i]->size];
-		upd_flow = (struct rte_flow_hw *)
-			&items[_queue_attr[i]->size * MLX5_HW_MAX_ITEMS];
+		upd_flow = (struct rte_flow_hw *)&job[_queue_attr[i]->size];
 		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].items = &items[j * MLX5_HW_MAX_ITEMS];
 			job[j].upd_flow = &upd_flow[j];
 			priv->hw_q[i].job[j] = &job[j];
 		}
@@ -12329,14 +12322,12 @@ flow_hw_calc_table_hash(struct rte_eth_dev *dev,
 			 uint32_t *hash, struct rte_flow_error *error)
 {
 	const struct rte_flow_item *items;
-	/* Temp job to allow adding missing items */
-	static struct rte_flow_item tmp_items[MLX5_HW_MAX_ITEMS];
-	static struct mlx5_hw_q_job job = {.items = tmp_items};
+	struct mlx5_flow_hw_pattern_params pp;
 	int res;
 
 	items = flow_hw_get_rule_items(dev, table, pattern,
 				       pattern_template_index,
-				       &job);
+				       &pp);
 	res = mlx5dr_rule_hash_calculate(mlx5_table_matcher(table), items,
 					 pattern_template_index,
 					 MLX5DR_RULE_HASH_CALC_MODE_RAW,
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 07/11] net/mlx5: remove updated flow from job
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (5 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

mlx5_hw_q_job struct held a reference to a temporary flow rule struct,
used during flow rule update operation. It serves as a container for
flow actions data calculated during actions construction.
After flow rule update operation succeeds, data from temporary flow rule
is copied over to original flow rule.

Although access to this temporary flow rule struct is required
during both operation enqueue step and completion polling step,
there can be only one ongoing flow update operation for a given
flow rule. As a result there is no need to store it per job.

This patch removes all references to temporary flow rule struct
stored in mlx5_hw_q_job and removes relevant allocations to reduce
job memory footprint.
Temporary flow rule struct stored per job is replaced with:

- If table is not resizable - An array of rte_flow_hw_aux structs,
  stored in template table. This array holds one entry per each
  flow rule, each containing a single mentioned temporary struct.
- If table is resizable - Additional rte_flow_hw_aux struct,
  allocated alongside rte_flow_hw in resizable ipool.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   1 -
 drivers/net/mlx5/mlx5_flow.h    |   7 +++
 drivers/net/mlx5/mlx5_flow_hw.c | 100 ++++++++++++++++++++++++++------
 3 files changed, 89 insertions(+), 19 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1ca6223f95..2e2504f20f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -407,7 +407,6 @@ struct mlx5_hw_q_job {
 		/* Data extracted from hardware */
 		void *hw;
 	} query;
-	struct rte_flow_hw *upd_flow; /* Flow with updated values. */
 };
 
 /* HW steering job descriptor LIFO pool. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 436d1391bc..a204f94624 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1293,6 +1293,12 @@ struct rte_flow_hw {
 	uint8_t rule[]; /* HWS layer data struct. */
 } __rte_packed;
 
+/** Auxiliary data stored per flow which is not required to be stored in main flow structure. */
+struct rte_flow_hw_aux {
+	/** Placeholder flow struct used during flow rule update operation. */
+	struct rte_flow_hw upd_flow;
+};
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
@@ -1601,6 +1607,7 @@ struct rte_flow_template_table {
 	/* Action templates bind to the table. */
 	struct mlx5_hw_action_template ats[MLX5_HW_TBL_MAX_ACTION_TEMPLATE];
 	struct mlx5_indexed_pool *flow; /* The table's flow ipool. */
+	struct rte_flow_hw_aux *flow_aux; /**< Auxiliary data stored per flow. */
 	struct mlx5_indexed_pool *resource; /* The table's resource ipool. */
 	struct mlx5_flow_template_table_cfg cfg;
 	uint32_t type; /* Flow table type RX/TX/FDB. */
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index ab67dc139e..cbbf87b999 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -79,6 +79,66 @@ struct mlx5_indlst_legacy {
 #define MLX5_CONST_ENCAP_ITEM(encap_type, ptr) \
 (((const struct encap_type *)(ptr))->definition)
 
+/**
+ * Returns the size of a struct with a following layout:
+ *
+ * @code{.c}
+ * struct rte_flow_hw {
+ *     // rte_flow_hw fields
+ *     uint8_t rule[mlx5dr_rule_get_handle_size()];
+ * };
+ * @endcode
+ *
+ * Such struct is used as a basic container for HW Steering flow rule.
+ */
+static size_t
+mlx5_flow_hw_entry_size(void)
+{
+	return sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size();
+}
+
+/**
+ * Returns the size of "auxed" rte_flow_hw structure which is assumed to be laid out as follows:
+ *
+ * @code{.c}
+ * struct {
+ *     struct rte_flow_hw {
+ *         // rte_flow_hw fields
+ *         uint8_t rule[mlx5dr_rule_get_handle_size()];
+ *     } flow;
+ *     struct rte_flow_hw_aux aux;
+ * };
+ * @endcode
+ *
+ * Such struct is used whenever rte_flow_hw_aux cannot be allocated separately from the rte_flow_hw
+ * e.g., when table is resizable.
+ */
+static size_t
+mlx5_flow_hw_auxed_entry_size(void)
+{
+	size_t rule_size = mlx5dr_rule_get_handle_size();
+
+	return sizeof(struct rte_flow_hw) + rule_size + sizeof(struct rte_flow_hw_aux);
+}
+
+/**
+ * Returns a valid pointer to rte_flow_hw_aux associated with given rte_flow_hw
+ * depending on template table configuration.
+ */
+static __rte_always_inline struct rte_flow_hw_aux *
+mlx5_flow_hw_aux(uint16_t port_id, struct rte_flow_hw *flow)
+{
+	struct rte_flow_template_table *table = flow->table;
+
+	if (rte_flow_template_table_resizable(port_id, &table->cfg.attr)) {
+		size_t offset = sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size();
+
+		return RTE_PTR_ADD(flow, offset);
+	} else {
+		return &table->flow_aux[flow->idx - 1];
+	}
+}
+
 static int
 mlx5_tbl_multi_pattern_process(struct rte_eth_dev *dev,
 			       struct rte_flow_template_table *tbl,
@@ -3651,6 +3711,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *of = (struct rte_flow_hw *)flow;
 	struct rte_flow_hw *nf;
+	struct rte_flow_hw_aux *aux;
 	struct rte_flow_template_table *table = of->table;
 	struct mlx5_hw_q_job *job = NULL;
 	uint32_t res_idx = 0;
@@ -3661,7 +3722,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		goto error;
 	}
-	nf = job->upd_flow;
+	aux = mlx5_flow_hw_aux(dev->data->port_id, of);
+	nf = &aux->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
 	/*
@@ -3708,11 +3770,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
-	/*
-	 * Switch the old flow and the new flow.
-	 */
+	/* Switch to the old flow. New flow will retrieved from the table on completion. */
 	job->flow = of;
-	job->upd_flow = nf;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
 	if (likely(!ret))
@@ -3985,8 +4044,10 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 			mlx5_ipool_free(table->flow, flow->idx);
 		}
 	} else {
-		rte_memcpy(flow, job->upd_flow,
-			   offsetof(struct rte_flow_hw, rule));
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+		struct rte_flow_hw *upd_flow = &aux->upd_flow;
+
+		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
 	}
@@ -4475,7 +4536,6 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		.data = &flow_attr,
 	};
 	struct mlx5_indexed_pool_config cfg = {
-		.size = sizeof(struct rte_flow_hw) + mlx5dr_rule_get_handle_size(),
 		.trunk_size = 1 << 12,
 		.per_core_cache = 1 << 13,
 		.need_lock = 1,
@@ -4496,6 +4556,9 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	if (!attr->flow_attr.group)
 		max_tpl = 1;
 	cfg.max_idx = nb_flows;
+	cfg.size = !rte_flow_template_table_resizable(dev->data->port_id, attr) ?
+		   mlx5_flow_hw_entry_size() :
+		   mlx5_flow_hw_auxed_entry_size();
 	/* For table has very limited flows, disable cache. */
 	if (nb_flows < cfg.trunk_size) {
 		cfg.per_core_cache = 0;
@@ -4526,6 +4589,11 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 	tbl->flow = mlx5_ipool_create(&cfg);
 	if (!tbl->flow)
 		goto error;
+	/* Allocate table of auxiliary flow rule structs. */
+	tbl->flow_aux = mlx5_malloc(MLX5_MEM_ZERO, sizeof(struct rte_flow_hw_aux) * nb_flows,
+				    RTE_CACHE_LINE_SIZE, rte_dev_numa_node(dev->device));
+	if (!tbl->flow_aux)
+		goto error;
 	/* Register the flow group. */
 	ge = mlx5_hlist_register(priv->sh->groups, attr->flow_attr.group, &ctx);
 	if (!ge)
@@ -4646,6 +4714,8 @@ flow_hw_table_create(struct rte_eth_dev *dev,
 		if (tbl->grp)
 			mlx5_hlist_unregister(priv->sh->groups,
 					      &tbl->grp->entry);
+		if (tbl->flow_aux)
+			mlx5_free(tbl->flow_aux);
 		if (tbl->flow)
 			mlx5_ipool_destroy(tbl->flow);
 		mlx5_free(tbl);
@@ -4884,6 +4954,7 @@ flow_hw_table_destroy(struct rte_eth_dev *dev,
 	mlx5_hlist_unregister(priv->sh->groups, &table->grp->entry);
 	if (table->resource)
 		mlx5_ipool_destroy(table->resource);
+	mlx5_free(table->flow_aux);
 	mlx5_ipool_destroy(table->flow);
 	mlx5_free(table);
 	return 0;
@@ -10122,8 +10193,7 @@ flow_hw_configure(struct rte_eth_dev *dev,
 			goto err;
 		}
 		mem_size += (sizeof(struct mlx5_hw_q_job *) +
-			     sizeof(struct mlx5_hw_q_job) +
-			     sizeof(struct rte_flow_hw)) * _queue_attr[i]->size;
+			     sizeof(struct mlx5_hw_q_job)) * _queue_attr[i]->size;
 	}
 	priv->hw_q = mlx5_malloc(MLX5_MEM_ZERO, mem_size,
 				 64, SOCKET_ID_ANY);
@@ -10132,23 +10202,17 @@ flow_hw_configure(struct rte_eth_dev *dev,
 		goto err;
 	}
 	for (i = 0; i < nb_q_updated; i++) {
-		struct rte_flow_hw *upd_flow = NULL;
-
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
 		priv->hw_q[i].size = _queue_attr[i]->size;
 		if (i == 0)
 			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
 					    &priv->hw_q[nb_q_updated];
 		else
-			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
-				&job[_queue_attr[i - 1]->size - 1].upd_flow[1];
+			priv->hw_q[i].job = (struct mlx5_hw_q_job **)&job[_queue_attr[i - 1]->size];
 		job = (struct mlx5_hw_q_job *)
 		      &priv->hw_q[i].job[_queue_attr[i]->size];
-		upd_flow = (struct rte_flow_hw *)&job[_queue_attr[i]->size];
-		for (j = 0; j < _queue_attr[i]->size; j++) {
-			job[j].upd_flow = &upd_flow[j];
+		for (j = 0; j < _queue_attr[i]->size; j++)
 			priv->hw_q[i].job[j] = &job[j];
-		}
 		/* Notice ring name length is limited. */
 		priv->hw_q[i].indir_cq = mlx5_hwq_ring_create
 			(dev->data->port_id, i, _queue_attr[i]->size, "indir_act_cq");
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 08/11] net/mlx5: use flow as operation container
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (6 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

While processing async flow operations in mlx5 PMD,
mlx5_hw_q_job struct is used to hold the following data
related to the ongoing operation.

- operation type,
- user data,
- flow reference.

Job itself is then passed to mlx5dr layer as its "user data".
Other types of data required during flow operation processing
are accessed through the flow itself.

Since most of the accessed fields are in the flow struct itself,
the operation type and user data can be moved to the flow struct.
This removes unnecessary memory indirection and reduces memory
footprint of flow operations processing. It decreases cache stress
and as a result can increase processing throughput.

This patch removes the mlx5_hw_q_job from async flow operations
processing and from now on the flow itself can represent the ongoing
operation. Async operations on indirect actions still use jobs.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.h         |   8 +-
 drivers/net/mlx5/mlx5_flow.h    |  13 ++
 drivers/net/mlx5/mlx5_flow_hw.c | 210 +++++++++++++++-----------------
 3 files changed, 116 insertions(+), 115 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 2e2504f20f..8acb79e7bb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -396,10 +396,7 @@ enum mlx5_hw_indirect_type {
 struct mlx5_hw_q_job {
 	uint32_t type; /* Job type. */
 	uint32_t indirect_type;
-	union {
-		struct rte_flow_hw *flow; /* Flow attached to the job. */
-		const void *action; /* Indirect action attached to the job. */
-	};
+	const void *action; /* Indirect action attached to the job. */
 	void *user_data; /* Job user data. */
 	struct {
 		/* User memory for query output */
@@ -412,7 +409,8 @@ struct mlx5_hw_q_job {
 /* HW steering job descriptor LIFO pool. */
 struct mlx5_hw_q {
 	uint32_t job_idx; /* Free job index. */
-	uint32_t size; /* LIFO size. */
+	uint32_t size; /* Job LIFO queue size. */
+	uint32_t ongoing_flow_ops; /* Number of ongoing flow operations. */
 	struct mlx5_hw_q_job **job; /* LIFO header. */
 	struct rte_ring *indir_cq; /* Indirect action SW completion queue. */
 	struct rte_ring *indir_iq; /* Indirect action SW in progress queue. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index a204f94624..46d8ce1775 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1269,6 +1269,16 @@ typedef uint32_t cnt_id_t;
 
 #if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)
 
+enum {
+	MLX5_FLOW_HW_FLOW_OP_TYPE_NONE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY,
+	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE,
+};
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
@@ -1290,6 +1300,9 @@ struct rte_flow_hw {
 	cnt_id_t cnt_id;
 	uint32_t mtr_id;
 	uint32_t rule_idx;
+	uint8_t operation_type; /**< Ongoing flow operation type. */
+	void *user_data; /**< Application's private data passed to enqueued flow operation. */
+	uint8_t padding[1]; /**< Padding for proper alignment of mlx5dr rule struct. */
 	uint8_t rule[]; /* HWS layer data struct. */
 } __rte_packed;
 
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index cbbf87b999..dc0b4bff3d 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -312,6 +312,31 @@ static const struct rte_flow_item_eth ctrl_rx_eth_bcast_spec = {
 	.hdr.ether_type = 0,
 };
 
+static inline uint32_t
+flow_hw_q_pending(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	MLX5_ASSERT(q->size >= q->job_idx);
+	return (q->size - q->job_idx) + q->ongoing_flow_ops;
+}
+
+static inline void
+flow_hw_q_inc_flow_ops(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	q->ongoing_flow_ops++;
+}
+
+static inline void
+flow_hw_q_dec_flow_ops(struct mlx5_priv *priv, uint32_t queue)
+{
+	struct mlx5_hw_q *q = &priv->hw_q[queue];
+
+	q->ongoing_flow_ops--;
+}
+
 static __rte_always_inline struct mlx5_hw_q_job *
 flow_hw_job_get(struct mlx5_priv *priv, uint32_t queue)
 {
@@ -3426,20 +3451,15 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	struct mlx5_flow_hw_action_params ap;
 	struct mlx5_flow_hw_pattern_params pp;
 	struct rte_flow_hw *flow = NULL;
-	struct mlx5_hw_q_job *job = NULL;
 	const struct rte_flow_item *rule_items;
 	uint32_t flow_idx = 0;
 	uint32_t res_idx = 0;
 	int ret;
 
 	if (unlikely((!dev->data->dev_started))) {
-		rte_errno = EINVAL;
-		goto error;
-	}
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
+		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "Port must be started before enqueueing flow operations");
+		return NULL;
 	}
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
@@ -3461,13 +3481,12 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		flow->res_idx = flow_idx;
 	}
 	/*
-	 * Set the job type here in order to know if the flow memory
+	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
 	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_CREATE;
-	job->flow = flow;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
+	flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+	flow->user_data = user_data;
+	rule_attr.user_data = flow;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3501,7 +3520,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	} else {
 		uint32_t selector;
 
-		job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
 		rte_rwlock_read_lock(&table->matcher_replace_rwlk);
 		selector = table->matcher_selector;
 		ret = mlx5dr_rule_create(table->matcher_info[selector].matcher,
@@ -3512,15 +3531,15 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		flow->matcher_selector = selector;
 	}
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return (struct rte_flow *)flow;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3575,19 +3594,14 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	struct mlx5dr_rule_action *rule_acts;
 	struct mlx5_flow_hw_action_params ap;
 	struct rte_flow_hw *flow = NULL;
-	struct mlx5_hw_q_job *job = NULL;
 	uint32_t flow_idx = 0;
 	uint32_t res_idx = 0;
 	int ret;
 
 	if (unlikely(rule_index >= table->cfg.attr.nb_flows)) {
-		rte_errno = EINVAL;
-		goto error;
-	}
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
+		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "Flow rule index exceeds table size");
+		return NULL;
 	}
 	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
 	if (!flow)
@@ -3609,13 +3623,12 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 		flow->res_idx = flow_idx;
 	}
 	/*
-	 * Set the job type here in order to know if the flow memory
+	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
 	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_CREATE;
-	job->flow = flow;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
+	flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+	flow->user_data = user_data;
+	rule_attr.user_data = flow;
 	/* Set the rule index. */
 	flow->rule_idx = rule_index;
 	rule_attr.rule_idx = flow->rule_idx;
@@ -3640,7 +3653,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	} else {
 		uint32_t selector;
 
-		job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
 		rte_rwlock_read_lock(&table->matcher_replace_rwlk);
 		selector = table->matcher_selector;
 		ret = mlx5dr_rule_create(table->matcher_info[selector].matcher,
@@ -3649,15 +3662,15 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 	}
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return (struct rte_flow *)flow;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
 	if (flow_idx)
 		mlx5_ipool_free(table->flow, flow_idx);
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	rte_flow_error_set(error, rte_errno,
 			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 			   "fail to create rte flow");
@@ -3713,15 +3726,9 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	struct rte_flow_hw *nf;
 	struct rte_flow_hw_aux *aux;
 	struct rte_flow_template_table *table = of->table;
-	struct mlx5_hw_q_job *job = NULL;
 	uint32_t res_idx = 0;
 	int ret;
 
-	job = flow_hw_job_get(priv, queue);
-	if (!job) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
 	aux = mlx5_flow_hw_aux(dev->data->port_id, of);
 	nf = &aux->upd_flow;
 	memset(nf, 0, sizeof(struct rte_flow_hw));
@@ -3741,14 +3748,6 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
-	/*
-	 * Set the job type here in order to know if the flow memory
-	 * should be freed or not when get the result from dequeue.
-	 */
-	job->type = MLX5_HW_Q_JOB_TYPE_UPDATE;
-	job->flow = nf;
-	job->user_data = user_data;
-	rule_attr.user_data = job;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3770,18 +3769,22 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 		rte_errno = EINVAL;
 		goto error;
 	}
-	/* Switch to the old flow. New flow will retrieved from the table on completion. */
-	job->flow = of;
+	/*
+	 * Set the flow operation type here in order to know if the flow memory
+	 * should be freed or not when get the result from dequeue.
+	 */
+	of->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
+	of->user_data = user_data;
+	rule_attr.user_data = of;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
-	if (likely(!ret))
+	if (likely(!ret)) {
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return 0;
+	}
 error:
 	if (table->resource && res_idx)
 		mlx5_ipool_free(table->resource, res_idx);
-	/* Flow created fail, return the descriptor and flow memory. */
-	if (job)
-		flow_hw_job_put(priv, job, queue);
 	return rte_flow_error_set(error, rte_errno,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 				  "fail to update rte flow");
@@ -3825,27 +3828,23 @@ flow_hw_async_flow_destroy(struct rte_eth_dev *dev,
 		.burst = attr->postpone,
 	};
 	struct rte_flow_hw *fh = (struct rte_flow_hw *)flow;
-	struct mlx5_hw_q_job *job;
+	bool resizable = rte_flow_template_table_resizable(dev->data->port_id,
+							   &fh->table->cfg.attr);
 	int ret;
 
-	job = flow_hw_job_get(priv, queue);
-	if (!job)
-		return rte_flow_error_set(error, ENOMEM,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-					  "fail to destroy rte flow: flow queue full");
-	job->type = !rte_flow_template_table_resizable(dev->data->port_id, &fh->table->cfg.attr) ?
-		    MLX5_HW_Q_JOB_TYPE_DESTROY : MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY;
-	job->user_data = user_data;
-	job->flow = fh;
-	rule_attr.user_data = job;
+	fh->operation_type = !resizable ?
+			     MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY :
+			     MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY;
+	fh->user_data = user_data;
+	rule_attr.user_data = fh;
 	rule_attr.rule_idx = fh->rule_idx;
 	ret = mlx5dr_rule_destroy((struct mlx5dr_rule *)fh->rule, &rule_attr);
 	if (ret) {
-		flow_hw_job_put(priv, job, queue);
 		return rte_flow_error_set(error, rte_errno,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "fail to destroy rte flow");
 	}
+	flow_hw_q_inc_flow_ops(priv, queue);
 	return 0;
 }
 
@@ -3950,16 +3949,16 @@ mlx5_hw_pull_flow_transfer_comp(struct rte_eth_dev *dev,
 				uint16_t n_res)
 {
 	uint32_t size, i;
-	struct mlx5_hw_q_job *job = NULL;
+	struct rte_flow_hw *flow = NULL;
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct rte_ring *ring = priv->hw_q[queue].flow_transfer_completed;
 
 	size = RTE_MIN(rte_ring_count(ring), n_res);
 	for (i = 0; i < size; i++) {
 		res[i].status = RTE_FLOW_OP_SUCCESS;
-		rte_ring_dequeue(ring, (void **)&job);
-		res[i].user_data = job->user_data;
-		flow_hw_job_put(priv, job, queue);
+		rte_ring_dequeue(ring, (void **)&flow);
+		res[i].user_data = flow->user_data;
+		flow_hw_q_dec_flow_ops(priv, queue);
 	}
 	return (int)size;
 }
@@ -4016,12 +4015,11 @@ __flow_hw_pull_indir_action_comp(struct rte_eth_dev *dev,
 
 static __rte_always_inline void
 hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
-			       struct mlx5_hw_q_job *job,
+			       struct rte_flow_hw *flow,
 			       uint32_t queue, struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
-	struct rte_flow_hw *flow = job->flow;
 	struct rte_flow_template_table *table = flow->table;
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
@@ -4037,12 +4035,10 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 		mlx5_ipool_free(pool->idx_pool,	flow->mtr_id);
 		flow->mtr_id = 0;
 	}
-	if (job->type != MLX5_HW_Q_JOB_TYPE_UPDATE) {
-		if (table) {
-			if (table->resource)
-				mlx5_ipool_free(table->resource, res_idx);
-			mlx5_ipool_free(table->flow, flow->idx);
-		}
+	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
+		if (table->resource)
+			mlx5_ipool_free(table->resource, res_idx);
+		mlx5_ipool_free(table->flow, flow->idx);
 	} else {
 		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		struct rte_flow_hw *upd_flow = &aux->upd_flow;
@@ -4055,28 +4051,27 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 
 static __rte_always_inline void
 hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
-		      struct mlx5_hw_q_job *job,
+		      struct rte_flow_hw *flow,
 		      uint32_t queue, enum rte_flow_op_status status,
 		      struct rte_flow_error *error)
 {
-	struct rte_flow_hw *flow = job->flow;
 	struct rte_flow_template_table *table = flow->table;
 	uint32_t selector = flow->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
-	switch (job->type) {
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE:
+	switch (flow->operation_type) {
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
 		rte_atomic_fetch_add_explicit
 			(&table->matcher_info[selector].refcnt, 1,
 			 rte_memory_order_relaxed);
 		break;
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY:
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY:
 		rte_atomic_fetch_sub_explicit
 			(&table->matcher_info[selector].refcnt, 1,
 			 rte_memory_order_relaxed);
-		hw_cmpl_flow_update_or_destroy(dev, job, queue, error);
+		hw_cmpl_flow_update_or_destroy(dev, flow, queue, error);
 		break;
-	case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE:
+	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE:
 		if (status == RTE_FLOW_OP_SUCCESS) {
 			rte_atomic_fetch_sub_explicit
 				(&table->matcher_info[selector].refcnt, 1,
@@ -4120,7 +4115,6 @@ flow_hw_pull(struct rte_eth_dev *dev,
 	     struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_hw_q_job *job;
 	int ret, i;
 
 	/* 1. Pull the flow completion. */
@@ -4130,23 +4124,24 @@ flow_hw_pull(struct rte_eth_dev *dev,
 				RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 				"fail to query flow queue");
 	for (i = 0; i <  ret; i++) {
-		job = (struct mlx5_hw_q_job *)res[i].user_data;
+		struct rte_flow_hw *flow = res[i].user_data;
+
 		/* Restore user data. */
-		res[i].user_data = job->user_data;
-		switch (job->type) {
-		case MLX5_HW_Q_JOB_TYPE_DESTROY:
-		case MLX5_HW_Q_JOB_TYPE_UPDATE:
-			hw_cmpl_flow_update_or_destroy(dev, job, queue, error);
+		res[i].user_data = flow->user_data;
+		switch (flow->operation_type) {
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE:
+			hw_cmpl_flow_update_or_destroy(dev, flow, queue, error);
 			break;
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE:
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE:
-		case MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY:
-			hw_cmpl_resizable_tbl(dev, job, queue, res[i].status, error);
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY:
+		case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE:
+			hw_cmpl_resizable_tbl(dev, flow, queue, res[i].status, error);
 			break;
 		default:
 			break;
 		}
-		flow_hw_job_put(priv, job, queue);
+		flow_hw_q_dec_flow_ops(priv, queue);
 	}
 	/* 2. Pull indirect action comp. */
 	if (ret < n_res)
@@ -4190,7 +4185,7 @@ __flow_hw_push_action(struct rte_eth_dev *dev,
 			mlx5_aso_push_wqe(priv->sh,
 					  &priv->hws_mpool->sq[queue]);
 	}
-	return priv->hw_q[queue].size - priv->hw_q[queue].job_idx;
+	return flow_hw_q_pending(priv, queue);
 }
 
 static int
@@ -10204,6 +10199,7 @@ flow_hw_configure(struct rte_eth_dev *dev,
 	for (i = 0; i < nb_q_updated; i++) {
 		priv->hw_q[i].job_idx = _queue_attr[i]->size;
 		priv->hw_q[i].size = _queue_attr[i]->size;
+		priv->hw_q[i].ongoing_flow_ops = 0;
 		if (i == 0)
 			priv->hw_q[i].job = (struct mlx5_hw_q_job **)
 					    &priv->hw_q[nb_q_updated];
@@ -12635,7 +12631,6 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 {
 	int ret;
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_hw_q_job *job;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
 	struct rte_flow_template_table *table = hw_flow->table;
 	uint32_t table_selector = table->matcher_selector;
@@ -12661,31 +12656,26 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "no active table resize");
-	job = flow_hw_job_get(priv, queue);
-	if (!job)
-		return rte_flow_error_set(error, ENOMEM,
-					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-					  "queue is full");
-	job->type = MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE;
-	job->user_data = user_data;
-	job->flow = hw_flow;
-	rule_attr.user_data = job;
+	hw_flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE;
+	hw_flow->user_data = user_data;
+	rule_attr.user_data = hw_flow;
 	if (rule_selector == table_selector) {
 		struct rte_ring *ring = !attr->postpone ?
 					priv->hw_q[queue].flow_transfer_completed :
 					priv->hw_q[queue].flow_transfer_pending;
-		rte_ring_enqueue(ring, job);
+		rte_ring_enqueue(ring, hw_flow);
+		flow_hw_q_inc_flow_ops(priv, queue);
 		return 0;
 	}
 	ret = mlx5dr_matcher_resize_rule_move(other_matcher,
 					      (struct mlx5dr_rule *)hw_flow->rule,
 					      &rule_attr);
 	if (ret) {
-		flow_hw_job_put(priv, job, queue);
 		return rte_flow_error_set(error, rte_errno,
 					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 					  "flow transfer failed");
 	}
+	flow_hw_q_inc_flow_ops(priv, queue);
 	return 0;
 }
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 09/11] net/mlx5: move rarely used flow fields outside
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (7 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Some of the flow fields are either not always required
or are used very rarely, e.g.:

- AGE action reference,
- direct METER/METER_MARK action reference,
- matcher selector for resizable tables.

This patch moves these fields to rte_flow_hw_aux struct in order to
reduce the overall size of the flow struct, reducing the total size
of working set for most common use cases.
This results in reduction of the frequency of cache invalidation
during async flow operations processing.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    |  61 +++++++++++-----
 drivers/net/mlx5/mlx5_flow_hw.c | 121 ++++++++++++++++++++++++--------
 2 files changed, 138 insertions(+), 44 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 46d8ce1775..e8f4d2cb16 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1283,31 +1283,60 @@ enum {
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
 
-/* HWS flow struct. */
+/** HWS flow struct. */
 struct rte_flow_hw {
-	uint32_t idx; /* Flow index from indexed pool. */
-	uint32_t res_idx; /* Resource index from indexed pool. */
-	uint32_t fate_type; /* Fate action type. */
+	/** The table flow allcated from. */
+	struct rte_flow_template_table *table;
+	/** Application's private data passed to enqueued flow operation. */
+	void *user_data;
+	/** Flow index from indexed pool. */
+	uint32_t idx;
+	/** Resource index from indexed pool. */
+	uint32_t res_idx;
+	/** HWS flow rule index passed to mlx5dr. */
+	uint32_t rule_idx;
+	/** Fate action type. */
+	uint32_t fate_type;
+	/** Ongoing flow operation type. */
+	uint8_t operation_type;
+	/** Index of pattern template this flow is based on. */
+	uint8_t mt_idx;
+
+	/** COUNT action index. */
+	cnt_id_t cnt_id;
 	union {
-		/* Jump action. */
+		/** Jump action. */
 		struct mlx5_hw_jump_action *jump;
-		struct mlx5_hrxq *hrxq; /* TIR action. */
+		/** TIR action. */
+		struct mlx5_hrxq *hrxq;
 	};
-	struct rte_flow_template_table *table; /* The table flow allcated from. */
-	uint8_t mt_idx;
-	uint8_t matcher_selector:1;
+
+	/**
+	 * Padding for alignment to 56 bytes.
+	 * Since mlx5dr rule is 72 bytes, whole flow is contained within 128 B (2 cache lines).
+	 * This space is reserved for future additions to flow struct.
+	 */
+	uint8_t padding[10];
+	/** HWS layer data struct. */
+	uint8_t rule[];
+} __rte_packed;
+
+/** Auxiliary data fields that are updatable. */
+struct rte_flow_hw_aux_fields {
+	/** AGE action index. */
 	uint32_t age_idx;
-	cnt_id_t cnt_id;
+	/** Direct meter (METER or METER_MARK) action index. */
 	uint32_t mtr_id;
-	uint32_t rule_idx;
-	uint8_t operation_type; /**< Ongoing flow operation type. */
-	void *user_data; /**< Application's private data passed to enqueued flow operation. */
-	uint8_t padding[1]; /**< Padding for proper alignment of mlx5dr rule struct. */
-	uint8_t rule[]; /* HWS layer data struct. */
-} __rte_packed;
+};
 
 /** Auxiliary data stored per flow which is not required to be stored in main flow structure. */
 struct rte_flow_hw_aux {
+	/** Auxiliary fields associated with the original flow. */
+	struct rte_flow_hw_aux_fields orig;
+	/** Auxiliary fields associated with the updated flow. */
+	struct rte_flow_hw_aux_fields upd;
+	/** Index of resizable matcher associated with this flow. */
+	uint8_t matcher_selector;
 	/** Placeholder flow struct used during flow rule update operation. */
 	struct rte_flow_hw upd_flow;
 };
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index dc0b4bff3d..025f04ddde 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -139,6 +139,50 @@ mlx5_flow_hw_aux(uint16_t port_id, struct rte_flow_hw *flow)
 	}
 }
 
+static __rte_always_inline void
+mlx5_flow_hw_aux_set_age_idx(struct rte_flow_hw *flow,
+			     struct rte_flow_hw_aux *aux,
+			     uint32_t age_idx)
+{
+	/*
+	 * Only when creating a flow rule, the type will be set explicitly.
+	 * Or else, it should be none in the rule update case.
+	 */
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		aux->upd.age_idx = age_idx;
+	else
+		aux->orig.age_idx = age_idx;
+}
+
+static __rte_always_inline uint32_t
+mlx5_flow_hw_aux_get_age_idx(struct rte_flow_hw *flow, struct rte_flow_hw_aux *aux)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		return aux->upd.age_idx;
+	else
+		return aux->orig.age_idx;
+}
+
+static __rte_always_inline void
+mlx5_flow_hw_aux_set_mtr_id(struct rte_flow_hw *flow,
+			    struct rte_flow_hw_aux *aux,
+			    uint32_t mtr_id)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		aux->upd.mtr_id = mtr_id;
+	else
+		aux->orig.mtr_id = mtr_id;
+}
+
+static __rte_always_inline uint32_t __rte_unused
+mlx5_flow_hw_aux_get_mtr_id(struct rte_flow_hw *flow, struct rte_flow_hw_aux *aux)
+{
+	if (unlikely(flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE))
+		return aux->upd.mtr_id;
+	else
+		return aux->orig.mtr_id;
+}
+
 static int
 mlx5_tbl_multi_pattern_process(struct rte_eth_dev *dev,
 			       struct rte_flow_template_table *tbl,
@@ -2766,6 +2810,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_age_info *age_info;
 	struct mlx5_hws_age_param *param;
+	struct rte_flow_hw_aux *aux;
 	uint32_t act_idx = (uint32_t)(uintptr_t)action->conf;
 	uint32_t type = act_idx >> MLX5_INDIRECT_ACTION_TYPE_OFFSET;
 	uint32_t idx = act_idx &
@@ -2803,11 +2848,12 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 		flow->cnt_id = act_idx;
 		break;
 	case MLX5_INDIRECT_ACTION_TYPE_AGE:
+		aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		/*
 		 * Save the index with the indirect type, to recognize
 		 * it in flow destroy.
 		 */
-		flow->age_idx = act_idx;
+		mlx5_flow_hw_aux_set_age_idx(flow, aux, act_idx);
 		if (action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 			/*
 			 * The mutual update for idirect AGE & COUNT will be
@@ -3034,14 +3080,16 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	const struct rte_flow_action_age *age = NULL;
 	const struct rte_flow_action_nat64 *nat64_c = NULL;
 	struct rte_flow_attr attr = {
-			.ingress = 1,
+		.ingress = 1,
 	};
 	uint32_t ft_flag;
-	size_t encap_len = 0;
 	int ret;
+	size_t encap_len = 0;
 	uint32_t age_idx = 0;
+	uint32_t mtr_idx = 0;
 	struct mlx5_aso_mtr *aso_mtr;
 	struct mlx5_multi_pattern_segment *mp_segment = NULL;
+	struct rte_flow_hw_aux *aux;
 
 	attr.group = table->grp->group_id;
 	ft_flag = mlx5_hw_act_flag[!!table->grp->group_id][table->type];
@@ -3221,6 +3269,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			age = action->conf;
 			/*
 			 * First, create the AGE parameter, then create its
@@ -3234,7 +3283,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 							     error);
 			if (age_idx == 0)
 				return -rte_errno;
-			flow->age_idx = age_idx;
+			mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3295,9 +3344,11 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			 */
 			ret = flow_hw_meter_mark_compile(dev,
 				act_data->action_dst, action,
-				rule_acts, &flow->mtr_id, MLX5_HW_INV_QUEUE, error);
+				rule_acts, &mtr_idx, MLX5_HW_INV_QUEUE, error);
 			if (ret != 0)
 				return ret;
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+			mlx5_flow_hw_aux_set_mtr_id(flow, aux, mtr_idx);
 			break;
 		case RTE_FLOW_ACTION_TYPE_NAT64:
 			nat64_c = action->conf;
@@ -3310,9 +3361,10 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
-			age_idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
-			if (mlx5_hws_cnt_age_get(priv->hws_cpool,
-						 flow->cnt_id) != age_idx)
+			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+			age_idx = mlx5_flow_hw_aux_get_age_idx(flow, aux) &
+				  MLX5_HWS_AGE_IDX_MASK;
+			if (mlx5_hws_cnt_age_get(priv->hws_cpool, flow->cnt_id) != age_idx)
 				/*
 				 * This is first use of this indirect counter
 				 * for this indirect AGE, need to increase the
@@ -3324,8 +3376,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		 * Update this indirect counter the indirect/direct AGE in which
 		 * using it.
 		 */
-		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id,
-				     age_idx);
+		mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, age_idx);
 	}
 	if (hw_acts->encap_decap && !hw_acts->encap_decap->shared) {
 		int ix = mlx5_multi_pattern_reformat_to_index(hw_acts->encap_decap->action_type);
@@ -3518,6 +3569,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 	} else {
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		uint32_t selector;
 
 		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
@@ -3529,7 +3581,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
-		flow->matcher_selector = selector;
+		aux->matcher_selector = selector;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3651,6 +3703,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 rule_acts, &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 	} else {
+		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 		uint32_t selector;
 
 		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE;
@@ -3661,6 +3714,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 rule_acts, &rule_attr,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
+		aux->matcher_selector = selector;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3748,6 +3802,8 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
+	/* Indicate the construction function to set the proper fields. */
+	nf->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	/*
 	 * Indexed pool returns 1-based indices, but mlx5dr expects 0-based indices
 	 * for rule insertion hints.
@@ -3865,15 +3921,17 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 			  struct rte_flow_hw *flow,
 			  struct rte_flow_error *error)
 {
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(priv->dev_data->port_id, flow);
 	uint32_t *cnt_queue;
+	uint32_t age_idx = aux->orig.age_idx;
 
 	if (mlx5_hws_cnt_is_shared(priv->hws_cpool, flow->cnt_id)) {
-		if (flow->age_idx && !mlx5_hws_age_is_indirect(flow->age_idx)) {
+		if (age_idx && !mlx5_hws_age_is_indirect(age_idx)) {
 			/* Remove this AGE parameter from indirect counter. */
 			mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, 0);
 			/* Release the AGE parameter. */
-			mlx5_hws_age_action_destroy(priv, flow->age_idx, error);
-			flow->age_idx = 0;
+			mlx5_hws_age_action_destroy(priv, age_idx, error);
+			mlx5_flow_hw_aux_set_age_idx(flow, aux, 0);
 		}
 		return;
 	}
@@ -3882,16 +3940,16 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	/* Put the counter first to reduce the race risk in BG thread. */
 	mlx5_hws_cnt_pool_put(priv->hws_cpool, cnt_queue, &flow->cnt_id);
 	flow->cnt_id = 0;
-	if (flow->age_idx) {
-		if (mlx5_hws_age_is_indirect(flow->age_idx)) {
-			uint32_t idx = flow->age_idx & MLX5_HWS_AGE_IDX_MASK;
+	if (age_idx) {
+		if (mlx5_hws_age_is_indirect(age_idx)) {
+			uint32_t idx = age_idx & MLX5_HWS_AGE_IDX_MASK;
 
 			mlx5_hws_age_nb_cnt_decrease(priv, idx);
 		} else {
 			/* Release the AGE parameter. */
-			mlx5_hws_age_action_destroy(priv, flow->age_idx, error);
+			mlx5_hws_age_action_destroy(priv, age_idx, error);
 		}
-		flow->age_idx = 0;
+		mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 	}
 }
 
@@ -4021,6 +4079,7 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
 	struct rte_flow_template_table *table = flow->table;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
 
@@ -4031,9 +4090,9 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	if (mlx5_hws_cnt_id_valid(flow->cnt_id))
 		flow_hw_age_count_release(priv, queue,
 					  flow, error);
-	if (flow->mtr_id) {
-		mlx5_ipool_free(pool->idx_pool,	flow->mtr_id);
-		flow->mtr_id = 0;
+	if (aux->orig.mtr_id) {
+		mlx5_ipool_free(pool->idx_pool,	aux->orig.mtr_id);
+		aux->orig.mtr_id = 0;
 	}
 	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
 		if (table->resource)
@@ -4044,6 +4103,8 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 		struct rte_flow_hw *upd_flow = &aux->upd_flow;
 
 		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
+		aux->orig = aux->upd;
+		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
 	}
@@ -4056,7 +4117,8 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 		      struct rte_flow_error *error)
 {
 	struct rte_flow_template_table *table = flow->table;
-	uint32_t selector = flow->matcher_selector;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
+	uint32_t selector = aux->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
 	switch (flow->operation_type) {
@@ -4079,7 +4141,7 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 			rte_atomic_fetch_add_explicit
 				(&table->matcher_info[other_selector].refcnt, 1,
 				 rte_memory_order_relaxed);
-			flow->matcher_selector = other_selector;
+			aux->matcher_selector = other_selector;
 		}
 		break;
 	default:
@@ -11342,6 +11404,7 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 {
 	int ret = -EINVAL;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
+	struct rte_flow_hw_aux *aux;
 
 	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
 		switch (actions->type) {
@@ -11352,8 +11415,9 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 						    error);
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
-			ret = flow_hw_query_age(dev, hw_flow->age_idx, data,
-						error);
+			aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
+			ret = flow_hw_query_age(dev, mlx5_flow_hw_aux_get_age_idx(hw_flow, aux),
+						data, error);
 			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
@@ -12633,8 +12697,9 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct rte_flow_hw *hw_flow = (struct rte_flow_hw *)flow;
 	struct rte_flow_template_table *table = hw_flow->table;
+	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
 	uint32_t table_selector = table->matcher_selector;
-	uint32_t rule_selector = hw_flow->matcher_selector;
+	uint32_t rule_selector = aux->matcher_selector;
 	uint32_t other_selector;
 	struct mlx5dr_matcher *other_matcher;
 	struct mlx5dr_rule_attr rule_attr = {
@@ -12647,7 +12712,7 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 	 * the one that was used BEFORE table resize.
 	 * Since the function is called AFTER table resize,
 	 * `table->matcher_selector` always points to the new matcher and
-	 * `hw_flow->matcher_selector` points to a matcher used to create the flow.
+	 * `aux->matcher_selector` points to a matcher used to create the flow.
 	 */
 	other_selector = rule_selector == table_selector ?
 			 (rule_selector + 1) & 1 : rule_selector;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 10/11] net/mlx5: reuse flow fields
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (8 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-02-29 11:51   ` [PATCH v2 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
  2024-03-03 12:16   ` [PATCH v2 00/11] net/mlx5: flow insertion performance improvements Raslan Darawsheh
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao

Each time a flow is allocated in mlx5 PMD the whole buffer,
both rte_flow_hw and mlx5dr_rule parts, are zeroed.
This introduces some wasted work because:

- mlx5dr layer does not assume that mlx5dr_rule must be initialized,
- flow action translation in mlx5 PMD does not need most of the fields
  of rte_flow_hw to be zeroed.

To reduce this wasted work, this patch introduces flags field to
flow definition. Each flow field which is not always initialized
during flow creation, will have a correspondent flag set if value is
valid (in other words - it was set during flow creation).
Utilizing this mechanism allows PMD to:

- remove zeroing from flow allocation,
- access some fields (especially from rte_flow_hw_aux) if and only if
  corresponding flag is set.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h    | 24 ++++++++-
 drivers/net/mlx5/mlx5_flow_hw.c | 93 +++++++++++++++++++++------------
 2 files changed, 83 insertions(+), 34 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index e8f4d2cb16..db65825eab 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1279,6 +1279,26 @@ enum {
 	MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_MOVE,
 };
 
+enum {
+	MLX5_FLOW_HW_FLOW_FLAG_CNT_ID = RTE_BIT32(0),
+	MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP = RTE_BIT32(1),
+	MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ = RTE_BIT32(2),
+	MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX = RTE_BIT32(3),
+	MLX5_FLOW_HW_FLOW_FLAG_MTR_ID = RTE_BIT32(4),
+	MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR = RTE_BIT32(5),
+	MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW = RTE_BIT32(6),
+};
+
+#define MLX5_FLOW_HW_FLOW_FLAGS_ALL ( \
+		MLX5_FLOW_HW_FLOW_FLAG_CNT_ID | \
+		MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP | \
+		MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ | \
+		MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX | \
+		MLX5_FLOW_HW_FLOW_FLAG_MTR_ID | \
+		MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR | \
+		MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW \
+	)
+
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
 #endif
@@ -1295,8 +1315,8 @@ struct rte_flow_hw {
 	uint32_t res_idx;
 	/** HWS flow rule index passed to mlx5dr. */
 	uint32_t rule_idx;
-	/** Fate action type. */
-	uint32_t fate_type;
+	/** Which flow fields (inline or in auxiliary struct) are used. */
+	uint32_t flags;
 	/** Ongoing flow operation type. */
 	uint8_t operation_type;
 	/** Index of pattern template this flow is based on. */
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 025f04ddde..979be4764a 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -2845,6 +2845,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 				&rule_act->action,
 				&rule_act->counter.offset))
 			return -1;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 		flow->cnt_id = act_idx;
 		break;
 	case MLX5_INDIRECT_ACTION_TYPE_AGE:
@@ -2854,6 +2855,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 		 * it in flow destroy.
 		 */
 		mlx5_flow_hw_aux_set_age_idx(flow, aux, act_idx);
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX;
 		if (action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 			/*
 			 * The mutual update for idirect AGE & COUNT will be
@@ -2869,6 +2871,7 @@ flow_hw_shared_action_construct(struct rte_eth_dev *dev, uint32_t queue,
 						  &param->queue_id, &age_cnt,
 						  idx) < 0)
 				return -1;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = age_cnt;
 			param->nb_cnts++;
 		} else {
@@ -3174,7 +3177,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			rule_acts[act_data->action_dst].action =
 			(!!attr.group) ? jump->hws_action : jump->root_action;
 			flow->jump = jump;
-			flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
@@ -3185,7 +3188,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return -1;
 			rule_acts[act_data->action_dst].action = hrxq->action;
 			flow->hrxq = hrxq;
-			flow->fate_type = MLX5_FLOW_FATE_QUEUE;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_RSS:
 			item_flags = table->its[it_idx]->item_flags;
@@ -3264,7 +3267,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 					(!!attr.group) ? jump->hws_action :
 							 jump->root_action;
 			flow->jump = jump;
-			flow->fate_type = MLX5_FLOW_FATE_JUMP;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP;
 			if (mlx5_aso_mtr_wait(priv->sh, MLX5_HW_INV_QUEUE, aso_mtr))
 				return -1;
 			break;
@@ -3284,6 +3287,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 			if (age_idx == 0)
 				return -rte_errno;
 			mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX;
 			if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT)
 				/*
 				 * When AGE uses indirect counter, no need to
@@ -3306,6 +3310,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = cnt_id;
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_COUNT:
@@ -3317,6 +3322,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				 );
 			if (ret != 0)
 				return ret;
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 			flow->cnt_id = act_data->shared_counter.id;
 			break;
 		case RTE_FLOW_ACTION_TYPE_CONNTRACK:
@@ -3349,6 +3355,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				return ret;
 			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			mlx5_flow_hw_aux_set_mtr_id(flow, aux, mtr_idx);
+			flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MTR_ID;
 			break;
 		case RTE_FLOW_ACTION_TYPE_NAT64:
 			nat64_c = action->conf;
@@ -3360,7 +3367,11 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 		}
 	}
 	if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_COUNT) {
+		/* If indirect count is used, then CNT_ID flag should be set. */
+		MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID);
 		if (at->action_flags & MLX5_FLOW_ACTION_INDIRECT_AGE) {
+			/* If indirect AGE is used, then AGE_IDX flag should be set. */
+			MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX);
 			aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 			age_idx = mlx5_flow_hw_aux_get_age_idx(flow, aux) &
 				  MLX5_HWS_AGE_IDX_MASK;
@@ -3398,8 +3409,10 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
 				flow->res_idx - 1;
 		rule_acts[hw_acts->push_remove_pos].ipv6_ext.header = ap->ipv6_push_data;
 	}
-	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id))
+	if (mlx5_hws_cnt_id_valid(hw_acts->cnt_id)) {
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_CNT_ID;
 		flow->cnt_id = hw_acts->cnt_id;
+	}
 	return 0;
 }
 
@@ -3512,7 +3525,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 				   "Port must be started before enqueueing flow operations");
 		return NULL;
 	}
-	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
+	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3531,6 +3544,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	} else {
 		flow->res_idx = flow_idx;
 	}
+	flow->flags = 0;
 	/*
 	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3582,6 +3596,7 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		aux->matcher_selector = selector;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3655,7 +3670,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 				   "Flow rule index exceeds table size");
 		return NULL;
 	}
-	flow = mlx5_ipool_zmalloc(table->flow, &flow_idx);
+	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
 	rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue);
@@ -3674,6 +3689,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 	} else {
 		flow->res_idx = flow_idx;
 	}
+	flow->flags = 0;
 	/*
 	 * Set the flow operation type here in order to know if the flow memory
 	 * should be freed or not when get the result from dequeue.
@@ -3715,6 +3731,7 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev,
 					 (struct mlx5dr_rule *)flow->rule);
 		rte_rwlock_read_unlock(&table->matcher_replace_rwlk);
 		aux->matcher_selector = selector;
+		flow->flags |= MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR;
 	}
 	if (likely(!ret)) {
 		flow_hw_q_inc_flow_ops(priv, queue);
@@ -3802,6 +3819,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	} else {
 		nf->res_idx = of->res_idx;
 	}
+	nf->flags = 0;
 	/* Indicate the construction function to set the proper fields. */
 	nf->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	/*
@@ -3831,6 +3849,7 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev,
 	 */
 	of->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE;
 	of->user_data = user_data;
+	of->flags |= MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW;
 	rule_attr.user_data = of;
 	ret = mlx5dr_rule_action_update((struct mlx5dr_rule *)of->rule,
 					action_template_index, rule_acts, &rule_attr);
@@ -3925,13 +3944,14 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	uint32_t *cnt_queue;
 	uint32_t age_idx = aux->orig.age_idx;
 
+	MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID);
 	if (mlx5_hws_cnt_is_shared(priv->hws_cpool, flow->cnt_id)) {
-		if (age_idx && !mlx5_hws_age_is_indirect(age_idx)) {
+		if ((flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX) &&
+		    !mlx5_hws_age_is_indirect(age_idx)) {
 			/* Remove this AGE parameter from indirect counter. */
 			mlx5_hws_cnt_age_set(priv->hws_cpool, flow->cnt_id, 0);
 			/* Release the AGE parameter. */
 			mlx5_hws_age_action_destroy(priv, age_idx, error);
-			mlx5_flow_hw_aux_set_age_idx(flow, aux, 0);
 		}
 		return;
 	}
@@ -3939,8 +3959,7 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 	cnt_queue = mlx5_hws_cnt_is_pool_shared(priv) ? NULL : &queue;
 	/* Put the counter first to reduce the race risk in BG thread. */
 	mlx5_hws_cnt_pool_put(priv->hws_cpool, cnt_queue, &flow->cnt_id);
-	flow->cnt_id = 0;
-	if (age_idx) {
+	if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX) {
 		if (mlx5_hws_age_is_indirect(age_idx)) {
 			uint32_t idx = age_idx & MLX5_HWS_AGE_IDX_MASK;
 
@@ -3949,7 +3968,6 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,
 			/* Release the AGE parameter. */
 			mlx5_hws_age_action_destroy(priv, age_idx, error);
 		}
-		mlx5_flow_hw_aux_set_age_idx(flow, aux, age_idx);
 	}
 }
 
@@ -4079,34 +4097,35 @@ hw_cmpl_flow_update_or_destroy(struct rte_eth_dev *dev,
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_aso_mtr_pool *pool = priv->hws_mpool;
 	struct rte_flow_template_table *table = flow->table;
-	struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
 	/* Release the original resource index in case of update. */
 	uint32_t res_idx = flow->res_idx;
 
-	if (flow->fate_type == MLX5_FLOW_FATE_JUMP)
-		flow_hw_jump_release(dev, flow->jump);
-	else if (flow->fate_type == MLX5_FLOW_FATE_QUEUE)
-		mlx5_hrxq_obj_release(dev, flow->hrxq);
-	if (mlx5_hws_cnt_id_valid(flow->cnt_id))
-		flow_hw_age_count_release(priv, queue,
-					  flow, error);
-	if (aux->orig.mtr_id) {
-		mlx5_ipool_free(pool->idx_pool,	aux->orig.mtr_id);
-		aux->orig.mtr_id = 0;
-	}
-	if (flow->operation_type != MLX5_FLOW_HW_FLOW_OP_TYPE_UPDATE) {
-		if (table->resource)
-			mlx5_ipool_free(table->resource, res_idx);
-		mlx5_ipool_free(table->flow, flow->idx);
-	} else {
+	if (flow->flags & MLX5_FLOW_HW_FLOW_FLAGS_ALL) {
 		struct rte_flow_hw_aux *aux = mlx5_flow_hw_aux(dev->data->port_id, flow);
-		struct rte_flow_hw *upd_flow = &aux->upd_flow;
 
-		rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
-		aux->orig = aux->upd;
-		flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_FATE_JUMP)
+			flow_hw_jump_release(dev, flow->jump);
+		else if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_FATE_HRXQ)
+			mlx5_hrxq_obj_release(dev, flow->hrxq);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID)
+			flow_hw_age_count_release(priv, queue, flow, error);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MTR_ID)
+			mlx5_ipool_free(pool->idx_pool, aux->orig.mtr_id);
+		if (flow->flags & MLX5_FLOW_HW_FLOW_FLAG_UPD_FLOW) {
+			struct rte_flow_hw *upd_flow = &aux->upd_flow;
+
+			rte_memcpy(flow, upd_flow, offsetof(struct rte_flow_hw, rule));
+			aux->orig = aux->upd;
+			flow->operation_type = MLX5_FLOW_HW_FLOW_OP_TYPE_CREATE;
+			if (table->resource)
+				mlx5_ipool_free(table->resource, res_idx);
+		}
+	}
+	if (flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_DESTROY ||
+	    flow->operation_type == MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_DESTROY) {
 		if (table->resource)
 			mlx5_ipool_free(table->resource, res_idx);
+		mlx5_ipool_free(table->flow, flow->idx);
 	}
 }
 
@@ -4121,6 +4140,7 @@ hw_cmpl_resizable_tbl(struct rte_eth_dev *dev,
 	uint32_t selector = aux->matcher_selector;
 	uint32_t other_selector = (selector + 1) & 1;
 
+	MLX5_ASSERT(flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR);
 	switch (flow->operation_type) {
 	case MLX5_FLOW_HW_FLOW_OP_TYPE_RSZ_TBL_CREATE:
 		rte_atomic_fetch_add_explicit
@@ -11411,10 +11431,18 @@ flow_hw_query(struct rte_eth_dev *dev, struct rte_flow *flow,
 		case RTE_FLOW_ACTION_TYPE_VOID:
 			break;
 		case RTE_FLOW_ACTION_TYPE_COUNT:
+			if (!(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_CNT_ID))
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+							  "counter not defined in the rule");
 			ret = flow_hw_query_counter(dev, hw_flow->cnt_id, data,
 						    error);
 			break;
 		case RTE_FLOW_ACTION_TYPE_AGE:
+			if (!(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_AGE_IDX))
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+							  "age data not available");
 			aux = mlx5_flow_hw_aux(dev->data->port_id, hw_flow);
 			ret = flow_hw_query_age(dev, mlx5_flow_hw_aux_get_age_idx(hw_flow, aux),
 						data, error);
@@ -12707,6 +12735,7 @@ flow_hw_update_resized(struct rte_eth_dev *dev, uint32_t queue,
 		.burst = attr->postpone,
 	};
 
+	MLX5_ASSERT(hw_flow->flags & MLX5_FLOW_HW_FLOW_FLAG_MATCHER_SELECTOR);
 	/**
 	 * mlx5dr_matcher_resize_rule_move() accepts original table matcher -
 	 * the one that was used BEFORE table resize.
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 11/11] net/mlx5: remove unneeded device status checking
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (9 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
@ 2024-02-29 11:51   ` Dariusz Sosnowski
  2024-03-03 12:16   ` [PATCH v2 00/11] net/mlx5: flow insertion performance improvements Raslan Darawsheh
  11 siblings, 0 replies; 26+ messages in thread
From: Dariusz Sosnowski @ 2024-02-29 11:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Raslan Darawsheh, Bing Zhao, stable

From: Bing Zhao <bingz@nvidia.com>

The flow rule can be inserted even before the device started. The
only exception is for a queue or RSS action.

For the other interfaces of template API, the start status is not
checked. The checking would cause some cache miss or eviction since
the flag locates on some other cache line.

Fixes: f1fecffa88df ("net/mlx5: support Direct Rules action template API")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow_hw.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 979be4764a..285ec603d3 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3520,11 +3520,6 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev,
 	uint32_t res_idx = 0;
 	int ret;
 
-	if (unlikely((!dev->data->dev_started))) {
-		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-				   "Port must be started before enqueueing flow operations");
-		return NULL;
-	}
 	flow = mlx5_ipool_malloc(table->flow, &flow_idx);
 	if (!flow)
 		goto error;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v2 00/11] net/mlx5: flow insertion performance improvements
  2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
                     ` (10 preceding siblings ...)
  2024-02-29 11:51   ` [PATCH v2 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
@ 2024-03-03 12:16   ` Raslan Darawsheh
  11 siblings, 0 replies; 26+ messages in thread
From: Raslan Darawsheh @ 2024-03-03 12:16 UTC (permalink / raw)
  To: Dariusz Sosnowski, Slava Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Bing Zhao

Hi.

> -----Original Message-----
> From: Dariusz Sosnowski <dsosnowski@nvidia.com>
> Sent: Thursday, February 29, 2024 1:52 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> Suanming Mou <suanmingm@nvidia.com>; Matan Azrad
> <matan@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; Bing Zhao
> <bingz@nvidia.com>
> Subject: [PATCH v2 00/11] net/mlx5: flow insertion performance
> improvements
> 
> Goal of this patchset is to improve the throughput of flow insertion and
> deletion in mlx5 PMD when HW Steering flow engine is used.
> 
> - Patch 1 - Use preallocated per-queue, per-actions template buffer
>   for storing translated flow actions, instead of allocating and
>   filling it on demand, on each flow operation.
> - Patches 2-4 - Make resource index allocation optional. This allocation
>   will be skipped when it is not required by the created template table.
> - Patches 5-7 - Reduce memory footprint of the internal flow queue.
> - Patch 8 - Remove indirection between flow job and flow itself,
>   by using flow as an operation container.
> - Patches 9-10 - Reduce memory footpring of flow struct by moving
>   rarely used flow fields outside of the main flow struct.
>   These fields will accesses only when needed.
>   Also remove unneeded `zmalloc` usage.
> - Patch 11 - Remove unneeded device status check in flow create.
> 
> In general all of these changes result in the following improvements (all
> numbers are averaged Kflows/sec):
> 
> |              | Insertion) |   +%   | Deletion |   +%  |
> |--------------|:----------:|:------:|:--------:|:-----:|
> | baseline     |   6338.7   |        |  9739.6  |       |
> | improvements |   6978.8   | +10.1% |  10432.4 | +7.1% |
> 
> The basic benchmark was run on ConnectX-6 Dx (22.40.1000), on the system
> with Intel Xeon Platinum 8380 CPU.
> 
> v2:
> 
> - Rebased.
> - Applied Acked-by tags from previous version.
> 
> Bing Zhao (2):
>   net/mlx5: skip the unneeded resource index allocation
>   net/mlx5: remove unneeded device status checking
> 
> Dariusz Sosnowski (7):
>   net/mlx5: allocate local DR rule action buffers
>   net/mlx5: remove action params from job
>   net/mlx5: remove flow pattern from job
>   net/mlx5: remove updated flow from job
>   net/mlx5: use flow as operation container
>   net/mlx5: move rarely used flow fields outside
>   net/mlx5: reuse flow fields
> 
> Erez Shitrit (2):
>   net/mlx5/hws: add check for matcher rule update support
>   net/mlx5/hws: add check if matcher contains complex rules
> 
>  drivers/net/mlx5/hws/mlx5dr.h         |  16 +
>  drivers/net/mlx5/hws/mlx5dr_action.c  |   6 +
>  drivers/net/mlx5/hws/mlx5dr_action.h  |   2 +
>  drivers/net/mlx5/hws/mlx5dr_matcher.c |  29 +
>  drivers/net/mlx5/mlx5.h               |  29 +-
>  drivers/net/mlx5/mlx5_flow.h          | 128 ++++-
>  drivers/net/mlx5/mlx5_flow_hw.c       | 794 ++++++++++++++++----------
>  7 files changed, 666 insertions(+), 338 deletions(-)
> 
> --
> 2.39.2

Series applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-03-03 12:16 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-28 17:00 [PATCH 00/11] net/mlx5: flow insertion performance improvements Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
2024-02-28 17:00 ` [PATCH 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
2024-02-29  8:52 ` [PATCH 00/11] net/mlx5: flow insertion performance improvements Ori Kam
2024-02-29 11:51 ` [PATCH v2 " Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 01/11] net/mlx5: allocate local DR rule action buffers Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 02/11] net/mlx5/hws: add check for matcher rule update support Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 03/11] net/mlx5/hws: add check if matcher contains complex rules Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 04/11] net/mlx5: skip the unneeded resource index allocation Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 05/11] net/mlx5: remove action params from job Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 06/11] net/mlx5: remove flow pattern " Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 07/11] net/mlx5: remove updated flow " Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 08/11] net/mlx5: use flow as operation container Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 09/11] net/mlx5: move rarely used flow fields outside Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 10/11] net/mlx5: reuse flow fields Dariusz Sosnowski
2024-02-29 11:51   ` [PATCH v2 11/11] net/mlx5: remove unneeded device status checking Dariusz Sosnowski
2024-03-03 12:16   ` [PATCH v2 00/11] net/mlx5: flow insertion performance improvements Raslan Darawsheh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).