DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria
@ 2018-09-16 13:42 Dekel Peled
  2018-09-19  7:21 ` Xueming(Steven) Li
  2018-09-27 14:18 ` [dpdk-dev] [PATCH v2] " Dekel Peled
  0 siblings, 2 replies; 17+ messages in thread
From: Dekel Peled @ 2018-09-16 13:42 UTC (permalink / raw)
  To: dev, shahafs, yskoh; +Cc: orika

As described in series starting at [1], it adds option to set metadata
value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in several parts:
- Add the setting of metadata value in matcher when creating
a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
indicated by ol_flag, in different burst functions.
- Allow flow rule with attribute egress in specific cases.

This patch must be built together with the files in series [1].

[1] "ethdev: support metadata as flow rule criteria"

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |  60 +++++++++++++----
 drivers/net/mlx5/mlx5_flow.h          |  22 +++++--
 drivers/net/mlx5/mlx5_flow_dv.c       | 120 ++++++++++++++++++++++++++++++----
 drivers/net/mlx5/mlx5_flow_verbs.c    |  21 ++++--
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  34 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  28 ++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |   4 +-
 drivers/net/mlx5/mlx5_txq.c           |   6 ++
 11 files changed, 255 insertions(+), 47 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 4234be6..7932e0f 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -402,7 +402,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
@@ -602,7 +602,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_flag(uint64_t action_flags,
-				   struct rte_flow_error *error)
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 
 	if (action_flags & MLX5_ACTION_DROP)
@@ -624,6 +625,11 @@ int mlx5_flow_validate_action_flag(uint64_t action_flags,
 					  NULL,
 					  "can't have 2 flag"
 					  " actions in same flow");
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"flag action not supported for egress");
 	return 0;
 }
 
@@ -642,8 +648,9 @@ int mlx5_flow_validate_action_flag(uint64_t action_flags,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_mark(uint64_t action_flags,
-				   const struct rte_flow_action *action,
-				   struct rte_flow_error *error)
+				const struct rte_flow_action *action,
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 	const struct rte_flow_action_mark *mark = action->conf;
 
@@ -677,6 +684,11 @@ int mlx5_flow_validate_action_mark(uint64_t action_flags,
 					  NULL,
 					  "can't have 2 flag actions in same"
 					  " flow");
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"mark action not supported for egress");
 	return 0;
 }
 
@@ -693,7 +705,8 @@ int mlx5_flow_validate_action_mark(uint64_t action_flags,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_drop(uint64_t action_flags,
-				   struct rte_flow_error *error)
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 	if (action_flags & MLX5_ACTION_FLAG)
 		return rte_flow_error_set(error,
@@ -715,6 +728,11 @@ int mlx5_flow_validate_action_drop(uint64_t action_flags,
 					  NULL,
 					  "can't have 2 fate actions in"
 					  " same flow");
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"drop action not supported for egress");
 	return 0;
 }
 
@@ -735,9 +753,10 @@ int mlx5_flow_validate_action_drop(uint64_t action_flags,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_queue(uint64_t action_flags,
-				    struct rte_eth_dev *dev,
-				    const struct rte_flow_action *action,
-				    struct rte_flow_error *error)
+				struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 	struct priv *priv = dev->data->dev_private;
 	const struct rte_flow_action_queue *queue = action->conf;
@@ -760,6 +779,11 @@ int mlx5_flow_validate_action_queue(uint64_t action_flags,
 					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
 					  &queue->index,
 					  "queue is not configured");
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"queue action not supported for egress");
 	return 0;
 }
 
@@ -780,9 +804,10 @@ int mlx5_flow_validate_action_queue(uint64_t action_flags,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_rss(uint64_t action_flags,
-				  struct rte_eth_dev *dev,
-				  const struct rte_flow_action *action,
-				  struct rte_flow_error *error)
+				struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 	struct priv *priv = dev->data->dev_private;
 	const struct rte_flow_action_rss *rss = action->conf;
@@ -839,6 +864,11 @@ int mlx5_flow_validate_action_rss(uint64_t action_flags,
 						  &rss->queue[i],
 						  "queue is not configured");
 	}
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"rss action not supported for egress");
 	return 0;
 }
 
@@ -855,7 +885,8 @@ int mlx5_flow_validate_action_rss(uint64_t action_flags,
  *   0 on success, a negative errno value otherwise and rte_ernno is set.
  */
 int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
-				    struct rte_flow_error *error)
+				const struct rte_flow_attr *attr,
+				struct rte_flow_error *error)
 {
 	struct priv *priv = dev->data->dev_private;
 
@@ -864,6 +895,11 @@ int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
 					  RTE_FLOW_ERROR_TYPE_ACTION,
 					  NULL,
 					  "flow counters are not supported.");
+	if (attr->egress)
+		return rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
+					NULL,
+					"count action not supported for egress");
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index f1a72d4..e0446c5 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -237,21 +240,27 @@ struct mlx5_flow_driver_ops {
 
 /* mlx5_flow.c */
 int mlx5_flow_validate_action_flag(uint64_t action_flags,
-				   struct rte_flow_error *error);
+				    const struct rte_flow_attr *attr,
+				    struct rte_flow_error *error);
 int mlx5_flow_validate_action_mark(uint64_t action_flags,
-				   const struct rte_flow_action *action,
-				   struct rte_flow_error *error);
+				    const struct rte_flow_action *action,
+				    const struct rte_flow_attr *attr,
+				    struct rte_flow_error *error);
 int mlx5_flow_validate_action_drop(uint64_t action_flags,
-				   struct rte_flow_error *error);
+				    const struct rte_flow_attr *attr,
+				    struct rte_flow_error *error);
 int mlx5_flow_validate_action_queue(uint64_t action_flags,
 				    struct rte_eth_dev *dev,
 				    const struct rte_flow_action *action,
+				    const struct rte_flow_attr *attr,
 				    struct rte_flow_error *error);
 int mlx5_flow_validate_action_rss(uint64_t action_flags,
 				  struct rte_eth_dev *dev,
 				  const struct rte_flow_action *action,
+				  const struct rte_flow_attr *attr,
 				  struct rte_flow_error *error);
 int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
+				    const struct rte_flow_attr *attr,
 				    struct rte_flow_error *error);
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
@@ -294,6 +303,11 @@ int mlx5_flow_validate_item_mpls(uint64_t item_flags,
 void mlx5_flow_init_driver_ops(struct rte_eth_dev *dev);
 uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 				   uint32_t subpriority);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 
 /* mlx5_flow_dv.c */
 void mlx5_flow_dv_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 4090c5f..3b52181 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -37,6 +37,47 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] item
+ *   Item specification.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_validate_item_meta(const struct rte_flow_item *item,
+			const struct rte_flow_attr *attributes,
+			struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *mask = item->mask;
+
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+
+	int ret;
+
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable
+		(item, (const uint8_t *)mask,
+		(const uint8_t *)&nic_mask,
+		sizeof(struct rte_flow_item_meta), error);
+	if (ret < 0)
+		return ret;
+
+	if (attributes->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -68,21 +109,17 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 					  RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
 					  NULL,
 					  "priority out of range");
-	if (attributes->egress)
-		return rte_flow_error_set(error, ENOTSUP,
-					  RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
-					  NULL,
-					  "egress is not supported");
 	if (attributes->transfer)
 		return rte_flow_error_set(error, ENOTSUP,
 					  RTE_FLOW_ERROR_TYPE_ATTR_TRANSFER,
 					  NULL,
 					  "transfer is not supported");
-	if (!attributes->ingress)
+	if (!(attributes->egress ^ attributes->ingress))
 		return rte_flow_error_set(error, ENOTSUP,
-					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  RTE_FLOW_ERROR_TYPE_ATTR,
 					  NULL,
-					  "ingress attribute is mandatory");
+					  "must specify exactly one of "
+					  "ingress or egress");
 	return 0;
 }
 
@@ -219,6 +256,12 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = mlx5_flow_validate_item_meta(items, attr, error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -233,6 +276,7 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 			break;
 		case RTE_FLOW_ACTION_TYPE_FLAG:
 			ret = mlx5_flow_validate_action_flag(action_flags,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
@@ -241,6 +285,7 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 		case RTE_FLOW_ACTION_TYPE_MARK:
 			ret = mlx5_flow_validate_action_mark(action_flags,
 							     actions,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
@@ -248,27 +293,36 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 			break;
 		case RTE_FLOW_ACTION_TYPE_DROP:
 			ret = mlx5_flow_validate_action_drop(action_flags,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_DROP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
-			ret = mlx5_flow_validate_action_queue(action_flags, dev,
-							      actions, error);
+			ret = mlx5_flow_validate_action_queue(action_flags,
+							      dev,
+							      actions,
+							      attr,
+							      error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_QUEUE;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
-			ret = mlx5_flow_validate_action_rss(action_flags, dev,
-							    actions, error);
+			ret = mlx5_flow_validate_action_rss(action_flags,
+							    dev,
+							    actions,
+							    attr,
+							    error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_RSS;
 			break;
 		case RTE_FLOW_ACTION_TYPE_COUNT:
-			ret = mlx5_flow_validate_action_count(dev, error);
+			ret = mlx5_flow_validate_action_count(dev,
+							      attr,
+							      error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_COUNT;
@@ -865,6 +919,43 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+				const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *metam;
+	const struct rte_flow_item_meta *metav;
+
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	metam = (const void *)item->mask;
+	if (!metam)
+		metam = &rte_flow_item_meta_mask;
+
+	metav = (const void *)item->spec;
+	if (metav) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			RTE_BE32((uint32_t)rte_be_to_cpu_64(metam->data)));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			RTE_BE32((uint32_t)rte_be_to_cpu_64(metav->data)));
+	}
+}
+
+/**
  *
  * Translate flow item.
  *
@@ -946,6 +1037,9 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
 	case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
 		flow_dv_translate_item_vxlan(tmatcher->key, key, item, inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->key, key, item);
+		break;
 	case RTE_FLOW_ITEM_TYPE_ICMP:
 	case RTE_FLOW_ITEM_TYPE_MPLS:
 	case RTE_FLOW_ITEM_TYPE_SCTP:
diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c b/drivers/net/mlx5/mlx5_flow_verbs.c
index 48e816d..3c42016 100644
--- a/drivers/net/mlx5/mlx5_flow_verbs.c
+++ b/drivers/net/mlx5/mlx5_flow_verbs.c
@@ -1272,6 +1272,7 @@ struct ibv_spec_header {
 			break;
 		case RTE_FLOW_ACTION_TYPE_FLAG:
 			ret = mlx5_flow_validate_action_flag(action_flags,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
@@ -1280,6 +1281,7 @@ struct ibv_spec_header {
 		case RTE_FLOW_ACTION_TYPE_MARK:
 			ret = mlx5_flow_validate_action_mark(action_flags,
 							     actions,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
@@ -1287,27 +1289,36 @@ struct ibv_spec_header {
 			break;
 		case RTE_FLOW_ACTION_TYPE_DROP:
 			ret = mlx5_flow_validate_action_drop(action_flags,
+							     attr,
 							     error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_DROP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_QUEUE:
-			ret = mlx5_flow_validate_action_queue(action_flags, dev,
-							      actions, error);
+			ret = mlx5_flow_validate_action_queue(action_flags,
+							      dev,
+							      actions,
+							      attr,
+							      error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_QUEUE;
 			break;
 		case RTE_FLOW_ACTION_TYPE_RSS:
-			ret = mlx5_flow_validate_action_rss(action_flags, dev,
-							    actions, error);
+			ret = mlx5_flow_validate_action_rss(action_flags,
+							    dev,
+							    actions,
+							    attr,
+							    error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_RSS;
 			break;
 		case RTE_FLOW_ACTION_TYPE_COUNT:
-			ret = mlx5_flow_validate_action_count(dev, error);
+			ret = mlx5_flow_validate_action_count(dev,
+							      attr,
+							      error);
 			if (ret < 0)
 				return ret;
 			action_flags |= MLX5_ACTION_COUNT;
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 117cec7..2f33aef 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 2d14f8a..080de57 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		uint32_t metadata = 0;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,9 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		if (buf->ol_flags & PKT_TX_METADATA)
+			metadata = buf->hash.fdir.hi;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +785,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +799,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +865,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -971,6 +975,8 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata !=
+				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +990,8 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata =
+				rte_cpu_to_be_32(buf->hash.fdir.hi);
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1090,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1199,12 +1207,16 @@
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata !=
+					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata !=
+					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1236,20 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				/* Copy metadata from mbuf if valid */
+				if (buf->ol_flags & PKT_TX_METADATA)
+					mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				/* Copy metadata from mbuf if valid */
+				if (buf->ol_flags & PKT_TX_METADATA)
+					mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1482,6 +1502,8 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata !=
+					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1527,10 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			/* Copy metadata from mbuf if valid */
+			if (buf->ol_flags & PKT_TX_METADATA)
+				mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..db36699 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -48,25 +48,39 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
  *   Number of packets having same ol_flags.
+ *   If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
+ *   as well.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n,
+		uint8_t *cs_flags, const uint64_t txq_offloads)
 {
 	unsigned int pos;
 	const uint64_t ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
-		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
 
 	if (!pkts_n)
 		return 0;
 	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	for (pos = 1; pos < pkts_n; ++pos) {
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
 			break;
+		/* If the metadata ol_flag is set,
+		 * metadata must be same in all packets.
+		 */
+		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
+			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
+			pkts[0]->hash.fdir.hi != pkts[pos]->hash.fdir.hi)
+			break;
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
 	return pos;
 }
@@ -137,8 +151,10 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+					DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					     &cs_flags, txq->offloads);
 		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
 		nb_tx += ret;
 		if (!ret)
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..20c9427 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -237,6 +237,7 @@
 	uint8x16_t *t_wqe;
 	uint8_t *dseg;
 	uint8x16_t ctrl;
+	uint32_t md; /* metadata */
 
 	/* Make sure all packets can fit into a single WQE. */
 	assert(elts_n > pkts_n);
@@ -293,10 +294,11 @@
 	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
+	md = pkts[0]->hash.fdir.hi;
 	vst1q_u8((void *)(t_wqe + 1),
 		 ((uint8x16_t) { 0, 0, 0, 0,
 				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
+				 md >> 24, md >> 16, md >> 8, md,
 				 0, 0, 0, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..7c8535c 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -236,6 +236,7 @@
 			      0,  1,  2,  3  /* bswap32 */);
 	__m128i *t_wqe, *dseg;
 	__m128i ctrl;
+	uint32_t md; /* metadata */
 
 	/* Make sure all packets can fit into a single WQE. */
 	assert(elts_n > pkts_n);
@@ -292,9 +293,10 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
+	md = pkts[0]->hash.fdir.hi;
 	_mm_store_si128(t_wqe + 1,
 			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
+				     md, md >> 8, md >> 16, md >> 24,
 				     0, 0, 0, cs_flags,
 				     0, 0, 0, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..7263fb1 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,6 +128,12 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
+
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria
  2018-09-16 13:42 [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria Dekel Peled
@ 2018-09-19  7:21 ` Xueming(Steven) Li
  2018-09-27 14:18 ` [dpdk-dev] [PATCH v2] " Dekel Peled
  1 sibling, 0 replies; 17+ messages in thread
From: Xueming(Steven) Li @ 2018-09-19  7:21 UTC (permalink / raw)
  To: Dekel Peled, dev, Shahaf Shuler, Yongseok Koh; +Cc: Ori Kam



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Dekel Peled
> Sent: Sunday, September 16, 2018 9:42 PM
> To: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Cc: Ori Kam <orika@mellanox.com>
> Subject: [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria
> 
> As described in series starting at [1], it adds option to set metadata value as match pattern when
> creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in several parts:
> - Add the setting of metadata value in matcher when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when indicated by ol_flag, in different burst
> functions.
> - Allow flow rule with attribute egress in specific cases.
> 
> This patch must be built together with the files in series [1].
> 
> [1] "ethdev: support metadata as flow rule criteria"
> 
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c          |  60 +++++++++++++----
>  drivers/net/mlx5/mlx5_flow.h          |  22 +++++--
>  drivers/net/mlx5/mlx5_flow_dv.c       | 120 ++++++++++++++++++++++++++++++----
>  drivers/net/mlx5/mlx5_flow_verbs.c    |  21 ++++--
>  drivers/net/mlx5/mlx5_prm.h           |   2 +-
>  drivers/net/mlx5/mlx5_rxtx.c          |  34 ++++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.c      |  28 ++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   4 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |   4 +-
>  drivers/net/mlx5/mlx5_txq.c           |   6 ++
>  11 files changed, 255 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 4234be6..7932e0f
> 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -402,7 +402,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
> -static int
> +int
>  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
>  			  const uint8_t *mask,
>  			  const uint8_t *nic_mask,
> @@ -602,7 +602,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_flag(uint64_t action_flags,
> -				   struct rte_flow_error *error)
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
> 
>  	if (action_flags & MLX5_ACTION_DROP)
> @@ -624,6 +625,11 @@ int mlx5_flow_validate_action_flag(uint64_t action_flags,
>  					  NULL,
>  					  "can't have 2 flag"
>  					  " actions in same flow");
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"flag action not supported for egress");
>  	return 0;
>  }
> 
> @@ -642,8 +648,9 @@ int mlx5_flow_validate_action_flag(uint64_t action_flags,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_mark(uint64_t action_flags,
> -				   const struct rte_flow_action *action,
> -				   struct rte_flow_error *error)
> +				const struct rte_flow_action *action,
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
>  	const struct rte_flow_action_mark *mark = action->conf;
> 
> @@ -677,6 +684,11 @@ int mlx5_flow_validate_action_mark(uint64_t action_flags,
>  					  NULL,
>  					  "can't have 2 flag actions in same"
>  					  " flow");
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"mark action not supported for egress");
>  	return 0;
>  }
> 
> @@ -693,7 +705,8 @@ int mlx5_flow_validate_action_mark(uint64_t action_flags,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_drop(uint64_t action_flags,
> -				   struct rte_flow_error *error)
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
>  	if (action_flags & MLX5_ACTION_FLAG)
>  		return rte_flow_error_set(error,
> @@ -715,6 +728,11 @@ int mlx5_flow_validate_action_drop(uint64_t action_flags,
>  					  NULL,
>  					  "can't have 2 fate actions in"
>  					  " same flow");
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"drop action not supported for egress");
>  	return 0;
>  }
> 
> @@ -735,9 +753,10 @@ int mlx5_flow_validate_action_drop(uint64_t action_flags,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_queue(uint64_t action_flags,
> -				    struct rte_eth_dev *dev,
> -				    const struct rte_flow_action *action,
> -				    struct rte_flow_error *error)
> +				struct rte_eth_dev *dev,
> +				const struct rte_flow_action *action,
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	const struct rte_flow_action_queue *queue = action->conf; @@ -760,6 +779,11 @@ int
> mlx5_flow_validate_action_queue(uint64_t action_flags,
>  					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
>  					  &queue->index,
>  					  "queue is not configured");
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"queue action not supported for egress");
>  	return 0;
>  }
> 
> @@ -780,9 +804,10 @@ int mlx5_flow_validate_action_queue(uint64_t action_flags,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_rss(uint64_t action_flags,
> -				  struct rte_eth_dev *dev,
> -				  const struct rte_flow_action *action,
> -				  struct rte_flow_error *error)
> +				struct rte_eth_dev *dev,
> +				const struct rte_flow_action *action,
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	const struct rte_flow_action_rss *rss = action->conf; @@ -839,6 +864,11 @@ int
> mlx5_flow_validate_action_rss(uint64_t action_flags,
>  						  &rss->queue[i],
>  						  "queue is not configured");
>  	}
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"rss action not supported for egress");
>  	return 0;
>  }
> 
> @@ -855,7 +885,8 @@ int mlx5_flow_validate_action_rss(uint64_t action_flags,
>   *   0 on success, a negative errno value otherwise and rte_ernno is set.
>   */
>  int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
> -				    struct rte_flow_error *error)
> +				const struct rte_flow_attr *attr,
> +				struct rte_flow_error *error)
>  {
>  	struct priv *priv = dev->data->dev_private;
> 
> @@ -864,6 +895,11 @@ int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
>  					  RTE_FLOW_ERROR_TYPE_ACTION,
>  					  NULL,
>  					  "flow counters are not supported.");
> +	if (attr->egress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +					NULL,
> +					"count action not supported for egress");
>  	return 0;
>  }
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index f1a72d4..e0446c5
> 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -43,6 +43,9 @@
>  #define MLX5_FLOW_LAYER_GRE (1u << 14)
>  #define MLX5_FLOW_LAYER_MPLS (1u << 15)
> 
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
>  /* Outer Masks. */
>  #define MLX5_FLOW_LAYER_OUTER_L3 \
>  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6) @@ -237,21 +240,27 @@ struct
> mlx5_flow_driver_ops {
> 
>  /* mlx5_flow.c */
>  int mlx5_flow_validate_action_flag(uint64_t action_flags,
> -				   struct rte_flow_error *error);
> +				    const struct rte_flow_attr *attr,
> +				    struct rte_flow_error *error);
>  int mlx5_flow_validate_action_mark(uint64_t action_flags,
> -				   const struct rte_flow_action *action,
> -				   struct rte_flow_error *error);
> +				    const struct rte_flow_action *action,
> +				    const struct rte_flow_attr *attr,
> +				    struct rte_flow_error *error);
>  int mlx5_flow_validate_action_drop(uint64_t action_flags,
> -				   struct rte_flow_error *error);
> +				    const struct rte_flow_attr *attr,
> +				    struct rte_flow_error *error);
>  int mlx5_flow_validate_action_queue(uint64_t action_flags,
>  				    struct rte_eth_dev *dev,
>  				    const struct rte_flow_action *action,
> +				    const struct rte_flow_attr *attr,
>  				    struct rte_flow_error *error);
>  int mlx5_flow_validate_action_rss(uint64_t action_flags,
>  				  struct rte_eth_dev *dev,
>  				  const struct rte_flow_action *action,
> +				  const struct rte_flow_attr *attr,
>  				  struct rte_flow_error *error);
>  int mlx5_flow_validate_action_count(struct rte_eth_dev *dev,
> +				    const struct rte_flow_attr *attr,
>  				    struct rte_flow_error *error);
>  int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
>  				  const struct rte_flow_attr *attributes, @@ -294,6 +303,11 @@ int
> mlx5_flow_validate_item_mpls(uint64_t item_flags,  void mlx5_flow_init_driver_ops(struct rte_eth_dev
> *dev);  uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
>  				   uint32_t subpriority);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);


Above invalid egress actions checks seems less relation to metadata, how about split this patch into:
1. egress support
2. metadata support

> 
>  /* mlx5_flow_dv.c */
>  void mlx5_flow_dv_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops); diff --git
> a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 4090c5f..3b52181 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -37,6 +37,47 @@
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> 
>  /**
> + * Validate META item.
> + *
> + * @param[in] item
> + *   Item specification.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_validate_item_meta(const struct rte_flow_item *item,
> +			const struct rte_flow_attr *attributes,
> +			struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *mask = item->mask;
> +
> +	const struct rte_flow_item_meta nic_mask = {
> +		.data = RTE_BE32(UINT32_MAX)
> +	};
> +
> +	int ret;
> +
> +	if (!mask)
> +		mask = &rte_flow_item_meta_mask;
> +	ret = mlx5_flow_item_acceptable
> +		(item, (const uint8_t *)mask,
> +		(const uint8_t *)&nic_mask,
> +		sizeof(struct rte_flow_item_meta), error);

Please follow line wrapping format.

> +	if (ret < 0)
> +		return ret;
> +
> +	if (attributes->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for ingress");
> +	return 0;

What if spec not specified?

> +}
> +
> +/**
>   * Verify the @p attributes will be correctly understood by the NIC and store
>   * them in the @p flow if everything is correct.
>   *
> @@ -68,21 +109,17 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  					  RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
>  					  NULL,
>  					  "priority out of range");
> -	if (attributes->egress)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> -					  NULL,
> -					  "egress is not supported");
>  	if (attributes->transfer)
>  		return rte_flow_error_set(error, ENOTSUP,
>  					  RTE_FLOW_ERROR_TYPE_ATTR_TRANSFER,
>  					  NULL,
>  					  "transfer is not supported");
> -	if (!attributes->ingress)
> +	if (!(attributes->egress ^ attributes->ingress))
>  		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  RTE_FLOW_ERROR_TYPE_ATTR,
>  					  NULL,
> -					  "ingress attribute is mandatory");
> +					  "must specify exactly one of "
> +					  "ingress or egress");
>  	return 0;
>  }
> 
> @@ -219,6 +256,12 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  				return ret;
>  			item_flags |= MLX5_FLOW_LAYER_MPLS;
>  			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret = mlx5_flow_validate_item_meta(items, attr, error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
>  						  RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -233,6 +276,7 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_FLAG:
>  			ret = mlx5_flow_validate_action_flag(action_flags,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
> @@ -241,6 +285,7 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  		case RTE_FLOW_ACTION_TYPE_MARK:
>  			ret = mlx5_flow_validate_action_mark(action_flags,
>  							     actions,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
> @@ -248,27 +293,36 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_DROP:
>  			ret = mlx5_flow_validate_action_drop(action_flags,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_DROP;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_QUEUE:
> -			ret = mlx5_flow_validate_action_queue(action_flags, dev,
> -							      actions, error);
> +			ret = mlx5_flow_validate_action_queue(action_flags,
> +							      dev,
> +							      actions,
> +							      attr,
> +							      error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_QUEUE;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_RSS:
> -			ret = mlx5_flow_validate_action_rss(action_flags, dev,
> -							    actions, error);
> +			ret = mlx5_flow_validate_action_rss(action_flags,
> +							    dev,
> +							    actions,
> +							    attr,
> +							    error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_RSS;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_COUNT:
> -			ret = mlx5_flow_validate_action_count(dev, error);
> +			ret = mlx5_flow_validate_action_count(dev,
> +							      attr,
> +							      error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_COUNT;
> @@ -865,6 +919,43 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,  }
> 
>  /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +				const struct rte_flow_item *item)
> +{
> +	const struct rte_flow_item_meta *metam;
> +	const struct rte_flow_item_meta *metav;
> +
> +	void *misc2_m =
> +		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
> +	void *misc2_v =
> +		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
> +
> +	metam = (const void *)item->mask;
> +	if (!metam)
> +		metam = &rte_flow_item_meta_mask;
> +
> +	metav = (const void *)item->spec;
> +	if (metav) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> +			RTE_BE32((uint32_t)rte_be_to_cpu_64(metam->data)));
> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			RTE_BE32((uint32_t)rte_be_to_cpu_64(metav->data)));

Data size of rte_flow_item_meta.data is 32 bits in [1], no need to convert the size.

> +	}
> +}
> +
> +/**
>   *
>   * Translate flow item.
>   *
> @@ -946,6 +1037,9 @@ static int flow_dv_validate_attributes(struct rte_eth_dev *dev,
>  	case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
>  		flow_dv_translate_item_vxlan(tmatcher->key, key, item, inner);
>  		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->key, key, item);
> +		break;
>  	case RTE_FLOW_ITEM_TYPE_ICMP:
>  	case RTE_FLOW_ITEM_TYPE_MPLS:
>  	case RTE_FLOW_ITEM_TYPE_SCTP:
> diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c b/drivers/net/mlx5/mlx5_flow_verbs.c
> index 48e816d..3c42016 100644
> --- a/drivers/net/mlx5/mlx5_flow_verbs.c
> +++ b/drivers/net/mlx5/mlx5_flow_verbs.c
> @@ -1272,6 +1272,7 @@ struct ibv_spec_header {
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_FLAG:
>  			ret = mlx5_flow_validate_action_flag(action_flags,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
> @@ -1280,6 +1281,7 @@ struct ibv_spec_header {
>  		case RTE_FLOW_ACTION_TYPE_MARK:
>  			ret = mlx5_flow_validate_action_mark(action_flags,
>  							     actions,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
> @@ -1287,27 +1289,36 @@ struct ibv_spec_header {
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_DROP:
>  			ret = mlx5_flow_validate_action_drop(action_flags,
> +							     attr,
>  							     error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_DROP;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_QUEUE:
> -			ret = mlx5_flow_validate_action_queue(action_flags, dev,
> -							      actions, error);
> +			ret = mlx5_flow_validate_action_queue(action_flags,
> +							      dev,
> +							      actions,
> +							      attr,
> +							      error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_QUEUE;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_RSS:
> -			ret = mlx5_flow_validate_action_rss(action_flags, dev,
> -							    actions, error);
> +			ret = mlx5_flow_validate_action_rss(action_flags,
> +							    dev,
> +							    actions,
> +							    attr,
> +							    error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_RSS;
>  			break;
>  		case RTE_FLOW_ACTION_TYPE_COUNT:
> -			ret = mlx5_flow_validate_action_count(dev, error);
> +			ret = mlx5_flow_validate_action_count(dev,
> +							      attr,
> +							      error);
>  			if (ret < 0)
>  				return ret;
>  			action_flags |= MLX5_ACTION_COUNT;
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h index 117cec7..2f33aef 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
>  	uint8_t	cs_flags;
>  	uint8_t	rsvd1;
>  	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
>  	uint16_t inline_hdr_sz;
>  	uint8_t inline_hdr[2];
>  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 2d14f8a..080de57
> 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
>  		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
>  		uint32_t swp_offsets = 0;
>  		uint8_t swp_types = 0;
> +		uint32_t metadata = 0;
>  		uint16_t tso_segsz = 0;
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  		uint32_t total_length = 0;
> @@ -566,6 +567,9 @@
>  		cs_flags = txq_ol_cksum_to_cs(buf);
>  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
>  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		if (buf->ol_flags & PKT_TX_METADATA)
> +			metadata = buf->hash.fdir.hi;
>  		/* Replace the Ethernet type by the VLAN if necessary. */
>  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
>  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 | @@ -781,7 +785,7 @@
>  				swp_offsets,
>  				cs_flags | (swp_types << 8) |
>  				(rte_cpu_to_be_16(tso_segsz) << 16),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
>  			};
>  		} else {
> @@ -795,7 +799,7 @@
>  			wqe->eseg = (rte_v128u32_t){
>  				swp_offsets,
>  				cs_flags | (swp_types << 8),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
>  			};
>  		}
> @@ -861,7 +865,7 @@
>  	mpw->wqe->eseg.inline_hdr_sz = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
>  					     (txq->wqe_ci << 8) |
>  					     MLX5_OPCODE_TSO);
> @@ -971,6 +975,8 @@
>  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
>  		    ((mpw.len != length) ||
>  		     (segs_n != 1) ||
> +		     (mpw.wqe->eseg.flow_table_metadata !=
> +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
>  			mlx5_mpw_close(txq, &mpw);
>  		if (mpw.state == MLX5_MPW_STATE_CLOSED) { @@ -984,6 +990,8 @@
>  			max_wqe -= 2;
>  			mlx5_mpw_new(txq, &mpw, length);
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata =
> +				rte_cpu_to_be_32(buf->hash.fdir.hi);
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
>  		assert((segs_n == 1) || (mpw.pkts_n == 0)); @@ -1082,7 +1090,7 @@
>  	mpw->wqe->eseg.cs_flags = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	inl = (struct mlx5_wqe_inl_small *)
>  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
>  	mpw->data.raw = (uint8_t *)&inl->raw;
> @@ -1199,12 +1207,16 @@
>  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> +					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||

Shall we check buf->ol_flags & PKT_TX_METADATA here? Same question to similar code below.

>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				mlx5_mpw_close(txq, &mpw);
>  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
>  			    (length > inline_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> +					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
>  				mlx5_mpw_inline_close(txq, &mpw);
>  				inline_room =
> @@ -1224,12 +1236,20 @@
>  				max_wqe -= 2;
>  				mlx5_mpw_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				/* Copy metadata from mbuf if valid */
> +				if (buf->ol_flags & PKT_TX_METADATA)
> +					mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);
>  			} else {
>  				if (unlikely(max_wqe < wqe_inl_n))
>  					break;
>  				max_wqe -= wqe_inl_n;
>  				mlx5_mpw_inline_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				/* Copy metadata from mbuf if valid */
> +				if (buf->ol_flags & PKT_TX_METADATA)
> +					mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);
>  			}
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */ @@ -1482,6 +1502,8 @@
>  			    (length <= txq->inline_max_packet_sz &&
>  			     inl_pad + sizeof(inl_hdr) + length >
>  			     mpw_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> +					rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				max_wqe -= mlx5_empw_close(txq, &mpw);
>  		}
> @@ -1505,6 +1527,10 @@
>  				    sizeof(inl_hdr) + length <= mpw_room &&
>  				    !txq->mpw_hdr_dseg;
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			/* Copy metadata from mbuf if valid */
> +			if (buf->ol_flags & PKT_TX_METADATA)
> +				mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);
>  		} else {
>  			/* Evaluate whether the next packet can be inlined.
>  			 * Inlininig is possible when:
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
> index 0a4aed8..db36699 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> @@ -48,25 +48,39 @@
>   *   Number of packets.
>   * @param cs_flags
>   *   Pointer of flags to be returned.
> + * @param txq_offloads
> + *   Offloads enabled on Tx queue
>   *
>   * @return
>   *   Number of packets having same ol_flags.
> + *   If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
> + *   as well.

Please move such function description above member description.

>   */
>  static inline unsigned int
> -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
> +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n,
> +		uint8_t *cs_flags, const uint64_t txq_offloads)
>  {
>  	unsigned int pos;
>  	const uint64_t ol_mask =
>  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
>  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
> 
>  	if (!pkts_n)
>  		return 0;
>  	/* Count the number of packets having same ol_flags. */
> -	for (pos = 1; pos < pkts_n; ++pos)
> -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> +	for (pos = 1; pos < pkts_n; ++pos) {
> +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
> +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
>  			break;
> +		/* If the metadata ol_flag is set,
> +		 * metadata must be same in all packets.
> +		 */
> +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
> +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> +			pkts[0]->hash.fdir.hi != pkts[pos]->hash.fdir.hi)
> +			break;
> +	}
>  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
>  	return pos;
>  }
> @@ -137,8 +151,10 @@
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
>  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> +					DEV_TX_OFFLOAD_MATCH_METADATA))
> +			n = txq_calc_offload(&pkts[nb_tx], n,
> +					     &cs_flags, txq->offloads);
>  		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
>  		nb_tx += ret;
>  		if (!ret)
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fb884f9..fda7004 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -22,6 +22,7 @@
>  /* HW offload capabilities of vectorized Tx. */  #define MLX5_VEC_TX_OFFLOAD_CAP \
>  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
>  	 DEV_TX_OFFLOAD_MULTI_SEGS)
> 
>  /*
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index b37b738..20c9427 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -237,6 +237,7 @@
>  	uint8x16_t *t_wqe;
>  	uint8_t *dseg;
>  	uint8x16_t ctrl;
> +	uint32_t md; /* metadata */
> 
>  	/* Make sure all packets can fit into a single WQE. */
>  	assert(elts_n > pkts_n);
> @@ -293,10 +294,11 @@
>  	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
>  	vst1q_u8((void *)t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> +	md = pkts[0]->hash.fdir.hi;
>  	vst1q_u8((void *)(t_wqe + 1),
>  		 ((uint8x16_t) { 0, 0, 0, 0,
>  				 cs_flags, 0, 0, 0,
> -				 0, 0, 0, 0,
> +				 md >> 24, md >> 16, md >> 8, md,
>  				 0, 0, 0, 0 }));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 54b3783..7c8535c 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -236,6 +236,7 @@
>  			      0,  1,  2,  3  /* bswap32 */);
>  	__m128i *t_wqe, *dseg;
>  	__m128i ctrl;
> +	uint32_t md; /* metadata */
> 
>  	/* Make sure all packets can fit into a single WQE. */
>  	assert(elts_n > pkts_n);
> @@ -292,9 +293,10 @@
>  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
>  	_mm_store_si128(t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> +	md = pkts[0]->hash.fdir.hi;
>  	_mm_store_si128(t_wqe + 1,
>  			_mm_set_epi8(0, 0, 0, 0,
> -				     0, 0, 0, 0,
> +				     md, md >> 8, md >> 16, md >> 24,
>  				     0, 0, 0, cs_flags,
>  				     0, 0, 0, 0));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index f9bc473..7263fb1 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -128,6 +128,12 @@
>  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> +
> +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> +	if (config->dv_flow_en)
> +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA; #endif
> +
>  	return offloads;
>  }
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v2] net/mlx5: support metadata as flow rule criteria
  2018-09-16 13:42 [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria Dekel Peled
  2018-09-19  7:21 ` Xueming(Steven) Li
@ 2018-09-27 14:18 ` Dekel Peled
  2018-09-29  9:09   ` Yongseok Koh
  2018-10-11 11:19   ` [dpdk-dev] [PATCH v3] " Dekel Peled
  1 sibling, 2 replies; 17+ messages in thread
From: Dekel Peled @ 2018-09-27 14:18 UTC (permalink / raw)
  To: dev, shahafs, yskoh; +Cc: orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the setting of metadata value in matcher when creating
  a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113270.html
	
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
V2:
Split the support of egress rules to a different patch.
---
 drivers/net/mlx5/mlx5_flow.c          |  2 +-
 drivers/net/mlx5/mlx5_flow.h          |  8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 96 +++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |  2 +-
 drivers/net/mlx5/mlx5_rxtx.c          | 35 +++++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      | 31 ++++++++---
 drivers/net/mlx5/mlx5_rxtx_vec.h      |  1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  4 +-
 drivers/net/mlx5/mlx5_txq.c           |  6 +++
 10 files changed, 174 insertions(+), 15 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 8007bf1..9581691 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -417,7 +417,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 10d700a..d91ae17 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -42,6 +42,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -299,6 +302,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index cf663cd..2439f5e 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,55 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_validate_item_meta(const struct rte_flow_item *item,
+			const struct rte_flow_attr *attr,
+			struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+
+	int ret;
+
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -216,6 +265,12 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = mlx5_flow_validate_item_meta(items, attr, error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -853,6 +908,43 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+				const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *metam;
+	const struct rte_flow_item_meta *metav;
+
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	metam = (const void *)item->mask;
+	if (!metam)
+		metam = &rte_flow_item_meta_mask;
+
+	metav = (const void *)item->spec;
+	if (metav) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			RTE_BE32(metam->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			RTE_BE32(metav->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -938,6 +1030,10 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
+
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 4e2f9f4..a905397 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..d596e4e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		uint32_t metadata = 0;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,9 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		if (buf->ol_flags & PKT_TX_METADATA)
+			metadata = buf->hash.fdir.hi;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +785,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +799,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +865,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -971,6 +975,8 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata !=
+			rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +990,8 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata =
+				rte_cpu_to_be_32(buf->hash.fdir.hi);
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1090,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1199,12 +1207,16 @@
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata !=
+				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata !=
+				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1236,21 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				/* Copy metadata from mbuf if valid */
+				if (buf->ol_flags & PKT_TX_METADATA)
+					mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
+
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				/* Copy metadata from mbuf if valid */
+				if (buf->ol_flags & PKT_TX_METADATA)
+					mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1482,6 +1503,8 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata !=
+				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1528,10 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			/* Copy metadata from mbuf if valid */
+			if (buf->ol_flags & PKT_TX_METADATA)
+				mpw.wqe->eseg.flow_table_metadata =
+					rte_cpu_to_be_32(buf->hash.fdir.hi);
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..132943a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -41,6 +41,8 @@
 
 /**
  * Count the number of packets having same ol_flags and calculate cs_flags.
+ * If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
+ * as well.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,25 +50,38 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		const uint64_t txq_offloads)
 {
 	unsigned int pos;
 	const uint64_t ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
-		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
 
 	if (!pkts_n)
 		return 0;
 	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	for (pos = 1; pos < pkts_n; ++pos) {
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
 			break;
+		/* If the metadata ol_flag is set,
+		 *  metadata must be same in all packets.
+		 */
+		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
+			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
+			pkts[0]->hash.fdir.hi != pkts[pos]->hash.fdir.hi)
+			break;
+	}
+
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
 	return pos;
 }
@@ -137,8 +152,10 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					&cs_flags, txq->offloads);
 		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
 		nb_tx += ret;
 		if (!ret)
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..20c9427 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -237,6 +237,7 @@
 	uint8x16_t *t_wqe;
 	uint8_t *dseg;
 	uint8x16_t ctrl;
+	uint32_t md; /* metadata */
 
 	/* Make sure all packets can fit into a single WQE. */
 	assert(elts_n > pkts_n);
@@ -293,10 +294,11 @@
 	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
+	md = pkts[0]->hash.fdir.hi;
 	vst1q_u8((void *)(t_wqe + 1),
 		 ((uint8x16_t) { 0, 0, 0, 0,
 				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
+				 md >> 24, md >> 16, md >> 8, md,
 				 0, 0, 0, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..7c8535c 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -236,6 +236,7 @@
 			      0,  1,  2,  3  /* bswap32 */);
 	__m128i *t_wqe, *dseg;
 	__m128i ctrl;
+	uint32_t md; /* metadata */
 
 	/* Make sure all packets can fit into a single WQE. */
 	assert(elts_n > pkts_n);
@@ -292,9 +293,10 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
+	md = pkts[0]->hash.fdir.hi;
 	_mm_store_si128(t_wqe + 1,
 			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
+				     md, md >> 8, md >> 16, md >> 24,
 				     0, 0, 0, cs_flags,
 				     0, 0, 0, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..7263fb1 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,6 +128,12 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
+
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v2] net/mlx5: support metadata as flow rule criteria
  2018-09-27 14:18 ` [dpdk-dev] [PATCH v2] " Dekel Peled
@ 2018-09-29  9:09   ` Yongseok Koh
  2018-10-03  5:22     ` Dekel Peled
  2018-10-11 11:19   ` [dpdk-dev] [PATCH v3] " Dekel Peled
  1 sibling, 1 reply; 17+ messages in thread
From: Yongseok Koh @ 2018-09-29  9:09 UTC (permalink / raw)
  To: Dekel Peled; +Cc: dev, Shahaf Shuler, Ori Kam

On Thu, Sep 27, 2018 at 05:18:55PM +0300, Dekel Peled wrote:
> As described in series starting at [1], it adds option to set
> metadata value as match pattern when creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the setting of metadata value in matcher when creating
>   a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>   indicated by ol_flag, in different burst functions.
> 
> [1] "ethdev: support metadata as flow rule criteria"
>     http://mails.dpdk.org/archives/dev/2018-September/113270.html
> 	
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
> V2:
> Split the support of egress rules to a different patch.
> ---
>  drivers/net/mlx5/mlx5_flow.c          |  2 +-
>  drivers/net/mlx5/mlx5_flow.h          |  8 +++
>  drivers/net/mlx5/mlx5_flow_dv.c       | 96 +++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_prm.h           |  2 +-
>  drivers/net/mlx5/mlx5_rxtx.c          | 35 +++++++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.c      | 31 ++++++++---
>  drivers/net/mlx5/mlx5_rxtx_vec.h      |  1 +
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  4 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  4 +-
>  drivers/net/mlx5/mlx5_txq.c           |  6 +++
>  10 files changed, 174 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 8007bf1..9581691 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -417,7 +417,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
> -static int
> +int
>  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
>  			  const uint8_t *mask,
>  			  const uint8_t *nic_mask,
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 10d700a..d91ae17 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -42,6 +42,9 @@
>  #define MLX5_FLOW_LAYER_GRE (1u << 14)
>  #define MLX5_FLOW_LAYER_MPLS (1u << 15)
>  
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
>  /* Outer Masks. */
>  #define MLX5_FLOW_LAYER_OUTER_L3 \
>  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
> @@ -299,6 +302,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
>  int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
>  				  const struct rte_flow_attr *attributes,
>  				  struct rte_flow_error *error);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);
>  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
>  				uint64_t item_flags,
>  				struct rte_flow_error *error);
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index cf663cd..2439f5e 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -36,6 +36,55 @@
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
>  
>  /**
> + * Validate META item.
> + *
> + * @param[in] item
> + *   Item specification.
> + * @param[in] attr
> + *   Attributes of flow that includes this item.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_validate_item_meta(const struct rte_flow_item *item,

Naming rule here is starting from flow_dv_*() for static funcs. For global
funcs, it should start from mlx5_, so it should be mlx5_flow_dv_*().

Or, you can put it in the mlx5_flow.c as a common helper function although it is
used only by DV.

I prefer the latter.

> +			const struct rte_flow_attr *attr,
> +			struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *spec = item->spec;
> +	const struct rte_flow_item_meta *mask = item->mask;
> +
> +	const struct rte_flow_item_meta nic_mask = {
> +		.data = RTE_BE32(UINT32_MAX)
> +	};
> +
> +	int ret;
> +
> +	if (!spec)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  item->spec,
> +					  "data cannot be empty");
> +	if (!mask)
> +		mask = &rte_flow_item_meta_mask;
> +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> +					(const uint8_t *)&nic_mask,
> +					sizeof(struct rte_flow_item_meta),
> +					error);
> +	if (ret < 0)
> +		return ret;
> +

No blank line is allowed.

> +	if (attr->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for ingress");

If it is only supported with egress flow, then please group this patch with the
other one - "net/mlx5: allow flow rule with attribute egress", and make it
applied later than that by specifying [1/2] and [2/2].

> +	return 0;
> +}
> +
> +/**
>   * Verify the @p attributes will be correctly understood by the NIC and store
>   * them in the @p flow if everything is correct.
>   *
> @@ -216,6 +265,12 @@
>  				return ret;
>  			item_flags |= MLX5_FLOW_LAYER_MPLS;
>  			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret = mlx5_flow_validate_item_meta(items, attr, error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
>  						  RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -853,6 +908,43 @@
>  }
>  
>  /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +				const struct rte_flow_item *item)
> +{
> +	const struct rte_flow_item_meta *metam;
> +	const struct rte_flow_item_meta *metav;

Ori changed naming like meta_m and meta_v.

> +	void *misc2_m =
> +		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
> +	void *misc2_v =
> +		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
> +
> +	metam = (const void *)item->mask;
> +	if (!metam)
> +		metam = &rte_flow_item_meta_mask;
> +

No blank line is allowed except for the one after variable declaration. Please
remove such blank lines in the whole patches.

> +	metav = (const void *)item->spec;
> +	if (metav) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> +			RTE_BE32(metam->data));
> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			RTE_BE32(metav->data));
> +	}
> +}
> +
> +/**
>   * Update the matcher and the value based the selected item.
>   *
>   * @param[in, out] matcher
> @@ -938,6 +1030,10 @@
>  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
>  					     inner);
>  		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
> +		break;
> +
>  	default:
>  		break;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 4e2f9f4..a905397 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
>  	uint8_t	cs_flags;
>  	uint8_t	rsvd1;
>  	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
>  	uint16_t inline_hdr_sz;
>  	uint8_t inline_hdr[2];
>  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 558e6b6..d596e4e 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
>  		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
>  		uint32_t swp_offsets = 0;
>  		uint8_t swp_types = 0;
> +		uint32_t metadata = 0;
>  		uint16_t tso_segsz = 0;
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  		uint32_t total_length = 0;
> @@ -566,6 +567,9 @@
>  		cs_flags = txq_ol_cksum_to_cs(buf);
>  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
>  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		if (buf->ol_flags & PKT_TX_METADATA)
> +			metadata = buf->hash.fdir.hi;

I saw someone suggested to add a union field to name it properly. I also agree
on the idea. It is quite confusing to get meta data from hash field.

>  		/* Replace the Ethernet type by the VLAN if necessary. */
>  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
>  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
> @@ -781,7 +785,7 @@
>  				swp_offsets,
>  				cs_flags | (swp_types << 8) |
>  				(rte_cpu_to_be_16(tso_segsz) << 16),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
>  			};
>  		} else {
> @@ -795,7 +799,7 @@
>  			wqe->eseg = (rte_v128u32_t){
>  				swp_offsets,
>  				cs_flags | (swp_types << 8),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
>  			};
>  		}
> @@ -861,7 +865,7 @@
>  	mpw->wqe->eseg.inline_hdr_sz = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
>  					     (txq->wqe_ci << 8) |
>  					     MLX5_OPCODE_TSO);
> @@ -971,6 +975,8 @@
>  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
>  		    ((mpw.len != length) ||
>  		     (segs_n != 1) ||
> +		     (mpw.wqe->eseg.flow_table_metadata !=
> +			rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
>  			mlx5_mpw_close(txq, &mpw);
>  		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
> @@ -984,6 +990,8 @@
>  			max_wqe -= 2;
>  			mlx5_mpw_new(txq, &mpw, length);
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata =
> +				rte_cpu_to_be_32(buf->hash.fdir.hi);
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
>  		assert((segs_n == 1) || (mpw.pkts_n == 0));
> @@ -1082,7 +1090,7 @@
>  	mpw->wqe->eseg.cs_flags = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	inl = (struct mlx5_wqe_inl_small *)
>  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
>  	mpw->data.raw = (uint8_t *)&inl->raw;
> @@ -1199,12 +1207,16 @@
>  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				mlx5_mpw_close(txq, &mpw);
>  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
>  			    (length > inline_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||

Isn't this mbuf field vaild only if there's PKT_TX_METADATA? Without the flag,
it could be a garbage, right? And I think the value in the eseg might not be
metadata of previous packet if the packet didn't have the flag. It would be
better to define metadata and get it early in this loop like cs_flags. E.g.

		metadata = buf->ol_flags & PKT_TX_METADATA ?
			   rte_cpu_to_be_32(buf->hash.fdir.hi) : 0;
		cs_flags = txq_ol_cksum_to_cs(buf);

Same for the rest of similar funcs.

>  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
>  				mlx5_mpw_inline_close(txq, &mpw);
>  				inline_room =
> @@ -1224,12 +1236,21 @@
>  				max_wqe -= 2;
>  				mlx5_mpw_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				/* Copy metadata from mbuf if valid */
> +				if (buf->ol_flags & PKT_TX_METADATA)
> +					mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);
> +

Yes, this can be allowed but the following is preferred.
					mpw.wqe->eseg.flow_table_metadata =
						rte_cpu_to_be_32
						(buf->hash.fdir.hi);
>  			} else {
>  				if (unlikely(max_wqe < wqe_inl_n))
>  					break;
>  				max_wqe -= wqe_inl_n;
>  				mlx5_mpw_inline_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				/* Copy metadata from mbuf if valid */
> +				if (buf->ol_flags & PKT_TX_METADATA)
> +					mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);

Same here.

>  			}
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1482,6 +1503,8 @@
>  			    (length <= txq->inline_max_packet_sz &&
>  			     inl_pad + sizeof(inl_hdr) + length >
>  			     mpw_room) ||
> +			     (mpw.wqe->eseg.flow_table_metadata !=
> +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				max_wqe -= mlx5_empw_close(txq, &mpw);
>  		}
> @@ -1505,6 +1528,10 @@
>  				    sizeof(inl_hdr) + length <= mpw_room &&
>  				    !txq->mpw_hdr_dseg;
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			/* Copy metadata from mbuf if valid */
> +			if (buf->ol_flags & PKT_TX_METADATA)
> +				mpw.wqe->eseg.flow_table_metadata =
> +					rte_cpu_to_be_32(buf->hash.fdir.hi);
>  		} else {
>  			/* Evaluate whether the next packet can be inlined.
>  			 * Inlininig is possible when:
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
> index 0a4aed8..132943a 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> @@ -41,6 +41,8 @@
>  
>  /**
>   * Count the number of packets having same ol_flags and calculate cs_flags.
> + * If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
> + * as well.
>   *
>   * @param pkts
>   *   Pointer to array of packets.
> @@ -48,25 +50,38 @@
>   *   Number of packets.
>   * @param cs_flags
>   *   Pointer of flags to be returned.
> + * @param txq_offloads
> + *   Offloads enabled on Tx queue
>   *
>   * @return
> - *   Number of packets having same ol_flags.
> + *   Number of packets having same ol_flags and metadata, if relevant.
>   */
>  static inline unsigned int
> -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
> +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
> +		const uint64_t txq_offloads)
>  {
>  	unsigned int pos;
>  	const uint64_t ol_mask =
>  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
>  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
>  
>  	if (!pkts_n)
>  		return 0;
>  	/* Count the number of packets having same ol_flags. */
> -	for (pos = 1; pos < pkts_n; ++pos)
> -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> +	for (pos = 1; pos < pkts_n; ++pos) {
> +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
> +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
>  			break;
> +		/* If the metadata ol_flag is set,
> +		 *  metadata must be same in all packets.
> +		 */
> +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
> +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> +			pkts[0]->hash.fdir.hi != pkts[pos]->hash.fdir.hi)
> +			break;
> +	}
> +
>  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
>  	return pos;
>  }
> @@ -137,8 +152,10 @@
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
>  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> +				DEV_TX_OFFLOAD_MATCH_METADATA))

How about writing a separate func - txq_count_contig_same_metadata()? And it
would be better to rename txq_calc_offload() to txq_count_contig_same_csflag().
Then, there could be less redundant if-clause in the funcs.

> +			n = txq_calc_offload(&pkts[nb_tx], n,
> +					&cs_flags, txq->offloads);
>  		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
>  		nb_tx += ret;
>  		if (!ret)
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fb884f9..fda7004 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -22,6 +22,7 @@
>  /* HW offload capabilities of vectorized Tx. */
>  #define MLX5_VEC_TX_OFFLOAD_CAP \
>  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
>  	 DEV_TX_OFFLOAD_MULTI_SEGS)
>  
>  /*
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index b37b738..20c9427 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -237,6 +237,7 @@
>  	uint8x16_t *t_wqe;
>  	uint8_t *dseg;
>  	uint8x16_t ctrl;
> +	uint32_t md; /* metadata */
>  
>  	/* Make sure all packets can fit into a single WQE. */
>  	assert(elts_n > pkts_n);
> @@ -293,10 +294,11 @@
>  	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
>  	vst1q_u8((void *)t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> +	md = pkts[0]->hash.fdir.hi;
>  	vst1q_u8((void *)(t_wqe + 1),
>  		 ((uint8x16_t) { 0, 0, 0, 0,
>  				 cs_flags, 0, 0, 0,
> -				 0, 0, 0, 0,
> +				 md >> 24, md >> 16, md >> 8, md,
>  				 0, 0, 0, 0 }));

I know compiler would be a good optimization but as it is very performance
critical path, let's optimize it by adding metadata as an argument for
txq_burst_v().

static inline uint16_t
txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
	    uint8_t cs_flags, rte_be32_t metadata)
{
...
	vst1q_u32((void *)(t_wqe + 1),
		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
...
}

mlx5_tx_burst_raw_vec() should call txq_burst_v(..., 0, 0). As 0 is a builtin
constant and txq_burst_v() is inline, it would get any performance drop.

In mlx5_tx_burst_vec(), 
	ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);

, where metadata is gotten from txq_count_contig_same_metadata() and should be
big-endian.

>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 54b3783..7c8535c 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -236,6 +236,7 @@
>  			      0,  1,  2,  3  /* bswap32 */);
>  	__m128i *t_wqe, *dseg;
>  	__m128i ctrl;
> +	uint32_t md; /* metadata */
>  
>  	/* Make sure all packets can fit into a single WQE. */
>  	assert(elts_n > pkts_n);
> @@ -292,9 +293,10 @@
>  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
>  	_mm_store_si128(t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> +	md = pkts[0]->hash.fdir.hi;
>  	_mm_store_si128(t_wqe + 1,
>  			_mm_set_epi8(0, 0, 0, 0,
> -				     0, 0, 0, 0,
> +				     md, md >> 8, md >> 16, md >> 24,
>  				     0, 0, 0, cs_flags,
>  				     0, 0, 0, 0));

If you take my proposal above, this should be:
	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));

>  #ifdef MLX5_PMD_SOFT_COUNTERS
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index f9bc473..7263fb1 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -128,6 +128,12 @@
>  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> +
> +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> +	if (config->dv_flow_en)
> +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
> +#endif
> +
>  	return offloads;
>  }
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v2] net/mlx5: support metadata as flow rule criteria
  2018-09-29  9:09   ` Yongseok Koh
@ 2018-10-03  5:22     ` Dekel Peled
  2018-10-03  7:22       ` Yongseok Koh
  0 siblings, 1 reply; 17+ messages in thread
From: Dekel Peled @ 2018-10-03  5:22 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev, Shahaf Shuler, Ori Kam

Thanks, PSB.

> -----Original Message-----
> From: Yongseok Koh
> Sent: Saturday, September 29, 2018 12:09 PM
> To: Dekel Peled <dekelp@mellanox.com>
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>; Ori Kam
> <orika@mellanox.com>
> Subject: Re: [PATCH v2] net/mlx5: support metadata as flow rule criteria
> 
> On Thu, Sep 27, 2018 at 05:18:55PM +0300, Dekel Peled wrote:
> > As described in series starting at [1], it adds option to set metadata
> > value as match pattern when creating a new flow rule.
> >
> > This patch adds metadata support in mlx5 driver, in two parts:
> > - Add the setting of metadata value in matcher when creating
> >   a new flow rule.
> > - Add the passing of metadata value from mbuf to wqe when
> >   indicated by ol_flag, in different burst functions.
> >
> > [1] "ethdev: support metadata as flow rule criteria"
> >     http://mails.dpdk.org/archives/dev/2018-September/113270.html
> >
> > Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> > ---
> > V2:
> > Split the support of egress rules to a different patch.
> > ---
> >  drivers/net/mlx5/mlx5_flow.c          |  2 +-
> >  drivers/net/mlx5/mlx5_flow.h          |  8 +++
> >  drivers/net/mlx5/mlx5_flow_dv.c       | 96
> +++++++++++++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5_prm.h           |  2 +-
> >  drivers/net/mlx5/mlx5_rxtx.c          | 35 +++++++++++--
> >  drivers/net/mlx5/mlx5_rxtx_vec.c      | 31 ++++++++---
> >  drivers/net/mlx5/mlx5_rxtx_vec.h      |  1 +
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  4 +-
> > drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  4 +-
> >  drivers/net/mlx5/mlx5_txq.c           |  6 +++
> >  10 files changed, 174 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 8007bf1..9581691 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -417,7 +417,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
> >   * @return
> >   *   0 on success, a negative errno value otherwise and rte_errno is set.
> >   */
> > -static int
> > +int
> >  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> >  			  const uint8_t *mask,
> >  			  const uint8_t *nic_mask,
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 10d700a..d91ae17 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -42,6 +42,9 @@
> >  #define MLX5_FLOW_LAYER_GRE (1u << 14)  #define
> MLX5_FLOW_LAYER_MPLS
> > (1u << 15)
> >
> > +/* General pattern items bits. */
> > +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> > +
> >  /* Outer Masks. */
> >  #define MLX5_FLOW_LAYER_OUTER_L3 \
> >  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 |
> MLX5_FLOW_LAYER_OUTER_L3_IPV6) @@
> > -299,6 +302,11 @@ int mlx5_flow_validate_action_rss(const struct
> > rte_flow_action *action,  int mlx5_flow_validate_attributes(struct
> rte_eth_dev *dev,
> >  				  const struct rte_flow_attr *attributes,
> >  				  struct rte_flow_error *error);
> > +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> > +			      const uint8_t *mask,
> > +			      const uint8_t *nic_mask,
> > +			      unsigned int size,
> > +			      struct rte_flow_error *error);
> >  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
> >  				uint64_t item_flags,
> >  				struct rte_flow_error *error);
> > diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> > b/drivers/net/mlx5/mlx5_flow_dv.c index cf663cd..2439f5e 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -36,6 +36,55 @@
> >  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> >
> >  /**
> > + * Validate META item.
> > + *
> > + * @param[in] item
> > + *   Item specification.
> > + * @param[in] attr
> > + *   Attributes of flow that includes this item.
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +mlx5_flow_validate_item_meta(const struct rte_flow_item *item,
> 
> Naming rule here is starting from flow_dv_*() for static funcs. For global
> funcs, it should start from mlx5_, so it should be mlx5_flow_dv_*().

Renamed function to flow_dv_validate_item_meta.
Left it as static in mlx5_flow_dv.c, since it is relevant for DV only.

> 
> Or, you can put it in the mlx5_flow.c as a common helper function although it
> is used only by DV.
> 
> I prefer the latter.
> 
> > +			const struct rte_flow_attr *attr,
> > +			struct rte_flow_error *error)
> > +{
> > +	const struct rte_flow_item_meta *spec = item->spec;
> > +	const struct rte_flow_item_meta *mask = item->mask;
> > +
> > +	const struct rte_flow_item_meta nic_mask = {
> > +		.data = RTE_BE32(UINT32_MAX)
> > +	};
> > +
> > +	int ret;
> > +
> > +	if (!spec)
> > +		return rte_flow_error_set(error, EINVAL,
> > +
> RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> > +					  item->spec,
> > +					  "data cannot be empty");
> > +	if (!mask)
> > +		mask = &rte_flow_item_meta_mask;
> > +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> > +					(const uint8_t *)&nic_mask,
> > +					sizeof(struct rte_flow_item_meta),
> > +					error);
> > +	if (ret < 0)
> > +		return ret;
> > +
> 
> No blank line is allowed.
Blank line removed.

> 
> > +	if (attr->ingress)
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> > +					  NULL,
> > +					  "pattern not supported for
> ingress");
> 
> If it is only supported with egress flow, then please group this patch with the
> other one - "net/mlx5: allow flow rule with attribute egress", and make it
> applied later than that by specifying [1/2] and [2/2].

I will update the commit log with the relevant data as done 
in http://mails.dpdk.org/archives/dev/2018-September/113280.html

> 
> > +	return 0;
> > +}
> > +
> > +/**
> >   * Verify the @p attributes will be correctly understood by the NIC and
> store
> >   * them in the @p flow if everything is correct.
> >   *
> > @@ -216,6 +265,12 @@
> >  				return ret;
> >  			item_flags |= MLX5_FLOW_LAYER_MPLS;
> >  			break;
> > +		case RTE_FLOW_ITEM_TYPE_META:
> > +			ret = mlx5_flow_validate_item_meta(items, attr,
> error);
> > +			if (ret < 0)
> > +				return ret;
> > +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> > +			break;
> >  		default:
> >  			return rte_flow_error_set(error, ENOTSUP,
> >
> RTE_FLOW_ERROR_TYPE_ITEM,
> > @@ -853,6 +908,43 @@
> >  }
> >
> >  /**
> > + * Add META item to matcher
> > + *
> > + * @param[in, out] matcher
> > + *   Flow matcher.
> > + * @param[in, out] key
> > + *   Flow matcher value.
> > + * @param[in] item
> > + *   Flow pattern to translate.
> > + * @param[in] inner
> > + *   Item is inner pattern.
> > + */
> > +static void
> > +flow_dv_translate_item_meta(void *matcher, void *key,
> > +				const struct rte_flow_item *item) {
> > +	const struct rte_flow_item_meta *metam;
> > +	const struct rte_flow_item_meta *metav;
> 
> Ori changed naming like meta_m and meta_v.

Renamed.

> 
> > +	void *misc2_m =
> > +		MLX5_ADDR_OF(fte_match_param, matcher,
> misc_parameters_2);
> > +	void *misc2_v =
> > +		MLX5_ADDR_OF(fte_match_param, key,
> misc_parameters_2);
> > +
> > +	metam = (const void *)item->mask;
> > +	if (!metam)
> > +		metam = &rte_flow_item_meta_mask;
> > +
> 
> No blank line is allowed except for the one after variable declaration. Please
> remove such blank lines in the whole patches.
Done.

> 
> > +	metav = (const void *)item->spec;
> > +	if (metav) {
> > +		MLX5_SET(fte_match_set_misc2, misc2_m,
> metadata_reg_a,
> > +			RTE_BE32(metam->data));
> > +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> > +			RTE_BE32(metav->data));
> > +	}
> > +}
> > +
> > +/**
> >   * Update the matcher and the value based the selected item.
> >   *
> >   * @param[in, out] matcher
> > @@ -938,6 +1030,10 @@
> >  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key,
> item,
> >  					     inner);
> >  		break;
> > +	case RTE_FLOW_ITEM_TYPE_META:
> > +		flow_dv_translate_item_meta(tmatcher->mask.buf, key,
> item);
> > +		break;
> > +
> >  	default:
> >  		break;
> >  	}
> > diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> > index 4e2f9f4..a905397 100644
> > --- a/drivers/net/mlx5/mlx5_prm.h
> > +++ b/drivers/net/mlx5/mlx5_prm.h
> > @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
> >  	uint8_t	cs_flags;
> >  	uint8_t	rsvd1;
> >  	uint16_t mss;
> > -	uint32_t rsvd2;
> > +	uint32_t flow_table_metadata;
> >  	uint16_t inline_hdr_sz;
> >  	uint8_t inline_hdr[2];
> >  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > b/drivers/net/mlx5/mlx5_rxtx.c index 558e6b6..d596e4e 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > @@ -523,6 +523,7 @@
> >  		uint8_t tso = txq->tso_en && (buf->ol_flags &
> PKT_TX_TCP_SEG);
> >  		uint32_t swp_offsets = 0;
> >  		uint8_t swp_types = 0;
> > +		uint32_t metadata = 0;
> >  		uint16_t tso_segsz = 0;
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> >  		uint32_t total_length = 0;
> > @@ -566,6 +567,9 @@
> >  		cs_flags = txq_ol_cksum_to_cs(buf);
> >  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets,
> &swp_types);
> >  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 *
> MLX5_WQE_DWORD_SIZE;
> > +		/* Copy metadata from mbuf if valid */
> > +		if (buf->ol_flags & PKT_TX_METADATA)
> > +			metadata = buf->hash.fdir.hi;
> 
> I saw someone suggested to add a union field to name it properly. I also
> agree on the idea. It is quite confusing to get meta data from hash field.

Done, added field tx_metadata in union mbuf.hash.

> 
> >  		/* Replace the Ethernet type by the VLAN if necessary. */
> >  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
> >  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 | @@
> -781,7 +785,7 @@
> >  				swp_offsets,
> >  				cs_flags | (swp_types << 8) |
> >  				(rte_cpu_to_be_16(tso_segsz) << 16),
> > -				0,
> > +				rte_cpu_to_be_32(metadata),
> >  				(ehdr << 16) |
> rte_cpu_to_be_16(tso_header_sz),
> >  			};
> >  		} else {
> > @@ -795,7 +799,7 @@
> >  			wqe->eseg = (rte_v128u32_t){
> >  				swp_offsets,
> >  				cs_flags | (swp_types << 8),
> > -				0,
> > +				rte_cpu_to_be_32(metadata),
> >  				(ehdr << 16) |
> rte_cpu_to_be_16(pkt_inline_sz),
> >  			};
> >  		}
> > @@ -861,7 +865,7 @@
> >  	mpw->wqe->eseg.inline_hdr_sz = 0;
> >  	mpw->wqe->eseg.rsvd0 = 0;
> >  	mpw->wqe->eseg.rsvd1 = 0;
> > -	mpw->wqe->eseg.rsvd2 = 0;
> > +	mpw->wqe->eseg.flow_table_metadata = 0;
> >  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW
> << 24) |
> >  					     (txq->wqe_ci << 8) |
> >  					     MLX5_OPCODE_TSO);
> > @@ -971,6 +975,8 @@
> >  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
> >  		    ((mpw.len != length) ||
> >  		     (segs_n != 1) ||
> > +		     (mpw.wqe->eseg.flow_table_metadata !=
> > +			rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
> >  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
> >  			mlx5_mpw_close(txq, &mpw);
> >  		if (mpw.state == MLX5_MPW_STATE_CLOSED) { @@ -984,6
> +990,8 @@
> >  			max_wqe -= 2;
> >  			mlx5_mpw_new(txq, &mpw, length);
> >  			mpw.wqe->eseg.cs_flags = cs_flags;
> > +			mpw.wqe->eseg.flow_table_metadata =
> > +				rte_cpu_to_be_32(buf->hash.fdir.hi);
> >  		}
> >  		/* Multi-segment packets must be alone in their MPW. */
> >  		assert((segs_n == 1) || (mpw.pkts_n == 0)); @@ -1082,7
> +1090,7 @@
> >  	mpw->wqe->eseg.cs_flags = 0;
> >  	mpw->wqe->eseg.rsvd0 = 0;
> >  	mpw->wqe->eseg.rsvd1 = 0;
> > -	mpw->wqe->eseg.rsvd2 = 0;
> > +	mpw->wqe->eseg.flow_table_metadata = 0;
> >  	inl = (struct mlx5_wqe_inl_small *)
> >  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
> >  	mpw->data.raw = (uint8_t *)&inl->raw; @@ -1199,12 +1207,16 @@
> >  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
> >  			if ((mpw.len != length) ||
> >  			    (segs_n != 1) ||
> > +			    (mpw.wqe->eseg.flow_table_metadata !=
> > +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags))
> >  				mlx5_mpw_close(txq, &mpw);
> >  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
> >  			if ((mpw.len != length) ||
> >  			    (segs_n != 1) ||
> >  			    (length > inline_room) ||
> > +			    (mpw.wqe->eseg.flow_table_metadata !=
> > +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
> 
> Isn't this mbuf field vaild only if there's PKT_TX_METADATA? Without the
> flag, it could be a garbage, right? And I think the value in the eseg might not
> be metadata of previous packet if the packet didn't have the flag. It would be
> better to define metadata and get it early in this loop like cs_flags. E.g.
> 
> 		metadata = buf->ol_flags & PKT_TX_METADATA ?
> 			   rte_cpu_to_be_32(buf->hash.fdir.hi) : 0;
> 		cs_flags = txq_ol_cksum_to_cs(buf);
> 
> Same for the rest of similar funcs.

Done.

> 
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
> >  				mlx5_mpw_inline_close(txq, &mpw);
> >  				inline_room =
> > @@ -1224,12 +1236,21 @@
> >  				max_wqe -= 2;
> >  				mlx5_mpw_new(txq, &mpw, length);
> >  				mpw.wqe->eseg.cs_flags = cs_flags;
> > +				/* Copy metadata from mbuf if valid */
> > +				if (buf->ol_flags & PKT_TX_METADATA)
> > +					mpw.wqe-
> >eseg.flow_table_metadata =
> > +					rte_cpu_to_be_32(buf-
> >hash.fdir.hi);
> > +
> 
> Yes, this can be allowed but the following is preferred.
OK.

> 					mpw.wqe-
> >eseg.flow_table_metadata =
> 						rte_cpu_to_be_32
> 						(buf->hash.fdir.hi);
> >  			} else {
> >  				if (unlikely(max_wqe < wqe_inl_n))
> >  					break;
> >  				max_wqe -= wqe_inl_n;
> >  				mlx5_mpw_inline_new(txq, &mpw, length);
> >  				mpw.wqe->eseg.cs_flags = cs_flags;
> > +				/* Copy metadata from mbuf if valid */
> > +				if (buf->ol_flags & PKT_TX_METADATA)
> > +					mpw.wqe-
> >eseg.flow_table_metadata =
> > +					rte_cpu_to_be_32(buf-
> >hash.fdir.hi);
> 
> Same here.
> 
> >  			}
> >  		}
> >  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1482,6
> > +1503,8 @@
> >  			    (length <= txq->inline_max_packet_sz &&
> >  			     inl_pad + sizeof(inl_hdr) + length >
> >  			     mpw_room) ||
> > +			     (mpw.wqe->eseg.flow_table_metadata !=
> > +				rte_cpu_to_be_32(buf->hash.fdir.hi)) ||
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags))
> >  				max_wqe -= mlx5_empw_close(txq, &mpw);
> >  		}
> > @@ -1505,6 +1528,10 @@
> >  				    sizeof(inl_hdr) + length <= mpw_room &&
> >  				    !txq->mpw_hdr_dseg;
> >  			mpw.wqe->eseg.cs_flags = cs_flags;
> > +			/* Copy metadata from mbuf if valid */
> > +			if (buf->ol_flags & PKT_TX_METADATA)
> > +				mpw.wqe->eseg.flow_table_metadata =
> > +					rte_cpu_to_be_32(buf-
> >hash.fdir.hi);
> >  		} else {
> >  			/* Evaluate whether the next packet can be inlined.
> >  			 * Inlininig is possible when:
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c
> > b/drivers/net/mlx5/mlx5_rxtx_vec.c
> > index 0a4aed8..132943a 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> > @@ -41,6 +41,8 @@
> >
> >  /**
> >   * Count the number of packets having same ol_flags and calculate
> cs_flags.
> > + * If PKT_TX_METADATA is set in ol_flags, packets must have same
> > + metadata
> > + * as well.
> >   *
> >   * @param pkts
> >   *   Pointer to array of packets.
> > @@ -48,25 +50,38 @@
> >   *   Number of packets.
> >   * @param cs_flags
> >   *   Pointer of flags to be returned.
> > + * @param txq_offloads
> > + *   Offloads enabled on Tx queue
> >   *
> >   * @return
> > - *   Number of packets having same ol_flags.
> > + *   Number of packets having same ol_flags and metadata, if relevant.
> >   */
> >  static inline unsigned int
> > -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> > *cs_flags)
> > +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> *cs_flags,
> > +		const uint64_t txq_offloads)
> >  {
> >  	unsigned int pos;
> >  	const uint64_t ol_mask =
> >  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
> >  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> > -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> > +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM |
> PKT_TX_METADATA;
> >
> >  	if (!pkts_n)
> >  		return 0;
> >  	/* Count the number of packets having same ol_flags. */
> > -	for (pos = 1; pos < pkts_n; ++pos)
> > -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> > +	for (pos = 1; pos < pkts_n; ++pos) {
> > +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> &&
> > +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) &
> ol_mask))
> >  			break;
> > +		/* If the metadata ol_flag is set,
> > +		 *  metadata must be same in all packets.
> > +		 */
> > +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA)
> &&
> > +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> > +			pkts[0]->hash.fdir.hi != pkts[pos]->hash.fdir.hi)
> > +			break;
> > +	}
> > +
> >  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
> >  	return pos;
> >  }
> > @@ -137,8 +152,10 @@
> >  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
> MLX5_VPMD_TX_MAX_BURST);
> >  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
> >  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> > -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> > -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> > +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> > +				DEV_TX_OFFLOAD_MATCH_METADATA))
> 
> How about writing a separate func - txq_count_contig_same_metadata()?
> And it would be better to rename txq_calc_offload() to
> txq_count_contig_same_csflag().
> Then, there could be less redundant if-clause in the funcs.

It was considered during implementation but decided to handle all related logic
In same function.

> 
> > +			n = txq_calc_offload(&pkts[nb_tx], n,
> > +					&cs_flags, txq->offloads);
> >  		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
> >  		nb_tx += ret;
> >  		if (!ret)
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > index fb884f9..fda7004 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > @@ -22,6 +22,7 @@
> >  /* HW offload capabilities of vectorized Tx. */  #define
> > MLX5_VEC_TX_OFFLOAD_CAP \
> >  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> > +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
> >  	 DEV_TX_OFFLOAD_MULTI_SEGS)
> >
> >  /*
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > index b37b738..20c9427 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > @@ -237,6 +237,7 @@
> >  	uint8x16_t *t_wqe;
> >  	uint8_t *dseg;
> >  	uint8x16_t ctrl;
> > +	uint32_t md; /* metadata */
> >
> >  	/* Make sure all packets can fit into a single WQE. */
> >  	assert(elts_n > pkts_n);
> > @@ -293,10 +294,11 @@
> >  	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
> >  	vst1q_u8((void *)t_wqe, ctrl);
> >  	/* Fill ESEG in the header. */
> > +	md = pkts[0]->hash.fdir.hi;
> >  	vst1q_u8((void *)(t_wqe + 1),
> >  		 ((uint8x16_t) { 0, 0, 0, 0,
> >  				 cs_flags, 0, 0, 0,
> > -				 0, 0, 0, 0,
> > +				 md >> 24, md >> 16, md >> 8, md,
> >  				 0, 0, 0, 0 }));
> 
> I know compiler would be a good optimization but as it is very performance
> critical path, let's optimize it by adding metadata as an argument for
> txq_burst_v().
> 
> static inline uint16_t
> txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
> 	    uint8_t cs_flags, rte_be32_t metadata) { ...
> 	vst1q_u32((void *)(t_wqe + 1),
> 		 ((uint32x4_t) { 0, cs_flags, metadata, 0 })); ...
> }
> 
> mlx5_tx_burst_raw_vec() should call txq_burst_v(..., 0, 0). As 0 is a builtin
> constant and txq_burst_v() is inline, it would get any performance drop.
> 
> In mlx5_tx_burst_vec(),
> 	ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
> 
> , where metadata is gotten from txq_count_contig_same_metadata() and
> should be big-endian.

Done.

> 
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> >  	txq->stats.opackets += pkts_n;
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > index 54b3783..7c8535c 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > @@ -236,6 +236,7 @@
> >  			      0,  1,  2,  3  /* bswap32 */);
> >  	__m128i *t_wqe, *dseg;
> >  	__m128i ctrl;
> > +	uint32_t md; /* metadata */
> >
> >  	/* Make sure all packets can fit into a single WQE. */
> >  	assert(elts_n > pkts_n);
> > @@ -292,9 +293,10 @@
> >  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
> >  	_mm_store_si128(t_wqe, ctrl);
> >  	/* Fill ESEG in the header. */
> > +	md = pkts[0]->hash.fdir.hi;
> >  	_mm_store_si128(t_wqe + 1,
> >  			_mm_set_epi8(0, 0, 0, 0,
> > -				     0, 0, 0, 0,
> > +				     md, md >> 8, md >> 16, md >> 24,
> >  				     0, 0, 0, cs_flags,
> >  				     0, 0, 0, 0));
> 
> If you take my proposal above, this should be:
> 	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags,
> 0));

Done.

> 
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> > index f9bc473..7263fb1 100644
> > --- a/drivers/net/mlx5/mlx5_txq.c
> > +++ b/drivers/net/mlx5/mlx5_txq.c
> > @@ -128,6 +128,12 @@
> >  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> >  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
> >  	}
> > +
> > +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> > +	if (config->dv_flow_en)
> > +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA; #endif
> > +
> >  	return offloads;
> >  }
> >
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v2] net/mlx5: support metadata as flow rule criteria
  2018-10-03  5:22     ` Dekel Peled
@ 2018-10-03  7:22       ` Yongseok Koh
  0 siblings, 0 replies; 17+ messages in thread
From: Yongseok Koh @ 2018-10-03  7:22 UTC (permalink / raw)
  To: Dekel Peled; +Cc: dev, Shahaf Shuler, Ori Kam

> On Oct 2, 2018, at 10:22 PM, Dekel Peled <dekelp@mellanox.com> wrote:
> 
>>> @@ -137,8 +152,10 @@
>>> 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
>> MLX5_VPMD_TX_MAX_BURST);
>>> 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>>> 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
>>> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
>>> -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
>>> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
>>> +				DEV_TX_OFFLOAD_MATCH_METADATA))
>> 
>> How about writing a separate func - txq_count_contig_same_metadata()?
>> And it would be better to rename txq_calc_offload() to
>> txq_count_contig_same_csflag().
>> Then, there could be less redundant if-clause in the funcs.
> 
> It was considered during implementation but decided to handle all related logic
> In same function.

But it doesn't look efficient. Note that it is performance critical datapath.

	if (A) {
		for (n)
			do_a();
	}
	if (B) {
		for (n)
			do_b();
	}

vs.

	if (A or B) {
		for (n) {
			if (A)
				do_a();
			if (B)
				do_b();
		}
	}

In the worst case, condition A and B will be checked n times each while it can be
only once in the first case.

Thanks,
Yongseok

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v3] net/mlx5: support metadata as flow rule criteria
  2018-09-27 14:18 ` [dpdk-dev] [PATCH v2] " Dekel Peled
  2018-09-29  9:09   ` Yongseok Koh
@ 2018-10-11 11:19   ` Dekel Peled
  2018-10-17 11:53     ` [dpdk-dev] [PATCH v4] " Dekel Peled
  1 sibling, 1 reply; 17+ messages in thread
From: Dekel Peled @ 2018-10-11 11:19 UTC (permalink / raw)
  To: yskoh, shahafs; +Cc: dev, orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
	http://mails.dpdk.org/archives/dev/2018-October/115469.html

---
v3:
- Update meta item validation.
v2:
Split the support of egress rules to a different patch.
---

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |   2 +-
 drivers/net/mlx5/mlx5_flow.h          |   8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 109 ++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  38 +++++++++---
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
 drivers/net/mlx5/mlx5_txq.c           |   6 ++
 10 files changed, 192 insertions(+), 26 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index ed60c40..6e4a651 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index d425311..bfc077d 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -296,6 +299,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index b863476..4a6907b 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,69 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   const struct rte_flow_attr *attr,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+
+	int ret;
+	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
+
+	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
+		return rte_flow_error_set(error, EPERM,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  NULL,
+					  "match on metadata offload "
+					  "configuration is off for this port");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!spec->data)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  NULL,
+					  "data cannot be zero");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -216,6 +279,13 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = flow_dv_validate_item_meta(dev, items, attr,
+							 error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -857,6 +927,42 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+				const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *meta_m;
+	const struct rte_flow_item_meta *meta_v;
+
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	meta_m = (const void *)item->mask;
+	if (!meta_m)
+		meta_m = &rte_flow_item_meta_mask;
+	meta_v = (const void *)item->spec;
+	if (meta_v) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			RTE_BE32(meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			RTE_BE32(meta_v->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -942,6 +1048,9 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 4e2f9f4..a905397 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..5b4d2fd 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		uint32_t metadata;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,10 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
+
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +786,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +800,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +866,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -948,6 +953,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint32_t cs_flags;
+		uint32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -964,6 +970,9 @@
 		max_elts -= segs_n;
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -971,6 +980,7 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +994,7 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1093,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1172,6 +1183,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint8_t cs_flags;
+		uint32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -1193,18 +1205,23 @@
 		 */
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1241,14 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1461,6 +1480,7 @@
 		unsigned int do_inline = 0; /* Whether inline is possible. */
 		uint32_t length;
 		uint8_t cs_flags;
+		uint32_t metadata;
 
 		/* Multi-segmented packet is handled in slow-path outside. */
 		assert(NB_SEGS(buf) == 1);
@@ -1468,6 +1488,9 @@
 		if (max_elts - j == 0)
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
@@ -1482,6 +1505,7 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1529,7 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..d9ee356 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -41,6 +41,8 @@
 
 /**
  * Count the number of packets having same ol_flags and calculate cs_flags.
+ * If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
+ * as well.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,26 +50,41 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param metadata
+ *   Pointer of metadata to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		 uint32_t *metadata, const uint64_t txq_offloads)
 {
 	unsigned int pos;
 	const uint64_t ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
-		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
 
 	if (!pkts_n)
 		return 0;
 	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	for (pos = 1; pos < pkts_n; ++pos) {
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
 			break;
+		/* If the metadata ol_flag is set,
+		 *  metadata must be same in all packets.
+		 */
+		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
+			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
+			pkts[0]->tx_metadata != pkts[pos]->tx_metadata)
+			break;
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
+	*metadata = rte_cpu_to_be_32(pkts[0]->tx_metadata);
 	return pos;
 }
 
@@ -96,7 +113,7 @@
 		uint16_t ret;
 
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
 		nb_tx += ret;
 		if (!ret)
 			break;
@@ -127,6 +144,7 @@
 		uint8_t cs_flags = 0;
 		uint16_t n;
 		uint16_t ret;
+		uint32_t metadata;
 
 		/* Transmit multi-seg packets in the head of pkts list. */
 		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
@@ -137,9 +155,11 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					&cs_flags, &metadata, txq->offloads);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
 		nb_tx += ret;
 		if (!ret)
 			break;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..a8a4d7b 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -201,13 +201,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, uint32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -294,10 +296,7 @@
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
 	vst1q_u8((void *)(t_wqe + 1),
-		 ((uint8x16_t) { 0, 0, 0, 0,
-				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
-				 0, 0, 0, 0 }));
+		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..31aae4a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -202,13 +202,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, uint32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -292,11 +294,7 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	_mm_store_si128(t_wqe + 1,
-			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
-				     0, 0, 0, cs_flags,
-				     0, 0, 0, 0));
+	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..7263fb1 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,6 +128,12 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
+
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v4] net/mlx5: support metadata as flow rule criteria
  2018-10-11 11:19   ` [dpdk-dev] [PATCH v3] " Dekel Peled
@ 2018-10-17 11:53     ` Dekel Peled
  2018-10-18  8:00       ` Yongseok Koh
  2018-10-21 14:04       ` [dpdk-dev] [PATCH v5] " Dekel Peled
  0 siblings, 2 replies; 17+ messages in thread
From: Dekel Peled @ 2018-10-17 11:53 UTC (permalink / raw)
  To: yskoh, shahafs; +Cc: dev, orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-October/115469.html

---
v4:
- Rebase.
- Apply code review comments.
v3:
- Update meta item validation.
v2:
- Split the support of egress rules to a different patch.
---
	
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |   2 +-
 drivers/net/mlx5/mlx5_flow.h          |   8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 109 ++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  38 +++++++++---
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
 drivers/net/mlx5/mlx5_txq.c           |   6 ++
 10 files changed, 192 insertions(+), 26 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index bd70fce..15262f6 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 094f666..834a6ed 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a013201..bfddfab 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,69 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   const struct rte_flow_attr *attr,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+
+	int ret;
+	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
+
+	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
+		return rte_flow_error_set(error, EPERM,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  NULL,
+					  "match on metadata offload "
+					  "configuration is off for this port");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!spec->data)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  NULL,
+					  "data cannot be zero");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -214,6 +277,13 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = flow_dv_validate_item_meta(dev, items, attr,
+							 error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -855,6 +925,42 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+				const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *meta_m;
+	const struct rte_flow_item_meta *meta_v;
+
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	meta_m = (const void *)item->mask;
+	if (!meta_m)
+		meta_m = &rte_flow_item_meta_mask;
+	meta_v = (const void *)item->spec;
+	if (meta_v) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			RTE_BE32(meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			RTE_BE32(meta_v->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -940,6 +1046,9 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 69296a0..29742b1 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..5b4d2fd 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		uint32_t metadata;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,10 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
+
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +786,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +800,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				rte_cpu_to_be_32(metadata),
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +866,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -948,6 +953,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint32_t cs_flags;
+		uint32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -964,6 +970,9 @@
 		max_elts -= segs_n;
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -971,6 +980,7 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +994,7 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1093,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1172,6 +1183,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint8_t cs_flags;
+		uint32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -1193,18 +1205,23 @@
 		 */
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1241,14 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1461,6 +1480,7 @@
 		unsigned int do_inline = 0; /* Whether inline is possible. */
 		uint32_t length;
 		uint8_t cs_flags;
+		uint32_t metadata;
 
 		/* Multi-segmented packet is handled in slow-path outside. */
 		assert(NB_SEGS(buf) == 1);
@@ -1468,6 +1488,9 @@
 		if (max_elts - j == 0)
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+						buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
@@ -1482,6 +1505,7 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1529,7 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..16a8608 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -41,6 +41,8 @@
 
 /**
  * Count the number of packets having same ol_flags and calculate cs_flags.
+ * If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
+ * as well.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,26 +50,41 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param metadata
+ *   Pointer of metadata to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		 uint32_t *metadata, const uint64_t txq_offloads)
 {
 	unsigned int pos;
 	const uint64_t ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
-		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;
 
 	if (!pkts_n)
 		return 0;
 	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	for (pos = 1; pos < pkts_n; ++pos) {
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
 			break;
+		/* If the metadata ol_flag is set,
+		 *  metadata must be same in all packets.
+		 */
+		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
+			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
+			pkts[0]->tx_metadata != pkts[pos]->tx_metadata)
+			break;
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
+	*metadata = rte_cpu_to_be_32(pkts[0]->tx_metadata);
 	return pos;
 }
 
@@ -96,7 +113,7 @@
 		uint16_t ret;
 
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
 		nb_tx += ret;
 		if (!ret)
 			break;
@@ -127,6 +144,7 @@
 		uint8_t cs_flags = 0;
 		uint16_t n;
 		uint16_t ret;
+		uint32_t metadata = 0;
 
 		/* Transmit multi-seg packets in the head of pkts list. */
 		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
@@ -137,9 +155,11 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					&cs_flags, &metadata, txq->offloads);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
 		nb_tx += ret;
 		if (!ret)
 			break;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..a8a4d7b 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -201,13 +201,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, uint32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -294,10 +296,7 @@
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
 	vst1q_u8((void *)(t_wqe + 1),
-		 ((uint8x16_t) { 0, 0, 0, 0,
-				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
-				 0, 0, 0, 0 }));
+		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..31aae4a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -202,13 +202,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, uint32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -292,11 +294,7 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	_mm_store_si128(t_wqe + 1,
-			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
-				     0, 0, 0, cs_flags,
-				     0, 0, 0, 0));
+	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..7263fb1 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,6 +128,12 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
+
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v4] net/mlx5: support metadata as flow rule criteria
  2018-10-17 11:53     ` [dpdk-dev] [PATCH v4] " Dekel Peled
@ 2018-10-18  8:00       ` Yongseok Koh
  2018-10-21 13:44         ` Dekel Peled
  2018-10-21 14:04       ` [dpdk-dev] [PATCH v5] " Dekel Peled
  1 sibling, 1 reply; 17+ messages in thread
From: Yongseok Koh @ 2018-10-18  8:00 UTC (permalink / raw)
  To: Dekel Peled; +Cc: Shahaf Shuler, dev, Ori Kam

On Wed, Oct 17, 2018 at 02:53:37PM +0300, Dekel Peled wrote:
> As described in series starting at [1], it adds option to set
> metadata value as match pattern when creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the validation and setting of metadata value in matcher,
>   when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>   indicated by ol_flag, in different burst functions.
> 
> [1] "ethdev: support metadata as flow rule criteria"
>     http://mails.dpdk.org/archives/dev/2018-October/115469.html
> 
> ---
> v4:
> - Rebase.
> - Apply code review comments.
> v3:
> - Update meta item validation.
> v2:
> - Split the support of egress rules to a different patch.
> ---
> 	
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c          |   2 +-
>  drivers/net/mlx5/mlx5_flow.h          |   8 +++
>  drivers/net/mlx5/mlx5_flow_dv.c       | 109 ++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_prm.h           |   2 +-
>  drivers/net/mlx5/mlx5_rxtx.c          |  33 ++++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.c      |  38 +++++++++---
>  drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
>  drivers/net/mlx5/mlx5_txq.c           |   6 ++
>  10 files changed, 192 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index bd70fce..15262f6 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
> -static int
> +int
>  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
>  			  const uint8_t *mask,
>  			  const uint8_t *nic_mask,
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 094f666..834a6ed 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -43,6 +43,9 @@
>  #define MLX5_FLOW_LAYER_GRE (1u << 14)
>  #define MLX5_FLOW_LAYER_MPLS (1u << 15)
>  
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
>  /* Outer Masks. */
>  #define MLX5_FLOW_LAYER_OUTER_L3 \
>  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
> @@ -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
>  int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
>  				  const struct rte_flow_attr *attributes,
>  				  struct rte_flow_error *error);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);
>  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
>  				uint64_t item_flags,
>  				struct rte_flow_error *error);
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index a013201..bfddfab 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -36,6 +36,69 @@
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
>  
>  /**
> + * Validate META item.
> + *
> + * @param[in] dev
> + *   Pointer to the rte_eth_dev structure.
> + * @param[in] item
> + *   Item specification.
> + * @param[in] attr
> + *   Attributes of flow that includes this item.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> +			   const struct rte_flow_item *item,
> +			   const struct rte_flow_attr *attr,
> +			   struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *spec = item->spec;
> +	const struct rte_flow_item_meta *mask = item->mask;
> +

No blank line.

> +	const struct rte_flow_item_meta nic_mask = {
> +		.data = RTE_BE32(UINT32_MAX)
> +	};
> +

Ditto.

> +	int ret;
> +	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
> +
> +	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> +		return rte_flow_error_set(error, EPERM,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  NULL,
> +					  "match on metadata offload "
> +					  "configuration is off for this port");
> +	if (!spec)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  item->spec,
> +					  "data cannot be empty");
> +	if (!spec->data)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  NULL,
> +					  "data cannot be zero");
> +	if (!mask)
> +		mask = &rte_flow_item_meta_mask;
> +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> +					(const uint8_t *)&nic_mask,
> +					sizeof(struct rte_flow_item_meta),
> +					error);
> +	if (ret < 0)
> +		return ret;
> +	if (attr->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for ingress");
> +	return 0;
> +}
> +
> +/**
>   * Verify the @p attributes will be correctly understood by the NIC and store
>   * them in the @p flow if everything is correct.
>   *
> @@ -214,6 +277,13 @@
>  				return ret;
>  			item_flags |= MLX5_FLOW_LAYER_MPLS;
>  			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret = flow_dv_validate_item_meta(dev, items, attr,
> +							 error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
>  						  RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -855,6 +925,42 @@
>  }
>  
>  /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +				const struct rte_flow_item *item)
> +{
> +	const struct rte_flow_item_meta *meta_m;
> +	const struct rte_flow_item_meta *meta_v;
> +
> +	void *misc2_m =
> +		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
> +	void *misc2_v =
> +		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
> +
> +	meta_m = (const void *)item->mask;
> +	if (!meta_m)
> +		meta_m = &rte_flow_item_meta_mask;
> +	meta_v = (const void *)item->spec;
> +	if (meta_v) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> +			RTE_BE32(meta_m->data));

Nope. RTE_BE32() is for builtin constant, not for a variable.
You should use rte_cpu_to_be_32() instead.

> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			RTE_BE32(meta_v->data));

Same here.

> +	}
> +}
> +
> +/**
>   * Update the matcher and the value based the selected item.
>   *
>   * @param[in, out] matcher
> @@ -940,6 +1046,9 @@
>  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
>  					     inner);
>  		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 69296a0..29742b1 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
>  	uint8_t	cs_flags;
>  	uint8_t	rsvd1;
>  	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
>  	uint16_t inline_hdr_sz;
>  	uint8_t inline_hdr[2];
>  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 558e6b6..5b4d2fd 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
>  		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
>  		uint32_t swp_offsets = 0;
>  		uint8_t swp_types = 0;
> +		uint32_t metadata;
>  		uint16_t tso_segsz = 0;
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  		uint32_t total_length = 0;
> @@ -566,6 +567,10 @@
>  		cs_flags = txq_ol_cksum_to_cs(buf);
>  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
>  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.

> +

No blank line.

>  		/* Replace the Ethernet type by the VLAN if necessary. */
>  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
>  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
> @@ -781,7 +786,7 @@
>  				swp_offsets,
>  				cs_flags | (swp_types << 8) |
>  				(rte_cpu_to_be_16(tso_segsz) << 16),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
>  			};
>  		} else {
> @@ -795,7 +800,7 @@
>  			wqe->eseg = (rte_v128u32_t){
>  				swp_offsets,
>  				cs_flags | (swp_types << 8),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
>  			};
>  		}
> @@ -861,7 +866,7 @@
>  	mpw->wqe->eseg.inline_hdr_sz = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
>  					     (txq->wqe_ci << 8) |
>  					     MLX5_OPCODE_TSO);
> @@ -948,6 +953,7 @@
>  		uint32_t length;
>  		unsigned int segs_n = buf->nb_segs;
>  		uint32_t cs_flags;
> +		uint32_t metadata;
>  
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -964,6 +970,9 @@
>  		max_elts -= segs_n;
>  		--pkts_n;
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian? I think it needs.

>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		assert(length);
> @@ -971,6 +980,7 @@
>  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
>  		    ((mpw.len != length) ||
>  		     (segs_n != 1) ||
> +		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
>  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
>  			mlx5_mpw_close(txq, &mpw);
>  		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
> @@ -984,6 +994,7 @@
>  			max_wqe -= 2;
>  			mlx5_mpw_new(txq, &mpw, length);
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata = metadata;
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
>  		assert((segs_n == 1) || (mpw.pkts_n == 0));
> @@ -1082,7 +1093,7 @@
>  	mpw->wqe->eseg.cs_flags = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	inl = (struct mlx5_wqe_inl_small *)
>  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
>  	mpw->data.raw = (uint8_t *)&inl->raw;
> @@ -1172,6 +1183,7 @@
>  		uint32_t length;
>  		unsigned int segs_n = buf->nb_segs;
>  		uint8_t cs_flags;
> +		uint32_t metadata;
>  
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -1193,18 +1205,23 @@
>  		 */
>  		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian?

>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		/* Start new session if packet differs. */
>  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
> +			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				mlx5_mpw_close(txq, &mpw);
>  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
>  			    (length > inline_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
>  				mlx5_mpw_inline_close(txq, &mpw);
>  				inline_room =
> @@ -1224,12 +1241,14 @@
>  				max_wqe -= 2;
>  				mlx5_mpw_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata = metadata;
>  			} else {
>  				if (unlikely(max_wqe < wqe_inl_n))
>  					break;
>  				max_wqe -= wqe_inl_n;
>  				mlx5_mpw_inline_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata = metadata;
>  			}
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1461,6 +1480,7 @@
>  		unsigned int do_inline = 0; /* Whether inline is possible. */
>  		uint32_t length;
>  		uint8_t cs_flags;
> +		uint32_t metadata;
>  
>  		/* Multi-segmented packet is handled in slow-path outside. */
>  		assert(NB_SEGS(buf) == 1);
> @@ -1468,6 +1488,9 @@
>  		if (max_elts - j == 0)
>  			break;
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian?

>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		/* Start new session if:
> @@ -1482,6 +1505,7 @@
>  			    (length <= txq->inline_max_packet_sz &&
>  			     inl_pad + sizeof(inl_hdr) + length >
>  			     mpw_room) ||
> +			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				max_wqe -= mlx5_empw_close(txq, &mpw);
>  		}
> @@ -1505,6 +1529,7 @@
>  				    sizeof(inl_hdr) + length <= mpw_room &&
>  				    !txq->mpw_hdr_dseg;
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata = metadata;
>  		} else {
>  			/* Evaluate whether the next packet can be inlined.
>  			 * Inlininig is possible when:
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
> index 0a4aed8..16a8608 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> @@ -41,6 +41,8 @@
>  
>  /**
>   * Count the number of packets having same ol_flags and calculate cs_flags.
> + * If PKT_TX_METADATA is set in ol_flags, packets must have same metadata
> + * as well.

Packets can have different metadata but we just want to count the number of
packets having same data. Please correct the comment.

>   *
>   * @param pkts
>   *   Pointer to array of packets.
> @@ -48,26 +50,41 @@
>   *   Number of packets.
>   * @param cs_flags
>   *   Pointer of flags to be returned.
> + * @param metadata
> + *   Pointer of metadata to be returned.
> + * @param txq_offloads
> + *   Offloads enabled on Tx queue
>   *
>   * @return
> - *   Number of packets having same ol_flags.
> + *   Number of packets having same ol_flags and metadata, if relevant.
>   */
>  static inline unsigned int
> -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
> +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
> +		 uint32_t *metadata, const uint64_t txq_offloads)
>  {
>  	unsigned int pos;
>  	const uint64_t ol_mask =
>  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
>  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;

Shouldn't add PKT_TX_METADATA. As it is for cksum, you might rather want to
change the name, e.g., cksum_ol_mask.

>  
>  	if (!pkts_n)
>  		return 0;
>  	/* Count the number of packets having same ol_flags. */

This comment has to be corrected and moved.

> -	for (pos = 1; pos < pkts_n; ++pos)
> -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> +	for (pos = 1; pos < pkts_n; ++pos) {
> +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
> +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))

Indentation.

>  			break;
> +		/* If the metadata ol_flag is set,
> +		 *  metadata must be same in all packets.
> +		 */

Correct comment. First line should be empty for multi-line comment.
And it can't be 'must'. We are not forcing it but just counting the number of
packets having same metadata like I mentioned above.

> +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
> +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> +			pkts[0]->tx_metadata != pkts[pos]->tx_metadata)

Disagree. What if pkts[0] doesn't have PKT_TXT_METADATA while pkt[1] has it?
And, indentation.

> +			break;
> +	}
>  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
> +	*metadata = rte_cpu_to_be_32(pkts[0]->tx_metadata);

Same here. You should check if pkts[0] has metadata first.

>  	return pos;

Here's my suggestion for the whole func.

	unsigned int pos;
	const uint64_t cksum_ol_mask =
		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
	const uint32_t p0_metadata;

	if (!pkts_n)
		return 0;
	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
		      pkts[0]->tx_metadata : 0;
	/* Count the number of packets having same offload parameters. */
	for (pos = 1; pos < pkts_n; ++pos) {
		/* Check if packet can have same checksum flags. */
		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & cksum_ol_mask))
			break;
		/* Check if packet has same metadata. */
		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
			const uint32_t p1_metadata =
				pkts[pos]->ol_flags & PKT_TX_METADATA ?
				pkts[pos]->tx_metadata : 0;

			if (p1_metadata != p0_metadata)
				break;
		}
	}
	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
	*metadata = rte_cpu_to_be_32(p0_metadata);
	return pos;
>  }
>  
> @@ -96,7 +113,7 @@
>  		uint16_t ret;
>  
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
> -		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
> +		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
>  		nb_tx += ret;
>  		if (!ret)
>  			break;
> @@ -127,6 +144,7 @@
>  		uint8_t cs_flags = 0;
>  		uint16_t n;
>  		uint16_t ret;
> +		uint32_t metadata = 0;

Let's use rte_be32_t instead.

>  
>  		/* Transmit multi-seg packets in the head of pkts list. */
>  		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
> @@ -137,9 +155,11 @@
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
>  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> -		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> +				DEV_TX_OFFLOAD_MATCH_METADATA))

Indentation.

> +			n = txq_calc_offload(&pkts[nb_tx], n,
> +					&cs_flags, &metadata, txq->offloads);

Indentation.

> +		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
>  		nb_tx += ret;
>  		if (!ret)
>  			break;
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fb884f9..fda7004 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -22,6 +22,7 @@
>  /* HW offload capabilities of vectorized Tx. */
>  #define MLX5_VEC_TX_OFFLOAD_CAP \
>  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
>  	 DEV_TX_OFFLOAD_MULTI_SEGS)
>  
>  /*
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index b37b738..a8a4d7b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -201,13 +201,15 @@
>   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<= pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, uint32_t metadata)

Let's use rte_be32_t instead.

>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head = txq->elts_head;
> @@ -294,10 +296,7 @@
>  	vst1q_u8((void *)t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
>  	vst1q_u8((void *)(t_wqe + 1),
> -		 ((uint8x16_t) { 0, 0, 0, 0,
> -				 cs_flags, 0, 0, 0,
> -				 0, 0, 0, 0,
> -				 0, 0, 0, 0 }));
> +		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 54b3783..31aae4a 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -202,13 +202,15 @@
>   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<= pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, uint32_t metadata)

Let's use rte_be32_t instead.

>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head = txq->elts_head;
> @@ -292,11 +294,7 @@
>  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
>  	_mm_store_si128(t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> -	_mm_store_si128(t_wqe + 1,
> -			_mm_set_epi8(0, 0, 0, 0,
> -				     0, 0, 0, 0,
> -				     0, 0, 0, cs_flags,
> -				     0, 0, 0, 0));
> +	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index f9bc473..7263fb1 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -128,6 +128,12 @@
>  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> +

Please no blank line.

> +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> +	if (config->dv_flow_en)
> +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
> +#endif
> +

Same here.

>  	return offloads;
>  }
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v4] net/mlx5: support metadata as flow rule criteria
  2018-10-18  8:00       ` Yongseok Koh
@ 2018-10-21 13:44         ` Dekel Peled
  0 siblings, 0 replies; 17+ messages in thread
From: Dekel Peled @ 2018-10-21 13:44 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: Shahaf Shuler, dev, Ori Kam

Thanks, PSB.

> -----Original Message-----
> From: Yongseok Koh
> Sent: Thursday, October 18, 2018 11:01 AM
> To: Dekel Peled <dekelp@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Ori Kam
> <orika@mellanox.com>
> Subject: Re: [PATCH v4] net/mlx5: support metadata as flow rule criteria
> 
> On Wed, Oct 17, 2018 at 02:53:37PM +0300, Dekel Peled wrote:
> > As described in series starting at [1], it adds option to set metadata
> > value as match pattern when creating a new flow rule.
> >
> > This patch adds metadata support in mlx5 driver, in two parts:
> > - Add the validation and setting of metadata value in matcher,
> >   when creating a new flow rule.
> > - Add the passing of metadata value from mbuf to wqe when
> >   indicated by ol_flag, in different burst functions.
> >
> > [1] "ethdev: support metadata as flow rule criteria"
> >     http://mails.dpdk.org/archives/dev/2018-October/115469.html
> >
> > ---
> > v4:
> > - Rebase.
> > - Apply code review comments.
> > v3:
> > - Update meta item validation.
> > v2:
> > - Split the support of egress rules to a different patch.
> > ---
> >
> > Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c          |   2 +-
> >  drivers/net/mlx5/mlx5_flow.h          |   8 +++
> >  drivers/net/mlx5/mlx5_flow_dv.c       | 109
> ++++++++++++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5_prm.h           |   2 +-
> >  drivers/net/mlx5/mlx5_rxtx.c          |  33 ++++++++--
> >  drivers/net/mlx5/mlx5_rxtx_vec.c      |  38 +++++++++---
> >  drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
> >  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
> >  drivers/net/mlx5/mlx5_txq.c           |   6 ++
> >  10 files changed, 192 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index bd70fce..15262f6 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
> >   * @return
> >   *   0 on success, a negative errno value otherwise and rte_errno is set.
> >   */
> > -static int
> > +int
> >  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> >  			  const uint8_t *mask,
> >  			  const uint8_t *nic_mask,
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 094f666..834a6ed 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -43,6 +43,9 @@
> >  #define MLX5_FLOW_LAYER_GRE (1u << 14)  #define
> MLX5_FLOW_LAYER_MPLS
> > (1u << 15)
> >
> > +/* General pattern items bits. */
> > +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> > +
> >  /* Outer Masks. */
> >  #define MLX5_FLOW_LAYER_OUTER_L3 \
> >  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 |
> MLX5_FLOW_LAYER_OUTER_L3_IPV6) @@
> > -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct
> > rte_flow_action *action,  int mlx5_flow_validate_attributes(struct
> rte_eth_dev *dev,
> >  				  const struct rte_flow_attr *attributes,
> >  				  struct rte_flow_error *error);
> > +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> > +			      const uint8_t *mask,
> > +			      const uint8_t *nic_mask,
> > +			      unsigned int size,
> > +			      struct rte_flow_error *error);
> >  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
> >  				uint64_t item_flags,
> >  				struct rte_flow_error *error);
> > diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> > b/drivers/net/mlx5/mlx5_flow_dv.c index a013201..bfddfab 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -36,6 +36,69 @@
> >  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> >
> >  /**
> > + * Validate META item.
> > + *
> > + * @param[in] dev
> > + *   Pointer to the rte_eth_dev structure.
> > + * @param[in] item
> > + *   Item specification.
> > + * @param[in] attr
> > + *   Attributes of flow that includes this item.
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> > +			   const struct rte_flow_item *item,
> > +			   const struct rte_flow_attr *attr,
> > +			   struct rte_flow_error *error)
> > +{
> > +	const struct rte_flow_item_meta *spec = item->spec;
> > +	const struct rte_flow_item_meta *mask = item->mask;
> > +
> 
> No blank line.

Removed.

> 
> > +	const struct rte_flow_item_meta nic_mask = {
> > +		.data = RTE_BE32(UINT32_MAX)
> > +	};
> > +
> 
> Ditto.

Removed.

> 
> > +	int ret;
> > +	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
> > +
> > +	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> > +		return rte_flow_error_set(error, EPERM,
> > +					  RTE_FLOW_ERROR_TYPE_ITEM,
> > +					  NULL,
> > +					  "match on metadata offload "
> > +					  "configuration is off for this port");
> > +	if (!spec)
> > +		return rte_flow_error_set(error, EINVAL,
> > +
> RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> > +					  item->spec,
> > +					  "data cannot be empty");
> > +	if (!spec->data)
> > +		return rte_flow_error_set(error, EINVAL,
> > +
> RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> > +					  NULL,
> > +					  "data cannot be zero");
> > +	if (!mask)
> > +		mask = &rte_flow_item_meta_mask;
> > +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> > +					(const uint8_t *)&nic_mask,
> > +					sizeof(struct rte_flow_item_meta),
> > +					error);
> > +	if (ret < 0)
> > +		return ret;
> > +	if (attr->ingress)
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> > +					  NULL,
> > +					  "pattern not supported for
> ingress");
> > +	return 0;
> > +}
> > +
> > +/**
> >   * Verify the @p attributes will be correctly understood by the NIC and
> store
> >   * them in the @p flow if everything is correct.
> >   *
> > @@ -214,6 +277,13 @@
> >  				return ret;
> >  			item_flags |= MLX5_FLOW_LAYER_MPLS;
> >  			break;
> > +		case RTE_FLOW_ITEM_TYPE_META:
> > +			ret = flow_dv_validate_item_meta(dev, items, attr,
> > +							 error);
> > +			if (ret < 0)
> > +				return ret;
> > +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> > +			break;
> >  		default:
> >  			return rte_flow_error_set(error, ENOTSUP,
> >
> RTE_FLOW_ERROR_TYPE_ITEM,
> > @@ -855,6 +925,42 @@
> >  }
> >
> >  /**
> > + * Add META item to matcher
> > + *
> > + * @param[in, out] matcher
> > + *   Flow matcher.
> > + * @param[in, out] key
> > + *   Flow matcher value.
> > + * @param[in] item
> > + *   Flow pattern to translate.
> > + * @param[in] inner
> > + *   Item is inner pattern.
> > + */
> > +static void
> > +flow_dv_translate_item_meta(void *matcher, void *key,
> > +				const struct rte_flow_item *item) {
> > +	const struct rte_flow_item_meta *meta_m;
> > +	const struct rte_flow_item_meta *meta_v;
> > +
> > +	void *misc2_m =
> > +		MLX5_ADDR_OF(fte_match_param, matcher,
> misc_parameters_2);
> > +	void *misc2_v =
> > +		MLX5_ADDR_OF(fte_match_param, key,
> misc_parameters_2);
> > +
> > +	meta_m = (const void *)item->mask;
> > +	if (!meta_m)
> > +		meta_m = &rte_flow_item_meta_mask;
> > +	meta_v = (const void *)item->spec;
> > +	if (meta_v) {
> > +		MLX5_SET(fte_match_set_misc2, misc2_m,
> metadata_reg_a,
> > +			RTE_BE32(meta_m->data));
> 
> Nope. RTE_BE32() is for builtin constant, not for a variable.
> You should use rte_cpu_to_be_32() instead.

Replaced.

> 
> > +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> > +			RTE_BE32(meta_v->data));
> 
> Same here.

Replaced.

> 
> > +	}
> > +}
> > +
> > +/**
> >   * Update the matcher and the value based the selected item.
> >   *
> >   * @param[in, out] matcher
> > @@ -940,6 +1046,9 @@
> >  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key,
> item,
> >  					     inner);
> >  		break;
> > +	case RTE_FLOW_ITEM_TYPE_META:
> > +		flow_dv_translate_item_meta(tmatcher->mask.buf, key,
> item);
> > +		break;
> >  	default:
> >  		break;
> >  	}
> > diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> > index 69296a0..29742b1 100644
> > --- a/drivers/net/mlx5/mlx5_prm.h
> > +++ b/drivers/net/mlx5/mlx5_prm.h
> > @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
> >  	uint8_t	cs_flags;
> >  	uint8_t	rsvd1;
> >  	uint16_t mss;
> > -	uint32_t rsvd2;
> > +	uint32_t flow_table_metadata;
> >  	uint16_t inline_hdr_sz;
> >  	uint8_t inline_hdr[2];
> >  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > b/drivers/net/mlx5/mlx5_rxtx.c index 558e6b6..5b4d2fd 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > @@ -523,6 +523,7 @@
> >  		uint8_t tso = txq->tso_en && (buf->ol_flags &
> PKT_TX_TCP_SEG);
> >  		uint32_t swp_offsets = 0;
> >  		uint8_t swp_types = 0;
> > +		uint32_t metadata;
> >  		uint16_t tso_segsz = 0;
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> >  		uint32_t total_length = 0;
> > @@ -566,6 +567,10 @@
> >  		cs_flags = txq_ol_cksum_to_cs(buf);
> >  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets,
> &swp_types);
> >  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 *
> MLX5_WQE_DWORD_SIZE;
> > +		/* Copy metadata from mbuf if valid */
> > +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> > +						buf->tx_metadata : 0;
> 
> Indentation.

Changed.

> 
> > +
> 
> No blank line.

Removed.

> 
> >  		/* Replace the Ethernet type by the VLAN if necessary. */
> >  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
> >  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 | @@
> -781,7 +786,7 @@
> >  				swp_offsets,
> >  				cs_flags | (swp_types << 8) |
> >  				(rte_cpu_to_be_16(tso_segsz) << 16),
> > -				0,
> > +				rte_cpu_to_be_32(metadata),
> >  				(ehdr << 16) |
> rte_cpu_to_be_16(tso_header_sz),
> >  			};
> >  		} else {
> > @@ -795,7 +800,7 @@
> >  			wqe->eseg = (rte_v128u32_t){
> >  				swp_offsets,
> >  				cs_flags | (swp_types << 8),
> > -				0,
> > +				rte_cpu_to_be_32(metadata),
> >  				(ehdr << 16) |
> rte_cpu_to_be_16(pkt_inline_sz),
> >  			};
> >  		}
> > @@ -861,7 +866,7 @@
> >  	mpw->wqe->eseg.inline_hdr_sz = 0;
> >  	mpw->wqe->eseg.rsvd0 = 0;
> >  	mpw->wqe->eseg.rsvd1 = 0;
> > -	mpw->wqe->eseg.rsvd2 = 0;
> > +	mpw->wqe->eseg.flow_table_metadata = 0;
> >  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW <<
> 24) |
> >  					     (txq->wqe_ci << 8) |
> >  					     MLX5_OPCODE_TSO);
> > @@ -948,6 +953,7 @@
> >  		uint32_t length;
> >  		unsigned int segs_n = buf->nb_segs;
> >  		uint32_t cs_flags;
> > +		uint32_t metadata;
> >
> >  		/*
> >  		 * Make sure there is enough room to store this packet and
> @@
> > -964,6 +970,9 @@
> >  		max_elts -= segs_n;
> >  		--pkts_n;
> >  		cs_flags = txq_ol_cksum_to_cs(buf);
> > +		/* Copy metadata from mbuf if valid */
> > +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> > +						buf->tx_metadata : 0;
> 
> Indentation.

Changed.

> And no need to change to big-endian? I think it needs.

Metadata written in mbuf by application as big-endian.

> 
> >  		/* Retrieve packet information. */
> >  		length = PKT_LEN(buf);
> >  		assert(length);
> > @@ -971,6 +980,7 @@
> >  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
> >  		    ((mpw.len != length) ||
> >  		     (segs_n != 1) ||
> > +		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
> >  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
> >  			mlx5_mpw_close(txq, &mpw);
> >  		if (mpw.state == MLX5_MPW_STATE_CLOSED) { @@ -984,6
> +994,7 @@
> >  			max_wqe -= 2;
> >  			mlx5_mpw_new(txq, &mpw, length);
> >  			mpw.wqe->eseg.cs_flags = cs_flags;
> > +			mpw.wqe->eseg.flow_table_metadata = metadata;
> >  		}
> >  		/* Multi-segment packets must be alone in their MPW. */
> >  		assert((segs_n == 1) || (mpw.pkts_n == 0)); @@ -1082,7
> +1093,7 @@
> >  	mpw->wqe->eseg.cs_flags = 0;
> >  	mpw->wqe->eseg.rsvd0 = 0;
> >  	mpw->wqe->eseg.rsvd1 = 0;
> > -	mpw->wqe->eseg.rsvd2 = 0;
> > +	mpw->wqe->eseg.flow_table_metadata = 0;
> >  	inl = (struct mlx5_wqe_inl_small *)
> >  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
> >  	mpw->data.raw = (uint8_t *)&inl->raw; @@ -1172,6 +1183,7 @@
> >  		uint32_t length;
> >  		unsigned int segs_n = buf->nb_segs;
> >  		uint8_t cs_flags;
> > +		uint32_t metadata;
> >
> >  		/*
> >  		 * Make sure there is enough room to store this packet and
> @@
> > -1193,18 +1205,23 @@
> >  		 */
> >  		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq-
> >wqe_pi);
> >  		cs_flags = txq_ol_cksum_to_cs(buf);
> > +		/* Copy metadata from mbuf if valid */
> > +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> > +						buf->tx_metadata : 0;
> 
> Indentation.

Changed.

> And no need to change to big-endian?

Metadata written in mbuf by application as big-endian.

> 
> >  		/* Retrieve packet information. */
> >  		length = PKT_LEN(buf);
> >  		/* Start new session if packet differs. */
> >  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
> >  			if ((mpw.len != length) ||
> >  			    (segs_n != 1) ||
> > +			    (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags))
> >  				mlx5_mpw_close(txq, &mpw);
> >  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
> >  			if ((mpw.len != length) ||
> >  			    (segs_n != 1) ||
> >  			    (length > inline_room) ||
> > +			    (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
> >  				mlx5_mpw_inline_close(txq, &mpw);
> >  				inline_room =
> > @@ -1224,12 +1241,14 @@
> >  				max_wqe -= 2;
> >  				mlx5_mpw_new(txq, &mpw, length);
> >  				mpw.wqe->eseg.cs_flags = cs_flags;
> > +				mpw.wqe->eseg.flow_table_metadata =
> metadata;
> >  			} else {
> >  				if (unlikely(max_wqe < wqe_inl_n))
> >  					break;
> >  				max_wqe -= wqe_inl_n;
> >  				mlx5_mpw_inline_new(txq, &mpw, length);
> >  				mpw.wqe->eseg.cs_flags = cs_flags;
> > +				mpw.wqe->eseg.flow_table_metadata =
> metadata;
> >  			}
> >  		}
> >  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1461,6
> > +1480,7 @@
> >  		unsigned int do_inline = 0; /* Whether inline is possible. */
> >  		uint32_t length;
> >  		uint8_t cs_flags;
> > +		uint32_t metadata;
> >
> >  		/* Multi-segmented packet is handled in slow-path outside.
> */
> >  		assert(NB_SEGS(buf) == 1);
> > @@ -1468,6 +1488,9 @@
> >  		if (max_elts - j == 0)
> >  			break;
> >  		cs_flags = txq_ol_cksum_to_cs(buf);
> > +		/* Copy metadata from mbuf if valid */
> > +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> > +						buf->tx_metadata : 0;
> 
> Indentation.

Changed.

> And no need to change to big-endian?

Metadata written in mbuf by application as big-endian.

> 
> >  		/* Retrieve packet information. */
> >  		length = PKT_LEN(buf);
> >  		/* Start new session if:
> > @@ -1482,6 +1505,7 @@
> >  			    (length <= txq->inline_max_packet_sz &&
> >  			     inl_pad + sizeof(inl_hdr) + length >
> >  			     mpw_room) ||
> > +			     (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
> >  			    (mpw.wqe->eseg.cs_flags != cs_flags))
> >  				max_wqe -= mlx5_empw_close(txq, &mpw);
> >  		}
> > @@ -1505,6 +1529,7 @@
> >  				    sizeof(inl_hdr) + length <= mpw_room &&
> >  				    !txq->mpw_hdr_dseg;
> >  			mpw.wqe->eseg.cs_flags = cs_flags;
> > +			mpw.wqe->eseg.flow_table_metadata = metadata;
> >  		} else {
> >  			/* Evaluate whether the next packet can be inlined.
> >  			 * Inlininig is possible when:
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c
> > b/drivers/net/mlx5/mlx5_rxtx_vec.c
> > index 0a4aed8..16a8608 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> > @@ -41,6 +41,8 @@
> >
> >  /**
> >   * Count the number of packets having same ol_flags and calculate
> cs_flags.
> > + * If PKT_TX_METADATA is set in ol_flags, packets must have same
> > + metadata
> > + * as well.
> 
> Packets can have different metadata but we just want to count the number
> of packets having same data. Please correct the comment.

Corrected.

> 
> >   *
> >   * @param pkts
> >   *   Pointer to array of packets.
> > @@ -48,26 +50,41 @@
> >   *   Number of packets.
> >   * @param cs_flags
> >   *   Pointer of flags to be returned.
> > + * @param metadata
> > + *   Pointer of metadata to be returned.
> > + * @param txq_offloads
> > + *   Offloads enabled on Tx queue
> >   *
> >   * @return
> > - *   Number of packets having same ol_flags.
> > + *   Number of packets having same ol_flags and metadata, if relevant.
> >   */
> >  static inline unsigned int
> > -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> > *cs_flags)
> > +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> *cs_flags,
> > +		 uint32_t *metadata, const uint64_t txq_offloads)
> >  {
> >  	unsigned int pos;
> >  	const uint64_t ol_mask =
> >  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
> >  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> > -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> > +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM |
> PKT_TX_METADATA;
> 
> Shouldn't add PKT_TX_METADATA. As it is for cksum, you might rather want
> to change the name, e.g., cksum_ol_mask.
> 
> >
> >  	if (!pkts_n)
> >  		return 0;
> >  	/* Count the number of packets having same ol_flags. */
> 
> This comment has to be corrected and moved.
> 
> > -	for (pos = 1; pos < pkts_n; ++pos)
> > -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> > +	for (pos = 1; pos < pkts_n; ++pos) {
> > +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> &&
> > +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))
> 
> Indentation.
> 
> >  			break;
> > +		/* If the metadata ol_flag is set,
> > +		 *  metadata must be same in all packets.
> > +		 */
> 
> Correct comment. First line should be empty for multi-line comment.
> And it can't be 'must'. We are not forcing it but just counting the number of
> packets having same metadata like I mentioned above.
> 
> > +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA)
> &&
> > +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> > +			pkts[0]->tx_metadata != pkts[pos]->tx_metadata)
> 
> Disagree. What if pkts[0] doesn't have PKT_TXT_METADATA while pkt[1] has
> it?
> And, indentation.
> 
> > +			break;
> > +	}
> >  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
> > +	*metadata = rte_cpu_to_be_32(pkts[0]->tx_metadata);
> 
> Same here. You should check if pkts[0] has metadata first.
> 
> >  	return pos;
> 
> Here's my suggestion for the whole func.
> 
> 	unsigned int pos;
> 	const uint64_t cksum_ol_mask =
> 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
> 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> 		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> 	const uint32_t p0_metadata;
> 
> 	if (!pkts_n)
> 		return 0;
> 	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
> 		      pkts[0]->tx_metadata : 0;
> 	/* Count the number of packets having same offload parameters. */
> 	for (pos = 1; pos < pkts_n; ++pos) {
> 		/* Check if packet can have same checksum flags. */
> 		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> &&
> 		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) &
> cksum_ol_mask))
> 			break;
> 		/* Check if packet has same metadata. */
> 		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
> 			const uint32_t p1_metadata =
> 				pkts[pos]->ol_flags & PKT_TX_METADATA ?
> 				pkts[pos]->tx_metadata : 0;
> 
> 			if (p1_metadata != p0_metadata)
> 				break;
> 		}
> 	}
> 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
> 	*metadata = rte_cpu_to_be_32(p0_metadata);
> 	return pos;

Modified per your suggestion.

> >  }
> >
> > @@ -96,7 +113,7 @@
> >  		uint16_t ret;
> >
> >  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
> MLX5_VPMD_TX_MAX_BURST);
> > -		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
> > +		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
> >  		nb_tx += ret;
> >  		if (!ret)
> >  			break;
> > @@ -127,6 +144,7 @@
> >  		uint8_t cs_flags = 0;
> >  		uint16_t n;
> >  		uint16_t ret;
> > +		uint32_t metadata = 0;
> 
> Let's use rte_be32_t instead.

Agree.

> 
> >
> >  		/* Transmit multi-seg packets in the head of pkts list. */
> >  		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
> @@ -137,9
> > +155,11 @@
> >  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
> MLX5_VPMD_TX_MAX_BURST);
> >  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
> >  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> > -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> > -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> > -		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
> > +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> > +				DEV_TX_OFFLOAD_MATCH_METADATA))
> 
> Indentation.

Changed.

> 
> > +			n = txq_calc_offload(&pkts[nb_tx], n,
> > +					&cs_flags, &metadata, txq-
> >offloads);
> 
> Indentation.

Changed.

> 
> > +		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
> >  		nb_tx += ret;
> >  		if (!ret)
> >  			break;
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > index fb884f9..fda7004 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > @@ -22,6 +22,7 @@
> >  /* HW offload capabilities of vectorized Tx. */  #define
> > MLX5_VEC_TX_OFFLOAD_CAP \
> >  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> > +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
> >  	 DEV_TX_OFFLOAD_MULTI_SEGS)
> >
> >  /*
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > index b37b738..a8a4d7b 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > @@ -201,13 +201,15 @@
> >   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
> >   * @param cs_flags
> >   *   Checksum offload flags to be written in the descriptor.
> > + * @param metadata
> > + *   Metadata value to be written in the descriptor.
> >   *
> >   * @return
> >   *   Number of packets successfully transmitted (<= pkts_n).
> >   */
> >  static inline uint16_t
> >  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
> > -	    uint8_t cs_flags)
> > +	    uint8_t cs_flags, uint32_t metadata)
> 
> Let's use rte_be32_t instead.

Agree.

> 
> >  {
> >  	struct rte_mbuf **elts;
> >  	uint16_t elts_head = txq->elts_head; @@ -294,10 +296,7 @@
> >  	vst1q_u8((void *)t_wqe, ctrl);
> >  	/* Fill ESEG in the header. */
> >  	vst1q_u8((void *)(t_wqe + 1),
> > -		 ((uint8x16_t) { 0, 0, 0, 0,
> > -				 cs_flags, 0, 0, 0,
> > -				 0, 0, 0, 0,
> > -				 0, 0, 0, 0 }));
> > +		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> >  	txq->stats.opackets += pkts_n;
> >  #endif
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > index 54b3783..31aae4a 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > @@ -202,13 +202,15 @@
> >   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
> >   * @param cs_flags
> >   *   Checksum offload flags to be written in the descriptor.
> > + * @param metadata
> > + *   Metadata value to be written in the descriptor.
> >   *
> >   * @return
> >   *   Number of packets successfully transmitted (<= pkts_n).
> >   */
> >  static inline uint16_t
> >  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
> > -	    uint8_t cs_flags)
> > +	    uint8_t cs_flags, uint32_t metadata)
> 
> Let's use rte_be32_t instead.

Agree.

> 
> >  {
> >  	struct rte_mbuf **elts;
> >  	uint16_t elts_head = txq->elts_head; @@ -292,11 +294,7 @@
> >  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
> >  	_mm_store_si128(t_wqe, ctrl);
> >  	/* Fill ESEG in the header. */
> > -	_mm_store_si128(t_wqe + 1,
> > -			_mm_set_epi8(0, 0, 0, 0,
> > -				     0, 0, 0, 0,
> > -				     0, 0, 0, cs_flags,
> > -				     0, 0, 0, 0));
> > +	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags,
> 0));
> >  #ifdef MLX5_PMD_SOFT_COUNTERS
> >  	txq->stats.opackets += pkts_n;
> >  #endif
> > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> > index f9bc473..7263fb1 100644
> > --- a/drivers/net/mlx5/mlx5_txq.c
> > +++ b/drivers/net/mlx5/mlx5_txq.c
> > @@ -128,6 +128,12 @@
> >  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> >  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
> >  	}
> > +
> 
> Please no blank line.

Removed.

> 
> > +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> > +	if (config->dv_flow_en)
> > +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA; #endif
> > +
> 
> Same here.

Removed.

> 
> >  	return offloads;
> >  }
> >
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v5] net/mlx5: support metadata as flow rule criteria
  2018-10-17 11:53     ` [dpdk-dev] [PATCH v4] " Dekel Peled
  2018-10-18  8:00       ` Yongseok Koh
@ 2018-10-21 14:04       ` Dekel Peled
  2018-10-22 18:47         ` Yongseok Koh
  2018-10-23 10:48         ` [dpdk-dev] [PATCH v6] " Dekel Peled
  1 sibling, 2 replies; 17+ messages in thread
From: Dekel Peled @ 2018-10-21 14:04 UTC (permalink / raw)
  To: yskoh, shahafs; +Cc: dev, orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113269.html

---
v5:
Apply code review comments:
 Coding style (indentation, redundant blank lines, clear comments).
 txq_calc_offload() logic updated.
 rte_be32_t type used instead of uint32_t.
v4:
- Rebase.
- Apply code review comments.
v3:
- Update meta item validation.
v2:
- Split the support of egress rules to a different patch.
---

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |   2 +-
 drivers/net/mlx5/mlx5_flow.h          |   8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 107 ++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  32 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  46 +++++++++++----
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
 drivers/net/mlx5/mlx5_txq.c           |   5 +-
 10 files changed, 193 insertions(+), 29 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index bd70fce..15262f6 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 094f666..834a6ed 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a013201..a7976d7 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,67 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   const struct rte_flow_attr *attr,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+	int ret;
+	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
+
+	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
+		return rte_flow_error_set(error, EPERM,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  NULL,
+					  "match on metadata offload "
+					  "configuration is off for this port");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!spec->data)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  NULL,
+					  "data cannot be zero");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -214,6 +275,13 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = flow_dv_validate_item_meta(dev, items, attr,
+							 error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -855,6 +923,42 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+				const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *meta_m;
+	const struct rte_flow_item_meta *meta_v;
+
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	meta_m = (const void *)item->mask;
+	if (!meta_m)
+		meta_m = &rte_flow_item_meta_mask;
+	meta_v = (const void *)item->spec;
+	if (meta_v) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_v->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -940,6 +1044,9 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 69296a0..29742b1 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..2bd220b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		rte_be32_t metadata;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,9 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+				buf->tx_metadata : 0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +785,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +799,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +865,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -948,6 +952,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint32_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -964,6 +969,9 @@
 		max_elts -= segs_n;
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+				buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -971,6 +979,7 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +993,7 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1092,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1172,6 +1182,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -1193,18 +1204,23 @@
 		 */
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+				buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1240,14 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1461,6 +1479,7 @@
 		unsigned int do_inline = 0; /* Whether inline is possible. */
 		uint32_t length;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/* Multi-segmented packet is handled in slow-path outside. */
 		assert(NB_SEGS(buf) == 1);
@@ -1468,6 +1487,9 @@
 		if (max_elts - j == 0)
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ?
+				buf->tx_metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
@@ -1482,6 +1504,7 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1528,7 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..1453f4f 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -40,7 +40,8 @@
 #endif
 
 /**
- * Count the number of packets having same ol_flags and calculate cs_flags.
+ * Count the number of packets having same ol_flags and same metadata (if
+ * PKT_TX_METADATA is set in ol_flags), and calculate cs_flags.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,26 +49,45 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param metadata
+ *   Pointer of metadata to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		 rte_be32_t *metadata, const uint64_t txq_offloads)
 {
 	unsigned int pos;
-	const uint64_t ol_mask =
+	const uint64_t cksum_ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
 		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+	rte_be32_t p0_metadata, pn_metadata;
 
 	if (!pkts_n)
 		return 0;
-	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
+			pkts[0]->tx_metadata : 0;
+	/* Count the number of packets having same offload parameters. */
+	for (pos = 1; pos < pkts_n; ++pos) {
+		/* Check if packet has same checksum flags. */
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & cksum_ol_mask))
 			break;
+		/* Check if packet has same metadata. */
+		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
+					pkts[pos]->tx_metadata : 0;
+			if (pn_metadata != p0_metadata)
+				break;
+		}
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
+	*metadata = p0_metadata;
 	return pos;
 }
 
@@ -96,7 +116,7 @@
 		uint16_t ret;
 
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
 		nb_tx += ret;
 		if (!ret)
 			break;
@@ -127,6 +147,7 @@
 		uint8_t cs_flags = 0;
 		uint16_t n;
 		uint16_t ret;
+		rte_be32_t metadata = 0;
 
 		/* Transmit multi-seg packets in the head of pkts list. */
 		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
@@ -137,9 +158,12 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				     DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					     &cs_flags, &metadata,
+					     txq->offloads);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
 		nb_tx += ret;
 		if (!ret)
 			break;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..b5e329f 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -201,13 +201,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -294,10 +296,7 @@
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
 	vst1q_u8((void *)(t_wqe + 1),
-		 ((uint8x16_t) { 0, 0, 0, 0,
-				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
-				 0, 0, 0, 0 }));
+		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..e0f95f9 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -202,13 +202,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -292,11 +294,7 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	_mm_store_si128(t_wqe + 1,
-			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
-				     0, 0, 0, cs_flags,
-				     0, 0, 0, 0));
+	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..b01bd67 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -120,7 +120,6 @@
 			offloads |= (DEV_TX_OFFLOAD_IP_TNL_TSO |
 				     DEV_TX_OFFLOAD_UDP_TNL_TSO);
 	}
-
 	if (config->tunnel_en) {
 		if (config->hw_csum)
 			offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
@@ -128,6 +127,10 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v5] net/mlx5: support metadata as flow rule criteria
  2018-10-21 14:04       ` [dpdk-dev] [PATCH v5] " Dekel Peled
@ 2018-10-22 18:47         ` Yongseok Koh
  2018-10-23 10:48         ` [dpdk-dev] [PATCH v6] " Dekel Peled
  1 sibling, 0 replies; 17+ messages in thread
From: Yongseok Koh @ 2018-10-22 18:47 UTC (permalink / raw)
  To: Dekel Peled; +Cc: Shahaf Shuler, dev, Ori Kam


> On Oct 21, 2018, at 7:04 AM, Dekel Peled <dekelp@mellanox.com> wrote:
> 
> As described in series starting at [1], it adds option to set
> metadata value as match pattern when creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the validation and setting of metadata value in matcher,
>  when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>  indicated by ol_flag, in different burst functions.
> 
> [1] "ethdev: support metadata as flow rule criteria"
>    http://mails.dpdk.org/archives/dev/2018-September/113269.html
> 
> ---
> v5:
> Apply code review comments:
> Coding style (indentation, redundant blank lines, clear comments).
> txq_calc_offload() logic updated.
> rte_be32_t type used instead of uint32_t.
> v4:
> - Rebase.
> - Apply code review comments.
> v3:
> - Update meta item validation.
> v2:
> - Split the support of egress rules to a different patch.
> ---
> 
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
> drivers/net/mlx5/mlx5_flow.c          |   2 +-
> drivers/net/mlx5/mlx5_flow.h          |   8 +++
> drivers/net/mlx5/mlx5_flow_dv.c       | 107 ++++++++++++++++++++++++++++++++++
> drivers/net/mlx5/mlx5_prm.h           |   2 +-
> drivers/net/mlx5/mlx5_rxtx.c          |  32 ++++++++--
> drivers/net/mlx5/mlx5_rxtx_vec.c      |  46 +++++++++++----
> drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
> drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
> drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
> drivers/net/mlx5/mlx5_txq.c           |   5 +-
> 10 files changed, 193 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index bd70fce..15262f6 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
>  * @return
>  *   0 on success, a negative errno value otherwise and rte_errno is set.
>  */
> -static int
> +int
> mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> 			  const uint8_t *mask,
> 			  const uint8_t *nic_mask,
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 094f666..834a6ed 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -43,6 +43,9 @@
> #define MLX5_FLOW_LAYER_GRE (1u << 14)
> #define MLX5_FLOW_LAYER_MPLS (1u << 15)
> 
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
> /* Outer Masks. */
> #define MLX5_FLOW_LAYER_OUTER_L3 \
> 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
> @@ -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
> int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
> 				  const struct rte_flow_attr *attributes,
> 				  struct rte_flow_error *error);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);
> int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
> 				uint64_t item_flags,
> 				struct rte_flow_error *error);
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index a013201..a7976d7 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -36,6 +36,67 @@
> #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> 
> /**
> + * Validate META item.
> + *
> + * @param[in] dev
> + *   Pointer to the rte_eth_dev structure.
> + * @param[in] item
> + *   Item specification.
> + * @param[in] attr
> + *   Attributes of flow that includes this item.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> +			   const struct rte_flow_item *item,
> +			   const struct rte_flow_attr *attr,
> +			   struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *spec = item->spec;
> +	const struct rte_flow_item_meta *mask = item->mask;
> +	const struct rte_flow_item_meta nic_mask = {
> +		.data = RTE_BE32(UINT32_MAX)
> +	};
> +	int ret;
> +	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
> +
> +	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> +		return rte_flow_error_set(error, EPERM,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  NULL,
> +					  "match on metadata offload "
> +					  "configuration is off for this port");
> +	if (!spec)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  item->spec,
> +					  "data cannot be empty");
> +	if (!spec->data)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  NULL,
> +					  "data cannot be zero");
> +	if (!mask)
> +		mask = &rte_flow_item_meta_mask;
> +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> +					(const uint8_t *)&nic_mask,
> +					sizeof(struct rte_flow_item_meta),
> +					error);
> +	if (ret < 0)
> +		return ret;
> +	if (attr->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for ingress");
> +	return 0;
> +}
> +
> +/**
>  * Verify the @p attributes will be correctly understood by the NIC and store
>  * them in the @p flow if everything is correct.
>  *
> @@ -214,6 +275,13 @@
> 				return ret;
> 			item_flags |= MLX5_FLOW_LAYER_MPLS;
> 			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret = flow_dv_validate_item_meta(dev, items, attr,
> +							 error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> +			break;
> 		default:
> 			return rte_flow_error_set(error, ENOTSUP,
> 						  RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -855,6 +923,42 @@
> }
> 
> /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +				const struct rte_flow_item *item)
> +{
> +	const struct rte_flow_item_meta *meta_m;
> +	const struct rte_flow_item_meta *meta_v;
> +
> +	void *misc2_m =
> +		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
> +	void *misc2_v =
> +		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
> +
> +	meta_m = (const void *)item->mask;
> +	if (!meta_m)
> +		meta_m = &rte_flow_item_meta_mask;
> +	meta_v = (const void *)item->spec;
> +	if (meta_v) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> +			 rte_be_to_cpu_32(meta_m->data));
> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			 rte_be_to_cpu_32(meta_v->data));
> +	}
> +}
> +
> +/**
>  * Update the matcher and the value based the selected item.
>  *
>  * @param[in, out] matcher
> @@ -940,6 +1044,9 @@
> 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
> 					     inner);
> 		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
> +		break;
> 	default:
> 		break;
> 	}
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 69296a0..29742b1 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
> 	uint8_t	cs_flags;
> 	uint8_t	rsvd1;
> 	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
> 	uint16_t inline_hdr_sz;
> 	uint8_t inline_hdr[2];
> } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 558e6b6..2bd220b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
> 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
> 		uint32_t swp_offsets = 0;
> 		uint8_t swp_types = 0;
> +		rte_be32_t metadata;
> 		uint16_t tso_segsz = 0;
> #ifdef MLX5_PMD_SOFT_COUNTERS
> 		uint32_t total_length = 0;
> @@ -566,6 +567,9 @@
> 		cs_flags = txq_ol_cksum_to_cs(buf);
> 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
> 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ?
> +				buf->tx_metadata : 0;

Nitpicking. :-)
Indentation. There're a few more in the rest of the patch.
Otherwise,

Acked-by: Yongseok Koh <yskoh@mellanox.com>
 
Thanks

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v6] net/mlx5: support metadata as flow rule criteria
  2018-10-21 14:04       ` [dpdk-dev] [PATCH v5] " Dekel Peled
  2018-10-22 18:47         ` Yongseok Koh
@ 2018-10-23 10:48         ` Dekel Peled
  2018-10-23 12:27           ` Shahaf Shuler
  2018-10-23 19:34           ` [dpdk-dev] [PATCH v7] " Dekel Peled
  1 sibling, 2 replies; 17+ messages in thread
From: Dekel Peled @ 2018-10-23 10:48 UTC (permalink / raw)
  To: yskoh, shahafs; +Cc: dev, orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113269.html

---
v6:
- Correct indentation.
- Fix setting data in matcher to include mask.
v5:
Apply code review comments:
 Coding style (indentation, redundant blank lines, clear comments).
 txq_calc_offload() logic updated.
 rte_be32_t type used instead of uint32_t.
v4:
- Rebase.
- Apply code review comments.
v3:
- Update meta item validation.
v2:
- Split the support of egress rules to a different patch.
---

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |   2 +-
 drivers/net/mlx5/mlx5_flow.h          |   8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 106 ++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  32 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  46 +++++++++++----
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
 drivers/net/mlx5/mlx5_txq.c           |   5 +-
 10 files changed, 192 insertions(+), 29 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index fcabab0..df5c34e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index af0a125..38635c9 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -316,6 +319,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 58e3c33..e8f409f 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,67 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   const struct rte_flow_attr *attr,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+	int ret;
+	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
+
+	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
+		return rte_flow_error_set(error, EPERM,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  NULL,
+					  "match on metadata offload "
+					  "configuration is off for this port");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!spec->data)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  NULL,
+					  "data cannot be zero");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -214,6 +275,13 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = flow_dv_validate_item_meta(dev, items, attr,
+							 error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -857,6 +925,41 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+			    const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *meta_m;
+	const struct rte_flow_item_meta *meta_v;
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	meta_m = (const void *)item->mask;
+	if (!meta_m)
+		meta_m = &rte_flow_item_meta_mask;
+	meta_v = (const void *)item->spec;
+	if (meta_v) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -942,6 +1045,9 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 69296a0..29742b1 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..90a2bf8 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		rte_be32_t metadata;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,9 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +785,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +799,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +865,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -948,6 +952,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint32_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -964,6 +969,9 @@
 		max_elts -= segs_n;
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -971,6 +979,7 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +993,7 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1092,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1172,6 +1182,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -1193,18 +1204,23 @@
 		 */
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1240,14 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1461,6 +1479,7 @@
 		unsigned int do_inline = 0; /* Whether inline is possible. */
 		uint32_t length;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/* Multi-segmented packet is handled in slow-path outside. */
 		assert(NB_SEGS(buf) == 1);
@@ -1468,6 +1487,9 @@
 		if (max_elts - j == 0)
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
@@ -1482,6 +1504,7 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1528,7 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..1453f4f 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -40,7 +40,8 @@
 #endif
 
 /**
- * Count the number of packets having same ol_flags and calculate cs_flags.
+ * Count the number of packets having same ol_flags and same metadata (if
+ * PKT_TX_METADATA is set in ol_flags), and calculate cs_flags.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,26 +49,45 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param metadata
+ *   Pointer of metadata to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		 rte_be32_t *metadata, const uint64_t txq_offloads)
 {
 	unsigned int pos;
-	const uint64_t ol_mask =
+	const uint64_t cksum_ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
 		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+	rte_be32_t p0_metadata, pn_metadata;
 
 	if (!pkts_n)
 		return 0;
-	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
+			pkts[0]->tx_metadata : 0;
+	/* Count the number of packets having same offload parameters. */
+	for (pos = 1; pos < pkts_n; ++pos) {
+		/* Check if packet has same checksum flags. */
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & cksum_ol_mask))
 			break;
+		/* Check if packet has same metadata. */
+		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
+					pkts[pos]->tx_metadata : 0;
+			if (pn_metadata != p0_metadata)
+				break;
+		}
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
+	*metadata = p0_metadata;
 	return pos;
 }
 
@@ -96,7 +116,7 @@
 		uint16_t ret;
 
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
 		nb_tx += ret;
 		if (!ret)
 			break;
@@ -127,6 +147,7 @@
 		uint8_t cs_flags = 0;
 		uint16_t n;
 		uint16_t ret;
+		rte_be32_t metadata = 0;
 
 		/* Transmit multi-seg packets in the head of pkts list. */
 		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
@@ -137,9 +158,12 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				     DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					     &cs_flags, &metadata,
+					     txq->offloads);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
 		nb_tx += ret;
 		if (!ret)
 			break;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..b5e329f 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -201,13 +201,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -294,10 +296,7 @@
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
 	vst1q_u8((void *)(t_wqe + 1),
-		 ((uint8x16_t) { 0, 0, 0, 0,
-				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
-				 0, 0, 0, 0 }));
+		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..e0f95f9 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -202,13 +202,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -292,11 +294,7 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	_mm_store_si128(t_wqe + 1,
-			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
-				     0, 0, 0, cs_flags,
-				     0, 0, 0, 0));
+	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..b01bd67 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -120,7 +120,6 @@
 			offloads |= (DEV_TX_OFFLOAD_IP_TNL_TSO |
 				     DEV_TX_OFFLOAD_UDP_TNL_TSO);
 	}
-
 	if (config->tunnel_en) {
 		if (config->hw_csum)
 			offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
@@ -128,6 +127,10 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v6] net/mlx5: support metadata as flow rule criteria
  2018-10-23 10:48         ` [dpdk-dev] [PATCH v6] " Dekel Peled
@ 2018-10-23 12:27           ` Shahaf Shuler
  2018-10-23 19:34           ` [dpdk-dev] [PATCH v7] " Dekel Peled
  1 sibling, 0 replies; 17+ messages in thread
From: Shahaf Shuler @ 2018-10-23 12:27 UTC (permalink / raw)
  To: Dekel Peled, Yongseok Koh; +Cc: dev, Ori Kam

Tuesday, October 23, 2018 1:48 PM, Dekel Peled:
> Subject: [PATCH v6] net/mlx5: support metadata as flow rule criteria
> 
> As described in series starting at [1], it adds option to set metadata value as
> match pattern when creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the validation and setting of metadata value in matcher,
>   when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>   indicated by ol_flag, in different burst functions.
> 
> [1] "ethdev: support metadata as flow rule criteria"
>     http://mails.dpdk.org/archives/dev/2018-September/113269.html
> 
> ---
> v6:
> - Correct indentation.
> - Fix setting data in matcher to include mask.
> v5:
> Apply code review comments:
>  Coding style (indentation, redundant blank lines, clear comments).
>  txq_calc_offload() logic updated.
>  rte_be32_t type used instead of uint32_t.
> v4:
> - Rebase.
> - Apply code review comments.
> v3:
> - Update meta item validation.
> v2:
> - Split the support of egress rules to a different patch.
> ---
> 
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---

Applied to next-net-mlx, thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v7] net/mlx5: support metadata as flow rule criteria
  2018-10-23 10:48         ` [dpdk-dev] [PATCH v6] " Dekel Peled
  2018-10-23 12:27           ` Shahaf Shuler
@ 2018-10-23 19:34           ` Dekel Peled
  2018-10-24  6:12             ` Shahaf Shuler
  1 sibling, 1 reply; 17+ messages in thread
From: Dekel Peled @ 2018-10-23 19:34 UTC (permalink / raw)
  To: yskoh, shahafs; +Cc: dev, orika

As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113269.html

---
v7:
- Fix use of wrong type.
v6:
- Correct indentation.
- Fix setting data in matcher to include mask.
v5:
Apply code review comments:
 Coding style (indentation, redundant blank lines, clear comments).
 txq_calc_offload() logic updated.
 rte_be32_t type used instead of uint32_t.
v4:
- Rebase.
- Apply code review comments.
v3:
- Update meta item validation.
v2:
- Split the support of egress rules to a different patch.
---

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          |   2 +-
 drivers/net/mlx5/mlx5_flow.h          |   8 +++
 drivers/net/mlx5/mlx5_flow_dv.c       | 106 ++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_prm.h           |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c          |  32 ++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  46 +++++++++++----
 drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  11 ++--
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
 drivers/net/mlx5/mlx5_txq.c           |   5 +-
 10 files changed, 193 insertions(+), 30 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index fcabab0..df5c34e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static int
+int
 mlx5_flow_item_acceptable(const struct rte_flow_item *item,
 			  const uint8_t *mask,
 			  const uint8_t *nic_mask,
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index af0a125..38635c9 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -43,6 +43,9 @@
 #define MLX5_FLOW_LAYER_GRE (1u << 14)
 #define MLX5_FLOW_LAYER_MPLS (1u << 15)
 
+/* General pattern items bits. */
+#define MLX5_FLOW_ITEM_METADATA (1u << 16)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -316,6 +319,11 @@ int mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
 int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
 				  const struct rte_flow_attr *attributes,
 				  struct rte_flow_error *error);
+int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
+			      const uint8_t *mask,
+			      const uint8_t *nic_mask,
+			      unsigned int size,
+			      struct rte_flow_error *error);
 int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
 				uint64_t item_flags,
 				struct rte_flow_error *error);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 58e3c33..e8f409f 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -36,6 +36,67 @@
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 
 /**
+ * Validate META item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   const struct rte_flow_attr *attr,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_meta *spec = item->spec;
+	const struct rte_flow_item_meta *mask = item->mask;
+	const struct rte_flow_item_meta nic_mask = {
+		.data = RTE_BE32(UINT32_MAX)
+	};
+	int ret;
+	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
+
+	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
+		return rte_flow_error_set(error, EPERM,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  NULL,
+					  "match on metadata offload "
+					  "configuration is off for this port");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (!spec->data)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  NULL,
+					  "data cannot be zero");
+	if (!mask)
+		mask = &rte_flow_item_meta_mask;
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+					(const uint8_t *)&nic_mask,
+					sizeof(struct rte_flow_item_meta),
+					error);
+	if (ret < 0)
+		return ret;
+	if (attr->ingress)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
+					  NULL,
+					  "pattern not supported for ingress");
+	return 0;
+}
+
+/**
  * Verify the @p attributes will be correctly understood by the NIC and store
  * them in the @p flow if everything is correct.
  *
@@ -214,6 +275,13 @@
 				return ret;
 			item_flags |= MLX5_FLOW_LAYER_MPLS;
 			break;
+		case RTE_FLOW_ITEM_TYPE_META:
+			ret = flow_dv_validate_item_meta(dev, items, attr,
+							 error);
+			if (ret < 0)
+				return ret;
+			item_flags |= MLX5_FLOW_ITEM_METADATA;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -857,6 +925,41 @@
 }
 
 /**
+ * Add META item to matcher
+ *
+ * @param[in, out] matcher
+ *   Flow matcher.
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ * @param[in] inner
+ *   Item is inner pattern.
+ */
+static void
+flow_dv_translate_item_meta(void *matcher, void *key,
+			    const struct rte_flow_item *item)
+{
+	const struct rte_flow_item_meta *meta_m;
+	const struct rte_flow_item_meta *meta_v;
+	void *misc2_m =
+		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
+	void *misc2_v =
+		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
+
+	meta_m = (const void *)item->mask;
+	if (!meta_m)
+		meta_m = &rte_flow_item_meta_mask;
+	meta_v = (const void *)item->spec;
+	if (meta_v) {
+		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
+			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+	}
+}
+
+/**
  * Update the matcher and the value based the selected item.
  *
  * @param[in, out] matcher
@@ -942,6 +1045,9 @@
 		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
 					     inner);
 		break;
+	case RTE_FLOW_ITEM_TYPE_META:
+		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 69296a0..29742b1 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
 	uint8_t	cs_flags;
 	uint8_t	rsvd1;
 	uint16_t mss;
-	uint32_t rsvd2;
+	uint32_t flow_table_metadata;
 	uint16_t inline_hdr_sz;
 	uint8_t inline_hdr[2];
 } __rte_aligned(MLX5_WQE_DWORD_SIZE);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 558e6b6..90a2bf8 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -523,6 +523,7 @@
 		uint8_t tso = txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
 		uint32_t swp_offsets = 0;
 		uint8_t swp_types = 0;
+		rte_be32_t metadata;
 		uint16_t tso_segsz = 0;
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		uint32_t total_length = 0;
@@ -566,6 +567,9 @@
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -781,7 +785,7 @@
 				swp_offsets,
 				cs_flags | (swp_types << 8) |
 				(rte_cpu_to_be_16(tso_segsz) << 16),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
 			};
 		} else {
@@ -795,7 +799,7 @@
 			wqe->eseg = (rte_v128u32_t){
 				swp_offsets,
 				cs_flags | (swp_types << 8),
-				0,
+				metadata,
 				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
 			};
 		}
@@ -861,7 +865,7 @@
 	mpw->wqe->eseg.inline_hdr_sz = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
 					     (txq->wqe_ci << 8) |
 					     MLX5_OPCODE_TSO);
@@ -948,6 +952,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint32_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -964,6 +969,9 @@
 		max_elts -= segs_n;
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -971,6 +979,7 @@
 		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
 		    ((mpw.len != length) ||
 		     (segs_n != 1) ||
+		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 		     (mpw.wqe->eseg.cs_flags != cs_flags)))
 			mlx5_mpw_close(txq, &mpw);
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -984,6 +993,7 @@
 			max_wqe -= 2;
 			mlx5_mpw_new(txq, &mpw, length);
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		}
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
@@ -1082,7 +1092,7 @@
 	mpw->wqe->eseg.cs_flags = 0;
 	mpw->wqe->eseg.rsvd0 = 0;
 	mpw->wqe->eseg.rsvd1 = 0;
-	mpw->wqe->eseg.rsvd2 = 0;
+	mpw->wqe->eseg.flow_table_metadata = 0;
 	inl = (struct mlx5_wqe_inl_small *)
 		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
 	mpw->data.raw = (uint8_t *)&inl->raw;
@@ -1172,6 +1182,7 @@
 		uint32_t length;
 		unsigned int segs_n = buf->nb_segs;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/*
 		 * Make sure there is enough room to store this packet and
@@ -1193,18 +1204,23 @@
 		 */
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				mlx5_mpw_close(txq, &mpw);
 		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
 			if ((mpw.len != length) ||
 			    (segs_n != 1) ||
 			    (length > inline_room) ||
+			    (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
 				inline_room =
@@ -1224,12 +1240,14 @@
 				max_wqe -= 2;
 				mlx5_mpw_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			} else {
 				if (unlikely(max_wqe < wqe_inl_n))
 					break;
 				max_wqe -= wqe_inl_n;
 				mlx5_mpw_inline_new(txq, &mpw, length);
 				mpw.wqe->eseg.cs_flags = cs_flags;
+				mpw.wqe->eseg.flow_table_metadata = metadata;
 			}
 		}
 		/* Multi-segment packets must be alone in their MPW. */
@@ -1461,6 +1479,7 @@
 		unsigned int do_inline = 0; /* Whether inline is possible. */
 		uint32_t length;
 		uint8_t cs_flags;
+		rte_be32_t metadata;
 
 		/* Multi-segmented packet is handled in slow-path outside. */
 		assert(NB_SEGS(buf) == 1);
@@ -1468,6 +1487,9 @@
 		if (max_elts - j == 0)
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
+		/* Copy metadata from mbuf if valid */
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
+							     0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
@@ -1482,6 +1504,7 @@
 			    (length <= txq->inline_max_packet_sz &&
 			     inl_pad + sizeof(inl_hdr) + length >
 			     mpw_room) ||
+			     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
 			    (mpw.wqe->eseg.cs_flags != cs_flags))
 				max_wqe -= mlx5_empw_close(txq, &mpw);
 		}
@@ -1505,6 +1528,7 @@
 				    sizeof(inl_hdr) + length <= mpw_room &&
 				    !txq->mpw_hdr_dseg;
 			mpw.wqe->eseg.cs_flags = cs_flags;
+			mpw.wqe->eseg.flow_table_metadata = metadata;
 		} else {
 			/* Evaluate whether the next packet can be inlined.
 			 * Inlininig is possible when:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 0a4aed8..1453f4f 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -40,7 +40,8 @@
 #endif
 
 /**
- * Count the number of packets having same ol_flags and calculate cs_flags.
+ * Count the number of packets having same ol_flags and same metadata (if
+ * PKT_TX_METADATA is set in ol_flags), and calculate cs_flags.
  *
  * @param pkts
  *   Pointer to array of packets.
@@ -48,26 +49,45 @@
  *   Number of packets.
  * @param cs_flags
  *   Pointer of flags to be returned.
+ * @param metadata
+ *   Pointer of metadata to be returned.
+ * @param txq_offloads
+ *   Offloads enabled on Tx queue
  *
  * @return
- *   Number of packets having same ol_flags.
+ *   Number of packets having same ol_flags and metadata, if relevant.
  */
 static inline unsigned int
-txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags)
+txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
+		 rte_be32_t *metadata, const uint64_t txq_offloads)
 {
 	unsigned int pos;
-	const uint64_t ol_mask =
+	const uint64_t cksum_ol_mask =
 		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
 		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
 		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
+	rte_be32_t p0_metadata, pn_metadata;
 
 	if (!pkts_n)
 		return 0;
-	/* Count the number of packets having same ol_flags. */
-	for (pos = 1; pos < pkts_n; ++pos)
-		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
+	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
+			pkts[0]->tx_metadata : 0;
+	/* Count the number of packets having same offload parameters. */
+	for (pos = 1; pos < pkts_n; ++pos) {
+		/* Check if packet has same checksum flags. */
+		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
+		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & cksum_ol_mask))
 			break;
+		/* Check if packet has same metadata. */
+		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
+					pkts[pos]->tx_metadata : 0;
+			if (pn_metadata != p0_metadata)
+				break;
+		}
+	}
 	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
+	*metadata = p0_metadata;
 	return pos;
 }
 
@@ -96,7 +116,7 @@
 		uint16_t ret;
 
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
 		nb_tx += ret;
 		if (!ret)
 			break;
@@ -127,6 +147,7 @@
 		uint8_t cs_flags = 0;
 		uint16_t n;
 		uint16_t ret;
+		rte_be32_t metadata = 0;
 
 		/* Transmit multi-seg packets in the head of pkts list. */
 		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
@@ -137,9 +158,12 @@
 		n = RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
 		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
 			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
-		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
-			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
-		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
+		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
+				     DEV_TX_OFFLOAD_MATCH_METADATA))
+			n = txq_calc_offload(&pkts[nb_tx], n,
+					     &cs_flags, &metadata,
+					     txq->offloads);
+		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
 		nb_tx += ret;
 		if (!ret)
 			break;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fb884f9..fda7004 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -22,6 +22,7 @@
 /* HW offload capabilities of vectorized Tx. */
 #define MLX5_VEC_TX_OFFLOAD_CAP \
 	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
+	 DEV_TX_OFFLOAD_MATCH_METADATA | \
 	 DEV_TX_OFFLOAD_MULTI_SEGS)
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b37b738..0b729f1 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -201,13 +201,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -293,11 +295,8 @@
 	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
 	vst1q_u8((void *)t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	vst1q_u8((void *)(t_wqe + 1),
-		 ((uint8x16_t) { 0, 0, 0, 0,
-				 cs_flags, 0, 0, 0,
-				 0, 0, 0, 0,
-				 0, 0, 0, 0 }));
+	vst1q_u32((void *)(t_wqe + 1),
+		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 54b3783..e0f95f9 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -202,13 +202,15 @@
  *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
  * @param cs_flags
  *   Checksum offload flags to be written in the descriptor.
+ * @param metadata
+ *   Metadata value to be written in the descriptor.
  *
  * @return
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static inline uint16_t
 txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
-	    uint8_t cs_flags)
+	    uint8_t cs_flags, rte_be32_t metadata)
 {
 	struct rte_mbuf **elts;
 	uint16_t elts_head = txq->elts_head;
@@ -292,11 +294,7 @@
 	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
 	_mm_store_si128(t_wqe, ctrl);
 	/* Fill ESEG in the header. */
-	_mm_store_si128(t_wqe + 1,
-			_mm_set_epi8(0, 0, 0, 0,
-				     0, 0, 0, 0,
-				     0, 0, 0, cs_flags,
-				     0, 0, 0, 0));
+	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	txq->stats.opackets += pkts_n;
 #endif
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index f9bc473..b01bd67 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -120,7 +120,6 @@
 			offloads |= (DEV_TX_OFFLOAD_IP_TNL_TSO |
 				     DEV_TX_OFFLOAD_UDP_TNL_TSO);
 	}
-
 	if (config->tunnel_en) {
 		if (config->hw_csum)
 			offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
@@ -128,6 +127,10 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	if (config->dv_flow_en)
+		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
+#endif
 	return offloads;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v7] net/mlx5: support metadata as flow rule criteria
  2018-10-23 19:34           ` [dpdk-dev] [PATCH v7] " Dekel Peled
@ 2018-10-24  6:12             ` Shahaf Shuler
  2018-10-24  8:49               ` Ferruh Yigit
  0 siblings, 1 reply; 17+ messages in thread
From: Shahaf Shuler @ 2018-10-24  6:12 UTC (permalink / raw)
  To: Dekel Peled, Yongseok Koh, Ferruh Yigit; +Cc: dev, Ori Kam

Hi Ferruh, 

This patch contains a fix for compilation on top of arm.
I hopped to replace between the existing "support metadata as flow rule criteria" to this one before you take it, but I was too late.
Can you please replace the old patch with this one?
Otherwise we will provide a separate fix patch for this issue. 


Tuesday, October 23, 2018 10:34 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v7] net/mlx5: support metadata as flow rule
> criteria
> 
> As described in series starting at [1], it adds option to set metadata value as
> match pattern when creating a new flow rule.
> 
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the validation and setting of metadata value in matcher,
>   when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>   indicated by ol_flag, in different burst functions.
> 
> [1] "ethdev: support metadata as flow rule criteria"
> 
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
> s.dpdk.org%2Farchives%2Fdev%2F2018-
> September%2F113269.html&amp;data=02%7C01%7Cshahafs%40mellanox.co
> m%7Cdd41e32e0904475b253708d63921cd6b%7Ca652971c7d2e4d9ba6a4d149
> 256f461b%7C0%7C0%7C636759214682359808&amp;sdata=QgVXYva4uv%2FA
> GcrofzbIIlxHpdR1cOfDw2BACO0s6wY%3D&amp;reserved=0
> 

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

> ---
> v7:
> - Fix use of wrong type.
> v6:
> - Correct indentation.
> - Fix setting data in matcher to include mask.
> v5:
> Apply code review comments:
>  Coding style (indentation, redundant blank lines, clear comments).
>  txq_calc_offload() logic updated.
>  rte_be32_t type used instead of uint32_t.
> v4:
> - Rebase.
> - Apply code review comments.
> v3:
> - Update meta item validation.
> v2:
> - Split the support of egress rules to a different patch.
> ---
> 
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c          |   2 +-
>  drivers/net/mlx5/mlx5_flow.h          |   8 +++
>  drivers/net/mlx5/mlx5_flow_dv.c       | 106
> ++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_prm.h           |   2 +-
>  drivers/net/mlx5/mlx5_rxtx.c          |  32 ++++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.c      |  46 +++++++++++----
>  drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  11 ++--
> drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
>  drivers/net/mlx5/mlx5_txq.c           |   5 +-
>  10 files changed, 193 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index fcabab0..df5c34e 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
> -static int
> +int
>  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
>  			  const uint8_t *mask,
>  			  const uint8_t *nic_mask,
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index af0a125..38635c9 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -43,6 +43,9 @@
>  #define MLX5_FLOW_LAYER_GRE (1u << 14)
>  #define MLX5_FLOW_LAYER_MPLS (1u << 15)
> 
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
>  /* Outer Masks. */
>  #define MLX5_FLOW_LAYER_OUTER_L3 \
>  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 |
> MLX5_FLOW_LAYER_OUTER_L3_IPV6) @@ -316,6 +319,11 @@ int
> mlx5_flow_validate_action_rss(const struct rte_flow_action *action,  int
> mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
>  				  const struct rte_flow_attr *attributes,
>  				  struct rte_flow_error *error);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);
>  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
>  				uint64_t item_flags,
>  				struct rte_flow_error *error);
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> b/drivers/net/mlx5/mlx5_flow_dv.c index 58e3c33..e8f409f 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -36,6 +36,67 @@
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> 
>  /**
> + * Validate META item.
> + *
> + * @param[in] dev
> + *   Pointer to the rte_eth_dev structure.
> + * @param[in] item
> + *   Item specification.
> + * @param[in] attr
> + *   Attributes of flow that includes this item.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> +			   const struct rte_flow_item *item,
> +			   const struct rte_flow_attr *attr,
> +			   struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *spec = item->spec;
> +	const struct rte_flow_item_meta *mask = item->mask;
> +	const struct rte_flow_item_meta nic_mask = {
> +		.data = RTE_BE32(UINT32_MAX)
> +	};
> +	int ret;
> +	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
> +
> +	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> +		return rte_flow_error_set(error, EPERM,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  NULL,
> +					  "match on metadata offload "
> +					  "configuration is off for this port");
> +	if (!spec)
> +		return rte_flow_error_set(error, EINVAL,
> +
> RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  item->spec,
> +					  "data cannot be empty");
> +	if (!spec->data)
> +		return rte_flow_error_set(error, EINVAL,
> +
> RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  NULL,
> +					  "data cannot be zero");
> +	if (!mask)
> +		mask = &rte_flow_item_meta_mask;
> +	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> +					(const uint8_t *)&nic_mask,
> +					sizeof(struct rte_flow_item_meta),
> +					error);
> +	if (ret < 0)
> +		return ret;
> +	if (attr->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for
> ingress");
> +	return 0;
> +}
> +
> +/**
>   * Verify the @p attributes will be correctly understood by the NIC and store
>   * them in the @p flow if everything is correct.
>   *
> @@ -214,6 +275,13 @@
>  				return ret;
>  			item_flags |= MLX5_FLOW_LAYER_MPLS;
>  			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret = flow_dv_validate_item_meta(dev, items, attr,
> +							 error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |= MLX5_FLOW_ITEM_METADATA;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
> 
> RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -857,6 +925,41 @@
>  }
> 
>  /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +			    const struct rte_flow_item *item) {
> +	const struct rte_flow_item_meta *meta_m;
> +	const struct rte_flow_item_meta *meta_v;
> +	void *misc2_m =
> +		MLX5_ADDR_OF(fte_match_param, matcher,
> misc_parameters_2);
> +	void *misc2_v =
> +		MLX5_ADDR_OF(fte_match_param, key,
> misc_parameters_2);
> +
> +	meta_m = (const void *)item->mask;
> +	if (!meta_m)
> +		meta_m = &rte_flow_item_meta_mask;
> +	meta_v = (const void *)item->spec;
> +	if (meta_v) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m,
> metadata_reg_a,
> +			 rte_be_to_cpu_32(meta_m->data));
> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			 rte_be_to_cpu_32(meta_v->data & meta_m-
> >data));
> +	}
> +}
> +
> +/**
>   * Update the matcher and the value based the selected item.
>   *
>   * @param[in, out] matcher
> @@ -942,6 +1045,9 @@
>  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key,
> item,
>  					     inner);
>  		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->mask.buf, key,
> item);
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 69296a0..29742b1 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
>  	uint8_t	cs_flags;
>  	uint8_t	rsvd1;
>  	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
>  	uint16_t inline_hdr_sz;
>  	uint8_t inline_hdr[2];
>  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 558e6b6..90a2bf8 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
>  		uint8_t tso = txq->tso_en && (buf->ol_flags &
> PKT_TX_TCP_SEG);
>  		uint32_t swp_offsets = 0;
>  		uint8_t swp_types = 0;
> +		rte_be32_t metadata;
>  		uint16_t tso_segsz = 0;
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  		uint32_t total_length = 0;
> @@ -566,6 +567,9 @@
>  		cs_flags = txq_ol_cksum_to_cs(buf);
>  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets,
> &swp_types);
>  		raw = ((uint8_t *)(uintptr_t)wqe) + 2 *
> MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ? buf-
> >tx_metadata :
> +							     0;
>  		/* Replace the Ethernet type by the VLAN if necessary. */
>  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
>  			uint32_t vlan = rte_cpu_to_be_32(0x81000000 | @@
> -781,7 +785,7 @@
>  				swp_offsets,
>  				cs_flags | (swp_types << 8) |
>  				(rte_cpu_to_be_16(tso_segsz) << 16),
> -				0,
> +				metadata,
>  				(ehdr << 16) |
> rte_cpu_to_be_16(tso_header_sz),
>  			};
>  		} else {
> @@ -795,7 +799,7 @@
>  			wqe->eseg = (rte_v128u32_t){
>  				swp_offsets,
>  				cs_flags | (swp_types << 8),
> -				0,
> +				metadata,
>  				(ehdr << 16) |
> rte_cpu_to_be_16(pkt_inline_sz),
>  			};
>  		}
> @@ -861,7 +865,7 @@
>  	mpw->wqe->eseg.inline_hdr_sz = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	mpw->wqe->ctrl[0] = rte_cpu_to_be_32((MLX5_OPC_MOD_MPW
> << 24) |
>  					     (txq->wqe_ci << 8) |
>  					     MLX5_OPCODE_TSO);
> @@ -948,6 +952,7 @@
>  		uint32_t length;
>  		unsigned int segs_n = buf->nb_segs;
>  		uint32_t cs_flags;
> +		rte_be32_t metadata;
> 
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -964,6 +969,9 @@
>  		max_elts -= segs_n;
>  		--pkts_n;
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ? buf-
> >tx_metadata :
> +							     0;
>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		assert(length);
> @@ -971,6 +979,7 @@
>  		if ((mpw.state == MLX5_MPW_STATE_OPENED) &&
>  		    ((mpw.len != length) ||
>  		     (segs_n != 1) ||
> +		     (mpw.wqe->eseg.flow_table_metadata != metadata) ||
>  		     (mpw.wqe->eseg.cs_flags != cs_flags)))
>  			mlx5_mpw_close(txq, &mpw);
>  		if (mpw.state == MLX5_MPW_STATE_CLOSED) { @@ -984,6
> +993,7 @@
>  			max_wqe -= 2;
>  			mlx5_mpw_new(txq, &mpw, length);
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata = metadata;
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
>  		assert((segs_n == 1) || (mpw.pkts_n == 0)); @@ -1082,7
> +1092,7 @@
>  	mpw->wqe->eseg.cs_flags = 0;
>  	mpw->wqe->eseg.rsvd0 = 0;
>  	mpw->wqe->eseg.rsvd1 = 0;
> -	mpw->wqe->eseg.rsvd2 = 0;
> +	mpw->wqe->eseg.flow_table_metadata = 0;
>  	inl = (struct mlx5_wqe_inl_small *)
>  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
>  	mpw->data.raw = (uint8_t *)&inl->raw;
> @@ -1172,6 +1182,7 @@
>  		uint32_t length;
>  		unsigned int segs_n = buf->nb_segs;
>  		uint8_t cs_flags;
> +		rte_be32_t metadata;
> 
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -1193,18 +1204,23 @@
>  		 */
>  		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq-
> >wqe_pi);
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ? buf-
> >tx_metadata :
> +							     0;
>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		/* Start new session if packet differs. */
>  		if (mpw.state == MLX5_MPW_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				mlx5_mpw_close(txq, &mpw);
>  		} else if (mpw.state == MLX5_MPW_INL_STATE_OPENED) {
>  			if ((mpw.len != length) ||
>  			    (segs_n != 1) ||
>  			    (length > inline_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags)) {
>  				mlx5_mpw_inline_close(txq, &mpw);
>  				inline_room =
> @@ -1224,12 +1240,14 @@
>  				max_wqe -= 2;
>  				mlx5_mpw_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata =
> metadata;
>  			} else {
>  				if (unlikely(max_wqe < wqe_inl_n))
>  					break;
>  				max_wqe -= wqe_inl_n;
>  				mlx5_mpw_inline_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags = cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata =
> metadata;
>  			}
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1461,6 +1479,7 @@
>  		unsigned int do_inline = 0; /* Whether inline is possible. */
>  		uint32_t length;
>  		uint8_t cs_flags;
> +		rte_be32_t metadata;
> 
>  		/* Multi-segmented packet is handled in slow-path outside.
> */
>  		assert(NB_SEGS(buf) == 1);
> @@ -1468,6 +1487,9 @@
>  		if (max_elts - j == 0)
>  			break;
>  		cs_flags = txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata = buf->ol_flags & PKT_TX_METADATA ? buf-
> >tx_metadata :
> +							     0;
>  		/* Retrieve packet information. */
>  		length = PKT_LEN(buf);
>  		/* Start new session if:
> @@ -1482,6 +1504,7 @@
>  			    (length <= txq->inline_max_packet_sz &&
>  			     inl_pad + sizeof(inl_hdr) + length >
>  			     mpw_room) ||
> +			     (mpw.wqe->eseg.flow_table_metadata !=
> metadata) ||
>  			    (mpw.wqe->eseg.cs_flags != cs_flags))
>  				max_wqe -= mlx5_empw_close(txq, &mpw);
>  		}
> @@ -1505,6 +1528,7 @@
>  				    sizeof(inl_hdr) + length <= mpw_room &&
>  				    !txq->mpw_hdr_dseg;
>  			mpw.wqe->eseg.cs_flags = cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata = metadata;
>  		} else {
>  			/* Evaluate whether the next packet can be inlined.
>  			 * Inlininig is possible when:
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c
> b/drivers/net/mlx5/mlx5_rxtx_vec.c
> index 0a4aed8..1453f4f 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> @@ -40,7 +40,8 @@
>  #endif
> 
>  /**
> - * Count the number of packets having same ol_flags and calculate cs_flags.
> + * Count the number of packets having same ol_flags and same metadata
> + (if
> + * PKT_TX_METADATA is set in ol_flags), and calculate cs_flags.
>   *
>   * @param pkts
>   *   Pointer to array of packets.
> @@ -48,26 +49,45 @@
>   *   Number of packets.
>   * @param cs_flags
>   *   Pointer of flags to be returned.
> + * @param metadata
> + *   Pointer of metadata to be returned.
> + * @param txq_offloads
> + *   Offloads enabled on Tx queue
>   *
>   * @return
> - *   Number of packets having same ol_flags.
> + *   Number of packets having same ol_flags and metadata, if relevant.
>   */
>  static inline unsigned int
> -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> *cs_flags)
> +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t
> *cs_flags,
> +		 rte_be32_t *metadata, const uint64_t txq_offloads)
>  {
>  	unsigned int pos;
> -	const uint64_t ol_mask =
> +	const uint64_t cksum_ol_mask =
>  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
>  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
>  		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> +	rte_be32_t p0_metadata, pn_metadata;
> 
>  	if (!pkts_n)
>  		return 0;
> -	/* Count the number of packets having same ol_flags. */
> -	for (pos = 1; pos < pkts_n; ++pos)
> -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> +	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
> +			pkts[0]->tx_metadata : 0;
> +	/* Count the number of packets having same offload parameters. */
> +	for (pos = 1; pos < pkts_n; ++pos) {
> +		/* Check if packet has same checksum flags. */
> +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> &&
> +		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) &
> cksum_ol_mask))
>  			break;
> +		/* Check if packet has same metadata. */
> +		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
> +			pn_metadata = pkts[pos]->ol_flags &
> PKT_TX_METADATA ?
> +					pkts[pos]->tx_metadata : 0;
> +			if (pn_metadata != p0_metadata)
> +				break;
> +		}
> +	}
>  	*cs_flags = txq_ol_cksum_to_cs(pkts[0]);
> +	*metadata = p0_metadata;
>  	return pos;
>  }
> 
> @@ -96,7 +116,7 @@
>  		uint16_t ret;
> 
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
> MLX5_VPMD_TX_MAX_BURST);
> -		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0);
> +		ret = txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
>  		nb_tx += ret;
>  		if (!ret)
>  			break;
> @@ -127,6 +147,7 @@
>  		uint8_t cs_flags = 0;
>  		uint16_t n;
>  		uint16_t ret;
> +		rte_be32_t metadata = 0;
> 
>  		/* Transmit multi-seg packets in the head of pkts list. */
>  		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
> @@ -137,9 +158,12 @@
>  		n = RTE_MIN((uint16_t)(pkts_n - nb_tx),
> MLX5_VPMD_TX_MAX_BURST);
>  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>  			n = txq_count_contig_single_seg(&pkts[nb_tx], n);
> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> -			n = txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> -		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> +				     DEV_TX_OFFLOAD_MATCH_METADATA))
> +			n = txq_calc_offload(&pkts[nb_tx], n,
> +					     &cs_flags, &metadata,
> +					     txq->offloads);
> +		ret = txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
>  		nb_tx += ret;
>  		if (!ret)
>  			break;
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fb884f9..fda7004 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -22,6 +22,7 @@
>  /* HW offload capabilities of vectorized Tx. */  #define
> MLX5_VEC_TX_OFFLOAD_CAP \
>  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
>  	 DEV_TX_OFFLOAD_MULTI_SEGS)
> 
>  /*
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index b37b738..0b729f1 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -201,13 +201,15 @@
>   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<= pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, rte_be32_t metadata)
>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head = txq->elts_head;
> @@ -293,11 +295,8 @@
>  	ctrl = vqtbl1q_u8(ctrl, ctrl_shuf_m);
>  	vst1q_u8((void *)t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> -	vst1q_u8((void *)(t_wqe + 1),
> -		 ((uint8x16_t) { 0, 0, 0, 0,
> -				 cs_flags, 0, 0, 0,
> -				 0, 0, 0, 0,
> -				 0, 0, 0, 0 }));
> +	vst1q_u32((void *)(t_wqe + 1),
> +		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 54b3783..e0f95f9 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -202,13 +202,15 @@
>   *   Number of packets to be sent (<= MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<= pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, rte_be32_t metadata)
>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head = txq->elts_head;
> @@ -292,11 +294,7 @@
>  	ctrl = _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
>  	_mm_store_si128(t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> -	_mm_store_si128(t_wqe + 1,
> -			_mm_set_epi8(0, 0, 0, 0,
> -				     0, 0, 0, 0,
> -				     0, 0, 0, cs_flags,
> -				     0, 0, 0, 0));
> +	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags,
> 0));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets += pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index f9bc473..b01bd67 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -120,7 +120,6 @@
>  			offloads |= (DEV_TX_OFFLOAD_IP_TNL_TSO |
>  				     DEV_TX_OFFLOAD_UDP_TNL_TSO);
>  	}
> -
>  	if (config->tunnel_en) {
>  		if (config->hw_csum)
>  			offloads |=
> DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM; @@ -128,6 +127,10 @@
>  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> +	if (config->dv_flow_en)
> +		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA; #endif
>  	return offloads;
>  }
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [PATCH v7] net/mlx5: support metadata as flow rule criteria
  2018-10-24  6:12             ` Shahaf Shuler
@ 2018-10-24  8:49               ` Ferruh Yigit
  0 siblings, 0 replies; 17+ messages in thread
From: Ferruh Yigit @ 2018-10-24  8:49 UTC (permalink / raw)
  To: Shahaf Shuler, Dekel Peled, Yongseok Koh; +Cc: dev, Ori Kam

On 10/24/2018 7:12 AM, Shahaf Shuler wrote:
> Hi Ferruh, 
> 
> This patch contains a fix for compilation on top of arm.
> I hopped to replace between the existing "support metadata as flow rule criteria" to this one before you take it, but I was too late.
> Can you please replace the old patch with this one?

Replaced in repo.

> Otherwise we will provide a separate fix patch for this issue. 
> 
> 
> Tuesday, October 23, 2018 10:34 PM, Dekel Peled:
>> Subject: [dpdk-dev] [PATCH v7] net/mlx5: support metadata as flow rule
>> criteria
>>
>> As described in series starting at [1], it adds option to set metadata value as
>> match pattern when creating a new flow rule.
>>
>> This patch adds metadata support in mlx5 driver, in two parts:
>> - Add the validation and setting of metadata value in matcher,
>>   when creating a new flow rule.
>> - Add the passing of metadata value from mbuf to wqe when
>>   indicated by ol_flag, in different burst functions.
>>
>> [1] "ethdev: support metadata as flow rule criteria"
>>
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
>> s.dpdk.org%2Farchives%2Fdev%2F2018-
>> September%2F113269.html&amp;data=02%7C01%7Cshahafs%40mellanox.co
>> m%7Cdd41e32e0904475b253708d63921cd6b%7Ca652971c7d2e4d9ba6a4d149
>> 256f461b%7C0%7C0%7C636759214682359808&amp;sdata=QgVXYva4uv%2FA
>> GcrofzbIIlxHpdR1cOfDw2BACO0s6wY%3D&amp;reserved=0
>>
> 
> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
> 
>> ---
>> v7:
>> - Fix use of wrong type.
>> v6:
>> - Correct indentation.
>> - Fix setting data in matcher to include mask.
>> v5:
>> Apply code review comments:
>>  Coding style (indentation, redundant blank lines, clear comments).
>>  txq_calc_offload() logic updated.
>>  rte_be32_t type used instead of uint32_t.
>> v4:
>> - Rebase.
>> - Apply code review comments.
>> v3:
>> - Update meta item validation.
>> v2:
>> - Split the support of egress rules to a different patch.
>> ---
>>
>> Signed-off-by: Dekel Peled <dekelp@mellanox.com>

<...>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-10-24  8:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-16 13:42 [dpdk-dev] [PATCH] net/mlx5: support metadata as flow rule criteria Dekel Peled
2018-09-19  7:21 ` Xueming(Steven) Li
2018-09-27 14:18 ` [dpdk-dev] [PATCH v2] " Dekel Peled
2018-09-29  9:09   ` Yongseok Koh
2018-10-03  5:22     ` Dekel Peled
2018-10-03  7:22       ` Yongseok Koh
2018-10-11 11:19   ` [dpdk-dev] [PATCH v3] " Dekel Peled
2018-10-17 11:53     ` [dpdk-dev] [PATCH v4] " Dekel Peled
2018-10-18  8:00       ` Yongseok Koh
2018-10-21 13:44         ` Dekel Peled
2018-10-21 14:04       ` [dpdk-dev] [PATCH v5] " Dekel Peled
2018-10-22 18:47         ` Yongseok Koh
2018-10-23 10:48         ` [dpdk-dev] [PATCH v6] " Dekel Peled
2018-10-23 12:27           ` Shahaf Shuler
2018-10-23 19:34           ` [dpdk-dev] [PATCH v7] " Dekel Peled
2018-10-24  6:12             ` Shahaf Shuler
2018-10-24  8:49               ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).