DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
@ 2019-06-03 21:32 Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
                   ` (3 more replies)
  0 siblings, 4 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

Currently, metadata can be set on egress path via mbuf tx_meatadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.

This patch extends the usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on packet Rx. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via mbuf metadata field with
PKT_RX_METADATA ol_flag.

For this purpose, mbuf->tx_metadata is moved as a separate new field and
renamed to 'metadata' to support both Rx and Tx metadata.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on HW capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/util.c                   |  2 +-
 doc/guides/prog_guide/rte_flow.rst    | 73 +++++++++++++++++++--------
 drivers/net/mlx5/mlx5_rxtx.c          | 12 ++---
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  2 +-
 lib/librte_ethdev/rte_flow.h          | 43 ++++++++++++++--
 lib/librte_mbuf/rte_mbuf.h            | 21 ++++----
 8 files changed, 111 insertions(+), 48 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index a1164b7053..6ecc97351f 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -182,7 +182,7 @@ tx_pkt_set_md(uint16_t port_id, __rte_unused uint16_t queue,
 	 * and set ol_flags accordingly.
 	 */
 	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
+		pkts[i]->metadata = ports[port_id].tx_metadata;
 		pkts[i]->ol_flags |= PKT_TX_METADATA;
 	}
 	return nb_pkts;
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index a34d012e55..016cd90e52 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META`` action
+sets metadata for a packet and the metadata will be reported via mbuf metadata
+field with PKT_RX_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1189,27 +1215,6 @@ Normally preceded by any of:
 - `Item: ICMP6_ND_NS`_
 - `Item: ICMP6_ND_OPT`_
 
-Item: ``META``
-^^^^^^^^^^^^^^
-
-Matches an application specific 32 bit metadata item.
-
-- Default ``mask`` matches the specified metadata value.
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Actions
 ~~~~~~~
 
@@ -2345,6 +2350,32 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
    | ``mac_addr`` | MAC address   |
    +--------------+---------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by mbuf
+metadata field with PKT_RX_METADATA flag if set.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 7174ffc91c..19b4a2567b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -626,8 +626,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -1029,8 +1028,7 @@ mlx5_tx_burst_mpw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -1264,8 +1262,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
@@ -1547,8 +1544,7 @@ txq_burst_empw(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 9a3a5ae437..9f99c8cb03 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -71,7 +71,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 	if (!pkts_n)
 		return 0;
 	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
-			pkts[0]->tx_metadata : 0;
+		      pkts[0]->metadata : 0;
 	/* Count the number of packets having same offload parameters. */
 	for (pos = 1; pos < pkts_n; ++pos) {
 		/* Check if packet has same checksum flags. */
@@ -81,7 +81,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 		/* Check if packet has same metadata. */
 		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
 			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
-					pkts[pos]->tx_metadata : 0;
+				      pkts[pos]->metadata : 0;
 			if (pn_metadata != p0_metadata)
 				break;
 		}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b2cc710887..c54914e97a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -131,7 +131,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		uint8x16_t ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index dce3ee4b40..3de640a2fd 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -129,7 +129,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		__m128i ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f3a8fb103f..cda8628183 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -417,7 +417,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1164,9 +1165,16 @@ rte_flow_item_icmp6_nd_opt_tla_eth_mask = {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * field with PKT_RX_METADATA flag.
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1650,6 +1658,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_mac.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_MAC_DST,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2131,6 +2146,28 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf metadata field with PKT_TX_METADATA flag
+ * on egress will be overridden by this action. On ingress, the metadata will be
+ * carried by mbuf metadata field with PKT_RX_METADATA flag if set.
+ *
+ * Altering partial bits is supported with mask. For bits which have never been
+ * set, unpredictable value will be seen depending on driver implementation. For
+ * loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated
+ * to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
 /*
  * Definition of a single action.
  *
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index e4c2da6ee6..60f2b553e6 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -200,6 +200,11 @@ extern "C" {
 
 /* add new RX flags here */
 
+/**
+ * Indicate that mbuf has metadata from device.
+ */
+#define PKT_RX_METADATA	(1ULL << 23)
+
 /* add new TX flags here */
 
 /**
@@ -648,17 +653,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
@@ -725,6 +719,11 @@ struct rte_mbuf {
 	 */
 	struct rte_mbuf_ext_shared_info *shinfo;
 
+	/** Application specific metadata value for flow rule match.
+	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
+	 */
+	uint32_t metadata;
+
 } __rte_cache_aligned;
 
 /**
-- 
2.21.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
@ 2019-06-03 21:32 ` Yongseok Koh
  2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 98+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

Mark ID can be modified when supporting multiple tables. Partial bit
alteration is supported to preserve some bit-fields set by previous match.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
 lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 016cd90e52..2907edfff4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is returned in the
    | ``id`` | integer value to return with packets |
    +--------+--------------------------------------+
 
+Action: ``MODIFY_MARK``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Alter partial bits of mark ID set by ``MARK`` action.
+
+``mask`` indicates which bits are modified. For bits which have never been set
+by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be seen depending on
+driver implementation.
+
+.. _table_rte_flow_action_modify_mark:
+
+.. table:: MODIFY_MARK
+
+   +----------+--------------------------------------+
+   | Field    | Value                                |
+   +==========+======================================+
+   | ``id``   | integer value to return with packets |
+   +----------+--------------------------------------+
+   | ``mask`` | bit-mask applies to "id"             |
+   +----------+--------------------------------------+
+
 Action: ``FLAG``
 ^^^^^^^^^^^^^^^^
 
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index cda8628183..d811f8a06e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
+	/**
+	 * Alter partial bits of mark ID set by RTE_FLOW_ACTION_TYPE_MARK.
+	 *
+	 * See struct rte_flow_action_modify_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
+
 	/**
 	 * Flags packets. Similar to MARK without a specific value; only
 	 * sets the PKT_RX_FDIR mbuf flag.
@@ -1681,6 +1688,23 @@ struct rte_flow_action_mark {
 	uint32_t id; /**< Integer value to return with packets. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_MODIFY_MARK
+ *
+ * Alter partial bits of mark ID set by RTE_FLOW_ACTION_TYPE_MARK.
+ *
+ * Provided mask indicates which bits are modified. For bits which have never
+ * been set by mark action or modify_mark action, unpredictable value will be
+ * seen depending on driver implementation.
+ */
+struct rte_flow_action_modify_mark {
+	uint32_t id; /**< Integer value to return with packets. */
+	uint32_t mask; /**< Mask of bits to modify. */
+};
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
-- 
2.21.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [RFC 3/3] ethdev: add flow tag
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
@ 2019-06-03 21:32 ` Yongseok Koh
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  3 siblings, 1 reply; 98+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst | 50 +++++++++++++++++++++++++++
 lib/librte_ethdev/rte_flow.h       | 54 ++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2907edfff4..f6ef4305b4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ field with PKT_RX_METADATA flag.
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2397,6 +2425,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index d811f8a06e..5ee2bc95c6 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -422,6 +422,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1187,6 +1196,27 @@ static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_rx_meta_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1672,6 +1702,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2192,6 +2231,21 @@ struct rte_flow_action_set_meta {
 	rte_be32_t mask;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.21.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
@ 2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
  2019-06-06 18:33     ` Yongseok Koh
  0 siblings, 1 reply; 98+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-06-06 10:35 UTC (permalink / raw)
  To: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, olivier.matz
  Cc: dev

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Yongseok Koh
> Sent: Tuesday, June 4, 2019 3:03 AM
> To: shahafs@mellanox.com; thomas@monjalon.net; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; adrien.mazarguil@6wind.com;
> olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
> 
> Mark ID can be modified when supporting multiple tables. Partial bit
> alteration is supported to preserve some bit-fields set by previous match.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
>  lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 016cd90e52..2907edfff4 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is
> returned in the
>     | ``id`` | integer value to return with packets |
>     +--------+--------------------------------------+
> 
> +Action: ``MODIFY_MARK``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Alter partial bits of mark ID set by ``MARK`` action.
> +
> +``mask`` indicates which bits are modified. For bits which have never
> +been set by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be
> +seen depending on driver implementation.
> +
> +.. _table_rte_flow_action_modify_mark:
> +
> +.. table:: MODIFY_MARK
> +
> +   +----------+--------------------------------------+
> +   | Field    | Value                                |
> +   +==========+======================================+
> +   | ``id``   | integer value to return with packets |
> +   +----------+--------------------------------------+
> +   | ``mask`` | bit-mask applies to "id"             |
> +   +----------+--------------------------------------+
> +
>  Action: ``FLAG``
>  ^^^^^^^^^^^^^^^^
> 
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index
> cda8628183..d811f8a06e 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MARK,
> 
> +	/**
> +	 * Alter partial bits of mark ID set by
> RTE_FLOW_ACTION_TYPE_MARK.
> +	 *
> +	 * See struct rte_flow_action_modify_mark.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
> +

I think, we need to define the case where application calls MODIFY_MARK first on given pattern before MARK
I think, either we can 
# Introduce an error number for that?
# Treat first MODIFY_MARK as MARK

Just to understand, in this absence of this new action, an application needs
to destroy the given pattern with associated  existing MARK action and
add the same pattern with updated value as MARK action? Right?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
@ 2019-06-06 18:33     ` Yongseok Koh
  0 siblings, 0 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-06-06 18:33 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran
  Cc: Shahaf Shuler, Thomas Monjalon, ferruh.yigit, arybchenko,
	Adrien Mazarguil, olivier.matz, dev


> On Jun 6, 2019, at 3:35 AM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
> 
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Yongseok Koh
>> Sent: Tuesday, June 4, 2019 3:03 AM
>> To: shahafs@mellanox.com; thomas@monjalon.net; ferruh.yigit@intel.com;
>> arybchenko@solarflare.com; adrien.mazarguil@6wind.com;
>> olivier.matz@6wind.com
>> Cc: dev@dpdk.org
>> Subject: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
>> 
>> Mark ID can be modified when supporting multiple tables. Partial bit
>> alteration is supported to preserve some bit-fields set by previous match.
>> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> ---
>> doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
>> lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
>> 2 files changed, 45 insertions(+)
>> 
>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>> b/doc/guides/prog_guide/rte_flow.rst
>> index 016cd90e52..2907edfff4 100644
>> --- a/doc/guides/prog_guide/rte_flow.rst
>> +++ b/doc/guides/prog_guide/rte_flow.rst
>> @@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is
>> returned in the
>>    | ``id`` | integer value to return with packets |
>>    +--------+--------------------------------------+
>> 
>> +Action: ``MODIFY_MARK``
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +Alter partial bits of mark ID set by ``MARK`` action.
>> +
>> +``mask`` indicates which bits are modified. For bits which have never
>> +been set by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be
>> +seen depending on driver implementation.
>> +
>> +.. _table_rte_flow_action_modify_mark:
>> +
>> +.. table:: MODIFY_MARK
>> +
>> +   +----------+--------------------------------------+
>> +   | Field    | Value                                |
>> +   +==========+======================================+
>> +   | ``id``   | integer value to return with packets |
>> +   +----------+--------------------------------------+
>> +   | ``mask`` | bit-mask applies to "id"             |
>> +   +----------+--------------------------------------+
>> +
>> Action: ``FLAG``
>> ^^^^^^^^^^^^^^^^
>> 
>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index
>> cda8628183..d811f8a06e 100644
>> --- a/lib/librte_ethdev/rte_flow.h
>> +++ b/lib/librte_ethdev/rte_flow.h
>> @@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
>> 	 */
>> 	RTE_FLOW_ACTION_TYPE_MARK,
>> 
>> +	/**
>> +	 * Alter partial bits of mark ID set by
>> RTE_FLOW_ACTION_TYPE_MARK.
>> +	 *
>> +	 * See struct rte_flow_action_modify_mark.
>> +	 */
>> +	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
>> +
> 
> I think, we need to define the case where application calls MODIFY_MARK first on given pattern before MARK

Good input. 

> I think, either we can 
> # Introduce an error number for that?

Practically, it would be impossible to keep track of MARK action to check if it is set or not prior to MODIFY_MARK.
When creating flows with multiple tables, we can't say a flow having MODIFY_MARK action will have prior MARK action or not.

> # Treat first MODIFY_MARK as MARK

So, I took similar approach.
In the documentation above, unset bits would have arbitrary value depending on driver/device implementation.
User can't assume mark ID is initially zeroed but rather need to check it with vendors.

> Just to understand, in this absence of this new action, an application needs
> to destroy the given pattern with associated  existing MARK action and
> add the same pattern with updated value as MARK action? Right?

Application would have to override it by second MARK action.
But it has to be validated by user anyway to check if device allows override.

Thanks,
Yongseok



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
@ 2019-06-09 14:23 ` Andrew Rybchenko
  2019-06-10  3:19   ` Wang, Haiyue
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  3 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-06-09 14:23 UTC (permalink / raw)
  To: Yongseok Koh, shahafs, thomas, ferruh.yigit, adrien.mazarguil,
	olivier.matz
  Cc: dev

On 6/4/19 12:32 AM, Yongseok Koh wrote:
> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>
> This patch extends the usability.
>
> 1) RTE_FLOW_ACTION_TYPE_SET_META
>
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
>
> 2) Metadata on ingress
>
> There's also need to support metadata on packet Rx. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via mbuf metadata field with
> PKT_RX_METADATA ol_flag.
>
> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> renamed to 'metadata' to support both Rx and Tx metadata.
>
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on HW capability.
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

There is a mark on Rx which is delivered to application in hash.fdir.hi.
Why do we need one more 32-bit value set by NIC and delivered to 
application?
What is the difference between MARK and META on Rx?
When application should use MARK and when META?
Is there cases when both could be necessary?

Moreover, the third patch adds 32-bit tags which are not delivered to
application. May be META/MARK should be simply a kind of TAG (e.g. with
index 0 or marked using additional attribute) which is delivered to 
application?

(It is either API breakage (if tx_metadata is removed) or ABI breakage
if metadata and tx_metadata will share new location after shinfo).

Andrew.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
@ 2019-06-10  3:19   ` Wang, Haiyue
  2019-06-10  7:20     ` Andrew Rybchenko
  0 siblings, 1 reply; 98+ messages in thread
From: Wang, Haiyue @ 2019-06-10  3:19 UTC (permalink / raw)
  To: Andrew Rybchenko, Yongseok Koh, shahafs, thomas, Yigit, Ferruh,
	adrien.mazarguil, olivier.matz
  Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
> Sent: Sunday, June 9, 2019 22:24
> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
> 
> On 6/4/19 12:32 AM, Yongseok Koh wrote:
> > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >
> > This patch extends the usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule and
> > matched by another rule. This new action allows metadata to be set as a
> > result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on packet Rx. Metadata can be set by
> > SET_META action and matched by META item like Tx. The final value set by
> > the action will be delivered to application via mbuf metadata field with
> > PKT_RX_METADATA ol_flag.
> >
> > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > renamed to 'metadata' to support both Rx and Tx metadata.
> >
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on HW capability.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> There is a mark on Rx which is delivered to application in hash.fdir.hi.
> Why do we need one more 32-bit value set by NIC and delivered to
> application?
> What is the difference between MARK and META on Rx?
> When application should use MARK and when META?
> Is there cases when both could be necessary?
> 
In my understanding, MARK is FDIR related thing, META seems to be NIC
specific. And we also need this kind of specific data field to export
NIC's data to application.

> Moreover, the third patch adds 32-bit tags which are not delivered to
> application. May be META/MARK should be simply a kind of TAG (e.g. with
> index 0 or marked using additional attribute) which is delivered to
> application?
> 
> (It is either API breakage (if tx_metadata is removed) or ABI breakage
> if metadata and tx_metadata will share new location after shinfo).
> 
Make use of udata64 to export NIC metadata to application ?
	RTE_STD_C11
	union {
		void *userdata;   /**< Can be used for external metadata */
		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
		uint64_t rx_metadata;
	};
> Andrew.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-10  3:19   ` Wang, Haiyue
@ 2019-06-10  7:20     ` Andrew Rybchenko
  2019-06-11  0:06       ` Yongseok Koh
  0 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-06-10  7:20 UTC (permalink / raw)
  To: Wang, Haiyue, Yongseok Koh, shahafs, thomas, Yigit, Ferruh,
	adrien.mazarguil, olivier.matz
  Cc: dev, Ananyev, Konstantin

On 6/10/19 6:19 AM, Wang, Haiyue wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
>> Sent: Sunday, June 9, 2019 22:24
>> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
>> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
>>
>> On 6/4/19 12:32 AM, Yongseok Koh wrote:
>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>
>>> This patch extends the usability.
>>>
>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>
>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>> matched by another rule. This new action allows metadata to be set as a
>>> result of flow match.
>>>
>>> 2) Metadata on ingress
>>>
>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>> SET_META action and matched by META item like Tx. The final value set by
>>> the action will be delivered to application via mbuf metadata field with
>>> PKT_RX_METADATA ol_flag.
>>>
>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>
>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>> propagated to the other path depending on HW capability.
>>>
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> There is a mark on Rx which is delivered to application in hash.fdir.hi.
>> Why do we need one more 32-bit value set by NIC and delivered to
>> application?
>> What is the difference between MARK and META on Rx?
>> When application should use MARK and when META?
>> Is there cases when both could be necessary?
>>
> In my understanding, MARK is FDIR related thing, META seems to be NIC
> specific. And we also need this kind of specific data field to export
> NIC's data to application.

I think it is better to avoid NIC vendor-specifics in motivation. I 
understand
that it exists for you, but I think it is better to look at it from RTE 
flow API
definition point of view: both are 32-bit (except endianess and I'm not sure
that I understand why meta is defined as big-endian since it is not a value
coming from or going to network in a packet, I'm sorry that I've missed it
on review that time), both may be set using action on Rx, both may be
matched using pattern item.

>> Moreover, the third patch adds 32-bit tags which are not delivered to
>> application. May be META/MARK should be simply a kind of TAG (e.g. with
>> index 0 or marked using additional attribute) which is delivered to
>> application?
>>
>> (It is either API breakage (if tx_metadata is removed) or ABI breakage
>> if metadata and tx_metadata will share new location after shinfo).
>>
> Make use of udata64 to export NIC metadata to application ?
> 	RTE_STD_C11
> 	union {
> 		void *userdata;   /**< Can be used for external metadata */
> 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
> 		uint64_t rx_metadata;
> 	};

As I understand it does not work for Tx and I'm not sure that it is
a good idea to have different locations for Tx and Rx.

RFC adds it at the end of mbuf, but it was rejected before since
it eats space in mbuf structure (CC Konstantin).

There is a long discussion on the topic before [1], [2], [3] and [4].

Andrew.

[1] http://mails.dpdk.org/archives/dev/2018-August/109660.html
[2] http://mails.dpdk.org/archives/dev/2018-September/111771.html
[3] http://mails.dpdk.org/archives/dev/2018-October/114559.html
[4] http://mails.dpdk.org/archives/dev/2018-October/115469.html

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-10  7:20     ` Andrew Rybchenko
@ 2019-06-11  0:06       ` Yongseok Koh
  2019-06-19  9:05         ` Andrew Rybchenko
  0 siblings, 1 reply; 98+ messages in thread
From: Yongseok Koh @ 2019-06-11  0:06 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Wang, Haiyue, Shahaf Shuler, Thomas Monjalon, Yigit, Ferruh,
	Adrien Mazarguil, olivier.matz, dev, Ananyev, Konstantin

On Mon, Jun 10, 2019 at 10:20:28AM +0300, Andrew Rybchenko wrote:
> On 6/10/19 6:19 AM, Wang, Haiyue wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
> > > Sent: Sunday, June 9, 2019 22:24
> > > To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
> > > <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
> > > 
> > > On 6/4/19 12:32 AM, Yongseok Koh wrote:
> > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > 
> > > > This patch extends the usability.
> > > > 
> > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > 
> > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > matched by another rule. This new action allows metadata to be set as a
> > > > result of flow match.
> > > > 
> > > > 2) Metadata on ingress
> > > > 
> > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > SET_META action and matched by META item like Tx. The final value set by
> > > > the action will be delivered to application via mbuf metadata field with
> > > > PKT_RX_METADATA ol_flag.
> > > > 
> > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > 
> > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > propagated to the other path depending on HW capability.
> > > > 
> > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > There is a mark on Rx which is delivered to application in hash.fdir.hi.
> > > Why do we need one more 32-bit value set by NIC and delivered to
> > > application?
> > > What is the difference between MARK and META on Rx?
> > > When application should use MARK and when META?
> > > Is there cases when both could be necessary?
> > > 
> > In my understanding, MARK is FDIR related thing, META seems to be NIC
> > specific. And we also need this kind of specific data field to export
> > NIC's data to application.
> 
> I think it is better to avoid NIC vendor-specifics in motivation. I
> understand
> that it exists for you, but I think it is better to look at it from RTE flow
> API
> definition point of view: both are 32-bit (except endianess and I'm not sure
> that I understand why meta is defined as big-endian since it is not a value
> coming from or going to network in a packet, I'm sorry that I've missed it
> on review that time), both may be set using action on Rx, both may be
> matched using pattern item.

Yes, MARK and META has the same characteristic on Rx path. Let me clarify why I
picked this way.

What if device has more bits to deliver to host? Currently, only 32-bit data can
be delivered to user via MARK ID. Now we have more requests from users (OVS
connection tracking) that want to see more information generated during flow
match from the device. Let's say it is 64 bits and it may contain intermediate
match results to keep track of multi-table match, to keep address of callback
function to call, or so. I thought about extending the current MARK to 64-bit
but I knew that we couldn't make more room in the first cacheline of mbuf where
every vendor has their critical interest. And the FDIR has been there for a long
time and has lots of use-cases in DPDK (not easy to break). This is why I'm
suggesting to obtain another 32 bits in the second cacheline of the structure.

Also, I thought about other scenario as well. Even though we have MARK item
introduced lately, it isn't used by any PMD at all for now, meaning it might not
be match-able on a certain device. What if there are two types registers on Rx
and one is match-able and the other isn't? PMD can use META for match-able
register while MARK is used for non-match-able register without supporting
item match. If MARK simply becomes 64-bit just because it has the same
characteristic in terms of rte_flow, only one of such registers can be used as
we can't say only part of bits are match-able on the item. Instead of extending
the MARK to 64 bits, I thought it would be better to give more flexibility by
bundling it with Tx metadata, which can set by mbuf.

The actual issue we have may be how we can make it scalable? What if there's
more need to carry more data from device? Well, IIRC, Olivier once suggested to
put a pointer (like mbuf->userdata) to extend mbuf struct beyond two cachelines.
But we still have some space left at the end.

> > > Moreover, the third patch adds 32-bit tags which are not delivered to
> > > application. May be META/MARK should be simply a kind of TAG (e.g. with
> > > index 0 or marked using additional attribute) which is delivered to
> > > application?

Yes, TAG is a kind of transient device-internal data which isn't delivered to
host. It would be a design choice. I could define all these kinds as an array of
MARK IDs having different attributes - some are exportable/match-able and others
are not, which sounds quite complex. As rte_flow doesn't have a direct way to
check device capability (user has to call a series of validate functions
instead), I thought defining TAG would be better.

> > > (It is either API breakage (if tx_metadata is removed) or ABI breakage
> > > if metadata and tx_metadata will share new location after shinfo).

Fortunately, mlx5 is the only entity which uses tx_metadata so far.

> > Make use of udata64 to export NIC metadata to application ?
> > 	RTE_STD_C11
> > 	union {
> > 		void *userdata;   /**< Can be used for external metadata */
> > 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
> > 		uint64_t rx_metadata;
> > 	};
> 
> As I understand it does not work for Tx and I'm not sure that it is
> a good idea to have different locations for Tx and Rx.
> 
> RFC adds it at the end of mbuf, but it was rejected before since
> it eats space in mbuf structure (CC Konstantin).

Yep, I was in the discussion. IIRC, the reason wasn't because it ate space but
because it could recycle unused space on Tx path. We still have 16B after shinfo
and I'm not sure how many bytes we should reserve. I think reserving space for
one pointer would be fine.

Thanks,
Yongseok

> There is a long discussion on the topic before [1], [2], [3] and [4].
> 
> Andrew.
> 
> [1] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-August%2F109660.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=EFHyECwg0NBRvyrouZqWD6x0WD4xAsqsfYQGrEvS%2BEg%3D&amp;reserved=0
> [2] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111771.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=M8cQSmQhWKlUVKvFgux0T0TWAnJhPxdO4Dn3fkReTyg%3D&amp;reserved=0
> [3] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F114559.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=ZVm5god7n1i07OCc5Z7B%2BBUpnjXCraJXU0FeF5KkCRc%3D&amp;reserved=0
> [4] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F115469.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=XgKV%2B331Vqsq9Ns40giI1nAwscVxBxqb78vB1BY8z%2Bc%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-11  0:06       ` Yongseok Koh
@ 2019-06-19  9:05         ` Andrew Rybchenko
  0 siblings, 0 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-06-19  9:05 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Wang, Haiyue, Shahaf Shuler, Thomas Monjalon, Yigit, Ferruh,
	Adrien Mazarguil, olivier.matz, dev, Ananyev, Konstantin

On 11.06.2019 3:06, Yongseok Koh wrote:
> On Mon, Jun 10, 2019 at 10:20:28AM +0300, Andrew Rybchenko wrote:
>> On 6/10/19 6:19 AM, Wang, Haiyue wrote:
>>>> -----Original Message-----
>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
>>>> Sent: Sunday, June 9, 2019 22:24
>>>> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
>>>> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
>>>>
>>>> On 6/4/19 12:32 AM, Yongseok Koh wrote:
>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>
>>>>> This patch extends the usability.
>>>>>
>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>
>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>> result of flow match.
>>>>>
>>>>> 2) Metadata on ingress
>>>>>
>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>> the action will be delivered to application via mbuf metadata field with
>>>>> PKT_RX_METADATA ol_flag.
>>>>>
>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>
>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>> propagated to the other path depending on HW capability.
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>> There is a mark on Rx which is delivered to application in hash.fdir.hi.
>>>> Why do we need one more 32-bit value set by NIC and delivered to
>>>> application?
>>>> What is the difference between MARK and META on Rx?
>>>> When application should use MARK and when META?
>>>> Is there cases when both could be necessary?
>>>>
>>> In my understanding, MARK is FDIR related thing, META seems to be NIC
>>> specific. And we also need this kind of specific data field to export
>>> NIC's data to application.
>> I think it is better to avoid NIC vendor-specifics in motivation. I
>> understand
>> that it exists for you, but I think it is better to look at it from RTE flow
>> API
>> definition point of view: both are 32-bit (except endianess and I'm not sure
>> that I understand why meta is defined as big-endian since it is not a value
>> coming from or going to network in a packet, I'm sorry that I've missed it
>> on review that time), both may be set using action on Rx, both may be
>> matched using pattern item.
> Yes, MARK and META has the same characteristic on Rx path. Let me clarify why I
> picked this way.
>
> What if device has more bits to deliver to host? Currently, only 32-bit data can
> be delivered to user via MARK ID. Now we have more requests from users (OVS
> connection tracking) that want to see more information generated during flow
> match from the device. Let's say it is 64 bits and it may contain intermediate
> match results to keep track of multi-table match, to keep address of callback
> function to call, or so. I thought about extending the current MARK to 64-bit
> but I knew that we couldn't make more room in the first cacheline of mbuf where
> every vendor has their critical interest. And the FDIR has been there for a long
> time and has lots of use-cases in DPDK (not easy to break). This is why I'm
> suggesting to obtain another 32 bits in the second cacheline of the structure.
>
> Also, I thought about other scenario as well. Even though we have MARK item
> introduced lately, it isn't used by any PMD at all for now, meaning it might not
> be match-able on a certain device. What if there are two types registers on Rx
> and one is match-able and the other isn't? PMD can use META for match-able
> register while MARK is used for non-match-able register without supporting
> item match. If MARK simply becomes 64-bit just because it has the same
> characteristic in terms of rte_flow, only one of such registers can be used as
> we can't say only part of bits are match-able on the item. Instead of extending
> the MARK to 64 bits, I thought it would be better to give more flexibility by
> bundling it with Tx metadata, which can set by mbuf.

Thanks a lot for explanations. If the way is finally approved, priority
among META and MARK should be defined. I.e. if only one is supported
or only one may be match, it must be MARK. Otherwise, it will be too
complicated for applications to find out which one to use.
Is there any limitations on usage of MARK or META in transfer rules?
There is a lot of work on documentation in this area to make it usable.


> The actual issue we have may be how we can make it scalable? What if there's
> more need to carry more data from device? Well, IIRC, Olivier once suggested to
> put a pointer (like mbuf->userdata) to extend mbuf struct beyond two cachelines.
> But we still have some space left at the end.
>
>>>> Moreover, the third patch adds 32-bit tags which are not delivered to
>>>> application. May be META/MARK should be simply a kind of TAG (e.g. with
>>>> index 0 or marked using additional attribute) which is delivered to
>>>> application?
> Yes, TAG is a kind of transient device-internal data which isn't delivered to
> host. It would be a design choice. I could define all these kinds as an array of
> MARK IDs having different attributes - some are exportable/match-able and others
> are not, which sounds quite complex. As rte_flow doesn't have a direct way to
> check device capability (user has to call a series of validate functions
> instead), I thought defining TAG would be better.
>
>>>> (It is either API breakage (if tx_metadata is removed) or ABI breakage
>>>> if metadata and tx_metadata will share new location after shinfo).
> Fortunately, mlx5 is the only entity which uses tx_metadata so far.

As I understand it is still breakage.

>>> Make use of udata64 to export NIC metadata to application ?
>>> 	RTE_STD_C11
>>> 	union {
>>> 		void *userdata;   /**< Can be used for external metadata */
>>> 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
>>> 		uint64_t rx_metadata;
>>> 	};
>> As I understand it does not work for Tx and I'm not sure that it is
>> a good idea to have different locations for Tx and Rx.
>>
>> RFC adds it at the end of mbuf, but it was rejected before since
>> it eats space in mbuf structure (CC Konstantin).
> Yep, I was in the discussion. IIRC, the reason wasn't because it ate space but
> because it could recycle unused space on Tx path. We still have 16B after shinfo
> and I'm not sure how many bytes we should reserve. I think reserving space for
> one pointer would be fine.

I have no strong opinion.

Thanks,
Andrew.

> Thanks,
> Yongseok
>
>> There is a long discussion on the topic before [1], [2], [3] and [4].
>>
>> Andrew.
>>
>> [1] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-August%2F109660.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=EFHyECwg0NBRvyrouZqWD6x0WD4xAsqsfYQGrEvS%2BEg%3D&amp;reserved=0
>> [2] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111771.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=M8cQSmQhWKlUVKvFgux0T0TWAnJhPxdO4Dn3fkReTyg%3D&amp;reserved=0
>> [3] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F114559.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=ZVm5god7n1i07OCc5Z7B%2BBUpnjXCraJXU0FeF5KkCRc%3D&amp;reserved=0
>> [4] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F115469.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=XgKV%2B331Vqsq9Ns40giI1nAwscVxBxqb78vB1BY8z%2Bc%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
                   ` (2 preceding siblings ...)
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
@ 2019-07-04 23:21 ` Yongseok Koh
  2019-07-10  9:31   ` Olivier Matz
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  3 siblings, 2 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-07-04 23:21 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev, viacheslavo

Currently, metadata can be set on egress path via mbuf tx_meatadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.

This patch extends the usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on packet Rx. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via mbuf metadata field with
PKT_RX_METADATA ol_flag.

For this purpose, mbuf->tx_metadata is moved as a separate new field and
renamed to 'metadata' to support both Rx and Tx metadata.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on HW capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/cmdline_flow.c            | 35 ++++++++++++
 app/test-pmd/util.c                    |  2 +-
 doc/guides/prog_guide/rte_flow.rst     | 73 ++++++++++++++++++--------
 doc/guides/rel_notes/release_19_08.rst | 10 ++++
 drivers/net/mlx5/mlx5_rxtx.c           | 12 ++---
 drivers/net/mlx5/mlx5_rxtx_vec.c       |  4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h  |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h   |  2 +-
 lib/librte_ethdev/rte_ethdev.h         |  5 ++
 lib/librte_ethdev/rte_flow.h           | 43 +++++++++++++--
 lib/librte_mbuf/rte_mbuf.h             | 21 ++++----
 11 files changed, 161 insertions(+), 48 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 201bd9de56..eda5c5491f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -272,6 +272,9 @@ enum index {
 	ACTION_SET_MAC_SRC_MAC_SRC,
 	ACTION_SET_MAC_DST,
 	ACTION_SET_MAC_DST_MAC_DST,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -885,6 +888,7 @@ static const enum index next_action[] = {
 	ACTION_SET_TTL,
 	ACTION_SET_MAC_SRC,
 	ACTION_SET_MAC_DST,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1047,6 +1051,13 @@ static const enum index action_set_mac_dst[] = {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -2854,6 +2865,30 @@ static const struct token token_list[] = {
 			     (struct rte_flow_action_set_mac, mac_addr)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index a1164b7053..6ecc97351f 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -182,7 +182,7 @@ tx_pkt_set_md(uint16_t port_id, __rte_unused uint16_t queue,
 	 * and set ol_flags accordingly.
 	 */
 	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
+		pkts[i]->metadata = ports[port_id].tx_metadata;
 		pkts[i]->ol_flags |= PKT_TX_METADATA;
 	}
 	return nb_pkts;
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index a34d012e55..5092f0074e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` field of ``rte_mbuf`` field with PKT_RX_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1189,27 +1215,6 @@ Normally preceded by any of:
 - `Item: ICMP6_ND_NS`_
 - `Item: ICMP6_ND_OPT`_
 
-Item: ``META``
-^^^^^^^^^^^^^^
-
-Matches an application specific 32 bit metadata item.
-
-- Default ``mask`` matches the specified metadata value.
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Actions
 ~~~~~~~
 
@@ -2345,6 +2350,32 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
    | ``mac_addr`` | MAC address   |
    +--------------+---------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by mbuf
+metadata field with PKT_RX_METADATA flag if set.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index 223479c6d4..e087266da0 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -68,6 +68,16 @@ New Features
   rte_rand_max() which supplies unbiased, bounded pseudo-random
   numbers.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * ``tx_metadata`` field of ``rte_mbuf`` has been moved to an independent
+    field and renamed as ``metadata``.
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via ``metadata`` field of ``rte_mbuf``
+    with PKT_RX_METADATA.
+
 * **Updated the bnxt PMD.**
 
   Updated the bnxt PMD. The major enhancements include:
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index c1dc8c4e17..4b23a0176d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -784,8 +784,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -1193,8 +1192,7 @@ mlx5_tx_burst_mpw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -1430,8 +1428,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
@@ -1715,8 +1712,7 @@ txq_burst_empw(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 073044f6d1..b8e042c5d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -71,7 +71,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 	if (!pkts_n)
 		return 0;
 	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
-			pkts[0]->tx_metadata : 0;
+		      pkts[0]->metadata : 0;
 	/* Count the number of packets having same offload parameters. */
 	for (pos = 1; pos < pkts_n; ++pos) {
 		/* Check if packet has same checksum flags. */
@@ -81,7 +81,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 		/* Check if packet has same metadata. */
 		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
 			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
-					pkts[pos]->tx_metadata : 0;
+				      pkts[pos]->metadata : 0;
 			if (pn_metadata != p0_metadata)
 				break;
 		}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 1c7e3b444a..900cd9db43 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -131,7 +131,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		uint8x16_t ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 503ca0f6ad..df7e22b9b9 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -129,7 +129,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		__m128i ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index c85212649c..ee0707e2d8 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1011,6 +1011,11 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
+/**
+ * Device supports match on metadata Rx offload.
+ * Driver sets PKT_RX_METADATA and mbuf metadata field.
+ */
+#define DEV_RX_OFFLOAD_MATCH_METADATA   0x00080000
 
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f3a8fb103f..cda8628183 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -417,7 +417,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1164,9 +1165,16 @@ rte_flow_item_icmp6_nd_opt_tla_eth_mask = {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * field with PKT_RX_METADATA flag.
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1650,6 +1658,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_mac.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_MAC_DST,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2131,6 +2146,28 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf metadata field with PKT_TX_METADATA flag
+ * on egress will be overridden by this action. On ingress, the metadata will be
+ * carried by mbuf metadata field with PKT_RX_METADATA flag if set.
+ *
+ * Altering partial bits is supported with mask. For bits which have never been
+ * set, unpredictable value will be seen depending on driver implementation. For
+ * loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated
+ * to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
 /*
  * Definition of a single action.
  *
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 9542488554..ba2da874f5 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -200,6 +200,11 @@ extern "C" {
 
 /* add new RX flags here */
 
+/**
+ * Indicate that mbuf has metadata from device.
+ */
+#define PKT_RX_METADATA	(1ULL << 23)
+
 /* add new TX flags here */
 
 /**
@@ -648,17 +653,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
@@ -727,6 +721,11 @@ struct rte_mbuf {
 	 */
 	struct rte_mbuf_ext_shared_info *shinfo;
 
+	/** Application specific metadata value for flow rule match.
+	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
+	 */
+	uint32_t metadata;
+
 } __rte_cache_aligned;
 
 /**
-- 
2.21.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
@ 2019-07-04 23:23   ` Yongseok Koh
  2019-07-05 13:54     ` Adrien Mazarguil
  2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  0 siblings, 2 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-07-04 23:23 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev, viacheslavo

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++
 doc/guides/rel_notes/release_19_08.rst |  4 ++
 lib/librte_ethdev/rte_flow.h           | 54 +++++++++++++++++++
 4 files changed, 183 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index eda5c5491f..37a692db54 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -181,6 +181,9 @@ enum index {
 	ITEM_ICMP6_ND_OPT_TLA_ETH_TLA,
 	ITEM_META,
 	ITEM_META_DATA,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -275,6 +278,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -613,6 +620,7 @@ static const enum index next_item[] = {
 	ITEM_ICMP6_ND_OPT_SLA_ETH,
 	ITEM_ICMP6_ND_OPT_TLA_ETH,
 	ITEM_META,
+	ITEM_TAG,
 	ZERO,
 };
 
@@ -839,6 +847,13 @@ static const enum index item_meta[] = {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -889,6 +904,7 @@ static const enum index next_action[] = {
 	ACTION_SET_MAC_SRC,
 	ACTION_SET_MAC_DST,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1058,6 +1074,14 @@ static const enum index action_set_meta[] = {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -2161,6 +2185,26 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_meta,
 						  data, "\xff\xff\xff\xff")),
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 
 	/* Validate/create actions. */
 	[ACTIONS] = {
@@ -2889,6 +2933,37 @@ static const struct token token_list[] = {
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 5092f0074e..7880425e9e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2376,6 +2404,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index e087266da0..7c9da3fdae 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -78,6 +78,10 @@ New Features
   * Rx metadata is delivered to host via ``metadata`` field of ``rte_mbuf``
     with PKT_RX_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
 * **Updated the bnxt PMD.**
 
   Updated the bnxt PMD. The major enhancements include:
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index cda8628183..ffb8a13098 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -422,6 +422,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1187,6 +1196,27 @@ static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_rx_meta_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1665,6 +1695,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2168,6 +2207,21 @@ struct rte_flow_action_set_meta {
 	rte_be32_t mask;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.21.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
@ 2019-07-05 13:54     ` Adrien Mazarguil
  2019-07-05 18:05       ` Yongseok Koh
  2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2019-07-05 13:54 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: shahafs, thomas, ferruh.yigit, arybchenko, olivier.matz, dev,
	viacheslavo

On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> A tag is a transient data which can be used during flow match. This can be
> used to store match result from a previous table so that the same pattern
> need not be matched again on the next table. Even if outer header is
> decapsulated on the previous match, the match result can be kept.
> 
> Some device expose internal registers of its flow processing pipeline and
> those registers are quite useful for stateful connection tracking as it
> keeps status of flow matching. Multiple tags are supported by specifying
> index.
> 
> Example testpmd commands are:
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>             eth ... / end
>     actions ... jump group 2 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
>   flow create 0 ingress group 2
>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Hi Yongseok,

Only high level questions for now, while it unquestionably looks useful,
from a user standpoint exposing the separate index seems redundant and not
necessarily convenient. Using the following example to illustrate:

 actions set_tag index 3 value 0x123456 mask 0xfffff

 pattern tag index is 3 value spec 0x123456 value mask 0xffffff

I might be missing something, but why isn't this enough:

 pattern tag index is 3 # match whatever is stored at index 3

Assuming it can work, then why bother with providing value spec/mask on
set_tag? A flow rule pattern matches something, sets some arbitrary tag to
be matched by a subsequent flow rule and that's it. It even seems like
relying on the index only on both occasions is enough for identification.

Same question for the opposite approach; relying on the value, never
mentioning the index.

I'm under the impression that the index is a hardware-specific constraint
that shouldn't be exposed (especially since it's an 8-bit field). If so, a
PMD could keep track of used indices without having them exposed through the
public API.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 13:54     ` Adrien Mazarguil
@ 2019-07-05 18:05       ` Yongseok Koh
  2019-07-08 23:32         ` Yongseok Koh
  2019-07-09  8:38         ` Adrien Mazarguil
  0 siblings, 2 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-07-05 18:05 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko



> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> 
> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>> A tag is a transient data which can be used during flow match. This can be
>> used to store match result from a previous table so that the same pattern
>> need not be matched again on the next table. Even if outer header is
>> decapsulated on the previous match, the match result can be kept.
>> 
>> Some device expose internal registers of its flow processing pipeline and
>> those registers are quite useful for stateful connection tracking as it
>> keeps status of flow matching. Multiple tags are supported by specifying
>> index.
>> 
>> Example testpmd commands are:
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>            eth ... / end
>>    actions ... jump group 2 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>>  flow create 0 ingress group 2
>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> Hi Yongseok,
> 
> Only high level questions for now, while it unquestionably looks useful,
> from a user standpoint exposing the separate index seems redundant and not
> necessarily convenient. Using the following example to illustrate:
> 
> actions set_tag index 3 value 0x123456 mask 0xfffff
> 
> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> 
> I might be missing something, but why isn't this enough:
> 
> pattern tag index is 3 # match whatever is stored at index 3
> 
> Assuming it can work, then why bother with providing value spec/mask on
> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> be matched by a subsequent flow rule and that's it. It even seems like
> relying on the index only on both occasions is enough for identification.
> 
> Same question for the opposite approach; relying on the value, never
> mentioning the index.
> 
> I'm under the impression that the index is a hardware-specific constraint
> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> PMD could keep track of used indices without having them exposed through the
> public API.


Thank you for review, Adrien.
Hope you are doing well. It's been long since we talked each other. :-)

Your approach will work too in general but we have a request from customer that
they want to partition this limited tag storage. Assuming that HW exposes 32bit
tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
table id and 8bit flow id. As they want to split one 32bit storage, I thought it
is better to provide mask when setting/matching the value. Even some customer
wants to store multiple flags bit by bit like ol_flags. They do want to alter
only partial bits.

And for the index, it is to reference an entry of tags array as HW can provide
larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
storage which users can use for their own sake.
	tag[0], tag[1], tag[2], tag[3]


Thanks,
Yongseok


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 18:05       ` Yongseok Koh
@ 2019-07-08 23:32         ` Yongseok Koh
  2019-07-09  8:38         ` Adrien Mazarguil
  1 sibling, 0 replies; 98+ messages in thread
From: Yongseok Koh @ 2019-07-08 23:32 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko


> On Jul 5, 2019, at 11:05 AM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> 
> 
>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>> 
>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>> A tag is a transient data which can be used during flow match. This can be
>>> used to store match result from a previous table so that the same pattern
>>> need not be matched again on the next table. Even if outer header is
>>> decapsulated on the previous match, the match result can be kept.
>>> 
>>> Some device expose internal registers of its flow processing pipeline and
>>> those registers are quite useful for stateful connection tracking as it
>>> keeps status of flow matching. Multiple tags are supported by specifying
>>> index.
>>> 
>>> Example testpmd commands are:
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>           eth ... / end
>>>   actions ... jump group 2 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>           tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> flow create 0 ingress group 2
>>>   pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> 
>> Hi Yongseok,
>> 
>> Only high level questions for now, while it unquestionably looks useful,
>> from a user standpoint exposing the separate index seems redundant and not
>> necessarily convenient. Using the following example to illustrate:
>> 
>> actions set_tag index 3 value 0x123456 mask 0xfffff
>> 
>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>> 
>> I might be missing something, but why isn't this enough:
>> 
>> pattern tag index is 3 # match whatever is stored at index 3
>> 
>> Assuming it can work, then why bother with providing value spec/mask on
>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>> be matched by a subsequent flow rule and that's it. It even seems like
>> relying on the index only on both occasions is enough for identification.
>> 
>> Same question for the opposite approach; relying on the value, never
>> mentioning the index.
>> 
>> I'm under the impression that the index is a hardware-specific constraint
>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>> PMD could keep track of used indices without having them exposed through the
>> public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)
> 
> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

Adrien,

Are you okay with this?

Thanks,
Yongseok



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 18:05       ` Yongseok Koh
  2019-07-08 23:32         ` Yongseok Koh
@ 2019-07-09  8:38         ` Adrien Mazarguil
  2019-07-11  1:59           ` Yongseok Koh
  1 sibling, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2019-07-09  8:38 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > 
> > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >> A tag is a transient data which can be used during flow match. This can be
> >> used to store match result from a previous table so that the same pattern
> >> need not be matched again on the next table. Even if outer header is
> >> decapsulated on the previous match, the match result can be kept.
> >> 
> >> Some device expose internal registers of its flow processing pipeline and
> >> those registers are quite useful for stateful connection tracking as it
> >> keeps status of flow matching. Multiple tags are supported by specifying
> >> index.
> >> 
> >> Example testpmd commands are:
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>            eth ... / end
> >>    actions ... jump group 2 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >>  flow create 0 ingress group 2
> >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > 
> > Hi Yongseok,
> > 
> > Only high level questions for now, while it unquestionably looks useful,
> > from a user standpoint exposing the separate index seems redundant and not
> > necessarily convenient. Using the following example to illustrate:
> > 
> > actions set_tag index 3 value 0x123456 mask 0xfffff
> > 
> > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > 
> > I might be missing something, but why isn't this enough:
> > 
> > pattern tag index is 3 # match whatever is stored at index 3
> > 
> > Assuming it can work, then why bother with providing value spec/mask on
> > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > be matched by a subsequent flow rule and that's it. It even seems like
> > relying on the index only on both occasions is enough for identification.
> > 
> > Same question for the opposite approach; relying on the value, never
> > mentioning the index.
> > 
> > I'm under the impression that the index is a hardware-specific constraint
> > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > PMD could keep track of used indices without having them exposed through the
> > public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)

Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
answer these days...

 <dev@dpdk.org> hey!
 <dev@dpdk.org> no private talks!

Back to the topic:

> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

OK, looks like I missed the point then. I initially took it for a funky
alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
(ingress extended [1]) but while it could be used like that, it's more of a
way to temporarily store and retrieve a small amount of data, correct?

Out of curiosity, are these registers independent from META and other
items/actions in mlx5, otherwise what happens if they are combined?

Are there other uses for these registers? Say, referencing their contents
from other places in a flow rule so they don't have to be hard-coded?

Right now I'm still uncomfortable with such a feature in the public API
because compared to META [1], this approach looks very hardware-specific and
seemingly difficult to map on different HW architectures.

However, the main problem is that as described, its end purpose seems
redundant with META, which I think can cover the use cases you gave. So what
can an application do with this that couldn't be done in a more generic
fashion through META?

I may still be missing something and I'm open to ideas, but assuming it
doesn't make it into the public rte_flow API, it remains an interesting
feature on its own merit which could be added to DPDK as PMD-specific
pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
for PMDs to expose a public header that dedicated applications can include
to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
No problem with that.

[1] "[PATCH] ethdev: extend flow metadata"
    https://mails.dpdk.org/archives/dev/2019-July/137305.html

[2] "Negative types"
    https://doc.dpdk.org/guides/prog_guide/rte_flow.html#negative-types

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
@ 2019-07-10  9:31   ` Olivier Matz
  2019-07-10  9:55     ` Bruce Richardson
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-07-10  9:31 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil, dev,
	viacheslavo

Hi Yongseok,

On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> 
> This patch extends the usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on packet Rx. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via mbuf metadata field with
> PKT_RX_METADATA ol_flag.
> 
> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> renamed to 'metadata' to support both Rx and Tx metadata.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on HW capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

(...)

> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -200,6 +200,11 @@ extern "C" {
>  
>  /* add new RX flags here */
>  
> +/**
> + * Indicate that mbuf has metadata from device.
> + */
> +#define PKT_RX_METADATA	(1ULL << 23)
> +
>  /* add new TX flags here */
>  
>  /**
> @@ -648,17 +653,6 @@ struct rte_mbuf {
>  			/**< User defined tags. See rte_distributor_process() */
>  			uint32_t usr;
>  		} hash;                   /**< hash information */
> -		struct {
> -			/**
> -			 * Application specific metadata value
> -			 * for egress flow rule match.
> -			 * Valid if PKT_TX_METADATA is set.
> -			 * Located here to allow conjunct use
> -			 * with hash.sched.hi.
> -			 */
> -			uint32_t tx_metadata;
> -			uint32_t reserved;
> -		};
>  	};
>  
>  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> @@ -727,6 +721,11 @@ struct rte_mbuf {
>  	 */
>  	struct rte_mbuf_ext_shared_info *shinfo;
>  
> +	/** Application specific metadata value for flow rule match.
> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> +	 */
> +	uint32_t metadata;
> +
>  } __rte_cache_aligned;

This will break the ABI, so we cannot put it in 19.08, and we need a
deprecation notice.

That said, it shows again that we need something to make the addition of
fields in mbufs more flexible. Knowing that Thomas will present
something about it at next userspace [1], I drafted something in a RFC
[2]. It is simpler than I expected, and I think your commit could be
a good candidate for a first user. Do you mind having a try? Feedback
and comment is of course welcome.

If it fits your needs, we can target its introduction for 19.11. Having
a user for this new feature would be a plus for its introduction :)

Thanks,
Olivier

[1] https://dpdkbordeaux2019.sched.com/
[2] http://mails.dpdk.org/archives/dev/2019-July/137772.html

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10  9:31   ` Olivier Matz
@ 2019-07-10  9:55     ` Bruce Richardson
  2019-07-10 10:07       ` Olivier Matz
  0 siblings, 1 reply; 98+ messages in thread
From: Bruce Richardson @ 2019-07-10  9:55 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> Hi Yongseok,
> 
> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > 
> > This patch extends the usability.
> > 
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > 
> > When supporting multiple tables, Tx metadata can also be set by a rule and
> > matched by another rule. This new action allows metadata to be set as a
> > result of flow match.
> > 
> > 2) Metadata on ingress
> > 
> > There's also need to support metadata on packet Rx. Metadata can be set by
> > SET_META action and matched by META item like Tx. The final value set by
> > the action will be delivered to application via mbuf metadata field with
> > PKT_RX_METADATA ol_flag.
> > 
> > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > renamed to 'metadata' to support both Rx and Tx metadata.
> > 
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on HW capability.
> > 
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> (...)
> 
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -200,6 +200,11 @@ extern "C" {
> >  
> >  /* add new RX flags here */
> >  
> > +/**
> > + * Indicate that mbuf has metadata from device.
> > + */
> > +#define PKT_RX_METADATA	(1ULL << 23)
> > +
> >  /* add new TX flags here */
> >  
> >  /**
> > @@ -648,17 +653,6 @@ struct rte_mbuf {
> >  			/**< User defined tags. See rte_distributor_process() */
> >  			uint32_t usr;
> >  		} hash;                   /**< hash information */
> > -		struct {
> > -			/**
> > -			 * Application specific metadata value
> > -			 * for egress flow rule match.
> > -			 * Valid if PKT_TX_METADATA is set.
> > -			 * Located here to allow conjunct use
> > -			 * with hash.sched.hi.
> > -			 */
> > -			uint32_t tx_metadata;
> > -			uint32_t reserved;
> > -		};
> >  	};
> >  
> >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > @@ -727,6 +721,11 @@ struct rte_mbuf {
> >  	 */
> >  	struct rte_mbuf_ext_shared_info *shinfo;
> >  
> > +	/** Application specific metadata value for flow rule match.
> > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > +	 */
> > +	uint32_t metadata;
> > +
> >  } __rte_cache_aligned;
> 
> This will break the ABI, so we cannot put it in 19.08, and we need a
> deprecation notice.
> 
Does it actually break the ABI? Adding a new field to the mbuf should only
break the ABI if it either causes new fields to move or changes the
structure size. Since this is at the end, it's not going to move any older
fields, and since everything is cache-aligned I don't think the structure
size changes either.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10  9:55     ` Bruce Richardson
@ 2019-07-10 10:07       ` Olivier Matz
  2019-07-10 12:01         ` Bruce Richardson
  0 siblings, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-07-10 10:07 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > Hi Yongseok,
> > 
> > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > 
> > > This patch extends the usability.
> > > 
> > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > 
> > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > matched by another rule. This new action allows metadata to be set as a
> > > result of flow match.
> > > 
> > > 2) Metadata on ingress
> > > 
> > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > SET_META action and matched by META item like Tx. The final value set by
> > > the action will be delivered to application via mbuf metadata field with
> > > PKT_RX_METADATA ol_flag.
> > > 
> > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > 
> > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > propagated to the other path depending on HW capability.
> > > 
> > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > 
> > (...)
> > 
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -200,6 +200,11 @@ extern "C" {
> > >  
> > >  /* add new RX flags here */
> > >  
> > > +/**
> > > + * Indicate that mbuf has metadata from device.
> > > + */
> > > +#define PKT_RX_METADATA	(1ULL << 23)
> > > +
> > >  /* add new TX flags here */
> > >  
> > >  /**
> > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > >  			/**< User defined tags. See rte_distributor_process() */
> > >  			uint32_t usr;
> > >  		} hash;                   /**< hash information */
> > > -		struct {
> > > -			/**
> > > -			 * Application specific metadata value
> > > -			 * for egress flow rule match.
> > > -			 * Valid if PKT_TX_METADATA is set.
> > > -			 * Located here to allow conjunct use
> > > -			 * with hash.sched.hi.
> > > -			 */
> > > -			uint32_t tx_metadata;
> > > -			uint32_t reserved;
> > > -		};
> > >  	};
> > >  
> > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > >  	 */
> > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > >  
> > > +	/** Application specific metadata value for flow rule match.
> > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > +	 */
> > > +	uint32_t metadata;
> > > +
> > >  } __rte_cache_aligned;
> > 
> > This will break the ABI, so we cannot put it in 19.08, and we need a
> > deprecation notice.
> > 
> Does it actually break the ABI? Adding a new field to the mbuf should only
> break the ABI if it either causes new fields to move or changes the
> structure size. Since this is at the end, it's not going to move any older
> fields, and since everything is cache-aligned I don't think the structure
> size changes either.

I think it does break the ABI: in previous version, when the PKT_TX_METADATA
flag is set, the associated value is put in m->tx_metadata (offset 44 on
x86-64), and in the next version, it will be in m->metadata (offset 112). So,
these 2 versions are not binary compatible.

Anyway, at least it breaks the API.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 10:07       ` Olivier Matz
@ 2019-07-10 12:01         ` Bruce Richardson
  2019-07-10 12:26           ` Thomas Monjalon
  0 siblings, 1 reply; 98+ messages in thread
From: Bruce Richardson @ 2019-07-10 12:01 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > Hi Yongseok,
> > > 
> > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > 
> > > > This patch extends the usability.
> > > > 
> > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > 
> > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > matched by another rule. This new action allows metadata to be set as a
> > > > result of flow match.
> > > > 
> > > > 2) Metadata on ingress
> > > > 
> > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > SET_META action and matched by META item like Tx. The final value set by
> > > > the action will be delivered to application via mbuf metadata field with
> > > > PKT_RX_METADATA ol_flag.
> > > > 
> > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > 
> > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > propagated to the other path depending on HW capability.
> > > > 
> > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > 
> > > (...)
> > > 
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -200,6 +200,11 @@ extern "C" {
> > > >  
> > > >  /* add new RX flags here */
> > > >  
> > > > +/**
> > > > + * Indicate that mbuf has metadata from device.
> > > > + */
> > > > +#define PKT_RX_METADATA	(1ULL << 23)
> > > > +
> > > >  /* add new TX flags here */
> > > >  
> > > >  /**
> > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > >  			/**< User defined tags. See rte_distributor_process() */
> > > >  			uint32_t usr;
> > > >  		} hash;                   /**< hash information */
> > > > -		struct {
> > > > -			/**
> > > > -			 * Application specific metadata value
> > > > -			 * for egress flow rule match.
> > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > -			 * Located here to allow conjunct use
> > > > -			 * with hash.sched.hi.
> > > > -			 */
> > > > -			uint32_t tx_metadata;
> > > > -			uint32_t reserved;
> > > > -		};
> > > >  	};
> > > >  
> > > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > >  	 */
> > > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > > >  
> > > > +	/** Application specific metadata value for flow rule match.
> > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > +	 */
> > > > +	uint32_t metadata;
> > > > +
> > > >  } __rte_cache_aligned;
> > > 
> > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > deprecation notice.
> > > 
> > Does it actually break the ABI? Adding a new field to the mbuf should only
> > break the ABI if it either causes new fields to move or changes the
> > structure size. Since this is at the end, it's not going to move any older
> > fields, and since everything is cache-aligned I don't think the structure
> > size changes either.
> 
> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> flag is set, the associated value is put in m->tx_metadata (offset 44 on
> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> these 2 versions are not binary compatible.
> 
> Anyway, at least it breaks the API.

Ok, I misunderstood. I thought it was the structure change itself you were
saying broke the ABI. Yes, putting the data in a different place is indeed
an ABI break.

/Bruce

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 12:01         ` Bruce Richardson
@ 2019-07-10 12:26           ` Thomas Monjalon
  2019-07-10 16:37             ` Yongseok Koh
  0 siblings, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2019-07-10 12:26 UTC (permalink / raw)
  To: Bruce Richardson, Olivier Matz, arybchenko, adrien.mazarguil
  Cc: Yongseok Koh, shahafs, ferruh.yigit, dev, viacheslavo

10/07/2019 14:01, Bruce Richardson:
> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> > On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > > 
> > > > > This patch extends the usability.
> > > > > 
> > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > > 
> > > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > > matched by another rule. This new action allows metadata to be set as a
> > > > > result of flow match.
> > > > > 
> > > > > 2) Metadata on ingress
> > > > > 
> > > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > > SET_META action and matched by META item like Tx. The final value set by
> > > > > the action will be delivered to application via mbuf metadata field with
> > > > > PKT_RX_METADATA ol_flag.
> > > > > 
> > > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > > 
> > > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > > propagated to the other path depending on HW capability.
> > > > > 
> > > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > > 
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > > >  			/**< User defined tags. See rte_distributor_process() */
> > > > >  			uint32_t usr;
> > > > >  		} hash;                   /**< hash information */
> > > > > -		struct {
> > > > > -			/**
> > > > > -			 * Application specific metadata value
> > > > > -			 * for egress flow rule match.
> > > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > > -			 * Located here to allow conjunct use
> > > > > -			 * with hash.sched.hi.
> > > > > -			 */
> > > > > -			uint32_t tx_metadata;
> > > > > -			uint32_t reserved;
> > > > > -		};
> > > > >  	};
> > > > >  
> > > > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > > >  	 */
> > > > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > > > >  
> > > > > +	/** Application specific metadata value for flow rule match.
> > > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > > +	 */
> > > > > +	uint32_t metadata;
> > > > > +
> > > > >  } __rte_cache_aligned;
> > > > 
> > > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > > deprecation notice.
> > > > 
> > > Does it actually break the ABI? Adding a new field to the mbuf should only
> > > break the ABI if it either causes new fields to move or changes the
> > > structure size. Since this is at the end, it's not going to move any older
> > > fields, and since everything is cache-aligned I don't think the structure
> > > size changes either.
> > 
> > I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> > flag is set, the associated value is put in m->tx_metadata (offset 44 on
> > x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> > these 2 versions are not binary compatible.
> > 
> > Anyway, at least it breaks the API.
> 
> Ok, I misunderstood. I thought it was the structure change itself you were
> saying broke the ABI. Yes, putting the data in a different place is indeed
> an ABI break.

We could add the new field and keep the old one unused,
so it does not break the ABI.
However I suppose everybody will prefer a version using dynamic fields.
Is someone against using dynamic field for such usage?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 12:26           ` Thomas Monjalon
@ 2019-07-10 16:37             ` Yongseok Koh
  2019-07-11  7:44               ` Adrien Mazarguil
  0 siblings, 1 reply; 98+ messages in thread
From: Yongseok Koh @ 2019-07-10 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, Olivier Matz
  Cc: Bruce Richardson, Andrew Rybchenko, Adrien Mazarguil,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko


> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> 10/07/2019 14:01, Bruce Richardson:
>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>> 
>>>>>> This patch extends the usability.
>>>>>> 
>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>> 
>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>> result of flow match.
>>>>>> 
>>>>>> 2) Metadata on ingress
>>>>>> 
>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>> PKT_RX_METADATA ol_flag.
>>>>>> 
>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>> 
>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>> propagated to the other path depending on HW capability.
>>>>>> 
>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>> 
>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>> 			uint32_t usr;
>>>>>> 		} hash;                   /**< hash information */
>>>>>> -		struct {
>>>>>> -			/**
>>>>>> -			 * Application specific metadata value
>>>>>> -			 * for egress flow rule match.
>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>> -			 * Located here to allow conjunct use
>>>>>> -			 * with hash.sched.hi.
>>>>>> -			 */
>>>>>> -			uint32_t tx_metadata;
>>>>>> -			uint32_t reserved;
>>>>>> -		};
>>>>>> 	};
>>>>>> 
>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>> 	 */
>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>> 
>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>> +	 */
>>>>>> +	uint32_t metadata;
>>>>>> +
>>>>>> } __rte_cache_aligned;
>>>>> 
>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>> deprecation notice.
>>>>> 
>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>> break the ABI if it either causes new fields to move or changes the
>>>> structure size. Since this is at the end, it's not going to move any older
>>>> fields, and since everything is cache-aligned I don't think the structure
>>>> size changes either.
>>> 
>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>> these 2 versions are not binary compatible.
>>> 
>>> Anyway, at least it breaks the API.
>> 
>> Ok, I misunderstood. I thought it was the structure change itself you were
>> saying broke the ABI. Yes, putting the data in a different place is indeed
>> an ABI break.
> 
> We could add the new field and keep the old one unused,
> so it does not break the ABI.

Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
the new one at the end and make it used with the new PKT_RX_METADATA.

> However I suppose everybody will prefer a version using dynamic fields.
> Is someone against using dynamic field for such usage?

However, given that the amazing dynamic fields is coming soon (thanks for your
effort, Olivier and Thomas!), I'd be honored to be the first user of it.

Olivier, I'll take a look at your RFC.


Thanks,
Yongseok


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-09  8:38         ` Adrien Mazarguil
@ 2019-07-11  1:59           ` Yongseok Koh
  2019-10-08 12:57             ` Yigit, Ferruh
  0 siblings, 1 reply; 98+ messages in thread
From: Yongseok Koh @ 2019-07-11  1:59 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > > 
> > > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> > >> A tag is a transient data which can be used during flow match. This can be
> > >> used to store match result from a previous table so that the same pattern
> > >> need not be matched again on the next table. Even if outer header is
> > >> decapsulated on the previous match, the match result can be kept.
> > >> 
> > >> Some device expose internal registers of its flow processing pipeline and
> > >> those registers are quite useful for stateful connection tracking as it
> > >> keeps status of flow matching. Multiple tags are supported by specifying
> > >> index.
> > >> 
> > >> Example testpmd commands are:
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> > >>            eth ... / end
> > >>    actions ... jump group 2 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> > >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >>  flow create 0 ingress group 2
> > >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > 
> > > Hi Yongseok,
> > > 
> > > Only high level questions for now, while it unquestionably looks useful,
> > > from a user standpoint exposing the separate index seems redundant and not
> > > necessarily convenient. Using the following example to illustrate:
> > > 
> > > actions set_tag index 3 value 0x123456 mask 0xfffff
> > > 
> > > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > > 
> > > I might be missing something, but why isn't this enough:
> > > 
> > > pattern tag index is 3 # match whatever is stored at index 3
> > > 
> > > Assuming it can work, then why bother with providing value spec/mask on
> > > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > > be matched by a subsequent flow rule and that's it. It even seems like
> > > relying on the index only on both occasions is enough for identification.
> > > 
> > > Same question for the opposite approach; relying on the value, never
> > > mentioning the index.
> > > 
> > > I'm under the impression that the index is a hardware-specific constraint
> > > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > > PMD could keep track of used indices without having them exposed through the
> > > public API.
> > 
> > 
> > Thank you for review, Adrien.
> > Hope you are doing well. It's been long since we talked each other. :-)
> 
> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
> answer these days...
> 
>  <dev@dpdk.org> hey!
>  <dev@dpdk.org> no private talks!
> 
> Back to the topic:
> 
> > Your approach will work too in general but we have a request from customer that
> > they want to partition this limited tag storage. Assuming that HW exposes 32bit
> > tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> > store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> > table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> > is better to provide mask when setting/matching the value. Even some customer
> > wants to store multiple flags bit by bit like ol_flags. They do want to alter
> > only partial bits.
> > 
> > And for the index, it is to reference an entry of tags array as HW can provide
> > larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> > storage which users can use for their own sake.
> > 	tag[0], tag[1], tag[2], tag[3]
> 
> OK, looks like I missed the point then. I initially took it for a funky
> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
> (ingress extended [1]) but while it could be used like that, it's more of a
> way to temporarily store and retrieve a small amount of data, correct?

Correct.

> Out of curiosity, are these registers independent from META and other
> items/actions in mlx5, otherwise what happens if they are combined?

I thought about combining it but I chose this way. Because it is transient. META
can be set by packet descriptor on Tx and can be delivered to host via mbuf on
Rx, but this TAG item can't. If I combine it, users have to query this
capability for each 32b storage. And also, there should be a way to request data
from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
mbuf->metadata, it should be something like,
	ingress / pattern .. /
	actions ... set_meta index i data x / copy_meta_to_rx index i
And if user wants to set meta[i] via mbuf on Tx,
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

For sure, user is also responsible for querying these capabilities per each
meta[] storage.

As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

User might misunderstand the order of two things - item meta and copy_meta
action. I also thought about having capability bits per each meta[] storage but
it also looked complex.

I do think rte_flow item/action is better to be simple, atomic and intuitive.
That's why I made this choice.

> Are there other uses for these registers? Say, referencing their contents
> from other places in a flow rule so they don't have to be hard-coded?

Possible.
Actually, this feature is needed by connection tracking of OVS-DPDK.

> Right now I'm still uncomfortable with such a feature in the public API
> because compared to META [1], this approach looks very hardware-specific and
> seemingly difficult to map on different HW architectures.

I wouldn't say it is HW-specific. Like I explained above, I just define this new
item/action to make things easy-to-use and intuitive.

> However, the main problem is that as described, its end purpose seems
> redundant with META, which I think can cover the use cases you gave. So what
> can an application do with this that couldn't be done in a more generic
> fashion through META?
> 
> I may still be missing something and I'm open to ideas, but assuming it
> doesn't make it into the public rte_flow API, it remains an interesting
> feature on its own merit which could be added to DPDK as PMD-specific
> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
> for PMDs to expose a public header that dedicated applications can include
> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> No problem with that.

That's good info. Thanks. But still considering connection-tracking-like
use-cases, this transient storage on multi-table flow pipeline is quite useful.


thanks,
Yongseok

> [1] "[PATCH] ethdev: extend flow metadata"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
> 
> [2] "Negative types"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 16:37             ` Yongseok Koh
@ 2019-07-11  7:44               ` Adrien Mazarguil
  2019-07-14 11:46                 ` Andrew Rybchenko
  0 siblings, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2019-07-11  7:44 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Andrew Rybchenko, Shahaf Shuler, Ferruh Yigit, dev,
	Slava Ovsiienko

On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> 
> > On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > 10/07/2019 14:01, Bruce Richardson:
> >> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> >>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> >>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> >>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> >>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> >>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >>>>>> 
> >>>>>> This patch extends the usability.
> >>>>>> 
> >>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>> 
> >>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
> >>>>>> matched by another rule. This new action allows metadata to be set as a
> >>>>>> result of flow match.
> >>>>>> 
> >>>>>> 2) Metadata on ingress
> >>>>>> 
> >>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
> >>>>>> SET_META action and matched by META item like Tx. The final value set by
> >>>>>> the action will be delivered to application via mbuf metadata field with
> >>>>>> PKT_RX_METADATA ol_flag.
> >>>>>> 
> >>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> >>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
> >>>>>> 
> >>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> >>>>>> propagated to the other path depending on HW capability.
> >>>>>> 
> >>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>> 
> >>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
> >>>>>> 			/**< User defined tags. See rte_distributor_process() */
> >>>>>> 			uint32_t usr;
> >>>>>> 		} hash;                   /**< hash information */
> >>>>>> -		struct {
> >>>>>> -			/**
> >>>>>> -			 * Application specific metadata value
> >>>>>> -			 * for egress flow rule match.
> >>>>>> -			 * Valid if PKT_TX_METADATA is set.
> >>>>>> -			 * Located here to allow conjunct use
> >>>>>> -			 * with hash.sched.hi.
> >>>>>> -			 */
> >>>>>> -			uint32_t tx_metadata;
> >>>>>> -			uint32_t reserved;
> >>>>>> -		};
> >>>>>> 	};
> >>>>>> 
> >>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> >>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
> >>>>>> 	 */
> >>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>>>> 
> >>>>>> +	/** Application specific metadata value for flow rule match.
> >>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> >>>>>> +	 */
> >>>>>> +	uint32_t metadata;
> >>>>>> +
> >>>>>> } __rte_cache_aligned;
> >>>>> 
> >>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
> >>>>> deprecation notice.
> >>>>> 
> >>>> Does it actually break the ABI? Adding a new field to the mbuf should only
> >>>> break the ABI if it either causes new fields to move or changes the
> >>>> structure size. Since this is at the end, it's not going to move any older
> >>>> fields, and since everything is cache-aligned I don't think the structure
> >>>> size changes either.
> >>> 
> >>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> >>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
> >>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> >>> these 2 versions are not binary compatible.
> >>> 
> >>> Anyway, at least it breaks the API.
> >> 
> >> Ok, I misunderstood. I thought it was the structure change itself you were
> >> saying broke the ABI. Yes, putting the data in a different place is indeed
> >> an ABI break.
> > 
> > We could add the new field and keep the old one unused,
> > so it does not break the ABI.
> 
> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
> the new one at the end and make it used with the new PKT_RX_METADATA.
> 
> > However I suppose everybody will prefer a version using dynamic fields.
> > Is someone against using dynamic field for such usage?
> 
> However, given that the amazing dynamic fields is coming soon (thanks for your
> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
> 
> Olivier, I'll take a look at your RFC.

Just got a crazy idea while reading this thread... How about repurposing
that "reserved" field as "rx_metadata" in the meantime?

I know reserved fields are cursed and no one's ever supposed to touch them
but this risk is mitigated by having the end user explicitly request its
use, so the patch author (and his relatives) should be safe from the
resulting bad juju.

Joke aside, while I like the idea of Tx/Rx META, I think the similarities
with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
comment when META was originally added to the Tx path, but there's a lot of
overlap between these items/actions, without anything explaining to the end
user how and why they should pick one over the other, if they can be
combined at all and what happens in that case.

All this must be documented, then we should think about unifying their
respective features and deprecate the less capable items/actions. In my
opinion, users need exactly one method to mark/match some mark while
processing Rx/Tx traffic and *optionally* have that mark read from/written
to the mbuf, which may or may not be possible depending on HW features.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-11  7:44               ` Adrien Mazarguil
@ 2019-07-14 11:46                 ` Andrew Rybchenko
  2019-07-29 15:06                   ` Adrien Mazarguil
  0 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-07-14 11:46 UTC (permalink / raw)
  To: Adrien Mazarguil, Yongseok Koh
  Cc: Thomas Monjalon, Olivier Matz, Bruce Richardson, Shahaf Shuler,
	Ferruh Yigit, dev, Slava Ovsiienko

On 11.07.2019 10:44, Adrien Mazarguil wrote:
> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
>>>
>>> 10/07/2019 14:01, Bruce Richardson:
>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>>>>
>>>>>>>> This patch extends the usability.
>>>>>>>>
>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>>>>
>>>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>>>> result of flow match.
>>>>>>>>
>>>>>>>> 2) Metadata on ingress
>>>>>>>>
>>>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>>>> PKT_RX_METADATA ol_flag.
>>>>>>>>
>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>>>>
>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>>>> propagated to the other path depending on HW capability.
>>>>>>>>
>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>>>> 			uint32_t usr;
>>>>>>>> 		} hash;                   /**< hash information */
>>>>>>>> -		struct {
>>>>>>>> -			/**
>>>>>>>> -			 * Application specific metadata value
>>>>>>>> -			 * for egress flow rule match.
>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>>>> -			 * Located here to allow conjunct use
>>>>>>>> -			 * with hash.sched.hi.
>>>>>>>> -			 */
>>>>>>>> -			uint32_t tx_metadata;
>>>>>>>> -			uint32_t reserved;
>>>>>>>> -		};
>>>>>>>> 	};
>>>>>>>>
>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>>>> 	 */
>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>>>>
>>>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>>>> +	 */
>>>>>>>> +	uint32_t metadata;
>>>>>>>> +
>>>>>>>> } __rte_cache_aligned;
>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>>>> deprecation notice.
>>>>>>>
>>>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>>>> break the ABI if it either causes new fields to move or changes the
>>>>>> structure size. Since this is at the end, it's not going to move any older
>>>>>> fields, and since everything is cache-aligned I don't think the structure
>>>>>> size changes either.
>>>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>>>> these 2 versions are not binary compatible.
>>>>>
>>>>> Anyway, at least it breaks the API.
>>>> Ok, I misunderstood. I thought it was the structure change itself you were
>>>> saying broke the ABI. Yes, putting the data in a different place is indeed
>>>> an ABI break.
>>> We could add the new field and keep the old one unused,
>>> so it does not break the ABI.
>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
>> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
>> the new one at the end and make it used with the new PKT_RX_METADATA.
>>
>>> However I suppose everybody will prefer a version using dynamic fields.
>>> Is someone against using dynamic field for such usage?
>> However, given that the amazing dynamic fields is coming soon (thanks for your
>> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
>>
>> Olivier, I'll take a look at your RFC.
> Just got a crazy idea while reading this thread... How about repurposing
> that "reserved" field as "rx_metadata" in the meantime?

It overlaps with hash.fdir.hi which has RSS hash.

> I know reserved fields are cursed and no one's ever supposed to touch them
> but this risk is mitigated by having the end user explicitly request its
> use, so the patch author (and his relatives) should be safe from the
> resulting bad juju.
>
> Joke aside, while I like the idea of Tx/Rx META, I think the similarities
> with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
> comment when META was originally added to the Tx path, but there's a lot of
> overlap between these items/actions, without anything explaining to the end
> user how and why they should pick one over the other, if they can be
> combined at all and what happens in that case.
>
> All this must be documented, then we should think about unifying their
> respective features and deprecate the less capable items/actions. In my
> opinion, users need exactly one method to mark/match some mark while
> processing Rx/Tx traffic and *optionally* have that mark read from/written
> to the mbuf, which may or may not be possible depending on HW features.
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-14 11:46                 ` Andrew Rybchenko
@ 2019-07-29 15:06                   ` Adrien Mazarguil
  2019-10-08 12:51                     ` Yigit, Ferruh
  0 siblings, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2019-07-29 15:06 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko

On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
> On 11.07.2019 10:44, Adrien Mazarguil wrote:
> > On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> > > > On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > 10/07/2019 14:01, Bruce Richardson:
> > > > > On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> > > > > > On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > > > > > > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > > > > > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > > > > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > > > > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > > > > > > 
> > > > > > > > > This patch extends the usability.
> > > > > > > > > 
> > > > > > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > > > > > > 
> > > > > > > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > > > > > > matched by another rule. This new action allows metadata to be set as a
> > > > > > > > > result of flow match.
> > > > > > > > > 
> > > > > > > > > 2) Metadata on ingress
> > > > > > > > > 
> > > > > > > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > > > > > > SET_META action and matched by META item like Tx. The final value set by
> > > > > > > > > the action will be delivered to application via mbuf metadata field with
> > > > > > > > > PKT_RX_METADATA ol_flag.
> > > > > > > > > 
> > > > > > > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > > > > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > > > > > > 
> > > > > > > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > > > > > > propagated to the other path depending on HW capability.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > > > > > > > 			/**< User defined tags. See rte_distributor_process() */
> > > > > > > > > 			uint32_t usr;
> > > > > > > > > 		} hash;                   /**< hash information */
> > > > > > > > > -		struct {
> > > > > > > > > -			/**
> > > > > > > > > -			 * Application specific metadata value
> > > > > > > > > -			 * for egress flow rule match.
> > > > > > > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > > > > > > -			 * Located here to allow conjunct use
> > > > > > > > > -			 * with hash.sched.hi.
> > > > > > > > > -			 */
> > > > > > > > > -			uint32_t tx_metadata;
> > > > > > > > > -			uint32_t reserved;
> > > > > > > > > -		};
> > > > > > > > > 	};
> > > > > > > > > 
> > > > > > > > > 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > > > > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > > > > > > > 	 */
> > > > > > > > > 	struct rte_mbuf_ext_shared_info *shinfo;
> > > > > > > > > 
> > > > > > > > > +	/** Application specific metadata value for flow rule match.
> > > > > > > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > > > > > > +	 */
> > > > > > > > > +	uint32_t metadata;
> > > > > > > > > +
> > > > > > > > > } __rte_cache_aligned;
> > > > > > > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > > > > > > deprecation notice.
> > > > > > > > 
> > > > > > > Does it actually break the ABI? Adding a new field to the mbuf should only
> > > > > > > break the ABI if it either causes new fields to move or changes the
> > > > > > > structure size. Since this is at the end, it's not going to move any older
> > > > > > > fields, and since everything is cache-aligned I don't think the structure
> > > > > > > size changes either.
> > > > > > I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> > > > > > flag is set, the associated value is put in m->tx_metadata (offset 44 on
> > > > > > x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> > > > > > these 2 versions are not binary compatible.
> > > > > > 
> > > > > > Anyway, at least it breaks the API.
> > > > > Ok, I misunderstood. I thought it was the structure change itself you were
> > > > > saying broke the ABI. Yes, putting the data in a different place is indeed
> > > > > an ABI break.
> > > > We could add the new field and keep the old one unused,
> > > > so it does not break the ABI.
> > > Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
> > > keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
> > > the new one at the end and make it used with the new PKT_RX_METADATA.
> > > 
> > > > However I suppose everybody will prefer a version using dynamic fields.
> > > > Is someone against using dynamic field for such usage?
> > > However, given that the amazing dynamic fields is coming soon (thanks for your
> > > effort, Olivier and Thomas!), I'd be honored to be the first user of it.
> > > 
> > > Olivier, I'll take a look at your RFC.
> > Just got a crazy idea while reading this thread... How about repurposing
> > that "reserved" field as "rx_metadata" in the meantime?
> 
> It overlaps with hash.fdir.hi which has RSS hash.

While it does overlap with hash.fdir.hi, isn't the RSS hash stored in the
"rss" field overlapping with hash.fdir.lo? (see struct rte_flow_action_rss)

hash.fdir.hi was originally used by FDIR and later repurposed by rte_flow
for its MARK action, which neatly qualifies as Rx metadata so renaming
"reserved" as "rx_metadata" could already make sense.

That is, assuming users do not need two different kinds of Rx metadata
returned simultaneously with their packets. I think it's safe.

> > I know reserved fields are cursed and no one's ever supposed to touch them
> > but this risk is mitigated by having the end user explicitly request its
> > use, so the patch author (and his relatives) should be safe from the
> > resulting bad juju.
> > 
> > Joke aside, while I like the idea of Tx/Rx META, I think the similarities
> > with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
> > comment when META was originally added to the Tx path, but there's a lot of
> > overlap between these items/actions, without anything explaining to the end
> > user how and why they should pick one over the other, if they can be
> > combined at all and what happens in that case.
> > 
> > All this must be documented, then we should think about unifying their
> > respective features and deprecate the less capable items/actions. In my
> > opinion, users need exactly one method to mark/match some mark while
> > processing Rx/Tx traffic and *optionally* have that mark read from/written
> > to the mbuf, which may or may not be possible depending on HW features.

Thoughts regarding this suggestion? From a user perspective I think all
these actions should be unified but maybe there are good reasons to keep
them separate?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-29 15:06                   ` Adrien Mazarguil
@ 2019-10-08 12:51                     ` Yigit, Ferruh
  2019-10-08 13:17                       ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Yigit, Ferruh @ 2019-10-08 12:51 UTC (permalink / raw)
  To: Adrien Mazarguil, Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko

On 7/29/2019 4:06 PM, Adrien Mazarguil wrote:
> On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
>> On 11.07.2019 10:44, Adrien Mazarguil wrote:
>>> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
>>>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>>
>>>>> 10/07/2019 14:01, Bruce Richardson:
>>>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>>>>>>
>>>>>>>>>> This patch extends the usability.
>>>>>>>>>>
>>>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>>>>>>
>>>>>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>>>>>> result of flow match.
>>>>>>>>>>
>>>>>>>>>> 2) Metadata on ingress
>>>>>>>>>>
>>>>>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>>>>>> PKT_RX_METADATA ol_flag.
>>>>>>>>>>
>>>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>>>>>>
>>>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>>>>>> propagated to the other path depending on HW capability.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>>>>>> 			uint32_t usr;
>>>>>>>>>> 		} hash;                   /**< hash information */
>>>>>>>>>> -		struct {
>>>>>>>>>> -			/**
>>>>>>>>>> -			 * Application specific metadata value
>>>>>>>>>> -			 * for egress flow rule match.
>>>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>>>>>> -			 * Located here to allow conjunct use
>>>>>>>>>> -			 * with hash.sched.hi.
>>>>>>>>>> -			 */
>>>>>>>>>> -			uint32_t tx_metadata;
>>>>>>>>>> -			uint32_t reserved;
>>>>>>>>>> -		};
>>>>>>>>>> 	};
>>>>>>>>>>
>>>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>>>>>> 	 */
>>>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>>>>>>
>>>>>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>>>>>> +	 */
>>>>>>>>>> +	uint32_t metadata;
>>>>>>>>>> +
>>>>>>>>>> } __rte_cache_aligned;
>>>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>>>>>> deprecation notice.
>>>>>>>>>
>>>>>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>>>>>> break the ABI if it either causes new fields to move or changes the
>>>>>>>> structure size. Since this is at the end, it's not going to move any older
>>>>>>>> fields, and since everything is cache-aligned I don't think the structure
>>>>>>>> size changes either.
>>>>>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>>>>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>>>>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>>>>>> these 2 versions are not binary compatible.
>>>>>>>
>>>>>>> Anyway, at least it breaks the API.
>>>>>> Ok, I misunderstood. I thought it was the structure change itself you were
>>>>>> saying broke the ABI. Yes, putting the data in a different place is indeed
>>>>>> an ABI break.
>>>>> We could add the new field and keep the old one unused,
>>>>> so it does not break the ABI.
>>>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
>>>> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
>>>> the new one at the end and make it used with the new PKT_RX_METADATA.
>>>>
>>>>> However I suppose everybody will prefer a version using dynamic fields.
>>>>> Is someone against using dynamic field for such usage?
>>>> However, given that the amazing dynamic fields is coming soon (thanks for your
>>>> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
>>>>
>>>> Olivier, I'll take a look at your RFC.
>>> Just got a crazy idea while reading this thread... How about repurposing
>>> that "reserved" field as "rx_metadata" in the meantime?
>>
>> It overlaps with hash.fdir.hi which has RSS hash.
> 
> While it does overlap with hash.fdir.hi, isn't the RSS hash stored in the
> "rss" field overlapping with hash.fdir.lo? (see struct rte_flow_action_rss)
> 
> hash.fdir.hi was originally used by FDIR and later repurposed by rte_flow
> for its MARK action, which neatly qualifies as Rx metadata so renaming
> "reserved" as "rx_metadata" could already make sense.
> 
> That is, assuming users do not need two different kinds of Rx metadata
> returned simultaneously with their packets. I think it's safe.
> 
>>> I know reserved fields are cursed and no one's ever supposed to touch them
>>> but this risk is mitigated by having the end user explicitly request its
>>> use, so the patch author (and his relatives) should be safe from the
>>> resulting bad juju.
>>>
>>> Joke aside, while I like the idea of Tx/Rx META, I think the similarities
>>> with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
>>> comment when META was originally added to the Tx path, but there's a lot of
>>> overlap between these items/actions, without anything explaining to the end
>>> user how and why they should pick one over the other, if they can be
>>> combined at all and what happens in that case.
>>>
>>> All this must be documented, then we should think about unifying their
>>> respective features and deprecate the less capable items/actions. In my
>>> opinion, users need exactly one method to mark/match some mark while
>>> processing Rx/Tx traffic and *optionally* have that mark read from/written
>>> to the mbuf, which may or may not be possible depending on HW features.
> 
> Thoughts regarding this suggestion? From a user perspective I think all
> these actions should be unified but maybe there are good reasons to keep
> them separate?
> 

I think more recent plan is introducing dynamic fields for the remaining 16
bytes in the second cacheline.

I will update the patch as rejected, is there any objection?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-11  1:59           ` Yongseok Koh
@ 2019-10-08 12:57             ` Yigit, Ferruh
  2019-10-08 13:18               ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Yigit, Ferruh @ 2019-10-08 12:57 UTC (permalink / raw)
  To: Yongseok Koh, Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
>> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
>>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>>>>
>>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>>>> A tag is a transient data which can be used during flow match. This can be
>>>>> used to store match result from a previous table so that the same pattern
>>>>> need not be matched again on the next table. Even if outer header is
>>>>> decapsulated on the previous match, the match result can be kept.
>>>>>
>>>>> Some device expose internal registers of its flow processing pipeline and
>>>>> those registers are quite useful for stateful connection tracking as it
>>>>> keeps status of flow matching. Multiple tags are supported by specifying
>>>>> index.
>>>>>
>>>>> Example testpmd commands are:
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>>>            eth ... / end
>>>>>    actions ... jump group 2 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>>  flow create 0 ingress group 2
>>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>
>>>> Hi Yongseok,
>>>>
>>>> Only high level questions for now, while it unquestionably looks useful,
>>>> from a user standpoint exposing the separate index seems redundant and not
>>>> necessarily convenient. Using the following example to illustrate:
>>>>
>>>> actions set_tag index 3 value 0x123456 mask 0xfffff
>>>>
>>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>>>>
>>>> I might be missing something, but why isn't this enough:
>>>>
>>>> pattern tag index is 3 # match whatever is stored at index 3
>>>>
>>>> Assuming it can work, then why bother with providing value spec/mask on
>>>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>>>> be matched by a subsequent flow rule and that's it. It even seems like
>>>> relying on the index only on both occasions is enough for identification.
>>>>
>>>> Same question for the opposite approach; relying on the value, never
>>>> mentioning the index.
>>>>
>>>> I'm under the impression that the index is a hardware-specific constraint
>>>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>>>> PMD could keep track of used indices without having them exposed through the
>>>> public API.
>>>
>>>
>>> Thank you for review, Adrien.
>>> Hope you are doing well. It's been long since we talked each other. :-)
>>
>> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
>> answer these days...
>>
>>  <dev@dpdk.org> hey!
>>  <dev@dpdk.org> no private talks!
>>
>> Back to the topic:
>>
>>> Your approach will work too in general but we have a request from customer that
>>> they want to partition this limited tag storage. Assuming that HW exposes 32bit
>>> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
>>> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
>>> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
>>> is better to provide mask when setting/matching the value. Even some customer
>>> wants to store multiple flags bit by bit like ol_flags. They do want to alter
>>> only partial bits.
>>>
>>> And for the index, it is to reference an entry of tags array as HW can provide
>>> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
>>> storage which users can use for their own sake.
>>> 	tag[0], tag[1], tag[2], tag[3]
>>
>> OK, looks like I missed the point then. I initially took it for a funky
>> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
>> (ingress extended [1]) but while it could be used like that, it's more of a
>> way to temporarily store and retrieve a small amount of data, correct?
> 
> Correct.
> 
>> Out of curiosity, are these registers independent from META and other
>> items/actions in mlx5, otherwise what happens if they are combined?
> 
> I thought about combining it but I chose this way. Because it is transient. META
> can be set by packet descriptor on Tx and can be delivered to host via mbuf on
> Rx, but this TAG item can't. If I combine it, users have to query this
> capability for each 32b storage. And also, there should be a way to request data
> from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
> storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> mbuf->metadata, it should be something like,
> 	ingress / pattern .. /
> 	actions ... set_meta index i data x / copy_meta_to_rx index i
> And if user wants to set meta[i] via mbuf on Tx,
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> For sure, user is also responsible for querying these capabilities per each
> meta[] storage.
> 
> As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> User might misunderstand the order of two things - item meta and copy_meta
> action. I also thought about having capability bits per each meta[] storage but
> it also looked complex.
> 
> I do think rte_flow item/action is better to be simple, atomic and intuitive.
> That's why I made this choice.
> 
>> Are there other uses for these registers? Say, referencing their contents
>> from other places in a flow rule so they don't have to be hard-coded?
> 
> Possible.
> Actually, this feature is needed by connection tracking of OVS-DPDK.
> 
>> Right now I'm still uncomfortable with such a feature in the public API
>> because compared to META [1], this approach looks very hardware-specific and
>> seemingly difficult to map on different HW architectures.
> 
> I wouldn't say it is HW-specific. Like I explained above, I just define this new
> item/action to make things easy-to-use and intuitive.
> 
>> However, the main problem is that as described, its end purpose seems
>> redundant with META, which I think can cover the use cases you gave. So what
>> can an application do with this that couldn't be done in a more generic
>> fashion through META?
>>
>> I may still be missing something and I'm open to ideas, but assuming it
>> doesn't make it into the public rte_flow API, it remains an interesting
>> feature on its own merit which could be added to DPDK as PMD-specific
>> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
>> for PMDs to expose a public header that dedicated applications can include
>> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
>> No problem with that.
> 
> That's good info. Thanks. But still considering connection-tracking-like
> use-cases, this transient storage on multi-table flow pipeline is quite useful.
> 
> 
> thanks,
> Yongseok
> 
>> [1] "[PATCH] ethdev: extend flow metadata"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
>>
>> [2] "Negative types"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0

Is this RFC still valid, will there be any follow up?
If not am marking it as rejected in next a few days.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-10-08 12:51                     ` Yigit, Ferruh
@ 2019-10-08 13:17                       ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-08 13:17 UTC (permalink / raw)
  To: Yigit, Ferruh, Adrien Mazarguil, Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev

> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@linux.intel.com>
> Sent: Tuesday, October 8, 2019 15:51
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>
> Cc: Yongseok Koh <yskoh@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Olivier Matz <olivier.matz@6wind.com>; Bruce
> Richardson <bruce.richardson@intel.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Ferruh Yigit <ferruh.yigit@intel.com>; dev
> <dev@dpdk.org>; Slava Ovsiienko <viacheslavo@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
> 
> On 7/29/2019 4:06 PM, Adrien Mazarguil wrote:
> > On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
> >> On 11.07.2019 10:44, Adrien Mazarguil wrote:
> >>> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> >>>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon
> <thomas@monjalon.net> wrote:
> >>>>>
> >>>>> 10/07/2019 14:01, Bruce Richardson:
> >>>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> >>>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson
> wrote:
> >>>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> >>>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> >>>>>>>>>> Currently, metadata can be set on egress path via mbuf
> >>>>>>>>>> tx_meatadata field with PKT_TX_METADATA flag and
> RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >>>>>>>>>>
> >>>>>>>>>> This patch extends the usability.
> >>>>>>>>>>
> >>>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>>>>>>
> >>>>>>>>>> When supporting multiple tables, Tx metadata can also be set
> >>>>>>>>>> by a rule and matched by another rule. This new action allows
> >>>>>>>>>> metadata to be set as a result of flow match.
> >>>>>>>>>>
> >>>>>>>>>> 2) Metadata on ingress
> >>>>>>>>>>
> >>>>>>>>>> There's also need to support metadata on packet Rx. Metadata
> >>>>>>>>>> can be set by SET_META action and matched by META item like
> >>>>>>>>>> Tx. The final value set by the action will be delivered to
> >>>>>>>>>> application via mbuf metadata field with PKT_RX_METADATA
> ol_flag.
> >>>>>>>>>>
> >>>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate
> >>>>>>>>>> new field and renamed to 'metadata' to support both Rx and Tx
> metadata.
> >>>>>>>>>>
> >>>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or
> may
> >>>>>>>>>> not be propagated to the other path depending on HW
> capability.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
> >>>>>>>>>> 			/**< User defined tags. See
> rte_distributor_process() */
> >>>>>>>>>> 			uint32_t usr;
> >>>>>>>>>> 		} hash;                   /**< hash information */
> >>>>>>>>>> -		struct {
> >>>>>>>>>> -			/**
> >>>>>>>>>> -			 * Application specific metadata value
> >>>>>>>>>> -			 * for egress flow rule match.
> >>>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
> >>>>>>>>>> -			 * Located here to allow conjunct use
> >>>>>>>>>> -			 * with hash.sched.hi.
> >>>>>>>>>> -			 */
> >>>>>>>>>> -			uint32_t tx_metadata;
> >>>>>>>>>> -			uint32_t reserved;
> >>>>>>>>>> -		};
> >>>>>>>>>> 	};
> >>>>>>>>>>
> >>>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set.
> >>>>>>>>>> */ @@ -727,6 +721,11 @@ struct rte_mbuf {
> >>>>>>>>>> 	 */
> >>>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>>>>>>>>
> >>>>>>>>>> +	/** Application specific metadata value for flow rule match.
> >>>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> >>>>>>>>>> +	 */
> >>>>>>>>>> +	uint32_t metadata;
> >>>>>>>>>> +
> >>>>>>>>>> } __rte_cache_aligned;
> >>>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we
> >>>>>>>>> need a deprecation notice.
> >>>>>>>>>
> >>>>>>>> Does it actually break the ABI? Adding a new field to the mbuf
> >>>>>>>> should only break the ABI if it either causes new fields to
> >>>>>>>> move or changes the structure size. Since this is at the end,
> >>>>>>>> it's not going to move any older fields, and since everything
> >>>>>>>> is cache-aligned I don't think the structure size changes either.
> >>>>>>> I think it does break the ABI: in previous version, when the
> >>>>>>> PKT_TX_METADATA flag is set, the associated value is put in
> >>>>>>> m->tx_metadata (offset 44 on x86-64), and in the next version,
> >>>>>>> it will be in m->metadata (offset 112). So, these 2 versions are not
> binary compatible.
> >>>>>>>
> >>>>>>> Anyway, at least it breaks the API.
> >>>>>> Ok, I misunderstood. I thought it was the structure change itself
> >>>>>> you were saying broke the ABI. Yes, putting the data in a
> >>>>>> different place is indeed an ABI break.
> >>>>> We could add the new field and keep the old one unused, so it does
> >>>>> not break the ABI.
> >>>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to
> >>>> break it, I can keep the current union'd field (tx_metadata) as is
> >>>> with PKT_TX_METADATA, add the new one at the end and make it used
> with the new PKT_RX_METADATA.
> >>>>
> >>>>> However I suppose everybody will prefer a version using dynamic
> fields.
> >>>>> Is someone against using dynamic field for such usage?
> >>>> However, given that the amazing dynamic fields is coming soon
> >>>> (thanks for your effort, Olivier and Thomas!), I'd be honored to be the
> first user of it.
> >>>>
> >>>> Olivier, I'll take a look at your RFC.
> >>> Just got a crazy idea while reading this thread... How about
> >>> repurposing that "reserved" field as "rx_metadata" in the meantime?
> >>
> >> It overlaps with hash.fdir.hi which has RSS hash.
> >
> > While it does overlap with hash.fdir.hi, isn't the RSS hash stored in
> > the "rss" field overlapping with hash.fdir.lo? (see struct
> > rte_flow_action_rss)
> >
> > hash.fdir.hi was originally used by FDIR and later repurposed by
> > rte_flow for its MARK action, which neatly qualifies as Rx metadata so
> > renaming "reserved" as "rx_metadata" could already make sense.
> >
> > That is, assuming users do not need two different kinds of Rx metadata
> > returned simultaneously with their packets. I think it's safe.
> >
> >>> I know reserved fields are cursed and no one's ever supposed to
> >>> touch them but this risk is mitigated by having the end user
> >>> explicitly request its use, so the patch author (and his relatives)
> >>> should be safe from the resulting bad juju.
> >>>
> >>> Joke aside, while I like the idea of Tx/Rx META, I think the
> >>> similarities with MARK (and TAG eventually) is a problem. I wasn't
> >>> available and couldn't comment when META was originally added to the
> >>> Tx path, but there's a lot of overlap between these items/actions,
> >>> without anything explaining to the end user how and why they should
> >>> pick one over the other, if they can be combined at all and what happens
> in that case.
> >>>
> >>> All this must be documented, then we should think about unifying
> >>> their respective features and deprecate the less capable
> >>> items/actions. In my opinion, users need exactly one method to
> >>> mark/match some mark while processing Rx/Tx traffic and *optionally*
> >>> have that mark read from/written to the mbuf, which may or may not be
> possible depending on HW features.
> >
> > Thoughts regarding this suggestion? From a user perspective I think
> > all these actions should be unified but maybe there are good reasons
> > to keep them separate?
> >
> 
> I think more recent plan is introducing dynamic fields for the remaining 16
> bytes in the second cacheline.
> 
> I will update the patch as rejected, is there any objection?

v2 is coming,  will be based on dynamic mbuf fields.
I think Superseded / Changes Requested is more relevant.

WBR, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-10-08 12:57             ` Yigit, Ferruh
@ 2019-10-08 13:18               ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-08 13:18 UTC (permalink / raw)
  To: Yigit, Ferruh, Yongseok Koh, Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev

> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@linux.intel.com>
> Sent: Tuesday, October 8, 2019 15:57
> To: Yongseok Koh <yskoh@mellanox.com>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew
> Rybchenko <arybchenko@solarflare.com>; Olivier Matz
> <olivier.matz@6wind.com>; dev <dev@dpdk.org>; Slava Ovsiienko
> <viacheslavo@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH] ethdev: add flow tag
> 
> On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> > On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> >> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> >>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil
> <adrien.mazarguil@6wind.com> wrote:
> >>>>
> >>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >>>>> A tag is a transient data which can be used during flow match.
> >>>>> This can be used to store match result from a previous table so
> >>>>> that the same pattern need not be matched again on the next table.
> >>>>> Even if outer header is decapsulated on the previous match, the match
> result can be kept.
> >>>>>
> >>>>> Some device expose internal registers of its flow processing
> >>>>> pipeline and those registers are quite useful for stateful
> >>>>> connection tracking as it keeps status of flow matching. Multiple
> >>>>> tags are supported by specifying index.
> >>>>>
> >>>>> Example testpmd commands are:
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>>>>            eth ... / end
> >>>>>    actions ... jump group 2 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>>  flow create 0 ingress group 2
> >>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>
> >>>> Hi Yongseok,
> >>>>
> >>>> Only high level questions for now, while it unquestionably looks
> >>>> useful, from a user standpoint exposing the separate index seems
> >>>> redundant and not necessarily convenient. Using the following example
> to illustrate:
> >>>>
> >>>> actions set_tag index 3 value 0x123456 mask 0xfffff
> >>>>
> >>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> >>>>
> >>>> I might be missing something, but why isn't this enough:
> >>>>
> >>>> pattern tag index is 3 # match whatever is stored at index 3
> >>>>
> >>>> Assuming it can work, then why bother with providing value
> >>>> spec/mask on set_tag? A flow rule pattern matches something, sets
> >>>> some arbitrary tag to be matched by a subsequent flow rule and
> >>>> that's it. It even seems like relying on the index only on both occasions
> is enough for identification.
> >>>>
> >>>> Same question for the opposite approach; relying on the value,
> >>>> never mentioning the index.
> >>>>
> >>>> I'm under the impression that the index is a hardware-specific
> >>>> constraint that shouldn't be exposed (especially since it's an
> >>>> 8-bit field). If so, a PMD could keep track of used indices without
> >>>> having them exposed through the public API.
> >>>
> >>>
> >>> Thank you for review, Adrien.
> >>> Hope you are doing well. It's been long since we talked each other.
> >>> :-)
> >>
> >> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence
> >> slow to answer these days...
> >>
> >>  <dev@dpdk.org> hey!
> >>  <dev@dpdk.org> no private talks!
> >>
> >> Back to the topic:
> >>
> >>> Your approach will work too in general but we have a request from
> >>> customer that they want to partition this limited tag storage.
> >>> Assuming that HW exposes 32bit tags (those are 'registers' in HW
> >>> pipeline in mlx5 HW). Then, customers want to store multiple data
> >>> even in a 32-bit storage. For example, 16bit vlan tag, 8bit table id
> >>> and 8bit flow id. As they want to split one 32bit storage, I thought
> >>> it is better to provide mask when setting/matching the value. Even
> >>> some customer wants to store multiple flags bit by bit like ol_flags. They
> do want to alter only partial bits.
> >>>
> >>> And for the index, it is to reference an entry of tags array as HW
> >>> can provide larger registers than 32-bit. For example, mlx5 HW would
> >>> provide 4 of 32b storage which users can use for their own sake.
> >>> 	tag[0], tag[1], tag[2], tag[3]
> >>
> >> OK, looks like I missed the point then. I initially took it for a
> >> funky alternative to RTE_FLOW_ITEM_TYPE_META &
> >> RTE_FLOW_ACTION_TYPE_SET_META (ingress extended [1]) but while it
> >> could be used like that, it's more of a way to temporarily store and
> retrieve a small amount of data, correct?
> >
> > Correct.
> >
> >> Out of curiosity, are these registers independent from META and other
> >> items/actions in mlx5, otherwise what happens if they are combined?
> >
> > I thought about combining it but I chose this way. Because it is
> > transient. META can be set by packet descriptor on Tx and can be
> > delivered to host via mbuf on Rx, but this TAG item can't. If I
> > combine it, users have to query this capability for each 32b storage.
> > And also, there should be a way to request data from such storages
> > (i.e. new action , e.g. copy_meta). Let's say there are 4x32b storages
> > - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> > mbuf->metadata, it should be something like,
> > 	ingress / pattern .. /
> > 	actions ... set_meta index i data x / copy_meta_to_rx index i And if
> > user wants to set meta[i] via mbuf on Tx,
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > For sure, user is also responsible for querying these capabilities per
> > each meta[] storage.
> >
> > As copy_meta_to_tx/rx isn't a real action, this example would confuse
> user.
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > User might misunderstand the order of two things - item meta and
> > copy_meta action. I also thought about having capability bits per each
> > meta[] storage but it also looked complex.
> >
> > I do think rte_flow item/action is better to be simple, atomic and intuitive.
> > That's why I made this choice.
> >
> >> Are there other uses for these registers? Say, referencing their
> >> contents from other places in a flow rule so they don't have to be hard-
> coded?
> >
> > Possible.
> > Actually, this feature is needed by connection tracking of OVS-DPDK.
> >
> >> Right now I'm still uncomfortable with such a feature in the public
> >> API because compared to META [1], this approach looks very
> >> hardware-specific and seemingly difficult to map on different HW
> architectures.
> >
> > I wouldn't say it is HW-specific. Like I explained above, I just
> > define this new item/action to make things easy-to-use and intuitive.
> >
> >> However, the main problem is that as described, its end purpose seems
> >> redundant with META, which I think can cover the use cases you gave.
> >> So what can an application do with this that couldn't be done in a
> >> more generic fashion through META?
> >>
> >> I may still be missing something and I'm open to ideas, but assuming
> >> it doesn't make it into the public rte_flow API, it remains an
> >> interesting feature on its own merit which could be added to DPDK as
> >> PMD-specific pattern items/actions [2]. mlx5 doesn't have any yet,
> >> but it's pretty common for PMDs to expose a public header that
> >> dedicated applications can include to use this kind of features (look for
> rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> >> No problem with that.
> >
> > That's good info. Thanks. But still considering
> > connection-tracking-like use-cases, this transient storage on multi-table
> flow pipeline is quite useful.
> >
> >
> > thanks,
> > Yongseok
> >
> >> [1] "[PATCH] ethdev: extend flow metadata"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmai
> >> ls.dpdk.org%2Farchives%2Fdev%2F2019-
> July%2F137305.html&amp;data=02%7C
> >>
> 01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23308d74bef1
> 4fd%7C
> >>
> a652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637061362537116332
> &amp;sda
> >>
> ta=I%2B%2BERHK8FXzLxXkbbjGTmNDf2e%2FsVRvQ%2FIJW4ZmaYrk%3D&a
> mp;reserve
> >> d=0
> >>
> >> [2] "Negative types"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc
> >> .dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-
> types&amp;
> >>
> data=02%7C01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23
> 308d74
> >>
> bef14fd%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63706136
> 25371163
> >>
> 32&amp;sdata=o6hcNuwWnv9fADGxNcy6S9B0xwCNdlNhbloIKRiMiNo%3D&
> amp;reser
> >> ved=0
> 
> Is this RFC still valid, will there be any follow up?
> If not am marking it as rejected in next a few days.

Yes, RFC is valid, v2 and support in mlx5 Is coming.

WBR, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-07-10  9:31   ` Olivier Matz
@ 2019-10-10 16:02   ` Viacheslav Ovsiienko
  2019-10-18  9:22     ` Olivier Matz
  2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-10 16:02 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

  This patch uses dynamic mbuf field and must be applied after:
  http://patches.dpdk.org/patch/59343/
  "mbuf: support dynamic fields and flags"

  v2: - rebased
      - relies on dynamic mbuf field feature

  v1: http://patches.dpdk.org/patch/56103/

  rfc: http://patches.dpdk.org/patch/54270/

 app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst   |  8 +++
 lib/librte_ethdev/rte_ethdev.h           |  1 -
 lib/librte_ethdev/rte_ethdev_version.map |  6 +++
 lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
 lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
 9 files changed, 265 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b26b8bf..078f256 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -305,6 +305,9 @@ enum index {
 	ACTION_DEC_TCP_ACK_VALUE,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -994,6 +997,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1180,6 +1184,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1238,6 +1249,10 @@ static int parse_vc_action_raw_encap(struct context *,
 static int parse_vc_action_raw_decap(struct context *,
 				     const struct token *, const char *,
 				     unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3222,7 +3237,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.help = "set raw decap data",
 		.next = NEXT(next_item),
 		.call = parse_set_raw_encap_decap,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4592,6 +4631,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 1570270..39ff07b 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -81,6 +81,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index ff6fb11..45fc041 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2415,6 +2441,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 8921cfd..904746e 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -95,6 +95,14 @@ New Features
   for specific offload features, where adding a static field or flag
   in the mbuf is not justified.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index d937fb4..9a6432c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1013,7 +1013,6 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
-
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
 				 DEV_RX_OFFLOAD_TCP_CKSUM)
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 6df42a4..3d9cafc 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -283,4 +283,10 @@ EXPERIMENTAL {
 
 	# added in 19.08
 	rte_eth_read_clock;
+
+	# added in 19.11
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_flag;
+	rte_flow_dynf_metadata_avail;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index cc03b15..9cbda75 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -153,8 +161,41 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = MBUF_DYNF_METADATA_NAME,
+		.size = MBUF_DYNF_METADATA_SIZE,
+		.align = MBUF_DYNF_METADATA_ALIGN,
+		.flags = MBUF_DYNF_METADATA_FLAGS,
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = MBUF_DYNF_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 391a44a..a27e619 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -27,6 +27,8 @@
 #include <rte_udp.h>
 #include <rte_byteorder.h>
 #include <rte_esp.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -417,7 +419,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
 /*
  * Definition of a single action.
  *
@@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 6e2c816..4ff33ac 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
  */
 #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
 
+/**
+ * Flow metadata dynamic field definitions.
+ */
+#define MBUF_DYNF_METADATA_NAME "flow-metadata"
+#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
+#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
+#define MBUF_DYNF_METADATA_FLAGS 0
+
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v2] ethdev: add flow tag
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-07-05 13:54     ` Adrien Mazarguil
@ 2019-10-10 16:09     ` Viacheslav Ovsiienko
  2019-10-24 13:12       ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-10 16:09 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, Yongseok Koh

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

v2: rebased
v1: http://patches.dpdk.org/patch/56104/
rfc: http://patches.dpdk.org/patch/54271/


 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst |  5 +++
 lib/librte_ethdev/rte_flow.c           |  2 +
 lib/librte_ethdev/rte_flow.h           | 54 ++++++++++++++++++++++++
 5 files changed, 186 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 078f256..667cb80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -203,6 +203,9 @@ enum index {
 	ITEM_PPPOED,
 	ITEM_PPPOE_SEID,
 	ITEM_PPPOE_PROTO_ID,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -308,6 +311,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -678,6 +685,7 @@ struct parse_action_priv {
 	ITEM_PPPOES,
 	ITEM_PPPOED,
 	ITEM_PPPOE_PROTO_ID,
+	ITEM_TAG,
 	END_SET,
 	ZERO,
 };
@@ -942,6 +950,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -998,6 +1013,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1191,6 +1207,14 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -2434,6 +2458,26 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.next = NEXT(item_pppoe_proto_id),
 		.call = parse_vc,
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -3262,6 +3306,37 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 45fc041..290646f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2472,6 +2500,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 904746e..9077f2f 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -103,6 +103,11 @@ New Features
   * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
     PKT_RX_DYNF_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 9cbda75..dcbae99 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -82,6 +82,7 @@ struct rte_flow_desc_data {
 		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
 	MK_FLOW_ITEM(MARK, sizeof(struct rte_flow_item_mark)),
 	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
+	MK_FLOW_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
 	MK_FLOW_ITEM(GRE_KEY, sizeof(rte_be32_t)),
 	MK_FLOW_ITEM(GTP_PSC, sizeof(struct rte_flow_item_gtp_psc)),
 	MK_FLOW_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)),
@@ -162,6 +163,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
+	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index a27e619..f3a5166 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -473,6 +473,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_pppoe_proto_id.
 	 */
 	RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1300,6 +1309,27 @@ struct rte_flow_item_pppoe_proto_id {
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_tag_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
  * RTE_FLOW_ITEM_TYPE_MARK
  *
  * Matches an arbitrary integer value which was set using the ``MARK`` action
@@ -1831,6 +1861,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2355,6 +2394,21 @@ struct rte_flow_action_set_meta {
 /* Mbuf dynamic flag for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
@ 2019-10-18  9:22     ` Olivier Matz
  2019-10-19 19:47       ` Slava Ovsiienko
  2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-18  9:22 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, thomas, Yongseok Koh

Hi Viacheslav,

Few comments on the dynamic mbuf part below.

On Thu, Oct 10, 2019 at 04:02:39PM +0000, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> 
>   This patch uses dynamic mbuf field and must be applied after:
>   http://patches.dpdk.org/patch/59343/
>   "mbuf: support dynamic fields and flags"
> 
>   v2: - rebased
>       - relies on dynamic mbuf field feature
> 
>   v1: http://patches.dpdk.org/patch/56103/
> 
>   rfc: http://patches.dpdk.org/patch/54270/
> 
>  app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
>  app/test-pmd/util.c                      |  5 ++
>  doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst   |  8 +++
>  lib/librte_ethdev/rte_ethdev.h           |  1 -
>  lib/librte_ethdev/rte_ethdev_version.map |  6 +++
>  lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
>  lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
>  9 files changed, 265 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index b26b8bf..078f256 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -305,6 +305,9 @@ enum index {
>  	ACTION_DEC_TCP_ACK_VALUE,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
>  };
>  
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -994,6 +997,7 @@ struct parse_action_priv {
>  	ACTION_DEC_TCP_ACK,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
>  	ZERO,
>  };
>  
> @@ -1180,6 +1184,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
>  
> +static const enum index action_set_meta[] = {
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -1238,6 +1249,10 @@ static int parse_vc_action_raw_encap(struct context *,
>  static int parse_vc_action_raw_decap(struct context *,
>  				     const struct token *, const char *,
>  				     unsigned int, void *, unsigned int);
> +static int parse_vc_action_set_meta(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
> +				    unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -3222,7 +3237,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  		.help = "set raw decap data",
>  		.next = NEXT(next_item),
>  		.call = parse_set_raw_encap_decap,
> -	}
> +	},
> +	[ACTION_SET_META] = {
> +		.name = "set_meta",
> +		.help = "set metadata",
> +		.priv = PRIV_ACTION(SET_META,
> +			sizeof(struct rte_flow_action_set_meta)),
> +		.next = NEXT(action_set_meta),
> +		.call = parse_vc_action_set_meta,
> +	},
> +	[ACTION_SET_META_DATA] = {
> +		.name = "data",
> +		.help = "metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_META_MASK] = {
> +		.name = "mask",
> +		.help = "mask for metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
>  
>  /** Remove and return last entry from argument stack. */
> @@ -4592,6 +4631,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  	return ret;
>  }
>  
> +static int
> +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	ret = rte_flow_dynf_metadata_register();
> +	if (ret < 0)
> +		return -1;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 1570270..39ff07b 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -81,6 +81,11 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> +		if (ol_flags & PKT_TX_METADATA)
> +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_RX_DYNF_METADATA)
> +			printf(" - Rx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (mb->packet_type) {
>  			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
>  			printf(" - hw ptype: %s", buf);
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index ff6fb11..45fc041 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
>     | ``mask`` | ``id``   | zeroed to match any value |
>     +----------+----------+---------------------------+
>  
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +   +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |
> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -2415,6 +2441,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
>  
>  Using this action on non-matching traffic will result in undefined behavior.
>  
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>  
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 8921cfd..904746e 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -95,6 +95,14 @@ New Features
>    for specific offload features, where adding a static field or flag
>    in the mbuf is not justified.
>  
> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +
>  Removed Items
>  -------------
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index d937fb4..9a6432c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1013,7 +1013,6 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 6df42a4..3d9cafc 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -283,4 +283,10 @@ EXPERIMENTAL {
>  
>  	# added in 19.08
>  	rte_eth_read_clock;
> +
> +	# added in 19.11
> +	rte_flow_dynf_metadata_offs;
> +	rte_flow_dynf_metadata_flag;
> +	rte_flow_dynf_metadata_avail;
> +	rte_flow_dynf_metadata_register;
>  };
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index cc03b15..9cbda75 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>  #include <rte_errno.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include "rte_ethdev.h"
>  #include "rte_flow_driver.h"
>  #include "rte_flow.h"
>  
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>  /**
>   * Flow elements description tables.
>   */
> @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
>  };
>  
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +		.size = MBUF_DYNF_METADATA_SIZE,
> +		.align = MBUF_DYNF_METADATA_ALIGN,
> +		.flags = MBUF_DYNF_METADATA_FLAGS,
> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +	};

I don't see think we need #defines.
You can directly use the name, sizeof() and __alignof__() here.
If the information is used externally, the structure shall
be made global non-static.


> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:
> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>  static int
>  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>  {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 391a44a..a27e619 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -27,6 +27,8 @@
>  #include <rte_udp.h>
>  #include <rte_byteorder.h>
>  #include <rte_esp.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -417,7 +419,8 @@ enum rte_flow_item_type {
>  	/**
>  	 * [META]
>  	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>  	 * See struct rte_flow_item_meta.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_META,
> @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  #endif
>  
>  /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> - * Matches a specified metadata value.
> + * RTE_FLOW_ITEM_TYPE_META
> + *
> + * Matches a specified metadata value. On egress, metadata can be set either by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
>  	rte_be32_t data;
> @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
>  	 * undefined behavior.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> +
> +	/**
> +	 * Set metadata on ingress or egress path.
> +	 *
> +	 * See struct rte_flow_action_set_meta.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_META,
>  };
>  
>  /**
> @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
>  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>  };
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	rte_be32_t data;
> +	rte_be32_t mask;
> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +

I wonder if helpers like this wouldn't be better, because
they combine the flag and the field:

/**
 * Set metadata dynamic field and flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
                                       uint32_t metadata)
{
       *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                       uint32_t *) = metadata;
       m->ol_flags |= rte_flow_dynf_metadata_mask;
}

/**
 * Get metadata dynamic field value in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
                                       uint32_t *metadata)
{
       if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
               return -1;
       *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                               uint32_t *);
       return 0;
}

/**
 * Delete the metadata dynamic flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m)
{
       m->ol_flags &= ~rte_flow_dynf_metadata_mask;
}


>  /*
>   * Definition of a single action.
>   *
> @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {
>  };
>  
>  /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {
> +	return !!rte_flow_dynf_metadata_mask;
> +}

_registered() instead of _avail() ?

> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>   * Check whether a flow rule can be created on a given port.
>   *
>   * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 6e2c816..4ff33ac 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
>   */
>  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
>  
> +/**
> + * Flow metadata dynamic field definitions.
> + */
> +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
> +#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
> +#define MBUF_DYNF_METADATA_FLAGS 0

If this flag is only to be used in rte_flow, it can stay in rte_flow.
The name should follow the function name conventions, I suggest
"rte_flow_metadata".

If the flag is going to be used in several places in dpdk (rte_flow,
pmd, app, ...), I wonder if it shouldn't be defined it in
rte_mbuf_dyn.c. I mean:

====
/* rte_mbuf_dyn.c */
const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
   ...
};
int rte_mbuf_dynfield_flow_metadata_offset = -1;
const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
   ...
};
int rte_mbuf_dynflag_flow_metadata_bitnum = -1;

int rte_mbuf_dyn_flow_metadata_register(void)
{
...
}

/* rte_mbuf_dyn.h */
extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
extern int rte_mbuf_dynfield_flow_metadata_offset;
extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
extern int rte_mbuf_dynflag_flow_metadata_bitnum;

...helpers to set/get metadata...
===

Centralizing the definitions of non-private dynamic fields/flags in
rte_mbuf_dyn may help other people to reuse a field that is well
described if it match their use-case.

In your case, what is carried by metadata? Could it be reused by
others? I think some more description is needed.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-18  9:22     ` Olivier Matz
@ 2019-10-19 19:47       ` Slava Ovsiienko
  2019-10-21 16:37         ` Olivier Matz
  0 siblings, 1 reply; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-19 19:47 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon, Yongseok Koh

Hi, Olivier

Thank you for your comment (and for the dynamic mbuf patch, btw). Please, see below.

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Friday, October 18, 2019 12:22
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v2] ethdev: extend flow metadata
> 
> Hi Viacheslav,
> 
> Few comments on the dynamic mbuf part below.
> 
[snip]

> > @@ -12,10 +12,18 @@
> >  #include <rte_errno.h>
> >  #include <rte_branch_prediction.h>
> >  #include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >  #include "rte_ethdev.h"
> >  #include "rte_flow_driver.h"
> >  #include "rte_flow.h"
> >
> > +/* Mbuf dynamic field name for metadata. */ int
> > +rte_flow_dynf_metadata_offs = -1;
> > +
> > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> >  /**
> >   * Flow elements description tables.
> >   */
> > @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
> >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >  };
> >
> > +int
> > +rte_flow_dynf_metadata_register(void)
> > +{
> > +	int offset;
> > +	int flag;
> > +
> > +	static const struct rte_mbuf_dynfield desc_offs = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +		.size = MBUF_DYNF_METADATA_SIZE,
> > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > +	};
> > +	static const struct rte_mbuf_dynflag desc_flag = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +	};
> 
> I don't see think we need #defines.
> You can directly use the name, sizeof() and __alignof__() here.
> If the information is used externally, the structure shall be made global non-
> static.

The intention was to gather all dynamic fields definitions in one place 
(in rte_mbuf_dyn.h). It would be easy to see all fields in one sight (some
might be shared, some might be mutual exclusive, estimate mbuf space,
required by various features, etc.). So, we can't just fill structure fields
with simple sizeof() and alignof() instead of definitions (the field parameters
must be defined once).

I do not see the reasons to make table global. I would prefer the definitions.
- the definitions are compile time processing (table fields are runtime),
it provides code optimization and better performance.

> > +
> > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > +	if (offset < 0)
> > +		goto error;
> > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > +	if (flag < 0)
> > +		goto error;
> > +	rte_flow_dynf_metadata_offs = offset;
> > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > +	return 0;
> > +
> > +error:
> > +	rte_flow_dynf_metadata_offs = -1;
> > +	rte_flow_dynf_metadata_mask = 0ULL;
> > +	return -rte_errno;
> > +}
> > +
> >  static int
> >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -27,6 +27,8 @@
> >  #include <rte_udp.h>
> >  #include <rte_byteorder.h>
> >  #include <rte_esp.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> >  	/**
> >  	 * [META]
> >  	 *
> > -	 * Matches a metadata value specified in mbuf metadata field.
> > +	 * Matches a metadata value.
> > +	 *
> >  	 * See struct rte_flow_item_meta.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_META,
> > @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > #endif
> >
> >  /**
> > - * RTE_FLOW_ITEM_TYPE_META.
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> >   *
> > - * Matches a specified metadata value.
> > + * RTE_FLOW_ITEM_TYPE_META
> > + *
> > + * Matches a specified metadata value. On egress, metadata can be set
> > + either by
> > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > + * metadata for a packet and the metadata will be reported via mbuf
> > + metadata
> > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > + field must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> >   */
> >  struct rte_flow_item_meta {
> >  	rte_be32_t data;
> > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> >  	 * undefined behavior.
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > +
> > +	/**
> > +	 * Set metadata on ingress or egress path.
> > +	 *
> > +	 * See struct rte_flow_action_set_meta.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_SET_META,
> >  };
> >
> >  /**
> > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SET_META
> > + *
> > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > +On
> > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > +field
> > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > +must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> > + *
> > + * Altering partial bits is supported with mask. For bits which have
> > +never
> > + * been set, unpredictable value will be seen depending on driver
> > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may
> > + * or may not be propagated to the other path depending on HW
> capability.
> > + *
> > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > + */
> > +struct rte_flow_action_set_meta {
> > +	rte_be32_t data;
> > +	rte_be32_t mask;
> > +};
> > +
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> 
> I wonder if helpers like this wouldn't be better, because they combine the
> flag and the field:
> 
> /**
>  * Set metadata dynamic field and flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
>                                        uint32_t metadata) {
>        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                        uint32_t *) = metadata;
>        m->ol_flags |= rte_flow_dynf_metadata_mask; }
Setting flag looks redundantly.
What if driver just replaces the metadata and flag is already set?
The other option - the flags (for set of fields) might be set in combinations.
mbuf field is supposed to be engaged in datapath, performance is
very critical, adding one more abstraction layer seems not to be relevant.
Also, metadata is not feature of mbuf. It should have rte_flow prefix.

> /**
>  * Get metadata dynamic field value in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
>                                        uint32_t *metadata) {
>        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
>                return -1;
What if metadata is 0xFFFFFFFF ?
The checking of availability might embrace larger code block, 
so this might be not the best place to check availability.

>        *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                                uint32_t *);
>        return 0;
> }
> 
> /**
>  * Delete the metadata dynamic flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
>        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> 
Sorry, I do not see the practical usecase for these helpers. In my opinion it is just some kind of obscuration.
They do replace the very simple code and introduce some risk of performance impact.

> 
> >  /*
> >   * Definition of a single action.
> >   *
> > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> >
> >  /**
> > + * Check if mbuf dynamic field for metadata is registered.
> > + *
> > + * @return
> > + *   True if registered, false otherwise.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> > +	return !!rte_flow_dynf_metadata_mask; }
> 
> _registered() instead of _avail() ?
Accepted, sounds better.

> 
> > +
> > +/**
> > + * Register mbuf dynamic field and flag for metadata.
> > + *
> > + * This function must be called prior to use SET_META action in order
> > +to
> > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > +delivered to
> > + * application.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_dynf_metadata_register(void);
> > +
> > +/**
> >   * Check whether a flow rule can be created on a given port.
> >   *
> >   * The flow rule is validated for correctness and whether it could be
> > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> >   */
> >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) +
> > (offset)))
> >
> > +/**
> > + * Flow metadata dynamic field definitions.
> > + */
> > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > +MBUF_DYNF_METADATA_FLAGS 0
> 
> If this flag is only to be used in rte_flow, it can stay in rte_flow.
> The name should follow the function name conventions, I suggest
> "rte_flow_metadata".

The definitions:
MBUF_DYNF_METADATA_NAME, 
MBUF_DYNF_METADATA_SIZE,
MBUF_DYNF_METADATA_ALIGN
are global. rte_flow proposes only minimal set tyo check and access
the metadata. By knowing the field names applications would have the
more flexibility in processing the fields, for example it allows to  optimize
the handling of multiple dynamic fields . The definition of metadata size allows
to generate optimized code:
#if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit()
#else
	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit()
#endif

MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow,
this flag is related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
Would you like to drop this definition?

> 
> If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I mean:
> 
> ====
> /* rte_mbuf_dyn.c */
> const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
>    ...
> };
In this case we would make this descriptor global.
It is no needed, because there Is no supposed any usage, but by
rte_flow_dynf_metadata_register() only. The 

> int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
>    ...
> };
> int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> 
> int rte_mbuf_dyn_flow_metadata_register(void)
> {
> ...
> }
> 
> /* rte_mbuf_dyn.h */
> extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
> extern int rte_mbuf_dynfield_flow_metadata_offset;
> extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> 
> ...helpers to set/get metadata...
> ===
> 
> Centralizing the definitions of non-private dynamic fields/flags in
> rte_mbuf_dyn may help other people to reuse a field that is well described if
> it match their use-case.

Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx placed
in rte_mbuf_dyn.h. Do you think we should share the descriptors either?
I have no idea why someone (but rte_flow_dynf_metadata_register()) might
register metadata field directly.

> 
> In your case, what is carried by metadata? Could it be reused by others? I
> think some more description is needed.
In my case, metadata is just opaquie rte_flow related 32-bit unsigned value provided by
mlx5 hardrware in rx datapath. I have no guess whether someone wishes to reuse.

Brief summary of you comment (just to make sure I understood your proposal in correct way):
1. drop all definitions MBUF_DYNF_METADATA_xxx, leave MBUF_DYNF_METADATA_NAME only
2. move the descriptor const struct rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
3. provide helpers to access metadata

[1] and [2] look OK in general. Although I think these ones make code less flexible, restrict the potential compile time options.
For now it is rather theoretical question, if you insist on your approach - please, let me know, I'll address [1] and [2]
and update.my patch.

As for [3] - IMHO, the extra abstraction layer is not useful, and might be even harmful.
I tend not to complicate the code, at least, for now.

With best regards,
Slava
 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-19 19:47       ` Slava Ovsiienko
@ 2019-10-21 16:37         ` Olivier Matz
  2019-10-24  6:49           ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-21 16:37 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon, Yongseok Koh

Hi Slava,

On Sat, Oct 19, 2019 at 07:47:59PM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> Thank you for your comment (and for the dynamic mbuf patch, btw). Please, see below.
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Friday, October 18, 2019 12:22
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> > Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>
> > Subject: Re: [PATCH v2] ethdev: extend flow metadata
> > 
> > Hi Viacheslav,
> > 
> > Few comments on the dynamic mbuf part below.
> > 
> [snip]
> 
> > > @@ -12,10 +12,18 @@
> > >  #include <rte_errno.h>
> > >  #include <rte_branch_prediction.h>
> > >  #include <rte_string_fns.h>
> > > +#include <rte_mbuf.h>
> > > +#include <rte_mbuf_dyn.h>
> > >  #include "rte_ethdev.h"
> > >  #include "rte_flow_driver.h"
> > >  #include "rte_flow.h"
> > >
> > > +/* Mbuf dynamic field name for metadata. */ int
> > > +rte_flow_dynf_metadata_offs = -1;
> > > +
> > > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > > +rte_flow_dynf_metadata_mask;
> > > +
> > >  /**
> > >   * Flow elements description tables.
> > >   */
> > > @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
> > >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> > >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> > >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> > rte_flow_action_set_meta)),
> > >  };
> > >
> > > +int
> > > +rte_flow_dynf_metadata_register(void)
> > > +{
> > > +	int offset;
> > > +	int flag;
> > > +
> > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > +	};
> > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > +	};
> > 
> > I don't see think we need #defines.
> > You can directly use the name, sizeof() and __alignof__() here.
> > If the information is used externally, the structure shall be made global non-
> > static.
> 
> The intention was to gather all dynamic fields definitions in one place 
> (in rte_mbuf_dyn.h).

If the dynamic field is only going to be used inside rte_flow, I think
there is no need to expose it in rte_mbuf_dyn.h.
The other reason is I think the #define are just "passthrough", and do
not really bring added value, just an indirection.

> It would be easy to see all fields in one sight (some
> might be shared, some might be mutual exclusive, estimate mbuf space,
> required by various features, etc.). So, we can't just fill structure fields
> with simple sizeof() and alignof() instead of definitions (the field parameters
> must be defined once).
> 
> I do not see the reasons to make table global. I would prefer the definitions.
> - the definitions are compile time processing (table fields are runtime),
> it provides code optimization and better performance.

There is indeed no need to make the table global if the field is private
to rte_flow. About better performance, my understanding is that it would
only impact registration, am I missing something?

> 
> > > +
> > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > +	if (offset < 0)
> > > +		goto error;
> > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > +	if (flag < 0)
> > > +		goto error;
> > > +	rte_flow_dynf_metadata_offs = offset;
> > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > +	return 0;
> > > +
> > > +error:
> > > +	rte_flow_dynf_metadata_offs = -1;
> > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > +	return -rte_errno;
> > > +}
> > > +
> > >  static int
> > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > > diff --git a/lib/librte_ethdev/rte_flow.h
> > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -27,6 +27,8 @@
> > >  #include <rte_udp.h>
> > >  #include <rte_byteorder.h>
> > >  #include <rte_esp.h>
> > > +#include <rte_mbuf.h>
> > > +#include <rte_mbuf_dyn.h>
> > >
> > >  #ifdef __cplusplus
> > >  extern "C" {
> > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > >  	/**
> > >  	 * [META]
> > >  	 *
> > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > +	 * Matches a metadata value.
> > > +	 *
> > >  	 * See struct rte_flow_item_meta.
> > >  	 */
> > >  	RTE_FLOW_ITEM_TYPE_META,
> > > @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > > #endif
> > >
> > >  /**
> > > - * RTE_FLOW_ITEM_TYPE_META.
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > >   *
> > > - * Matches a specified metadata value.
> > > + * RTE_FLOW_ITEM_TYPE_META
> > > + *
> > > + * Matches a specified metadata value. On egress, metadata can be set
> > > + either by
> > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > + * metadata for a packet and the metadata will be reported via mbuf
> > > + metadata
> > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > > + field must be
> > > + * registered in advance by rte_flow_dynf_metadata_register().
> > >   */
> > >  struct rte_flow_item_meta {
> > >  	rte_be32_t data;
> > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > >  	 * undefined behavior.
> > >  	 */
> > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > +
> > > +	/**
> > > +	 * Set metadata on ingress or egress path.
> > > +	 *
> > > +	 * See struct rte_flow_action_set_meta.
> > > +	 */
> > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > >  };
> > >
> > >  /**
> > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > + *
> > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > +On
> > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > +field
> > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > +must be
> > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > + *
> > > + * Altering partial bits is supported with mask. For bits which have
> > > +never
> > > + * been set, unpredictable value will be seen depending on driver
> > > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > > +may
> > > + * or may not be propagated to the other path depending on HW
> > capability.
> > > + *
> > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > + */
> > > +struct rte_flow_action_set_meta {
> > > +	rte_be32_t data;
> > > +	rte_be32_t mask;
> > > +};
> > > +
> > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > +rte_flow_dynf_metadata_offs;
> > > +
> > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > +rte_flow_dynf_metadata_mask;
> > > +
> > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > +RTE_FLOW_DYNF_METADATA(m) \
> > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > *)
> > > +
> > > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > > +(rte_flow_dynf_metadata_mask)
> > > +
> > 
> > I wonder if helpers like this wouldn't be better, because they combine the
> > flag and the field:
> > 
> > /**
> >  * Set metadata dynamic field and flag in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> >                                        uint32_t metadata) {
> >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> >                        uint32_t *) = metadata;
> >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> Setting flag looks redundantly.
> What if driver just replaces the metadata and flag is already set?
> The other option - the flags (for set of fields) might be set in combinations.
> mbuf field is supposed to be engaged in datapath, performance is
> very critical, adding one more abstraction layer seems not to be relevant.

Ok, that was just a suggestion. Let's use your accessors if you fear a
performance impact.

Nevertheless I suggest to use static inline functions in place of
macros if possible. For RTE_MBUF_DYNFIELD(), I used a macro because
it's the only way to provide a type to cast the result. But in your
case, you know it's a uint32_t *.

> Also, metadata is not feature of mbuf. It should have rte_flow prefix.

Yes, sure. The example derives from a test I've done, and I forgot to
change it.


> > /**
> >  * Get metadata dynamic field value in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> >                                        uint32_t *metadata) {
> >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> >                return -1;
> What if metadata is 0xFFFFFFFF ?
> The checking of availability might embrace larger code block, 
> so this might be not the best place to check availability.
> 
> >        *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> >                                uint32_t *);
> >        return 0;
> > }
> > 
> > /**
> >  * Delete the metadata dynamic flag in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > 
> Sorry, I do not see the practical usecase for these helpers. In my opinion it is just some kind of obscuration.
> They do replace the very simple code and introduce some risk of performance impact.
> 
> > 
> > >  /*
> > >   * Definition of a single action.
> > >   *
> > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > >
> > >  /**
> > > + * Check if mbuf dynamic field for metadata is registered.
> > > + *
> > > + * @return
> > > + *   True if registered, false otherwise.
> > > + */
> > > +__rte_experimental
> > > +static inline int
> > > +rte_flow_dynf_metadata_avail(void) {
> > > +	return !!rte_flow_dynf_metadata_mask; }
> > 
> > _registered() instead of _avail() ?
> Accepted, sounds better.
> 
> > 
> > > +
> > > +/**
> > > + * Register mbuf dynamic field and flag for metadata.
> > > + *
> > > + * This function must be called prior to use SET_META action in order
> > > +to
> > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > +delivered to
> > > + * application.
> > > + *
> > > + * @return
> > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_flow_dynf_metadata_register(void);
> > > +
> > > +/**
> > >   * Check whether a flow rule can be created on a given port.
> > >   *
> > >   * The flow rule is validated for correctness and whether it could be
> > > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > >   */
> > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) +
> > > (offset)))
> > >
> > > +/**
> > > + * Flow metadata dynamic field definitions.
> > > + */
> > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > +MBUF_DYNF_METADATA_FLAGS 0
> > 
> > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > The name should follow the function name conventions, I suggest
> > "rte_flow_metadata".
> 
> The definitions:
> MBUF_DYNF_METADATA_NAME, 
> MBUF_DYNF_METADATA_SIZE,
> MBUF_DYNF_METADATA_ALIGN
> are global. rte_flow proposes only minimal set tyo check and access
> the metadata. By knowing the field names applications would have the
> more flexibility in processing the fields, for example it allows to  optimize
> the handling of multiple dynamic fields . The definition of metadata size allows
> to generate optimized code:
> #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit()
> #else
> 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit()
> #endif

I don't see any reason why the same dynamic field could have different
sizes, I even think it could be dangerous. Your accessors suppose that
the metadata is a uint32_t. Having a compile-time option for that does
not look desirable.

Just a side note: we have to take care when adding a new *public*
dynamic field that it won't change in the future: the accessors are
macros or static inline functions, so they are embedded in the binaries.
This is probably something we should discuss
and may not be when updating the dpdk (as shared lib).


> MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow,
> this flag is related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> Would you like to drop this definition?
> 
> > 
> > If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> > ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I mean:
> > 
> > ====
> > /* rte_mbuf_dyn.c */
> > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> >    ...
> > };
> In this case we would make this descriptor global.
> It is no needed, because there Is no supposed any usage, but by
> rte_flow_dynf_metadata_register() only. The 

Yes, in my example I wasn't sure it was going to be private to rte_flow (see "If
the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
...)").

So yes, I agree the struct should remain private.


> > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> >    ...
> > };
> > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > 
> > int rte_mbuf_dyn_flow_metadata_register(void)
> > {
> > ...
> > }
> > 
> > /* rte_mbuf_dyn.h */
> > extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
> > extern int rte_mbuf_dynfield_flow_metadata_offset;
> > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > 
> > ...helpers to set/get metadata...
> > ===
> > 
> > Centralizing the definitions of non-private dynamic fields/flags in
> > rte_mbuf_dyn may help other people to reuse a field that is well described if
> > it match their use-case.
> 
> Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx placed
> in rte_mbuf_dyn.h. Do you think we should share the descriptors either?
> I have no idea why someone (but rte_flow_dynf_metadata_register()) might
> register metadata field directly.

If the field is private to rte_flow, yes, there is no need to share the
"struct rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register()
could be marked as internal, right?

One more question: I see the registration is done by
parse_vc_action_set_meta(). My understanding is that this function is
not in datapath, and is called when configuring rte_flow. Do you
confirm?

> > In your case, what is carried by metadata? Could it be reused by others? I
> > think some more description is needed.
> In my case, metadata is just opaquie rte_flow related 32-bit unsigned value provided by
> mlx5 hardrware in rx datapath. I have no guess whether someone wishes to reuse.

What is the user supposed to do with this value? If it is hw-specific
data, I think the name of the mbuf field should include "MLX", and it
should be described.

Are these rte_flow actions somehow specific to mellanox drivers ?

> Brief summary of you comment (just to make sure I understood your proposal in correct way):
> 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave MBUF_DYNF_METADATA_NAME only
> 2. move the descriptor const struct rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> 3. provide helpers to access metadata
> 
> [1] and [2] look OK in general. Although I think these ones make code less flexible, restrict the potential compile time options.
> For now it is rather theoretical question, if you insist on your approach - please, let me know, I'll address [1] and [2]
> and update.my patch.

[1] I think the #define only adds an indirection, and I didn't see any
    perf constraint here.
[2] My previous comment was surely not clear, sorry. The code can stay
    in rte_flow.

> As for [3] - IMHO, the extra abstraction layer is not useful, and might be even harmful.
> I tend not to complicate the code, at least, for now.

[3] ok for me


Thanks,
Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-21 16:37         ` Olivier Matz
@ 2019-10-24  6:49           ` Slava Ovsiienko
  2019-10-24  9:22             ` Olivier Matz
  0 siblings, 1 reply; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-24  6:49 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

Hi, Olivier

> > [snip]
> >
> > > > +int
> > > > +rte_flow_dynf_metadata_register(void)
> > > > +{
> > > > +	int offset;
> > > > +	int flag;
> > > > +
> > > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > > +	};
> > > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > +	};
> > >
> > > I don't see think we need #defines.
> > > You can directly use the name, sizeof() and __alignof__() here.
> > > If the information is used externally, the structure shall be made
> > > global non- static.
> >
> > The intention was to gather all dynamic fields definitions in one
> > place (in rte_mbuf_dyn.h).
> 
> If the dynamic field is only going to be used inside rte_flow, I think there is no
> need to expose it in rte_mbuf_dyn.h.
> The other reason is I think the #define are just "passthrough", and do not
> really bring added value, just an indirection.
> 
> > It would be easy to see all fields in one sight (some might be shared,
> > some might be mutual exclusive, estimate mbuf space, required by
> > various features, etc.). So, we can't just fill structure fields with
> > simple sizeof() and alignof() instead of definitions (the field
> > parameters must be defined once).
> >
> > I do not see the reasons to make table global. I would prefer the
> definitions.
> > - the definitions are compile time processing (table fields are
> > runtime), it provides code optimization and better performance.
> 
> There is indeed no need to make the table global if the field is private to
> rte_flow. About better performance, my understanding is that it would only
> impact registration, am I missing something?

OK, I thought about some opportunity to allow application to register
field directly, bypassing rte_flow_dynf_metadata_register(). So either
definitions or field description table was supposed to be global. 
I agree, let's do not complicate the matter, I'll will make global the
metadata field name definition only - in the rte_mbuf_dyn.h in order
just to have some centralizing point.

> >
> > > > +
> > > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > +	if (offset < 0)
> > > > +		goto error;
> > > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > +	if (flag < 0)
> > > > +		goto error;
> > > > +	rte_flow_dynf_metadata_offs = offset;
> > > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > +	return 0;
> > > > +
> > > > +error:
> > > > +	rte_flow_dynf_metadata_offs = -1;
> > > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > > +	return -rte_errno;
> > > > +}
> > > > +
> > > >  static int
> > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> > > > { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > @@ -27,6 +27,8 @@
> > > >  #include <rte_udp.h>
> > > >  #include <rte_byteorder.h>
> > > >  #include <rte_esp.h>
> > > > +#include <rte_mbuf.h>
> > > > +#include <rte_mbuf_dyn.h>
> > > >
> > > >  #ifdef __cplusplus
> > > >  extern "C" {
> > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > >  	/**
> > > >  	 * [META]
> > > >  	 *
> > > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > > +	 * Matches a metadata value.
> > > > +	 *
> > > >  	 * See struct rte_flow_item_meta.
> > > >  	 */
> > > >  	RTE_FLOW_ITEM_TYPE_META,
> > > > @@ -1213,9 +1216,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > #endif
> > > >
> > > >  /**
> > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > + notice
> > > >   *
> > > > - * Matches a specified metadata value.
> > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > + *
> > > > + * Matches a specified metadata value. On egress, metadata can be
> > > > + set either by
> > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > + * metadata for a packet and the metadata will be reported via
> > > > + mbuf metadata
> > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> mbuf
> > > > + field must be
> > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > >   */
> > > >  struct rte_flow_item_meta {
> > > >  	rte_be32_t data;
> > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > >  	 * undefined behavior.
> > > >  	 */
> > > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > +
> > > > +	/**
> > > > +	 * Set metadata on ingress or egress path.
> > > > +	 *
> > > > +	 * See struct rte_flow_action_set_meta.
> > > > +	 */
> > > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > > >  };
> > > >
> > > >  /**
> > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > +notice
> > > > + *
> > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > + *
> > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > > +On
> > > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > > +field
> > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > > +must be
> > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > + *
> > > > + * Altering partial bits is supported with mask. For bits which
> > > > +have never
> > > > + * been set, unpredictable value will be seen depending on driver
> > > > + * implementation. For loopback/hairpin packet, metadata set on
> > > > +Rx/Tx may
> > > > + * or may not be propagated to the other path depending on HW
> > > capability.
> > > > + *
> > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > + */
> > > > +struct rte_flow_action_set_meta {
> > > > +	rte_be32_t data;
> > > > +	rte_be32_t mask;
> > > > +};
> > > > +
> > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > +rte_flow_dynf_metadata_offs;
> > > > +
> > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > +rte_flow_dynf_metadata_mask;
> > > > +
> > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > > *)
> > > > +
> > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > +PKT_RX_DYNF_METADATA
> > > > +(rte_flow_dynf_metadata_mask)
> > > > +
> > >
> > > I wonder if helpers like this wouldn't be better, because they
> > > combine the flag and the field:
> > >
> > > /**
> > >  * Set metadata dynamic field and flag in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > >                                        uint32_t metadata) {
> > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > >                        uint32_t *) = metadata;
> > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > Setting flag looks redundantly.
> > What if driver just replaces the metadata and flag is already set?
> > The other option - the flags (for set of fields) might be set in combinations.
> > mbuf field is supposed to be engaged in datapath, performance is very
> > critical, adding one more abstraction layer seems not to be relevant.
> 
> Ok, that was just a suggestion. Let's use your accessors if you fear a
> performance impact.
The simple example - mlx5 PMD has the rx_burst routine implemented
with vector instructions, and it processes four packets at once. No need
to check field availability four times, and the storing the metadata
is the subject for further optimization with vector instructions.
It is a bit difficult to provide common helpers to handle the metadata
field due to extremely high optimization requirements.

> 
> Nevertheless I suggest to use static inline functions in place of macros if
> possible. For RTE_MBUF_DYNFIELD(), I used a macro because it's the only
> way to provide a type to cast the result. But in your case, you know it's a
> uint32_t *.
What If one needs to specify the address of field? Macro allows to do that,
inline functions - do not. Packets may be processed in bizarre ways,
for example in a batch, with vector instructions. OK, I'll provide
the set/get routines, but I'm not sure whether will use ones in mlx5 code.
In my opinion it just obscures the field nature. Field is just a field, AFAIU, 
it is main idea of your patch, the way to handle dynamic field should be close
to handling usual static fields, I think. Macro pointer follows this approach,
routines - does not.

> > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> 
> Yes, sure. The example derives from a test I've done, and I forgot to change
> it.
> 
> 
> > > /**
> > >  * Get metadata dynamic field value in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> > >                                        uint32_t *metadata) {
> > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > >                return -1;
> > What if metadata is 0xFFFFFFFF ?
> > The checking of availability might embrace larger code block, so this
> > might be not the best place to check availability.
> >
> > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> rte_flow_dynf_metadata_offs,
> > >                                uint32_t *);
> > >        return 0;
> > > }
> > >
> > > /**
> > >  * Delete the metadata dynamic flag in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > >
> > Sorry, I do not see the practical usecase for these helpers. In my opinion it
> is just some kind of obscuration.
> > They do replace the very simple code and introduce some risk of
> performance impact.
> >
> > >
> > > >  /*
> > > >   * Definition of a single action.
> > > >   *
> > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > >
> > > >  /**
> > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > + *
> > > > + * @return
> > > > + *   True if registered, false otherwise.
> > > > + */
> > > > +__rte_experimental
> > > > +static inline int
> > > > +rte_flow_dynf_metadata_avail(void) {
> > > > +	return !!rte_flow_dynf_metadata_mask; }
> > >
> > > _registered() instead of _avail() ?
> > Accepted, sounds better.

Hmm, I changed my opinion - we already have
rte_flow_dynf_metadata_register(void). Is it OK to have
rte_flow_dynf_metadata_registerED(void) ?
It would be easy to mistype. 

> >
> > >
> > > > +
> > > > +/**
> > > > + * Register mbuf dynamic field and flag for metadata.
> > > > + *
> > > > + * This function must be called prior to use SET_META action in
> > > > +order to
> > > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > > +delivered to
> > > > + * application.
> > > > + *
> > > > + * @return
> > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_flow_dynf_metadata_register(void);
> > > > +
> > > > +/**
> > > >   * Check whether a flow rule can be created on a given port.
> > > >   *
> > > >   * The flow rule is validated for correctness and whether it
> > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> *name,
> > > >   */
> > > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m)
> > > > +
> > > > (offset)))
> > > >
> > > > +/**
> > > > + * Flow metadata dynamic field definitions.
> > > > + */
> > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > +MBUF_DYNF_METADATA_FLAGS 0
> > >
> > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > The name should follow the function name conventions, I suggest
> > > "rte_flow_metadata".
> >
> > The definitions:
> > MBUF_DYNF_METADATA_NAME,
> > MBUF_DYNF_METADATA_SIZE,
> > MBUF_DYNF_METADATA_ALIGN
> > are global. rte_flow proposes only minimal set tyo check and access
> > the metadata. By knowing the field names applications would have the
> > more flexibility in processing the fields, for example it allows to
> > optimize the handling of multiple dynamic fields . The definition of
> > metadata size allows to generate optimized code:
> > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> 
> I don't see any reason why the same dynamic field could have different sizes,
> I even think it could be dangerous. Your accessors suppose that the
> metadata is a uint32_t. Having a compile-time option for that does not look
> desirable.

I tried to provide maximal flexibility and It was just an example of the thing
we could do with global definitions. If you think we do not need it - OK,
let's do things simpler.

> 
> Just a side note: we have to take care when adding a new *public* dynamic
> field that it won't change in the future: the accessors are macros or static
> inline functions, so they are embedded in the binaries.
> This is probably something we should discuss and may not be when updating
> the dpdk (as shared lib).

Yes, agree, defines just will not work correct in correct way and even break an ABI.
As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
should be removed.

> 
> > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag is
> > related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> > Would you like to drop this definition?
> >
> > >
> > > If the flag is going to be used in several places in dpdk (rte_flow,
> > > pmd, app, ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I
> mean:
> > >
> > > ====
> > > /* rte_mbuf_dyn.c */
> > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > >    ...
> > > };
> > In this case we would make this descriptor global.
> > It is no needed, because there Is no supposed any usage, but by
> > rte_flow_dynf_metadata_register() only. The
> 
> Yes, in my example I wasn't sure it was going to be private to rte_flow (see
> "If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> ...)").
> 
> So yes, I agree the struct should remain private.
OK.

> 
> 
> > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > >    ...
> > > };
> > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > >
> > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > {
> > > ...
> > > }
> > >
> > > /* rte_mbuf_dyn.h */
> > > extern const struct rte_mbuf_dynfield
> > > rte_mbuf_dynfield_flow_metadata; extern int
> > > rte_mbuf_dynfield_flow_metadata_offset;
> > > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > >
> > > ...helpers to set/get metadata...
> > > ===
> > >
> > > Centralizing the definitions of non-private dynamic fields/flags in
> > > rte_mbuf_dyn may help other people to reuse a field that is well
> > > described if it match their use-case.
> >
> > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > placed in rte_mbuf_dyn.h. Do you think we should share the descriptors
> either?
> > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > might register metadata field directly.
> 
> If the field is private to rte_flow, yes, there is no need to share the "struct
> rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register() could be
> marked as internal, right?
rte_flow_dynf_metadata_register() is intended to be called by application.
Some applications might wish to engage metadata feature, some ones - not.

> 
> One more question: I see the registration is done by
> parse_vc_action_set_meta(). My understanding is that this function is not in
> datapath, and is called when configuring rte_flow. Do you confirm?
Rather it is called to configure application in general. If user sets metadata 
(by issuing the appropriate command) it is assumed he/she would like
the metadata on Rx side either. This is just for test purposes and it is not brilliant
example of rte_flow_dynf_metadata_register() use case.


> 
> > > In your case, what is carried by metadata? Could it be reused by
> > > others? I think some more description is needed.
> > In my case, metadata is just opaquie rte_flow related 32-bit unsigned
> > value provided by
> > mlx5 hardrware in rx datapath. I have no guess whether someone wishes
> to reuse.
> 
> What is the user supposed to do with this value? If it is hw-specific data, I
> think the name of the mbuf field should include "MLX", and it should be
> described.

Metadata are not HW specific at all - they neither control nor are produced
by HW (abstracting from the flow engine is implemented in HW).
Metadata are some opaque data, it is some kind of link between data
path and flow space.  With metadata application may provide some per packet
information to flow engine and get back some information from flow engine.
it is generic concept, supposed to be neither HW-related nor vendor specific.

> 
> Are these rte_flow actions somehow specific to mellanox drivers ?

AFAIK, currently it is going to be supported by mlx5 PMD only,
but concept is common and is not vendor specific.

> 
> > Brief summary of you comment (just to make sure I understood your
> proposal in correct way):
> > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const struct
> > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> > 3. provide helpers to access metadata
> >
> > [1] and [2] look OK in general. Although I think these ones make code less
> flexible, restrict the potential compile time options.
> > For now it is rather theoretical question, if you insist on your
> > approach - please, let me know, I'll address [1] and [2] and update.my
> patch.
> 
> [1] I think the #define only adds an indirection, and I didn't see any
>     perf constraint here.
> [2] My previous comment was surely not clear, sorry. The code can stay
>     in rte_flow.
> 
> > As for [3] - IMHO, the extra abstraction layer is not useful, and might be
> even harmful.
> > I tend not to complicate the code, at least, for now.
> 
> [3] ok for me
> 
> 
> Thanks,
> Olivier

With best regards, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-24  6:49           ` Slava Ovsiienko
@ 2019-10-24  9:22             ` Olivier Matz
  2019-10-24 12:30               ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-24  9:22 UTC (permalink / raw)
  To: Slava Ovsiienko; +Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

Hi Slava,

On Thu, Oct 24, 2019 at 06:49:41AM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> > > [snip]
> > >
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void)
> > > > > +{
> > > > > +	int offset;
> > > > > +	int flag;
> > > > > +
> > > > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > > > +	};
> > > > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +	};
> > > >
> > > > I don't see think we need #defines.
> > > > You can directly use the name, sizeof() and __alignof__() here.
> > > > If the information is used externally, the structure shall be made
> > > > global non- static.
> > >
> > > The intention was to gather all dynamic fields definitions in one
> > > place (in rte_mbuf_dyn.h).
> > 
> > If the dynamic field is only going to be used inside rte_flow, I think there is no
> > need to expose it in rte_mbuf_dyn.h.
> > The other reason is I think the #define are just "passthrough", and do not
> > really bring added value, just an indirection.
> > 
> > > It would be easy to see all fields in one sight (some might be shared,
> > > some might be mutual exclusive, estimate mbuf space, required by
> > > various features, etc.). So, we can't just fill structure fields with
> > > simple sizeof() and alignof() instead of definitions (the field
> > > parameters must be defined once).
> > >
> > > I do not see the reasons to make table global. I would prefer the
> > definitions.
> > > - the definitions are compile time processing (table fields are
> > > runtime), it provides code optimization and better performance.
> > 
> > There is indeed no need to make the table global if the field is private to
> > rte_flow. About better performance, my understanding is that it would only
> > impact registration, am I missing something?
> 
> OK, I thought about some opportunity to allow application to register
> field directly, bypassing rte_flow_dynf_metadata_register(). So either
> definitions or field description table was supposed to be global. 
> I agree, let's do not complicate the matter, I'll will make global the
> metadata field name definition only - in the rte_mbuf_dyn.h in order
> just to have some centralizing point.

By reading your mail, things are also clearer to me about which
parts need access to this field.

To summarize what I understand:
- dyn field registration is done in rte_flow lib when configuring
  a flow using META
- the dynamic field will never be get/set in a mbuf by a PMD or rte_flow
  before a flow using META is added

One question then: why would you need the dyn field name to be exported?
Does the PMD need to know if the field is registered with a lookup or
something like that? If yes, can you detail why?


> 
> > >
> > > > > +
> > > > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > > +	if (offset < 0)
> > > > > +		goto error;
> > > > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > > +	if (flag < 0)
> > > > > +		goto error;
> > > > > +	rte_flow_dynf_metadata_offs = offset;
> > > > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > > +	return 0;
> > > > > +
> > > > > +error:
> > > > > +	rte_flow_dynf_metadata_offs = -1;
> > > > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > > > +	return -rte_errno;
> > > > > +}
> > > > > +
> > > > >  static int
> > > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> > > > > { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > @@ -27,6 +27,8 @@
> > > > >  #include <rte_udp.h>
> > > > >  #include <rte_byteorder.h>
> > > > >  #include <rte_esp.h>
> > > > > +#include <rte_mbuf.h>
> > > > > +#include <rte_mbuf_dyn.h>
> > > > >
> > > > >  #ifdef __cplusplus
> > > > >  extern "C" {
> > > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > > >  	/**
> > > > >  	 * [META]
> > > > >  	 *
> > > > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > > > +	 * Matches a metadata value.
> > > > > +	 *
> > > > >  	 * See struct rte_flow_item_meta.
> > > > >  	 */
> > > > >  	RTE_FLOW_ITEM_TYPE_META,
> > > > > @@ -1213,9 +1216,17 @@ struct
> > rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > > #endif
> > > > >
> > > > >  /**
> > > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > + notice
> > > > >   *
> > > > > - * Matches a specified metadata value.
> > > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > > + *
> > > > > + * Matches a specified metadata value. On egress, metadata can be
> > > > > + set either by
> > > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > > + * metadata for a packet and the metadata will be reported via
> > > > > + mbuf metadata
> > > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > mbuf
> > > > > + field must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > >   */
> > > > >  struct rte_flow_item_meta {
> > > > >  	rte_be32_t data;
> > > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > > >  	 * undefined behavior.
> > > > >  	 */
> > > > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > > +
> > > > > +	/**
> > > > > +	 * Set metadata on ingress or egress path.
> > > > > +	 *
> > > > > +	 * See struct rte_flow_action_set_meta.
> > > > > +	 */
> > > > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > > > >  };
> > > > >
> > > > >  /**
> > > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > > >
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > +notice
> > > > > + *
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > > + *
> > > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > > > +On
> > > > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > > > +field
> > > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > > > +must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > + *
> > > > > + * Altering partial bits is supported with mask. For bits which
> > > > > +have never
> > > > > + * been set, unpredictable value will be seen depending on driver
> > > > > + * implementation. For loopback/hairpin packet, metadata set on
> > > > > +Rx/Tx may
> > > > > + * or may not be propagated to the other path depending on HW
> > > > capability.
> > > > > + *
> > > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > > + */
> > > > > +struct rte_flow_action_set_meta {
> > > > > +	rte_be32_t data;
> > > > > +	rte_be32_t mask;
> > > > > +};
> > > > > +
> > > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > > +rte_flow_dynf_metadata_offs;
> > > > > +
> > > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > > +rte_flow_dynf_metadata_mask;
> > > > > +
> > > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > > > *)
> > > > > +
> > > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > > +PKT_RX_DYNF_METADATA
> > > > > +(rte_flow_dynf_metadata_mask)
> > > > > +
> > > >
> > > > I wonder if helpers like this wouldn't be better, because they
> > > > combine the flag and the field:
> > > >
> > > > /**
> > > >  * Set metadata dynamic field and flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > > >                                        uint32_t metadata) {
> > > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > > >                        uint32_t *) = metadata;
> > > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > > Setting flag looks redundantly.
> > > What if driver just replaces the metadata and flag is already set?
> > > The other option - the flags (for set of fields) might be set in combinations.
> > > mbuf field is supposed to be engaged in datapath, performance is very
> > > critical, adding one more abstraction layer seems not to be relevant.
> > 
> > Ok, that was just a suggestion. Let's use your accessors if you fear a
> > performance impact.
> The simple example - mlx5 PMD has the rx_burst routine implemented
> with vector instructions, and it processes four packets at once. No need
> to check field availability four times, and the storing the metadata
> is the subject for further optimization with vector instructions.
> It is a bit difficult to provide common helpers to handle the metadata
> field due to extremely high optimization requirements.

ok, got it

> > Nevertheless I suggest to use static inline functions in place of macros if
> > possible. For RTE_MBUF_DYNFIELD(), I used a macro because it's the only
> > way to provide a type to cast the result. But in your case, you know it's a
> > uint32_t *.
> What If one needs to specify the address of field? Macro allows to do that,
> inline functions - do not. Packets may be processed in bizarre ways,
> for example in a batch, with vector instructions. OK, I'll provide
> the set/get routines, but I'm not sure whether will use ones in mlx5 code.
> In my opinion it just obscures the field nature. Field is just a field, AFAIU, 
> it is main idea of your patch, the way to handle dynamic field should be close
> to handling usual static fields, I think. Macro pointer follows this approach,
> routines - does not.

Well, I just think that:
  rte_mbuf_set_timestamp(m, 1234);
is more readable than:
  *RTE_MBUF_TIMESTAMP(m) = 1234;

Anyway, in your case, if you need to use vector instructions in the PMD,
I guess you will directly use the offset.

> > > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> > 
> > Yes, sure. The example derives from a test I've done, and I forgot to change
> > it.
> > 
> > 
> > > > /**
> > > >  * Get metadata dynamic field value in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> > > >                                        uint32_t *metadata) {
> > > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > > >                return -1;
> > > What if metadata is 0xFFFFFFFF ?
> > > The checking of availability might embrace larger code block, so this
> > > might be not the best place to check availability.
> > >
> > > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> > rte_flow_dynf_metadata_offs,
> > > >                                uint32_t *);
> > > >        return 0;
> > > > }
> > > >
> > > > /**
> > > >  * Delete the metadata dynamic flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > > >
> > > Sorry, I do not see the practical usecase for these helpers. In my opinion it
> > is just some kind of obscuration.
> > > They do replace the very simple code and introduce some risk of
> > performance impact.
> > >
> > > >
> > > > >  /*
> > > > >   * Definition of a single action.
> > > > >   *
> > > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > > >
> > > > >  /**
> > > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > > + *
> > > > > + * @return
> > > > > + *   True if registered, false otherwise.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static inline int
> > > > > +rte_flow_dynf_metadata_avail(void) {
> > > > > +	return !!rte_flow_dynf_metadata_mask; }
> > > >
> > > > _registered() instead of _avail() ?
> > > Accepted, sounds better.
> 
> Hmm, I changed my opinion - we already have
> rte_flow_dynf_metadata_register(void). Is it OK to have
> rte_flow_dynf_metadata_registerED(void) ?
> It would be easy to mistype. 

what about xxx_is_registered() ?
if you feel it's too long, ok, let's keep avail()

> 
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * Register mbuf dynamic field and flag for metadata.
> > > > > + *
> > > > > + * This function must be called prior to use SET_META action in
> > > > > +order to
> > > > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > > > +delivered to
> > > > > + * application.
> > > > > + *
> > > > > + * @return
> > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > set.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void);
> > > > > +
> > > > > +/**
> > > > >   * Check whether a flow rule can be created on a given port.
> > > > >   *
> > > > >   * The flow rule is validated for correctness and whether it
> > > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> > *name,
> > > > >   */
> > > > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m)
> > > > > +
> > > > > (offset)))
> > > > >
> > > > > +/**
> > > > > + * Flow metadata dynamic field definitions.
> > > > > + */
> > > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_FLAGS 0
> > > >
> > > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > > The name should follow the function name conventions, I suggest
> > > > "rte_flow_metadata".
> > >
> > > The definitions:
> > > MBUF_DYNF_METADATA_NAME,
> > > MBUF_DYNF_METADATA_SIZE,
> > > MBUF_DYNF_METADATA_ALIGN
> > > are global. rte_flow proposes only minimal set tyo check and access
> > > the metadata. By knowing the field names applications would have the
> > > more flexibility in processing the fields, for example it allows to
> > > optimize the handling of multiple dynamic fields . The definition of
> > > metadata size allows to generate optimized code:
> > > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> > 
> > I don't see any reason why the same dynamic field could have different sizes,
> > I even think it could be dangerous. Your accessors suppose that the
> > metadata is a uint32_t. Having a compile-time option for that does not look
> > desirable.
> 
> I tried to provide maximal flexibility and It was just an example of the thing
> we could do with global definitions. If you think we do not need it - OK,
> let's do things simpler.
> 
> > 
> > Just a side note: we have to take care when adding a new *public* dynamic
> > field that it won't change in the future: the accessors are macros or static
> > inline functions, so they are embedded in the binaries.
> > This is probably something we should discuss and may not be when updating
> > the dpdk (as shared lib).
> 
> Yes, agree, defines just will not work correct in correct way and even break an ABI.
> As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
> should be removed.
> 
> > 
> > > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag is
> > > related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> > > Would you like to drop this definition?
> > >
> > > >
> > > > If the flag is going to be used in several places in dpdk (rte_flow,
> > > > pmd, app, ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I
> > mean:
> > > >
> > > > ====
> > > > /* rte_mbuf_dyn.c */
> > > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > > >    ...
> > > > };
> > > In this case we would make this descriptor global.
> > > It is no needed, because there Is no supposed any usage, but by
> > > rte_flow_dynf_metadata_register() only. The
> > 
> > Yes, in my example I wasn't sure it was going to be private to rte_flow (see
> > "If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> > ...)").
> > 
> > So yes, I agree the struct should remain private.
> OK.
> 
> > 
> > 
> > > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > > >    ...
> > > > };
> > > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > > >
> > > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > > {
> > > > ...
> > > > }
> > > >
> > > > /* rte_mbuf_dyn.h */
> > > > extern const struct rte_mbuf_dynfield
> > > > rte_mbuf_dynfield_flow_metadata; extern int
> > > > rte_mbuf_dynfield_flow_metadata_offset;
> > > > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > > > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > > >
> > > > ...helpers to set/get metadata...
> > > > ===
> > > >
> > > > Centralizing the definitions of non-private dynamic fields/flags in
> > > > rte_mbuf_dyn may help other people to reuse a field that is well
> > > > described if it match their use-case.
> > >
> > > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > > placed in rte_mbuf_dyn.h. Do you think we should share the descriptors
> > either?
> > > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > > might register metadata field directly.
> > 
> > If the field is private to rte_flow, yes, there is no need to share the "struct
> > rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register() could be
> > marked as internal, right?
> rte_flow_dynf_metadata_register() is intended to be called by application.
> Some applications might wish to engage metadata feature, some ones - not.
> 
> > 
> > One more question: I see the registration is done by
> > parse_vc_action_set_meta(). My understanding is that this function is not in
> > datapath, and is called when configuring rte_flow. Do you confirm?
> Rather it is called to configure application in general. If user sets metadata 
> (by issuing the appropriate command) it is assumed he/she would like
> the metadata on Rx side either. This is just for test purposes and it is not brilliant
> example of rte_flow_dynf_metadata_register() use case.
> 
> 
> > 
> > > > In your case, what is carried by metadata? Could it be reused by
> > > > others? I think some more description is needed.
> > > In my case, metadata is just opaquie rte_flow related 32-bit unsigned
> > > value provided by
> > > mlx5 hardrware in rx datapath. I have no guess whether someone wishes
> > to reuse.
> > 
> > What is the user supposed to do with this value? If it is hw-specific data, I
> > think the name of the mbuf field should include "MLX", and it should be
> > described.
> 
> Metadata are not HW specific at all - they neither control nor are produced
> by HW (abstracting from the flow engine is implemented in HW).
> Metadata are some opaque data, it is some kind of link between data
> path and flow space.  With metadata application may provide some per packet
> information to flow engine and get back some information from flow engine.
> it is generic concept, supposed to be neither HW-related nor vendor specific.

ok, understood, it's like a mark or tag.

> > Are these rte_flow actions somehow specific to mellanox drivers ?
> 
> AFAIK, currently it is going to be supported by mlx5 PMD only,
> but concept is common and is not vendor specific.
> 
> > 
> > > Brief summary of you comment (just to make sure I understood your
> > proposal in correct way):
> > > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const struct
> > > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> > > 3. provide helpers to access metadata
> > >
> > > [1] and [2] look OK in general. Although I think these ones make code less
> > flexible, restrict the potential compile time options.
> > > For now it is rather theoretical question, if you insist on your
> > > approach - please, let me know, I'll address [1] and [2] and update.my
> > patch.
> > 
> > [1] I think the #define only adds an indirection, and I didn't see any
> >     perf constraint here.
> > [2] My previous comment was surely not clear, sorry. The code can stay
> >     in rte_flow.
> > 
> > > As for [3] - IMHO, the extra abstraction layer is not useful, and might be
> > even harmful.
> > > I tend not to complicate the code, at least, for now.
> > 
> > [3] ok for me
> > 
> > 
> > Thanks,
> > Olivier
> 
> With best regards, Slava

Thanks

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-24  9:22             ` Olivier Matz
@ 2019-10-24 12:30               ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-24 12:30 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

Hi Olivier,
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, October 24, 2019 12:23
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [PATCH v2] ethdev: extend flow metadata
>
> Hi Slava,
>
> On Thu, Oct 24, 2019 at 06:49:41AM +0000, Slava Ovsiienko wrote:
> > Hi, Olivier
> >
> > > > [snip]
> > > >
> > > > > > +int
> > > > > > +rte_flow_dynf_metadata_register(void)
> > > > > > +{
> > > > > > +   int offset;
> > > > > > +   int flag;
> > > > > > +
> > > > > > +   static const struct rte_mbuf_dynfield desc_offs = {
> > > > > > +           .name = MBUF_DYNF_METADATA_NAME,
> > > > > > +           .size = MBUF_DYNF_METADATA_SIZE,
> > > > > > +           .align = MBUF_DYNF_METADATA_ALIGN,
> > > > > > +           .flags = MBUF_DYNF_METADATA_FLAGS,
> > > > > > +   };
> > > > > > +   static const struct rte_mbuf_dynflag desc_flag = {
> > > > > > +           .name = MBUF_DYNF_METADATA_NAME,
> > > > > > +   };
> > > > >
> > > > > I don't see think we need #defines.
> > > > > You can directly use the name, sizeof() and __alignof__() here.
> > > > > If the information is used externally, the structure shall be
> > > > > made global non- static.
> > > >
> > > > The intention was to gather all dynamic fields definitions in one
> > > > place (in rte_mbuf_dyn.h).
> > >
> > > If the dynamic field is only going to be used inside rte_flow, I
> > > think there is no need to expose it in rte_mbuf_dyn.h.
> > > The other reason is I think the #define are just "passthrough", and
> > > do not really bring added value, just an indirection.
> > >
> > > > It would be easy to see all fields in one sight (some might be
> > > > shared, some might be mutual exclusive, estimate mbuf space,
> > > > required by various features, etc.). So, we can't just fill
> > > > structure fields with simple sizeof() and alignof() instead of
> > > > definitions (the field parameters must be defined once).
> > > >
> > > > I do not see the reasons to make table global. I would prefer the
> > > definitions.
> > > > - the definitions are compile time processing (table fields are
> > > > runtime), it provides code optimization and better performance.
> > >
> > > There is indeed no need to make the table global if the field is
> > > private to rte_flow. About better performance, my understanding is
> > > that it would only impact registration, am I missing something?
> >
> > OK, I thought about some opportunity to allow application to register
> > field directly, bypassing rte_flow_dynf_metadata_register(). So either
> > definitions or field description table was supposed to be global.
> > I agree, let's do not complicate the matter, I'll will make global the
> > metadata field name definition only - in the rte_mbuf_dyn.h in order
> > just to have some centralizing point.
>
> By reading your mail, things are also clearer to me about which parts need
> access to this field.
>
> To summarize what I understand:
> - dyn field registration is done in rte_flow lib when configuring
>   a flow using META
> - the dynamic field will never be get/set in a mbuf by a PMD or rte_flow
>   before a flow using META is added

In testpmd with current patch - yes, and this is just a sample. The common practice of
enabling metadata may differ. If application sees some PMD supporting RX/TX_METADATA
offload and it desires to receive metadata - it registers the dynamic field for ones.

> One question then: why would you need the dyn field name to be exported?
> Does the PMD need to know if the field is registered with a lookup or
> something like that? If yes, can you detail why?

I think it might happen the PMD does.
Right now I have an issue with mlx5 PMD compiled as shared library.
The global variables from rte_flow.c is not seen in PMD (just because I forget
to add one into the .map file). The way dynamic data are linked is system
dependent and it might be needed to optimize. I mean - in some
cases PMD might need to do lookup explicitly and use local copies
of offset and mask. So' I'd prefer to see field descriptor be
global visible. Yes, PMD can take the offset/flag directly
from the rte_flow variables and cache ones internally,
so global descriptor is just some kind of insurance.

As for the name - it is less critical, it may
be just useful for various log/debug messages and so on. The other
reason to have name definition is to put it in the "centralizing point"
somewhere in the rte_mbuf_dyn.h, to gather all names together and
eliminate the name conflicts (yes,  the documented name convention
reduces the risk, but it just convenient to see all fields names
within one sight - it is easy to determine which are supported, etc).

> >
> > > >
> > > > > > +
> > > > > > +   offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > > > +   if (offset < 0)
> > > > > > +           goto error;
> > > > > > +   flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > > > +   if (flag < 0)
> > > > > > +           goto error;
> > > > > > +   rte_flow_dynf_metadata_offs = offset;
> > > > > > +   rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > > > +   return 0;
> > > > > > +
> > > > > > +error:
> > > > > > +   rte_flow_dynf_metadata_offs = -1;
> > > > > > +   rte_flow_dynf_metadata_mask = 0ULL;
> > > > > > +   return -rte_errno;
> > > > > > +}
> > > > > > +
> > > > > >  static int
> > > > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error
> > > > > > *error) { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > > @@ -27,6 +27,8 @@
> > > > > >  #include <rte_udp.h>
> > > > > >  #include <rte_byteorder.h>
> > > > > >  #include <rte_esp.h>
> > > > > > +#include <rte_mbuf.h>
> > > > > > +#include <rte_mbuf_dyn.h>
> > > > > >
> > > > > >  #ifdef __cplusplus
> > > > > >  extern "C" {
> > > > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > > > >     /**
> > > > > >      * [META]
> > > > > >      *
> > > > > > -    * Matches a metadata value specified in mbuf metadata
> field.
> > > > > > +    * Matches a metadata value.
> > > > > > +    *
> > > > > >      * See struct rte_flow_item_meta.
> > > > > >      */
> > > > > >     RTE_FLOW_ITEM_TYPE_META,
> > > > > > @@ -1213,9 +1216,17 @@ struct
> > > rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > > > #endif
> > > > > >
> > > > > >  /**
> > > > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > > > + * @warning
> > > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > > + notice
> > > > > >   *
> > > > > > - * Matches a specified metadata value.
> > > > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > > > + *
> > > > > > + * Matches a specified metadata value. On egress, metadata
> > > > > > + can be set either by
> > > > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > > > + * metadata for a packet and the metadata will be reported
> > > > > > + via mbuf metadata
> > > > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > > mbuf
> > > > > > + field must be
> > > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > >   */
> > > > > >  struct rte_flow_item_meta {
> > > > > >     rte_be32_t data;
> > > > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > > > >      * undefined behavior.
> > > > > >      */
> > > > > >     RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > > > +
> > > > > > +   /**
> > > > > > +    * Set metadata on ingress or egress path.
> > > > > > +    *
> > > > > > +    * See struct rte_flow_action_set_meta.
> > > > > > +    */
> > > > > > +   RTE_FLOW_ACTION_TYPE_SET_META,
> > > > > >  };
> > > > > >
> > > > > >  /**
> > > > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > > > >     uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > > > >
> > > > > > +/**
> > > > > > + * @warning
> > > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > > +notice
> > > > > > + *
> > > > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > > > + *
> > > > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > > > + * PKT_TX_METADATA flag on egress will be overridden by this
> action.
> > > > > > +On
> > > > > > + * ingress, the metadata will be carried by mbuf metadata
> > > > > > +dynamic field
> > > > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf
> > > > > > +field must be
> > > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > > + *
> > > > > > + * Altering partial bits is supported with mask. For bits
> > > > > > +which have never
> > > > > > + * been set, unpredictable value will be seen depending on
> > > > > > +driver
> > > > > > + * implementation. For loopback/hairpin packet, metadata set
> > > > > > +on Rx/Tx may
> > > > > > + * or may not be propagated to the other path depending on HW
> > > > > capability.
> > > > > > + *
> > > > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > > > + */
> > > > > > +struct rte_flow_action_set_meta {
> > > > > > +   rte_be32_t data;
> > > > > > +   rte_be32_t mask;
> > > > > > +};
> > > > > > +
> > > > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > > > +rte_flow_dynf_metadata_offs;
> > > > > > +
> > > > > > +/* Mbuf dynamic field flag mask for metadata. */ extern
> > > > > > +uint64_t rte_flow_dynf_metadata_mask;
> > > > > > +
> > > > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > > > +   RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs,
> uint32_t
> > > > > *)
> > > > > > +
> > > > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > > > +PKT_RX_DYNF_METADATA
> > > > > > +(rte_flow_dynf_metadata_mask)
> > > > > > +
> > > > >
> > > > > I wonder if helpers like this wouldn't be better, because they
> > > > > combine the flag and the field:
> > > > >
> > > > > /**
> > > > >  * Set metadata dynamic field and flag in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > > > >                                        uint32_t metadata) {
> > > > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > > > >                        uint32_t *) = metadata;
> > > > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > > > Setting flag looks redundantly.
> > > > What if driver just replaces the metadata and flag is already set?
> > > > The other option - the flags (for set of fields) might be set in
> combinations.
> > > > mbuf field is supposed to be engaged in datapath, performance is
> > > > very critical, adding one more abstraction layer seems not to be
> relevant.
> > >
> > > Ok, that was just a suggestion. Let's use your accessors if you fear
> > > a performance impact.
> > The simple example - mlx5 PMD has the rx_burst routine implemented
> > with vector instructions, and it processes four packets at once. No
> > need to check field availability four times, and the storing the
> > metadata is the subject for further optimization with vector instructions.
> > It is a bit difficult to provide common helpers to handle the metadata
> > field due to extremely high optimization requirements.
>
> ok, got it
>
> > > Nevertheless I suggest to use static inline functions in place of
> > > macros if possible. For RTE_MBUF_DYNFIELD(), I used a macro because
> > > it's the only way to provide a type to cast the result. But in your
> > > case, you know it's a uint32_t *.
> > What If one needs to specify the address of field? Macro allows to do
> > that, inline functions - do not. Packets may be processed in bizarre
> > ways, for example in a batch, with vector instructions. OK, I'll
> > provide the set/get routines, but I'm not sure whether will use ones in mlx5
> code.
> > In my opinion it just obscures the field nature. Field is just a
> > field, AFAIU, it is main idea of your patch, the way to handle dynamic
> > field should be close to handling usual static fields, I think. Macro
> > pointer follows this approach, routines - does not.
>
> Well, I just think that:
>   rte_mbuf_set_time_stamp(m, 1234);
> is more readable than:
>   *RTE_MBUF_TIMESTAMP(m) = 1234;

I implemented these metadata set/get in v3, as you proposed.
But, mlx5 PMD does not use these ones (possible, I'll refactor some occurrences)
BTW, I did not find any rte_mbuf_set_xxxx() implemented? Did I miss smth?
Should we start with metadata field specifically? 😊

>
> Anyway, in your case, if you need to use vector instructions in the PMD, I
> guess you will directly use the offset.

Right.

>
> > > > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> > >
> > > Yes, sure. The example derives from a test I've done, and I forgot
> > > to change it.
> > >
> > >
> > > > > /**
> > > > >  * Get metadata dynamic field value in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf
> *m,
> > > > >                                        uint32_t *metadata) {
> > > > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > > > >                return -1;
> > > > What if metadata is 0xFFFFFFFF ?
> > > > The checking of availability might embrace larger code block, so
> > > > this might be not the best place to check availability.
> > > >
> > > > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> > > rte_flow_dynf_metadata_offs,
> > > > >                                uint32_t *);
> > > > >        return 0;
> > > > > }
> > > > >
> > > > > /**
> > > > >  * Delete the metadata dynamic flag in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > > > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > > > >
> > > > Sorry, I do not see the practical usecase for these helpers. In my
> > > > opinion it
> > > is just some kind of obscuration.
> > > > They do replace the very simple code and introduce some risk of
> > > performance impact.
> > > >
> > > > >
> > > > > >  /*
> > > > > >   * Definition of a single action.
> > > > > >   *
> > > > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > > > >
> > > > > >  /**
> > > > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > > > + *
> > > > > > + * @return
> > > > > > + *   True if registered, false otherwise.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +static inline int
> > > > > > +rte_flow_dynf_metadata_avail(void) {
> > > > > > +   return !!rte_flow_dynf_metadata_mask; }
> > > > >
> > > > > _registered() instead of _avail() ?
> > > > Accepted, sounds better.
> >
> > Hmm, I changed my opinion - we already have
> > rte_flow_dynf_metadata_register(void). Is it OK to have
> > rte_flow_dynf_metadata_registerED(void) ?
> > It would be easy to mistype.
>
> what about xxx_is_registered() ?
It seems to be not much better, sorry ☹
> if you feel it's too long, ok, let's keep avail()
Actually, I tend to complete with "_available", but it is really long.

> >
> > > >
> > > > >
> > > > > > +
> > > > > > +/**
> > > > > > + * Register mbuf dynamic field and flag for metadata.
> > > > > > + *
> > > > > > + * This function must be called prior to use SET_META action
> > > > > > +in order to
> > > > > > + * register the dynamic mbuf field. Otherwise, the data
> > > > > > +cannot be delivered to
> > > > > > + * application.
> > > > > > + *
> > > > > > + * @return
> > > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > > set.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int
> > > > > > +rte_flow_dynf_metadata_register(void);
> > > > > > +
> > > > > > +/**
> > > > > >   * Check whether a flow rule can be created on a given port.
> > > > > >   *
> > > > > >   * The flow rule is validated for correctness and whether it
> > > > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> > > *name,
> > > > > >   */
> > > > > >  #define RTE_MBUF_DYNFIELD(m, offset, type)
> > > > > > ((type)((uintptr_t)(m)
> > > > > > +
> > > > > > (offset)))
> > > > > >
> > > > > > +/**
> > > > > > + * Flow metadata dynamic field definitions.
> > > > > > + */
> > > > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > > > +MBUF_DYNF_METADATA_FLAGS 0
> > > > >
> > > > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > > > The name should follow the function name conventions, I suggest
> > > > > "rte_flow_metadata".
> > > >
> > > > The definitions:
> > > > MBUF_DYNF_METADATA_NAME,
> > > > MBUF_DYNF_METADATA_SIZE,
> > > > MBUF_DYNF_METADATA_ALIGN
> > > > are global. rte_flow proposes only minimal set tyo check and
> > > > access the metadata. By knowing the field names applications would
> > > > have the more flexibility in processing the fields, for example it
> > > > allows to optimize the handling of multiple dynamic fields . The
> > > > definition of metadata size allows to generate optimized code:
> > > > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > > >         *RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > > >         *RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> > >
> > > I don't see any reason why the same dynamic field could have
> > > different sizes, I even think it could be dangerous. Your accessors
> > > suppose that the metadata is a uint32_t. Having a compile-time
> > > option for that does not look desirable.
> >
> > I tried to provide maximal flexibility and It was just an example of
> > the thing we could do with global definitions. If you think we do not
> > need it - OK, let's do things simpler.
> >
> > >
> > > Just a side note: we have to take care when adding a new *public*
> > > dynamic field that it won't change in the future: the accessors are
> > > macros or static inline functions, so they are embedded in the binaries.
> > > This is probably something we should discuss and may not be when
> > > updating the dpdk (as shared lib).
> >
> > Yes, agree, defines just will not work correct in correct way and even break
> an ABI.
> > As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
> should
> > be removed.
> >
> > >
> > > > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag
> > > > is related exclusively to dynamic mbuf  " Reserved for future use, must
> be 0".
> > > > Would you like to drop this definition?
> > > >
> > > > >
> > > > > If the flag is going to be used in several places in dpdk
> > > > > (rte_flow, pmd, app, ...), I wonder if it shouldn't be defined
> > > > > it in rte_mbuf_dyn.c. I
> > > mean:
> > > > >
> > > > > ====
> > > > > /* rte_mbuf_dyn.c */
> > > > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > > > >    ...
> > > > > };
> > > > In this case we would make this descriptor global.
> > > > It is no needed, because there Is no supposed any usage, but by
> > > > rte_flow_dynf_metadata_register() only. The
> > >
> > > Yes, in my example I wasn't sure it was going to be private to
> > > rte_flow (see "If the flag is going to be used in several places in
> > > dpdk (rte_flow, pmd, app, ...)").
> > >
> > > So yes, I agree the struct should remain private.
> > OK.
> >
> > >
> > >
> > > > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > > > >    ...
> > > > > };
> > > > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > > > >
> > > > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > > > {
> > > > > ...
> > > > > }
> > > > >
> > > > > /* rte_mbuf_dyn.h */
> > > > > extern const struct rte_mbuf_dynfield
> > > > > rte_mbuf_dynfield_flow_metadata; extern int
> > > > > rte_mbuf_dynfield_flow_metadata_offset;
> > > > > extern const struct rte_mbuf_dynflag
> > > > > rte_mbuf_dynflag_flow_metadata; extern int
> > > > > rte_mbuf_dynflag_flow_metadata_bitnum;
> > > > >
> > > > > ...helpers to set/get metadata...
> > > > > ===
> > > > >
> > > > > Centralizing the definitions of non-private dynamic fields/flags
> > > > > in rte_mbuf_dyn may help other people to reuse a field that is
> > > > > well described if it match their use-case.
> > > >
> > > > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > > > placed in rte_mbuf_dyn.h. Do you think we should share the
> > > > descriptors
> > > either?
> > > > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > > > might register metadata field directly.
> > >
> > > If the field is private to rte_flow, yes, there is no need to share
> > > the "struct rte_mbuf_dynfield". Even the
> > > rte_flow_dynf_metadata_register() could be marked as internal, right?
> > rte_flow_dynf_metadata_register() is intended to be called by application.
> > Some applications might wish to engage metadata feature, some ones -
> not.
> >
> > >
> > > One more question: I see the registration is done by
> > > parse_vc_action_set_meta(). My understanding is that this function
> > > is not in datapath, and is called when configuring rte_flow. Do you
> confirm?
> > Rather it is called to configure application in general. If user sets
> > metadata (by issuing the appropriate command) it is assumed he/she
> > would like the metadata on Rx side either. This is just for test
> > purposes and it is not brilliant example of
> rte_flow_dynf_metadata_register() use case.
> >
> >
> > >
> > > > > In your case, what is carried by metadata? Could it be reused by
> > > > > others? I think some more description is needed.
> > > > In my case, metadata is just opaquie rte_flow related 32-bit
> > > > unsigned value provided by
> > > > mlx5 hardrware in rx datapath. I have no guess whether someone
> > > > wishes
> > > to reuse.
> > >
> > > What is the user supposed to do with this value? If it is
> > > hw-specific data, I think the name of the mbuf field should include
> > > "MLX", and it should be described.
> >
> > Metadata are not HW specific at all - they neither control nor are
> > produced by HW (abstracting from the flow engine is implemented in HW).
> > Metadata are some opaque data, it is some kind of link between data
> > path and flow space.  With metadata application may provide some per
> > packet information to flow engine and get back some information from
> flow engine.
> > it is generic concept, supposed to be neither HW-related nor vendor
> specific.
>
> ok, understood, it's like a mark or tag.
>
> > > Are these rte_flow actions somehow specific to mellanox drivers ?
> >
> > AFAIK, currently it is going to be supported by mlx5 PMD only, but
> > concept is common and is not vendor specific.
> >
> > >
> > > > Brief summary of you comment (just to make sure I understood your
> > > proposal in correct way):
> > > > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > > > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const
> struct
> > > > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it
> > > > global 3. provide helpers to access metadata
> > > >
> > > > [1] and [2] look OK in general. Although I think these ones make
> > > > code less
> > > flexible, restrict the potential compile time options.
> > > > For now it is rather theoretical question, if you insist on your
> > > > approach - please, let me know, I'll address [1] and [2] and
> > > > update.my
> > > patch.
> > >
> > > [1] I think the #define only adds an indirection, and I didn't see any
> > >     perf constraint here.
> > > [2] My previous comment was surely not clear, sorry. The code can stay
> > >     in rte_flow.
> > >
> > > > As for [3] - IMHO, the extra abstraction layer is not useful, and
> > > > might be
> > > even harmful.
> > > > I tend not to complicate the code, at least, for now.
> > >
> > > [3] ok for me
> > >
> > >
> > > Thanks,
> > > Olivier
> >
 With best regards, Slava



^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v3] ethdev: extend flow metadata
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  2019-10-18  9:22     ` Olivier Matz
@ 2019-10-24 13:08     ` Viacheslav Ovsiienko
  2019-10-27 16:56       ` Ori Kam
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-24 13:08 UTC (permalink / raw)
  To: dev; +Cc: thomas, olivier.matz, matan, rasland, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 v3: - removed MBUF_DYNF_METADATA_xxx definitions, only
       MBUF_DYNF_METADATA_NAME remains in rte_mbuf_dyn.h's
       centralizing point
     - added rte_flow_dynf_metadata_set/get helpers (Olivier)
     - updated rte_ethdev_version.map
     - name follows dynamic field name conventions
     - rebased
 
 v2: - http://patches.dpdk.org/patch/60908/
     - rebased
     - relies on dynamic mbuf field feature

 v1: http://patches.dpdk.org/patch/56103/

 rfc: http://patches.dpdk.org/patch/54270/

 app/test-pmd/cmdline_flow.c              | 57 +++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 72 ++++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   | 15 +++++
 lib/librte_ethdev/rte_ethdev.h           |  1 -
 lib/librte_ethdev/rte_ethdev_version.map |  3 +
 lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++
 lib/librte_ethdev/rte_flow.h             | 99 +++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 ++-
 9 files changed, 279 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f48f4eb..bc89bf9 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -308,6 +308,9 @@ enum index {
 	ACTION_DEC_TCP_ACK_VALUE,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1005,6 +1008,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1191,6 +1195,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1249,6 +1260,10 @@ static int parse_vc_action_raw_encap(struct context *,
 static int parse_vc_action_raw_decap(struct context *,
 				     const struct token *, const char *,
 				     unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3255,7 +3270,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.help = "set raw decap data",
 		.next = NEXT(next_item),
 		.call = parse_set_raw_encap_decap,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4625,6 +4664,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 1570270..39ff07b 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -81,6 +81,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 6e6d44d..2b49baa 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 206d287..2c51426 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -193,6 +193,21 @@ New Features
     gives ability to print port supported ptypes in different protocol layers.
 
 
+* **Add support of support dynamic fields and flags in mbuf.**
+
+  This new feature adds the ability to dynamically register some room
+  for a field or a flag in the mbuf structure. This is typically used
+  for specific offload features, where adding a static field or flag
+  in the mbuf is not justified.
+
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 33c528b..6ad5e1b 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1048,7 +1048,6 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
-
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
 				 DEV_RX_OFFLOAD_TCP_CKSUM)
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index e59d516..a5bf643 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -288,4 +288,7 @@ EXPERIMENTAL {
 	rte_eth_rx_burst_mode_get;
 	rte_eth_tx_burst_mode_get;
 	rte_eth_burst_mode_option_name;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..6090177 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,41 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = MBUF_DYNF_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+		.flags = 0,
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = MBUF_DYNF_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..b821557 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1942,6 +1953,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2729,32 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..a4a0cf5 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
-
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v3] ethdev: add flow tag
  2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
@ 2019-10-24 13:12       ` Viacheslav Ovsiienko
  2019-10-27 16:38         ` Ori Kam
  2019-10-27 18:42         ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  0 siblings, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-24 13:12 UTC (permalink / raw)
  To: dev; +Cc: thomas, arybchenko, matan, rasland, Yongseok Koh

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v3: rebased, neat updates
v2: http://patches.dpdk.org/patch/60909/
v1: http://patches.dpdk.org/patch/56104/
rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst |  5 +++
 lib/librte_ethdev/rte_flow.c           |  2 +
 lib/librte_ethdev/rte_flow.h           | 61 +++++++++++++++++++++++++++
 5 files changed, 193 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bc89bf9..35852bd 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -206,6 +206,9 @@ enum index {
 	ITEM_HIGIG2,
 	ITEM_HIGIG2_CLASSIFICATION,
 	ITEM_HIGIG2_VID,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -311,6 +314,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -682,6 +689,7 @@ struct parse_action_priv {
 	ITEM_PPPOED,
 	ITEM_PPPOE_PROTO_ID,
 	ITEM_HIGIG2,
+	ITEM_TAG,
 	END_SET,
 	ZERO,
 };
@@ -953,6 +961,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1009,6 +1024,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1202,6 +1218,14 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -2467,6 +2491,26 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_higig2_hdr,
 					hdr.ppt1.vid)),
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -3295,6 +3339,37 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b49baa..89a29b9 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2508,6 +2536,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 2c51426..610191b 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -208,6 +208,11 @@ New Features
   * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
     PKT_RX_DYNF_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 6090177..ec1d11d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -82,6 +82,7 @@ struct rte_flow_desc_data {
 		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
 	MK_FLOW_ITEM(MARK, sizeof(struct rte_flow_item_mark)),
 	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
+	MK_FLOW_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
 	MK_FLOW_ITEM(GRE_KEY, sizeof(rte_be32_t)),
 	MK_FLOW_ITEM(GTP_PSC, sizeof(struct rte_flow_item_gtp_psc)),
 	MK_FLOW_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)),
@@ -166,6 +167,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
+	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b821557..4d56954 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -501,6 +501,15 @@ enum rte_flow_item_type {
 	 * see struct rte_flow_item_higig2_hdr.
 	 */
 	RTE_FLOW_ITEM_TYPE_HIGIG2,
+
+	/*
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1350,6 +1359,27 @@ struct rte_flow_item_pppoe_proto_id {
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_tag_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
  * RTE_FLOW_ITEM_TYPE_MARK
  *
  * Matches an arbitrary integer value which was set using the ``MARK`` action
@@ -1368,6 +1398,13 @@ struct rte_flow_item_mark {
 	uint32_t id; /**< Integer value to match against. */
 };
 
+/** Default mask for RTE_FLOW_ITEM_TYPE_MARK. */
+#ifndef __cplusplus
+static const struct rte_flow_item_mark rte_flow_item_mark_mask = {
+	.id = 0xffffffff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1960,6 +1997,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2496,6 +2542,21 @@ struct rte_flow_action_set_meta {
 	*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v3] ethdev: add flow tag
  2019-10-24 13:12       ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
@ 2019-10-27 16:38         ` Ori Kam
  2019-10-27 18:42         ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Ori Kam @ 2019-10-27 16:38 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Thomas Monjalon, arybchenko, Matan Azrad, Raslan Darawsheh, Yongseok Koh



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Thursday, October 24, 2019 4:12 PM
> To: dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; arybchenko@solarflare.com;
> Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: [dpdk-dev] [PATCH v3] ethdev: add flow tag
> 
> A tag is a transient data which can be used during flow match. This can be
> used to store match result from a previous table so that the same pattern
> need not be matched again on the next table. Even if outer header is
> decapsulated on the previous match, the match result can be kept.
> 
> Some device expose internal registers of its flow processing pipeline and
> those registers are quite useful for stateful connection tracking as it
> keeps status of flow matching. Multiple tags are supported by specifying
> index.
> 
> Example testpmd commands are:
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>             eth ... / end
>     actions ... jump group 2 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
>   flow create 0 ingress group 2
>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> v3: rebased, neat updates
> v2:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cd91f4ba88d40409aac5a08d75883d0f2%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637075195487679773&amp;sdata=XrzhgAa2H%2BuWV%
> 2FxZu3XBnYkFv%2FVauLkjN7fAA0ROSss%3D&amp;reserved=0
> v1:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cd91f4ba88d40409aac5a08d75883d0f2%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637075195487689768&amp;sdata=e9C9LHb3b%2Fnif%2F
> 8S5ypeGDoEeVH%2FBayN3mX1q4p0arA%3D&amp;reserved=0
> rfc:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cd91f4ba88d40409aac5a08d75883d0f2%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637075195487689768&amp;sdata=3uP4UubC%2BpoDdtk
> iwSMwu2AwHm7yyBAJhItA%2Be9Q5co%3D&amp;reserved=0
> 
>  app/test-pmd/cmdline_flow.c            | 75
> ++++++++++++++++++++++++++++++++++
>  doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst |  5 +++
>  lib/librte_ethdev/rte_flow.c           |  2 +
>  lib/librte_ethdev/rte_flow.h           | 61 +++++++++++++++++++++++++++
>  5 files changed, 193 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index bc89bf9..35852bd 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -206,6 +206,9 @@ enum index {
>  	ITEM_HIGIG2,
>  	ITEM_HIGIG2_CLASSIFICATION,
>  	ITEM_HIGIG2_VID,
> +	ITEM_TAG,
> +	ITEM_TAG_DATA,
> +	ITEM_TAG_INDEX,
> 
>  	/* Validate/create actions. */
>  	ACTIONS,
> @@ -311,6 +314,10 @@ enum index {
>  	ACTION_SET_META,
>  	ACTION_SET_META_DATA,
>  	ACTION_SET_META_MASK,
> +	ACTION_SET_TAG,
> +	ACTION_SET_TAG_INDEX,
> +	ACTION_SET_TAG_DATA,
> +	ACTION_SET_TAG_MASK,
>  };
> 
Can you please add them alphabetic , (data,index,mask) what do you think?


>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -682,6 +689,7 @@ struct parse_action_priv {
>  	ITEM_PPPOED,
>  	ITEM_PPPOE_PROTO_ID,
>  	ITEM_HIGIG2,
> +	ITEM_TAG,
>  	END_SET,
>  	ZERO,
>  };
> @@ -953,6 +961,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
> 
> +static const enum index item_tag[] = {
> +	ITEM_TAG_DATA,
> +	ITEM_TAG_INDEX,
> +	ITEM_NEXT,
> +	ZERO,
> +};
> +
>  static const enum index next_action[] = {
>  	ACTION_END,
>  	ACTION_VOID,
> @@ -1009,6 +1024,7 @@ struct parse_action_priv {
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
>  	ACTION_SET_META,
> +	ACTION_SET_TAG,
>  	ZERO,
>  };
> 
> @@ -1202,6 +1218,14 @@ struct parse_action_priv {
>  	ZERO,
>  };
> 
> +static const enum index action_set_tag[] = {
> +	ACTION_SET_TAG_INDEX,
> +	ACTION_SET_TAG_DATA,
> +	ACTION_SET_TAG_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +

Again maybe order of the defines according.

>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -2467,6 +2491,26 @@ static int comp_vc_action_rss_queue(struct context
> *, const struct token *,
>  		.args = ARGS(ARGS_ENTRY_HTON(struct
> rte_flow_item_higig2_hdr,
>  					hdr.ppt1.vid)),
>  	},
> +	[ITEM_TAG] = {
> +		.name = "tag",
> +		.help = "match tag value",
> +		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
> +		.next = NEXT(item_tag),
> +		.call = parse_vc,
> +	},
> +	[ITEM_TAG_DATA] = {
> +		.name = "data",
> +		.help = "tag value to match",
> +		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
> item_param),
> +		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag,
> data)),
> +	},
> +	[ITEM_TAG_INDEX] = {
> +		.name = "index",
> +		.help = "index of tag array to match",
> +		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
> +			     NEXT_ENTRY(ITEM_PARAM_IS)),
> +		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
> +	},

I think you are missing mask.

>  	/* Validate/create actions. */
>  	[ACTIONS] = {
>  		.name = "actions",
> @@ -3295,6 +3339,37 @@ static int comp_vc_action_rss_queue(struct context
> *, const struct token *,
>  			     (struct rte_flow_action_set_meta, mask)),
>  		.call = parse_vc_conf,
>  	},
> +	[ACTION_SET_TAG] = {
> +		.name = "set_tag",
> +		.help = "set tag",
> +		.priv = PRIV_ACTION(SET_TAG,
> +			sizeof(struct rte_flow_action_set_tag)),
> +		.next = NEXT(action_set_tag),
> +		.call = parse_vc,
> +	},
> +	[ACTION_SET_TAG_INDEX] = {
> +		.name = "index",
> +		.help = "index of tag array",
> +		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag,
> index)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_TAG_DATA] = {
> +		.name = "data",
> +		.help = "tag value",
> +		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_tag, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_TAG_MASK] = {
> +		.name = "mask",
> +		.help = "mask for tag value",
> +		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_tag, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
> 
>  /** Remove and return last entry from argument stack. */
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 2b49baa..89a29b9 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata
> will be reported via
>     | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
>     +----------+----------+---------------------------------------+
> 
> +Item: ``TAG``
> +^^^^^^^^^^^^^
> +
> +Matches tag item set by other flows. Multiple tags are supported by specifying
> +``index``.
> +
> +- Default ``mask`` matches the specified tag value and index.
> +
> +.. _table_rte_flow_item_tag:
> +
> +.. table:: TAG
> +
> +   +----------+----------+----------------------------------------+
> +   | Field    | Subfield  | Value                                 |
> +
> +==========+===========+=======================================+
> +   | ``spec`` | ``data``  | 32 bit flow tag value                 |
> +   |          +-----------+---------------------------------------+
> +   |          | ``index`` | index of flow tag                     |
> +   +----------+-----------+---------------------------------------+
> +   | ``last`` | ``data``  | upper range value                     |
> +   |          +-----------+                                       |
> +   |          | ``index`` |                                       |
> +   +----------+-----------+---------------------------------------+

I don't think last is relevant for this. Maybe is should be documented as ignored.

> +   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
> +   |          +-----------+                                       |
> +   |          | ``index`` |                                       |

Should set index as ignored.

> +   +----------+-----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
> 
> @@ -2508,6 +2536,28 @@ the other path depending on HW capability.
>     | ``mask`` | bit-mask applies to "data" |
>     +----------+----------------------------+
> 
> +Action: ``SET_TAG``
> +^^^^^^^^^^^^^^^^^^^
> +
> +Set Tag.
> +
> +Tag is a transient data used during flow matching. This is not delivered to
> +application. Multiple tags are supported by specifying index.
> +
> +.. _table_rte_flow_action_set_tag:
> +
> +.. table:: SET_TAG
> +
> +   +-----------+----------------------------+
> +   | Field     | Value                      |
> +   +===========+============================+
> +   | ``data``  | 32 bit tag value           |
> +   +-----------+----------------------------+
> +   | ``mask``  | bit-mask applies to "data" |
> +   +-----------+----------------------------+
> +   | ``index`` | index of tag to set        |
> +   +-----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
> 
> diff --git a/doc/guides/rel_notes/release_19_11.rst
> b/doc/guides/rel_notes/release_19_11.rst
> index 2c51426..610191b 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -208,6 +208,11 @@ New Features
>    * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
>      PKT_RX_DYNF_METADATA.
> 
> +* **Added flow tag in rte_flow.**
> +  SET_TAG action and TAG item have been added to support transient flow
> +  tag.
> +
> +
>  Removed Items
>  -------------
> 
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index 6090177..ec1d11d 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -82,6 +82,7 @@ struct rte_flow_desc_data {
>  		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
>  	MK_FLOW_ITEM(MARK, sizeof(struct rte_flow_item_mark)),
>  	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
> +	MK_FLOW_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
>  	MK_FLOW_ITEM(GRE_KEY, sizeof(rte_be32_t)),
>  	MK_FLOW_ITEM(GTP_PSC, sizeof(struct rte_flow_item_gtp_psc)),
>  	MK_FLOW_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)),
> @@ -166,6 +167,7 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> +	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
>  };
> 
>  int
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index b821557..4d56954 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -501,6 +501,15 @@ enum rte_flow_item_type {
>  	 * see struct rte_flow_item_higig2_hdr.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_HIGIG2,
> +
> +	/*
> +	 * [META]
> +	 *


Please remove the [META]

> +	 * Matches a tag value.
> +	 *
> +	 * See struct rte_flow_item_tag.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_TAG,
>  };
> 
>  /**
> @@ -1350,6 +1359,27 @@ struct rte_flow_item_pppoe_proto_id {
>   * @warning
>   * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> + * RTE_FLOW_ITEM_TYPE_TAG
> + *
> + * Matches a specified tag value at the specified index.
> + */
> +struct rte_flow_item_tag {
> +	uint32_t data;
> +	uint8_t index;
> +};
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_tag rte_flow_item_tag_mask = {
> +	.data = 0xffffffff,
> +	.index = 0xff,
> +};
> +#endif
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
>   * RTE_FLOW_ITEM_TYPE_MARK
>   *
>   * Matches an arbitrary integer value which was set using the ``MARK`` action
> @@ -1368,6 +1398,13 @@ struct rte_flow_item_mark {
>  	uint32_t id; /**< Integer value to match against. */
>  };
> 
> +/** Default mask for RTE_FLOW_ITEM_TYPE_MARK. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_mark rte_flow_item_mark_mask = {
> +	.id = 0xffffffff,
> +};
> +#endif
> +
>  /**
>   * @warning
>   * @b EXPERIMENTAL: this structure may change without prior notice
> @@ -1960,6 +1997,15 @@ enum rte_flow_action_type {
>  	 * See struct rte_flow_action_set_meta.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_SET_META,
> +
> +	/**
> +	 * Set Tag.
> +	 *
> +	 * Tag is not delivered to application.
> +	 *

I think we should think positive. Something like tag is for internal flow only and is not delivered to the application.
What do you think?

> +	 * See struct rte_flow_action_set_tag.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_TAG,
>  };
> 
>  /**
> @@ -2496,6 +2542,21 @@ struct rte_flow_action_set_meta {
>  	*RTE_FLOW_DYNF_METADATA(m) = v;
>  }
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_TAG
> + *
> + * Set a tag which is a transient data used during flow matching. This is not
> + * delivered to application. Multiple tags are supported by specifying index.
> + */
> +struct rte_flow_action_set_tag {
> +	uint32_t data;
> +	uint32_t mask;
> +	uint8_t index;
> +};
> +
>  /*
>   * Definition of a single action.
>   *
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v3] ethdev: extend flow metadata
  2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
@ 2019-10-27 16:56       ` Ori Kam
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Ori Kam @ 2019-10-27 16:56 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Thomas Monjalon, olivier.matz, Matan Azrad, Raslan Darawsheh,
	Yongseok Koh

Hi Slava,

Some small comments inline.
Please also add deprication for the META cap on TX. 

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Subject: [dpdk-dev] [PATCH v3] ethdev: extend flow metadata
> 
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches
> metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  v3: - removed MBUF_DYNF_METADATA_xxx definitions, only
>        MBUF_DYNF_METADATA_NAME remains in rte_mbuf_dyn.h's
>        centralizing point
>      - added rte_flow_dynf_metadata_set/get helpers (Olivier)
>      - updated rte_ethdev_version.map
>      - name follows dynamic field name conventions
>      - rebased
> 
>  v2: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F60908%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7C71a0c68aa47f482cf44c08d75883425e%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637075193088507126&amp;sdata=V4aOinownen1dLOddq
> pAzTp4GuTFL02BtMv08JCvZYY%3D&amp;reserved=0
>      - rebased
>      - relies on dynamic mbuf field feature
> 
>  v1:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F56103%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7C71a0c68aa47f482cf44c08d75883425e%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637075193088507126&amp;sdata=tznGLMEFGEqKsSnbjUd
> aQSHc%2BEh3%2FxTJwWt5fnZ7lK4%3D&amp;reserved=0
> 
>  rfc:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F54270%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7C71a0c68aa47f482cf44c08d75883425e%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637075193088507126&amp;sdata=rRu5Rt1AV14qXhMlPi5
> H%2FVSOM%2F3JGGNhVZw7P3SfuzY%3D&amp;reserved=0
> 
>  app/test-pmd/cmdline_flow.c              | 57 +++++++++++++++++-
>  app/test-pmd/util.c                      |  5 ++
>  doc/guides/prog_guide/rte_flow.rst       | 72 ++++++++++++++++++-----
>  doc/guides/rel_notes/release_19_11.rst   | 15 +++++
>  lib/librte_ethdev/rte_ethdev.h           |  1 -
>  lib/librte_ethdev/rte_ethdev_version.map |  3 +
>  lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 99
> +++++++++++++++++++++++++++++++-
>  lib/librte_mbuf/rte_mbuf_dyn.h           |  8 ++-
>  9 files changed, 279 insertions(+), 22 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index f48f4eb..bc89bf9 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -308,6 +308,9 @@ enum index {
>  	ACTION_DEC_TCP_ACK_VALUE,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
>  };
> 
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -1005,6 +1008,7 @@ struct parse_action_priv {
>  	ACTION_DEC_TCP_ACK,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
>  	ZERO,
>  };
> 
> @@ -1191,6 +1195,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
> 
> +static const enum index action_set_meta[] = {
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -1249,6 +1260,10 @@ static int parse_vc_action_raw_encap(struct
> context *,
>  static int parse_vc_action_raw_decap(struct context *,
>  				     const struct token *, const char *,
>  				     unsigned int, void *, unsigned int);
> +static int parse_vc_action_set_meta(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
> +				    unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -3255,7 +3270,31 @@ static int comp_vc_action_rss_queue(struct context
> *, const struct token *,
>  		.help = "set raw decap data",
>  		.next = NEXT(next_item),
>  		.call = parse_set_raw_encap_decap,
> -	}
> +	},
> +	[ACTION_SET_META] = {
> +		.name = "set_meta",
> +		.help = "set metadata",
> +		.priv = PRIV_ACTION(SET_META,
> +			sizeof(struct rte_flow_action_set_meta)),
> +		.next = NEXT(action_set_meta),
> +		.call = parse_vc_action_set_meta,
> +	},
> +	[ACTION_SET_META_DATA] = {
> +		.name = "data",
> +		.help = "metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_META_MASK] = {
> +		.name = "mask",
> +		.help = "mask for metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
> 
>  /** Remove and return last entry from argument stack. */
> @@ -4625,6 +4664,22 @@ static int comp_vc_action_rss_queue(struct context
> *, const struct token *,
>  	return ret;
>  }
> 
> +static int
> +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	ret = rte_flow_dynf_metadata_register();
> +	if (ret < 0)
> +		return -1;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 1570270..39ff07b 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -81,6 +81,11 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> +		if (ol_flags & PKT_TX_METADATA)
> +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_RX_DYNF_METADATA)
> +			printf(" - Rx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (mb->packet_type) {
>  			rte_get_ptype_name(mb->packet_type, buf,
> sizeof(buf));
>  			printf(" - hw ptype: %s", buf);
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 6e6d44d..2b49baa 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or
> not at all.
>     | ``mask`` | ``id``   | zeroed to match any value |
>     +----------+----------+---------------------------+
> 
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA
> flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +
> +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |

I don't think this field should be used.

> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
> 
> @@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
>  - ``proto_id``: PPP protocol identifier.
>  - Default ``mask`` matches proto_id only.
> 
Why this lines are changed?

> -
> -.. _table_rte_flow_item_meta:
> -
> -.. table:: META
> -
> -   +----------+----------+---------------------------------------+
> -   | Field    | Subfield | Value                                 |
> -
> +==========+==========+=======================================+
> -   | ``spec`` | ``data`` | 32 bit metadata value                 |
> -   +----------+--------------------------------------------------+
> -   | ``last`` | ``data`` | upper range value                     |
> -   +----------+----------+---------------------------------------+
> -   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> -   +----------+----------+---------------------------------------+
> -
>  Item: ``NSH``
>  ^^^^^^^^^^^^^^^^^^^
> 
> @@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by
> is a big-endian 32 bit integer.
> 
>  Using this action on non-matching traffic will result in undefined behavior.
> 
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress
> will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set
> along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
> 
> diff --git a/doc/guides/rel_notes/release_19_11.rst
> b/doc/guides/rel_notes/release_19_11.rst
> index 206d287..2c51426 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -193,6 +193,21 @@ New Features
>      gives ability to print port supported ptypes in different protocol layers.
> 
> 
> +* **Add support of support dynamic fields and flags in mbuf.**
> +
> +  This new feature adds the ability to dynamically register some room
> +  for a field or a flag in the mbuf structure. This is typically used
> +  for specific offload features, where adding a static field or flag
> +  in the mbuf is not justified.
> +
> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +
>  Removed Items
>  -------------
> 
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index 33c528b..6ad5e1b 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map
> b/lib/librte_ethdev/rte_ethdev_version.map
> index e59d516..a5bf643 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -288,4 +288,7 @@ EXPERIMENTAL {
>  	rte_eth_rx_burst_mode_get;
>  	rte_eth_tx_burst_mode_get;
>  	rte_eth_burst_mode_option_name;
> +	rte_flow_dynf_metadata_offs;
> +	rte_flow_dynf_metadata_mask;
> +	rte_flow_dynf_metadata_register;
>  };
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index ca0f680..6090177 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>  #include <rte_errno.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include "rte_ethdev.h"
>  #include "rte_flow_driver.h"
>  #include "rte_flow.h"
> 
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>  /**
>   * Flow elements description tables.
>   */
> @@ -157,8 +165,41 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
>  };
> 
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +		.size = sizeof(uint32_t),
> +		.align = __alignof__(uint32_t),
> +		.flags = 0,
> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +	};
> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:
> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>  static int
>  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>  {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 4fee105..b821557 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -28,6 +28,8 @@
>  #include <rte_byteorder.h>
>  #include <rte_esp.h>
>  #include <rte_higig.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
> 
>  #ifdef __cplusplus
>  extern "C" {
> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
>  	/**
>  	 * [META]
>  	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>  	 * See struct rte_flow_item_meta.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_META,
> @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  #endif
> 
>  /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> - * Matches a specified metadata value.
> + * RTE_FLOW_ITEM_TYPE_META
> + *
> + * Matches a specified metadata value. On egress, metadata can be set either
> by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf
> metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field
> must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
>  	rte_be32_t data;
> @@ -1942,6 +1953,13 @@ enum rte_flow_action_type {
>  	 * undefined behavior.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> +
> +	/**
> +	 * Set metadata on ingress or egress path.
> +	 *
> +	 * See struct rte_flow_action_set_meta.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_META,
>  };
> 
>  /**
> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
>  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>  };
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	rte_be32_t data;
> +	rte_be32_t mask;
> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +
> +__rte_experimental
> +static inline uint32_t
> +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> +	return *RTE_FLOW_DYNF_METADATA(m);
> +}
> +
> +__rte_experimental
> +static inline void
> +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> +	*RTE_FLOW_DYNF_METADATA(m) = v;
> +}
> +
>  /*
>   * Definition of a single action.
>   *
> @@ -2662,6 +2729,32 @@ enum rte_flow_conv_op {
>  };
> 
>  /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {
> +	return !!rte_flow_dynf_metadata_mask;
> +}
> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>   * Check whether a flow rule can be created on a given port.
>   *
>   * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 2e9d418..a4a0cf5 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
>  __rte_experimental
>  void rte_mbuf_dyn_dump(FILE *out);
> 
> -/* Placeholder for dynamic fields and flags declarations. */
> -
> +/*
> + * Placeholder for dynamic fields and flags declarations.
> + * This is centralizing point to gather all field names
> + * and parameters together.
> + */
> +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
>  #endif
> --
> 1.8.3.1


Thanks,
Ori

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
  2019-10-27 16:56       ` Ori Kam
@ 2019-10-27 18:40       ` Viacheslav Ovsiienko
  2019-10-27 19:10         ` Ori Kam
                           ` (3 more replies)
  1 sibling, 4 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-27 18:40 UTC (permalink / raw)
  To: dev; +Cc: thomas, matan, olivier.matz, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

---
v4: documentation comments addressed
    deprecation notice for Tx metadata offload flag
    rebased

v3: http://patches.dpdk.org/patch/61902/
    rebased, neat updates

v2: http://patches.dpdk.org/patch/60909/
v1: http://patches.dpdk.org/patch/56104/
rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c              | 57 +++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 72 ++++++++++++++++++-----
 doc/guides/rel_notes/deprecation.rst     |  4 ++
 doc/guides/rel_notes/release_19_11.rst   | 15 +++++
 lib/librte_ethdev/rte_ethdev.h           |  1 -
 lib/librte_ethdev/rte_ethdev_version.map |  3 +
 lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++
 lib/librte_ethdev/rte_flow.h             | 99 +++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 ++-
 10 files changed, 283 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0d0bc0a..e4ef066 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -316,6 +316,9 @@ enum index {
 	ACTION_RAW_ENCAP_INDEX_VALUE,
 	ACTION_RAW_DECAP_INDEX,
 	ACTION_RAW_DECAP_INDEX_VALUE,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1067,6 +1070,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1265,6 +1269,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.help = "index of raw_encap/raw_decap data",
 		.next = NEXT(next_item),
 		.call = parse_port,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 159ce19..c943aca 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 3aa1634..9d54d8e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -106,6 +106,10 @@ Deprecation Notices
   struct ``rte_eth_dev_info`` for the port capability and in struct
   ``rte_eth_rxmode`` for the port configuration.
 
+* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed, static metadata
+  mbuf field will be removed in 20.02, metadata feature will use dynamic mbuf
+  field and flag instead.
+
 * cryptodev: support for using IV with all sizes is added, J0 still can
   be used but only when IV length in following structs ``rte_crypto_auth_xform``,
   ``rte_crypto_aead_xform`` is set to zero. When IV length is greater or equal
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 0e5bb5d..6d331f6 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -232,6 +232,21 @@ New Features
     gives ability to print port supported ptypes in different protocol layers.
 
 
+* **Add support of support dynamic fields and flags in mbuf.**
+
+  This new feature adds the ability to dynamically register some room
+  for a field or a flag in the mbuf structure. This is typically used
+  for specific offload features, where adding a static field or flag
+  in the mbuf is not justified.
+
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index c36c1b6..b19c86b 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1048,7 +1048,6 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
-
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
 				 DEV_RX_OFFLOAD_TCP_CKSUM)
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index e59d516..a5bf643 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -288,4 +288,7 @@ EXPERIMENTAL {
 	rte_eth_rx_burst_mode_get;
 	rte_eth_tx_burst_mode_get;
 	rte_eth_burst_mode_option_name;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..6090177 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,41 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = MBUF_DYNF_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+		.flags = 0,
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = MBUF_DYNF_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..b821557 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1942,6 +1953,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2729,32 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..a4a0cf5 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
-
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v4] ethdev: add flow tag
  2019-10-24 13:12       ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
  2019-10-27 16:38         ` Ori Kam
@ 2019-10-27 18:42         ` Viacheslav Ovsiienko
  2019-10-27 19:11           ` Ori Kam
  1 sibling, 1 reply; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-27 18:42 UTC (permalink / raw)
  To: dev; +Cc: thomas, matan, orika, Yongseok Koh

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

---
v4: rebased, doc comments are addressed 
v3: http://patches.dpdk.org/patch/61902/
v2: http://patches.dpdk.org/patch/60909/
v1: http://patches.dpdk.org/patch/56104/
rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst |  5 +++
 lib/librte_ethdev/rte_flow.c           |  2 +
 lib/librte_ethdev/rte_flow.h           | 62 ++++++++++++++++++++++++++++
 5 files changed, 194 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e4ef066..951f20a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -210,6 +210,9 @@ enum index {
 	ITEM_HIGIG2,
 	ITEM_HIGIG2_CLASSIFICATION,
 	ITEM_HIGIG2_VID,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -319,6 +322,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -738,6 +745,7 @@ struct parse_action_priv {
 	ITEM_PPPOED,
 	ITEM_PPPOE_PROTO_ID,
 	ITEM_HIGIG2,
+	ITEM_TAG,
 	END_SET,
 	ZERO,
 };
@@ -1015,6 +1023,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1071,6 +1086,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1276,6 +1292,14 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -2549,6 +2573,26 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_higig2_hdr,
 					hdr.ppt1.vid)),
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -3418,6 +3462,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index c943aca..d5746d2 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | field is ignored                      |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | field is ignored                      |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2508,6 +2536,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 6d331f6..69d3e3f 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -247,6 +247,11 @@ New Features
   * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
     PKT_RX_DYNF_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 6090177..ec1d11d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -82,6 +82,7 @@ struct rte_flow_desc_data {
 		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
 	MK_FLOW_ITEM(MARK, sizeof(struct rte_flow_item_mark)),
 	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
+	MK_FLOW_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
 	MK_FLOW_ITEM(GRE_KEY, sizeof(rte_be32_t)),
 	MK_FLOW_ITEM(GTP_PSC, sizeof(struct rte_flow_item_gtp_psc)),
 	MK_FLOW_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)),
@@ -166,6 +167,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
+	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b821557..d1ab982 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -501,6 +501,15 @@ enum rte_flow_item_type {
 	 * see struct rte_flow_item_higig2_hdr.
 	 */
 	RTE_FLOW_ITEM_TYPE_HIGIG2,
+
+	/*
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1350,6 +1359,27 @@ struct rte_flow_item_pppoe_proto_id {
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_tag_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
  * RTE_FLOW_ITEM_TYPE_MARK
  *
  * Matches an arbitrary integer value which was set using the ``MARK`` action
@@ -1368,6 +1398,13 @@ struct rte_flow_item_mark {
 	uint32_t id; /**< Integer value to match against. */
 };
 
+/** Default mask for RTE_FLOW_ITEM_TYPE_MARK. */
+#ifndef __cplusplus
+static const struct rte_flow_item_mark rte_flow_item_mark_mask = {
+	.id = 0xffffffff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1960,6 +1997,16 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is for internal flow usage only and
+	 * is not delivered to the application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2496,6 +2543,21 @@ struct rte_flow_action_set_meta {
 	*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
@ 2019-10-27 19:10         ` Ori Kam
  2019-10-29 16:22         ` Andrew Rybchenko
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 98+ messages in thread
From: Ori Kam @ 2019-10-27 19:10 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Thomas Monjalon, Matan Azrad, olivier.matz, Yongseok Koh



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Subject: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches
> metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> ---
> v4: documentation comments addressed
>     deprecation notice for Tx metadata offload flag
>     rebased
> 
> v3:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F61902%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Ce5a38cd79f30468e76d408d75b0d2f4c%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637077984504455741&amp;sdata=C1yyYY8M8LpoOg1bTz
> wM8nIx19RcDzP96GVNA%2FABRb8%3D&amp;reserved=0
>     rebased, neat updates
> 
> v2:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Ce5a38cd79f30468e76d408d75b0d2f4c%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637077984504455741&amp;sdata=H1zpBrDfxQaTAQwETE
> St9uiY3rgVHQEMw%2FeEveZSdx4%3D&amp;reserved=0
> v1:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Ce5a38cd79f30468e76d408d75b0d2f4c%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637077984504455741&amp;sdata=olUN2iPv38TqFHIX8a0
> b3Uz505Cqz34BOlckZHsl8rw%3D&amp;reserved=0
> rfc:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Ce5a38cd79f30468e76d408d75b0d2f4c%7Ca652971c7d2e4d9ba6a4d149256
> f461b%7C0%7C0%7C637077984504455741&amp;sdata=%2BMP4tWWQHO6Vd
> NBGJNM1om%2BwoM5ARrbXx0DP44et5mA%3D&amp;reserved=0
> 
>  app/test-pmd/cmdline_flow.c              | 57 +++++++++++++++++-
>  app/test-pmd/util.c                      |  5 ++
>  doc/guides/prog_guide/rte_flow.rst       | 72 ++++++++++++++++++-----
>  doc/guides/rel_notes/deprecation.rst     |  4 ++
>  doc/guides/rel_notes/release_19_11.rst   | 15 +++++
>  lib/librte_ethdev/rte_ethdev.h           |  1 -
>  lib/librte_ethdev/rte_ethdev_version.map |  3 +
>  lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 99
> +++++++++++++++++++++++++++++++-

Acked-by: Ori Kam <orika@mellanox.com>

Thanks,
Ori

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: add flow tag
  2019-10-27 18:42         ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
@ 2019-10-27 19:11           ` Ori Kam
  2019-10-31 18:57             ` Ferruh Yigit
  0 siblings, 1 reply; 98+ messages in thread
From: Ori Kam @ 2019-10-27 19:11 UTC (permalink / raw)
  To: Slava Ovsiienko, dev; +Cc: Thomas Monjalon, Matan Azrad, Yongseok Koh



> -----Original Message-----
> From: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Sent: Sunday, October 27, 2019 8:42 PM
> To: dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> <matan@mellanox.com>; Ori Kam <orika@mellanox.com>; Yongseok Koh
> <yskoh@mellanox.com>
> Subject: [PATCH v4] ethdev: add flow tag
> 
> A tag is a transient data which can be used during flow match. This can be
> used to store match result from a previous table so that the same pattern
> need not be matched again on the next table. Even if outer header is
> decapsulated on the previous match, the match result can be kept.
> 
> Some device expose internal registers of its flow processing pipeline and
> those registers are quite useful for stateful connection tracking as it
> keeps status of flow matching. Multiple tags are supported by specifying
> index.
> 
> Example testpmd commands are:
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>             eth ... / end
>     actions ... jump group 2 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
>   flow create 0 ingress group 2
>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> ---
> v4: rebased, doc comments are addressed
> v3:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F61902%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637077985544569848&amp;sdata=uig9z%2BKlajityhrU2P
> ejBEJR%2FsgBHvytHC2HcZBuI7Q%3D&amp;reserved=0
> v2:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=9pfgncgaRg1mVkJ00o
> wm63lsiNw14hoo4pySvnjFCVE%3D&amp;reserved=0
> v1:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=3r9b2yaNZfNiLjYStD
> MDbw3PpQFbTYuPdJO9%2F8c0VbM%3D&amp;reserved=0
> rfc:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=3uM2kVUbEwohNwFr
> %2FR0mpBKEIFDfqYAChz0GakK6Pkw%3D&amp;reserved=0


Acked-by: Ori Kam <orika@mellanox.com>

Thanks,
Ori

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  2019-10-27 19:10         ` Ori Kam
@ 2019-10-29 16:22         ` Andrew Rybchenko
  2019-10-29 17:19           ` Slava Ovsiienko
  2019-10-29 16:25         ` Olivier Matz
  2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
  3 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-29 16:22 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: thomas, matan, olivier.matz, orika, Yongseok Koh

On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
>
> This patch extends the metadata feature usability.
>
> 1) RTE_FLOW_ACTION_TYPE_SET_META
>
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
>
> 2) Metadata on ingress
>
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
>
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
>
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
>
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Above explanations lack information about "meta" vs "mark" which
may be set on Rx as well and delivered in other mbuf field.
It should be explained by one more field is required and rules
defined. Otherwise we can endup in half PMDs supporting
mark only, half PMDs supporting meta only and applications
in an interesting situation to make a choice which one to use.

[snip]

> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 159ce19..c943aca 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
>      | ``mask`` | ``id``   | zeroed to match any value |
>      +----------+----------+---------------------------+
>   
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +   +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |
> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>   Data matching item types
>   ~~~~~~~~~~~~~~~~~~~~~~~~
>   
> @@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
>   - ``proto_id``: PPP protocol identifier.
>   - Default ``mask`` matches proto_id only.
>   
> -
> -.. _table_rte_flow_item_meta:
> -
> -.. table:: META
> -
> -   +----------+----------+---------------------------------------+
> -   | Field    | Subfield | Value                                 |
> -   +==========+==========+=======================================+
> -   | ``spec`` | ``data`` | 32 bit metadata value                 |
> -   +----------+--------------------------------------------------+
> -   | ``last`` | ``data`` | upper range value                     |
> -   +----------+----------+---------------------------------------+
> -   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> -   +----------+----------+---------------------------------------+
> -
>   Item: ``NSH``
>   ^^^^^^^^^^^^^
>   
> @@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
>   
>   Using this action on non-matching traffic will result in undefined behavior.
>   
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>   Negative types
>   ~~~~~~~~~~~~~~
>   
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 3aa1634..9d54d8e 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -106,6 +106,10 @@ Deprecation Notices
>     struct ``rte_eth_dev_info`` for the port capability and in struct
>     ``rte_eth_rxmode`` for the port configuration.
>   
> +* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed, static metadata
> +  mbuf field will be removed in 20.02, metadata feature will use dynamic mbuf
> +  field and flag instead.
> +

Isn't it breaking stable API/ABI? Should target release be 20.11?

I think that DEV_TX_OFFLOAD_MATCH_METADATA should marked
as deprecated now as well as tx_metadata field in mbuf.

>   * cryptodev: support for using IV with all sizes is added, J0 still can
>     be used but only when IV length in following structs ``rte_crypto_auth_xform``,
>     ``rte_crypto_aead_xform`` is set to zero. When IV length is greater or equal
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 0e5bb5d..6d331f6 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -232,6 +232,21 @@ New Features
>       gives ability to print port supported ptypes in different protocol layers.
>   
>   
> +* **Add support of support dynamic fields and flags in mbuf.**
> +
> +  This new feature adds the ability to dynamically register some room
> +  for a field or a flag in the mbuf structure. This is typically used
> +  for specific offload features, where adding a static field or flag
> +  in the mbuf is not justified.
> +

I guess above is just incorrect merge.

> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +

Two empty lines are required before the next section.

>   Removed Items
>   -------------
>   
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index c36c1b6..b19c86b 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
>   #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>   #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>   #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -

Unrelated change.

>   #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>   				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>   				 DEV_RX_OFFLOAD_TCP_CKSUM)

[snip]

> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index ca0f680..6090177 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>   #include <rte_errno.h>
>   #include <rte_branch_prediction.h>
>   #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>   #include "rte_ethdev.h"
>   #include "rte_flow_driver.h"
>   #include "rte_flow.h"
>   
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>   /**
>    * Flow elements description tables.
>    */
> @@ -157,8 +165,41 @@ struct rte_flow_desc_data {
>   	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>   	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>   	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
>   };
>   
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +		.size = sizeof(uint32_t),
> +		.align = __alignof__(uint32_t),
> +		.flags = 0,

I think flags initialization to 0 is redundant.

> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +	};
> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:

Just an observation...
Impossibility to unregister hits here. Field may be registered,
but will be used.

> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>   static int
>   flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>   {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 4fee105..b821557 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -28,6 +28,8 @@
>   #include <rte_byteorder.h>
>   #include <rte_esp.h>
>   #include <rte_higig.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>   
>   #ifdef __cplusplus
>   extern "C" {
> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
>   	/**
>   	 * [META]
>   	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>   	 * See struct rte_flow_item_meta.
>   	 */
>   	RTE_FLOW_ITEM_TYPE_META,
> @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>   #endif
>   
>   /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice

Is it allowed to make experimental back?

>    *
> - * Matches a specified metadata value.
> + * RTE_FLOW_ITEM_TYPE_META
> + *
> + * Matches a specified metadata value. On egress, metadata can be set either by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>    */
>   struct rte_flow_item_meta {
>   	rte_be32_t data;

[snip]

> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
>   	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>   };
>   
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	rte_be32_t data;
> +	rte_be32_t mask;

As I understand tx_metadata is host endian. Just double-checking.
Is a new dynamic field host endian or big endian?
I definitely would like to see motivation in comments why data/mask
are big-endian here.

> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;

These two global variables look frightening to me.
It does not look good to me.

> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +
> +__rte_experimental
> +static inline uint32_t
> +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {

Above curly bracket should be on its own line in the case of function
definition.

Shouldn't m be 'const struct rte_mbuf *'?

> +	return *RTE_FLOW_DYNF_METADATA(m);
> +}
> +
> +__rte_experimental
> +static inline void
> +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {

Above curly bracket should be on its own line in the case of function
definition.

> +	*RTE_FLOW_DYNF_METADATA(m) = v;
> +}
> +
>   /*
>    * Definition of a single action.
>    *
> @@ -2662,6 +2729,32 @@ enum rte_flow_conv_op {
>   };
>   
>   /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {

Above curly bracket should be on its own line in the case of function
definition.

> +	return !!rte_flow_dynf_metadata_mask;
> +}
> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>    * Check whether a flow rule can be created on a given port.
>    *
>    * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 2e9d418..a4a0cf5 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
>   __rte_experimental
>   void rte_mbuf_dyn_dump(FILE *out);
>   
> -/* Placeholder for dynamic fields and flags declarations. */
> -
> +/*
> + * Placeholder for dynamic fields and flags declarations.
> + * This is centralizing point to gather all field names
> + * and parameters together.
> + */

It is not a comment for below define. So, I think empty line is
required to separate the comment from below define.
I'm not sure that the clarification is required, but it is up to Olivier.

> +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"

Empty line is missing here

>   #endif


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
  2019-10-27 19:10         ` Ori Kam
  2019-10-29 16:22         ` Andrew Rybchenko
@ 2019-10-29 16:25         ` Olivier Matz
  2019-10-29 16:33           ` Olivier Matz
  2019-10-29 17:43           ` Slava Ovsiienko
  2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
  3 siblings, 2 replies; 98+ messages in thread
From: Olivier Matz @ 2019-10-29 16:25 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, thomas, matan, orika, Yongseok Koh

Hi Slava,

Looks good to me overall. Few minor comments below.

On Sun, Oct 27, 2019 at 06:40:36PM +0000, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

(...)

> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index c36c1b6..b19c86b 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>  				 DEV_RX_OFFLOAD_TCP_CKSUM)

Undue removed line here.

(...)

> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +
> +__rte_experimental
> +static inline uint32_t
> +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> +	return *RTE_FLOW_DYNF_METADATA(m);
> +}
> +
> +__rte_experimental
> +static inline void
> +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> +	*RTE_FLOW_DYNF_METADATA(m) = v;
> +}
> +

(...)

> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {
> +       return !!rte_flow_dynf_metadata_mask;
> +}

I think, in DPDK:

	static inline void
	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
	{
		...

is prefered over:

	static inline void
	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
		...

> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
>  __rte_experimental
>  void rte_mbuf_dyn_dump(FILE *out);
>  
> -/* Placeholder for dynamic fields and flags declarations. */
> -
> +/*
> + * Placeholder for dynamic fields and flags declarations.
> + * This is centralizing point to gather all field names
> + * and parameters together.
> + */
> +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
>  #endif

The RTE_ prefix is missing. Also, thi name is called dynfield but it is
used for both field and flag. I suggest RTE_MBUF_DYNFIELD_METADATA_NAME
and RTE_MBUF_DYNFLAG_METADATA_NAME, to be consistent with the other
naming conventions in rte_mbuf_dyn.[ch].

One more comment: as previously discussed, changing the size or
alignement of a dynamic field should not be allowed, because it can
break the users of the field.

Depending on how it is implemented (is the registration function inline?
is the rte_mbuf_dynfield structure private, shared, or static const in a
.h? are we using #defines for name, size, align?), I think the impact on
users will be different. This is something we need to think about for
next versions: how to detect these changes before pushing the commit,
and/or at runtime?

Regards,
Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 16:25         ` Olivier Matz
@ 2019-10-29 16:33           ` Olivier Matz
  2019-10-29 17:53             ` Slava Ovsiienko
  2019-10-29 17:43           ` Slava Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-29 16:33 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, thomas, matan, orika, Yongseok Koh

On Tue, Oct 29, 2019 at 05:25:22PM +0100, Olivier Matz wrote:
> Hi Slava,
> 
> Looks good to me overall. Few minor comments below.
> 
> On Sun, Oct 27, 2019 at 06:40:36PM +0000, Viacheslav Ovsiienko wrote:
> > Currently, metadata can be set on egress path via mbuf tx_metadata field
> > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> > 
> > This patch extends the metadata feature usability.
> > 
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > 
> > When supporting multiple tables, Tx metadata can also be set by a rule and
> > matched by another rule. This new action allows metadata to be set as a
> > result of flow match.
> > 
> > 2) Metadata on ingress
> > 
> > There's also need to support metadata on ingress. Metadata can be set by
> > SET_META action and matched by META item like Tx. The final value set by
> > the action will be delivered to application via metadata dynamic field of
> > mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> > PKT_RX_DYNF_METADATA flag will be set along with the data.
> > 
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior to use SET_META action.
> > 
> > The availability of dynamic mbuf metadata field can be checked
> > with rte_flow_dynf_metadata_avail() routine.
> > 
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on hardware capability.
> > 
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> (...)
> 
> > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> > index c36c1b6..b19c86b 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
> >  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
> >  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
> >  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> > -
> >  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
> >  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
> >  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> 
> Undue removed line here.
> 
> (...)
> 
> > +/* Mbuf dynamic field offset for metadata. */
> > +extern int rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */
> > +extern uint64_t rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */
> > +#define RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> > +
> > +/* Mbuf dynamic flag for metadata. */
> > +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> > +
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> > +	return *RTE_FLOW_DYNF_METADATA(m);
> > +}
> > +
> > +__rte_experimental
> > +static inline void
> > +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> > +	*RTE_FLOW_DYNF_METADATA(m) = v;
> > +}
> > +
> 
> (...)
> 
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> > +       return !!rte_flow_dynf_metadata_mask;
> > +}
> 
> I think, in DPDK:
> 
> 	static inline void
> 	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
> 	{
> 		...
> 
> is prefered over:
> 
> 	static inline void
> 	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> 		...
> 
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
> >  __rte_experimental
> >  void rte_mbuf_dyn_dump(FILE *out);
> >  
> > -/* Placeholder for dynamic fields and flags declarations. */
> > -
> > +/*
> > + * Placeholder for dynamic fields and flags declarations.
> > + * This is centralizing point to gather all field names
> > + * and parameters together.
> > + */
> > +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
> >  #endif
> 
> The RTE_ prefix is missing. Also, thi name is called dynfield but it is
> used for both field and flag. I suggest RTE_MBUF_DYNFIELD_METADATA_NAME
> and RTE_MBUF_DYNFLAG_METADATA_NAME, to be consistent with the other
> naming conventions in rte_mbuf_dyn.[ch].

I forgot: can you please document the goal/usage of these field and flag
here?  Not necessarily a detailed explanation, but a high level view:
what is transported, when it is registered, ...


> One more comment: as previously discussed, changing the size or
> alignement of a dynamic field should not be allowed, because it can
> break the users of the field.
> 
> Depending on how it is implemented (is the registration function inline?
> is the rte_mbuf_dynfield structure private, shared, or static const in a
> .h? are we using #defines for name, size, align?), I think the impact on
> users will be different. This is something we need to think about for
> next versions: how to detect these changes before pushing the commit,
> and/or at runtime?
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 16:22         ` Andrew Rybchenko
@ 2019-10-29 17:19           ` Slava Ovsiienko
  2019-10-29 18:30             ` Thomas Monjalon
  2019-10-30  7:35             ` Andrew Rybchenko
  0 siblings, 2 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-29 17:19 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: Thomas Monjalon, Matan Azrad, olivier.matz, Ori Kam, Yongseok Koh

Hi, Andrew

Thank you for the review.

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Tuesday, October 29, 2019 18:22
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> > Currently, metadata can be set on egress path via mbuf tx_metadata
> > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> matches metadata.
> >
> > This patch extends the metadata feature usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule
> > and matched by another rule. This new action allows metadata to be set
> > as a result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on ingress. Metadata can be set
> > by SET_META action and matched by META item like Tx. The final value
> > set by the action will be delivered to application via metadata
> > dynamic field of mbuf which can be accessed by
> RTE_FLOW_DYNF_METADATA().
> > PKT_RX_DYNF_METADATA flag will be set along with the data.
> >
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior to use SET_META action.
> >
> > The availability of dynamic mbuf metadata field can be checked with
> > rte_flow_dynf_metadata_avail() routine.
> >
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on hardware capability.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> Above explanations lack information about "meta" vs "mark" which may be
> set on Rx as well and delivered in other mbuf field.
> It should be explained by one more field is required and rules defined.

There is some story about metadata features.
Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action stores some
specified value in the metadata storage, and, on the packet receiving PMD puts the flag
and value to the mbuf and applications can see the packet was threated inside flow engine
according to the appropriate RTE flow(s). MARK and FLAG are like some kind of gateway
to transfer some per-packet information from the flow engine to the application
via receiving datapath.

From the datapath point of view, the MARK and FLAG are related to the receiving side only.
It would useful to have the same gateway on the transmitting side and there was the feature
of type RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the field in mbuf
and this value will be transferred to some field in the packet metadata inside the flow engine.
It did not matter whether these metadata fields are shared because of MARK and META items
belonged to different domains (receiving and transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside the flow engine
and gateways to exchange these values on a per-packet basis via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent action which
would allow us to set META value on the transmitting path. So, the action of type:
- RTE_FLOW_ACTION_TYPE_SET_META is proposed.

The next, applications raise the new requirements for packet metadata. The flow engines are
getting more complex, internal switches are introduced, multiple ports might be supported within
the same flow engine namespace. From the DPDK points of view, it means the packets might be sent
on one eth_dev port and received on the other one, and the packet path inside the flow engine entirely
belongs to the same hardware device. The simplest example is SR-IOV with PF, VFs and the representors.
And there is a brilliant opportunity to provide some out-of-band channel to transfer some extra data
 from one port to another one, besides the packet data itself.


> Above explanations lack information about "meta" vs "mark" which may be
> set on Rx as well and delivered in other mbuf field.
> It should be explained by one more field is required and rules defined.
> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> supporting meta only and applications in an interesting situation to make a
> choice which one to use.

There is no "mark" vs "meta". MARK and META means are kept for compatibility issues
and legacy part works exactly as before. The trials (with flow_validate)  is supposed
to check whether PMD supports MARK or META feature on appropriate domain. It depends
on PMD implementation, configuration and underlaying HW/FW/kernel capabilities and
should be resolved in runtime.

> 
> [snip]
> 
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 159ce19..c943aca 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -658,6 +658,32 @@ the physical device, with virtual groups in the
> PMD or not at all.
> >      | ``mask`` | ``id``   | zeroed to match any value |
> >      +----------+----------+---------------------------+
> >
> > +Item: ``META``
> > +^^^^^^^^^^^^^^^^^
> > +
> > +Matches 32 bit metadata item set.
> > +
> > +On egress, metadata can be set either by mbuf metadata field with
> > +PKT_TX_METADATA flag or ``SET_META`` action. On ingress,
> ``SET_META``
> > +action sets metadata for a packet and the metadata will be reported
> > +via ``metadata`` dynamic field of ``rte_mbuf`` with
> PKT_RX_DYNF_METADATA flag.
> > +
> > +- Default ``mask`` matches the specified Rx metadata value.
> > +
> > +.. _table_rte_flow_item_meta:
> > +
> > +.. table:: META
> > +
> > +   +----------+----------+---------------------------------------+
> > +   | Field    | Subfield | Value                                 |
> > +
> +==========+==========+=======================================
> +
> > +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> > +   +----------+----------+---------------------------------------+
> > +   | ``last`` | ``data`` | upper range value                     |
> > +   +----------+----------+---------------------------------------+
> > +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> > +   +----------+----------+---------------------------------------+
> > +
> >   Data matching item types
> >   ~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > @@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
> >   - ``proto_id``: PPP protocol identifier.
> >   - Default ``mask`` matches proto_id only.
> >
> > -
> > -.. _table_rte_flow_item_meta:
> > -
> > -.. table:: META
> > -
> > -   +----------+----------+---------------------------------------+
> > -   | Field    | Subfield | Value                                 |
> > -
> +==========+==========+=======================================
> +
> > -   | ``spec`` | ``data`` | 32 bit metadata value                 |
> > -   +----------+--------------------------------------------------+
> > -   | ``last`` | ``data`` | upper range value                     |
> > -   +----------+----------+---------------------------------------+
> > -   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> > -   +----------+----------+---------------------------------------+
> > -
> >   Item: ``NSH``
> >   ^^^^^^^^^^^^^
> >
> > @@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment
> number by is a big-endian 32 bit integer.
> >
> >   Using this action on non-matching traffic will result in undefined behavior.
> >
> > +Action: ``SET_META``
> > +^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Set metadata. Item ``META`` matches metadata.
> > +
> > +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on
> > +egress will be overridden by this action. On ingress, the metadata
> > +will be carried by ``metadata`` dynamic field of ``rte_mbuf`` which
> > +can be accessed by ``RTE_FLOW_DYNF_METADATA()``.
> PKT_RX_DYNF_METADATA
> > +flag will be set along with the data.
> > +
> > +The mbuf dynamic field must be registered by calling
> > +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> > +
> > +Altering partial bits is supported with ``mask``. For bits which have
> > +never been set, unpredictable value will be seen depending on driver
> > +implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may or may not be propagated to the other path depending on HW
> capability.
> > +
> > +.. _table_rte_flow_action_set_meta:
> > +
> > +.. table:: SET_META
> > +
> > +   +----------+----------------------------+
> > +   | Field    | Value                      |
> > +   +==========+============================+
> > +   | ``data`` | 32 bit metadata value      |
> > +   +----------+----------------------------+
> > +   | ``mask`` | bit-mask applies to "data" |
> > +   +----------+----------------------------+
> > +
> >   Negative types
> >   ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index 3aa1634..9d54d8e 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -106,6 +106,10 @@ Deprecation Notices
> >     struct ``rte_eth_dev_info`` for the port capability and in struct
> >     ``rte_eth_rxmode`` for the port configuration.
> >
> > +* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed, static
> > +metadata
> > +  mbuf field will be removed in 20.02, metadata feature will use
> > +dynamic mbuf
> > +  field and flag instead.
> > +
> 
> Isn't it breaking stable API/ABI? Should target release be 20.11?
tx_metadata is in union, it should not be ABI break.
And we propose to remove tx_metadata at all in 19.11 
and share the dynamic metadata field between Rx and Tx METAdata.

> I think that DEV_TX_OFFLOAD_MATCH_METADATA should marked as
> deprecated now as well as tx_metadata field in mbuf.
> 
> >   * cryptodev: support for using IV with all sizes is added, J0 still can
> >     be used but only when IV length in following structs
> ``rte_crypto_auth_xform``,
> >     ``rte_crypto_aead_xform`` is set to zero. When IV length is
> > greater or equal diff --git a/doc/guides/rel_notes/release_19_11.rst
> > b/doc/guides/rel_notes/release_19_11.rst
> > index 0e5bb5d..6d331f6 100644
> > --- a/doc/guides/rel_notes/release_19_11.rst
> > +++ b/doc/guides/rel_notes/release_19_11.rst
> > @@ -232,6 +232,21 @@ New Features
> >       gives ability to print port supported ptypes in different protocol layers.
> >
> >
> > +* **Add support of support dynamic fields and flags in mbuf.**
> > +
> > +  This new feature adds the ability to dynamically register some room
> > + for a field or a flag in the mbuf structure. This is typically used
> > + for specific offload features, where adding a static field or flag
> > + in the mbuf is not justified.
> > +
> 
> I guess above is just incorrect merge.
Oops, thanks for spotting,

> 
> > +* **Extended metadata support in rte_flow.**
> > +
> > +  Flow metadata is extended to both Rx and Tx.
> > +
> > +  * Tx metadata can also be set by SET_META action of rte_flow.
> > +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf``
> with
> > +    PKT_RX_DYNF_METADATA.
> > +
> 
> Two empty lines are required before the next section.
Accepted.

> 
> >   Removed Items
> >   -------------
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index c36c1b6..b19c86b 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
> >   #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
> >   #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
> >   #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> > -
OK, accepted.

> 
> Unrelated change.
> 
> >   #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM
> | \
> >   				 DEV_RX_OFFLOAD_UDP_CKSUM | \
> >   				 DEV_RX_OFFLOAD_TCP_CKSUM)
> 
> [snip]
> 
> > diff --git a/lib/librte_ethdev/rte_flow.c
> > b/lib/librte_ethdev/rte_flow.c index ca0f680..6090177 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -12,10 +12,18 @@
> >   #include <rte_errno.h>
> >   #include <rte_branch_prediction.h>
> >   #include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >   #include "rte_ethdev.h"
> >   #include "rte_flow_driver.h"
> >   #include "rte_flow.h"
> >
> > +/* Mbuf dynamic field name for metadata. */ int
> > +rte_flow_dynf_metadata_offs = -1;
> > +
> > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> >   /**
> >    * Flow elements description tables.
> >    */
> > @@ -157,8 +165,41 @@ struct rte_flow_desc_data {
> >   	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> >   	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> >   	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >   };
> >
> > +int
> > +rte_flow_dynf_metadata_register(void)
> > +{
> > +	int offset;
> > +	int flag;
> > +
> > +	static const struct rte_mbuf_dynfield desc_offs = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +		.size = sizeof(uint32_t),
> > +		.align = __alignof__(uint32_t),
> > +		.flags = 0,
> 
> I think flags initialization to 0 is redundant.
It was left just for reminding that field exist. Do you think we do not need the reminding? OK.

> 
> > +	};
> > +	static const struct rte_mbuf_dynflag desc_flag = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +	};
> > +
> > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > +	if (offset < 0)
> > +		goto error;
> > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > +	if (flag < 0)
> > +		goto error;
> > +	rte_flow_dynf_metadata_offs = offset;
> > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > +	return 0;
> > +
> > +error:
> 
> Just an observation...
> Impossibility to unregister hits here. Field may be registered, but will be used.

Metadata field is useless without flag. Yes, we have no opportunity to unregister,
so we just forget about "field with no flag"  and that's it.

> 
> > +	rte_flow_dynf_metadata_offs = -1;
> > +	rte_flow_dynf_metadata_mask = 0ULL;
> > +	return -rte_errno;
> > +}
> > +
> >   static int
> >   flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> >   {
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -28,6 +28,8 @@
> >   #include <rte_byteorder.h>
> >   #include <rte_esp.h>
> >   #include <rte_higig.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >
> >   #ifdef __cplusplus
> >   extern "C" {
> > @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> >   	/**
> >   	 * [META]
> >   	 *
> > -	 * Matches a metadata value specified in mbuf metadata field.
> > +	 * Matches a metadata value.
> > +	 *
> >   	 * See struct rte_flow_item_meta.
> >   	 */
> >   	RTE_FLOW_ITEM_TYPE_META,
> > @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> >   #endif
> >
> >   /**
> > - * RTE_FLOW_ITEM_TYPE_META.
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> 
> Is it allowed to make experimental back?
I think we should remove EXPERIMENTAL here. We do not introduce new
feature, but just extend the apply area.

> 
> >    *
> > - * Matches a specified metadata value.
> > + * RTE_FLOW_ITEM_TYPE_META
> > + *
> > + * Matches a specified metadata value. On egress, metadata can be set
> > + either by
> > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > + * metadata for a packet and the metadata will be reported via mbuf
> > + metadata
> > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > + field must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> >    */
> >   struct rte_flow_item_meta {
> >   	rte_be32_t data;
> 
> [snip]
> 
> > @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> >   	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> >   };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SET_META
> > + *
> > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > +On
> > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > +field
> > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > +must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> > + *
> > + * Altering partial bits is supported with mask. For bits which have
> > +never
> > + * been set, unpredictable value will be seen depending on driver
> > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may
> > + * or may not be propagated to the other path depending on HW
> capability.
> > + *
> > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > + */
> > +struct rte_flow_action_set_meta {
> > +	rte_be32_t data;
> > +	rte_be32_t mask;
> 
> As I understand tx_metadata is host endian. Just double-checking.
> Is a new dynamic field host endian or big endian?
> I definitely would like to see motivation in comments why data/mask are big-
> endian here.

metadata is opaque value, endianness does not matter, there are no some 
special motivations for choosing endiannes. rte_flow_item_meta() structure
provides data with rte_be32_t type, so meta related action does the same. 
I could assume the origin of selecting bigendian type was the endianness
of metadata field in Tx descriptor of ConnectX NICs.

> 
> > +};
> > +
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> 
> These two global variables look frightening to me.
> It does not look good to me.
For me too. But we need the performance, these ones are 
intended for usage in datapath, any overhead is painful.

> 
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> 
> Above curly bracket should be on its own line in the case of function
> definition.
> 
> Shouldn't m be 'const struct rte_mbuf *'?
You are right, it would be better, will update.
> 
> > +	return *RTE_FLOW_DYNF_METADATA(m);
> > +}
> > +
> > +__rte_experimental
> > +static inline void
> > +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> 
> Above curly bracket should be on its own line in the case of function
> definition.
> 
> > +	*RTE_FLOW_DYNF_METADATA(m) = v;
> > +}
> > +
> >   /*
> >    * Definition of a single action.
> >    *
> > @@ -2662,6 +2729,32 @@ enum rte_flow_conv_op {
> >   };
> >
> >   /**
> > + * Check if mbuf dynamic field for metadata is registered.
> > + *
> > + * @return
> > + *   True if registered, false otherwise.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> 
> Above curly bracket should be on its own line in the case of function
> definition.
> 
> > +	return !!rte_flow_dynf_metadata_mask; }
> > +
> > +/**
> > + * Register mbuf dynamic field and flag for metadata.
> > + *
> > + * This function must be called prior to use SET_META action in order
> > +to
> > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > +delivered to
> > + * application.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_dynf_metadata_register(void);
> > +
> > +/**
> >    * Check whether a flow rule can be created on a given port.
> >    *
> >    * The flow rule is validated for correctness and whether it could
> > be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > b/lib/librte_mbuf/rte_mbuf_dyn.h index 2e9d418..a4a0cf5 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
> >   __rte_experimental
> >   void rte_mbuf_dyn_dump(FILE *out);
> >
> > -/* Placeholder for dynamic fields and flags declarations. */
> > -
> > +/*
> > + * Placeholder for dynamic fields and flags declarations.
> > + * This is centralizing point to gather all field names
> > + * and parameters together.
> > + */
> 
> It is not a comment for below define. So, I think empty line is required to
> separate the comment from below define.
> I'm not sure that the clarification is required, but it is up to Olivier.
> 
> > +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
> 
> Empty line is missing here
Thanks, will add one.

> 
> >   #endif

With best regards, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 16:25         ` Olivier Matz
  2019-10-29 16:33           ` Olivier Matz
@ 2019-10-29 17:43           ` Slava Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-29 17:43 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Thomas Monjalon, Matan Azrad, Ori Kam, Yongseok Koh

Hi, Olivier

Thanks a lot for the review.

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, October 29, 2019 18:25
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Matan
> Azrad <matan@mellanox.com>; Ori Kam <orika@mellanox.com>; Yongseok
> Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v4] ethdev: extend flow metadata
> 
> Hi Slava,
> 
> Looks good to me overall. Few minor comments below.
> 
> On Sun, Oct 27, 2019 at 06:40:36PM +0000, Viacheslav Ovsiienko wrote:
> > Currently, metadata can be set on egress path via mbuf tx_metadata
> > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> matches metadata.
> >
> > This patch extends the metadata feature usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule
> > and matched by another rule. This new action allows metadata to be set
> > as a result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on ingress. Metadata can be set
> > by SET_META action and matched by META item like Tx. The final value
> > set by the action will be delivered to application via metadata
> > dynamic field of mbuf which can be accessed by
> RTE_FLOW_DYNF_METADATA().
> > PKT_RX_DYNF_METADATA flag will be set along with the data.
> >
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior to use SET_META action.
> >
> > The availability of dynamic mbuf metadata field can be checked with
> > rte_flow_dynf_metadata_avail() routine.
> >
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on hardware capability.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> (...)
> 
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index c36c1b6..b19c86b 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1048,7 +1048,6 @@ struct rte_eth_conf {
> >  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
> >  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
> >  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> > -
> >  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM
> | \
> >  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
> >  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> 
> Undue removed line here.

Right, will fix.

> 
> (...)
> 
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> > +	return *RTE_FLOW_DYNF_METADATA(m);
> > +}
> > +
> > +__rte_experimental
> > +static inline void
> > +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> > +	*RTE_FLOW_DYNF_METADATA(m) = v;
> > +}
> > +
> 
> (...)
> 
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> > +       return !!rte_flow_dynf_metadata_mask; }
> 
> I think, in DPDK:
> 
> 	static inline void
> 	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
> 	{
> 		...
> 
> is prefered over:
> 
> 	static inline void
> 	rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {

Right. It is strange it passed the validator. Will fix.
> 		...
> 
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -234,6 +234,10 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > __rte_experimental  void rte_mbuf_dyn_dump(FILE *out);
> >
> > -/* Placeholder for dynamic fields and flags declarations. */
> > -
> > +/*
> > + * Placeholder for dynamic fields and flags declarations.
> > + * This is centralizing point to gather all field names
> > + * and parameters together.
> > + */
> > +#define MBUF_DYNF_METADATA_NAME "rte_flow_dynfield_metadata"
> >  #endif
> 
> The RTE_ prefix is missing. Also, thi name is called dynfield but it is used for
> both field and flag. I suggest RTE_MBUF_DYNFIELD_METADATA_NAME and
> RTE_MBUF_DYNFLAG_METADATA_NAME, to be consistent with the other
> naming conventions in rte_mbuf_dyn.[ch].

Well, it makes sense for complex features, say, with multiple dynaflags.
But it is OK for me, will update.

> 
> One more comment: as previously discussed, changing the size or alignement
> of a dynamic field should not be allowed, because it can break the users of
> the field.
> 
> Depending on how it is implemented (is the registration function inline?
> is the rte_mbuf_dynfield structure private, shared, or static const in a .h? are
> we using #defines for name, size, align?), I think the impact on users will be
> different. This is something we need to think about for next versions: how to
> detect these changes before pushing the commit, and/or at runtime?
> 
I'm not sure if we will have a lot of implementation invariants for dynamic fields.
These ones are intended to be used in datapath, performance is a king. I think if 
someone invent the more efficient way to handle dynafields, we'll update the
metadata code either.


> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 16:33           ` Olivier Matz
@ 2019-10-29 17:53             ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-29 17:53 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Thomas Monjalon, Matan Azrad, Ori Kam, Yongseok Koh

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, October 29, 2019 18:34
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Matan
> Azrad <matan@mellanox.com>; Ori Kam <orika@mellanox.com>; Yongseok
> Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v4] ethdev: extend flow metadata
> 
> On Tue, Oct 29, 2019 at 05:25:22PM +0100, Olivier Matz wrote:
> > Hi Slava,
> >
> > Looks good to me overall. Few minor comments below.

[snip]

> 
> I forgot: can you please document the goal/usage of these field and flag
> here?  Not necessarily a detailed explanation, but a high level view:
> what is transported, when it is registered, ...
> 
Even the brief explanation is wordy, because it involves some 
metadata features history, I will think how I could elaborate the 
description.

With best regards, Slava


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 17:19           ` Slava Ovsiienko
@ 2019-10-29 18:30             ` Thomas Monjalon
  2019-10-29 18:35               ` Slava Ovsiienko
  2019-10-30  6:28               ` Andrew Rybchenko
  2019-10-30  7:35             ` Andrew Rybchenko
  1 sibling, 2 replies; 98+ messages in thread
From: Thomas Monjalon @ 2019-10-29 18:30 UTC (permalink / raw)
  To: Slava Ovsiienko; +Cc: Andrew Rybchenko, dev, Matan Azrad, olivier.matz, Ori Kam

29/10/2019 18:19, Slava Ovsiienko:
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > +* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed, static
> > > +metadata
> > > +  mbuf field will be removed in 20.02, metadata feature will use
> > > +dynamic mbuf
> > > +  field and flag instead.
> > > +
> > 
> > Isn't it breaking stable API/ABI? Should target release be 20.11?
> 
> tx_metadata is in union, it should not be ABI break.
> And we propose to remove tx_metadata at all in 19.11 
> and share the dynamic metadata field between Rx and Tx METAdata.

Yes please, let's remove tx_metadata from mbuf now,
while adding metadata dynamic field.
I am sure nobody will complain that we sanitized this feature
before releasing DPDK 19.11 LTS.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 18:30             ` Thomas Monjalon
@ 2019-10-29 18:35               ` Slava Ovsiienko
  2019-10-30  6:28               ` Andrew Rybchenko
  1 sibling, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-29 18:35 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Andrew Rybchenko, dev, Matan Azrad, olivier.matz, Ori Kam

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, October 29, 2019 20:30
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org;
> Matan Azrad <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> <orika@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> 29/10/2019 18:19, Slava Ovsiienko:
> > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > > > --- a/doc/guides/rel_notes/deprecation.rst
> > > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > > +* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed,
> static
> > > > +metadata
> > > > +  mbuf field will be removed in 20.02, metadata feature will use
> > > > +dynamic mbuf
> > > > +  field and flag instead.
> > > > +
> > >
> > > Isn't it breaking stable API/ABI? Should target release be 20.11?
> >
> > tx_metadata is in union, it should not be ABI break.
> > And we propose to remove tx_metadata at all in 19.11 and share the
> > dynamic metadata field between Rx and Tx METAdata.
> 
> Yes please, let's remove tx_metadata from mbuf now, while adding
> metadata dynamic field.
> I am sure nobody will complain that we sanitized this feature before
> releasing DPDK 19.11 LTS.

OK, I'll extend this patch to small series and update tx_metadata in the
dedicated commit (to simplify reviewing either).

With best regards, Slava
 


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
                           ` (2 preceding siblings ...)
  2019-10-29 16:25         ` Olivier Matz
@ 2019-10-29 19:31         ` Viacheslav Ovsiienko
  2019-10-30  8:02           ` Andrew Rybchenko
                             ` (2 more replies)
  3 siblings, 3 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-29 19:31 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v5: - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/


 app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 72 ++++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  8 +++
 lib/librte_ethdev/rte_ethdev_version.map |  3 +
 lib/librte_ethdev/rte_flow.c             | 40 +++++++++++++
 lib/librte_ethdev/rte_flow.h             | 98 +++++++++++++++++++++++++++++++-
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 ++-
 8 files changed, 271 insertions(+), 20 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0d0bc0a..e4ef066 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -316,6 +316,9 @@ enum index {
 	ACTION_RAW_ENCAP_INDEX_VALUE,
 	ACTION_RAW_DECAP_INDEX,
 	ACTION_RAW_DECAP_INDEX_VALUE,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1067,6 +1070,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1265,6 +1269,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.help = "index of raw_encap/raw_decap data",
 		.next = NEXT(next_item),
 		.call = parse_port,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 159ce19..c943aca 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f6e90cb..8311021 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -237,6 +237,14 @@ New Features
   On supported NICs, we can now setup haipin queue which will offload packets
   from the wire, backto the wire.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 
 Removed Items
 -------------
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 48b5389..e593f34 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -291,4 +291,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_hairpin_queue_setup;
 	rte_eth_dev_hairpin_capability_get;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..b0490cd 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..47b0220 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,9 +1266,14 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2728,32 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..de651c1 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 18:30             ` Thomas Monjalon
  2019-10-29 18:35               ` Slava Ovsiienko
@ 2019-10-30  6:28               ` Andrew Rybchenko
  1 sibling, 0 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-30  6:28 UTC (permalink / raw)
  To: Thomas Monjalon, Slava Ovsiienko; +Cc: dev, Matan Azrad, olivier.matz, Ori Kam

On 10/29/19 9:30 PM, Thomas Monjalon wrote:
> 29/10/2019 18:19, Slava Ovsiienko:
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>>> --- a/doc/guides/rel_notes/deprecation.rst
>>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>>> +* ethdev: DEV_TX_OFFLOAD_MATCH_METADATA will be removed, static
>>>> +metadata
>>>> +  mbuf field will be removed in 20.02, metadata feature will use
>>>> +dynamic mbuf
>>>> +  field and flag instead.
>>>> +
>>> Isn't it breaking stable API/ABI? Should target release be 20.11?
>> tx_metadata is in union, it should not be ABI break.
>> And we propose to remove tx_metadata at all in 19.11
>> and share the dynamic metadata field between Rx and Tx METAdata.
> Yes please, let's remove tx_metadata from mbuf now,
> while adding metadata dynamic field.
> I am sure nobody will complain that we sanitized this feature
> before releasing DPDK 19.11 LTS.

OK for me.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-29 17:19           ` Slava Ovsiienko
  2019-10-29 18:30             ` Thomas Monjalon
@ 2019-10-30  7:35             ` Andrew Rybchenko
  2019-10-30  8:59               ` Slava Ovsiienko
  2019-10-30 15:49               ` Olivier Matz
  1 sibling, 2 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-30  7:35 UTC (permalink / raw)
  To: Slava Ovsiienko, Thomas Monjalon, olivier.matz
  Cc: dev, Matan Azrad, Ori Kam, Yongseok Koh

@Olivier, please, take a look at the end of the mail.

On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> Hi, Andrew
>
> Thank you for the review.
>
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Tuesday, October 29, 2019 18:22
>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
>> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
>> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
>> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
>>
>> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
>>> Currently, metadata can be set on egress path via mbuf tx_metadata
>>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
>> matches metadata.
>>> This patch extends the metadata feature usability.
>>>
>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>
>>> When supporting multiple tables, Tx metadata can also be set by a rule
>>> and matched by another rule. This new action allows metadata to be set
>>> as a result of flow match.
>>>
>>> 2) Metadata on ingress
>>>
>>> There's also need to support metadata on ingress. Metadata can be set
>>> by SET_META action and matched by META item like Tx. The final value
>>> set by the action will be delivered to application via metadata
>>> dynamic field of mbuf which can be accessed by
>> RTE_FLOW_DYNF_METADATA().
>>> PKT_RX_DYNF_METADATA flag will be set along with the data.
>>>
>>> The mbuf dynamic field must be registered by calling
>>> rte_flow_dynf_metadata_register() prior to use SET_META action.
>>>
>>> The availability of dynamic mbuf metadata field can be checked with
>>> rte_flow_dynf_metadata_avail() routine.
>>>
>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>> propagated to the other path depending on hardware capability.
>>>
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>> Above explanations lack information about "meta" vs "mark" which may be
>> set on Rx as well and delivered in other mbuf field.
>> It should be explained by one more field is required and rules defined.
> There is some story about metadata features.
> Initially, there were proposed two metadata related actions:
>
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
>
> These actions set the special flag in the packet metadata, MARK action stores some
> specified value in the metadata storage, and, on the packet receiving PMD puts the flag
> and value to the mbuf and applications can see the packet was threated inside flow engine
> according to the appropriate RTE flow(s). MARK and FLAG are like some kind of gateway
> to transfer some per-packet information from the flow engine to the application
> via receiving datapath.
>
>  From the datapath point of view, the MARK and FLAG are related to the receiving side only.
> It would useful to have the same gateway on the transmitting side and there was the feature
> of type RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the field in mbuf
> and this value will be transferred to some field in the packet metadata inside the flow engine.
> It did not matter whether these metadata fields are shared because of MARK and META items
> belonged to different domains (receiving and transmitting) and could be vendor-specific.
>
> So far, so good, DPDK proposes some entities to control metadata inside the flow engine
> and gateways to exchange these values on a per-packet basis via datapaths.
>
> As we can see, the MARK and META means are not symmetric, there is absent action which
> would allow us to set META value on the transmitting path. So, the action of type:
> - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
>
> The next, applications raise the new requirements for packet metadata. The flow engines are
> getting more complex, internal switches are introduced, multiple ports might be supported within
> the same flow engine namespace. From the DPDK points of view, it means the packets might be sent
> on one eth_dev port and received on the other one, and the packet path inside the flow engine entirely
> belongs to the same hardware device. The simplest example is SR-IOV with PF, VFs and the representors.
> And there is a brilliant opportunity to provide some out-of-band channel to transfer some extra data
>   from one port to another one, besides the packet data itself.
>
>
>> Above explanations lack information about "meta" vs "mark" which may be
>> set on Rx as well and delivered in other mbuf field.
>> It should be explained by one more field is required and rules defined.
>> Otherwise we can endup in half PMDs supporting mark only, half PMDs
>> supporting meta only and applications in an interesting situation to make a
>> choice which one to use.
> There is no "mark" vs "meta". MARK and META means are kept for compatibility issues
> and legacy part works exactly as before. The trials (with flow_validate)  is supposed
> to check whether PMD supports MARK or META feature on appropriate domain. It depends
> on PMD implementation, configuration and underlaying HW/FW/kernel capabilities and
> should be resolved in runtime.

The trials a way, but very tricky way. My imagination draws me
pictures how an application code could look like in attempt to use
either mark or meta for Rx only and these pictures are not nice.
May be it will look acceptable when mark becomes a dynamic
since usage of either one or another dynamic field is definitely
easier than usage of either fixed or dynamic field.

May be dynamic field for mark at fixed offset should be
introduced in the release or the nearest future? It will allow
to preserve ABI up to 20.11 and provide future proof API.
The trick is to register dynamic meta field at fixed offset
at start of a day to be sure that it is guaranteed to succeed.
It sounds like it is a transition mechanism from fixed to
dynamic fields.

[snip]

>>> diff --git a/lib/librte_ethdev/rte_flow.h
>>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
>>> --- a/lib/librte_ethdev/rte_flow.h
>>> +++ b/lib/librte_ethdev/rte_flow.h
>>> @@ -28,6 +28,8 @@
>>>    #include <rte_byteorder.h>
>>>    #include <rte_esp.h>
>>>    #include <rte_higig.h>
>>> +#include <rte_mbuf.h>
>>> +#include <rte_mbuf_dyn.h>
>>>
>>>    #ifdef __cplusplus
>>>    extern "C" {
>>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
>>>    	/**
>>>    	 * [META]
>>>    	 *
>>> -	 * Matches a metadata value specified in mbuf metadata field.
>>> +	 * Matches a metadata value.
>>> +	 *
>>>    	 * See struct rte_flow_item_meta.
>>>    	 */
>>>    	RTE_FLOW_ITEM_TYPE_META,
>>> @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>>>    #endif
>>>
>>>    /**
>>> - * RTE_FLOW_ITEM_TYPE_META.
>>> + * @warning
>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>> Is it allowed to make experimental back?
> I think we should remove EXPERIMENTAL here. We do not introduce new
> feature, but just extend the apply area.

Agreed.

>>>     *
>>> - * Matches a specified metadata value.
>>> + * RTE_FLOW_ITEM_TYPE_META
>>> + *
>>> + * Matches a specified metadata value. On egress, metadata can be set
>>> + either by
>>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
>>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
>>> + RTE_FLOW_ACTION_TYPE_SET_META sets
>>> + * metadata for a packet and the metadata will be reported via mbuf
>>> + metadata
>>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
>>> + field must be
>>> + * registered in advance by rte_flow_dynf_metadata_register().
>>>     */
>>>    struct rte_flow_item_meta {
>>>    	rte_be32_t data;
>> [snip]
>>
>>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
>>>    	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>>>    };
>>>
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>>> + *
>>> + * RTE_FLOW_ACTION_TYPE_SET_META
>>> + *
>>> + * Set metadata. Metadata set by mbuf tx_metadata field with
>>> + * PKT_TX_METADATA flag on egress will be overridden by this action.
>>> +On
>>> + * ingress, the metadata will be carried by mbuf metadata dynamic
>>> +field
>>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
>>> +must be
>>> + * registered in advance by rte_flow_dynf_metadata_register().
>>> + *
>>> + * Altering partial bits is supported with mask. For bits which have
>>> +never
>>> + * been set, unpredictable value will be seen depending on driver
>>> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
>>> +may
>>> + * or may not be propagated to the other path depending on HW
>> capability.
>>> + *
>>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
>>> + */
>>> +struct rte_flow_action_set_meta {
>>> +	rte_be32_t data;
>>> +	rte_be32_t mask;
>> As I understand tx_metadata is host endian. Just double-checking.
>> Is a new dynamic field host endian or big endian?
>> I definitely would like to see motivation in comments why data/mask are big-
>> endian here.
> metadata is opaque value, endianness does not matter, there are no some
> special motivations for choosing endiannes. rte_flow_item_meta() structure
> provides data with rte_be32_t type, so meta related action does the same.

Endianness of meta in mbuf and flow API should match and it must be
documented. Endianness is important if a HW supports less bits since
it makes a hit for application to use LSB first if the bit space is 
sufficient.
mark is defined as host-endian (uint32_t) and I think meta should be the
same. Otherwise it complicates even more either mark or meta usage
as discussed above .

Yes, I think that rte_flow_item_meta should be fixed since both
mark and tx_metadata are host-endian.

(it says nothing about HW interface which is vendor specific and
vendor PMDs should care about it)

> I could assume the origin of selecting bigendian type was the endianness
> of metadata field in Tx descriptor of ConnectX NICs.
>
>>> +};
>>> +
>>> +/* Mbuf dynamic field offset for metadata. */ extern int
>>> +rte_flow_dynf_metadata_offs;
>>> +
>>> +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
>>> +rte_flow_dynf_metadata_mask;
>> These two global variables look frightening to me.
>> It does not look good to me.
> For me too. But we need the performance, these ones are
> intended for usage in datapath, any overhead is painful.

@Olivier, could you share your thoughts, please.

Andrew.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
@ 2019-10-30  8:02           ` Andrew Rybchenko
  2019-10-30 14:40             ` Slava Ovsiienko
  2019-10-30  8:35           ` Ori Kam
  2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-30  8:02 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: matan, rasland, thomas, olivier.matz, Yongseok Koh

On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
>
> This patch extends the metadata feature usability.
>
> 1) RTE_FLOW_ACTION_TYPE_SET_META
>
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
>
> 2) Metadata on ingress
>
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
>
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
>
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
>
> If application is going to engage the metadata feature it registers
> the metadata  dynamic fields, then PMD checks the metadata field
> availability and handles the appropriate fields in datapath.
>
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
>
> MARK and METADATA look similar and might operate in similar way,
> but not interacting.
>
> Initially, there were proposed two metadata related actions:
>
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
>
> These actions set the special flag in the packet metadata, MARK action
> stores some specified value in the metadata storage, and, on the packet
> receiving PMD puts the flag and value to the mbuf and applications can
> see the packet was threated inside flow engine according to the appropriate
> RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
> per-packet information from the flow engine to the application via
> receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
> provided. It allows us to extend the flow match pattern with the capability
> to match the metadata values set by MARK/FLAG actions on other flows.
>
>  From the datapath point of view, the MARK and FLAG are related to the
> receiving side only. It would useful to have the same gateway on the
> transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
> was proposed. The application can fill the field in mbuf and this value
> will be transferred to some field in the packet metadata inside the flow
> engine. It did not matter whether these metadata fields are shared because
> of MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
>
> So far, so good, DPDK proposes some entities to control metadata inside
> the flow engine and gateways to exchange these values on a per-packet basis
> via datapaths.
>
> As we can see, the MARK and META means are not symmetric, there is absent
> action which would allow us to set META value on the transmitting path.
> So, the action of type:
>
> - RTE_FLOW_ACTION_TYPE_SET_META was proposed.
>
> The next, applications raise the new requirements for packet metadata.
> The flow ngines are getting more complex, internal switches are introduced,
> multiple ports might be supported within the same flow engine namespace.
>  From the DPDK points of view, it means the packets might be sent on one
> eth_dev port and received on the other one, and the packet path inside
> the flow engine entirely belongs to the same hardware device. The simplest
> example is SR-IOV with PF, VFs and the representors. And there is a
> brilliant opportunity to provide some out-of-band channel to transfer
> some extra data from one port to another one, besides the packet data
> itself. And applications would like to use this opportunity.
>
> It is supposed for application to use trials (with rte_flow_validate)
> to detect which metadata features (FLAG, MARK, META) actually supported
> by PMD and underlying hardware. It might depend on PMD configuration,
> system software, hardware settings, etc., and should be detected
> in run time.
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

[snip]

> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 4fee105..47b0220 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h

[snip]

> @@ -2662,6 +2728,32 @@ enum rte_flow_conv_op {
>   };
>   
>   /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {

Wrong placement of function open curly bracket.

> +	return !!rte_flow_dynf_metadata_mask;
> +}
> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>    * Check whether a flow rule can be created on a given port.
>    *
>    * The flow rule is validated for correctness and whether it could be accepted
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
  2019-10-30  8:02           ` Andrew Rybchenko
@ 2019-10-30  8:35           ` Ori Kam
  2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2 siblings, 0 replies; 98+ messages in thread
From: Ori Kam @ 2019-10-30  8:35 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Thomas Monjalon, olivier.matz,
	arybchenko, Yongseok Koh



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Tuesday, October 29, 2019 9:32 PM
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> olivier.matz@6wind.com; arybchenko@solarflare.com; Yongseok Koh
> <yskoh@mellanox.com>
> Subject: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
> 
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches
> metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> If application is going to engage the metadata feature it registers
> the metadata  dynamic fields, then PMD checks the metadata field
> availability and handles the appropriate fields in datapath.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> MARK and METADATA look similar and might operate in similar way,
> but not interacting.
> 
> Initially, there were proposed two metadata related actions:
> 
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
> 
> These actions set the special flag in the packet metadata, MARK action
> stores some specified value in the metadata storage, and, on the packet
> receiving PMD puts the flag and value to the mbuf and applications can
> see the packet was threated inside flow engine according to the appropriate
> RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
> per-packet information from the flow engine to the application via
> receiving datapath. Also, there is the item of type
> RTE_FLOW_ITEM_TYPE_MARK
> provided. It allows us to extend the flow match pattern with the capability
> to match the metadata values set by MARK/FLAG actions on other flows.
> 
> From the datapath point of view, the MARK and FLAG are related to the
> receiving side only. It would useful to have the same gateway on the
> transmitting side and there was the feature of type
> RTE_FLOW_ITEM_TYPE_META
> was proposed. The application can fill the field in mbuf and this value
> will be transferred to some field in the packet metadata inside the flow
> engine. It did not matter whether these metadata fields are shared because
> of MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
> 
> So far, so good, DPDK proposes some entities to control metadata inside
> the flow engine and gateways to exchange these values on a per-packet basis
> via datapaths.
> 
> As we can see, the MARK and META means are not symmetric, there is absent
> action which would allow us to set META value on the transmitting path.
> So, the action of type:
> 
> - RTE_FLOW_ACTION_TYPE_SET_META was proposed.
> 
> The next, applications raise the new requirements for packet metadata.
> The flow ngines are getting more complex, internal switches are introduced,
> multiple ports might be supported within the same flow engine namespace.
> From the DPDK points of view, it means the packets might be sent on one
> eth_dev port and received on the other one, and the packet path inside
> the flow engine entirely belongs to the same hardware device. The simplest
> example is SR-IOV with PF, VFs and the representors. And there is a
> brilliant opportunity to provide some out-of-band channel to transfer
> some extra data from one port to another one, besides the packet data
> itself. And applications would like to use this opportunity.
> 
> It is supposed for application to use trials (with rte_flow_validate)
> to detect which metadata features (FLAG, MARK, META) actually supported
> by PMD and underlying hardware. It might depend on PMD configuration,
> system software, hardware settings, etc., and should be detected
> in run time.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> v5: - addressed code style issues from comments
>     - Tx metadata deprecation notice removed
>       (dedicated tx_metadata patch is coming)
>     - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
>       dedicated ones, RTE suffix is added
>     - metadata historic retrospective is added to log message
>     - rebased
> 
> v4: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F62065%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc179230aa76b4847b6d708d75ca6a727%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637079743167838838&amp;sdata=KEb4Vy%2Fr9Y%2BVh
> FZHqeo7ZOzO5Tphrk5gwRiwLA%2BSPPc%3D&amp;reserved=0
>     - documentation comments addressed
>     - deprecation notice for Tx metadata offload flag
>     - rebased
> 
> v3: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F61902%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc179230aa76b4847b6d708d75ca6a727%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637079743167838838&amp;sdata=IuMDbqwUdKhpkkJNy
> aLsBEqtQnQIYLD%2BPUTKx%2BuBb6o%3D&amp;reserved=0
>     - rebased, neat updates
> 
> v2: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc179230aa76b4847b6d708d75ca6a727%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637079743167838838&amp;sdata=IRxJ5TA2b%2BLjw3ud
> WUNweQbqzjELEtnr1tcAzUZrB94%3D&amp;reserved=0
> 
> v1: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc179230aa76b4847b6d708d75ca6a727%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637079743167838838&amp;sdata=3gmQwKhk0cZMUdy
> NI9aQtQkxkzWZUpLDdt5rkPz9AzY%3D&amp;reserved=0
>     - rfc:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7Cc179230aa76b4847b6d708d75ca6a727%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637079743167838838&amp;sdata=20Bz7wbiDCYqccY0M
> NQm0OFuM9KqX017l7KnXeWhWlQ%3D&amp;reserved=0
> 
> 

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30  7:35             ` Andrew Rybchenko
@ 2019-10-30  8:59               ` Slava Ovsiienko
  2019-10-30  9:20                 ` Andrew Rybchenko
  2019-10-30 10:03                 ` Slava Ovsiienko
  2019-10-30 15:49               ` Olivier Matz
  1 sibling, 2 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30  8:59 UTC (permalink / raw)
  To: Andrew Rybchenko, Thomas Monjalon, olivier.matz
  Cc: dev, Matan Azrad, Ori Kam, Yongseok Koh

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Wednesday, October 30, 2019 9:35
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> @Olivier, please, take a look at the end of the mail.
> 
> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> > Hi, Andrew
> >
> > Thank you for the review.
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Tuesday, October 29, 2019 18:22
> >> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> >> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> >> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> >> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >>
> >> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> >>> Currently, metadata can be set on egress path via mbuf tx_metadata
> >>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> >> matches metadata.
> >>> This patch extends the metadata feature usability.
> >>>
> >>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>
> >>> When supporting multiple tables, Tx metadata can also be set by a
> >>> rule and matched by another rule. This new action allows metadata to
> >>> be set as a result of flow match.
> >>>
> >>> 2) Metadata on ingress
> >>>
> >>> There's also need to support metadata on ingress. Metadata can be
> >>> set by SET_META action and matched by META item like Tx. The final
> >>> value set by the action will be delivered to application via
> >>> metadata dynamic field of mbuf which can be accessed by
> >> RTE_FLOW_DYNF_METADATA().
> >>> PKT_RX_DYNF_METADATA flag will be set along with the data.
> >>>
> >>> The mbuf dynamic field must be registered by calling
> >>> rte_flow_dynf_metadata_register() prior to use SET_META action.
> >>>
> >>> The availability of dynamic mbuf metadata field can be checked with
> >>> rte_flow_dynf_metadata_avail() routine.
> >>>
> >>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> >>> propagated to the other path depending on hardware capability.
> >>>
> >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> >> Above explanations lack information about "meta" vs "mark" which may
> >> be set on Rx as well and delivered in other mbuf field.
> >> It should be explained by one more field is required and rules defined.
> > There is some story about metadata features.
> > Initially, there were proposed two metadata related actions:
> >
> > - RTE_FLOW_ACTION_TYPE_FLAG
> > - RTE_FLOW_ACTION_TYPE_MARK
> >
> > These actions set the special flag in the packet metadata, MARK action
> > stores some specified value in the metadata storage, and, on the
> > packet receiving PMD puts the flag and value to the mbuf and
> > applications can see the packet was threated inside flow engine
> > according to the appropriate RTE flow(s). MARK and FLAG are like some
> > kind of gateway to transfer some per-packet information from the flow
> engine to the application via receiving datapath.
> >
> >  From the datapath point of view, the MARK and FLAG are related to the
> receiving side only.
> > It would useful to have the same gateway on the transmitting side and
> > there was the feature of type RTE_FLOW_ITEM_TYPE_META was
> proposed.
> > The application can fill the field in mbuf and this value will be transferred to
> some field in the packet metadata inside the flow engine.
> > It did not matter whether these metadata fields are shared because of
> > MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
> >
> > So far, so good, DPDK proposes some entities to control metadata
> > inside the flow engine and gateways to exchange these values on a per-
> packet basis via datapaths.
> >
> > As we can see, the MARK and META means are not symmetric, there is
> > absent action which would allow us to set META value on the transmitting
> path. So, the action of type:
> > - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> >
> > The next, applications raise the new requirements for packet metadata.
> > The flow engines are getting more complex, internal switches are
> > introduced, multiple ports might be supported within the same flow
> > engine namespace. From the DPDK points of view, it means the packets
> > might be sent on one eth_dev port and received on the other one, and the
> packet path inside the flow engine entirely belongs to the same hardware
> device. The simplest example is SR-IOV with PF, VFs and the representors.
> > And there is a brilliant opportunity to provide some out-of-band channel to
> transfer some extra data
> >   from one port to another one, besides the packet data itself.
> >
> >
> >> Above explanations lack information about "meta" vs "mark" which may
> >> be set on Rx as well and delivered in other mbuf field.
> >> It should be explained by one more field is required and rules defined.
> >> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> >> supporting meta only and applications in an interesting situation to
> >> make a choice which one to use.
> > There is no "mark" vs "meta". MARK and META means are kept for
> > compatibility issues and legacy part works exactly as before. The
> > trials (with flow_validate)  is supposed to check whether PMD supports
> > MARK or META feature on appropriate domain. It depends on PMD
> > implementation, configuration and underlaying HW/FW/kernel capabilities
> and should be resolved in runtime.
> 
> The trials a way, but very tricky way. My imagination draws me pictures how
> an application code could look like in attempt to use either mark or meta for
> Rx only and these pictures are not nice.
Agree, trials is not the best way.
For now there is no application trying to choose mark or meta, because these ones
belonged to other domains, and extension is newly introduced. 
So, the applications using mark will continue use mark, the same is regarding meta.
The new application definitely will ask for both mark and of them,
we have the requirements from customers  - "give us as many through bits as you can".
This new application just may refuse to work if metadata features are not detected,
because relays on it strongly.
BTW, trials are not so complex: rte_flow_validate(mark), rte_flow_validate_metadata()
and that's it.

> May be it will look acceptable when mark becomes a dynamic since usage of
> either one or another dynamic field is definitely easier than usage of either
> fixed or dynamic field.
At least in PMD datapath it does not look very ugly.
> 
> May be dynamic field for mark at fixed offset should be introduced in the
> release or the nearest future? It will allow to preserve ABI up to 20.11 and
> provide future proof API.
> The trick is to register dynamic meta field at fixed offset at start of a day to
> be sure that it is guaranteed to succeed.
> It sounds like it is a transition mechanism from fixed to dynamic fields.
> 
> [snip]
> 
> >>> diff --git a/lib/librte_ethdev/rte_flow.h
> >>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> >>> --- a/lib/librte_ethdev/rte_flow.h
> >>> +++ b/lib/librte_ethdev/rte_flow.h
> >>> @@ -28,6 +28,8 @@
> >>>    #include <rte_byteorder.h>
> >>>    #include <rte_esp.h>
> >>>    #include <rte_higig.h>
> >>> +#include <rte_mbuf.h>
> >>> +#include <rte_mbuf_dyn.h>
> >>>
> >>>    #ifdef __cplusplus
> >>>    extern "C" {
> >>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> >>>    	/**
> >>>    	 * [META]
> >>>    	 *
> >>> -	 * Matches a metadata value specified in mbuf metadata field.
> >>> +	 * Matches a metadata value.
> >>> +	 *
> >>>    	 * See struct rte_flow_item_meta.
> >>>    	 */
> >>>    	RTE_FLOW_ITEM_TYPE_META,
> >>> @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth
> {
> >>>    #endif
> >>>
> >>>    /**
> >>> - * RTE_FLOW_ITEM_TYPE_META.
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this structure may change without prior notice
> >> Is it allowed to make experimental back?
> > I think we should remove EXPERIMENTAL here. We do not introduce new
> > feature, but just extend the apply area.
> 
> Agreed.
> 
> >>>     *
> >>> - * Matches a specified metadata value.
> >>> + * RTE_FLOW_ITEM_TYPE_META
> >>> + *
> >>> + * Matches a specified metadata value. On egress, metadata can be
> >>> + set either by
> >>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> >>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> >>> + RTE_FLOW_ACTION_TYPE_SET_META sets
> >>> + * metadata for a packet and the metadata will be reported via mbuf
> >>> + metadata
> >>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> mbuf
> >>> + field must be
> >>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>>     */
> >>>    struct rte_flow_item_meta {
> >>>    	rte_be32_t data;
> >> [snip]
> >>
> >>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> >>>    	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> >>>    };
> >>>
> >>> +/**
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this structure may change without prior notice
> >>> + *
> >>> + * RTE_FLOW_ACTION_TYPE_SET_META
> >>> + *
> >>> + * Set metadata. Metadata set by mbuf tx_metadata field with
> >>> + * PKT_TX_METADATA flag on egress will be overridden by this action.
> >>> +On
> >>> + * ingress, the metadata will be carried by mbuf metadata dynamic
> >>> +field
> >>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> >>> +must be
> >>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>> + *
> >>> + * Altering partial bits is supported with mask. For bits which
> >>> +have never
> >>> + * been set, unpredictable value will be seen depending on driver
> >>> + * implementation. For loopback/hairpin packet, metadata set on
> >>> +Rx/Tx may
> >>> + * or may not be propagated to the other path depending on HW
> >> capability.
> >>> + *
> >>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> >>> + */
> >>> +struct rte_flow_action_set_meta {
> >>> +	rte_be32_t data;
> >>> +	rte_be32_t mask;
> >> As I understand tx_metadata is host endian. Just double-checking.
> >> Is a new dynamic field host endian or big endian?
> >> I definitely would like to see motivation in comments why data/mask
> >> are big- endian here.
> > metadata is opaque value, endianness does not matter, there are no
> > some special motivations for choosing endiannes. rte_flow_item_meta()
> > structure provides data with rte_be32_t type, so meta related action does
> the same.
> 
> Endianness of meta in mbuf and flow API should match and it must be
Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it in docs.

> documented. Endianness is important if a HW supports less bits since it
> makes a hit for application to use LSB first if the bit space is sufficient.
> mark is defined as host-endian (uint32_t) and I think meta should be the
> same. Otherwise it complicates even more either mark or meta usage as
> discussed above .
mark is mark, meta is meta, these ones are not interrelated (despite they
are becoming similar). And there is no choice between mark and meta.
It is not obvious we should have the same endianness for both.
There are some applications using this legacy features, so there might be compatibility
issues either.

> 
> Yes, I think that rte_flow_item_meta should be fixed since both mark and
> tx_metadata are host-endian.
> 
> (it says nothing about HW interface which is vendor specific and vendor
> PMDs should care about it)
Handling this in PMD might introduce the extra endianness conversion in datapath,
impacting performance. Not nice, IMO.
> 
> > I could assume the origin of selecting bigendian type was the
> > endianness of metadata field in Tx descriptor of ConnectX NICs.
> >
> >>> +};
> >>> +
> >>> +/* Mbuf dynamic field offset for metadata. */ extern int
> >>> +rte_flow_dynf_metadata_offs;
> >>> +
> >>> +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> >>> +rte_flow_dynf_metadata_mask;
> >> These two global variables look frightening to me.
> >> It does not look good to me.
> > For me too. But we need the performance, these ones are intended for
> > usage in datapath, any overhead is painful.
> 
> @Olivier, could you share your thoughts, please.
> 
> Andrew.

With best regards, Slava


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30  8:59               ` Slava Ovsiienko
@ 2019-10-30  9:20                 ` Andrew Rybchenko
  2019-10-30 10:05                   ` Slava Ovsiienko
  2019-10-30 10:03                 ` Slava Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-30  9:20 UTC (permalink / raw)
  To: Slava Ovsiienko, Thomas Monjalon, olivier.matz
  Cc: dev, Matan Azrad, Ori Kam, Yongseok Koh

On 10/30/19 11:59 AM, Slava Ovsiienko wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Wednesday, October 30, 2019 9:35
>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
>> <thomas@monjalon.net>; olivier.matz@6wind.com
>> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
>> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
>>
>> @Olivier, please, take a look at the end of the mail.
>>
>> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
>>> Hi, Andrew
>>>
>>> Thank you for the review.
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>>> Sent: Tuesday, October 29, 2019 18:22
>>>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
>>>> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
>>>> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
>>>> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
>>>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
>>>>
>>>> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
>>>>> Currently, metadata can be set on egress path via mbuf tx_metadata
>>>>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
>>>> matches metadata.
>>>>> This patch extends the metadata feature usability.
>>>>>
>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>
>>>>> When supporting multiple tables, Tx metadata can also be set by a
>>>>> rule and matched by another rule. This new action allows metadata to
>>>>> be set as a result of flow match.
>>>>>
>>>>> 2) Metadata on ingress
>>>>>
>>>>> There's also need to support metadata on ingress. Metadata can be
>>>>> set by SET_META action and matched by META item like Tx. The final
>>>>> value set by the action will be delivered to application via
>>>>> metadata dynamic field of mbuf which can be accessed by
>>>> RTE_FLOW_DYNF_METADATA().
>>>>> PKT_RX_DYNF_METADATA flag will be set along with the data.
>>>>>
>>>>> The mbuf dynamic field must be registered by calling
>>>>> rte_flow_dynf_metadata_register() prior to use SET_META action.
>>>>>
>>>>> The availability of dynamic mbuf metadata field can be checked with
>>>>> rte_flow_dynf_metadata_avail() routine.
>>>>>
>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>> propagated to the other path depending on hardware capability.
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>>>> Above explanations lack information about "meta" vs "mark" which may
>>>> be set on Rx as well and delivered in other mbuf field.
>>>> It should be explained by one more field is required and rules defined.
>>> There is some story about metadata features.
>>> Initially, there were proposed two metadata related actions:
>>>
>>> - RTE_FLOW_ACTION_TYPE_FLAG
>>> - RTE_FLOW_ACTION_TYPE_MARK
>>>
>>> These actions set the special flag in the packet metadata, MARK action
>>> stores some specified value in the metadata storage, and, on the
>>> packet receiving PMD puts the flag and value to the mbuf and
>>> applications can see the packet was threated inside flow engine
>>> according to the appropriate RTE flow(s). MARK and FLAG are like some
>>> kind of gateway to transfer some per-packet information from the flow
>> engine to the application via receiving datapath.
>>>   From the datapath point of view, the MARK and FLAG are related to the
>> receiving side only.
>>> It would useful to have the same gateway on the transmitting side and
>>> there was the feature of type RTE_FLOW_ITEM_TYPE_META was
>> proposed.
>>> The application can fill the field in mbuf and this value will be transferred to
>> some field in the packet metadata inside the flow engine.
>>> It did not matter whether these metadata fields are shared because of
>>> MARK and META items belonged to different domains (receiving and
>> transmitting) and could be vendor-specific.
>>> So far, so good, DPDK proposes some entities to control metadata
>>> inside the flow engine and gateways to exchange these values on a per-
>> packet basis via datapaths.
>>> As we can see, the MARK and META means are not symmetric, there is
>>> absent action which would allow us to set META value on the transmitting
>> path. So, the action of type:
>>> - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
>>>
>>> The next, applications raise the new requirements for packet metadata.
>>> The flow engines are getting more complex, internal switches are
>>> introduced, multiple ports might be supported within the same flow
>>> engine namespace. From the DPDK points of view, it means the packets
>>> might be sent on one eth_dev port and received on the other one, and the
>> packet path inside the flow engine entirely belongs to the same hardware
>> device. The simplest example is SR-IOV with PF, VFs and the representors.
>>> And there is a brilliant opportunity to provide some out-of-band channel to
>> transfer some extra data
>>>    from one port to another one, besides the packet data itself.
>>>
>>>
>>>> Above explanations lack information about "meta" vs "mark" which may
>>>> be set on Rx as well and delivered in other mbuf field.
>>>> It should be explained by one more field is required and rules defined.
>>>> Otherwise we can endup in half PMDs supporting mark only, half PMDs
>>>> supporting meta only and applications in an interesting situation to
>>>> make a choice which one to use.
>>> There is no "mark" vs "meta". MARK and META means are kept for
>>> compatibility issues and legacy part works exactly as before. The
>>> trials (with flow_validate)  is supposed to check whether PMD supports
>>> MARK or META feature on appropriate domain. It depends on PMD
>>> implementation, configuration and underlaying HW/FW/kernel capabilities
>> and should be resolved in runtime.
>>
>> The trials a way, but very tricky way. My imagination draws me pictures how
>> an application code could look like in attempt to use either mark or meta for
>> Rx only and these pictures are not nice.
> Agree, trials is not the best way.
> For now there is no application trying to choose mark or meta, because these ones
> belonged to other domains, and extension is newly introduced.
> So, the applications using mark will continue use mark, the same is regarding meta.
> The new application definitely will ask for both mark and of them,
> we have the requirements from customers  - "give us as many through bits as you can".
> This new application just may refuse to work if metadata features are not detected,
> because relays on it strongly.
> BTW, trials are not so complex: rte_flow_validate(mark), rte_flow_validate_metadata()
> and that's it.

In assumption that pattern is supported and fate action used together 
with mark
is supported as well. Not that easy, but OK.

>> May be it will look acceptable when mark becomes a dynamic since usage of
>> either one or another dynamic field is definitely easier than usage of either
>> fixed or dynamic field.
> At least in PMD datapath it does not look very ugly.
>> May be dynamic field for mark at fixed offset should be introduced in the
>> release or the nearest future? It will allow to preserve ABI up to 20.11 and
>> provide future proof API.
>> The trick is to register dynamic meta field at fixed offset at start of a day to
>> be sure that it is guaranteed to succeed.
>> It sounds like it is a transition mechanism from fixed to dynamic fields.
>>
>> [snip]
>>
>>>>> diff --git a/lib/librte_ethdev/rte_flow.h
>>>>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
>>>>> --- a/lib/librte_ethdev/rte_flow.h
>>>>> +++ b/lib/librte_ethdev/rte_flow.h
>>>>> @@ -28,6 +28,8 @@
>>>>>     #include <rte_byteorder.h>
>>>>>     #include <rte_esp.h>
>>>>>     #include <rte_higig.h>
>>>>> +#include <rte_mbuf.h>
>>>>> +#include <rte_mbuf_dyn.h>
>>>>>
>>>>>     #ifdef __cplusplus
>>>>>     extern "C" {
>>>>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
>>>>>     	/**
>>>>>     	 * [META]
>>>>>     	 *
>>>>> -	 * Matches a metadata value specified in mbuf metadata field.
>>>>> +	 * Matches a metadata value.
>>>>> +	 *
>>>>>     	 * See struct rte_flow_item_meta.
>>>>>     	 */
>>>>>     	RTE_FLOW_ITEM_TYPE_META,
>>>>> @@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth
>> {
>>>>>     #endif
>>>>>
>>>>>     /**
>>>>> - * RTE_FLOW_ITEM_TYPE_META.
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>>>> Is it allowed to make experimental back?
>>> I think we should remove EXPERIMENTAL here. We do not introduce new
>>> feature, but just extend the apply area.
>> Agreed.
>>
>>>>>      *
>>>>> - * Matches a specified metadata value.
>>>>> + * RTE_FLOW_ITEM_TYPE_META
>>>>> + *
>>>>> + * Matches a specified metadata value. On egress, metadata can be
>>>>> + set either by
>>>>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
>>>>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
>>>>> + RTE_FLOW_ACTION_TYPE_SET_META sets
>>>>> + * metadata for a packet and the metadata will be reported via mbuf
>>>>> + metadata
>>>>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
>> mbuf
>>>>> + field must be
>>>>> + * registered in advance by rte_flow_dynf_metadata_register().
>>>>>      */
>>>>>     struct rte_flow_item_meta {
>>>>>     	rte_be32_t data;
>>>> [snip]
>>>>
>>>>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
>>>>>     	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>>>>>     };
>>>>>
>>>>> +/**
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>>>>> + *
>>>>> + * RTE_FLOW_ACTION_TYPE_SET_META
>>>>> + *
>>>>> + * Set metadata. Metadata set by mbuf tx_metadata field with
>>>>> + * PKT_TX_METADATA flag on egress will be overridden by this action.
>>>>> +On
>>>>> + * ingress, the metadata will be carried by mbuf metadata dynamic
>>>>> +field
>>>>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
>>>>> +must be
>>>>> + * registered in advance by rte_flow_dynf_metadata_register().
>>>>> + *
>>>>> + * Altering partial bits is supported with mask. For bits which
>>>>> +have never
>>>>> + * been set, unpredictable value will be seen depending on driver
>>>>> + * implementation. For loopback/hairpin packet, metadata set on
>>>>> +Rx/Tx may
>>>>> + * or may not be propagated to the other path depending on HW
>>>> capability.
>>>>> + *
>>>>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
>>>>> + */
>>>>> +struct rte_flow_action_set_meta {
>>>>> +	rte_be32_t data;
>>>>> +	rte_be32_t mask;
>>>> As I understand tx_metadata is host endian. Just double-checking.
>>>> Is a new dynamic field host endian or big endian?
>>>> I definitely would like to see motivation in comments why data/mask
>>>> are big- endian here.
>>> metadata is opaque value, endianness does not matter, there are no
>>> some special motivations for choosing endiannes. rte_flow_item_meta()
>>> structure provides data with rte_be32_t type, so meta related action does
>> the same.
>>
>> Endianness of meta in mbuf and flow API should match and it must be
> Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it in docs.
>
>> documented. Endianness is important if a HW supports less bits since it
>> makes a hit for application to use LSB first if the bit space is sufficient.
>> mark is defined as host-endian (uint32_t) and I think meta should be the
>> same. Otherwise it complicates even more either mark or meta usage as
>> discussed above .
> mark is mark, meta is meta, these ones are not interrelated (despite they
> are becoming similar). And there is no choice between mark and meta.
> It is not obvious we should have the same endianness for both.
> There are some applications using this legacy features, so there might be compatibility
> issues either.

There are few reasons above to make meta host endian:
- match mark endianness (explained above, I still think that my reasons 
vaild)
- make it easier for application to use it without endianness conversion
   if a sequence number is used to allocate metas (similar to mark in OVS)
   and bit space is less than 32-bits

I see no single reason to keep it big-endian except a reason to keep it.

Since tx_metadata is going away and metadata was used for Tx only
before it is a good moment to fix it.

>> Yes, I think that rte_flow_item_meta should be fixed since both mark and
>> tx_metadata are host-endian.
>>
>> (it says nothing about HW interface which is vendor specific and vendor
>> PMDs should care about it)
> Handling this in PMD might introduce the extra endianness conversion in datapath,
> impacting performance. Not nice, IMO.

These concerns should not affect external interface since
it could be different for different HW vendors.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30  8:59               ` Slava Ovsiienko
  2019-10-30  9:20                 ` Andrew Rybchenko
@ 2019-10-30 10:03                 ` Slava Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30 10:03 UTC (permalink / raw)
  To: Andrew Rybchenko, Thomas Monjalon, olivier.matz
  Cc: dev, Matan Azrad, Ori Kam, Yongseok Koh

> -----Original Message-----
> From: Slava Ovsiienko
> Sent: Wednesday, October 30, 2019 11:00
> To: Andrew Rybchenko <arybchenko@solarflare.com>; Thomas Monjalon
> <thomas@monjalon.net>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: RE: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> > -----Original Message-----
> > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > Sent: Wednesday, October 30, 2019 9:35
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; olivier.matz@6wind.com
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> > <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> > Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >
> > @Olivier, please, take a look at the end of the mail.
> >
> > On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> > > Hi, Andrew
> > >
> > > Thank you for the review.
> > >
> > >> -----Original Message-----
> > >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> > >> Sent: Tuesday, October 29, 2019 18:22
> > >> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > >> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> > >> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> > >> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> > >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> > >>
> > >> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> > >>> Currently, metadata can be set on egress path via mbuf tx_metadata
> > >>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> > >> matches metadata.
> > >>> This patch extends the metadata feature usability.
> > >>>
> > >>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> > >>>
> > >>> When supporting multiple tables, Tx metadata can also be set by a
> > >>> rule and matched by another rule. This new action allows metadata
> > >>> to be set as a result of flow match.
> > >>>
> > >>> 2) Metadata on ingress
> > >>>
> > >>> There's also need to support metadata on ingress. Metadata can be
> > >>> set by SET_META action and matched by META item like Tx. The final
> > >>> value set by the action will be delivered to application via
> > >>> metadata dynamic field of mbuf which can be accessed by
> > >> RTE_FLOW_DYNF_METADATA().
> > >>> PKT_RX_DYNF_METADATA flag will be set along with the data.
> > >>>
> > >>> The mbuf dynamic field must be registered by calling
> > >>> rte_flow_dynf_metadata_register() prior to use SET_META action.
> > >>>
> > >>> The availability of dynamic mbuf metadata field can be checked
> > >>> with
> > >>> rte_flow_dynf_metadata_avail() routine.
> > >>>
> > >>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not
> > >>> be propagated to the other path depending on hardware capability.
> > >>>
> > >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > >>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > >> Above explanations lack information about "meta" vs "mark" which
> > >> may be set on Rx as well and delivered in other mbuf field.
> > >> It should be explained by one more field is required and rules defined.
> > > There is some story about metadata features.
> > > Initially, there were proposed two metadata related actions:
> > >
> > > - RTE_FLOW_ACTION_TYPE_FLAG
> > > - RTE_FLOW_ACTION_TYPE_MARK
> > >
> > > These actions set the special flag in the packet metadata, MARK
> > > action stores some specified value in the metadata storage, and, on
> > > the packet receiving PMD puts the flag and value to the mbuf and
> > > applications can see the packet was threated inside flow engine
> > > according to the appropriate RTE flow(s). MARK and FLAG are like
> > > some kind of gateway to transfer some per-packet information from
> > > the flow
> > engine to the application via receiving datapath.
> > >
> > >  From the datapath point of view, the MARK and FLAG are related to
> > > the
> > receiving side only.
> > > It would useful to have the same gateway on the transmitting side
> > > and there was the feature of type RTE_FLOW_ITEM_TYPE_META was
> > proposed.
> > > The application can fill the field in mbuf and this value will be
> > > transferred to
> > some field in the packet metadata inside the flow engine.
> > > It did not matter whether these metadata fields are shared because
> > > of MARK and META items belonged to different domains (receiving and
> > transmitting) and could be vendor-specific.
> > >
> > > So far, so good, DPDK proposes some entities to control metadata
> > > inside the flow engine and gateways to exchange these values on a
> > > per-
> > packet basis via datapaths.
> > >
> > > As we can see, the MARK and META means are not symmetric, there is
> > > absent action which would allow us to set META value on the
> > > transmitting
> > path. So, the action of type:
> > > - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> > >
> > > The next, applications raise the new requirements for packet metadata.
> > > The flow engines are getting more complex, internal switches are
> > > introduced, multiple ports might be supported within the same flow
> > > engine namespace. From the DPDK points of view, it means the packets
> > > might be sent on one eth_dev port and received on the other one, and
> > > the
> > packet path inside the flow engine entirely belongs to the same
> > hardware device. The simplest example is SR-IOV with PF, VFs and the
> representors.
> > > And there is a brilliant opportunity to provide some out-of-band
> > > channel to
> > transfer some extra data
> > >   from one port to another one, besides the packet data itself.
> > >
> > >
> > >> Above explanations lack information about "meta" vs "mark" which
> > >> may be set on Rx as well and delivered in other mbuf field.
> > >> It should be explained by one more field is required and rules defined.
> > >> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> > >> supporting meta only and applications in an interesting situation
> > >> to make a choice which one to use.
> > > There is no "mark" vs "meta". MARK and META means are kept for
> > > compatibility issues and legacy part works exactly as before. The
> > > trials (with flow_validate)  is supposed to check whether PMD
> > > supports MARK or META feature on appropriate domain. It depends on
> > > PMD implementation, configuration and underlaying HW/FW/kernel
> > > capabilities
> > and should be resolved in runtime.
> >
> > The trials a way, but very tricky way. My imagination draws me
> > pictures how an application code could look like in attempt to use
> > either mark or meta for Rx only and these pictures are not nice.
> Agree, trials is not the best way.
> For now there is no application trying to choose mark or meta, because
> these ones belonged to other domains, and extension is newly introduced.
> So, the applications using mark will continue use mark, the same is regarding
> meta.
> The new application definitely will ask for both mark and of them, we have
> the requirements from customers  - "give us as many through bits as you
> can".
> This new application just may refuse to work if metadata features are not
> detected, because relays on it strongly.
> BTW, trials are not so complex: rte_flow_validate(mark),
> rte_flow_validate_metadata() and that's it.
> 
> > May be it will look acceptable when mark becomes a dynamic since usage
> > of either one or another dynamic field is definitely easier than usage
> > of either fixed or dynamic field.
> At least in PMD datapath it does not look very ugly.
> >
> > May be dynamic field for mark at fixed offset should be introduced in
> > the release or the nearest future? It will allow to preserve ABI up to
> > 20.11 and provide future proof API.
> > The trick is to register dynamic meta field at fixed offset at start
> > of a day to be sure that it is guaranteed to succeed.
> > It sounds like it is a transition mechanism from fixed to dynamic fields.
> >
> > [snip]
> >
> > >>> diff --git a/lib/librte_ethdev/rte_flow.h
> > >>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> > >>> --- a/lib/librte_ethdev/rte_flow.h
> > >>> +++ b/lib/librte_ethdev/rte_flow.h
> > >>> @@ -28,6 +28,8 @@
> > >>>    #include <rte_byteorder.h>
> > >>>    #include <rte_esp.h>
> > >>>    #include <rte_higig.h>
> > >>> +#include <rte_mbuf.h>
> > >>> +#include <rte_mbuf_dyn.h>
> > >>>
> > >>>    #ifdef __cplusplus
> > >>>    extern "C" {
> > >>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> > >>>    	/**
> > >>>    	 * [META]
> > >>>    	 *
> > >>> -	 * Matches a metadata value specified in mbuf metadata field.
> > >>> +	 * Matches a metadata value.
> > >>> +	 *
> > >>>    	 * See struct rte_flow_item_meta.
> > >>>    	 */
> > >>>    	RTE_FLOW_ITEM_TYPE_META,
> > >>> @@ -1263,9 +1266,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth
> > {
> > >>>    #endif
> > >>>
> > >>>    /**
> > >>> - * RTE_FLOW_ITEM_TYPE_META.
> > >>> + * @warning
> > >>> + * @b EXPERIMENTAL: this structure may change without prior
> > >>> + notice
> > >> Is it allowed to make experimental back?
> > > I think we should remove EXPERIMENTAL here. We do not introduce new
> > > feature, but just extend the apply area.
> >
> > Agreed.
> >
> > >>>     *
> > >>> - * Matches a specified metadata value.
> > >>> + * RTE_FLOW_ITEM_TYPE_META
> > >>> + *
> > >>> + * Matches a specified metadata value. On egress, metadata can be
> > >>> + set either by
> > >>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > >>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > >>> + RTE_FLOW_ACTION_TYPE_SET_META sets
> > >>> + * metadata for a packet and the metadata will be reported via
> > >>> + mbuf metadata
> > >>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > mbuf
> > >>> + field must be
> > >>> + * registered in advance by rte_flow_dynf_metadata_register().
> > >>>     */
> > >>>    struct rte_flow_item_meta {
> > >>>    	rte_be32_t data;
> > >> [snip]
> > >>
> > >>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> > >>>    	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> > >>>    };
> > >>>
> > >>> +/**
> > >>> + * @warning
> > >>> + * @b EXPERIMENTAL: this structure may change without prior
> > >>> +notice
> > >>> + *
> > >>> + * RTE_FLOW_ACTION_TYPE_SET_META
> > >>> + *
> > >>> + * Set metadata. Metadata set by mbuf tx_metadata field with
> > >>> + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > >>> +On
> > >>> + * ingress, the metadata will be carried by mbuf metadata dynamic
> > >>> +field
> > >>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > >>> +must be
> > >>> + * registered in advance by rte_flow_dynf_metadata_register().
> > >>> + *
> > >>> + * Altering partial bits is supported with mask. For bits which
> > >>> +have never
> > >>> + * been set, unpredictable value will be seen depending on driver
> > >>> + * implementation. For loopback/hairpin packet, metadata set on
> > >>> +Rx/Tx may
> > >>> + * or may not be propagated to the other path depending on HW
> > >> capability.
> > >>> + *
> > >>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > >>> + */
> > >>> +struct rte_flow_action_set_meta {
> > >>> +	rte_be32_t data;
> > >>> +	rte_be32_t mask;
> > >> As I understand tx_metadata is host endian. Just double-checking.
> > >> Is a new dynamic field host endian or big endian?
> > >> I definitely would like to see motivation in comments why data/mask
> > >> are big- endian here.
> > > metadata is opaque value, endianness does not matter, there are no
> > > some special motivations for choosing endiannes.
> > > rte_flow_item_meta() structure provides data with rte_be32_t type,
> > > so meta related action does
> > the same.
> >
> > Endianness of meta in mbuf and flow API should match and it must be
> Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it in
> docs.
> 
> > documented. Endianness is important if a HW supports less bits since
> > it makes a hit for application to use LSB first if the bit space is sufficient.
> > mark is defined as host-endian (uint32_t) and I think meta should be
> > the same. Otherwise it complicates even more either mark or meta usage
> > as discussed above .
> mark is mark, meta is meta, these ones are not interrelated (despite they are
> becoming similar). And there is no choice between mark and meta.
> It is not obvious we should have the same endianness for both.
> There are some applications using this legacy features, so there might be
> compatibility issues either.
> 
> >
> > Yes, I think that rte_flow_item_meta should be fixed since both mark
> > and tx_metadata are host-endian.
> >
> > (it says nothing about HW interface which is vendor specific and
> > vendor PMDs should care about it)
> Handling this in PMD might introduce the extra endianness conversion in
> datapath, impacting performance. Not nice, IMO.

Update: I've reviewed other [META] items/actions, all of them are in
host endianness. OK, we are moving tx_metadata to dynfield, it seems to
be good point to alter metadata endianness. And I see the way to avoid
conversions in data path

With best regards, Slava

> >
> > > I could assume the origin of selecting bigendian type was the
> > > endianness of metadata field in Tx descriptor of ConnectX NICs.
> > >
> > >>> +};
> > >>> +
> > >>> +/* Mbuf dynamic field offset for metadata. */ extern int
> > >>> +rte_flow_dynf_metadata_offs;
> > >>> +
> > >>> +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > >>> +rte_flow_dynf_metadata_mask;
> > >> These two global variables look frightening to me.
> > >> It does not look good to me.
> > > For me too. But we need the performance, these ones are intended for
> > > usage in datapath, any overhead is painful.
> >
> > @Olivier, could you share your thoughts, please.
> >
> > Andrew.
> 
> With best regards, Slava


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30  9:20                 ` Andrew Rybchenko
@ 2019-10-30 10:05                   ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30 10:05 UTC (permalink / raw)
  To: Andrew Rybchenko, Thomas Monjalon, olivier.matz
  Cc: dev, Matan Azrad, Ori Kam, Yongseok Koh

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Wednesday, October 30, 2019 11:20
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> On 10/30/19 11:59 AM, Slava Ovsiienko wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Wednesday, October 30, 2019 9:35
> >> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
> >> <thomas@monjalon.net>; olivier.matz@6wind.com
> >> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> >> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >>
> >> @Olivier, please, take a look at the end of the mail.
> >>
> >> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> >>> Hi, Andrew
> >>>
> >>> Thank you for the review.
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >>>> Sent: Tuesday, October 29, 2019 18:22
> >>>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> >>>> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> >>>> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> >>>> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> >>>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >>>>
> >>>> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> >>>>> Currently, metadata can be set on egress path via mbuf tx_metadata
> >>>>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> >>>> matches metadata.
> >>>>> This patch extends the metadata feature usability.
> >>>>>
> >>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>
> >>>>> When supporting multiple tables, Tx metadata can also be set by a
> >>>>> rule and matched by another rule. This new action allows metadata
> >>>>> to be set as a result of flow match.
> >>>>>
> >>>>> 2) Metadata on ingress
> >>>>>
> >>>>> There's also need to support metadata on ingress. Metadata can be
> >>>>> set by SET_META action and matched by META item like Tx. The final
> >>>>> value set by the action will be delivered to application via
> >>>>> metadata dynamic field of mbuf which can be accessed by
> >>>> RTE_FLOW_DYNF_METADATA().
> >>>>> PKT_RX_DYNF_METADATA flag will be set along with the data.
> >>>>>
> >>>>> The mbuf dynamic field must be registered by calling
> >>>>> rte_flow_dynf_metadata_register() prior to use SET_META action.
> >>>>>
> >>>>> The availability of dynamic mbuf metadata field can be checked
> >>>>> with
> >>>>> rte_flow_dynf_metadata_avail() routine.
> >>>>>
> >>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not
> >>>>> be propagated to the other path depending on hardware capability.
> >>>>>
> >>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> >>>> Above explanations lack information about "meta" vs "mark" which
> >>>> may be set on Rx as well and delivered in other mbuf field.
> >>>> It should be explained by one more field is required and rules defined.
> >>> There is some story about metadata features.
> >>> Initially, there were proposed two metadata related actions:
> >>>
> >>> - RTE_FLOW_ACTION_TYPE_FLAG
> >>> - RTE_FLOW_ACTION_TYPE_MARK
> >>>
> >>> These actions set the special flag in the packet metadata, MARK
> >>> action stores some specified value in the metadata storage, and, on
> >>> the packet receiving PMD puts the flag and value to the mbuf and
> >>> applications can see the packet was threated inside flow engine
> >>> according to the appropriate RTE flow(s). MARK and FLAG are like
> >>> some kind of gateway to transfer some per-packet information from
> >>> the flow
> >> engine to the application via receiving datapath.
> >>>   From the datapath point of view, the MARK and FLAG are related to
> >>> the
> >> receiving side only.
> >>> It would useful to have the same gateway on the transmitting side
> >>> and there was the feature of type RTE_FLOW_ITEM_TYPE_META was
> >> proposed.
> >>> The application can fill the field in mbuf and this value will be
> >>> transferred to
> >> some field in the packet metadata inside the flow engine.
> >>> It did not matter whether these metadata fields are shared because
> >>> of MARK and META items belonged to different domains (receiving and
> >> transmitting) and could be vendor-specific.
> >>> So far, so good, DPDK proposes some entities to control metadata
> >>> inside the flow engine and gateways to exchange these values on a
> >>> per-
> >> packet basis via datapaths.
> >>> As we can see, the MARK and META means are not symmetric, there is
> >>> absent action which would allow us to set META value on the
> >>> transmitting
> >> path. So, the action of type:
> >>> - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> >>>
> >>> The next, applications raise the new requirements for packet metadata.
> >>> The flow engines are getting more complex, internal switches are
> >>> introduced, multiple ports might be supported within the same flow
> >>> engine namespace. From the DPDK points of view, it means the packets
> >>> might be sent on one eth_dev port and received on the other one, and
> >>> the
> >> packet path inside the flow engine entirely belongs to the same
> >> hardware device. The simplest example is SR-IOV with PF, VFs and the
> representors.
> >>> And there is a brilliant opportunity to provide some out-of-band
> >>> channel to
> >> transfer some extra data
> >>>    from one port to another one, besides the packet data itself.
> >>>
> >>>
> >>>> Above explanations lack information about "meta" vs "mark" which
> >>>> may be set on Rx as well and delivered in other mbuf field.
> >>>> It should be explained by one more field is required and rules defined.
> >>>> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> >>>> supporting meta only and applications in an interesting situation
> >>>> to make a choice which one to use.
> >>> There is no "mark" vs "meta". MARK and META means are kept for
> >>> compatibility issues and legacy part works exactly as before. The
> >>> trials (with flow_validate)  is supposed to check whether PMD
> >>> supports MARK or META feature on appropriate domain. It depends on
> >>> PMD implementation, configuration and underlaying HW/FW/kernel
> >>> capabilities
> >> and should be resolved in runtime.
> >>
> >> The trials a way, but very tricky way. My imagination draws me
> >> pictures how an application code could look like in attempt to use
> >> either mark or meta for Rx only and these pictures are not nice.
> > Agree, trials is not the best way.
> > For now there is no application trying to choose mark or meta, because
> > these ones belonged to other domains, and extension is newly introduced.
> > So, the applications using mark will continue use mark, the same is
> regarding meta.
> > The new application definitely will ask for both mark and of them, we
> > have the requirements from customers  - "give us as many through bits as
> you can".
> > This new application just may refuse to work if metadata features are
> > not detected, because relays on it strongly.
> > BTW, trials are not so complex: rte_flow_validate(mark),
> > rte_flow_validate_metadata() and that's it.
> 
> In assumption that pattern is supported and fate action used together with
> mark is supported as well. Not that easy, but OK.
> 
> >> May be it will look acceptable when mark becomes a dynamic since
> >> usage of either one or another dynamic field is definitely easier
> >> than usage of either fixed or dynamic field.
> > At least in PMD datapath it does not look very ugly.
> >> May be dynamic field for mark at fixed offset should be introduced in
> >> the release or the nearest future? It will allow to preserve ABI up
> >> to 20.11 and provide future proof API.
> >> The trick is to register dynamic meta field at fixed offset at start
> >> of a day to be sure that it is guaranteed to succeed.
> >> It sounds like it is a transition mechanism from fixed to dynamic fields.
> >>
> >> [snip]
> >>
> >>>>> diff --git a/lib/librte_ethdev/rte_flow.h
> >>>>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> >>>>> --- a/lib/librte_ethdev/rte_flow.h
> >>>>> +++ b/lib/librte_ethdev/rte_flow.h
> >>>>> @@ -28,6 +28,8 @@
> >>>>>     #include <rte_byteorder.h>
> >>>>>     #include <rte_esp.h>
> >>>>>     #include <rte_higig.h>
> >>>>> +#include <rte_mbuf.h>
> >>>>> +#include <rte_mbuf_dyn.h>
> >>>>>
> >>>>>     #ifdef __cplusplus
> >>>>>     extern "C" {
> >>>>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> >>>>>     	/**
> >>>>>     	 * [META]
> >>>>>     	 *
> >>>>> -	 * Matches a metadata value specified in mbuf metadata field.
> >>>>> +	 * Matches a metadata value.
> >>>>> +	 *
> >>>>>     	 * See struct rte_flow_item_meta.
> >>>>>     	 */
> >>>>>     	RTE_FLOW_ITEM_TYPE_META,
> >>>>> @@ -1263,9 +1266,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth
> >> {
> >>>>>     #endif
> >>>>>
> >>>>>     /**
> >>>>> - * RTE_FLOW_ITEM_TYPE_META.
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this structure may change without prior
> >>>>> + notice
> >>>> Is it allowed to make experimental back?
> >>> I think we should remove EXPERIMENTAL here. We do not introduce
> new
> >>> feature, but just extend the apply area.
> >> Agreed.
> >>
> >>>>>      *
> >>>>> - * Matches a specified metadata value.
> >>>>> + * RTE_FLOW_ITEM_TYPE_META
> >>>>> + *
> >>>>> + * Matches a specified metadata value. On egress, metadata can be
> >>>>> + set either by
> >>>>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> >>>>> + RTE_FLOW_ACTION_TYPE_SET_META sets
> >>>>> + * metadata for a packet and the metadata will be reported via
> >>>>> + mbuf metadata
> >>>>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> >> mbuf
> >>>>> + field must be
> >>>>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>>>>      */
> >>>>>     struct rte_flow_item_meta {
> >>>>>     	rte_be32_t data;
> >>>> [snip]
> >>>>
> >>>>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> >>>>>     	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> >>>>>     };
> >>>>>
> >>>>> +/**
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this structure may change without prior
> >>>>> +notice
> >>>>> + *
> >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META
> >>>>> + *
> >>>>> + * Set metadata. Metadata set by mbuf tx_metadata field with
> >>>>> + * PKT_TX_METADATA flag on egress will be overridden by this
> action.
> >>>>> +On
> >>>>> + * ingress, the metadata will be carried by mbuf metadata dynamic
> >>>>> +field
> >>>>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> >>>>> +must be
> >>>>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>>>> + *
> >>>>> + * Altering partial bits is supported with mask. For bits which
> >>>>> +have never
> >>>>> + * been set, unpredictable value will be seen depending on driver
> >>>>> + * implementation. For loopback/hairpin packet, metadata set on
> >>>>> +Rx/Tx may
> >>>>> + * or may not be propagated to the other path depending on HW
> >>>> capability.
> >>>>> + *
> >>>>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> >>>>> + */
> >>>>> +struct rte_flow_action_set_meta {
> >>>>> +	rte_be32_t data;
> >>>>> +	rte_be32_t mask;
> >>>> As I understand tx_metadata is host endian. Just double-checking.
> >>>> Is a new dynamic field host endian or big endian?
> >>>> I definitely would like to see motivation in comments why data/mask
> >>>> are big- endian here.
> >>> metadata is opaque value, endianness does not matter, there are no
> >>> some special motivations for choosing endiannes.
> >>> rte_flow_item_meta() structure provides data with rte_be32_t type,
> >>> so meta related action does
> >> the same.
> >>
> >> Endianness of meta in mbuf and flow API should match and it must be
> > Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it
> in docs.
> >
> >> documented. Endianness is important if a HW supports less bits since
> >> it makes a hit for application to use LSB first if the bit space is sufficient.
> >> mark is defined as host-endian (uint32_t) and I think meta should be
> >> the same. Otherwise it complicates even more either mark or meta
> >> usage as discussed above .
> > mark is mark, meta is meta, these ones are not interrelated (despite
> > they are becoming similar). And there is no choice between mark and meta.
> > It is not obvious we should have the same endianness for both.
> > There are some applications using this legacy features, so there might
> > be compatibility issues either.
> 
> There are few reasons above to make meta host endian:
> - match mark endianness (explained above, I still think that my reasons
> vaild)
> - make it easier for application to use it without endianness conversion
>    if a sequence number is used to allocate metas (similar to mark in OVS)
>    and bit space is less than 32-bits
> 
> I see no single reason to keep it big-endian except a reason to keep it.
> 
> Since tx_metadata is going away and metadata was used for Tx only before
> it is a good moment to fix it.
> 
> >> Yes, I think that rte_flow_item_meta should be fixed since both mark
> >> and tx_metadata are host-endian.
> >>
> >> (it says nothing about HW interface which is vendor specific and
> >> vendor PMDs should care about it)
> > Handling this in PMD might introduce the extra endianness conversion
> > in datapath, impacting performance. Not nice, IMO.
> 
> These concerns should not affect external interface since it could be
> different for different HW vendors.

OK, will alter metadata endianness to host one. There Is undefeatable argument
the other [META] items/actions are all in HE.

With best regards, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30  8:02           ` Andrew Rybchenko
@ 2019-10-30 14:40             ` Slava Ovsiienko
  2019-10-30 14:46               ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30 14:40 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Thomas Monjalon, olivier.matz

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Wednesday, October 30, 2019 10:02
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> olivier.matz@6wind.com; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v5] ethdev: extend flow metadata
> 
> On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > Currently, metadata can be set on egress path via mbuf tx_metadata
> > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> matches metadata.
> >
> > This patch extends the metadata feature usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule
> > and matched by another rule. This new action allows metadata to be set
> > as a result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on ingress. Metadata can be set
> > by SET_META action and matched by META item like Tx. The final value
> > set by the action will be delivered to application via metadata
> > dynamic field of mbuf which can be accessed by
> > RTE_FLOW_DYNF_METADATA() macro or with
> > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

We have a problem with PKT_RX_DYNF_METADATA/ PKT_TX_DYNF_METADATA.
These ones are referencing to global variable "rte_flow_dynf_metadata_mask".
And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list) which show 
the names of flags. It is not good if rte_mbuf.c would require including of rte_flow.h
and  linking rte_flow.c (not all apps use rte_flow or even ethdev).

What do you think? Should we rename rte_flow_dynf_xxxxx variables
to rte_mbuf_dynf_flow_metadata_xxxx and put ones into the  rte_mbuf_dyn.c?
The same about PKT_RX_DYNF_METADATA definition, is it candidate to move
to rte_mbuf_dyn.h ? It would allow not to link or include rte_flow.c/h into
rte_mbuf.c

With best regards, Slava


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30 14:40             ` Slava Ovsiienko
@ 2019-10-30 14:46               ` Slava Ovsiienko
  2019-10-30 15:20                 ` Olivier Matz
  0 siblings, 1 reply; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30 14:46 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Thomas Monjalon, olivier.matz

> -----Original Message-----
> From: Slava Ovsiienko
> Sent: Wednesday, October 30, 2019 16:41
> To: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> olivier.matz@6wind.com
> Subject: RE: [PATCH v5] ethdev: extend flow metadata
> 
> > -----Original Message-----
> > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > Sent: Wednesday, October 30, 2019 10:02
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> > olivier.matz@6wind.com; Yongseok Koh <yskoh@mellanox.com>
> > Subject: Re: [PATCH v5] ethdev: extend flow metadata
> >
> > On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > > Currently, metadata can be set on egress path via mbuf tx_metadata
> > > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> > matches metadata.
> > >
> > > This patch extends the metadata feature usability.
> > >
> > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > >
> > > When supporting multiple tables, Tx metadata can also be set by a
> > > rule and matched by another rule. This new action allows metadata to
> > > be set as a result of flow match.
> > >
> > > 2) Metadata on ingress
> > >
> > > There's also need to support metadata on ingress. Metadata can be
> > > set by SET_META action and matched by META item like Tx. The final
> > > value set by the action will be delivered to application via
> > > metadata dynamic field of mbuf which can be accessed by
> > > RTE_FLOW_DYNF_METADATA() macro or with
> > > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > > routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> We have a problem with PKT_RX_DYNF_METADATA/
> PKT_TX_DYNF_METADATA.
> These ones are referencing to global variable
> "rte_flow_dynf_metadata_mask".
> And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list) which show
> the names of flags. It is not good if rte_mbuf.c would require including of
> rte_flow.h and  linking rte_flow.c (not all apps use rte_flow or even ethdev).
> 
> What do you think? Should we rename rte_flow_dynf_xxxxx variables to
> rte_mbuf_dynf_flow_metadata_xxxx and put ones into the  rte_mbuf_dyn.c?
> The same about PKT_RX_DYNF_METADATA definition, is it candidate to
> move to rte_mbuf_dyn.h ? It would allow not to link or include rte_flow.c/h
> into rte_mbuf.c
> 

It is interesting to note that despite metadata field looks to be related to rte_flow,
there is no any reference to this field or flags inside rte_flow API implementation.
Only datapath references this field. Metadata is gateway between flow HW space and datapath,
it tends to be mostly on datapath side not on rte_flow.

With best regards, Slava

 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30 14:46               ` Slava Ovsiienko
@ 2019-10-30 15:20                 ` Olivier Matz
  2019-10-30 15:57                   ` Thomas Monjalon
  2019-10-30 15:58                   ` Slava Ovsiienko
  0 siblings, 2 replies; 98+ messages in thread
From: Olivier Matz @ 2019-10-30 15:20 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Andrew Rybchenko, dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

Hi,

On Wed, Oct 30, 2019 at 02:46:06PM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: Slava Ovsiienko
> > Sent: Wednesday, October 30, 2019 16:41
> > To: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org
> > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> > olivier.matz@6wind.com
> > Subject: RE: [PATCH v5] ethdev: extend flow metadata
> > 
> > > -----Original Message-----
> > > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > > Sent: Wednesday, October 30, 2019 10:02
> > > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> > > olivier.matz@6wind.com; Yongseok Koh <yskoh@mellanox.com>
> > > Subject: Re: [PATCH v5] ethdev: extend flow metadata
> > >
> > > On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > > > Currently, metadata can be set on egress path via mbuf tx_metadata
> > > > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> > > matches metadata.
> > > >
> > > > This patch extends the metadata feature usability.
> > > >
> > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > >
> > > > When supporting multiple tables, Tx metadata can also be set by a
> > > > rule and matched by another rule. This new action allows metadata to
> > > > be set as a result of flow match.
> > > >
> > > > 2) Metadata on ingress
> > > >
> > > > There's also need to support metadata on ingress. Metadata can be
> > > > set by SET_META action and matched by META item like Tx. The final
> > > > value set by the action will be delivered to application via
> > > > metadata dynamic field of mbuf which can be accessed by
> > > > RTE_FLOW_DYNF_METADATA() macro or with
> > > > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > > > routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> > 
> > We have a problem with PKT_RX_DYNF_METADATA/
> > PKT_TX_DYNF_METADATA.
> > These ones are referencing to global variable
> > "rte_flow_dynf_metadata_mask".
> > And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list) which show
> > the names of flags. It is not good if rte_mbuf.c would require including of
> > rte_flow.h and  linking rte_flow.c (not all apps use rte_flow or even ethdev).
> > 
> > What do you think? Should we rename rte_flow_dynf_xxxxx variables to
> > rte_mbuf_dynf_flow_metadata_xxxx and put ones into the  rte_mbuf_dyn.c?
> > The same about PKT_RX_DYNF_METADATA definition, is it candidate to
> > move to rte_mbuf_dyn.h ? It would allow not to link or include rte_flow.c/h
> > into rte_mbuf.c
> > 

In rte_mbuf_dyn.c, we maintain a list of registered flags. I think it
wouldn't be too difficult to introduce the equivalent of
rte_get_*_ol_flag_list() and *rte_get_*_ol_flag_name() for dynamic
flags. There is already a dump function (which does both fields and
flags), and a lookup by name function.

Maybe we could split the dump into fields and flags, and add a lookup by
offset/bitnum. Would it work for your use-case?

> It is interesting to note that despite metadata field looks to be related to rte_flow,
> there is no any reference to this field or flags inside rte_flow API implementation.
> Only datapath references this field. Metadata is gateway between flow HW space and datapath,
> it tends to be mostly on datapath side not on rte_flow.

Yes, only the registration of the field is related to rte_flow. But I
don't get where are you going with this. Is it a problem?


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30  7:35             ` Andrew Rybchenko
  2019-10-30  8:59               ` Slava Ovsiienko
@ 2019-10-30 15:49               ` Olivier Matz
  2019-10-31  9:25                 ` Andrew Rybchenko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-30 15:49 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Slava Ovsiienko, Thomas Monjalon, dev, Matan Azrad, Ori Kam,
	Yongseok Koh

Hi,

On Wed, Oct 30, 2019 at 10:35:16AM +0300, Andrew Rybchenko wrote:
> @Olivier, please, take a look at the end of the mail.
> 

(...)

> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> > > > +};
> > > > +
> > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > +rte_flow_dynf_metadata_offs;
> > > > +
> > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > +rte_flow_dynf_metadata_mask;
> > > These two global variables look frightening to me.
> > > It does not look good to me.
> > For me too. But we need the performance, these ones are
> > intended for usage in datapath, any overhead is painful.
> 
> @Olivier, could you share your thoughts, please.

Having a global variable looks unavoidable to me, if we want
performance.

An alternative can be to use static global variables in every file that
use this dynamic field, and call the register method from it. But I
don't think it would scale if a dynamic field is widely used.

Why does it look frigthening to you?

The constraint is that before using this variable, the register function
has to be called. I don't think there are race conditions, because the
field/flag registration is lock protected and always return the same
value.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30 15:20                 ` Olivier Matz
@ 2019-10-30 15:57                   ` Thomas Monjalon
  2019-10-30 15:58                   ` Slava Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Thomas Monjalon @ 2019-10-30 15:57 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Olivier Matz, Andrew Rybchenko, dev, Matan Azrad, Raslan Darawsheh

30/10/2019 16:20, Olivier Matz:
> > From: Slava Ovsiienko
> > > > On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > > > > Currently, metadata can be set on egress path via mbuf tx_metadata
> > > > > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> > > > matches metadata.
> > > > >
> > > > > This patch extends the metadata feature usability.
> > > > >
> > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > >
> > > > > When supporting multiple tables, Tx metadata can also be set by a
> > > > > rule and matched by another rule. This new action allows metadata to
> > > > > be set as a result of flow match.
> > > > >
> > > > > 2) Metadata on ingress
> > > > >
> > > > > There's also need to support metadata on ingress. Metadata can be
> > > > > set by SET_META action and matched by META item like Tx. The final
> > > > > value set by the action will be delivered to application via
> > > > > metadata dynamic field of mbuf which can be accessed by
> > > > > RTE_FLOW_DYNF_METADATA() macro or with
> > > > > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > > > > routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> > > 
> > > We have a problem with PKT_RX_DYNF_METADATA/
> > > PKT_TX_DYNF_METADATA.
> > > These ones are referencing to global variable
> > > "rte_flow_dynf_metadata_mask".
> > > And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list) which show
> > > the names of flags. It is not good if rte_mbuf.c would require including of
> > > rte_flow.h and  linking rte_flow.c (not all apps use rte_flow or even ethdev).
> > > 
> > > What do you think? Should we rename rte_flow_dynf_xxxxx variables to
> > > rte_mbuf_dynf_flow_metadata_xxxx and put ones into the  rte_mbuf_dyn.c?
> > > The same about PKT_RX_DYNF_METADATA definition, is it candidate to
> > > move to rte_mbuf_dyn.h ? It would allow not to link or include rte_flow.c/h
> > > into rte_mbuf.c
> > > 
> 
> In rte_mbuf_dyn.c, we maintain a list of registered flags. I think it
> wouldn't be too difficult to introduce the equivalent of
> rte_get_*_ol_flag_list() and *rte_get_*_ol_flag_name() for dynamic
> flags. There is already a dump function (which does both fields and
> flags), and a lookup by name function.
> 
> Maybe we could split the dump into fields and flags, and add a lookup by
> offset/bitnum. Would it work for your use-case?

I think it would not be so useful.
I propose to maintain the function rte_get_rx_ol_flag_list() only
for static mbuf fields, and do not list the dynamic flags.
In short, do nothing :)



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30 15:20                 ` Olivier Matz
  2019-10-30 15:57                   ` Thomas Monjalon
@ 2019-10-30 15:58                   ` Slava Ovsiienko
  2019-10-30 16:13                     ` Olivier Matz
  1 sibling, 1 reply; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-30 15:58 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Andrew Rybchenko, dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Wednesday, October 30, 2019 17:20
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org;
> Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>
> Subject: Re: [PATCH v5] ethdev: extend flow metadata
> 
> Hi,
> 
> On Wed, Oct 30, 2019 at 02:46:06PM +0000, Slava Ovsiienko wrote:
> > > -----Original Message-----
> > > From: Slava Ovsiienko
> > > Sent: Wednesday, October 30, 2019 16:41
> > > To: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org
> > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> > > olivier.matz@6wind.com
> > > Subject: RE: [PATCH v5] ethdev: extend flow metadata
> > >
> > > > -----Original Message-----
> > > > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > > > Sent: Wednesday, October 30, 2019 10:02
> > > > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > > <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > > > olivier.matz@6wind.com; Yongseok Koh <yskoh@mellanox.com>
> > > > Subject: Re: [PATCH v5] ethdev: extend flow metadata
> > > >
> > > > On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > > > > Currently, metadata can be set on egress path via mbuf
> > > > > tx_metadata field with PKT_TX_METADATA flag and
> > > > > RTE_FLOW_ITEM_TYPE_META
> > > > matches metadata.
> > > > >
> > > > > This patch extends the metadata feature usability.
> > > > >
> > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > >
> > > > > When supporting multiple tables, Tx metadata can also be set by
> > > > > a rule and matched by another rule. This new action allows
> > > > > metadata to be set as a result of flow match.
> > > > >
> > > > > 2) Metadata on ingress
> > > > >
> > > > > There's also need to support metadata on ingress. Metadata can
> > > > > be set by SET_META action and matched by META item like Tx. The
> > > > > final value set by the action will be delivered to application
> > > > > via metadata dynamic field of mbuf which can be accessed by
> > > > > RTE_FLOW_DYNF_METADATA() macro or with
> > > > > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get()
> > > > > helper routines. PKT_RX_DYNF_METADATA flag will be set along with
> the data.
> > >
> > > We have a problem with PKT_RX_DYNF_METADATA/
> PKT_TX_DYNF_METADATA.
> > > These ones are referencing to global variable
> > > "rte_flow_dynf_metadata_mask".
> > > And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list)
> > > which show the names of flags. It is not good if rte_mbuf.c would
> > > require including of rte_flow.h and  linking rte_flow.c (not all apps use
> rte_flow or even ethdev).
> > >
> > > What do you think? Should we rename rte_flow_dynf_xxxxx variables to
> > > rte_mbuf_dynf_flow_metadata_xxxx and put ones into the
> rte_mbuf_dyn.c?
> > > The same about PKT_RX_DYNF_METADATA definition, is it candidate to
> > > move to rte_mbuf_dyn.h ? It would allow not to link or include
> > > rte_flow.c/h into rte_mbuf.c
> > >
> 
> In rte_mbuf_dyn.c, we maintain a list of registered flags. I think it wouldn't
> be too difficult to introduce the equivalent of
> rte_get_*_ol_flag_list() and *rte_get_*_ol_flag_name() for dynamic flags.
> There is already a dump function (which does both fields and flags), and a
> lookup by name function.

Nice idea, thanks. I think existing lookup by name is OK, I'll update
flag list routines with dynamic flag support.

> 
> Maybe we could split the dump into fields and flags, and add a lookup by
> offset/bitnum. Would it work for your use-case?

Not needed right now, I think.

> 
> > It is interesting to note that despite metadata field looks to be
> > related to rte_flow, there is no any reference to this field or flags inside
> rte_flow API implementation.
> > Only datapath references this field. Metadata is gateway between flow
> > HW space and datapath, it tends to be mostly on datapath side not on
> rte_flow.
> 
> Yes, only the registration of the field is related to rte_flow. But I don't get
> where are you going with this. Is it a problem?

It is not a problem, rather some concern.
IMO, the datapath related entities (flag mask and field offset) are put in not appropriate place.
Now we have to add lookup to flag list routines, which we could avoid. The next candidates,
like timestamp or timesync might introduce some new issues.

With best regards, Slava


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v5] ethdev: extend flow metadata
  2019-10-30 15:58                   ` Slava Ovsiienko
@ 2019-10-30 16:13                     ` Olivier Matz
  0 siblings, 0 replies; 98+ messages in thread
From: Olivier Matz @ 2019-10-30 16:13 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Andrew Rybchenko, dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon

On Wed, Oct 30, 2019 at 03:58:04PM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Wednesday, October 30, 2019 17:20
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org;
> > Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>
> > Subject: Re: [PATCH v5] ethdev: extend flow metadata
> > 
> > Hi,
> > 
> > On Wed, Oct 30, 2019 at 02:46:06PM +0000, Slava Ovsiienko wrote:
> > > > -----Original Message-----
> > > > From: Slava Ovsiienko
> > > > Sent: Wednesday, October 30, 2019 16:41
> > > > To: Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org
> > > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > > <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> > > > olivier.matz@6wind.com
> > > > Subject: RE: [PATCH v5] ethdev: extend flow metadata
> > > >
> > > > > -----Original Message-----
> > > > > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > > > > Sent: Wednesday, October 30, 2019 10:02
> > > > > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > > > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > > > <rasland@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > > olivier.matz@6wind.com; Yongseok Koh <yskoh@mellanox.com>
> > > > > Subject: Re: [PATCH v5] ethdev: extend flow metadata
> > > > >
> > > > > On 10/29/19 10:31 PM, Viacheslav Ovsiienko wrote:
> > > > > > Currently, metadata can be set on egress path via mbuf
> > > > > > tx_metadata field with PKT_TX_METADATA flag and
> > > > > > RTE_FLOW_ITEM_TYPE_META
> > > > > matches metadata.
> > > > > >
> > > > > > This patch extends the metadata feature usability.
> > > > > >
> > > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > > >
> > > > > > When supporting multiple tables, Tx metadata can also be set by
> > > > > > a rule and matched by another rule. This new action allows
> > > > > > metadata to be set as a result of flow match.
> > > > > >
> > > > > > 2) Metadata on ingress
> > > > > >
> > > > > > There's also need to support metadata on ingress. Metadata can
> > > > > > be set by SET_META action and matched by META item like Tx. The
> > > > > > final value set by the action will be delivered to application
> > > > > > via metadata dynamic field of mbuf which can be accessed by
> > > > > > RTE_FLOW_DYNF_METADATA() macro or with
> > > > > > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get()
> > > > > > helper routines. PKT_RX_DYNF_METADATA flag will be set along with
> > the data.
> > > >
> > > > We have a problem with PKT_RX_DYNF_METADATA/
> > PKT_TX_DYNF_METADATA.
> > > > These ones are referencing to global variable
> > > > "rte_flow_dynf_metadata_mask".
> > > > And there are routines in rte_mbuf.c  (rte_get_rx_ol_flag_list)
> > > > which show the names of flags. It is not good if rte_mbuf.c would
> > > > require including of rte_flow.h and  linking rte_flow.c (not all apps use
> > rte_flow or even ethdev).
> > > >
> > > > What do you think? Should we rename rte_flow_dynf_xxxxx variables to
> > > > rte_mbuf_dynf_flow_metadata_xxxx and put ones into the
> > rte_mbuf_dyn.c?
> > > > The same about PKT_RX_DYNF_METADATA definition, is it candidate to
> > > > move to rte_mbuf_dyn.h ? It would allow not to link or include
> > > > rte_flow.c/h into rte_mbuf.c
> > > >
> > 
> > In rte_mbuf_dyn.c, we maintain a list of registered flags. I think it wouldn't
> > be too difficult to introduce the equivalent of
> > rte_get_*_ol_flag_list() and *rte_get_*_ol_flag_name() for dynamic flags.
> > There is already a dump function (which does both fields and flags), and a
> > lookup by name function.
> 
> Nice idea, thanks. I think existing lookup by name is OK, I'll update
> flag list routines with dynamic flag support.
> 
> > 
> > Maybe we could split the dump into fields and flags, and add a lookup by
> > offset/bitnum. Would it work for your use-case?
> 
> Not needed right now, I think.
> 
> > 
> > > It is interesting to note that despite metadata field looks to be
> > > related to rte_flow, there is no any reference to this field or flags inside
> > rte_flow API implementation.
> > > Only datapath references this field. Metadata is gateway between flow
> > > HW space and datapath, it tends to be mostly on datapath side not on
> > rte_flow.
> > 
> > Yes, only the registration of the field is related to rte_flow. But I don't get
> > where are you going with this. Is it a problem?
> 
> It is not a problem, rather some concern.
> IMO, the datapath related entities (flag mask and field offset) are put in not appropriate place.
> Now we have to add lookup to flag list routines, which we could avoid. The next candidates,
> like timestamp or timesync might introduce some new issues.

If the underlying question is should we centralize or not centralize the
definitions of dynamic fields/flags helpers, I'll tend to say that we
should not centralize. The reason is because it will not always be
possible: an application or an external library is allowed to register a
private dynamic field.

Nevertheless, as you say, introducing the next dynamic fields like
timestamp may show up some new issues, and we should be ready to rework
what has been done when we'll have more experience, with more usages of
dynamic fields.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature
  2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
  2019-10-30  8:02           ` Andrew Rybchenko
  2019-10-30  8:35           ` Ori Kam
@ 2019-10-30 17:12           ` Viacheslav Ovsiienko
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
  2 siblings, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-30 17:12 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

This patchset just combines two metadata related patches
to provide right applying order. The first patch introduces
the ingress metadata with mbuf dynamic field usage, the
second one moves egress metadata to the dynamic field
presented by first patch.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (2):
  ethdev: extend flow metadata
  ethdev: move egress metadata to dynamic field

 app/test-pmd/cmdline.c                   |   3 +-
 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/testpmd.c                   |   4 --
 app/test-pmd/testpmd.h                   |   2 +-
 app/test-pmd/util.c                      |  16 +++--
 app/test/test_mbuf.c                     |   1 -
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  13 ++++
 drivers/net/mlx5/mlx5_flow_dv.c          |  19 ++----
 drivers/net/mlx5/mlx5_rxtx.c             |  22 +++----
 drivers/net/mlx5/mlx5_rxtx_vec.h         |   6 --
 drivers/net/mlx5/mlx5_txq.c              |   4 --
 lib/librte_ethdev/rte_ethdev.c           |   1 -
 lib/librte_ethdev/rte_ethdev.h           |   5 --
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 104 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf.c               |   2 -
 lib/librte_mbuf/rte_mbuf_core.h          |  19 +-----
 lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
 20 files changed, 308 insertions(+), 93 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata
  2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
@ 2019-10-30 17:12             ` Viacheslav Ovsiienko
  2019-10-31  9:19               ` Andrew Rybchenko
  2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-30 17:12 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v6: - minor code style issues
    - is combined in series with followed egress metadata patch

v5: - http://patches.dpdk.org/patch/62179/
    - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/util.c                      |   5 ++
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  13 ++++
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
 8 files changed, 279 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0d0bc0a..e4ef066 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -316,6 +316,9 @@ enum index {
 	ACTION_RAW_ENCAP_INDEX_VALUE,
 	ACTION_RAW_DECAP_INDEX,
 	ACTION_RAW_DECAP_INDEX_VALUE,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1067,6 +1070,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1265,6 +1269,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.help = "index of raw_encap/raw_decap data",
 		.next = NEXT(next_item),
 		.call = parse_port,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 159ce19..c943aca 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f6e90cb..963c4f8 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -237,6 +237,14 @@ New Features
   On supported NICs, we can now setup haipin queue which will offload packets
   from the wire, backto the wire.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 
 Removed Items
 -------------
@@ -344,6 +352,11 @@ API Changes
   has been introduced in this release is used when used when all the packets
   enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
 
+* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
+  Due to the new dynamic metadata field in mbuf is host-endian either, there
+  is the minor compatibility issue for applications in case of 32-bit values
+  supported.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 48b5389..e593f34 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -291,4 +291,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_hairpin_queue_setup;
 	rte_eth_dev_hairpin_capability_get;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..b0490cd 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..f6e050c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,18 +1266,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
-	rte_be32_t data;
+	uint32_t data;
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
 #ifndef __cplusplus
 static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
-	.data = RTE_BE32(UINT32_MAX),
+	.data = UINT32_MAX,
 };
 #endif
 
@@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	uint32_t data;
+	uint32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2728,33 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void)
+{
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..de651c1 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v6 2/2] ethdev: move egress metadata to dynamic field
  2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-10-30 17:12             ` Viacheslav Ovsiienko
  2019-10-31  9:01               ` Andrew Rybchenko
  1 sibling, 1 reply; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-30 17:12 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be move from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 app/test-pmd/cmdline.c             |  3 ++-
 app/test-pmd/testpmd.c             |  4 ----
 app/test-pmd/testpmd.h             |  2 +-
 app/test-pmd/util.c                | 15 +++++++++------
 app/test/test_mbuf.c               |  1 -
 doc/guides/prog_guide/rte_flow.rst |  6 +++---
 drivers/net/mlx5/mlx5_flow_dv.c    | 19 ++++++-------------
 drivers/net/mlx5/mlx5_rxtx.c       | 22 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec.h   |  6 ------
 drivers/net/mlx5/mlx5_txq.c        |  4 ----
 lib/librte_ethdev/rte_ethdev.c     |  1 -
 lib/librte_ethdev/rte_ethdev.h     |  5 -----
 lib/librte_ethdev/rte_flow.h       | 19 ++++++++++---------
 lib/librte_mbuf/rte_mbuf.c         |  2 --
 lib/librte_mbuf/rte_mbuf_core.h    | 19 +------------------
 15 files changed, 43 insertions(+), 85 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4478069..49c45a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
 
 	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
+	ports[res->port_id].tx_metadata = res->value;
 	/* Add/remove callback to insert valid metadata in every Tx packet. */
 	if (ports[res->port_id].tx_metadata)
 		add_tx_md_callback(res->port_id);
 	else
 		remove_tx_md_callback(res->port_id);
+	rte_flow_dynf_metadata_register();
 }
 
 cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 0fc5b45..206c12b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1167,10 +1167,6 @@ struct extmem_param {
 		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
 			port->dev_conf.txmode.offloads &=
 				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
-		if (!(port->dev_info.tx_offload_capa &
-			DEV_TX_OFFLOAD_MATCH_METADATA))
-			port->dev_conf.txmode.offloads &=
-				~DEV_TX_OFFLOAD_MATCH_METADATA;
 		if (numa_support) {
 			if (port_numa[pid] != NUMA_NO_CONFIG)
 				port_per_socket[port_numa[pid]]++;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 8da1e8e..caabf32 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -193,7 +193,7 @@ struct rte_port {
 	struct softnic_port     softport;  /**< softnic params */
 #endif
 	/**< metadata value to insert in Tx packets. */
-	rte_be32_t		tx_metadata;
+	uint32_t		tx_metadata;
 	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
 };
 
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 56075b3..cf41864 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,8 +82,9 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
-		if (ol_flags & PKT_TX_METADATA)
-			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_TX_DYNF_METADATA)
+			printf(" - Tx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (ol_flags & PKT_RX_DYNF_METADATA)
 			printf(" - Rx metadata: 0x%x",
 			       *RTE_FLOW_DYNF_METADATA(mb));
@@ -188,10 +189,12 @@
 	 * Add metadata value to every Tx packet,
 	 * and set ol_flags accordingly.
 	 */
-	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
-		pkts[i]->ol_flags |= PKT_TX_METADATA;
-	}
+	if (rte_flow_dynf_metadata_avail())
+		for (i = 0; i < nb_pkts; i++) {
+			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
+						ports[port_id].tx_metadata;
+			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
+		}
 	return nb_pkts;
 }
 
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 854bc26..61ecffc 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1669,7 +1669,6 @@ struct flag_name {
 		VAL_NAME(PKT_TX_SEC_OFFLOAD),
 		VAL_NAME(PKT_TX_UDP_SEG),
 		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
-		VAL_NAME(PKT_TX_METADATA),
 	};
 
 	/* Test case to check with valid flag */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index c943aca..630e4c0 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -664,7 +664,7 @@ Item: ``META``
 Matches 32 bit metadata item set.
 
 On egress, metadata can be set either by mbuf metadata field with
-PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
 action sets metadata for a packet and the metadata will be reported via
 ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
 
@@ -2482,8 +2482,8 @@ Action: ``SET_META``
 
 Set metadata. Item ``META`` matches metadata.
 
-Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
-overridden by this action. On ingress, the metadata will be carried by
+Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
+will be overridden by this action. On ingress, the metadata will be carried by
 ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
 ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
 with the data.
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index d9a7fd4..f961bff 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -793,7 +793,7 @@ struct field_modify_info modify_tcp[] = {
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
 			   const struct rte_flow_item *item,
 			   const struct rte_flow_attr *attr,
 			   struct rte_flow_error *error)
@@ -801,17 +801,10 @@ struct field_modify_info modify_tcp[] = {
 	const struct rte_flow_item_meta *spec = item->spec;
 	const struct rte_flow_item_meta *mask = item->mask;
 	const struct rte_flow_item_meta nic_mask = {
-		.data = RTE_BE32(UINT32_MAX)
+		.data = UINT32_MAX
 	};
 	int ret;
-	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 
-	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
-		return rte_flow_error_set(error, EPERM,
-					  RTE_FLOW_ERROR_TYPE_ITEM,
-					  NULL,
-					  "match on metadata offload "
-					  "configuration is off for this port");
 	if (!spec)
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
@@ -4750,10 +4743,10 @@ struct field_modify_info modify_tcp[] = {
 		meta_m = &rte_flow_item_meta_mask;
 	meta_v = (const void *)item->spec;
 	if (meta_v) {
-		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_m->data));
-		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_m,
+			 metadata_reg_a, meta_m->data);
+		MLX5_SET(fte_match_set_misc2, misc2_v,
+			 metadata_reg_a, meta_v->data & meta_m->data);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f597c89..88a4378 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2281,8 +2281,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	/* Engage VLAN tag insertion feature if requested. */
 	if (MLX5_TXOFF_CONFIG(VLAN) &&
 	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
@@ -2341,8 +2341,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2434,8 +2434,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2628,8 +2628,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -3700,8 +3700,8 @@ enum mlx5_txcmp_code {
 		return false;
 	/* Fill metadata field if needed. */
 	if (MLX5_TXOFF_CONFIG(METADATA) &&
-		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
-				 loc->mbuf->tx_metadata : 0))
+		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
 		return false;
 	/* There must be no VLAN packets in eMPW loop. */
 	if (MLX5_TXOFF_CONFIG(VLAN))
@@ -5149,7 +5149,7 @@ enum mlx5_txcmp_code {
 		 */
 		olx |= MLX5_TXOFF_CONFIG_EMPW;
 	}
-	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+	if (rte_flow_dynf_metadata_avail()) {
 		/* We should support Flow metadata. */
 		olx |= MLX5_TXOFF_CONFIG_METADATA;
 	}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index b54ff72..85e0bd5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -19,12 +19,6 @@
 	 DEV_TX_OFFLOAD_TCP_CKSUM | \
 	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
 
-/* HW offload capabilities of vectorized Tx. */
-#define MLX5_VEC_TX_OFFLOAD_CAP \
-	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
-	 DEV_TX_OFFLOAD_MATCH_METADATA | \
-	 DEV_TX_OFFLOAD_MULTI_SEGS)
-
 /*
  * Compile time sanity check for vectorized functions.
  */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index dfc379c..97991f0 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,10 +128,6 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
-	if (config->dv_flow_en)
-		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
-#endif
 	return offloads;
 }
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 68aca1f..23b751f 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
-	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 9b69255..28e29c7 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1145,11 +1145,6 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/**
- * Device supports match on metadata Tx offload..
- * Application must set PKT_TX_METADATA and mbuf metadata field.
- */
-#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f6e050c..51d8292 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1268,12 +1268,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 /**
  * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value. On egress, metadata can be set either by
- * mbuf tx_metadata field with PKT_TX_METADATA flag or
- * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
- * metadata for a packet and the metadata will be reported via mbuf metadata
- * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
- * registered in advance by rte_flow_dynf_metadata_register().
+ * Matches a specified metadata value. On egress, metadata can be set
+ * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
+ * sets metadata for a packet and the metadata will be reported via mbuf
+ * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
+ * field must be registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	uint32_t data;
@@ -2450,8 +2450,8 @@ struct rte_flow_action_set_mac {
  *
  * RTE_FLOW_ACTION_TYPE_SET_META
  *
- * Set metadata. Metadata set by mbuf tx_metadata field with
- * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * Set metadata. Metadata set by mbuf metadata dynamic field with
+ * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
  * ingress, the metadata will be carried by mbuf metadata dynamic field
  * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
  * registered in advance by rte_flow_dynf_metadata_register().
@@ -2478,8 +2478,9 @@ struct rte_flow_action_set_meta {
 #define RTE_FLOW_DYNF_METADATA(m) \
 	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
 
-/* Mbuf dynamic flag for metadata. */
+/* Mbuf dynamic flags for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
 __rte_experimental
 static inline uint32_t
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8c51dc1..35df1c4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
-	case PKT_TX_METADATA: return "PKT_TX_METADATA";
 	default: return NULL;
 	}
 }
@@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
-		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3022701..edfc7e9 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -192,11 +192,6 @@
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
- * Indicate that the metadata field in the mbuf is in use.
- */
-#define PKT_TX_METADATA	(1ULL << 40)
-
-/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -389,8 +384,7 @@
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM | \
-		PKT_TX_METADATA)
+		PKT_TX_OUTER_UDP_CKSUM)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
@@ -601,17 +595,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v6 2/2] ethdev: move egress metadata to dynamic field
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
@ 2019-10-31  9:01               ` Andrew Rybchenko
  2019-10-31 10:54                 ` Slava Ovsiienko
  0 siblings, 1 reply; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-31  9:01 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev; +Cc: matan, rasland, thomas, olivier.matz, orika

On 10/30/19 8:12 PM, Viacheslav Ovsiienko wrote:
> The dynamic mbuf fields were introduced by [1]. The egress metadata is
> good candidate to be move from statically allocated field tx_metadata to
> dynamic one. Because mbufs are used in half-duplex fashion only, it is
> safe to share this dynamic field with ingress metadata.
>
> The shared dynamic field contains either egress (if application going to
> transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
> metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
> along with the data.
>
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior accessing the data.
>
> The availability of dynamic mbuf metadata field can be checked with
> rte_flow_dynf_metadata_avail() routine.
>
> [1] http://patches.dpdk.org/patch/62040/
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

LGTM

I think release notes should be updated.

What I don't understand now is the way for application
to understand if Tx metadata is supported or not.
Corresponding offload flag is removed. I guess the answer is
rte_flow_validate() with a rule on egress which tries to
match meta and do something (?).
It should be highlighted in the documentation in any case,
but I'd consider to keep the offload.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-10-31  9:19               ` Andrew Rybchenko
  2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-31  9:19 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: matan, rasland, thomas, olivier.matz, orika, Yongseok Koh

On 10/30/19 8:12 PM, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
>
> This patch extends the metadata feature usability.
>
> 1) RTE_FLOW_ACTION_TYPE_SET_META
>
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
>
> 2) Metadata on ingress
>
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
>
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
>
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
>
> If application is going to engage the metadata feature it registers
> the metadata  dynamic fields, then PMD checks the metadata field
> availability and handles the appropriate fields in datapath.
>
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
>
> MARK and METADATA look similar and might operate in similar way,
> but not interacting.
>
> Initially, there were proposed two metadata related actions:
>
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
>
> These actions set the special flag in the packet metadata, MARK action
> stores some specified value in the metadata storage, and, on the packet
> receiving PMD puts the flag and value to the mbuf and applications can
> see the packet was threated inside flow engine according to the appropriate
> RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
> per-packet information from the flow engine to the application via
> receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
> provided. It allows us to extend the flow match pattern with the capability
> to match the metadata values set by MARK/FLAG actions on other flows.
>
>  From the datapath point of view, the MARK and FLAG are related to the
> receiving side only. It would useful to have the same gateway on the
> transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
> was proposed. The application can fill the field in mbuf and this value
> will be transferred to some field in the packet metadata inside the flow
> engine. It did not matter whether these metadata fields are shared because
> of MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
>
> So far, so good, DPDK proposes some entities to control metadata inside
> the flow engine and gateways to exchange these values on a per-packet basis
> via datapaths.
>
> As we can see, the MARK and META means are not symmetric, there is absent
> action which would allow us to set META value on the transmitting path.
> So, the action of type:
>
> - RTE_FLOW_ACTION_TYPE_SET_META was proposed.
>
> The next, applications raise the new requirements for packet metadata.
> The flow ngines are getting more complex, internal switches are introduced,
> multiple ports might be supported within the same flow engine namespace.
>  From the DPDK points of view, it means the packets might be sent on one
> eth_dev port and received on the other one, and the packet path inside
> the flow engine entirely belongs to the same hardware device. The simplest
> example is SR-IOV with PF, VFs and the representors. And there is a
> brilliant opportunity to provide some out-of-band channel to transfer
> some extra data from one port to another one, besides the packet data
> itself. And applications would like to use this opportunity.
>
> It is supposed for application to use trials (with rte_flow_validate)
> to detect which metadata features (FLAG, MARK, META) actually supported
> by PMD and underlying hardware. It might depend on PMD configuration,
> system software, hardware settings, etc., and should be detected
> in run time.
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

It is good enough as an experimental feature to try how it goes, so

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
  2019-10-30 15:49               ` Olivier Matz
@ 2019-10-31  9:25                 ` Andrew Rybchenko
  0 siblings, 0 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-10-31  9:25 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Slava Ovsiienko, Thomas Monjalon, dev, Matan Azrad, Ori Kam,
	Yongseok Koh

On 10/30/19 6:49 PM, Olivier Matz wrote:
> Hi,
>
> On Wed, Oct 30, 2019 at 10:35:16AM +0300, Andrew Rybchenko wrote:
>> @Olivier, please, take a look at the end of the mail.
>>
> (...)
>
>> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
>>>>> +};
>>>>> +
>>>>> +/* Mbuf dynamic field offset for metadata. */ extern int
>>>>> +rte_flow_dynf_metadata_offs;
>>>>> +
>>>>> +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
>>>>> +rte_flow_dynf_metadata_mask;
>>>> These two global variables look frightening to me.
>>>> It does not look good to me.
>>> For me too. But we need the performance, these ones are
>>> intended for usage in datapath, any overhead is painful.
>> @Olivier, could you share your thoughts, please.
> Having a global variable looks unavoidable to me, if we want
> performance.
>
> An alternative can be to use static global variables in every file that
> use this dynamic field, and call the register method from it. But I
> don't think it would scale if a dynamic field is widely used.

Yes, I see.

> Why does it look frigthening to you?

It is just a good/bad design feeling. No specific technical reasons 
right now.
Let's try and take a look how it goes.

> The constraint is that before using this variable, the register function
> has to be called. I don't think there are race conditions, because the
> field/flag registration is lock protected and always return the same
> value.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v6 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31  9:01               ` Andrew Rybchenko
@ 2019-10-31 10:54                 ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-31 10:54 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Thomas Monjalon, olivier.matz, Ori Kam

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Thursday, October 31, 2019 11:02
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> olivier.matz@6wind.com; Ori Kam <orika@mellanox.com>
> Subject: Re: [PATCH v6 2/2] ethdev: move egress metadata to dynamic field
> 
> On 10/30/19 8:12 PM, Viacheslav Ovsiienko wrote:
> > The dynamic mbuf fields were introduced by [1]. The egress metadata is
> > good candidate to be move from statically allocated field tx_metadata
> > to dynamic one. Because mbufs are used in half-duplex fashion only, it
> > is safe to share this dynamic field with ingress metadata.
> >
> > The shared dynamic field contains either egress (if application going
> > to transmit mbuf with tx_burst) or ingress (if mbuf is received with
> > rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA()
> > macro or with
> > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be
> set
> > along with the data.
> >
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior accessing the data.
> >
> > The availability of dynamic mbuf metadata field can be checked with
> > rte_flow_dynf_metadata_avail() routine.
> >
> > [1]
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> >
> es.dpdk.org%2Fpatch%2F62040%2F&amp;data=02%7C01%7Cviacheslavo%4
> 0mellan
> >
> ox.com%7C3c5c76e00ac242bf9b4a08d75de0f4e8%7Ca652971c7d2e4d9ba6
> a4d14925
> >
> 6f461b%7C0%7C0%7C637081093087328482&amp;sdata=6KLkc21qu%2FFBY
> 3n9JRBE67
> > 0es%2FznOn3c2EwFi4i6qf4%3D&amp;reserved=0
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> 
> LGTM
> 
> I think release notes should be updated.
These are. In the first patch of series. There was tx_metadata endianness
altering, so I added the move to dynamic field there. Would you like to split it ?
Or to add  some more details ?

> 
> What I don't understand now is the way for application to understand if Tx
> metadata is supported or not.
> Corresponding offload flag is removed.
Yes, and was crying with bloody tiers while doing that.
Metadata feature is getting complex. We have some set of actions and items
that might be supported by PMDs in multiple combinations, the supported 
values and masks are also the subjects to query - we have no way to describe.
So, trial looks to be the only way to detect supported aspects of metadata
feature in run time.

 I guess the answer is
> rte_flow_validate() with a rule on egress which tries to match meta and do
> something (?).
Yes, it is supposed way, I call it - "trial".

> It should be highlighted in the documentation in any case, but I'd consider to
> keep the offload.
Metadata feature is considered rather as application requirement than
some PMD configuration option. Let's have a glance from other side.

If application neither needs nor supports metadata, it just does not register field,
PMDs are not bothered at all and work without any metadata handling.
If application relies on metadata strongly, requires ones, it does trials and fails
if required metadata aspects not supported.
If application has options to operate with or w/o metadata - it needs trials anyway.
If trials are OK - application registers metadata field and PMDs supporting this feature
become engaged.

With best regards, Slava




^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature
  2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-10-31  9:19               ` Andrew Rybchenko
@ 2019-10-31 13:05               ` Viacheslav Ovsiienko
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 13:05 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

This patchset just combines two metadata related patches
to provide right applying order. The first patch introduces
the ingress metadata with mbuf dynamic field usage, the
second one moves egress metadata to the dynamic field
presented by first patch.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (2):
  ethdev: extend flow metadata
  ethdev: move egress metadata to dynamic field

 app/test-pmd/cmdline.c                   |   3 +-
 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/testpmd.c                   |   4 --
 app/test-pmd/testpmd.h                   |   2 +-
 app/test-pmd/util.c                      |  16 +++--
 app/test/test_mbuf.c                     |   1 -
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  18 ++++++
 drivers/net/mlx5/mlx5_flow_dv.c          |  19 ++----
 drivers/net/mlx5/mlx5_rxtx.c             |  22 +++----
 drivers/net/mlx5/mlx5_rxtx_vec.h         |   6 --
 drivers/net/mlx5/mlx5_txq.c              |   4 --
 lib/librte_ethdev/rte_ethdev.c           |   1 -
 lib/librte_ethdev/rte_ethdev.h           |   5 --
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 104 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf.c               |   2 -
 lib/librte_mbuf/rte_mbuf_core.h          |  19 +-----
 lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
 20 files changed, 313 insertions(+), 93 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata
  2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
@ 2019-10-31 13:05                 ` Viacheslav Ovsiienko
  2019-10-31 15:47                   ` Olivier Matz
  2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 13:05 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
v7: - updated release notes in collateral patch

v6: - http://patches.dpdk.org/patch/62245/
    - minor code style issues
    - is combined in series with followed egress metadata patch

v5: - http://patches.dpdk.org/patch/62179/
    - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/util.c                      |   5 ++
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  13 ++++
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
 8 files changed, 279 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0d0bc0a..e4ef066 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -316,6 +316,9 @@ enum index {
 	ACTION_RAW_ENCAP_INDEX_VALUE,
 	ACTION_RAW_DECAP_INDEX,
 	ACTION_RAW_DECAP_INDEX_VALUE,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1067,6 +1070,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1265,6 +1269,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.help = "index of raw_encap/raw_decap data",
 		.next = NEXT(next_item),
 		.call = parse_port,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 159ce19..c943aca 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f6e90cb..963c4f8 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -237,6 +237,14 @@ New Features
   On supported NICs, we can now setup haipin queue which will offload packets
   from the wire, backto the wire.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 
 Removed Items
 -------------
@@ -344,6 +352,11 @@ API Changes
   has been introduced in this release is used when used when all the packets
   enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
 
+* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
+  Due to the new dynamic metadata field in mbuf is host-endian either, there
+  is the minor compatibility issue for applications in case of 32-bit values
+  supported.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 48b5389..e593f34 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -291,4 +291,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_hairpin_queue_setup;
 	rte_eth_dev_hairpin_capability_get;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..b0490cd 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..f6e050c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,18 +1266,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
-	rte_be32_t data;
+	uint32_t data;
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
 #ifndef __cplusplus
 static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
-	.data = RTE_BE32(UINT32_MAX),
+	.data = UINT32_MAX,
 };
 #endif
 
@@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	uint32_t data;
+	uint32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2728,33 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void)
+{
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..de651c1 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v7 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-10-31 13:05                 ` Viacheslav Ovsiienko
  2019-10-31 13:33                   ` Ori Kam
  2019-10-31 15:51                   ` Olivier Matz
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 13:05 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

v7: - updates release notes
v6: - http://patches.dpdk.org/patch/62244/

 app/test-pmd/cmdline.c                 |  3 ++-
 app/test-pmd/testpmd.c                 |  4 ----
 app/test-pmd/testpmd.h                 |  2 +-
 app/test-pmd/util.c                    | 15 +++++++++------
 app/test/test_mbuf.c                   |  1 -
 doc/guides/prog_guide/rte_flow.rst     |  6 +++---
 doc/guides/rel_notes/release_19_11.rst |  5 +++++
 drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
 drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
 drivers/net/mlx5/mlx5_txq.c            |  4 ----
 lib/librte_ethdev/rte_ethdev.c         |  1 -
 lib/librte_ethdev/rte_ethdev.h         |  5 -----
 lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf_core.h        | 19 +------------------
 16 files changed, 48 insertions(+), 85 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4478069..49c45a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
 
 	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
+	ports[res->port_id].tx_metadata = res->value;
 	/* Add/remove callback to insert valid metadata in every Tx packet. */
 	if (ports[res->port_id].tx_metadata)
 		add_tx_md_callback(res->port_id);
 	else
 		remove_tx_md_callback(res->port_id);
+	rte_flow_dynf_metadata_register();
 }
 
 cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 0fc5b45..206c12b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1167,10 +1167,6 @@ struct extmem_param {
 		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
 			port->dev_conf.txmode.offloads &=
 				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
-		if (!(port->dev_info.tx_offload_capa &
-			DEV_TX_OFFLOAD_MATCH_METADATA))
-			port->dev_conf.txmode.offloads &=
-				~DEV_TX_OFFLOAD_MATCH_METADATA;
 		if (numa_support) {
 			if (port_numa[pid] != NUMA_NO_CONFIG)
 				port_per_socket[port_numa[pid]]++;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 8da1e8e..caabf32 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -193,7 +193,7 @@ struct rte_port {
 	struct softnic_port     softport;  /**< softnic params */
 #endif
 	/**< metadata value to insert in Tx packets. */
-	rte_be32_t		tx_metadata;
+	uint32_t		tx_metadata;
 	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
 };
 
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 56075b3..cf41864 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,8 +82,9 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
-		if (ol_flags & PKT_TX_METADATA)
-			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_TX_DYNF_METADATA)
+			printf(" - Tx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (ol_flags & PKT_RX_DYNF_METADATA)
 			printf(" - Rx metadata: 0x%x",
 			       *RTE_FLOW_DYNF_METADATA(mb));
@@ -188,10 +189,12 @@
 	 * Add metadata value to every Tx packet,
 	 * and set ol_flags accordingly.
 	 */
-	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
-		pkts[i]->ol_flags |= PKT_TX_METADATA;
-	}
+	if (rte_flow_dynf_metadata_avail())
+		for (i = 0; i < nb_pkts; i++) {
+			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
+						ports[port_id].tx_metadata;
+			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
+		}
 	return nb_pkts;
 }
 
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 854bc26..61ecffc 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1669,7 +1669,6 @@ struct flag_name {
 		VAL_NAME(PKT_TX_SEC_OFFLOAD),
 		VAL_NAME(PKT_TX_UDP_SEG),
 		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
-		VAL_NAME(PKT_TX_METADATA),
 	};
 
 	/* Test case to check with valid flag */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index c943aca..630e4c0 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -664,7 +664,7 @@ Item: ``META``
 Matches 32 bit metadata item set.
 
 On egress, metadata can be set either by mbuf metadata field with
-PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
 action sets metadata for a packet and the metadata will be reported via
 ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
 
@@ -2482,8 +2482,8 @@ Action: ``SET_META``
 
 Set metadata. Item ``META`` matches metadata.
 
-Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
-overridden by this action. On ingress, the metadata will be carried by
+Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
+will be overridden by this action. On ingress, the metadata will be carried by
 ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
 ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
 with the data.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 963c4f8..2e9a596 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -357,6 +357,11 @@ API Changes
   is the minor compatibility issue for applications in case of 32-bit values
   supported.
 
+* metadata: the tx_metadata mbuf field is moved to dymanic one.
+  PKT_TX_METADATA flag is replaced with PKT_TX_DYNF_METADATA.
+  DEV_TX_OFFLOAD_MATCH_METADATA offload flag is removed, now metadata
+  support in PMD is engaged on dynamic field registration.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index d9a7fd4..f961bff 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -793,7 +793,7 @@ struct field_modify_info modify_tcp[] = {
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
 			   const struct rte_flow_item *item,
 			   const struct rte_flow_attr *attr,
 			   struct rte_flow_error *error)
@@ -801,17 +801,10 @@ struct field_modify_info modify_tcp[] = {
 	const struct rte_flow_item_meta *spec = item->spec;
 	const struct rte_flow_item_meta *mask = item->mask;
 	const struct rte_flow_item_meta nic_mask = {
-		.data = RTE_BE32(UINT32_MAX)
+		.data = UINT32_MAX
 	};
 	int ret;
-	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 
-	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
-		return rte_flow_error_set(error, EPERM,
-					  RTE_FLOW_ERROR_TYPE_ITEM,
-					  NULL,
-					  "match on metadata offload "
-					  "configuration is off for this port");
 	if (!spec)
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
@@ -4750,10 +4743,10 @@ struct field_modify_info modify_tcp[] = {
 		meta_m = &rte_flow_item_meta_mask;
 	meta_v = (const void *)item->spec;
 	if (meta_v) {
-		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_m->data));
-		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_m,
+			 metadata_reg_a, meta_m->data);
+		MLX5_SET(fte_match_set_misc2, misc2_v,
+			 metadata_reg_a, meta_v->data & meta_m->data);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f597c89..88a4378 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2281,8 +2281,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	/* Engage VLAN tag insertion feature if requested. */
 	if (MLX5_TXOFF_CONFIG(VLAN) &&
 	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
@@ -2341,8 +2341,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2434,8 +2434,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2628,8 +2628,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -3700,8 +3700,8 @@ enum mlx5_txcmp_code {
 		return false;
 	/* Fill metadata field if needed. */
 	if (MLX5_TXOFF_CONFIG(METADATA) &&
-		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
-				 loc->mbuf->tx_metadata : 0))
+		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
 		return false;
 	/* There must be no VLAN packets in eMPW loop. */
 	if (MLX5_TXOFF_CONFIG(VLAN))
@@ -5149,7 +5149,7 @@ enum mlx5_txcmp_code {
 		 */
 		olx |= MLX5_TXOFF_CONFIG_EMPW;
 	}
-	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+	if (rte_flow_dynf_metadata_avail()) {
 		/* We should support Flow metadata. */
 		olx |= MLX5_TXOFF_CONFIG_METADATA;
 	}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index b54ff72..85e0bd5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -19,12 +19,6 @@
 	 DEV_TX_OFFLOAD_TCP_CKSUM | \
 	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
 
-/* HW offload capabilities of vectorized Tx. */
-#define MLX5_VEC_TX_OFFLOAD_CAP \
-	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
-	 DEV_TX_OFFLOAD_MATCH_METADATA | \
-	 DEV_TX_OFFLOAD_MULTI_SEGS)
-
 /*
  * Compile time sanity check for vectorized functions.
  */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index dfc379c..97991f0 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,10 +128,6 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
-	if (config->dv_flow_en)
-		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
-#endif
 	return offloads;
 }
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 68aca1f..23b751f 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
-	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 9b69255..28e29c7 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1145,11 +1145,6 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/**
- * Device supports match on metadata Tx offload..
- * Application must set PKT_TX_METADATA and mbuf metadata field.
- */
-#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f6e050c..51d8292 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1268,12 +1268,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 /**
  * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value. On egress, metadata can be set either by
- * mbuf tx_metadata field with PKT_TX_METADATA flag or
- * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
- * metadata for a packet and the metadata will be reported via mbuf metadata
- * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
- * registered in advance by rte_flow_dynf_metadata_register().
+ * Matches a specified metadata value. On egress, metadata can be set
+ * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
+ * sets metadata for a packet and the metadata will be reported via mbuf
+ * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
+ * field must be registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	uint32_t data;
@@ -2450,8 +2450,8 @@ struct rte_flow_action_set_mac {
  *
  * RTE_FLOW_ACTION_TYPE_SET_META
  *
- * Set metadata. Metadata set by mbuf tx_metadata field with
- * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * Set metadata. Metadata set by mbuf metadata dynamic field with
+ * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
  * ingress, the metadata will be carried by mbuf metadata dynamic field
  * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
  * registered in advance by rte_flow_dynf_metadata_register().
@@ -2478,8 +2478,9 @@ struct rte_flow_action_set_meta {
 #define RTE_FLOW_DYNF_METADATA(m) \
 	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
 
-/* Mbuf dynamic flag for metadata. */
+/* Mbuf dynamic flags for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
 __rte_experimental
 static inline uint32_t
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8c51dc1..35df1c4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
-	case PKT_TX_METADATA: return "PKT_TX_METADATA";
 	default: return NULL;
 	}
 }
@@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
-		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3022701..edfc7e9 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -192,11 +192,6 @@
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
- * Indicate that the metadata field in the mbuf is in use.
- */
-#define PKT_TX_METADATA	(1ULL << 40)
-
-/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -389,8 +384,7 @@
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM | \
-		PKT_TX_METADATA)
+		PKT_TX_OUTER_UDP_CKSUM)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
@@ -601,17 +595,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v7 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
@ 2019-10-31 13:33                   ` Ori Kam
  2019-10-31 15:51                   ` Olivier Matz
  1 sibling, 0 replies; 98+ messages in thread
From: Ori Kam @ 2019-10-31 13:33 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Matan Azrad, Raslan Darawsheh, Thomas Monjalon, olivier.matz, arybchenko



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Thursday, October 31, 2019 3:05 PM
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> olivier.matz@6wind.com; arybchenko@solarflare.com; Ori Kam
> <orika@mellanox.com>
> Subject: [dpdk-dev] [PATCH v7 2/2] ethdev: move egress metadata to dynamic
> field
> 
> The dynamic mbuf fields were introduced by [1]. The egress metadata is
> good candidate to be moved from statically allocated field tx_metadata to
> dynamic one. Because mbufs are used in half-duplex fashion only, it is
> safe to share this dynamic field with ingress metadata.
> 
> The shared dynamic field contains either egress (if application going to
> transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
> metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or
> with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
> along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior accessing the data.
> 
> The availability of dynamic mbuf metadata field can be checked with
> rte_flow_dynf_metadata_avail() routine.
> 
> DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is
> removed.
> The metadata support in PMDs is engaged on dynamic field registration.
> 
> Metadata feature is getting complex. We might have some set of actions
> and items that might be supported by PMDs in multiple combinations,
> the supported values and masks are the subjects to query by perfroming
> trials (with rte_flow_validate).
> 
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F62040%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7C59c1c0e65ad7436cb8f708d75e031341%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637081239629362825&amp;sdata=A%2Fe1hjy02uGsgK5A
> ouK6pF%2BRdju5lcANP3ye3xx6qKs%3D&amp;reserved=0
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> 
> v7: - updates release notes
> v6: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
> dk.org%2Fpatch%2F62244%2F&amp;data=02%7C01%7Corika%40mellanox.com
> %7C59c1c0e65ad7436cb8f708d75e031341%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C637081239629362825&amp;sdata=72hW3BfXHQ0KGzNq
> o3M1TR%2BI4oQdJ5sU4wYzK74rF%2FA%3D&amp;reserved=0
> 
>  app/test-pmd/cmdline.c                 |  3 ++-
>  app/test-pmd/testpmd.c                 |  4 ----
>  app/test-pmd/testpmd.h                 |  2 +-
>  app/test-pmd/util.c                    | 15 +++++++++------
>  app/test/test_mbuf.c                   |  1 -
>  doc/guides/prog_guide/rte_flow.rst     |  6 +++---
>  doc/guides/rel_notes/release_19_11.rst |  5 +++++
>  drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
>  drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
>  drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
>  drivers/net/mlx5/mlx5_txq.c            |  4 ----
>  lib/librte_ethdev/rte_ethdev.c         |  1 -
>  lib/librte_ethdev/rte_ethdev.h         |  5 -----
>  lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
>  lib/librte_mbuf/rte_mbuf.c             |  2 --
>  lib/librte_mbuf/rte_mbuf_core.h        | 19 +------------------
>  16 files changed, 48 insertions(+), 85 deletions(-)
> 

Acked-by: Ori Kam <orika@mellanox.com>

Thanks,
Ori

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-10-31 15:47                   ` Olivier Matz
  2019-10-31 16:13                     ` Slava Ovsiienko
  2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-31 15:47 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, matan, rasland, thomas, arybchenko, orika, Yongseok Koh

Hi Slava,

One comment at the end.

On Thu, Oct 31, 2019 at 01:05:20PM +0000, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> If application is going to engage the metadata feature it registers
> the metadata  dynamic fields, then PMD checks the metadata field
> availability and handles the appropriate fields in datapath.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> MARK and METADATA look similar and might operate in similar way,
> but not interacting.
> 
> Initially, there were proposed two metadata related actions:
> 
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
> 
> These actions set the special flag in the packet metadata, MARK action
> stores some specified value in the metadata storage, and, on the packet
> receiving PMD puts the flag and value to the mbuf and applications can
> see the packet was threated inside flow engine according to the appropriate
> RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
> per-packet information from the flow engine to the application via
> receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
> provided. It allows us to extend the flow match pattern with the capability
> to match the metadata values set by MARK/FLAG actions on other flows.
> 
> From the datapath point of view, the MARK and FLAG are related to the
> receiving side only. It would useful to have the same gateway on the
> transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
> was proposed. The application can fill the field in mbuf and this value
> will be transferred to some field in the packet metadata inside the flow
> engine. It did not matter whether these metadata fields are shared because
> of MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
> 
> So far, so good, DPDK proposes some entities to control metadata inside
> the flow engine and gateways to exchange these values on a per-packet basis
> via datapaths.
> 
> As we can see, the MARK and META means are not symmetric, there is absent
> action which would allow us to set META value on the transmitting path.
> So, the action of type:
> 
> - RTE_FLOW_ACTION_TYPE_SET_META was proposed.
> 
> The next, applications raise the new requirements for packet metadata.
> The flow ngines are getting more complex, internal switches are introduced,
> multiple ports might be supported within the same flow engine namespace.
> From the DPDK points of view, it means the packets might be sent on one
> eth_dev port and received on the other one, and the packet path inside
> the flow engine entirely belongs to the same hardware device. The simplest
> example is SR-IOV with PF, VFs and the representors. And there is a
> brilliant opportunity to provide some out-of-band channel to transfer
> some extra data from one port to another one, besides the packet data
> itself. And applications would like to use this opportunity.
> 
> It is supposed for application to use trials (with rte_flow_validate)
> to detect which metadata features (FLAG, MARK, META) actually supported
> by PMD and underlying hardware. It might depend on PMD configuration,
> system software, hardware settings, etc., and should be detected
> in run time.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
> Acked-by: Ori Kam <orika@mellanox.com>
> ---
> v7: - updated release notes in collateral patch
> 
> v6: - http://patches.dpdk.org/patch/62245/
>     - minor code style issues
>     - is combined in series with followed egress metadata patch
> 
> v5: - http://patches.dpdk.org/patch/62179/
>     - addressed code style issues from comments
>     - Tx metadata deprecation notice removed
>       (dedicated tx_metadata patch is coming)
>     - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
>       dedicated ones, RTE suffix is added
>     - metadata historic retrospective is added to log message
>     - rebased
> 
> v4: - http://patches.dpdk.org/patch/62065/
>     - documentation comments addressed
>     - deprecation notice for Tx metadata offload flag
>     - rebased
> 
> v3: - http://patches.dpdk.org/patch/61902/
>     - rebased, neat updates
> 
> v2: - http://patches.dpdk.org/patch/60909/
> 
> v1: - http://patches.dpdk.org/patch/56104/
>     - rfc: http://patches.dpdk.org/patch/54271/
> 
>  app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
>  app/test-pmd/util.c                      |   5 ++
>  doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
>  doc/guides/rel_notes/release_19_11.rst   |  13 ++++
>  lib/librte_ethdev/rte_ethdev_version.map |   3 +
>  lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
>  lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
>  8 files changed, 279 insertions(+), 22 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 0d0bc0a..e4ef066 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -316,6 +316,9 @@ enum index {
>  	ACTION_RAW_ENCAP_INDEX_VALUE,
>  	ACTION_RAW_DECAP_INDEX,
>  	ACTION_RAW_DECAP_INDEX_VALUE,
> +	ACTION_SET_META,
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
>  };
>  
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -1067,6 +1070,7 @@ struct parse_action_priv {
>  	ACTION_DEC_TCP_ACK,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
>  	ZERO,
>  };
>  
> @@ -1265,6 +1269,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
>  
> +static const enum index action_set_meta[] = {
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
>  static int parse_vc_action_raw_decap_index(struct context *,
>  					   const struct token *, const char *,
>  					   unsigned int, void *, unsigned int);
> +static int parse_vc_action_set_meta(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
> +				    unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
>  		.help = "index of raw_encap/raw_decap data",
>  		.next = NEXT(next_item),
>  		.call = parse_port,
> -	}
> +	},
> +	[ACTION_SET_META] = {
> +		.name = "set_meta",
> +		.help = "set metadata",
> +		.priv = PRIV_ACTION(SET_META,
> +			sizeof(struct rte_flow_action_set_meta)),
> +		.next = NEXT(action_set_meta),
> +		.call = parse_vc_action_set_meta,
> +	},
> +	[ACTION_SET_META_DATA] = {
> +		.name = "data",
> +		.help = "metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_META_MASK] = {
> +		.name = "mask",
> +		.help = "mask for metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
>  
>  /** Remove and return last entry from argument stack. */
> @@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
>  	return ret;
>  }
>  
> +static int
> +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	ret = rte_flow_dynf_metadata_register();
> +	if (ret < 0)
> +		return -1;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index f20531d..56075b3 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -82,6 +82,11 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> +		if (ol_flags & PKT_TX_METADATA)
> +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_RX_DYNF_METADATA)
> +			printf(" - Rx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (mb->packet_type) {
>  			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
>  			printf(" - hw ptype: %s", buf);
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 159ce19..c943aca 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
>     | ``mask`` | ``id``   | zeroed to match any value |
>     +----------+----------+---------------------------+
>  
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +   +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |
> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
>  - ``proto_id``: PPP protocol identifier.
>  - Default ``mask`` matches proto_id only.
>  
> -
> -.. _table_rte_flow_item_meta:
> -
> -.. table:: META
> -
> -   +----------+----------+---------------------------------------+
> -   | Field    | Subfield | Value                                 |
> -   +==========+==========+=======================================+
> -   | ``spec`` | ``data`` | 32 bit metadata value                 |
> -   +----------+--------------------------------------------------+
> -   | ``last`` | ``data`` | upper range value                     |
> -   +----------+----------+---------------------------------------+
> -   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> -   +----------+----------+---------------------------------------+
> -
>  Item: ``NSH``
>  ^^^^^^^^^^^^^
>  
> @@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
>  
>  Using this action on non-matching traffic will result in undefined behavior.
>  
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>  
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index f6e90cb..963c4f8 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -237,6 +237,14 @@ New Features
>    On supported NICs, we can now setup haipin queue which will offload packets
>    from the wire, backto the wire.
>  
> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +
>  
>  Removed Items
>  -------------
> @@ -344,6 +352,11 @@ API Changes
>    has been introduced in this release is used when used when all the packets
>    enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
>  
> +* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
> +  Due to the new dynamic metadata field in mbuf is host-endian either, there
> +  is the minor compatibility issue for applications in case of 32-bit values
> +  supported.
> +
>  * sched: The pipe nodes configuration parameters such as number of pipes,
>    pipe queue sizes, pipe profiles, etc., are moved from port level structure
>    to subport level. This allows different subports of the same port to
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 48b5389..e593f34 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -291,4 +291,7 @@ EXPERIMENTAL {
>  	rte_eth_rx_hairpin_queue_setup;
>  	rte_eth_tx_hairpin_queue_setup;
>  	rte_eth_dev_hairpin_capability_get;
> +	rte_flow_dynf_metadata_offs;
> +	rte_flow_dynf_metadata_mask;
> +	rte_flow_dynf_metadata_register;
>  };
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index ca0f680..b0490cd 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>  #include <rte_errno.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include "rte_ethdev.h"
>  #include "rte_flow_driver.h"
>  #include "rte_flow.h"
>  
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>  /**
>   * Flow elements description tables.
>   */
> @@ -157,8 +165,40 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
>  };
>  
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
> +		.size = sizeof(uint32_t),
> +		.align = __alignof__(uint32_t),
> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
> +	};
> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:
> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>  static int
>  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>  {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 4fee105..f6e050c 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -28,6 +28,8 @@
>  #include <rte_byteorder.h>
>  #include <rte_esp.h>
>  #include <rte_higig.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
>  	/**
>  	 * [META]
>  	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>  	 * See struct rte_flow_item_meta.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_META,
> @@ -1263,18 +1266,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  #endif
>  
>  /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * RTE_FLOW_ITEM_TYPE_META
>   *
> - * Matches a specified metadata value.
> + * Matches a specified metadata value. On egress, metadata can be set either by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
> -	rte_be32_t data;
> +	uint32_t data;
>  };
>  
>  /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
>  #ifndef __cplusplus
>  static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
> -	.data = RTE_BE32(UINT32_MAX),
> +	.data = UINT32_MAX,
>  };
>  #endif
>  
> @@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
>  	 * undefined behavior.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> +
> +	/**
> +	 * Set metadata on ingress or egress path.
> +	 *
> +	 * See struct rte_flow_action_set_meta.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_META,
>  };
>  
>  /**
> @@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
>  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>  };
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	uint32_t data;
> +	uint32_t mask;
> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +
> +__rte_experimental
> +static inline uint32_t
> +rte_flow_dynf_metadata_get(struct rte_mbuf *m)
> +{
> +	return *RTE_FLOW_DYNF_METADATA(m);
> +}
> +
> +__rte_experimental
> +static inline void
> +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
> +{
> +	*RTE_FLOW_DYNF_METADATA(m) = v;
> +}
> +
>  /*
>   * Definition of a single action.
>   *
> @@ -2662,6 +2728,33 @@ enum rte_flow_conv_op {
>  };
>  
>  /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void)
> +{
> +	return !!rte_flow_dynf_metadata_mask;
> +}
> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>   * Check whether a flow rule can be created on a given port.
>   *
>   * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 2e9d418..de651c1 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -234,6 +234,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
>  __rte_experimental
>  void rte_mbuf_dyn_dump(FILE *out);
>  
> -/* Placeholder for dynamic fields and flags declarations. */
> +/*
> + * Placeholder for dynamic fields and flags declarations.
> + * This is centralizing point to gather all field names
> + * and parameters together.
> + */
> +#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
> +#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>  

Sorry if it was not clear in my first comment, but I think we should
have some words in this file about what are these field/flag.

After that:
Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v7 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
  2019-10-31 13:33                   ` Ori Kam
@ 2019-10-31 15:51                   ` Olivier Matz
  2019-10-31 16:07                     ` Slava Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Olivier Matz @ 2019-10-31 15:51 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, thomas, arybchenko, orika

Hi,

On Thu, Oct 31, 2019 at 01:05:21PM +0000, Viacheslav Ovsiienko wrote:
> The dynamic mbuf fields were introduced by [1]. The egress metadata is
> good candidate to be moved from statically allocated field tx_metadata to
> dynamic one. Because mbufs are used in half-duplex fashion only, it is
> safe to share this dynamic field with ingress metadata.
> 
> The shared dynamic field contains either egress (if application going to
> transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
> metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
> along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior accessing the data.
> 
> The availability of dynamic mbuf metadata field can be checked with
> rte_flow_dynf_metadata_avail() routine.
> 
> DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
> The metadata support in PMDs is engaged on dynamic field registration.
> 
> Metadata feature is getting complex. We might have some set of actions
> and items that might be supported by PMDs in multiple combinations,
> the supported values and masks are the subjects to query by perfroming
> trials (with rte_flow_validate).
> 
> [1] http://patches.dpdk.org/patch/62040/
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> 
> v7: - updates release notes
> v6: - http://patches.dpdk.org/patch/62244/
> 
>  app/test-pmd/cmdline.c                 |  3 ++-
>  app/test-pmd/testpmd.c                 |  4 ----
>  app/test-pmd/testpmd.h                 |  2 +-
>  app/test-pmd/util.c                    | 15 +++++++++------
>  app/test/test_mbuf.c                   |  1 -
>  doc/guides/prog_guide/rte_flow.rst     |  6 +++---
>  doc/guides/rel_notes/release_19_11.rst |  5 +++++
>  drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
>  drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
>  drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
>  drivers/net/mlx5/mlx5_txq.c            |  4 ----
>  lib/librte_ethdev/rte_ethdev.c         |  1 -
>  lib/librte_ethdev/rte_ethdev.h         |  5 -----
>  lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
>  lib/librte_mbuf/rte_mbuf.c             |  2 --
>  lib/librte_mbuf/rte_mbuf_core.h        | 19 +------------------
>  16 files changed, 48 insertions(+), 85 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 4478069..49c45a3 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
>  
>  	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
> +	ports[res->port_id].tx_metadata = res->value;
>  	/* Add/remove callback to insert valid metadata in every Tx packet. */
>  	if (ports[res->port_id].tx_metadata)
>  		add_tx_md_callback(res->port_id);
>  	else
>  		remove_tx_md_callback(res->port_id);
> +	rte_flow_dynf_metadata_register();
>  }
>  
>  cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 0fc5b45..206c12b 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1167,10 +1167,6 @@ struct extmem_param {
>  		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
>  			port->dev_conf.txmode.offloads &=
>  				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
> -		if (!(port->dev_info.tx_offload_capa &
> -			DEV_TX_OFFLOAD_MATCH_METADATA))
> -			port->dev_conf.txmode.offloads &=
> -				~DEV_TX_OFFLOAD_MATCH_METADATA;
>  		if (numa_support) {
>  			if (port_numa[pid] != NUMA_NO_CONFIG)
>  				port_per_socket[port_numa[pid]]++;
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 8da1e8e..caabf32 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -193,7 +193,7 @@ struct rte_port {
>  	struct softnic_port     softport;  /**< softnic params */
>  #endif
>  	/**< metadata value to insert in Tx packets. */
> -	rte_be32_t		tx_metadata;
> +	uint32_t		tx_metadata;
>  	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
>  };
>  
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 56075b3..cf41864 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -82,8 +82,9 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> -		if (ol_flags & PKT_TX_METADATA)
> -			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_TX_DYNF_METADATA)
> +			printf(" - Tx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (ol_flags & PKT_RX_DYNF_METADATA)
>  			printf(" - Rx metadata: 0x%x",
>  			       *RTE_FLOW_DYNF_METADATA(mb));
> @@ -188,10 +189,12 @@
>  	 * Add metadata value to every Tx packet,
>  	 * and set ol_flags accordingly.
>  	 */
> -	for (i = 0; i < nb_pkts; i++) {
> -		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
> -		pkts[i]->ol_flags |= PKT_TX_METADATA;
> -	}
> +	if (rte_flow_dynf_metadata_avail())
> +		for (i = 0; i < nb_pkts; i++) {
> +			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
> +						ports[port_id].tx_metadata;
> +			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
> +		}
>  	return nb_pkts;
>  }
>  
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> index 854bc26..61ecffc 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -1669,7 +1669,6 @@ struct flag_name {
>  		VAL_NAME(PKT_TX_SEC_OFFLOAD),
>  		VAL_NAME(PKT_TX_UDP_SEG),
>  		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
> -		VAL_NAME(PKT_TX_METADATA),
>  	};
>  
>  	/* Test case to check with valid flag */
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index c943aca..630e4c0 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -664,7 +664,7 @@ Item: ``META``
>  Matches 32 bit metadata item set.
>  
>  On egress, metadata can be set either by mbuf metadata field with
> -PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
>  action sets metadata for a packet and the metadata will be reported via
>  ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
>  
> @@ -2482,8 +2482,8 @@ Action: ``SET_META``
>  
>  Set metadata. Item ``META`` matches metadata.
>  
> -Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> -overridden by this action. On ingress, the metadata will be carried by
> +Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
> +will be overridden by this action. On ingress, the metadata will be carried by
>  ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
>  ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
>  with the data.
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 963c4f8..2e9a596 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -357,6 +357,11 @@ API Changes
>    is the minor compatibility issue for applications in case of 32-bit values
>    supported.
>  
> +* metadata: the tx_metadata mbuf field is moved to dymanic one.
> +  PKT_TX_METADATA flag is replaced with PKT_TX_DYNF_METADATA.
> +  DEV_TX_OFFLOAD_MATCH_METADATA offload flag is removed, now metadata
> +  support in PMD is engaged on dynamic field registration.
> +
>  * sched: The pipe nodes configuration parameters such as number of pipes,
>    pipe queue sizes, pipe profiles, etc., are moved from port level structure
>    to subport level. This allows different subports of the same port to
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index d9a7fd4..f961bff 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -793,7 +793,7 @@ struct field_modify_info modify_tcp[] = {
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
>  static int
> -flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> +flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
>  			   const struct rte_flow_item *item,
>  			   const struct rte_flow_attr *attr,
>  			   struct rte_flow_error *error)
> @@ -801,17 +801,10 @@ struct field_modify_info modify_tcp[] = {
>  	const struct rte_flow_item_meta *spec = item->spec;
>  	const struct rte_flow_item_meta *mask = item->mask;
>  	const struct rte_flow_item_meta nic_mask = {
> -		.data = RTE_BE32(UINT32_MAX)
> +		.data = UINT32_MAX
>  	};
>  	int ret;
> -	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
>  
> -	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> -		return rte_flow_error_set(error, EPERM,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  NULL,
> -					  "match on metadata offload "
> -					  "configuration is off for this port");
>  	if (!spec)
>  		return rte_flow_error_set(error, EINVAL,
>  					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> @@ -4750,10 +4743,10 @@ struct field_modify_info modify_tcp[] = {
>  		meta_m = &rte_flow_item_meta_mask;
>  	meta_v = (const void *)item->spec;
>  	if (meta_v) {
> -		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> -			 rte_be_to_cpu_32(meta_m->data));
> -		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> -			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
> +		MLX5_SET(fte_match_set_misc2, misc2_m,
> +			 metadata_reg_a, meta_m->data);
> +		MLX5_SET(fte_match_set_misc2, misc2_v,
> +			 metadata_reg_a, meta_v->data & meta_m->data);
>  	}
>  }
>  
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index f597c89..88a4378 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -2281,8 +2281,8 @@ enum mlx5_txcmp_code {
>  	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
>  	/* Fill metadata field if needed. */
>  	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
> -		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
> -		       loc->mbuf->tx_metadata : 0 : 0;
> +		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
> +		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
>  	/* Engage VLAN tag insertion feature if requested. */
>  	if (MLX5_TXOFF_CONFIG(VLAN) &&
>  	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
> @@ -2341,8 +2341,8 @@ enum mlx5_txcmp_code {
>  	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
>  	/* Fill metadata field if needed. */
>  	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
> -		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
> -		       loc->mbuf->tx_metadata : 0 : 0;
> +		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
> +		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
>  	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
>  				(sizeof(uint16_t) +
>  				 sizeof(rte_v128u32_t)),
> @@ -2434,8 +2434,8 @@ enum mlx5_txcmp_code {
>  	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
>  	/* Fill metadata field if needed. */
>  	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
> -		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
> -		       loc->mbuf->tx_metadata : 0 : 0;
> +		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
> +		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
>  	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
>  				(sizeof(uint16_t) +
>  				 sizeof(rte_v128u32_t)),
> @@ -2628,8 +2628,8 @@ enum mlx5_txcmp_code {
>  	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
>  	/* Fill metadata field if needed. */
>  	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
> -		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
> -		       loc->mbuf->tx_metadata : 0 : 0;
> +		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
> +		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
>  	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
>  				(sizeof(uint16_t) +
>  				 sizeof(rte_v128u32_t)),
> @@ -3700,8 +3700,8 @@ enum mlx5_txcmp_code {
>  		return false;
>  	/* Fill metadata field if needed. */
>  	if (MLX5_TXOFF_CONFIG(METADATA) &&
> -		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
> -				 loc->mbuf->tx_metadata : 0))
> +		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
> +				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
>  		return false;
>  	/* There must be no VLAN packets in eMPW loop. */
>  	if (MLX5_TXOFF_CONFIG(VLAN))
> @@ -5149,7 +5149,7 @@ enum mlx5_txcmp_code {
>  		 */
>  		olx |= MLX5_TXOFF_CONFIG_EMPW;
>  	}
> -	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
> +	if (rte_flow_dynf_metadata_avail()) {
>  		/* We should support Flow metadata. */
>  		olx |= MLX5_TXOFF_CONFIG_METADATA;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index b54ff72..85e0bd5 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -19,12 +19,6 @@
>  	 DEV_TX_OFFLOAD_TCP_CKSUM | \
>  	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
>  
> -/* HW offload capabilities of vectorized Tx. */
> -#define MLX5_VEC_TX_OFFLOAD_CAP \
> -	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> -	 DEV_TX_OFFLOAD_MATCH_METADATA | \
> -	 DEV_TX_OFFLOAD_MULTI_SEGS)
> -
>  /*
>   * Compile time sanity check for vectorized functions.
>   */
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index dfc379c..97991f0 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -128,10 +128,6 @@
>  			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> -#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> -	if (config->dv_flow_en)
> -		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
> -#endif
>  	return offloads;
>  }
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 68aca1f..23b751f 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
>  	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
>  	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
>  	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> -	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
>  };
>  
>  #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index 9b69255..28e29c7 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1145,11 +1145,6 @@ struct rte_eth_conf {
>  #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
>  /** Device supports outer UDP checksum */
>  #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
> -/**
> - * Device supports match on metadata Tx offload..
> - * Application must set PKT_TX_METADATA and mbuf metadata field.
> - */
> -#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
>  
>  #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
>  /**< Device supports Rx queue setup after device started*/
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index f6e050c..51d8292 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -1268,12 +1268,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  /**
>   * RTE_FLOW_ITEM_TYPE_META
>   *
> - * Matches a specified metadata value. On egress, metadata can be set either by
> - * mbuf tx_metadata field with PKT_TX_METADATA flag or
> - * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> - * metadata for a packet and the metadata will be reported via mbuf metadata
> - * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> - * registered in advance by rte_flow_dynf_metadata_register().
> + * Matches a specified metadata value. On egress, metadata can be set
> + * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
> + * sets metadata for a packet and the metadata will be reported via mbuf
> + * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> + * field must be registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
>  	uint32_t data;
> @@ -2450,8 +2450,8 @@ struct rte_flow_action_set_mac {
>   *
>   * RTE_FLOW_ACTION_TYPE_SET_META
>   *
> - * Set metadata. Metadata set by mbuf tx_metadata field with
> - * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * Set metadata. Metadata set by mbuf metadata dynamic field with
> + * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
>   * ingress, the metadata will be carried by mbuf metadata dynamic field
>   * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
>   * registered in advance by rte_flow_dynf_metadata_register().
> @@ -2478,8 +2478,9 @@ struct rte_flow_action_set_meta {
>  #define RTE_FLOW_DYNF_METADATA(m) \
>  	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
>  
> -/* Mbuf dynamic flag for metadata. */
> +/* Mbuf dynamic flags for metadata. */
>  #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)

Should we have 2 defines pointing to the same mask? Shall we use
PKT_DYNF_METADATA for both?


>  
>  __rte_experimental
>  static inline uint32_t
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 8c51dc1..35df1c4 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
>  	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
>  	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
>  	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> -	case PKT_TX_METADATA: return "PKT_TX_METADATA";
>  	default: return NULL;
>  	}
>  }
> @@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
>  		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
>  		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
>  		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> -		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
>  	};
>  	const char *name;
>  	unsigned int i;
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index 3022701..edfc7e9 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -192,11 +192,6 @@
>  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
>  
>  /**
> - * Indicate that the metadata field in the mbuf is in use.
> - */
> -#define PKT_TX_METADATA	(1ULL << 40)
> -
> -/**

You should also update PKT_LAST_FREE just above.


>   * Outer UDP checksum offload flag. This flag is used for enabling
>   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
>   * 1) Enable the following in mbuf,
> @@ -389,8 +384,7 @@
>  		PKT_TX_MACSEC |		 \
>  		PKT_TX_SEC_OFFLOAD |	 \
>  		PKT_TX_UDP_SEG |	 \
> -		PKT_TX_OUTER_UDP_CKSUM | \
> -		PKT_TX_METADATA)
> +		PKT_TX_OUTER_UDP_CKSUM)
>  
>  /**
>   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> @@ -601,17 +595,6 @@ struct rte_mbuf {
>  			/**< User defined tags. See rte_distributor_process() */
>  			uint32_t usr;
>  		} hash;                   /**< hash information */
> -		struct {
> -			/**
> -			 * Application specific metadata value
> -			 * for egress flow rule match.
> -			 * Valid if PKT_TX_METADATA is set.
> -			 * Located here to allow conjunct use
> -			 * with hash.sched.hi.
> -			 */
> -			uint32_t tx_metadata;
> -			uint32_t reserved;
> -		};
>  	};
>  
>  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v7 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 15:51                   ` Olivier Matz
@ 2019-10-31 16:07                     ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-31 16:07 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon, arybchenko, Ori Kam

Hi, 

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, October 31, 2019 17:51
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; arybchenko@solarflare.com; Ori Kam
> <orika@mellanox.com>
> Subject: Re: [PATCH v7 2/2] ethdev: move egress metadata to dynamic field
> 
> Hi,
> 
> On Thu, Oct 31, 2019 at 01:05:21PM +0000, Viacheslav Ovsiienko wrote:
> > The dynamic mbuf fields were introduced by [1]. The egress metadata is
> > good candidate to be moved from statically allocated field tx_metadata
> > to dynamic one. Because mbufs are used in half-duplex fashion only, it
> > is safe to share this dynamic field with ingress metadata.
> >
> > The shared dynamic field contains either egress (if application going
> > to transmit mbuf with tx_burst) or ingress (if mbuf is received with
> > rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA()
> > macro or with
> > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be
> set
> > along with the data.
> >
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior accessing the data.
> >
> > The availability of dynamic mbuf metadata field can be checked with
> > rte_flow_dynf_metadata_avail() routine.
> >
> > DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is
> removed.
> > The metadata support in PMDs is engaged on dynamic field registration.
> >
> > Metadata feature is getting complex. We might have some set of actions
> > and items that might be supported by PMDs in multiple combinations,
> > the supported values and masks are the subjects to query by perfroming
> > trials (with rte_flow_validate).
> >

[snip]

> > +/* Mbuf dynamic flags for metadata. */
> >  #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> > +#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> 
> Should we have 2 defines pointing to the same mask? Shall we use
> PKT_DYNF_METADATA for both?
It is just a style issue, we have two sets of PKT_RX_xxxx and PKT_TX_xxx flags,
it just looks nice to use appropriate flags in RX/TX parts of datapath.

> >
> >  __rte_experimental
> >  static inline uint32_t
> > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > index 8c51dc1..35df1c4 100644
> > --- a/lib/librte_mbuf/rte_mbuf.c
> > +++ b/lib/librte_mbuf/rte_mbuf.c
> > @@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t
> mask)
> >  	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> >  	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> >  	case PKT_TX_OUTER_UDP_CKSUM: return
> "PKT_TX_OUTER_UDP_CKSUM";
> > -	case PKT_TX_METADATA: return "PKT_TX_METADATA";
> >  	default: return NULL;
> >  	}
> >  }
> > @@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t
> mask)
> >  		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> >  		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> >  		{ PKT_TX_OUTER_UDP_CKSUM,
> PKT_TX_OUTER_UDP_CKSUM, NULL },
> > -		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
> >  	};
> >  	const char *name;
> >  	unsigned int i;
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h
> > b/lib/librte_mbuf/rte_mbuf_core.h index 3022701..edfc7e9 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -192,11 +192,6 @@
> >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> >
> >  /**
> > - * Indicate that the metadata field in the mbuf is in use.
> > - */
> > -#define PKT_TX_METADATA	(1ULL << 40)
> > -
> > -/**
> 
> You should also update PKT_LAST_FREE just above.
> 
Yes, my bad, will fix.

[snip] 

With best regards, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata
  2019-10-31 15:47                   ` Olivier Matz
@ 2019-10-31 16:13                     ` Slava Ovsiienko
  0 siblings, 0 replies; 98+ messages in thread
From: Slava Ovsiienko @ 2019-10-31 16:13 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon, arybchenko,
	Ori Kam, Yongseok Koh

Hi Olivier,

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, October 31, 2019 17:47
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; arybchenko@solarflare.com; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v7 1/2] ethdev: extend flow metadata
> 
> Hi Slava,
> 
> One comment at the end.
> 
> On Thu, Oct 31, 2019 at 01:05:20PM +0000, Viacheslav Ovsiienko wrote:
> > Currently, metadata can be set on egress path via mbuf tx_metadata
> > field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> matches metadata.
> >
> > This patch extends the metadata feature usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule
> > and matched by another rule. This new action allows metadata to be set
> > as a result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on ingress. Metadata can be set
> > by SET_META action and matched by META item like Tx. The final value
> > set by the action will be delivered to application via metadata
> > dynamic field of mbuf which can be accessed by
> > RTE_FLOW_DYNF_METADATA() macro or with
> > rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> > routines. PKT_RX_DYNF_METADATA flag will be set along with the data.
> >
> > The mbuf dynamic field must be registered by calling
> > rte_flow_dynf_metadata_register() prior to use SET_META action.
> >
> > The availability of dynamic mbuf metadata field can be checked with
> > rte_flow_dynf_metadata_avail() routine.
> >
> > If application is going to engage the metadata feature it registers
> > the metadata  dynamic fields, then PMD checks the metadata field
> > availability and handles the appropriate fields in datapath.
> >
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on hardware capability.
> >
> > MARK and METADATA look similar and might operate in similar way, but
> > not interacting.
> >
> > Initially, there were proposed two metadata related actions:
> >
> > - RTE_FLOW_ACTION_TYPE_FLAG
> > - RTE_FLOW_ACTION_TYPE_MARK
> >
> > These actions set the special flag in the packet metadata, MARK action
> > stores some specified value in the metadata storage, and, on the
> > packet receiving PMD puts the flag and value to the mbuf and
> > applications can see the packet was threated inside flow engine
> > according to the appropriate RTE flow(s). MARK and FLAG are like some
> > kind of gateway to transfer some per-packet information from the flow
> > engine to the application via receiving datapath. Also, there is the
> > item of type RTE_FLOW_ITEM_TYPE_MARK provided. It allows us to
> extend
> > the flow match pattern with the capability to match the metadata values
> set by MARK/FLAG actions on other flows.
> >
> > From the datapath point of view, the MARK and FLAG are related to the
> > receiving side only. It would useful to have the same gateway on the
> > transmitting side and there was the feature of type
> > RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the
> > field in mbuf and this value will be transferred to some field in the
> > packet metadata inside the flow engine. It did not matter whether
> > these metadata fields are shared because of MARK and META items
> > belonged to different domains (receiving and
> > transmitting) and could be vendor-specific.
> >
> > So far, so good, DPDK proposes some entities to control metadata
> > inside the flow engine and gateways to exchange these values on a
> > per-packet basis via datapaths.
> >
> > As we can see, the MARK and META means are not symmetric, there is
> > absent action which would allow us to set META value on the transmitting
> path.
> > So, the action of type:
> >
> > - RTE_FLOW_ACTION_TYPE_SET_META was proposed.
> >
> > The next, applications raise the new requirements for packet metadata.
> > The flow ngines are getting more complex, internal switches are
> > introduced, multiple ports might be supported within the same flow engine
> namespace.
> > From the DPDK points of view, it means the packets might be sent on
> > one eth_dev port and received on the other one, and the packet path
> > inside the flow engine entirely belongs to the same hardware device.
> > The simplest example is SR-IOV with PF, VFs and the representors. And
> > there is a brilliant opportunity to provide some out-of-band channel
> > to transfer some extra data from one port to another one, besides the
> > packet data itself. And applications would like to use this opportunity.
> >
> > It is supposed for application to use trials (with rte_flow_validate)
> > to detect which metadata features (FLAG, MARK, META) actually
> > supported by PMD and underlying hardware. It might depend on PMD
> > configuration, system software, hardware settings, etc., and should be
> > detected in run time.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
> > Acked-by: Ori Kam <orika@mellanox.com>
> > ---
> > v7: - updated release notes in collateral patch
> >
> > v6: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F62245%2F&amp;data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C637081336535150897&amp;sdata=lZ
> SRRw5DmT1cGeJ7Q57x8qIJSsFLPC%2FrJpzH3Yp%2Bf2k%3D&amp;reserved=
> 0
> >     - minor code style issues
> >     - is combined in series with followed egress metadata patch
> >
> > v5: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F62179%2F&amp;data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C637081336535150897&amp;sdata=L
> EL91A7XH%2BXYJr19%2FxpcEUmzr2ZMITfUJu%2FY8IfnLR0%3D&amp;reserv
> ed=0
> >     - addressed code style issues from comments
> >     - Tx metadata deprecation notice removed
> >       (dedicated tx_metadata patch is coming)
> >     - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
> >       dedicated ones, RTE suffix is added
> >     - metadata historic retrospective is added to log message
> >     - rebased
> >
> > v4: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F62065%2F&amp;data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C637081336535150897&amp;sdata=z
> wSoQ0IhWjalmLqSv5CLSNtBt96FJwxmnZQOL%2FLIupw%3D&amp;reserved=
> 0
> >     - documentation comments addressed
> >     - deprecation notice for Tx metadata offload flag
> >     - rebased
> >
> > v3: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F61902%2F&amp;data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C637081336535150897&amp;sdata=
> %2FvKJWyFvCaBeDQVfpsKM4HL8fzXFoGKHWbQn%2FjBtulM%3D&amp;reser
> ved=0
> >     - rebased, neat updates
> >
> > v2: -
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> >
> es.dpdk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Cviacheslavo%4
> 0mellan
> >
> ox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e4d9ba
> 6a4d14925
> >
> 6f461b%7C0%7C0%7C637081336535150897&amp;sdata=lEb4sLRWVXuvaL
> WLHENHZNMr
> > 4EwFuFs8LoglfikJqFE%3D&amp;reserved=0
> >
> > v1: -
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C637081336535150897&amp;sdata=
> %2FNYTARUn%2FqNn9BRBK09juCiGYS1eACb1OxEZKJJMSY4%3D&amp;reser
> ved=0
> >     - rfc:
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> >
> es.dpdk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Cviacheslavo%4
> 0mellan
> >
> ox.com%7C1390029b214e4543f1e708d75e19a404%7Ca652971c7d2e4d9ba
> 6a4d14925
> >
> 6f461b%7C0%7C0%7C637081336535150897&amp;sdata=u0faxda7EuxmoHk
> Ty0r0d4Yw
> > z6JfPvkiQuZEss6oo%2B4%3D&amp;reserved=0
> >
> >  app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
> >  app/test-pmd/util.c                      |   5 ++
> >  doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
> >  doc/guides/rel_notes/release_19_11.rst   |  13 ++++
> >  lib/librte_ethdev/rte_ethdev_version.map |   3 +
> >  lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
> >  lib/librte_ethdev/rte_flow.h             | 103
> +++++++++++++++++++++++++++++--
> >  lib/librte_mbuf/rte_mbuf_dyn.h           |   8 ++-
> >  8 files changed, 279 insertions(+), 22 deletions(-)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 0d0bc0a..e4ef066 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -316,6 +316,9 @@ enum index {
> >  	ACTION_RAW_ENCAP_INDEX_VALUE,
> >  	ACTION_RAW_DECAP_INDEX,
> >  	ACTION_RAW_DECAP_INDEX_VALUE,
> > +	ACTION_SET_META,
> > +	ACTION_SET_META_DATA,
> > +	ACTION_SET_META_MASK,
> >  };
> >
> >  /** Maximum size for pattern in struct rte_flow_item_raw. */ @@
> > -1067,6 +1070,7 @@ struct parse_action_priv {
> >  	ACTION_DEC_TCP_ACK,
> >  	ACTION_RAW_ENCAP,
> >  	ACTION_RAW_DECAP,
> > +	ACTION_SET_META,
> >  	ZERO,
> >  };
> >
> > @@ -1265,6 +1269,13 @@ struct parse_action_priv {
> >  	ZERO,
> >  };
> >
> > +static const enum index action_set_meta[] = {
> > +	ACTION_SET_META_DATA,
> > +	ACTION_SET_META_MASK,
> > +	ACTION_NEXT,
> > +	ZERO,
> > +};
> > +
> >  static int parse_set_raw_encap_decap(struct context *, const struct token
> *,
> >  				     const char *, unsigned int,
> >  				     void *, unsigned int);
> > @@ -1329,6 +1340,10 @@ static int
> > parse_vc_action_raw_encap_index(struct context *,  static int
> parse_vc_action_raw_decap_index(struct context *,
> >  					   const struct token *, const char *,
> >  					   unsigned int, void *, unsigned int);
> > +static int parse_vc_action_set_meta(struct context *ctx,
> > +				    const struct token *token, const char *str,
> > +				    unsigned int len, void *buf,
> > +				    unsigned int size);
> >  static int parse_destroy(struct context *, const struct token *,
> >  			 const char *, unsigned int,
> >  			 void *, unsigned int);
> > @@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context
> *, const struct token *,
> >  		.help = "index of raw_encap/raw_decap data",
> >  		.next = NEXT(next_item),
> >  		.call = parse_port,
> > -	}
> > +	},
> > +	[ACTION_SET_META] = {
> > +		.name = "set_meta",
> > +		.help = "set metadata",
> > +		.priv = PRIV_ACTION(SET_META,
> > +			sizeof(struct rte_flow_action_set_meta)),
> > +		.next = NEXT(action_set_meta),
> > +		.call = parse_vc_action_set_meta,
> > +	},
> > +	[ACTION_SET_META_DATA] = {
> > +		.name = "data",
> > +		.help = "metadata value",
> > +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> > +		.args = ARGS(ARGS_ENTRY_HTON
> > +			     (struct rte_flow_action_set_meta, data)),
> > +		.call = parse_vc_conf,
> > +	},
> > +	[ACTION_SET_META_MASK] = {
> > +		.name = "mask",
> > +		.help = "mask for metadata value",
> > +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> > +		.args = ARGS(ARGS_ENTRY_HTON
> > +			     (struct rte_flow_action_set_meta, mask)),
> > +		.call = parse_vc_conf,
> > +	},
> >  };
> >
> >  /** Remove and return last entry from argument stack. */ @@ -4818,6
> > +4857,22 @@ static int comp_set_raw_index(struct context *, const struct
> token *,
> >  	return ret;
> >  }
> >
> > +static int
> > +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> > +			 const char *str, unsigned int len, void *buf,
> > +			 unsigned int size)
> > +{
> > +	int ret;
> > +
> > +	ret = parse_vc(ctx, token, str, len, buf, size);
> > +	if (ret < 0)
> > +		return ret;
> > +	ret = rte_flow_dynf_metadata_register();
> > +	if (ret < 0)
> > +		return -1;
> > +	return len;
> > +}
> > +
> >  /** Parse tokens for destroy command. */  static int
> > parse_destroy(struct context *ctx, const struct token *token, diff
> > --git a/app/test-pmd/util.c b/app/test-pmd/util.c index
> > f20531d..56075b3 100644
> > --- a/app/test-pmd/util.c
> > +++ b/app/test-pmd/util.c
> > @@ -82,6 +82,11 @@
> >  			       mb->vlan_tci, mb->vlan_tci_outer);
> >  		else if (ol_flags & PKT_RX_VLAN)
> >  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> > +		if (ol_flags & PKT_TX_METADATA)
> > +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> > +		if (ol_flags & PKT_RX_DYNF_METADATA)
> > +			printf(" - Rx metadata: 0x%x",
> > +			       *RTE_FLOW_DYNF_METADATA(mb));
> >  		if (mb->packet_type) {
> >  			rte_get_ptype_name(mb->packet_type, buf,
> sizeof(buf));
> >  			printf(" - hw ptype: %s", buf);
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 159ce19..c943aca 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -658,6 +658,32 @@ the physical device, with virtual groups in the
> PMD or not at all.
> >     | ``mask`` | ``id``   | zeroed to match any value |
> >     +----------+----------+---------------------------+
> >
> > +Item: ``META``
> > +^^^^^^^^^^^^^^^^^
> > +
> > +Matches 32 bit metadata item set.
> > +
> > +On egress, metadata can be set either by mbuf metadata field with
> > +PKT_TX_METADATA flag or ``SET_META`` action. On ingress,
> ``SET_META``
> > +action sets metadata for a packet and the metadata will be reported
> > +via ``metadata`` dynamic field of ``rte_mbuf`` with
> PKT_RX_DYNF_METADATA flag.
> > +
> > +- Default ``mask`` matches the specified Rx metadata value.
> > +
> > +.. _table_rte_flow_item_meta:
> > +
> > +.. table:: META
> > +
> > +   +----------+----------+---------------------------------------+
> > +   | Field    | Subfield | Value                                 |
> > +
> +==========+==========+=======================================
> +
> > +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> > +   +----------+----------+---------------------------------------+
> > +   | ``last`` | ``data`` | upper range value                     |
> > +   +----------+----------+---------------------------------------+
> > +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> > +   +----------+----------+---------------------------------------+
> > +
> >  Data matching item types
> >  ~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > @@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
> >  - ``proto_id``: PPP protocol identifier.
> >  - Default ``mask`` matches proto_id only.
> >
> > -
> > -.. _table_rte_flow_item_meta:
> > -
> > -.. table:: META
> > -
> > -   +----------+----------+---------------------------------------+
> > -   | Field    | Subfield | Value                                 |
> > -
> +==========+==========+=======================================
> +
> > -   | ``spec`` | ``data`` | 32 bit metadata value                 |
> > -   +----------+--------------------------------------------------+
> > -   | ``last`` | ``data`` | upper range value                     |
> > -   +----------+----------+---------------------------------------+
> > -   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> > -   +----------+----------+---------------------------------------+
> > -
> >  Item: ``NSH``
> >  ^^^^^^^^^^^^^
> >
> > @@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment
> number by is a big-endian 32 bit integer.
> >
> >  Using this action on non-matching traffic will result in undefined behavior.
> >
> > +Action: ``SET_META``
> > +^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Set metadata. Item ``META`` matches metadata.
> > +
> > +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on
> > +egress will be overridden by this action. On ingress, the metadata
> > +will be carried by ``metadata`` dynamic field of ``rte_mbuf`` which
> > +can be accessed by ``RTE_FLOW_DYNF_METADATA()``.
> PKT_RX_DYNF_METADATA
> > +flag will be set along with the data.
> > +
> > +The mbuf dynamic field must be registered by calling
> > +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> > +
> > +Altering partial bits is supported with ``mask``. For bits which have
> > +never been set, unpredictable value will be seen depending on driver
> > +implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may or may not be propagated to the other path depending on HW
> capability.
> > +
> > +.. _table_rte_flow_action_set_meta:
> > +
> > +.. table:: SET_META
> > +
> > +   +----------+----------------------------+
> > +   | Field    | Value                      |
> > +   +==========+============================+
> > +   | ``data`` | 32 bit metadata value      |
> > +   +----------+----------------------------+
> > +   | ``mask`` | bit-mask applies to "data" |
> > +   +----------+----------------------------+
> > +
> >  Negative types
> >  ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_19_11.rst
> > b/doc/guides/rel_notes/release_19_11.rst
> > index f6e90cb..963c4f8 100644
> > --- a/doc/guides/rel_notes/release_19_11.rst
> > +++ b/doc/guides/rel_notes/release_19_11.rst
> > @@ -237,6 +237,14 @@ New Features
> >    On supported NICs, we can now setup haipin queue which will offload
> packets
> >    from the wire, backto the wire.
> >
> > +* **Extended metadata support in rte_flow.**
> > +
> > +  Flow metadata is extended to both Rx and Tx.
> > +
> > +  * Tx metadata can also be set by SET_META action of rte_flow.
> > +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf``
> with
> > +    PKT_RX_DYNF_METADATA.
> > +
> >
> >  Removed Items
> >  -------------
> > @@ -344,6 +352,11 @@ API Changes
> >    has been introduced in this release is used when used when all the
> packets
> >    enqueued in the tx adapter are destined for the same Ethernet port & Tx
> queue.
> >
> > +* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to
> host one.
> > +  Due to the new dynamic metadata field in mbuf is host-endian
> > +either, there
> > +  is the minor compatibility issue for applications in case of 32-bit
> > +values
> > +  supported.
> > +
> >  * sched: The pipe nodes configuration parameters such as number of
> pipes,
> >    pipe queue sizes, pipe profiles, etc., are moved from port level structure
> >    to subport level. This allows different subports of the same port
> > to diff --git a/lib/librte_ethdev/rte_ethdev_version.map
> > b/lib/librte_ethdev/rte_ethdev_version.map
> > index 48b5389..e593f34 100644
> > --- a/lib/librte_ethdev/rte_ethdev_version.map
> > +++ b/lib/librte_ethdev/rte_ethdev_version.map
> > @@ -291,4 +291,7 @@ EXPERIMENTAL {
> >  	rte_eth_rx_hairpin_queue_setup;
> >  	rte_eth_tx_hairpin_queue_setup;
> >  	rte_eth_dev_hairpin_capability_get;
> > +	rte_flow_dynf_metadata_offs;
> > +	rte_flow_dynf_metadata_mask;
> > +	rte_flow_dynf_metadata_register;
> >  };
> > diff --git a/lib/librte_ethdev/rte_flow.c
> > b/lib/librte_ethdev/rte_flow.c index ca0f680..b0490cd 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -12,10 +12,18 @@
> >  #include <rte_errno.h>
> >  #include <rte_branch_prediction.h>
> >  #include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >  #include "rte_ethdev.h"
> >  #include "rte_flow_driver.h"
> >  #include "rte_flow.h"
> >
> > +/* Mbuf dynamic field name for metadata. */ int
> > +rte_flow_dynf_metadata_offs = -1;
> > +
> > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> >  /**
> >   * Flow elements description tables.
> >   */
> > @@ -157,8 +165,40 @@ struct rte_flow_desc_data {
> >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >  };
> >
> > +int
> > +rte_flow_dynf_metadata_register(void)
> > +{
> > +	int offset;
> > +	int flag;
> > +
> > +	static const struct rte_mbuf_dynfield desc_offs = {
> > +		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
> > +		.size = sizeof(uint32_t),
> > +		.align = __alignof__(uint32_t),
> > +	};
> > +	static const struct rte_mbuf_dynflag desc_flag = {
> > +		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
> > +	};
> > +
> > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > +	if (offset < 0)
> > +		goto error;
> > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > +	if (flag < 0)
> > +		goto error;
> > +	rte_flow_dynf_metadata_offs = offset;
> > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > +	return 0;
> > +
> > +error:
> > +	rte_flow_dynf_metadata_offs = -1;
> > +	rte_flow_dynf_metadata_mask = 0ULL;
> > +	return -rte_errno;
> > +}
> > +
> >  static int
> >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 4fee105..f6e050c 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -28,6 +28,8 @@
> >  #include <rte_byteorder.h>
> >  #include <rte_esp.h>
> >  #include <rte_higig.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> >  	/**
> >  	 * [META]
> >  	 *
> > -	 * Matches a metadata value specified in mbuf metadata field.
> > +	 * Matches a metadata value.
> > +	 *
> >  	 * See struct rte_flow_item_meta.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_META,
> > @@ -1263,18 +1266,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > #endif
> >
> >  /**
> > - * RTE_FLOW_ITEM_TYPE_META.
> > + * RTE_FLOW_ITEM_TYPE_META
> >   *
> > - * Matches a specified metadata value.
> > + * Matches a specified metadata value. On egress, metadata can be set
> > + either by
> > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > + * metadata for a packet and the metadata will be reported via mbuf
> > + metadata
> > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > + field must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> >   */
> >  struct rte_flow_item_meta {
> > -	rte_be32_t data;
> > +	uint32_t data;
> >  };
> >
> >  /** Default mask for RTE_FLOW_ITEM_TYPE_META. */  #ifndef
> __cplusplus
> > static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
> > -	.data = RTE_BE32(UINT32_MAX),
> > +	.data = UINT32_MAX,
> >  };
> >  #endif
> >
> > @@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
> >  	 * undefined behavior.
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > +
> > +	/**
> > +	 * Set metadata on ingress or egress path.
> > +	 *
> > +	 * See struct rte_flow_action_set_meta.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_SET_META,
> >  };
> >
> >  /**
> > @@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
> >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SET_META
> > + *
> > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > +On
> > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > +field
> > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > +must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> > + *
> > + * Altering partial bits is supported with mask. For bits which have
> > +never
> > + * been set, unpredictable value will be seen depending on driver
> > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may
> > + * or may not be propagated to the other path depending on HW
> capability.
> > + *
> > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > + */
> > +struct rte_flow_action_set_meta {
> > +	uint32_t data;
> > +	uint32_t mask;
> > +};
> > +
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_flow_dynf_metadata_get(struct rte_mbuf *m) {
> > +	return *RTE_FLOW_DYNF_METADATA(m);
> > +}
> > +
> > +__rte_experimental
> > +static inline void
> > +rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v) {
> > +	*RTE_FLOW_DYNF_METADATA(m) = v;
> > +}
> > +
> >  /*
> >   * Definition of a single action.
> >   *
> > @@ -2662,6 +2728,33 @@ enum rte_flow_conv_op {  };
> >
> >  /**
> > + * Check if mbuf dynamic field for metadata is registered.
> > + *
> > + * @return
> > + *   True if registered, false otherwise.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void)
> > +{
> > +	return !!rte_flow_dynf_metadata_mask; }
> > +
> > +/**
> > + * Register mbuf dynamic field and flag for metadata.
> > + *
> > + * This function must be called prior to use SET_META action in order
> > +to
> > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > +delivered to
> > + * application.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_dynf_metadata_register(void);
> > +
> > +/**
> >   * Check whether a flow rule can be created on a given port.
> >   *
> >   * The flow rule is validated for correctness and whether it could be
> > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > b/lib/librte_mbuf/rte_mbuf_dyn.h index 2e9d418..de651c1 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -234,6 +234,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > __rte_experimental  void rte_mbuf_dyn_dump(FILE *out);
> >
> > -/* Placeholder for dynamic fields and flags declarations. */
> > +/*
> > + * Placeholder for dynamic fields and flags declarations.
> > + * This is centralizing point to gather all field names
> > + * and parameters together.
> > + */
> > +#define RTE_MBUF_DYNFIELD_METADATA_NAME
> "rte_flow_dynfield_metadata"
> > +#define RTE_MBUF_DYNFLAG_METADATA_NAME
> "rte_flow_dynflag_metadata"
> >
> 
> Sorry if it was not clear in my first comment, but I think we should have some
> words in this file about what are these field/flag.
Sure, I'll add some not-too-wordy comment here.

> 
> After that:
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Thanks a lot for the review, we polished the patch and made one better.

With best regards, Slava

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature
  2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-10-31 15:47                   ` Olivier Matz
@ 2019-10-31 16:48                   ` Viacheslav Ovsiienko
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 16:48 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

This patchset just combines two metadata related patches
to provide right applying order. The first patch introduces
the ingress metadata with mbuf dynamic field usage, the
second one moves egress metadata to the dynamic field
presented by first patch.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (2):
  ethdev: extend flow metadata
  ethdev: move egress metadata to dynamic field

 app/test-pmd/cmdline.c                   |   3 +-
 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/testpmd.c                   |   4 --
 app/test-pmd/testpmd.h                   |   2 +-
 app/test-pmd/util.c                      |  16 +++--
 app/test/test_mbuf.c                     |   1 -
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  18 ++++++
 drivers/net/mlx5/mlx5_flow_dv.c          |  19 ++----
 drivers/net/mlx5/mlx5_rxtx.c             |  22 +++----
 drivers/net/mlx5/mlx5_rxtx_vec.h         |   6 --
 drivers/net/mlx5/mlx5_txq.c              |   4 --
 lib/librte_ethdev/rte_ethdev.c           |   1 -
 lib/librte_ethdev/rte_ethdev.h           |   5 --
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 104 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf.c               |   2 -
 lib/librte_mbuf/rte_mbuf_core.h          |  21 +------
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 20 files changed, 322 insertions(+), 94 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata
  2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
@ 2019-10-31 16:48                     ` Viacheslav Ovsiienko
  2019-11-04  6:13                       ` [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 16:48 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
v8  - add flow metadata comment to rte_mbuf_dyn.h

v7: - http://patches.dpdk.org/patch/62278/
    - updated release notes in collateral patch

v6: - http://patches.dpdk.org/patch/62245/
    - minor code style issues
    - is combined in series with followed egress metadata patch

v5: - http://patches.dpdk.org/patch/62179/
    - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c              |  57 ++++++++++++++++-
 app/test-pmd/util.c                      |   5 ++
 doc/guides/prog_guide/rte_flow.rst       |  72 ++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  13 ++++
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 8 files changed, 287 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0d0bc0a..e4ef066 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -316,6 +316,9 @@ enum index {
 	ACTION_RAW_ENCAP_INDEX_VALUE,
 	ACTION_RAW_DECAP_INDEX,
 	ACTION_RAW_DECAP_INDEX_VALUE,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1067,6 +1070,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1265,6 +1269,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1329,6 +1340,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3378,7 +3393,31 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.help = "index of raw_encap/raw_decap data",
 		.next = NEXT(next_item),
 		.call = parse_port,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4818,6 +4857,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 159ce19..c943aca 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1232,21 +1258,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2466,6 +2477,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f6e90cb..963c4f8 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -237,6 +237,14 @@ New Features
   On supported NICs, we can now setup haipin queue which will offload packets
   from the wire, backto the wire.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 
 Removed Items
 -------------
@@ -344,6 +352,11 @@ API Changes
   has been introduced in this release is used when used when all the packets
   enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
 
+* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
+  Due to the new dynamic metadata field in mbuf is host-endian either, there
+  is the minor compatibility issue for applications in case of 32-bit values
+  supported.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 48b5389..e593f34 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -291,4 +291,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_hairpin_queue_setup;
 	rte_eth_dev_hairpin_capability_get;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index ca0f680..b0490cd 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -157,8 +165,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 4fee105..f6e050c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1263,18 +1266,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
-	rte_be32_t data;
+	uint32_t data;
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
 #ifndef __cplusplus
 static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
-	.data = RTE_BE32(UINT32_MAX),
+	.data = UINT32_MAX,
 };
 #endif
 
@@ -1942,6 +1950,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2429,6 +2444,57 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	uint32_t data;
+	uint32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2662,6 +2728,33 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void)
+{
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..96c3631 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+
+/*
+ * The metadata dynamic field provides some extra packet information
+ * to interact with RTE Flow engine. The metadata in sent mbufs can be
+ * used to match on some Flows. The metadata in received mbufs can
+ * provide some feedback from the Flows. The metadata flag tells
+ * whether the field contains actual value to send, or received one.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v8 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-10-31 16:48                     ` Viacheslav Ovsiienko
  2019-10-31 17:21                       ` Olivier Matz
  2019-11-01 12:34                       ` Andrew Rybchenko
  1 sibling, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-31 16:48 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>

---
v8: - updates PKT_LAST_FREE

v7: - http://patches.dpdk.org/patch/62280/ 
    - updates release notes

v6: - http://patches.dpdk.org/patch/62244/

 app/test-pmd/cmdline.c                 |  3 ++-
 app/test-pmd/testpmd.c                 |  4 ----
 app/test-pmd/testpmd.h                 |  2 +-
 app/test-pmd/util.c                    | 15 +++++++++------
 app/test/test_mbuf.c                   |  1 -
 doc/guides/prog_guide/rte_flow.rst     |  6 +++---
 doc/guides/rel_notes/release_19_11.rst |  5 +++++
 drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
 drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
 drivers/net/mlx5/mlx5_txq.c            |  4 ----
 lib/librte_ethdev/rte_ethdev.c         |  1 -
 lib/librte_ethdev/rte_ethdev.h         |  5 -----
 lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf_core.h        | 21 ++-------------------
 16 files changed, 49 insertions(+), 86 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4478069..49c45a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
 
 	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
+	ports[res->port_id].tx_metadata = res->value;
 	/* Add/remove callback to insert valid metadata in every Tx packet. */
 	if (ports[res->port_id].tx_metadata)
 		add_tx_md_callback(res->port_id);
 	else
 		remove_tx_md_callback(res->port_id);
+	rte_flow_dynf_metadata_register();
 }
 
 cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 0fc5b45..206c12b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1167,10 +1167,6 @@ struct extmem_param {
 		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
 			port->dev_conf.txmode.offloads &=
 				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
-		if (!(port->dev_info.tx_offload_capa &
-			DEV_TX_OFFLOAD_MATCH_METADATA))
-			port->dev_conf.txmode.offloads &=
-				~DEV_TX_OFFLOAD_MATCH_METADATA;
 		if (numa_support) {
 			if (port_numa[pid] != NUMA_NO_CONFIG)
 				port_per_socket[port_numa[pid]]++;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 8da1e8e..caabf32 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -193,7 +193,7 @@ struct rte_port {
 	struct softnic_port     softport;  /**< softnic params */
 #endif
 	/**< metadata value to insert in Tx packets. */
-	rte_be32_t		tx_metadata;
+	uint32_t		tx_metadata;
 	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
 };
 
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 56075b3..cf41864 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,8 +82,9 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
-		if (ol_flags & PKT_TX_METADATA)
-			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_TX_DYNF_METADATA)
+			printf(" - Tx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (ol_flags & PKT_RX_DYNF_METADATA)
 			printf(" - Rx metadata: 0x%x",
 			       *RTE_FLOW_DYNF_METADATA(mb));
@@ -188,10 +189,12 @@
 	 * Add metadata value to every Tx packet,
 	 * and set ol_flags accordingly.
 	 */
-	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
-		pkts[i]->ol_flags |= PKT_TX_METADATA;
-	}
+	if (rte_flow_dynf_metadata_avail())
+		for (i = 0; i < nb_pkts; i++) {
+			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
+						ports[port_id].tx_metadata;
+			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
+		}
 	return nb_pkts;
 }
 
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 854bc26..61ecffc 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1669,7 +1669,6 @@ struct flag_name {
 		VAL_NAME(PKT_TX_SEC_OFFLOAD),
 		VAL_NAME(PKT_TX_UDP_SEG),
 		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
-		VAL_NAME(PKT_TX_METADATA),
 	};
 
 	/* Test case to check with valid flag */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index c943aca..630e4c0 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -664,7 +664,7 @@ Item: ``META``
 Matches 32 bit metadata item set.
 
 On egress, metadata can be set either by mbuf metadata field with
-PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
 action sets metadata for a packet and the metadata will be reported via
 ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
 
@@ -2482,8 +2482,8 @@ Action: ``SET_META``
 
 Set metadata. Item ``META`` matches metadata.
 
-Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
-overridden by this action. On ingress, the metadata will be carried by
+Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
+will be overridden by this action. On ingress, the metadata will be carried by
 ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
 ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
 with the data.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 963c4f8..2e9a596 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -357,6 +357,11 @@ API Changes
   is the minor compatibility issue for applications in case of 32-bit values
   supported.
 
+* metadata: the tx_metadata mbuf field is moved to dymanic one.
+  PKT_TX_METADATA flag is replaced with PKT_TX_DYNF_METADATA.
+  DEV_TX_OFFLOAD_MATCH_METADATA offload flag is removed, now metadata
+  support in PMD is engaged on dynamic field registration.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index d9a7fd4..f961bff 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -793,7 +793,7 @@ struct field_modify_info modify_tcp[] = {
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
 			   const struct rte_flow_item *item,
 			   const struct rte_flow_attr *attr,
 			   struct rte_flow_error *error)
@@ -801,17 +801,10 @@ struct field_modify_info modify_tcp[] = {
 	const struct rte_flow_item_meta *spec = item->spec;
 	const struct rte_flow_item_meta *mask = item->mask;
 	const struct rte_flow_item_meta nic_mask = {
-		.data = RTE_BE32(UINT32_MAX)
+		.data = UINT32_MAX
 	};
 	int ret;
-	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 
-	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
-		return rte_flow_error_set(error, EPERM,
-					  RTE_FLOW_ERROR_TYPE_ITEM,
-					  NULL,
-					  "match on metadata offload "
-					  "configuration is off for this port");
 	if (!spec)
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
@@ -4750,10 +4743,10 @@ struct field_modify_info modify_tcp[] = {
 		meta_m = &rte_flow_item_meta_mask;
 	meta_v = (const void *)item->spec;
 	if (meta_v) {
-		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_m->data));
-		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_m,
+			 metadata_reg_a, meta_m->data);
+		MLX5_SET(fte_match_set_misc2, misc2_v,
+			 metadata_reg_a, meta_v->data & meta_m->data);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f597c89..88a4378 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2281,8 +2281,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	/* Engage VLAN tag insertion feature if requested. */
 	if (MLX5_TXOFF_CONFIG(VLAN) &&
 	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
@@ -2341,8 +2341,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2434,8 +2434,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2628,8 +2628,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -3700,8 +3700,8 @@ enum mlx5_txcmp_code {
 		return false;
 	/* Fill metadata field if needed. */
 	if (MLX5_TXOFF_CONFIG(METADATA) &&
-		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
-				 loc->mbuf->tx_metadata : 0))
+		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
 		return false;
 	/* There must be no VLAN packets in eMPW loop. */
 	if (MLX5_TXOFF_CONFIG(VLAN))
@@ -5149,7 +5149,7 @@ enum mlx5_txcmp_code {
 		 */
 		olx |= MLX5_TXOFF_CONFIG_EMPW;
 	}
-	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+	if (rte_flow_dynf_metadata_avail()) {
 		/* We should support Flow metadata. */
 		olx |= MLX5_TXOFF_CONFIG_METADATA;
 	}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index b54ff72..85e0bd5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -19,12 +19,6 @@
 	 DEV_TX_OFFLOAD_TCP_CKSUM | \
 	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
 
-/* HW offload capabilities of vectorized Tx. */
-#define MLX5_VEC_TX_OFFLOAD_CAP \
-	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
-	 DEV_TX_OFFLOAD_MATCH_METADATA | \
-	 DEV_TX_OFFLOAD_MULTI_SEGS)
-
 /*
  * Compile time sanity check for vectorized functions.
  */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index dfc379c..97991f0 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,10 +128,6 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
-	if (config->dv_flow_en)
-		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
-#endif
 	return offloads;
 }
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 68aca1f..23b751f 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
-	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 9b69255..28e29c7 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1145,11 +1145,6 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/**
- * Device supports match on metadata Tx offload..
- * Application must set PKT_TX_METADATA and mbuf metadata field.
- */
-#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f6e050c..51d8292 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1268,12 +1268,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 /**
  * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value. On egress, metadata can be set either by
- * mbuf tx_metadata field with PKT_TX_METADATA flag or
- * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
- * metadata for a packet and the metadata will be reported via mbuf metadata
- * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
- * registered in advance by rte_flow_dynf_metadata_register().
+ * Matches a specified metadata value. On egress, metadata can be set
+ * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
+ * sets metadata for a packet and the metadata will be reported via mbuf
+ * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
+ * field must be registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	uint32_t data;
@@ -2450,8 +2450,8 @@ struct rte_flow_action_set_mac {
  *
  * RTE_FLOW_ACTION_TYPE_SET_META
  *
- * Set metadata. Metadata set by mbuf tx_metadata field with
- * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * Set metadata. Metadata set by mbuf metadata dynamic field with
+ * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
  * ingress, the metadata will be carried by mbuf metadata dynamic field
  * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
  * registered in advance by rte_flow_dynf_metadata_register().
@@ -2478,8 +2478,9 @@ struct rte_flow_action_set_meta {
 #define RTE_FLOW_DYNF_METADATA(m) \
 	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
 
-/* Mbuf dynamic flag for metadata. */
+/* Mbuf dynamic flags for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
 __rte_experimental
 static inline uint32_t
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8c51dc1..35df1c4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
-	case PKT_TX_METADATA: return "PKT_TX_METADATA";
 	default: return NULL;
 	}
 }
@@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
-		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3022701..9a8557d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -187,16 +187,11 @@
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 39)
+#define PKT_LAST_FREE (1ULL << 40)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
- * Indicate that the metadata field in the mbuf is in use.
- */
-#define PKT_TX_METADATA	(1ULL << 40)
-
-/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -389,8 +384,7 @@
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM | \
-		PKT_TX_METADATA)
+		PKT_TX_OUTER_UDP_CKSUM)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
@@ -601,17 +595,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v8 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
@ 2019-10-31 17:21                       ` Olivier Matz
  2019-11-01 12:34                       ` Andrew Rybchenko
  1 sibling, 0 replies; 98+ messages in thread
From: Olivier Matz @ 2019-10-31 17:21 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, thomas, arybchenko, orika

On Thu, Oct 31, 2019 at 04:48:26PM +0000, Viacheslav Ovsiienko wrote:
> The dynamic mbuf fields were introduced by [1]. The egress metadata is
> good candidate to be moved from statically allocated field tx_metadata to
> dynamic one. Because mbufs are used in half-duplex fashion only, it is
> safe to share this dynamic field with ingress metadata.
> 
> The shared dynamic field contains either egress (if application going to
> transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
> metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
> along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior accessing the data.
> 
> The availability of dynamic mbuf metadata field can be checked with
> rte_flow_dynf_metadata_avail() routine.
> 
> DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
> The metadata support in PMDs is engaged on dynamic field registration.
> 
> Metadata feature is getting complex. We might have some set of actions
> and items that might be supported by PMDs in multiple combinations,
> the supported values and masks are the subjects to query by perfroming
> trials (with rte_flow_validate).
> 
> [1] http://patches.dpdk.org/patch/62040/
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Ori Kam <orika@mellanox.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v4] ethdev: add flow tag
  2019-10-27 19:11           ` Ori Kam
@ 2019-10-31 18:57             ` Ferruh Yigit
  0 siblings, 0 replies; 98+ messages in thread
From: Ferruh Yigit @ 2019-10-31 18:57 UTC (permalink / raw)
  To: Ori Kam, Slava Ovsiienko, dev; +Cc: Thomas Monjalon, Matan Azrad, Yongseok Koh

On 10/27/2019 7:11 PM, Ori Kam wrote:
> 
> 
>> -----Original Message-----
>> From: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>> Sent: Sunday, October 27, 2019 8:42 PM
>> To: dev@dpdk.org
>> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
>> <matan@mellanox.com>; Ori Kam <orika@mellanox.com>; Yongseok Koh
>> <yskoh@mellanox.com>
>> Subject: [PATCH v4] ethdev: add flow tag
>>
>> A tag is a transient data which can be used during flow match. This can be
>> used to store match result from a previous table so that the same pattern
>> need not be matched again on the next table. Even if outer header is
>> decapsulated on the previous match, the match result can be kept.
>>
>> Some device expose internal registers of its flow processing pipeline and
>> those registers are quite useful for stateful connection tracking as it
>> keeps status of flow matching. Multiple tags are supported by specifying
>> index.
>>
>> Example testpmd commands are:
>>
>>   flow create 0 ingress pattern ... / end
>>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>             set_tag index 3 value 0x123456 mask 0xffffff /
>>             vxlan_decap / jump group 1 / end
>>
>>   flow create 0 ingress pattern ... / end
>>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>             set_tag index 3 value 0x123456 mask 0xffffff /
>>             vxlan_decap / jump group 1 / end
>>
>>   flow create 0 ingress group 1
>>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>             eth ... / end
>>     actions ... jump group 2 / end
>>
>>   flow create 0 ingress group 1
>>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>>             eth ... / end
>>     actions ... / end
>>
>>   flow create 0 ingress group 2
>>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>             eth ... / end
>>     actions ... / end
>>
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>>
>> ---
>> v4: rebased, doc comments are addressed
>> v3:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
>> dk.org%2Fpatch%2F61902%2F&amp;data=02%7C01%7Corika%40mellanox.com
>> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
>> 6f461b%7C0%7C0%7C637077985544569848&amp;sdata=uig9z%2BKlajityhrU2P
>> ejBEJR%2FsgBHvytHC2HcZBuI7Q%3D&amp;reserved=0
>> v2:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
>> dk.org%2Fpatch%2F60909%2F&amp;data=02%7C01%7Corika%40mellanox.com
>> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
>> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=9pfgncgaRg1mVkJ00o
>> wm63lsiNw14hoo4pySvnjFCVE%3D&amp;reserved=0
>> v1:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
>> dk.org%2Fpatch%2F56104%2F&amp;data=02%7C01%7Corika%40mellanox.com
>> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
>> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=3r9b2yaNZfNiLjYStD
>> MDbw3PpQFbTYuPdJO9%2F8c0VbM%3D&amp;reserved=0
>> rfc:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dp
>> dk.org%2Fpatch%2F54271%2F&amp;data=02%7C01%7Corika%40mellanox.com
>> %7Cc16ca32f167b4104801708d75b0d6d82%7Ca652971c7d2e4d9ba6a4d14925
>> 6f461b%7C0%7C0%7C637077985544579804&amp;sdata=3uM2kVUbEwohNwFr
>> %2FR0mpBKEIFDfqYAChz0GakK6Pkw%3D&amp;reserved=0
> 
> 
> Acked-by: Ori Kam <orika@mellanox.com>
> 

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v8 2/2] ethdev: move egress metadata to dynamic field
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
  2019-10-31 17:21                       ` Olivier Matz
@ 2019-11-01 12:34                       ` Andrew Rybchenko
  1 sibling, 0 replies; 98+ messages in thread
From: Andrew Rybchenko @ 2019-11-01 12:34 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev; +Cc: matan, rasland, thomas, olivier.matz, orika

On 10/31/19 7:48 PM, Viacheslav Ovsiienko wrote:
> The dynamic mbuf fields were introduced by [1]. The egress metadata is
> good candidate to be moved from statically allocated field tx_metadata to
> dynamic one. Because mbufs are used in half-duplex fashion only, it is
> safe to share this dynamic field with ingress metadata.
>
> The shared dynamic field contains either egress (if application going to
> transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
> metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
> rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
> routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
> along with the data.
>
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior accessing the data.
>
> The availability of dynamic mbuf metadata field can be checked with
> rte_flow_dynf_metadata_avail() routine.
>
> DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
> The metadata support in PMDs is engaged on dynamic field registration.
>
> Metadata feature is getting complex. We might have some set of actions
> and items that might be supported by PMDs in multiple combinations,
> the supported values and masks are the subjects to query by perfroming
> trials (with rte_flow_validate).
>
> [1] http://patches.dpdk.org/patch/62040/
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Ori Kam <orika@mellanox.com>

I'm not sure that removal of DEV_TX_OFFLOAD_MATCH_METADATA
is a step in right direction, anyway:

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature
  2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-11-04  6:13                       ` Viacheslav Ovsiienko
  2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
  0 siblings, 2 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-04  6:13 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

This patchset just combines two metadata related patches
to provide right applying order. The first patch introduces
the ingress metadata with mbuf dynamic field usage, the
second one moves egress metadata to the dynamic field
presented by first patch.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (2):
  ethdev: extend flow metadata
  ethdev: move egress metadata to dynamic field

 app/test-pmd/cmdline.c                   |   3 +-
 app/test-pmd/cmdline_flow.c              |  55 ++++++++++++++++
 app/test-pmd/testpmd.c                   |   4 --
 app/test-pmd/testpmd.h                   |   2 +-
 app/test-pmd/util.c                      |  16 +++--
 app/test/test_mbuf.c                     |   1 -
 doc/guides/prog_guide/rte_flow.rst       |  76 +++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  17 +++++
 drivers/net/mlx5/mlx5_flow_dv.c          |  19 ++----
 drivers/net/mlx5/mlx5_rxtx.c             |  22 +++----
 drivers/net/mlx5/mlx5_rxtx_vec.h         |   6 --
 drivers/net/mlx5/mlx5_txq.c              |   4 --
 lib/librte_ethdev/rte_ethdev.c           |   1 -
 lib/librte_ethdev/rte_ethdev.h           |   5 --
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 104 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf.c               |   2 -
 lib/librte_mbuf/rte_mbuf_core.h          |  21 +------
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 20 files changed, 322 insertions(+), 95 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata
  2019-11-04  6:13                       ` [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature Viacheslav Ovsiienko
@ 2019-11-04  6:13                         ` Viacheslav Ovsiienko
  2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
  1 sibling, 1 reply; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-04  6:13 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
v9: - rebased

v8: - http://patches.dpdk.org/patch/62288/
    - add flow metadata comment to rte_mbuf_dyn.h

v7: - http://patches.dpdk.org/patch/62278/
    - updated release notes in collateral patch

v6: - http://patches.dpdk.org/patch/62245/
    - minor code style issues
    - is combined in series with followed egress metadata patch

v5: - http://patches.dpdk.org/patch/62179/
    - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/

 app/test-pmd/cmdline_flow.c              |  55 +++++++++++++++++
 app/test-pmd/util.c                      |   5 ++
 doc/guides/prog_guide/rte_flow.rst       |  76 ++++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  12 ++++
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 8 files changed, 287 insertions(+), 23 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 55be271..2c24187 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -323,6 +323,9 @@ enum index {
 	ACTION_SET_TAG_DATA,
 	ACTION_SET_TAG_INDEX,
 	ACTION_SET_TAG_MASK,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1083,6 +1086,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_TAG,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1289,6 +1293,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1353,6 +1364,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3454,6 +3469,30 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			     (struct rte_flow_action_set_tag, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4893,6 +4932,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index b2b34d8..1f72cc7 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -686,8 +686,34 @@ Matches tag item set by other flows. Multiple tags are supported by specifying
    |          | ``index`` | field is ignored                      |
    +----------+-----------+---------------------------------------+
 
-ata matching item types
-~~~~~~~~~~~~~~~~~~~~~~~
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
 
 Most of these are basically protocol header definitions with associated
 bit-masks. They must be specified (stacked) from lowest to highest protocol
@@ -1260,21 +1286,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2516,6 +2527,37 @@ application. Multiple tags are supported by specifying index.
    | ``index`` | index of tag to set        |
    +-----------+----------------------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f96ac38..f7f2ddb 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -241,6 +241,13 @@ New Features
   * Added a console command to testpmd app, ``show port (port_id) ptypes`` which
     gives ability to print port supported ptypes in different protocol layers.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
 
 Removed Items
 -------------
@@ -353,6 +360,11 @@ API Changes
   has been introduced in this release is used when used when all the packets
   enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
 
+* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
+  Due to the new dynamic metadata field in mbuf is host-endian either, there
+  is the minor compatibility issue for applications in case of 32-bit values
+  supported.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index c9104fd..56ba848 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -290,4 +290,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_burst_mode_get;
 	rte_eth_tx_hairpin_queue_setup;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 2f86d1a..33e3011 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -159,8 +167,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index ed398ac..934c3e1 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1272,18 +1275,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
-	rte_be32_t data;
+	uint32_t data;
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
 #ifndef __cplusplus
 static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
-	.data = RTE_BE32(UINT32_MAX),
+	.data = UINT32_MAX,
 };
 #endif
 
@@ -1989,6 +1997,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_tag.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_TAG,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2491,6 +2506,57 @@ struct rte_flow_action_set_tag {
 	uint8_t index;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	uint32_t data;
+	uint32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2724,6 +2790,33 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void)
+{
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..96c3631 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+
+/*
+ * The metadata dynamic field provides some extra packet information
+ * to interact with RTE Flow engine. The metadata in sent mbufs can be
+ * used to match on some Flows. The metadata in received mbufs can
+ * provide some feedback from the Flows. The metadata flag tells
+ * whether the field contains actual value to send, or received one.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v9 2/2] ethdev: move egress metadata to dynamic field
  2019-11-04  6:13                       ` [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-11-04  6:13                         ` Viacheslav Ovsiienko
  1 sibling, 0 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-04  6:13 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
v9: - rebased

v8: - http://patches.dpdk.org/patch/62287/
    - updates PKT_LAST_FREE

v7: - http://patches.dpdk.org/patch/62280/ 
    - updates release notes

v6: - http://patches.dpdk.org/patch/62244

 app/test-pmd/cmdline.c                 |  3 ++-
 app/test-pmd/testpmd.c                 |  4 ----
 app/test-pmd/testpmd.h                 |  2 +-
 app/test-pmd/util.c                    | 15 +++++++++------
 app/test/test_mbuf.c                   |  1 -
 doc/guides/prog_guide/rte_flow.rst     |  6 +++---
 doc/guides/rel_notes/release_19_11.rst |  5 +++++
 drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
 drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
 drivers/net/mlx5/mlx5_txq.c            |  4 ----
 lib/librte_ethdev/rte_ethdev.c         |  1 -
 lib/librte_ethdev/rte_ethdev.h         |  5 -----
 lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf_core.h        | 21 ++-------------------
 16 files changed, 49 insertions(+), 86 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4478069..49c45a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
 
 	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
+	ports[res->port_id].tx_metadata = res->value;
 	/* Add/remove callback to insert valid metadata in every Tx packet. */
 	if (ports[res->port_id].tx_metadata)
 		add_tx_md_callback(res->port_id);
 	else
 		remove_tx_md_callback(res->port_id);
+	rte_flow_dynf_metadata_register();
 }
 
 cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 38acbc5..5ba9741 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1119,10 +1119,6 @@ struct extmem_param {
 		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
 			port->dev_conf.txmode.offloads &=
 				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
-		if (!(port->dev_info.tx_offload_capa &
-			DEV_TX_OFFLOAD_MATCH_METADATA))
-			port->dev_conf.txmode.offloads &=
-				~DEV_TX_OFFLOAD_MATCH_METADATA;
 		if (numa_support) {
 			if (port_numa[pid] != NUMA_NO_CONFIG)
 				port_per_socket[port_numa[pid]]++;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index ec10a1a..419997f 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -193,7 +193,7 @@ struct rte_port {
 	struct softnic_port     softport;  /**< softnic params */
 #endif
 	/**< metadata value to insert in Tx packets. */
-	rte_be32_t		tx_metadata;
+	uint32_t		tx_metadata;
 	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
 };
 
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 56075b3..cf41864 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,8 +82,9 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
-		if (ol_flags & PKT_TX_METADATA)
-			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_TX_DYNF_METADATA)
+			printf(" - Tx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (ol_flags & PKT_RX_DYNF_METADATA)
 			printf(" - Rx metadata: 0x%x",
 			       *RTE_FLOW_DYNF_METADATA(mb));
@@ -188,10 +189,12 @@
 	 * Add metadata value to every Tx packet,
 	 * and set ol_flags accordingly.
 	 */
-	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
-		pkts[i]->ol_flags |= PKT_TX_METADATA;
-	}
+	if (rte_flow_dynf_metadata_avail())
+		for (i = 0; i < nb_pkts; i++) {
+			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
+						ports[port_id].tx_metadata;
+			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
+		}
 	return nb_pkts;
 }
 
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 854bc26..61ecffc 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1669,7 +1669,6 @@ struct flag_name {
 		VAL_NAME(PKT_TX_SEC_OFFLOAD),
 		VAL_NAME(PKT_TX_UDP_SEG),
 		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
-		VAL_NAME(PKT_TX_METADATA),
 	};
 
 	/* Test case to check with valid flag */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 1f72cc7..ac0020e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -692,7 +692,7 @@ Item: ``META``
 Matches 32 bit metadata item set.
 
 On egress, metadata can be set either by mbuf metadata field with
-PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
 action sets metadata for a packet and the metadata will be reported via
 ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
 
@@ -2532,8 +2532,8 @@ Action: ``SET_META``
 
 Set metadata. Item ``META`` matches metadata.
 
-Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
-overridden by this action. On ingress, the metadata will be carried by
+Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
+will be overridden by this action. On ingress, the metadata will be carried by
 ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
 ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
 with the data.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f7f2ddb..dd50710 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -365,6 +365,11 @@ API Changes
   is the minor compatibility issue for applications in case of 32-bit values
   supported.
 
+* metadata: the tx_metadata mbuf field is moved to dymanic one.
+  PKT_TX_METADATA flag is replaced with PKT_TX_DYNF_METADATA.
+  DEV_TX_OFFLOAD_MATCH_METADATA offload flag is removed, now metadata
+  support in PMD is engaged on dynamic field registration.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index d9a7fd4..f961bff 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -793,7 +793,7 @@ struct field_modify_info modify_tcp[] = {
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
 			   const struct rte_flow_item *item,
 			   const struct rte_flow_attr *attr,
 			   struct rte_flow_error *error)
@@ -801,17 +801,10 @@ struct field_modify_info modify_tcp[] = {
 	const struct rte_flow_item_meta *spec = item->spec;
 	const struct rte_flow_item_meta *mask = item->mask;
 	const struct rte_flow_item_meta nic_mask = {
-		.data = RTE_BE32(UINT32_MAX)
+		.data = UINT32_MAX
 	};
 	int ret;
-	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 
-	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
-		return rte_flow_error_set(error, EPERM,
-					  RTE_FLOW_ERROR_TYPE_ITEM,
-					  NULL,
-					  "match on metadata offload "
-					  "configuration is off for this port");
 	if (!spec)
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
@@ -4750,10 +4743,10 @@ struct field_modify_info modify_tcp[] = {
 		meta_m = &rte_flow_item_meta_mask;
 	meta_v = (const void *)item->spec;
 	if (meta_v) {
-		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_m->data));
-		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_m,
+			 metadata_reg_a, meta_m->data);
+		MLX5_SET(fte_match_set_misc2, misc2_v,
+			 metadata_reg_a, meta_v->data & meta_m->data);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f597c89..88a4378 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2281,8 +2281,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	/* Engage VLAN tag insertion feature if requested. */
 	if (MLX5_TXOFF_CONFIG(VLAN) &&
 	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
@@ -2341,8 +2341,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2434,8 +2434,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2628,8 +2628,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -3700,8 +3700,8 @@ enum mlx5_txcmp_code {
 		return false;
 	/* Fill metadata field if needed. */
 	if (MLX5_TXOFF_CONFIG(METADATA) &&
-		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
-				 loc->mbuf->tx_metadata : 0))
+		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
 		return false;
 	/* There must be no VLAN packets in eMPW loop. */
 	if (MLX5_TXOFF_CONFIG(VLAN))
@@ -5149,7 +5149,7 @@ enum mlx5_txcmp_code {
 		 */
 		olx |= MLX5_TXOFF_CONFIG_EMPW;
 	}
-	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+	if (rte_flow_dynf_metadata_avail()) {
 		/* We should support Flow metadata. */
 		olx |= MLX5_TXOFF_CONFIG_METADATA;
 	}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index b54ff72..85e0bd5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -19,12 +19,6 @@
 	 DEV_TX_OFFLOAD_TCP_CKSUM | \
 	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
 
-/* HW offload capabilities of vectorized Tx. */
-#define MLX5_VEC_TX_OFFLOAD_CAP \
-	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
-	 DEV_TX_OFFLOAD_MATCH_METADATA | \
-	 DEV_TX_OFFLOAD_MULTI_SEGS)
-
 /*
  * Compile time sanity check for vectorized functions.
  */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index dfc379c..97991f0 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,10 +128,6 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
-	if (config->dv_flow_en)
-		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
-#endif
 	return offloads;
 }
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 85ab5f0..ebc62d0 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
-	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index e6ef4b4..81d8908 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1145,11 +1145,6 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/**
- * Device supports match on metadata Tx offload..
- * Application must set PKT_TX_METADATA and mbuf metadata field.
- */
-#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 934c3e1..452d359 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1277,12 +1277,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 /**
  * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value. On egress, metadata can be set either by
- * mbuf tx_metadata field with PKT_TX_METADATA flag or
- * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
- * metadata for a packet and the metadata will be reported via mbuf metadata
- * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
- * registered in advance by rte_flow_dynf_metadata_register().
+ * Matches a specified metadata value. On egress, metadata can be set
+ * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
+ * sets metadata for a packet and the metadata will be reported via mbuf
+ * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
+ * field must be registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	uint32_t data;
@@ -2512,8 +2512,8 @@ struct rte_flow_action_set_tag {
  *
  * RTE_FLOW_ACTION_TYPE_SET_META
  *
- * Set metadata. Metadata set by mbuf tx_metadata field with
- * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * Set metadata. Metadata set by mbuf metadata dynamic field with
+ * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
  * ingress, the metadata will be carried by mbuf metadata dynamic field
  * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
  * registered in advance by rte_flow_dynf_metadata_register().
@@ -2540,8 +2540,9 @@ struct rte_flow_action_set_meta {
 #define RTE_FLOW_DYNF_METADATA(m) \
 	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
 
-/* Mbuf dynamic flag for metadata. */
+/* Mbuf dynamic flags for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
 __rte_experimental
 static inline uint32_t
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8c51dc1..35df1c4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
-	case PKT_TX_METADATA: return "PKT_TX_METADATA";
 	default: return NULL;
 	}
 }
@@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
-		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3022701..9a8557d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -187,16 +187,11 @@
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 39)
+#define PKT_LAST_FREE (1ULL << 40)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
- * Indicate that the metadata field in the mbuf is in use.
- */
-#define PKT_TX_METADATA	(1ULL << 40)
-
-/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -389,8 +384,7 @@
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM | \
-		PKT_TX_METADATA)
+		PKT_TX_OUTER_UDP_CKSUM)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
@@ -601,17 +595,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature
  2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-11-05 14:19                           ` Viacheslav Ovsiienko
  2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
                                               ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-05 14:19 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

This patchset just combines two metadata related patches
to provide right applying order. The first patch introduces
the ingress metadata with mbuf dynamic field usage, the
second one moves egress metadata to the dynamic field
presented by first patch.

Viacheslav Ovsiienko (2):
  ethdev: extend flow metadata
  ethdev: move egress metadata to dynamic field

 app/test-pmd/cmdline.c                   |   3 +-
 app/test-pmd/cmdline_flow.c              |  65 +++++++++++++++++--
 app/test-pmd/testpmd.c                   |   4 --
 app/test-pmd/testpmd.h                   |   2 +-
 app/test-pmd/util.c                      |  16 +++--
 app/test/test_mbuf.c                     |   1 -
 doc/guides/prog_guide/rte_flow.rst       |  76 +++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  17 +++++
 drivers/net/mlx5/mlx5_flow_dv.c          |  19 ++----
 drivers/net/mlx5/mlx5_rxtx.c             |  22 +++----
 drivers/net/mlx5/mlx5_rxtx_vec.h         |   6 --
 drivers/net/mlx5/mlx5_txq.c              |   4 --
 lib/librte_ethdev/rte_ethdev.c           |   1 -
 lib/librte_ethdev/rte_ethdev.h           |   5 --
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 104 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf.c               |   2 -
 lib/librte_mbuf/rte_mbuf_core.h          |  21 +------
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 20 files changed, 327 insertions(+), 100 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata
  2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
@ 2019-11-05 14:19                             ` Viacheslav Ovsiienko
  2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
  2019-11-06 15:49                             ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Ferruh Yigit
  2 siblings, 0 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-05 14:19 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
v10: - fix testpmd META and TAG items, SET_TAG, SET_META actions
       endianness set from interactive command interface. All metadata
       related items and actions now are presented with host endianess

v9:  - http://patches.dpdk.org/patch/62354/
     - rebased

v8:  - http://patches.dpdk.org/patch/62288/
     - add flow metadata comment to rte_mbuf_dyn.h

v7:  - http://patches.dpdk.org/patch/62278/
     - updated release notes in collateral patch

v6:  - http://patches.dpdk.org/patch/62245/
     - minor code style issues
     - is combined in series with followed egress metadata patch

v5: - http://patches.dpdk.org/patch/62179/
    - addressed code style issues from comments
    - Tx metadata deprecation notice removed
      (dedicated tx_metadata patch is coming)
    - MBUF_DYNF_METADATA_NAME is splitted into FIELD and FLAG
      dedicated ones, RTE suffix is added
    - metadata historic retrospective is added to log message
    - rebased

v4: - http://patches.dpdk.org/patch/62065/
    - documentation comments addressed
    - deprecation notice for Tx metadata offload flag
    - rebased

v3: - http://patches.dpdk.org/patch/61902/
    - rebased, neat updates

v2: - http://patches.dpdk.org/patch/60909/

v1: - http://patches.dpdk.org/patch/56104/
    - rfc: http://patches.dpdk.org/patch/54271/

   
 app/test-pmd/cmdline_flow.c              |  65 +++++++++++++++++--
 app/test-pmd/util.c                      |   5 ++
 doc/guides/prog_guide/rte_flow.rst       |  76 ++++++++++++++++++-----
 doc/guides/rel_notes/release_19_11.rst   |  12 ++++
 lib/librte_ethdev/rte_ethdev_version.map |   3 +
 lib/librte_ethdev/rte_flow.c             |  40 ++++++++++++
 lib/librte_ethdev/rte_flow.h             | 103 +++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  16 ++++-
 8 files changed, 292 insertions(+), 28 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 55be271..085182b 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -323,6 +323,9 @@ enum index {
 	ACTION_SET_TAG_DATA,
 	ACTION_SET_TAG_INDEX,
 	ACTION_SET_TAG_MASK,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -1083,6 +1086,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_TAG,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1289,6 +1293,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1353,6 +1364,10 @@ static int parse_vc_action_raw_encap_index(struct context *,
 static int parse_vc_action_raw_decap_index(struct context *,
 					   const struct token *, const char *,
 					   unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -2469,8 +2484,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.name = "data",
 		.help = "metadata value",
 		.next = NEXT(item_meta, NEXT_ENTRY(UNSIGNED), item_param),
-		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_meta,
-						  data, "\xff\xff\xff\xff")),
+		.args = ARGS(ARGS_ENTRY_MASK(struct rte_flow_item_meta,
+					     data, "\xff\xff\xff\xff")),
 	},
 	[ITEM_GRE_KEY] = {
 		.name = "gre_key",
@@ -2569,7 +2584,7 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.name = "data",
 		.help = "tag value to match",
 		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
-		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, data)),
 	},
 	[ITEM_TAG_INDEX] = {
 		.name = "index",
@@ -3442,7 +3457,7 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.name = "data",
 		.help = "tag value",
 		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
-		.args = ARGS(ARGS_ENTRY_HTON
+		.args = ARGS(ARGS_ENTRY
 			     (struct rte_flow_action_set_tag, data)),
 		.call = parse_vc_conf,
 	},
@@ -3450,10 +3465,34 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.name = "mask",
 		.help = "mask for tag value",
 		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
-		.args = ARGS(ARGS_ENTRY_HTON
+		.args = ARGS(ARGS_ENTRY
 			     (struct rte_flow_action_set_tag, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4893,6 +4932,22 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index f20531d..56075b3 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,6 +82,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index b2b34d8..1f72cc7 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -686,8 +686,34 @@ Matches tag item set by other flows. Multiple tags are supported by specifying
    |          | ``index`` | field is ignored                      |
    +----------+-----------+---------------------------------------+
 
-ata matching item types
-~~~~~~~~~~~~~~~~~~~~~~~
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
 
 Most of these are basically protocol header definitions with associated
 bit-masks. They must be specified (stacked) from lowest to highest protocol
@@ -1260,21 +1286,6 @@ Matches a PPPoE session protocol identifier.
 - ``proto_id``: PPP protocol identifier.
 - Default ``mask`` matches proto_id only.
 
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Item: ``NSH``
 ^^^^^^^^^^^^^
 
@@ -2516,6 +2527,37 @@ application. Multiple tags are supported by specifying index.
    | ``index`` | index of tag to set        |
    +-----------+----------------------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f96ac38..f7f2ddb 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -241,6 +241,13 @@ New Features
   * Added a console command to testpmd app, ``show port (port_id) ptypes`` which
     gives ability to print port supported ptypes in different protocol layers.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
 
 Removed Items
 -------------
@@ -353,6 +360,11 @@ API Changes
   has been introduced in this release is used when used when all the packets
   enqueued in the tx adapter are destined for the same Ethernet port & Tx queue.
 
+* metadata: RTE_FLOW_ITEM_TYPE_META data endianness altered to host one.
+  Due to the new dynamic metadata field in mbuf is host-endian either, there
+  is the minor compatibility issue for applications in case of 32-bit values
+  supported.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index c9104fd..56ba848 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -290,4 +290,7 @@ EXPERIMENTAL {
 	rte_eth_rx_hairpin_queue_setup;
 	rte_eth_tx_burst_mode_get;
 	rte_eth_tx_hairpin_queue_setup;
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_mask;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 2f86d1a..33e3011 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -159,8 +167,40 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = RTE_MBUF_DYNFIELD_METADATA_NAME,
+		.size = sizeof(uint32_t),
+		.align = __alignof__(uint32_t),
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = RTE_MBUF_DYNFLAG_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index ed398ac..934c3e1 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
 #include <rte_byteorder.h>
 #include <rte_esp.h>
 #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1272,18 +1275,23 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value.
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
-	rte_be32_t data;
+	uint32_t data;
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_META. */
 #ifndef __cplusplus
 static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
-	.data = RTE_BE32(UINT32_MAX),
+	.data = UINT32_MAX,
 };
 #endif
 
@@ -1989,6 +1997,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_tag.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_TAG,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2491,6 +2506,57 @@ struct rte_flow_action_set_tag {
 	uint8_t index;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	uint32_t data;
+	uint32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
+__rte_experimental
+static inline uint32_t
+rte_flow_dynf_metadata_get(struct rte_mbuf *m)
+{
+	return *RTE_FLOW_DYNF_METADATA(m);
+}
+
+__rte_experimental
+static inline void
+rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
+{
+	*RTE_FLOW_DYNF_METADATA(m) = v;
+}
+
 /*
  * Definition of a single action.
  *
@@ -2724,6 +2790,33 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void)
+{
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e9d418..96c3631 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -234,6 +234,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
 __rte_experimental
 void rte_mbuf_dyn_dump(FILE *out);
 
-/* Placeholder for dynamic fields and flags declarations. */
+/*
+ * Placeholder for dynamic fields and flags declarations.
+ * This is centralizing point to gather all field names
+ * and parameters together.
+ */
+
+/*
+ * The metadata dynamic field provides some extra packet information
+ * to interact with RTE Flow engine. The metadata in sent mbufs can be
+ * used to match on some Flows. The metadata in received mbufs can
+ * provide some feedback from the Flows. The metadata flag tells
+ * whether the field contains actual value to send, or received one.
+ */
+#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
+#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [dpdk-dev] [PATCH v10 2/2] ethdev: move egress metadata to dynamic field
  2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
@ 2019-11-05 14:19                             ` Viacheslav Ovsiienko
  2019-11-06 15:49                             ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Ferruh Yigit
  2 siblings, 0 replies; 98+ messages in thread
From: Viacheslav Ovsiienko @ 2019-11-05 14:19 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 app/test-pmd/cmdline.c                 |  3 ++-
 app/test-pmd/testpmd.c                 |  4 ----
 app/test-pmd/testpmd.h                 |  2 +-
 app/test-pmd/util.c                    | 15 +++++++++------
 app/test/test_mbuf.c                   |  1 -
 doc/guides/prog_guide/rte_flow.rst     |  6 +++---
 doc/guides/rel_notes/release_19_11.rst |  5 +++++
 drivers/net/mlx5/mlx5_flow_dv.c        | 19 ++++++-------------
 drivers/net/mlx5/mlx5_rxtx.c           | 22 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec.h       |  6 ------
 drivers/net/mlx5/mlx5_txq.c            |  4 ----
 lib/librte_ethdev/rte_ethdev.c         |  1 -
 lib/librte_ethdev/rte_ethdev.h         |  5 -----
 lib/librte_ethdev/rte_flow.h           | 19 ++++++++++---------
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf_core.h        | 21 ++-------------------
 16 files changed, 49 insertions(+), 86 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4478069..49c45a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -18718,12 +18718,13 @@ struct cmd_config_tx_metadata_specific_result {
 
 	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	ports[res->port_id].tx_metadata = rte_cpu_to_be_32(res->value);
+	ports[res->port_id].tx_metadata = res->value;
 	/* Add/remove callback to insert valid metadata in every Tx packet. */
 	if (ports[res->port_id].tx_metadata)
 		add_tx_md_callback(res->port_id);
 	else
 		remove_tx_md_callback(res->port_id);
+	rte_flow_dynf_metadata_register();
 }
 
 cmdline_parse_token_string_t cmd_config_tx_metadata_specific_port =
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 38acbc5..5ba9741 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1119,10 +1119,6 @@ struct extmem_param {
 		      DEV_TX_OFFLOAD_MBUF_FAST_FREE))
 			port->dev_conf.txmode.offloads &=
 				~DEV_TX_OFFLOAD_MBUF_FAST_FREE;
-		if (!(port->dev_info.tx_offload_capa &
-			DEV_TX_OFFLOAD_MATCH_METADATA))
-			port->dev_conf.txmode.offloads &=
-				~DEV_TX_OFFLOAD_MATCH_METADATA;
 		if (numa_support) {
 			if (port_numa[pid] != NUMA_NO_CONFIG)
 				port_per_socket[port_numa[pid]]++;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index ec10a1a..419997f 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -193,7 +193,7 @@ struct rte_port {
 	struct softnic_port     softport;  /**< softnic params */
 #endif
 	/**< metadata value to insert in Tx packets. */
-	rte_be32_t		tx_metadata;
+	uint32_t		tx_metadata;
 	const struct rte_eth_rxtx_callback *tx_set_md_cb[MAX_QUEUE_ID+1];
 };
 
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 56075b3..cf41864 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -82,8 +82,9 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
-		if (ol_flags & PKT_TX_METADATA)
-			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_TX_DYNF_METADATA)
+			printf(" - Tx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (ol_flags & PKT_RX_DYNF_METADATA)
 			printf(" - Rx metadata: 0x%x",
 			       *RTE_FLOW_DYNF_METADATA(mb));
@@ -188,10 +189,12 @@
 	 * Add metadata value to every Tx packet,
 	 * and set ol_flags accordingly.
 	 */
-	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
-		pkts[i]->ol_flags |= PKT_TX_METADATA;
-	}
+	if (rte_flow_dynf_metadata_avail())
+		for (i = 0; i < nb_pkts; i++) {
+			*RTE_FLOW_DYNF_METADATA(pkts[i]) =
+						ports[port_id].tx_metadata;
+			pkts[i]->ol_flags |= PKT_TX_DYNF_METADATA;
+		}
 	return nb_pkts;
 }
 
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 854bc26..61ecffc 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1669,7 +1669,6 @@ struct flag_name {
 		VAL_NAME(PKT_TX_SEC_OFFLOAD),
 		VAL_NAME(PKT_TX_UDP_SEG),
 		VAL_NAME(PKT_TX_OUTER_UDP_CKSUM),
-		VAL_NAME(PKT_TX_METADATA),
 	};
 
 	/* Test case to check with valid flag */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 1f72cc7..ac0020e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -692,7 +692,7 @@ Item: ``META``
 Matches 32 bit metadata item set.
 
 On egress, metadata can be set either by mbuf metadata field with
-PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+PKT_TX_DYNF_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
 action sets metadata for a packet and the metadata will be reported via
 ``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
 
@@ -2532,8 +2532,8 @@ Action: ``SET_META``
 
 Set metadata. Item ``META`` matches metadata.
 
-Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
-overridden by this action. On ingress, the metadata will be carried by
+Metadata set by mbuf metadata field with PKT_TX_DYNF_METADATA flag on egress
+will be overridden by this action. On ingress, the metadata will be carried by
 ``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
 ``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
 with the data.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f7f2ddb..dd50710 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -365,6 +365,11 @@ API Changes
   is the minor compatibility issue for applications in case of 32-bit values
   supported.
 
+* metadata: the tx_metadata mbuf field is moved to dymanic one.
+  PKT_TX_METADATA flag is replaced with PKT_TX_DYNF_METADATA.
+  DEV_TX_OFFLOAD_MATCH_METADATA offload flag is removed, now metadata
+  support in PMD is engaged on dynamic field registration.
+
 * sched: The pipe nodes configuration parameters such as number of pipes,
   pipe queue sizes, pipe profiles, etc., are moved from port level structure
   to subport level. This allows different subports of the same port to
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index b49175e..42c265f 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -820,7 +820,7 @@ struct field_modify_info modify_tcp[] = {
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-flow_dv_validate_item_meta(struct rte_eth_dev *dev,
+flow_dv_validate_item_meta(struct rte_eth_dev *dev __rte_unused,
 			   const struct rte_flow_item *item,
 			   const struct rte_flow_attr *attr,
 			   struct rte_flow_error *error)
@@ -828,17 +828,10 @@ struct field_modify_info modify_tcp[] = {
 	const struct rte_flow_item_meta *spec = item->spec;
 	const struct rte_flow_item_meta *mask = item->mask;
 	const struct rte_flow_item_meta nic_mask = {
-		.data = RTE_BE32(UINT32_MAX)
+		.data = UINT32_MAX
 	};
 	int ret;
-	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 
-	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
-		return rte_flow_error_set(error, EPERM,
-					  RTE_FLOW_ERROR_TYPE_ITEM,
-					  NULL,
-					  "match on metadata offload "
-					  "configuration is off for this port");
 	if (!spec)
 		return rte_flow_error_set(error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
@@ -4816,10 +4809,10 @@ struct field_modify_info modify_tcp[] = {
 		meta_m = &rte_flow_item_meta_mask;
 	meta_v = (const void *)item->spec;
 	if (meta_v) {
-		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_m->data));
-		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
-			 rte_be_to_cpu_32(meta_v->data & meta_m->data));
+		MLX5_SET(fte_match_set_misc2, misc2_m,
+			 metadata_reg_a, meta_m->data);
+		MLX5_SET(fte_match_set_misc2, misc2_v,
+			 metadata_reg_a, meta_v->data & meta_m->data);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 7f99f22..887e283 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2279,8 +2279,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	/* Engage VLAN tag insertion feature if requested. */
 	if (MLX5_TXOFF_CONFIG(VLAN) &&
 	    loc->mbuf->ol_flags & PKT_TX_VLAN_PKT) {
@@ -2339,8 +2339,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2432,8 +2432,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -2626,8 +2626,8 @@ enum mlx5_txcmp_code {
 	es->swp_offs = txq_mbuf_to_swp(loc, &es->swp_flags, olx);
 	/* Fill metadata field if needed. */
 	es->metadata = MLX5_TXOFF_CONFIG(METADATA) ?
-		       loc->mbuf->ol_flags & PKT_TX_METADATA ?
-		       loc->mbuf->tx_metadata : 0 : 0;
+		       loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+		       *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0 : 0;
 	static_assert(MLX5_ESEG_MIN_INLINE_SIZE ==
 				(sizeof(uint16_t) +
 				 sizeof(rte_v128u32_t)),
@@ -3698,8 +3698,8 @@ enum mlx5_txcmp_code {
 		return false;
 	/* Fill metadata field if needed. */
 	if (MLX5_TXOFF_CONFIG(METADATA) &&
-		es->metadata != (loc->mbuf->ol_flags & PKT_TX_METADATA ?
-				 loc->mbuf->tx_metadata : 0))
+		es->metadata != (loc->mbuf->ol_flags & PKT_TX_DYNF_METADATA ?
+				 *RTE_FLOW_DYNF_METADATA(loc->mbuf) : 0))
 		return false;
 	/* There must be no VLAN packets in eMPW loop. */
 	if (MLX5_TXOFF_CONFIG(VLAN))
@@ -5147,7 +5147,7 @@ enum mlx5_txcmp_code {
 		 */
 		olx |= MLX5_TXOFF_CONFIG_EMPW;
 	}
-	if (tx_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
+	if (rte_flow_dynf_metadata_avail()) {
 		/* We should support Flow metadata. */
 		olx |= MLX5_TXOFF_CONFIG_METADATA;
 	}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index b54ff72..85e0bd5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -19,12 +19,6 @@
 	 DEV_TX_OFFLOAD_TCP_CKSUM | \
 	 DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM)
 
-/* HW offload capabilities of vectorized Tx. */
-#define MLX5_VEC_TX_OFFLOAD_CAP \
-	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
-	 DEV_TX_OFFLOAD_MATCH_METADATA | \
-	 DEV_TX_OFFLOAD_MULTI_SEGS)
-
 /*
  * Compile time sanity check for vectorized functions.
  */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index dfc379c..97991f0 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -128,10 +128,6 @@
 			offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
 				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
 	}
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
-	if (config->dv_flow_en)
-		offloads |= DEV_TX_OFFLOAD_MATCH_METADATA;
-#endif
 	return offloads;
 }
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 85ab5f0..ebc62d0 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -161,7 +161,6 @@ struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
-	RTE_TX_OFFLOAD_BIT2STR(MATCH_METADATA),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index e6ef4b4..81d8908 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1145,11 +1145,6 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/**
- * Device supports match on metadata Tx offload..
- * Application must set PKT_TX_METADATA and mbuf metadata field.
- */
-#define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 934c3e1..452d359 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1277,12 +1277,12 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 /**
  * RTE_FLOW_ITEM_TYPE_META
  *
- * Matches a specified metadata value. On egress, metadata can be set either by
- * mbuf tx_metadata field with PKT_TX_METADATA flag or
- * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
- * metadata for a packet and the metadata will be reported via mbuf metadata
- * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
- * registered in advance by rte_flow_dynf_metadata_register().
+ * Matches a specified metadata value. On egress, metadata can be set
+ * either by mbuf dynamic metadata field with PKT_TX_DYNF_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META
+ * sets metadata for a packet and the metadata will be reported via mbuf
+ * metadata dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
+ * field must be registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	uint32_t data;
@@ -2512,8 +2512,8 @@ struct rte_flow_action_set_tag {
  *
  * RTE_FLOW_ACTION_TYPE_SET_META
  *
- * Set metadata. Metadata set by mbuf tx_metadata field with
- * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * Set metadata. Metadata set by mbuf metadata dynamic field with
+ * PKT_TX_DYNF_DATA flag on egress will be overridden by this action. On
  * ingress, the metadata will be carried by mbuf metadata dynamic field
  * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
  * registered in advance by rte_flow_dynf_metadata_register().
@@ -2540,8 +2540,9 @@ struct rte_flow_action_set_meta {
 #define RTE_FLOW_DYNF_METADATA(m) \
 	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
 
-/* Mbuf dynamic flag for metadata. */
+/* Mbuf dynamic flags for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+#define PKT_TX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
 __rte_experimental
 static inline uint32_t
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8c51dc1..35df1c4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -670,7 +670,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
-	case PKT_TX_METADATA: return "PKT_TX_METADATA";
 	default: return NULL;
 	}
 }
@@ -707,7 +706,6 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
-		{ PKT_TX_METADATA, PKT_TX_METADATA, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3022701..9a8557d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -187,16 +187,11 @@
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 39)
+#define PKT_LAST_FREE (1ULL << 40)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
- * Indicate that the metadata field in the mbuf is in use.
- */
-#define PKT_TX_METADATA	(1ULL << 40)
-
-/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -389,8 +384,7 @@
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM | \
-		PKT_TX_METADATA)
+		PKT_TX_OUTER_UDP_CKSUM)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
@@ -601,17 +595,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature
  2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
  2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
  2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
@ 2019-11-06 15:49                             ` Ferruh Yigit
  2 siblings, 0 replies; 98+ messages in thread
From: Ferruh Yigit @ 2019-11-06 15:49 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: matan, rasland, thomas, olivier.matz, arybchenko, orika

On 11/5/2019 2:19 PM, Viacheslav Ovsiienko wrote:
> This patchset just combines two metadata related patches
> to provide right applying order. The first patch introduces
> the ingress metadata with mbuf dynamic field usage, the
> second one moves egress metadata to the dynamic field
> presented by first patch.
> 
> Viacheslav Ovsiienko (2):
>   ethdev: extend flow metadata
>   ethdev: move egress metadata to dynamic field

Series applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2019-11-06 15:49 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
2019-06-06 18:33     ` Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-05 13:54     ` Adrien Mazarguil
2019-07-05 18:05       ` Yongseok Koh
2019-07-08 23:32         ` Yongseok Koh
2019-07-09  8:38         ` Adrien Mazarguil
2019-07-11  1:59           ` Yongseok Koh
2019-10-08 12:57             ` Yigit, Ferruh
2019-10-08 13:18               ` Slava Ovsiienko
2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-10-24 13:12       ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
2019-10-27 16:38         ` Ori Kam
2019-10-27 18:42         ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
2019-10-27 19:11           ` Ori Kam
2019-10-31 18:57             ` Ferruh Yigit
2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
2019-06-10  3:19   ` Wang, Haiyue
2019-06-10  7:20     ` Andrew Rybchenko
2019-06-11  0:06       ` Yongseok Koh
2019-06-19  9:05         ` Andrew Rybchenko
2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-10  9:31   ` Olivier Matz
2019-07-10  9:55     ` Bruce Richardson
2019-07-10 10:07       ` Olivier Matz
2019-07-10 12:01         ` Bruce Richardson
2019-07-10 12:26           ` Thomas Monjalon
2019-07-10 16:37             ` Yongseok Koh
2019-07-11  7:44               ` Adrien Mazarguil
2019-07-14 11:46                 ` Andrew Rybchenko
2019-07-29 15:06                   ` Adrien Mazarguil
2019-10-08 12:51                     ` Yigit, Ferruh
2019-10-08 13:17                       ` Slava Ovsiienko
2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-10-18  9:22     ` Olivier Matz
2019-10-19 19:47       ` Slava Ovsiienko
2019-10-21 16:37         ` Olivier Matz
2019-10-24  6:49           ` Slava Ovsiienko
2019-10-24  9:22             ` Olivier Matz
2019-10-24 12:30               ` Slava Ovsiienko
2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
2019-10-27 16:56       ` Ori Kam
2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
2019-10-27 19:10         ` Ori Kam
2019-10-29 16:22         ` Andrew Rybchenko
2019-10-29 17:19           ` Slava Ovsiienko
2019-10-29 18:30             ` Thomas Monjalon
2019-10-29 18:35               ` Slava Ovsiienko
2019-10-30  6:28               ` Andrew Rybchenko
2019-10-30  7:35             ` Andrew Rybchenko
2019-10-30  8:59               ` Slava Ovsiienko
2019-10-30  9:20                 ` Andrew Rybchenko
2019-10-30 10:05                   ` Slava Ovsiienko
2019-10-30 10:03                 ` Slava Ovsiienko
2019-10-30 15:49               ` Olivier Matz
2019-10-31  9:25                 ` Andrew Rybchenko
2019-10-29 16:25         ` Olivier Matz
2019-10-29 16:33           ` Olivier Matz
2019-10-29 17:53             ` Slava Ovsiienko
2019-10-29 17:43           ` Slava Ovsiienko
2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
2019-10-30  8:02           ` Andrew Rybchenko
2019-10-30 14:40             ` Slava Ovsiienko
2019-10-30 14:46               ` Slava Ovsiienko
2019-10-30 15:20                 ` Olivier Matz
2019-10-30 15:57                   ` Thomas Monjalon
2019-10-30 15:58                   ` Slava Ovsiienko
2019-10-30 16:13                     ` Olivier Matz
2019-10-30  8:35           ` Ori Kam
2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-10-31  9:19               ` Andrew Rybchenko
2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-10-31 15:47                   ` Olivier Matz
2019-10-31 16:13                     ` Slava Ovsiienko
2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-04  6:13                       ` [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
2019-11-06 15:49                             ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Ferruh Yigit
2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
2019-10-31 17:21                       ` Olivier Matz
2019-11-01 12:34                       ` Andrew Rybchenko
2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
2019-10-31 13:33                   ` Ori Kam
2019-10-31 15:51                   ` Olivier Matz
2019-10-31 16:07                     ` Slava Ovsiienko
2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
2019-10-31  9:01               ` Andrew Rybchenko
2019-10-31 10:54                 ` Slava Ovsiienko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).