DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information
@ 2021-09-02 14:23 Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
                   ` (6 more replies)
  0 siblings, 7 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev

Back in 2019, commit c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")
announced changes in DEV_RX_OFFLOAD namespace intending to add new flags, RSS_HASH and
FLOW_MARK. Since then, only the former has been added. Currently, there's no way for
the application to configure the ethdev's ability to read out user FLAG, user MARK or
whatever else meta information that might be required in the case of tunnel offload.
The application assumes that no extra efforts are needed to make such data available.

The team behind sfc poll-mode driver would like to take over these efforts since the
lack of said controls has started impacting us in a number of ways. Riverhead, a
cutting edge Xilinx smart NIC family, allows to switch between several Rx prefix
formats, with the short one being the most suited for small packet performance.
Features like RSS hash and user mark, in turn, are provided when long prefix is used,
but the driver does not enable it by default. Some leverage has to be implemented to
let the application express its interest in relying on various sorts of Rx meta data.

Our research indicates that, while RSS_HASH is a legitimate offload flag (it requests
the very computation of RSS hash and not just its delivery via mbufs), adding similar
flags for user FLAG, user MARK and tunnel ID information has a better alternative.

The first patch in the series provides a dedicated API to control precisely the very
ability to deliver Rx meta data from the HW to the ethdev and, later, to the callers.
While adding a new dedicated API might at first seem a bit awkward, it does have its
benefits, with the most notorious one being its clear and concise contract for users.
The documentation provided in the patch explains concrete workflow to be implemented.

The most important use case for this might be Open vSwitch. The application has to be
patched separately to make use of the new API. Right now OvS tries to use tunnel
offload and, if it fails to insert the corresponding flow rules, falls back to
MARK + RSS scheme, which also can fail in the case when the port doesn't support MARK.
With this API, OvS will be able to negotiate supported types of Rx meta information
in advance thus avoiding many unnecessary flow insertion attempts later on.

Ivan Malov (5):
  ethdev: add API to negotiate support for Rx meta information
  net/sfc: provide API to negotiate supported Rx meta features
  net/sfc: allow to use EF100 native datapath Rx mark in flows
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: allow to discern user flag on EF100 native datapath

 drivers/common/sfc_efx/base/ef10_rx.c  | 54 +++++++++++++--------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 ++++++++
 drivers/net/sfc/sfc_ethdev.c           | 34 +++++++++++++
 drivers/net/sfc/sfc_flow.c             | 10 ++--
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 19 ++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++
 lib/ethdev/rte_ethdev.h                | 66 ++++++++++++++++++++++++++
 lib/ethdev/version.map                 |  3 ++
 13 files changed, 239 insertions(+), 28 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 1/5] ethdev: add API to negotiate support for Rx meta information
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
@ 2021-09-02 14:23 ` Ivan Malov
  2021-09-02 14:47   ` Jerin Jacob
                     ` (2 more replies)
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
                   ` (5 subsequent siblings)
  6 siblings, 3 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev; +Cc: Andrew Rybchenko, Thomas Monjalon, Ferruh Yigit, Ray Kinsella

Per-packet meta information (flag, mark and the likes) might
be expensive to deliver in terms of small packet performance.
If the features are not enabled by default, enabling them at
short notice (for example, when a flow rule with action MARK
gets created) without traffic disruption may not be possible.

Letting applications request delivery of Rx meta information
during initialisation can solve the said problem efficiently.

Technically, that could be accomplished by defining new bits
in DEV_RX_OFFLOAD namespace, but the ability to extract meta
data cannot be considered an offload on its own. For example,
Rx checksumming is an offload, while mark delivery is not as
it needs an external party, a flow rule with action MARK, to
hold the value and trigger mark insertion in the first place.

With this in mind, add a means to let applications negotiate
adapter support for the very delivery of Rx meta information.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/ethdev/ethdev_driver.h | 19 +++++++++++
 lib/ethdev/rte_ethdev.c    | 25 +++++++++++++++
 lib/ethdev/rte_ethdev.h    | 66 ++++++++++++++++++++++++++++++++++++++
 lib/ethdev/version.map     |  3 ++
 4 files changed, 113 insertions(+)

diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 40e474aa7e..3e29555fc7 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate support for specific fractions of Rx meta information.
+ *
+ * @param[in] dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_negotiate_rx_meta_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -949,6 +965,9 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	eth_negotiate_rx_meta_t negotiate_rx_meta;
+	/**< Negotiate support for specific fractions of Rx meta information. */
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 9d95cd11e1..821450cbf9 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6311,6 +6311,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->negotiate_rx_meta, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->negotiate_rx_meta)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index d2b27c351f..ac4d164aa8 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4888,6 +4888,72 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/**
+ * The ethdev will be able to detect flagged packets provided that
+ * there are active flow rules comprising the corresponding action.
+ */
+#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
+
+/**
+ * The ethdev will manage to see mark IDs in packets provided that
+ * there are active flow rules comprising the corresponding action.
+ */
+#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
+
+/**
+ * The ethdev will be able to identify partially offloaded packets
+ * and process rte_flow_get_restore_info() invocations accordingly
+ * provided that there're so-called "tunnel_set" flow rules in use.
+ */
+#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate support for specific fractions of Rx meta information.
+ *
+ * This function has to be invoked as early as possible, precisely,
+ * before the first rte_eth_dev_configure() invocation, to let the
+ * PMD make preparations which might be hard to do on later stages.
+ *
+ * The negotiation process is assumed to be carried out as follows:
+ *
+ * - the application composes a mask of preferred Rx meta features
+ *   intending to enable at least some of them and invokes the API;
+ *
+ * - the ethdev driver reports back the optimal (from its point of
+ *   view) subset of the initial feature set thus agreeing to make
+ *   those comprising the subset simultaneously available to users;
+ *
+ * - should the application find the result unsatisfactory, it may
+ *   come up with another pick of preferred features and try again;
+ *
+ * - the application can pass zero to clear the negotiation result;
+ *
+ * - the last negotiated result takes effect upon the ethdev start.
+ *
+ * If the method itself is unsupported by the PMD, the application
+ * may just ignore that and proceed with the rest of configuration
+ * procedure intending to simply try using the features it prefers.
+ *
+ * @param[in] port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 3eece75b72..e390e5718c 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -249,6 +249,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_negotiate_rx_meta;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 2/5] net/sfc: provide API to negotiate supported Rx meta features
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
@ 2021-09-02 14:23 ` Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev; +Cc: Andrew Rybchenko

This is a preparation step. Later patches will make features
FLAG and MARK on EF100 native Rx datapath available to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 34 ++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 10 +++++-----
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 331e06bac6..2812d76cbb 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -312,6 +312,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_meta;
+
 	uint32_t			rxd_wait_timeout_ns;
 };
 
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 2db0d000c3..57bcccdb1b 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1859,6 +1859,27 @@ sfc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t ethdev_qid)
 	return sap->dp_rx->intr_disable(rxq_info->dp);
 }
 
+static int
+sfc_negotiate_rx_meta(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_MARK;
+
+	sa->negotiated_rx_meta &= *features;
+	*features = sa->negotiated_rx_meta;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1906,6 +1927,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_by_id		= sfc_xstats_get_by_id,
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
+	.negotiate_rx_meta		= sfc_negotiate_rx_meta,
 };
 
 /**
@@ -1998,6 +2020,18 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/*
+		 * Datapath EF10 ESSB is available only on EF10 NICs running
+		 * Rx FW variant DPDK, which always provides fields FLAG and
+		 * MARK in Rx prefix, so point this fact out below. This way,
+		 * legacy applications from EF10 era, which are not aware of
+		 * rte_eth_negotiate_rx_meta(), can keep the workflow intact.
+		 */
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_FLAG;
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 4f5993a68d..a2034b5f5e 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1759,7 +1759,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	int rc;
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
-	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1827,10 +1827,10 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 			if ((actions_set & mark_actions_mask) != 0)
 				goto fail_actions_overlap;
 
-			if ((dp_rx_features & SFC_DP_RX_FEAT_FLOW_FLAG) == 0) {
+			if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) == 0) {
 				rte_flow_error_set(error, ENOTSUP,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
-					"FLAG action is not supported on the current Rx datapath");
+					"Action FLAG is unsupported on the current Rx datapath or has not been negotiated");
 				return -rte_errno;
 			}
 
@@ -1844,10 +1844,10 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 			if ((actions_set & mark_actions_mask) != 0)
 				goto fail_actions_overlap;
 
-			if ((dp_rx_features & SFC_DP_RX_FEAT_FLOW_MARK) == 0) {
+			if ((rx_meta & RTE_ETH_RX_META_USER_MARK) == 0) {
 				rte_flow_error_set(error, ENOTSUP,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
-					"MARK action is not supported on the current Rx datapath");
+					"Action MARK is unsupported on the current Rx datapath or has not been negotiated");
 				return -rte_errno;
 			}
 
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 4b520bc619..89c161ef88 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -2963,6 +2963,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3012,12 +3013,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"Action FLAG has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"Action MARK has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
@ 2021-09-02 14:23 ` Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev; +Cc: Andrew Rybchenko

As of now, reading out mark on EF100 native datapath is used
only by MAE counter support for delivery of generation count
values. Make the feature available to flow action MARK users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index d4cb96881c..e0cafbc579 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -914,6 +914,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
 	.queue_offload_capa	= DEV_RX_OFFLOAD_CHECKSUM |
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..c1acd2ed99 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
                   ` (2 preceding siblings ...)
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
@ 2021-09-02 14:23 ` Ivan Malov
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev; +Cc: Andrew Rybchenko

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index 24e1314cc3..bed1029f59 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3007,6 +3007,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 76b8ce302a..9d3258b503 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 5/5] net/sfc: allow to discern user flag on EF100 native datapath
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
                   ` (3 preceding siblings ...)
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-09-02 14:23 ` Ivan Malov
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
  6 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-02 14:23 UTC (permalink / raw)
  To: dev; +Cc: Andrew Rybchenko

Read out user flag from Rx prefix and indicate it to callers.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index e0cafbc579..ba43e12739 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -62,6 +62,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_RSS_HASH		0x10
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -371,6 +372,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH_VALID, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(CLASS, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -800,6 +811,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -914,6 +931,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index c1acd2ed99..a3331c5089 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] ethdev: add API to negotiate support for Rx meta information
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
@ 2021-09-02 14:47   ` Jerin Jacob
  2021-09-02 16:14   ` Kinsella, Ray
  2021-09-03  9:34   ` Jerin Jacob
  2 siblings, 0 replies; 97+ messages in thread
From: Jerin Jacob @ 2021-09-02 14:47 UTC (permalink / raw)
  To: Ivan Malov
  Cc: dpdk-dev, Andrew Rybchenko, Thomas Monjalon, Ferruh Yigit, Ray Kinsella

On Thu, Sep 2, 2021 at 7:54 PM Ivan Malov <ivan.malov@oktetlabs.ru> wrote:
>
> Per-packet meta information (flag, mark and the likes) might
> be expensive to deliver in terms of small packet performance.
> If the features are not enabled by default, enabling them at
> short notice (for example, when a flow rule with action MARK
> gets created) without traffic disruption may not be possible.
>
> Letting applications request delivery of Rx meta information
> during initialisation can solve the said problem efficiently.
>
> Technically, that could be accomplished by defining new bits
> in DEV_RX_OFFLOAD namespace, but the ability to extract meta
> data cannot be considered an offload on its own. For example,
> Rx checksumming is an offload, while mark delivery is not as
> it needs an external party, a flow rule with action MARK, to
> hold the value and trigger mark insertion in the first place.
>
> With this in mind, add a means to let applications negotiate
> adapter support for the very delivery of Rx meta information.

Good stuff.


>
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> ---
>  lib/ethdev/ethdev_driver.h | 19 +++++++++++
>  lib/ethdev/rte_ethdev.c    | 25 +++++++++++++++
>  lib/ethdev/rte_ethdev.h    | 66 ++++++++++++++++++++++++++++++++++++++
>  lib/ethdev/version.map     |  3 ++
>  4 files changed, 113 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 40e474aa7e..3e29555fc7 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
>  typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
>         struct rte_eth_representor_info *info);
>
> +/**
> + * @internal
> + * Negotiate support for specific fractions of Rx meta information.
> + *
> + * @param[in] dev
> + *   Port (ethdev) handle
> + *
> + * @param[inout] features

All the params are in by default, I think, only [out] only needed.

> + *   Feature selection buffer
> + *
> + * @return
> + *   Negative errno value on error, zero otherwise
> + */
> +typedef int (*eth_negotiate_rx_meta_t)(struct rte_eth_dev *dev,
> +                                      uint64_t *features);
> +
>  /**
>   * @internal A structure containing the functions exported by an Ethernet driver.
>   */
> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>
>         eth_representor_info_get_t representor_info_get;
>         /**< Get representor info. */
> +
> +       eth_negotiate_rx_meta_t negotiate_rx_meta;
> +       /**< Negotiate support for specific fractions of Rx meta information. */
>  };
>
>  /**
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 9d95cd11e1..821450cbf9 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6311,6 +6311,31 @@ rte_eth_representor_info_get(uint16_t port_id,
>         return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
>  }
>
> +int
> +rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features)
> +{
> +       struct rte_eth_dev *dev;
> +
> +       RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +       dev = &rte_eth_devices[port_id];
> +
> +       if (dev->data->dev_configured != 0) {
> +               RTE_ETHDEV_LOG(ERR,
> +                       "The port (id=%"PRIu16") is already configured\n",
> +                       port_id);
> +               return -EBUSY;
> +       }
> +
> +       if (features == NULL) {
> +               RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> +               return -EINVAL;
> +       }
> +
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->negotiate_rx_meta, -ENOTSUP);
> +       return eth_err(port_id,
> +                      (*dev->dev_ops->negotiate_rx_meta)(dev, features));
> +}
> +
>  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>
>  RTE_INIT(ethdev_init_telemetry)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index d2b27c351f..ac4d164aa8 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4888,6 +4888,72 @@ __rte_experimental
>  int rte_eth_representor_info_get(uint16_t port_id,
>                                  struct rte_eth_representor_info *info);
>
> +/**
> + * The ethdev will be able to detect flagged packets provided that
> + * there are active flow rules comprising the corresponding action.
> + */
> +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)

I think, we need to add @see note in rte_flow API where this is requested.


> +
> +/**
> + * The ethdev will manage to see mark IDs in packets provided that
> + * there are active flow rules comprising the corresponding action.
> + */
> +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)

Same as above

> +
> +/**
> + * The ethdev will be able to identify partially offloaded packets
> + * and process rte_flow_get_restore_info() invocations accordingly
> + * provided that there're so-called "tunnel_set" flow rules in use.
> + */
> +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)

Same as above


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Negotiate support for specific fractions of Rx meta information.
> + *
> + * This function has to be invoked as early as possible, precisely,
> + * before the first rte_eth_dev_configure() invocation, to let the
> + * PMD make preparations which might be hard to do on later stages.
> + *
> + * The negotiation process is assumed to be carried out as follows:
> + *
> + * - the application composes a mask of preferred Rx meta features
> + *   intending to enable at least some of them and invokes the API;
> + *
> + * - the ethdev driver reports back the optimal (from its point of
> + *   view) subset of the initial feature set thus agreeing to make
> + *   those comprising the subset simultaneously available to users;
> + *
> + * - should the application find the result unsatisfactory, it may
> + *   come up with another pick of preferred features and try again;
> + *
> + * - the application can pass zero to clear the negotiation result;
> + *
> + * - the last negotiated result takes effect upon the ethdev start.
> + *
> + * If the method itself is unsupported by the PMD, the application
> + * may just ignore that and proceed with the rest of configuration
> + * procedure intending to simply try using the features it prefers.
> + *
> + * @param[in] port_id
> + *   Port (ethdev) identifier

See above comment.

> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   - (-EBUSY) if the port can't handle this in its current state;
> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
> + *   - (-ENODEV) if *port_id* is invalid;
> + *   - (-EINVAL) if *features* is NULL;
> + *   - (-EIO) if the device is removed;
> + *   - (0) on success
> + */
> +__rte_experimental
> +int rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features);

# I would prefer to call it as rte_eth_rx_meta_negotiate as
action/verb can come the last API to
to have a proper namespace.

# IMO, We need to update the release notes for this behavior change.

# Is any of the in-tree examples or test application needs to update
to adaptor this behavior change

> +
>  #include <rte_ethdev_core.h>
>
>  /**
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 3eece75b72..e390e5718c 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -249,6 +249,9 @@ EXPERIMENTAL {
>         rte_mtr_meter_policy_delete;
>         rte_mtr_meter_policy_update;
>         rte_mtr_meter_policy_validate;
> +
> +       # added in 21.11
> +       rte_eth_negotiate_rx_meta;
>  };
>
>  INTERNAL {
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] ethdev: add API to negotiate support for Rx meta information
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
  2021-09-02 14:47   ` Jerin Jacob
@ 2021-09-02 16:14   ` Kinsella, Ray
  2021-09-03  9:34   ` Jerin Jacob
  2 siblings, 0 replies; 97+ messages in thread
From: Kinsella, Ray @ 2021-09-02 16:14 UTC (permalink / raw)
  To: Ivan Malov, dev; +Cc: Andrew Rybchenko, Thomas Monjalon, Ferruh Yigit



On 02/09/2021 15:23, Ivan Malov wrote:
> Per-packet meta information (flag, mark and the likes) might
> be expensive to deliver in terms of small packet performance.
> If the features are not enabled by default, enabling them at
> short notice (for example, when a flow rule with action MARK
> gets created) without traffic disruption may not be possible.
> 
> Letting applications request delivery of Rx meta information
> during initialisation can solve the said problem efficiently.
> 
> Technically, that could be accomplished by defining new bits
> in DEV_RX_OFFLOAD namespace, but the ability to extract meta
> data cannot be considered an offload on its own. For example,
> Rx checksumming is an offload, while mark delivery is not as
> it needs an external party, a flow rule with action MARK, to
> hold the value and trigger mark insertion in the first place.
> 
> With this in mind, add a means to let applications negotiate
> adapter support for the very delivery of Rx meta information.
> 
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> ---
>  lib/ethdev/ethdev_driver.h | 19 +++++++++++
>  lib/ethdev/rte_ethdev.c    | 25 +++++++++++++++
>  lib/ethdev/rte_ethdev.h    | 66 ++++++++++++++++++++++++++++++++++++++
>  lib/ethdev/version.map     |  3 ++
>  4 files changed, 113 insertions(+)
> 
Acked-by: Ray Kinsella <mdr@ashroe.eu>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
                   ` (4 preceding siblings ...)
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
@ 2021-09-03  0:15 ` Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 1/5] ethdev: add API " Ivan Malov
                     ` (4 more replies)
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
  6 siblings, 5 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob, Ray Kinsella

Back in 2019, commit c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")
announced changes in DEV_RX_OFFLOAD namespace intending to add new flags, RSS_HASH and
FLOW_MARK. Since then, only the former has been added. Currently, there's no way for
the application to configure the ethdev's ability to read out user FLAG, user MARK or
whatever else meta information that might be required in the case of tunnel offload.
The application assumes that no extra efforts are needed to make such data available.

The team behind sfc poll-mode driver would like to take over these efforts since the
lack of said controls has started impacting us in a number of ways. Riverhead, a
cutting edge Xilinx smart NIC family, allows to switch between several Rx prefix
formats, with the short one being the most suited for small packet performance.
Features like RSS hash and user mark, in turn, are provided when long prefix is used,
but the driver does not enable it by default. Some leverage has to be implemented to
let the application express its interest in relying on various sorts of Rx meta data.

Our research indicates that, while RSS_HASH is a legitimate offload flag (it requests
the very computation of RSS hash and not just its delivery via mbufs), adding similar
flags for user FLAG, user MARK and tunnel ID information has a better alternative.

The first patch in the series provides a dedicated API to control precisely the very
ability to deliver Rx meta data from the HW to the ethdev and, later, to the callers.
While adding a new dedicated API might at first seem a bit awkward, it does have its
benefits, with the most notorious one being its clear and concise contract for users.
The documentation provided in the patch explains concrete workflow to be implemented.

The most important use case for this might be Open vSwitch. The application has to be
patched separately to make use of the new API. Right now OvS tries to use tunnel
offload and, if it fails to insert the corresponding flow rules, falls back to
MARK + RSS scheme, which also can fail in the case when the port doesn't support MARK.
With this API, OvS will be able to negotiate supported types of Rx meta information
in advance thus avoiding many unnecessary flow insertion attempts later on.

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Ivan Malov (5):
  ethdev: add API to negotiate support for Rx meta information
  net/sfc: provide API to negotiate supported Rx meta features
  net/sfc: allow to use EF100 native datapath Rx mark in flows
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: allow to discern user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++
 app/test-pmd/testpmd.c                 | 26 ++++++++++
 doc/guides/rel_notes/release_21_11.rst | 10 ++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 +++++++++++++--------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 ++++++++
 drivers/net/sfc/sfc_ethdev.c           | 34 +++++++++++++
 drivers/net/sfc/sfc_flow.c             | 10 ++--
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 19 ++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++
 lib/ethdev/rte_ethdev.h                | 66 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 15 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 311 insertions(+), 28 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 1/5] ethdev: add API to negotiate support for Rx meta information
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
@ 2021-09-03  0:15   ` Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev
  Cc: Jerin Jacob, Ray Kinsella, Andrew Rybchenko, Wisam Jaddo,
	Xiaoyun Li, Thomas Monjalon, Ferruh Yigit, Ori Kam

Per-packet meta information (flag, mark and the likes) might
be expensive to deliver in terms of small packet performance.
If the features are not enabled by default, enabling them at
short notice (for example, when a flow rule with action MARK
gets created) without traffic disruption may not be possible.

Letting applications request delivery of Rx meta information
during initialisation can solve the said problem efficiently.

Technically, that could be accomplished by defining new bits
in DEV_RX_OFFLOAD namespace, but the ability to extract meta
data cannot be considered an offload on its own. For example,
Rx checksumming is an offload, while mark delivery is not as
it needs an external party, a flow rule with action MARK, to
hold the value and trigger mark insertion in the first place.

With this in mind, add a means to let applications negotiate
adapter support for the very delivery of Rx meta information.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
---
 app/test-flow-perf/main.c              | 21 ++++++++
 app/test-pmd/testpmd.c                 | 26 ++++++++++
 doc/guides/rel_notes/release_21_11.rst | 10 ++++
 lib/ethdev/ethdev_driver.h             | 19 ++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++
 lib/ethdev/rte_ethdev.h                | 66 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 15 ++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 185 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..48eafffb1d 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_meta_features = 0;
+
+		rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
+		rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
+
+		ret = rte_eth_rx_meta_negotiate(port_id, &rx_meta_features);
+		if (ret == 0) {
+			if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9061cbf637..72addc59db 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1421,10 +1421,36 @@ static void
 init_config_port_offloads(portid_t pid, uint32_t socket_id)
 {
 	struct rte_port *port = &ports[pid];
+	uint64_t rx_meta_features = 0;
 	uint16_t data_size;
 	int ret;
 	int i;
 
+	rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
+
+	ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
+			TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    pid);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
+			TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    pid);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
+			TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    pid);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 pid, rte_strerror(-ret));
+	}
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index d707a554ef..017b5a8239 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,16 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added an API to negotiate support for Rx meta information**
+
+  A new API, ``rte_eth_rx_meta_negotiate()``, was added to let applications
+  negotiate support for the very delivery of various Rx meta data fractions,
+  with the following definitions being available starting from this release:
+
+  * ``RTE_ETH_RX_META_USER_FLAG``
+  * ``RTE_ETH_RX_META_USER_MARK``
+  * ``RTE_ETH_RX_META_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 40e474aa7e..508d2e6bfb 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate support for specific fractions of Rx meta information.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -949,6 +965,9 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	eth_rx_meta_negotiate_t rx_meta_negotiate;
+	/**< Negotiate support for specific fractions of Rx meta information. */
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 9d95cd11e1..8010aa7a43 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6311,6 +6311,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_meta_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index d2b27c351f..16af376866 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4888,6 +4888,72 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/**
+ * The ethdev will be able to detect flagged packets provided that
+ * there are active flow rules comprising the corresponding action.
+ */
+#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
+
+/**
+ * The ethdev will manage to see mark IDs in packets provided that
+ * there are active flow rules comprising the corresponding action.
+ */
+#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
+
+/**
+ * The ethdev will be able to identify partially offloaded packets
+ * and process rte_flow_get_restore_info() invocations accordingly
+ * provided that there're so-called "tunnel_set" flow rules in use.
+ */
+#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate support for specific fractions of Rx meta information.
+ *
+ * This function has to be invoked as early as possible, precisely,
+ * before the first rte_eth_dev_configure() invocation, to let the
+ * PMD make preparations which might be hard to do on later stages.
+ *
+ * The negotiation process is assumed to be carried out as follows:
+ *
+ * - the application composes a mask of preferred Rx meta features
+ *   intending to enable at least some of them and invokes the API;
+ *
+ * - the ethdev driver reports back the optimal (from its point of
+ *   view) subset of the initial feature set thus agreeing to make
+ *   those comprising the subset simultaneously available to users;
+ *
+ * - should the application find the result unsatisfactory, it may
+ *   come up with another pick of preferred features and try again;
+ *
+ * - the application can pass zero to clear the negotiation result;
+ *
+ * - the last negotiated result takes effect upon the ethdev start.
+ *
+ * If the method itself is unsupported by the PMD, the application
+ * may just ignore that and proceed with the rest of configuration
+ * procedure intending to simply try using the features it prefers.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 70f455d47d..f8f2bc2e8c 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,11 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * The ethdev's ability to serve mark IDs with received packets
+	 * has to be negotiated separately during initialisation period.
+	 * @see rte_eth_rx_meta_negotiate()
+	 * @see RTE_ETH_RX_META_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1917,11 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * The ethdev's ability to serve the flag with received packets
+	 * has to be negotiated separately during initialisation period.
+	 * @see rte_eth_rx_meta_negotiate()
+	 * @see RTE_ETH_RX_META_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4223,6 +4233,11 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * The ethdev's ability to uncover such packet processing state
+ * has to be negotiated separately during initialisation period.
+ * @see rte_eth_rx_meta_negotiate()
+ * @see RTE_ETH_RX_META_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 3eece75b72..2cf773f63a 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -249,6 +249,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_meta_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 2/5] net/sfc: provide API to negotiate supported Rx meta features
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 1/5] ethdev: add API " Ivan Malov
@ 2021-09-03  0:15   ` Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob, Ray Kinsella, Andrew Rybchenko

This is a preparation step. Later patches will make features
FLAG and MARK on EF100 native Rx datapath available to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 34 ++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 10 +++++-----
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 331e06bac6..2812d76cbb 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -312,6 +312,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_meta;
+
 	uint32_t			rxd_wait_timeout_ns;
 };
 
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 2db0d000c3..94203616c3 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1859,6 +1859,27 @@ sfc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t ethdev_qid)
 	return sap->dp_rx->intr_disable(rxq_info->dp);
 }
 
+static int
+sfc_rx_meta_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_MARK;
+
+	sa->negotiated_rx_meta &= *features;
+	*features = sa->negotiated_rx_meta;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1906,6 +1927,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_by_id		= sfc_xstats_get_by_id,
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
+	.rx_meta_negotiate		= sfc_rx_meta_negotiate,
 };
 
 /**
@@ -1998,6 +2020,18 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/*
+		 * Datapath EF10 ESSB is available only on EF10 NICs running
+		 * Rx FW variant DPDK, which always provides fields FLAG and
+		 * MARK in Rx prefix, so point this fact out below. This way,
+		 * legacy applications from EF10 era, which are not aware of
+		 * rte_eth_rx_meta_negotiate(), can keep the workflow intact.
+		 */
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_FLAG;
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 4f5993a68d..a2034b5f5e 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1759,7 +1759,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	int rc;
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
-	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1827,10 +1827,10 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 			if ((actions_set & mark_actions_mask) != 0)
 				goto fail_actions_overlap;
 
-			if ((dp_rx_features & SFC_DP_RX_FEAT_FLOW_FLAG) == 0) {
+			if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) == 0) {
 				rte_flow_error_set(error, ENOTSUP,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
-					"FLAG action is not supported on the current Rx datapath");
+					"Action FLAG is unsupported on the current Rx datapath or has not been negotiated");
 				return -rte_errno;
 			}
 
@@ -1844,10 +1844,10 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 			if ((actions_set & mark_actions_mask) != 0)
 				goto fail_actions_overlap;
 
-			if ((dp_rx_features & SFC_DP_RX_FEAT_FLOW_MARK) == 0) {
+			if ((rx_meta & RTE_ETH_RX_META_USER_MARK) == 0) {
 				rte_flow_error_set(error, ENOTSUP,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
-					"MARK action is not supported on the current Rx datapath");
+					"Action MARK is unsupported on the current Rx datapath or has not been negotiated");
 				return -rte_errno;
 			}
 
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 4b520bc619..89c161ef88 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -2963,6 +2963,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3012,12 +3013,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"Action FLAG has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"Action MARK has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 1/5] ethdev: add API " Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
@ 2021-09-03  0:15   ` Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob, Ray Kinsella, Andrew Rybchenko

As of now, reading out mark on EF100 native datapath is used
only by MAE counter support for delivery of generation count
values. Make the feature available to flow action MARK users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index d4cb96881c..e0cafbc579 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -914,6 +914,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
 	.queue_offload_capa	= DEV_RX_OFFLOAD_CHECKSUM |
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..c1acd2ed99 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
                     ` (2 preceding siblings ...)
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
@ 2021-09-03  0:15   ` Ivan Malov
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob, Ray Kinsella, Andrew Rybchenko

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index 24e1314cc3..bed1029f59 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3007,6 +3007,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 76b8ce302a..9d3258b503 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 5/5] net/sfc: allow to discern user flag on EF100 native datapath
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
                     ` (3 preceding siblings ...)
  2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-09-03  0:15   ` Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-03  0:15 UTC (permalink / raw)
  To: dev; +Cc: Jerin Jacob, Ray Kinsella, Andrew Rybchenko

Read out user flag from Rx prefix and indicate it to callers.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index e0cafbc579..ba43e12739 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -62,6 +62,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_RSS_HASH		0x10
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -371,6 +372,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH_VALID, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(CLASS, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -800,6 +811,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -914,6 +931,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index c1acd2ed99..a3331c5089 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH 1/5] ethdev: add API to negotiate support for Rx meta information
  2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
  2021-09-02 14:47   ` Jerin Jacob
  2021-09-02 16:14   ` Kinsella, Ray
@ 2021-09-03  9:34   ` Jerin Jacob
  2 siblings, 0 replies; 97+ messages in thread
From: Jerin Jacob @ 2021-09-03  9:34 UTC (permalink / raw)
  To: Ivan Malov
  Cc: dpdk-dev, Andrew Rybchenko, Thomas Monjalon, Ferruh Yigit, Ray Kinsella

On Thu, Sep 2, 2021 at 7:54 PM Ivan Malov <ivan.malov@oktetlabs.ru> wrote:
>
> Per-packet meta information (flag, mark and the likes) might
> be expensive to deliver in terms of small packet performance.
> If the features are not enabled by default, enabling them at
> short notice (for example, when a flow rule with action MARK
> gets created) without traffic disruption may not be possible.
>
> Letting applications request delivery of Rx meta information
> during initialisation can solve the said problem efficiently.
>
> Technically, that could be accomplished by defining new bits
> in DEV_RX_OFFLOAD namespace, but the ability to extract meta
> data cannot be considered an offload on its own. For example,
> Rx checksumming is an offload, while mark delivery is not as
> it needs an external party, a flow rule with action MARK, to
> hold the value and trigger mark insertion in the first place.
>
> With this in mind, add a means to let applications negotiate
> adapter support for the very delivery of Rx meta information.
>
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

Acked-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  lib/ethdev/ethdev_driver.h | 19 +++++++++++
>  lib/ethdev/rte_ethdev.c    | 25 +++++++++++++++
>  lib/ethdev/rte_ethdev.h    | 66 ++++++++++++++++++++++++++++++++++++++
>  lib/ethdev/version.map     |  3 ++
>  4 files changed, 113 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 40e474aa7e..3e29555fc7 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
>  typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
>         struct rte_eth_representor_info *info);
>
> +/**
> + * @internal
> + * Negotiate support for specific fractions of Rx meta information.
> + *
> + * @param[in] dev
> + *   Port (ethdev) handle
> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   Negative errno value on error, zero otherwise
> + */
> +typedef int (*eth_negotiate_rx_meta_t)(struct rte_eth_dev *dev,
> +                                      uint64_t *features);
> +
>  /**
>   * @internal A structure containing the functions exported by an Ethernet driver.
>   */
> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>
>         eth_representor_info_get_t representor_info_get;
>         /**< Get representor info. */
> +
> +       eth_negotiate_rx_meta_t negotiate_rx_meta;
> +       /**< Negotiate support for specific fractions of Rx meta information. */
>  };
>
>  /**
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 9d95cd11e1..821450cbf9 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6311,6 +6311,31 @@ rte_eth_representor_info_get(uint16_t port_id,
>         return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
>  }
>
> +int
> +rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features)
> +{
> +       struct rte_eth_dev *dev;
> +
> +       RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +       dev = &rte_eth_devices[port_id];
> +
> +       if (dev->data->dev_configured != 0) {
> +               RTE_ETHDEV_LOG(ERR,
> +                       "The port (id=%"PRIu16") is already configured\n",
> +                       port_id);
> +               return -EBUSY;
> +       }
> +
> +       if (features == NULL) {
> +               RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> +               return -EINVAL;
> +       }
> +
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->negotiate_rx_meta, -ENOTSUP);
> +       return eth_err(port_id,
> +                      (*dev->dev_ops->negotiate_rx_meta)(dev, features));
> +}
> +
>  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>
>  RTE_INIT(ethdev_init_telemetry)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index d2b27c351f..ac4d164aa8 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4888,6 +4888,72 @@ __rte_experimental
>  int rte_eth_representor_info_get(uint16_t port_id,
>                                  struct rte_eth_representor_info *info);
>
> +/**
> + * The ethdev will be able to detect flagged packets provided that
> + * there are active flow rules comprising the corresponding action.
> + */
> +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
> +
> +/**
> + * The ethdev will manage to see mark IDs in packets provided that
> + * there are active flow rules comprising the corresponding action.
> + */
> +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
> +
> +/**
> + * The ethdev will be able to identify partially offloaded packets
> + * and process rte_flow_get_restore_info() invocations accordingly
> + * provided that there're so-called "tunnel_set" flow rules in use.
> + */
> +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Negotiate support for specific fractions of Rx meta information.
> + *
> + * This function has to be invoked as early as possible, precisely,
> + * before the first rte_eth_dev_configure() invocation, to let the
> + * PMD make preparations which might be hard to do on later stages.
> + *
> + * The negotiation process is assumed to be carried out as follows:
> + *
> + * - the application composes a mask of preferred Rx meta features
> + *   intending to enable at least some of them and invokes the API;
> + *
> + * - the ethdev driver reports back the optimal (from its point of
> + *   view) subset of the initial feature set thus agreeing to make
> + *   those comprising the subset simultaneously available to users;
> + *
> + * - should the application find the result unsatisfactory, it may
> + *   come up with another pick of preferred features and try again;
> + *
> + * - the application can pass zero to clear the negotiation result;
> + *
> + * - the last negotiated result takes effect upon the ethdev start.
> + *
> + * If the method itself is unsupported by the PMD, the application
> + * may just ignore that and proceed with the rest of configuration
> + * procedure intending to simply try using the features it prefers.
> + *
> + * @param[in] port_id
> + *   Port (ethdev) identifier
> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   - (-EBUSY) if the port can't handle this in its current state;
> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
> + *   - (-ENODEV) if *port_id* is invalid;
> + *   - (-EINVAL) if *features* is NULL;
> + *   - (-EIO) if the device is removed;
> + *   - (0) on success
> + */
> +__rte_experimental
> +int rte_eth_negotiate_rx_meta(uint16_t port_id, uint64_t *features);
> +
>  #include <rte_ethdev_core.h>
>
>  /**
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 3eece75b72..e390e5718c 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -249,6 +249,9 @@ EXPERIMENTAL {
>         rte_mtr_meter_policy_delete;
>         rte_mtr_meter_policy_update;
>         rte_mtr_meter_policy_validate;
> +
> +       # added in 21.11
> +       rte_eth_negotiate_rx_meta;
>  };
>
>  INTERNAL {
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
                   ` (5 preceding siblings ...)
  2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
@ 2021-09-23 11:20 ` Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
                     ` (9 more replies)
  6 siblings, 10 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev; +Cc: Andy Moreton

In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
only the former has been added. The problem hasn't been solved.
Applications still assume that no efforts are needed to enable
flow mark and similar meta data delivery.

The team behind net/sfc driver has to take over the efforts since
the problem has started impacting us. Riverhead, a cutting edge
Xilinx smart NIC family, has two Rx prefix types. Rx meta data
is available only from long Rx prefix. Switching between the
prefix formats can't happen in started state. Hence, we run
into the same problem which [1] was aiming to solve.

Rx meta data (mark, flag, tunnel ID) delivery is not an offload
on its own since the corresponding flows must be active to set
the data in the first place. Hence, adding offload flags
similar to RSS_HASH is not a good idea.

Patch [1/5] of this series adds a generic API to let applications
negotiate delivery of Rx meta data during initialisation period.
This way, an application knows right from the start which parts
of Rx meta data won't be delivered. Hence, no necessity to try
inserting flows requesting such data and handle the failures.

Major clients of the flow library like Open vSwitch will have
to be patched separately to benefit from the new approach.

[1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Changes in v3:
* [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
* [1/5] has the ack from Jerin Jacob added

Ivan Malov (5):
  ethdev: add API to negotiate delivery of Rx meta data
  net/sfc: support API to negotiate delivery of Rx meta data
  net/sfc: support flow mark delivery on EF100 native datapath
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: report user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 26 +++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 +++++++++
 drivers/net/sfc/sfc_ethdev.c           | 29 ++++++++++++++
 drivers/net/sfc/sfc_flow.c             | 11 ++++++
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 19 +++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 45 +++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 287 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
@ 2021-09-23 11:20   ` Ivan Malov
  2021-09-30 14:59     ` Ori Kam
  2021-09-30 21:48     ` Ajit Khaparde
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 2/5] net/sfc: support " Ivan Malov
                     ` (8 subsequent siblings)
  9 siblings, 2 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev
  Cc: Andy Moreton, Andrew Rybchenko, Ray Kinsella, Jerin Jacob,
	Wisam Jaddo, Xiaoyun Li, Thomas Monjalon, Ferruh Yigit, Ori Kam

Delivery of mark, flag and the likes might affect small packet performance.
If these features are disabled by default, enabling them in started state
without causing traffic disruption may not always be possible.

Let applications negotiate delivery of Rx meta data beforehand.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 app/test-flow-perf/main.c              | 21 ++++++++++++
 app/test-pmd/testpmd.c                 | 26 +++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 ++++++
 lib/ethdev/ethdev_driver.h             | 19 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
 lib/ethdev/rte_ethdev.h                | 45 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 +++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 160 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..48eafffb1d 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_meta_features = 0;
+
+		rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
+		rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
+
+		ret = rte_eth_rx_meta_negotiate(port_id, &rx_meta_features);
+		if (ret == 0) {
+			if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..7a8da3d7ab 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1485,10 +1485,36 @@ static void
 init_config_port_offloads(portid_t pid, uint32_t socket_id)
 {
 	struct rte_port *port = &ports[pid];
+	uint64_t rx_meta_features = 0;
 	uint16_t data_size;
 	int ret;
 	int i;
 
+	rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
+
+	ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
+			TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    pid);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
+			TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    pid);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
+			TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    pid);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 pid, rte_strerror(-ret));
+	}
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 19356ac53c..6674d4474c 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -106,6 +106,15 @@ New Features
   Added command-line options to specify total number of processes and
   current process ID. Each process owns subset of Rx and Tx queues.
 
+* **Added an API to negotiate delivery of specific parts of Rx meta data**
+
+  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
+  The following parts of Rx meta data were defined:
+
+  * ``RTE_ETH_RX_META_USER_FLAG``
+  * ``RTE_ETH_RX_META_USER_MARK``
+  * ``RTE_ETH_RX_META_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 40e474aa7e..96e0c60cae 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate delivery of specific parts of Rx meta data.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -949,6 +965,9 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	eth_rx_meta_negotiate_t rx_meta_negotiate;
+	/**< Negotiate delivery of specific parts of Rx meta data. */
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index daf5ca9242..49cb84d64c 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_meta_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 1da37896d8..8467a7a362 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4888,6 +4888,51 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/** The ethdev sees flagged packets if there are flows with action FLAG. */
+#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
+
+/** The ethdev sees mark IDs in packets if there are flows with action MARK. */
+#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
+
+/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */
+#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate delivery of specific parts of Rx meta data.
+ *
+ * Invoke this API before the first rte_eth_dev_configure() invocation
+ * to let the PMD make preparations that are inconvenient to do later.
+ *
+ * The negotiation process is as follows:
+ *
+ * - the application requests features intending to use at least some of them;
+ * - the PMD responds with the guaranteed subset of the requested feature set;
+ * - the application can retry negotiation with another set of features;
+ * - the application can pass zero to clear the negotiation result;
+ * - the last negotiated result takes effect upon the ethdev start.
+ *
+ * If this API is unsupported, the application should gracefully ignore that.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 70f455d47d..6eb7ec0574 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * One should negotiate delivery of mark IDs beforehand.
+	 * @see rte_eth_rx_meta_negotiate()
+	 * @see RTE_ETH_RX_META_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * One should negotiate flag delivery beforehand.
+	 * @see rte_eth_rx_meta_negotiate()
+	 * @see RTE_ETH_RX_META_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * One should negotiate the processing state information delivery beforehand.
+ * @see rte_eth_rx_meta_negotiate()
+ * @see RTE_ETH_RX_META_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..a8928266a9 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_meta_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] net/sfc: support API to negotiate delivery of Rx meta data
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
@ 2021-09-23 11:20   ` Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev; +Cc: Andy Moreton, Andrew Rybchenko

Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 11 +++++++++++
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 331e06bac6..2812d76cbb 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -312,6 +312,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_meta;
+
 	uint32_t			rxd_wait_timeout_ns;
 };
 
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 2db0d000c3..75c3da2e52 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1859,6 +1859,28 @@ sfc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t ethdev_qid)
 	return sap->dp_rx->intr_disable(rxq_info->dp);
 }
 
+static int
+sfc_rx_meta_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+	uint64_t supported = 0;
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		supported |= RTE_ETH_RX_META_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		supported |= RTE_ETH_RX_META_USER_MARK;
+
+	sa->negotiated_rx_meta = supported & *features;
+	*features = sa->negotiated_rx_meta;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1906,6 +1928,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_by_id		= sfc_xstats_get_by_id,
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
+	.rx_meta_negotiate		= sfc_rx_meta_negotiate,
 };
 
 /**
@@ -1998,6 +2021,12 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/* FLAG and MARK are always available from Rx prefix. */
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_FLAG;
+		sa->negotiated_rx_meta |= RTE_ETH_RX_META_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 4f5993a68d..57cf1ad02b 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1760,6 +1760,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
 	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1832,6 +1833,11 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"FLAG action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"flag delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			spec_filter->template.efs_flags |=
@@ -1849,6 +1855,11 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"MARK action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_meta & RTE_ETH_RX_META_USER_MARK) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"mark delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			rc = sfc_flow_parse_mark(sa, actions->conf, flow);
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 4b520bc619..5ecad7347a 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -2963,6 +2963,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_meta = sa->negotiated_rx_meta;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3012,12 +3013,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"flag delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_meta & RTE_ETH_RX_META_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"mark delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] net/sfc: support flow mark delivery on EF100 native datapath
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 2/5] net/sfc: support " Ivan Malov
@ 2021-09-23 11:20   ` Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev; +Cc: Andy Moreton, Andrew Rybchenko

MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 1bf04f565a..b634c8f23a 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -914,6 +914,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
 	.queue_offload_capa	= DEV_RX_OFFLOAD_CHECKSUM |
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..c1acd2ed99 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (2 preceding siblings ...)
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
@ 2021-09-23 11:20   ` Ivan Malov
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev; +Cc: Andy Moreton, Andrew Rybchenko

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index 24e1314cc3..bed1029f59 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3007,6 +3007,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 76b8ce302a..9d3258b503 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] net/sfc: report user flag on EF100 native datapath
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (3 preceding siblings ...)
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-09-23 11:20   ` Ivan Malov
  2021-09-30 16:18   ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Thomas Monjalon
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-09-23 11:20 UTC (permalink / raw)
  To: dev; +Cc: Andy Moreton, Andrew Rybchenko

Detect the flag in Rx prefix and pass it to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index b634c8f23a..7d0d6b3d00 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -62,6 +62,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_RSS_HASH		0x10
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -371,6 +372,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH_VALID, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(CLASS, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -800,6 +811,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -914,6 +931,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index c1acd2ed99..a3331c5089 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_meta & RTE_ETH_RX_META_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
@ 2021-09-30 14:59     ` Ori Kam
  2021-09-30 15:07       ` Andrew Rybchenko
  2021-09-30 19:07       ` Ivan Malov
  2021-09-30 21:48     ` Ajit Khaparde
  1 sibling, 2 replies; 97+ messages in thread
From: Ori Kam @ 2021-09-30 14:59 UTC (permalink / raw)
  To: Ivan Malov, dev
  Cc: Andy Moreton, Andrew Rybchenko, Ray Kinsella, Jerin Jacob,
	Wisam Monther, Xiaoyun Li, NBU-Contact-Thomas Monjalon,
	Ferruh Yigit

Hi Ivan,
Sorry for jumping in late.

I have a concern that this patch breaks other PMDs.
From the rst file " One should negotiate flag delivery beforehand"
since you only added this function for your PMD all other PMD will fail.
I see that you added exception in the  examples, but it doesn't make sense
that applications will also need to add this exception which is not documented.

Please see more comments inline.

Thanks,
Ori

> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Thursday, September 23, 2021 2:20 PM
> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> Delivery of mark, flag and the likes might affect small packet performance.
> If these features are disabled by default, enabling them in started state
> without causing traffic disruption may not always be possible.
> 
> Let applications negotiate delivery of Rx meta data beforehand.
> 
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---
>  app/test-flow-perf/main.c              | 21 ++++++++++++
>  app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>  doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>  lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>  lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>  lib/ethdev/rte_ethdev.h                | 45 ++++++++++++++++++++++++++
>  lib/ethdev/rte_flow.h                  | 12 +++++++
>  lib/ethdev/version.map                 |  3 ++
>  8 files changed, 160 insertions(+)
> 
> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c index
> 9be8edc31d..48eafffb1d 100644
> --- a/app/test-flow-perf/main.c
> +++ b/app/test-flow-perf/main.c
> @@ -1760,6 +1760,27 @@ init_port(void)
>  		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
> 
>  	for (port_id = 0; port_id < nr_ports; port_id++) {
> +		uint64_t rx_meta_features = 0;
> +
> +		rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> +		rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> +
> +		ret = rte_eth_rx_meta_negotiate(port_id,
> &rx_meta_features);
> +		if (ret == 0) {
> +			if (!(rx_meta_features &
> RTE_ETH_RX_META_USER_FLAG)) {
> +				printf(":: flow action FLAG will not affect Rx
> mbufs on port=%u\n",
> +				       port_id);
> +			}
> +
> +			if (!(rx_meta_features &
> RTE_ETH_RX_META_USER_MARK)) {
> +				printf(":: flow action MARK will not affect Rx
> mbufs on port=%u\n",
> +				       port_id);
> +			}
> +		} else if (ret != -ENOTSUP) {
> +			rte_exit(EXIT_FAILURE, "Error when negotiating Rx
> meta features on port=%u: %s\n",
> +				 port_id, rte_strerror(-ret));
> +		}
> +
>  		ret = rte_eth_dev_info_get(port_id, &dev_info);
>  		if (ret != 0)
>  			rte_exit(EXIT_FAILURE,
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 97ae52e17e..7a8da3d7ab 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1485,10 +1485,36 @@ static void
>  init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>  	struct rte_port *port = &ports[pid];
> +	uint64_t rx_meta_features = 0;
>  	uint16_t data_size;
>  	int ret;
>  	int i;
> 
> +	rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> +	rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> +	rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> +
> +	ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> +	if (ret == 0) {
> +		if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> +			TESTPMD_LOG(INFO, "Flow action FLAG will not
> affect Rx mbufs on port %u\n",
> +				    pid);
> +		}
> +
> +		if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
> {
> +			TESTPMD_LOG(INFO, "Flow action MARK will not
> affect Rx mbufs on port %u\n",
> +				    pid);
> +		}
> +
> +		if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> +			TESTPMD_LOG(INFO, "Flow tunnel offload support
> might be limited or unavailable on port %u\n",
> +				    pid);
> +		}
> +	} else if (ret != -ENOTSUP) {
> +		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
> features on port %u: %s\n",
> +			 pid, rte_strerror(-ret));
> +	}
> +
>  	port->dev_conf.txmode = tx_mode;
>  	port->dev_conf.rxmode = rx_mode;
> 
> diff --git a/doc/guides/rel_notes/release_21_11.rst
> b/doc/guides/rel_notes/release_21_11.rst
> index 19356ac53c..6674d4474c 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -106,6 +106,15 @@ New Features
>    Added command-line options to specify total number of processes and
>    current process ID. Each process owns subset of Rx and Tx queues.
> 
> +* **Added an API to negotiate delivery of specific parts of Rx meta
> +data**
> +
> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
> +  The following parts of Rx meta data were defined:
> +
> +  * ``RTE_ETH_RX_META_USER_FLAG``
> +  * ``RTE_ETH_RX_META_USER_MARK``
> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
> +
> 
>  Removed Items
>  -------------
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
> 40e474aa7e..96e0c60cae 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
> typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
>  	struct rte_eth_representor_info *info);
> 
> +/**
> + * @internal
> + * Negotiate delivery of specific parts of Rx meta data.
> + *
> + * @param dev
> + *   Port (ethdev) handle
> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   Negative errno value on error, zero otherwise
> + */
> +typedef int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
> +				       uint64_t *features);
> +
>  /**
>   * @internal A structure containing the functions exported by an Ethernet
> driver.
>   */
> @@ -949,6 +965,9 @@ struct eth_dev_ops {
> 
>  	eth_representor_info_get_t representor_info_get;
>  	/**< Get representor info. */
> +
> +	eth_rx_meta_negotiate_t rx_meta_negotiate;
> +	/**< Negotiate delivery of specific parts of Rx meta data. */
>  };
> 
>  /**
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> daf5ca9242..49cb84d64c 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
>  	return eth_err(port_id, (*dev->dev_ops-
> >representor_info_get)(dev, info));  }
> 
> +int
> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (dev->data->dev_configured != 0) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"The port (id=%"PRIu16") is already configured\n",
> +			port_id);
> +		return -EBUSY;
> +	}
> +
> +	if (features == NULL) {
> +		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> +		return -EINVAL;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate,
> -ENOTSUP);
> +	return eth_err(port_id,
> +		       (*dev->dev_ops->rx_meta_negotiate)(dev, features)); }
> +
>  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> 
>  RTE_INIT(ethdev_init_telemetry)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> 1da37896d8..8467a7a362 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4888,6 +4888,51 @@ __rte_experimental  int
> rte_eth_representor_info_get(uint16_t port_id,
>  				 struct rte_eth_representor_info *info);
> 
> +/** The ethdev sees flagged packets if there are flows with action
> +FLAG. */ #define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
> +
> +/** The ethdev sees mark IDs in packets if there are flows with action
> +MARK. */ #define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
> +
> +/** The ethdev detects missed packets if there are "tunnel_set" flows
> +in use. */ #define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Negotiate delivery of specific parts of Rx meta data.
> + *
> + * Invoke this API before the first rte_eth_dev_configure() invocation
> + * to let the PMD make preparations that are inconvenient to do later.
> + *
> + * The negotiation process is as follows:
> + *
> + * - the application requests features intending to use at least some
> +of them;
> + * - the PMD responds with the guaranteed subset of the requested
> +feature set;
> + * - the application can retry negotiation with another set of
> +features;
> + * - the application can pass zero to clear the negotiation result;
> + * - the last negotiated result takes effect upon the ethdev start.
> + *
> + * If this API is unsupported, the application should gracefully ignore that.
> + *
> + * @param port_id
> + *   Port (ethdev) identifier
> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   - (-EBUSY) if the port can't handle this in its current state;
> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
> + *   - (-ENODEV) if *port_id* is invalid;
> + *   - (-EINVAL) if *features* is NULL;
> + *   - (-EIO) if the device is removed;
> + *   - (0) on success
> + */
> +__rte_experimental
> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);

I don't think meta is the best name since we also have meta item and the word meta can be used
in other cases.

> +
>  #include <rte_ethdev_core.h>
> 
>  /**
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> 70f455d47d..6eb7ec0574 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
>  	 * PKT_RX_FDIR_ID mbuf flags.
>  	 *
>  	 * See struct rte_flow_action_mark.
> +	 *
> +	 * One should negotiate delivery of mark IDs beforehand.
> +	 * @see rte_eth_rx_meta_negotiate()
> +	 * @see RTE_ETH_RX_META_USER_MARK
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MARK,
> 
> @@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
>  	 * sets the PKT_RX_FDIR mbuf flag.
>  	 *
>  	 * No associated configuration structure.
> +	 *
> +	 * One should negotiate flag delivery beforehand.
> +	 * @see rte_eth_rx_meta_negotiate()
> +	 * @see RTE_ETH_RX_META_USER_FLAG
>  	 */
>  	RTE_FLOW_ACTION_TYPE_FLAG,
> 
> @@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
>  /**
>   * Populate the current packet processing state, if exists, for the given mbuf.
>   *
> + * One should negotiate the processing state information delivery
> beforehand.
> + * @see rte_eth_rx_meta_negotiate()
> + * @see RTE_ETH_RX_META_TUNNEL_ID
> + *
>   * @param port_id
>   *   Port identifier of Ethernet device.
>   * @param[in] m
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> 904bce6ea1..a8928266a9 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -247,6 +247,9 @@ EXPERIMENTAL {
>  	rte_mtr_meter_policy_delete;
>  	rte_mtr_meter_policy_update;
>  	rte_mtr_meter_policy_validate;
> +
> +	# added in 21.11
> +	rte_eth_rx_meta_negotiate;
>  };
> 
>  INTERNAL {
> --
> 2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 14:59     ` Ori Kam
@ 2021-09-30 15:07       ` Andrew Rybchenko
  2021-09-30 19:07       ` Ivan Malov
  1 sibling, 0 replies; 97+ messages in thread
From: Andrew Rybchenko @ 2021-09-30 15:07 UTC (permalink / raw)
  To: Ori Kam, Ivan Malov, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ori,

On 9/30/21 5:59 PM, Ori Kam wrote:
> Hi Ivan,
> Sorry for jumping in late.
> 
> I have a concern that this patch breaks other PMDs.
>>From the rst file " One should negotiate flag delivery beforehand"
> since you only added this function for your PMD all other PMD will fail.
> I see that you added exception in the  examples, but it doesn't make sense
> that applications will also need to add this exception which is not documented.

It is a new API and the function description lists possible
return codes. An application can handle these return codes
gracefully. I'm not sure that it makes sense to highlight
it as a special case.

> 
> Please see more comments inline.

See below

> Thanks,
> Ori
> 
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Thursday, September 23, 2021 2:20 PM
>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Delivery of mark, flag and the likes might affect small packet performance.
>> If these features are disabled by default, enabling them in started state
>> without causing traffic disruption may not always be possible.
>>
>> Let applications negotiate delivery of Rx meta data beforehand.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>> Acked-by: Jerin Jacob <jerinj@marvell.com>

[snip]

>> +__rte_experimental
>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);
> 
> I don't think meta is the best name since we also have meta item and the word meta can be used
> in other cases.

Do you have any idea what could be used instead of it?

Thanks,
Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (4 preceding siblings ...)
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
@ 2021-09-30 16:18   ` Thomas Monjalon
  2021-09-30 19:30     ` Ivan Malov
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                     ` (3 subsequent siblings)
  9 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2021-09-30 16:18 UTC (permalink / raw)
  To: Ivan Malov; +Cc: dev, Andy Moreton, orika, andrew.rybchenko, ferruh.yigit

23/09/2021 13:20, Ivan Malov:
> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
> only the former has been added. The problem hasn't been solved.
> Applications still assume that no efforts are needed to enable
> flow mark and similar meta data delivery.
> 
> The team behind net/sfc driver has to take over the efforts since
> the problem has started impacting us. Riverhead, a cutting edge
> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
> is available only from long Rx prefix. Switching between the
> prefix formats can't happen in started state. Hence, we run
> into the same problem which [1] was aiming to solve.

Sorry I don't understand what is Rx prefix?

> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
> on its own since the corresponding flows must be active to set
> the data in the first place. Hence, adding offload flags
> similar to RSS_HASH is not a good idea.

What means "active" here?

> Patch [1/5] of this series adds a generic API to let applications
> negotiate delivery of Rx meta data during initialisation period.
> This way, an application knows right from the start which parts
> of Rx meta data won't be delivered. Hence, no necessity to try
> inserting flows requesting such data and handle the failures.

Sorry I don't understand the problem you want to solve.
And sorry for not noticing earlier.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 14:59     ` Ori Kam
  2021-09-30 15:07       ` Andrew Rybchenko
@ 2021-09-30 19:07       ` Ivan Malov
  2021-10-01  6:50         ` Andrew Rybchenko
  1 sibling, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-09-30 19:07 UTC (permalink / raw)
  To: Ori Kam, dev
  Cc: Andy Moreton, Andrew Rybchenko, Ray Kinsella, Jerin Jacob,
	Wisam Monther, Xiaoyun Li, NBU-Contact-Thomas Monjalon,
	Ferruh Yigit

Hi Ori,

On 30/09/2021 17:59, Ori Kam wrote:
> Hi Ivan,
> Sorry for jumping in late.

No worries. That's OK.

> I have a concern that this patch breaks other PMDs.

It does no such thing.

>>From the rst file " One should negotiate flag delivery beforehand"
> since you only added this function for your PMD all other PMD will fail.
> I see that you added exception in the  examples, but it doesn't make sense
> that applications will also need to add this exception which is not documented.

Say, you have an application, and you use it with some specific PMD. 
Say, that PMD doesn't run into the problem as ours does. In other words, 
the user can insert a flow with action MARK at any point and get mark 
delivery working starting from that moment without any problem. Say, 
this is exactly the way how it works for you at the moment.

Now. This new API kicks in. We update the application to invoke it as 
early as possible. But your PMD in question still doesn't support this 
API. The comment in the patch says that if the method returns ENOTSUP, 
the application should ignore that without batting an eyelid. It should 
just keep on working as it did before the introduction of this API.

More specific example:
Say, the application doesn't mind using either "RSS + MARK" or tunnel 
offload. What it does right now is attempt to insert tunnel flows first 
and, if this fails, fall back to "RSS + MARK". With this API, the 
application will try to invoke this API with "USER_MARK | TUNNEL_ID" in 
adapter initialised state. If the PMD says that it can only enable the 
tunnel offload, then the application will get the knowledge that it 
doesn't make sense to even try inserting "RSS + MARK" flows. It just can 
skip useless actions. But if the PMD doesn't support the method, the 
application will see ENOTSUP and handle this gracefully: it will make no 
assumptions about what's guaranteed to be supported and what's not and 
will just keep on its old behaviour: try to insert a flow, fail, fall 
back to another type of flow.

So no breakages with this API.

> 
> Please see more comments inline.
> 
> Thanks,
> Ori
> 
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Thursday, September 23, 2021 2:20 PM
>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Delivery of mark, flag and the likes might affect small packet performance.
>> If these features are disabled by default, enabling them in started state
>> without causing traffic disruption may not always be possible.
>>
>> Let applications negotiate delivery of Rx meta data beforehand.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>> ---
>>   app/test-flow-perf/main.c              | 21 ++++++++++++
>>   app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>>   doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>>   lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>>   lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>>   lib/ethdev/rte_ethdev.h                | 45 ++++++++++++++++++++++++++
>>   lib/ethdev/rte_flow.h                  | 12 +++++++
>>   lib/ethdev/version.map                 |  3 ++
>>   8 files changed, 160 insertions(+)
>>
>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c index
>> 9be8edc31d..48eafffb1d 100644
>> --- a/app/test-flow-perf/main.c
>> +++ b/app/test-flow-perf/main.c
>> @@ -1760,6 +1760,27 @@ init_port(void)
>>   		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
>>
>>   	for (port_id = 0; port_id < nr_ports; port_id++) {
>> +		uint64_t rx_meta_features = 0;
>> +
>> +		rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>> +		rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>> +
>> +		ret = rte_eth_rx_meta_negotiate(port_id,
>> &rx_meta_features);
>> +		if (ret == 0) {
>> +			if (!(rx_meta_features &
>> RTE_ETH_RX_META_USER_FLAG)) {
>> +				printf(":: flow action FLAG will not affect Rx
>> mbufs on port=%u\n",
>> +				       port_id);
>> +			}
>> +
>> +			if (!(rx_meta_features &
>> RTE_ETH_RX_META_USER_MARK)) {
>> +				printf(":: flow action MARK will not affect Rx
>> mbufs on port=%u\n",
>> +				       port_id);
>> +			}
>> +		} else if (ret != -ENOTSUP) {
>> +			rte_exit(EXIT_FAILURE, "Error when negotiating Rx
>> meta features on port=%u: %s\n",
>> +				 port_id, rte_strerror(-ret));
>> +		}
>> +
>>   		ret = rte_eth_dev_info_get(port_id, &dev_info);
>>   		if (ret != 0)
>>   			rte_exit(EXIT_FAILURE,
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>> 97ae52e17e..7a8da3d7ab 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -1485,10 +1485,36 @@ static void
>>   init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>>   	struct rte_port *port = &ports[pid];
>> +	uint64_t rx_meta_features = 0;
>>   	uint16_t data_size;
>>   	int ret;
>>   	int i;
>>
>> +	rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>> +	rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>> +	rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>> +
>> +	ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>> +	if (ret == 0) {
>> +		if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>> +			TESTPMD_LOG(INFO, "Flow action FLAG will not
>> affect Rx mbufs on port %u\n",
>> +				    pid);
>> +		}
>> +
>> +		if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
>> {
>> +			TESTPMD_LOG(INFO, "Flow action MARK will not
>> affect Rx mbufs on port %u\n",
>> +				    pid);
>> +		}
>> +
>> +		if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>> +			TESTPMD_LOG(INFO, "Flow tunnel offload support
>> might be limited or unavailable on port %u\n",
>> +				    pid);
>> +		}
>> +	} else if (ret != -ENOTSUP) {
>> +		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
>> features on port %u: %s\n",
>> +			 pid, rte_strerror(-ret));
>> +	}
>> +
>>   	port->dev_conf.txmode = tx_mode;
>>   	port->dev_conf.rxmode = rx_mode;
>>
>> diff --git a/doc/guides/rel_notes/release_21_11.rst
>> b/doc/guides/rel_notes/release_21_11.rst
>> index 19356ac53c..6674d4474c 100644
>> --- a/doc/guides/rel_notes/release_21_11.rst
>> +++ b/doc/guides/rel_notes/release_21_11.rst
>> @@ -106,6 +106,15 @@ New Features
>>     Added command-line options to specify total number of processes and
>>     current process ID. Each process owns subset of Rx and Tx queues.
>>
>> +* **Added an API to negotiate delivery of specific parts of Rx meta
>> +data**
>> +
>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
>> +  The following parts of Rx meta data were defined:
>> +
>> +  * ``RTE_ETH_RX_META_USER_FLAG``
>> +  * ``RTE_ETH_RX_META_USER_MARK``
>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
>> +
>>
>>   Removed Items
>>   -------------
>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
>> 40e474aa7e..96e0c60cae 100644
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
>> typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
>>   	struct rte_eth_representor_info *info);
>>
>> +/**
>> + * @internal
>> + * Negotiate delivery of specific parts of Rx meta data.
>> + *
>> + * @param dev
>> + *   Port (ethdev) handle
>> + *
>> + * @param[inout] features
>> + *   Feature selection buffer
>> + *
>> + * @return
>> + *   Negative errno value on error, zero otherwise
>> + */
>> +typedef int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
>> +				       uint64_t *features);
>> +
>>   /**
>>    * @internal A structure containing the functions exported by an Ethernet
>> driver.
>>    */
>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>>
>>   	eth_representor_info_get_t representor_info_get;
>>   	/**< Get representor info. */
>> +
>> +	eth_rx_meta_negotiate_t rx_meta_negotiate;
>> +	/**< Negotiate delivery of specific parts of Rx meta data. */
>>   };
>>
>>   /**
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
>> daf5ca9242..49cb84d64c 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
>>   	return eth_err(port_id, (*dev->dev_ops-
>>> representor_info_get)(dev, info));  }
>>
>> +int
>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
>> +	struct rte_eth_dev *dev;
>> +
>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>> +	dev = &rte_eth_devices[port_id];
>> +
>> +	if (dev->data->dev_configured != 0) {
>> +		RTE_ETHDEV_LOG(ERR,
>> +			"The port (id=%"PRIu16") is already configured\n",
>> +			port_id);
>> +		return -EBUSY;
>> +	}
>> +
>> +	if (features == NULL) {
>> +		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate,
>> -ENOTSUP);
>> +	return eth_err(port_id,
>> +		       (*dev->dev_ops->rx_meta_negotiate)(dev, features)); }
>> +
>>   RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>>
>>   RTE_INIT(ethdev_init_telemetry)
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>> 1da37896d8..8467a7a362 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
>> rte_eth_representor_info_get(uint16_t port_id,
>>   				 struct rte_eth_representor_info *info);
>>
>> +/** The ethdev sees flagged packets if there are flows with action
>> +FLAG. */ #define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
>> +
>> +/** The ethdev sees mark IDs in packets if there are flows with action
>> +MARK. */ #define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
>> +
>> +/** The ethdev detects missed packets if there are "tunnel_set" flows
>> +in use. */ #define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice
>> + *
>> + * Negotiate delivery of specific parts of Rx meta data.
>> + *
>> + * Invoke this API before the first rte_eth_dev_configure() invocation
>> + * to let the PMD make preparations that are inconvenient to do later.
>> + *
>> + * The negotiation process is as follows:
>> + *
>> + * - the application requests features intending to use at least some
>> +of them;
>> + * - the PMD responds with the guaranteed subset of the requested
>> +feature set;
>> + * - the application can retry negotiation with another set of
>> +features;
>> + * - the application can pass zero to clear the negotiation result;
>> + * - the last negotiated result takes effect upon the ethdev start.
>> + *
>> + * If this API is unsupported, the application should gracefully ignore that.
>> + *
>> + * @param port_id
>> + *   Port (ethdev) identifier
>> + *
>> + * @param[inout] features
>> + *   Feature selection buffer
>> + *
>> + * @return
>> + *   - (-EBUSY) if the port can't handle this in its current state;
>> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
>> + *   - (-ENODEV) if *port_id* is invalid;
>> + *   - (-EINVAL) if *features* is NULL;
>> + *   - (-EIO) if the device is removed;
>> + *   - (0) on success
>> + */
>> +__rte_experimental
>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);
> 
> I don't think meta is the best name since we also have meta item and the word meta can be used
> in other cases.

I'm no expert in naming. What could be a better term for this? 
Personally, I'd rather not perceive "meta" the way you describe. It's 
not just "meta". It's "rx_meta", and the flags supplied with this API 
provide enough context to explain what it's all about.

> 
>> +
>>   #include <rte_ethdev_core.h>
>>
>>   /**
>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>> 70f455d47d..6eb7ec0574 100644
>> --- a/lib/ethdev/rte_flow.h
>> +++ b/lib/ethdev/rte_flow.h
>> @@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
>>   	 * PKT_RX_FDIR_ID mbuf flags.
>>   	 *
>>   	 * See struct rte_flow_action_mark.
>> +	 *
>> +	 * One should negotiate delivery of mark IDs beforehand.
>> +	 * @see rte_eth_rx_meta_negotiate()
>> +	 * @see RTE_ETH_RX_META_USER_MARK
>>   	 */
>>   	RTE_FLOW_ACTION_TYPE_MARK,
>>
>> @@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
>>   	 * sets the PKT_RX_FDIR mbuf flag.
>>   	 *
>>   	 * No associated configuration structure.
>> +	 *
>> +	 * One should negotiate flag delivery beforehand.
>> +	 * @see rte_eth_rx_meta_negotiate()
>> +	 * @see RTE_ETH_RX_META_USER_FLAG
>>   	 */
>>   	RTE_FLOW_ACTION_TYPE_FLAG,
>>
>> @@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
>>   /**
>>    * Populate the current packet processing state, if exists, for the given mbuf.
>>    *
>> + * One should negotiate the processing state information delivery
>> beforehand.
>> + * @see rte_eth_rx_meta_negotiate()
>> + * @see RTE_ETH_RX_META_TUNNEL_ID
>> + *
>>    * @param port_id
>>    *   Port identifier of Ethernet device.
>>    * @param[in] m
>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>> 904bce6ea1..a8928266a9 100644
>> --- a/lib/ethdev/version.map
>> +++ b/lib/ethdev/version.map
>> @@ -247,6 +247,9 @@ EXPERIMENTAL {
>>   	rte_mtr_meter_policy_delete;
>>   	rte_mtr_meter_policy_update;
>>   	rte_mtr_meter_policy_validate;
>> +
>> +	# added in 21.11
>> +	rte_eth_rx_meta_negotiate;
>>   };
>>
>>   INTERNAL {
>> --
>> 2.20.1

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-09-30 16:18   ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Thomas Monjalon
@ 2021-09-30 19:30     ` Ivan Malov
  2021-10-01  6:47       ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-09-30 19:30 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Andy Moreton, orika, andrew.rybchenko, ferruh.yigit

Hi Thomas,

On 30/09/2021 19:18, Thomas Monjalon wrote:
> 23/09/2021 13:20, Ivan Malov:
>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
>> only the former has been added. The problem hasn't been solved.
>> Applications still assume that no efforts are needed to enable
>> flow mark and similar meta data delivery.
>>
>> The team behind net/sfc driver has to take over the efforts since
>> the problem has started impacting us. Riverhead, a cutting edge
>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
>> is available only from long Rx prefix. Switching between the
>> prefix formats can't happen in started state. Hence, we run
>> into the same problem which [1] was aiming to solve.
> 
> Sorry I don't understand what is Rx prefix?

A small chunk of per-packet metadata in Rx packet buffer preceding the 
actual packet data. In terms of mbuf, this could be something lying 
before m->data_off.

>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
>> on its own since the corresponding flows must be active to set
>> the data in the first place. Hence, adding offload flags
>> similar to RSS_HASH is not a good idea.
> 
> What means "active" here?

Active = inserted and functional. What this paragraph is trying to say 
is that when you enable, say, RSS_HASH, that implies both computation of 
the hash and the driver's ability to extract in from packets 
("delivery"). But when it comes to MARK, it's just "delivery". No 
"offload" here: the NIC won't set any mark in packets unless you create 
a flow rule to make it do so. That's the gist of it.

>> Patch [1/5] of this series adds a generic API to let applications
>> negotiate delivery of Rx meta data during initialisation period.
>> This way, an application knows right from the start which parts
>> of Rx meta data won't be delivered. Hence, no necessity to try
>> inserting flows requesting such data and handle the failures.
> 
> Sorry I don't understand the problem you want to solve.
> And sorry for not noticing earlier.

No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the 
packets by default (for performance reasons). If the application tries 
to insert a flow with action MARK, the PMD may not be able to enable 
delivery of Rx mark without the need to re-start Rx sub-system. And 
that's fraught with traffic disruption and similar bad consequences. In 
order to address it, we need to let the application express its interest 
in receiving mark with packets as early as possible. This way, the PMD 
can enable Rx mark delivery in advance. And, as an additional benefit, 
the application can learn *from the very beginning* whether it will be 
possible to use the feature or not. If this API tells the application 
that no mark delivery will be enabled, then the application can just 
skip many unnecessary attempts to insert wittingly unsupported flows 
during runtime.

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
  2021-09-30 14:59     ` Ori Kam
@ 2021-09-30 21:48     ` Ajit Khaparde
  2021-09-30 22:00       ` Ivan Malov
  1 sibling, 1 reply; 97+ messages in thread
From: Ajit Khaparde @ 2021-09-30 21:48 UTC (permalink / raw)
  To: Ivan Malov
  Cc: dpdk-dev, Andy Moreton, Andrew Rybchenko, Ray Kinsella,
	Jerin Jacob, Wisam Jaddo, Xiaoyun Li, Thomas Monjalon,
	Ferruh Yigit, Ori Kam

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

::::
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 97ae52e17e..7a8da3d7ab 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1485,10 +1485,36 @@ static void
>  init_config_port_offloads(portid_t pid, uint32_t socket_id)
>  {
>         struct rte_port *port = &ports[pid];
> +       uint64_t rx_meta_features = 0;
>         uint16_t data_size;
>         int ret;
>         int i;
>
> +       rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> +       rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> +       rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> +
> +       ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> +       if (ret == 0) {
> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> +                       TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
Log level info might be a little too noisy?

> +                                   pid);
> +               }
> +
> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
> +                       TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
> +                                   pid);
> +               }
> +
> +               if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> +                       TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
> +                                   pid);
> +               }
:::
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 21:48     ` Ajit Khaparde
@ 2021-09-30 22:00       ` Ivan Malov
  2021-09-30 22:12         ` Ajit Khaparde
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-09-30 22:00 UTC (permalink / raw)
  To: Ajit Khaparde
  Cc: dpdk-dev, Andy Moreton, Andrew Rybchenko, Ray Kinsella,
	Jerin Jacob, Wisam Jaddo, Xiaoyun Li, Thomas Monjalon,
	Ferruh Yigit, Ori Kam

Hi Ajit,

On 01/10/2021 00:48, Ajit Khaparde wrote:
> ::::
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 97ae52e17e..7a8da3d7ab 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -1485,10 +1485,36 @@ static void
>>   init_config_port_offloads(portid_t pid, uint32_t socket_id)
>>   {
>>          struct rte_port *port = &ports[pid];
>> +       uint64_t rx_meta_features = 0;
>>          uint16_t data_size;
>>          int ret;
>>          int i;
>>
>> +       rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>> +       rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>> +       rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>> +
>> +       ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>> +       if (ret == 0) {
>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>> +                       TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
> Log level info might be a little too noisy?

Do you really think so? But main() sets default log level to DEBUG, quote:
     rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);

If I go for DEBUG instead of INFO here, it won't get any quieter, will it?

> 
>> +                                   pid);
>> +               }
>> +
>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
>> +                       TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
>> +                                   pid);
>> +               }
>> +
>> +               if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>> +                       TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
>> +                                   pid);
>> +               }
> :::
>>

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 22:00       ` Ivan Malov
@ 2021-09-30 22:12         ` Ajit Khaparde
  2021-09-30 22:22           ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ajit Khaparde @ 2021-09-30 22:12 UTC (permalink / raw)
  To: Ivan Malov
  Cc: dpdk-dev, Andy Moreton, Andrew Rybchenko, Ray Kinsella,
	Jerin Jacob, Wisam Jaddo, Xiaoyun Li, Thomas Monjalon,
	Ferruh Yigit, Ori Kam

[-- Attachment #1: Type: text/plain, Size: 2174 bytes --]

On Thu, Sep 30, 2021 at 3:01 PM Ivan Malov <Ivan.Malov@oktetlabs.ru> wrote:
>
> Hi Ajit,
>
> On 01/10/2021 00:48, Ajit Khaparde wrote:
> > ::::
> >> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> >> index 97ae52e17e..7a8da3d7ab 100644
> >> --- a/app/test-pmd/testpmd.c
> >> +++ b/app/test-pmd/testpmd.c
> >> @@ -1485,10 +1485,36 @@ static void
> >>   init_config_port_offloads(portid_t pid, uint32_t socket_id)
> >>   {
> >>          struct rte_port *port = &ports[pid];
> >> +       uint64_t rx_meta_features = 0;
> >>          uint16_t data_size;
> >>          int ret;
> >>          int i;
> >>
> >> +       rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >> +       rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >> +       rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> >> +
> >> +       ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> >> +       if (ret == 0) {
> >> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> >> +                       TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
> > Log level info might be a little too noisy?
>
> Do you really think so? But main() sets default log level to DEBUG, quote:
>      rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);
>
> If I go for DEBUG instead of INFO here, it won't get any quieter, will it?
You are right. It won't.
But then three extra messages per port will stand out. But that's my opinion.
Maybe you could log the message when a flow is created with any of the
meta features?

>
> >
> >> +                                   pid);
> >> +               }
> >> +
> >> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
> >> +                       TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
> >> +                                   pid);
> >> +               }
> >> +
> >> +               if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> >> +                       TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
> >> +                                   pid);
> >> +               }
> > :::
> >>
>
> --
> Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 22:12         ` Ajit Khaparde
@ 2021-09-30 22:22           ` Ivan Malov
  2021-10-03  7:05             ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-09-30 22:22 UTC (permalink / raw)
  To: Ajit Khaparde
  Cc: dpdk-dev, Andy Moreton, Andrew Rybchenko, Ray Kinsella,
	Jerin Jacob, Wisam Jaddo, Xiaoyun Li, Thomas Monjalon,
	Ferruh Yigit, Ori Kam



On 01/10/2021 01:12, Ajit Khaparde wrote:
> On Thu, Sep 30, 2021 at 3:01 PM Ivan Malov <Ivan.Malov@oktetlabs.ru> wrote:
>>
>> Hi Ajit,
>>
>> On 01/10/2021 00:48, Ajit Khaparde wrote:
>>> ::::
>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>>>> index 97ae52e17e..7a8da3d7ab 100644
>>>> --- a/app/test-pmd/testpmd.c
>>>> +++ b/app/test-pmd/testpmd.c
>>>> @@ -1485,10 +1485,36 @@ static void
>>>>    init_config_port_offloads(portid_t pid, uint32_t socket_id)
>>>>    {
>>>>           struct rte_port *port = &ports[pid];
>>>> +       uint64_t rx_meta_features = 0;
>>>>           uint16_t data_size;
>>>>           int ret;
>>>>           int i;
>>>>
>>>> +       rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>> +       rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>> +       rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>>>> +
>>>> +       ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>>>> +       if (ret == 0) {
>>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>>>> +                       TESTPMD_LOG(INFO, "Flow action FLAG will not affect Rx mbufs on port %u\n",
>>> Log level info might be a little too noisy?
>>
>> Do you really think so? But main() sets default log level to DEBUG, quote:
>>       rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);
>>
>> If I go for DEBUG instead of INFO here, it won't get any quieter, will it?
> You are right. It won't.
> But then three extra messages per port will stand out. But that's my opinion.
> Maybe you could log the message when a flow is created with any of the
> meta features?

The idea is to warn the user from the very beginning that certain flow 
primitives won't actually work. This way, the user can refrain from 
trying to use them in flow rules. This might save their time.

But I don't mind going for DEBUG here. More opinions are welcome.

> 
>>
>>>
>>>> +                                   pid);
>>>> +               }
>>>> +
>>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
>>>> +                       TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx mbufs on port %u\n",
>>>> +                                   pid);
>>>> +               }
>>>> +
>>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>>>> +                       TESTPMD_LOG(INFO, "Flow tunnel offload support might be limited or unavailable on port %u\n",
>>>> +                                   pid);
>>>> +               }
>>> :::
>>>>
>>
>> --
>> Ivan M

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-09-30 19:30     ` Ivan Malov
@ 2021-10-01  6:47       ` Andrew Rybchenko
  2021-10-01  8:11         ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-01  6:47 UTC (permalink / raw)
  To: Ivan Malov, Thomas Monjalon; +Cc: dev, Andy Moreton, orika, ferruh.yigit

On 9/30/21 10:30 PM, Ivan Malov wrote:
> Hi Thomas,
> 
> On 30/09/2021 19:18, Thomas Monjalon wrote:
>> 23/09/2021 13:20, Ivan Malov:
>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
>>> only the former has been added. The problem hasn't been solved.
>>> Applications still assume that no efforts are needed to enable
>>> flow mark and similar meta data delivery.
>>>
>>> The team behind net/sfc driver has to take over the efforts since
>>> the problem has started impacting us. Riverhead, a cutting edge
>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
>>> is available only from long Rx prefix. Switching between the
>>> prefix formats can't happen in started state. Hence, we run
>>> into the same problem which [1] was aiming to solve.
>>
>> Sorry I don't understand what is Rx prefix?
> 
> A small chunk of per-packet metadata in Rx packet buffer preceding the
> actual packet data. In terms of mbuf, this could be something lying
> before m->data_off.
> 
>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
>>> on its own since the corresponding flows must be active to set
>>> the data in the first place. Hence, adding offload flags
>>> similar to RSS_HASH is not a good idea.
>>
>> What means "active" here?
> 
> Active = inserted and functional. What this paragraph is trying to say
> is that when you enable, say, RSS_HASH, that implies both computation of
> the hash and the driver's ability to extract in from packets
> ("delivery"). But when it comes to MARK, it's just "delivery". No
> "offload" here: the NIC won't set any mark in packets unless you create
> a flow rule to make it do so. That's the gist of it.
> 
>>> Patch [1/5] of this series adds a generic API to let applications
>>> negotiate delivery of Rx meta data during initialisation period.
>>> This way, an application knows right from the start which parts
>>> of Rx meta data won't be delivered. Hence, no necessity to try
>>> inserting flows requesting such data and handle the failures.
>>
>> Sorry I don't understand the problem you want to solve.
>> And sorry for not noticing earlier.
> 
> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
> packets by default (for performance reasons). If the application tries
> to insert a flow with action MARK, the PMD may not be able to enable
> delivery of Rx mark without the need to re-start Rx sub-system. And
> that's fraught with traffic disruption and similar bad consequences. In
> order to address it, we need to let the application express its interest
> in receiving mark with packets as early as possible. This way, the PMD
> can enable Rx mark delivery in advance. And, as an additional benefit,
> the application can learn *from the very beginning* whether it will be
> possible to use the feature or not. If this API tells the application
> that no mark delivery will be enabled, then the application can just
> skip many unnecessary attempts to insert wittingly unsupported flows
> during runtime.
> 

Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
is vendor-specific way to address the same problem.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 19:07       ` Ivan Malov
@ 2021-10-01  6:50         ` Andrew Rybchenko
  2021-10-03  7:42           ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-01  6:50 UTC (permalink / raw)
  To: Ivan Malov, Ori Kam, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

On 9/30/21 10:07 PM, Ivan Malov wrote:
> Hi Ori,
> 
> On 30/09/2021 17:59, Ori Kam wrote:
>> Hi Ivan,
>> Sorry for jumping in late.
> 
> No worries. That's OK.
> 
>> I have a concern that this patch breaks other PMDs.
> 
> It does no such thing.
> 
>>> From the rst file " One should negotiate flag delivery beforehand"
>> since you only added this function for your PMD all other PMD will fail.
>> I see that you added exception in the  examples, but it doesn't make
>> sense
>> that applications will also need to add this exception which is not
>> documented.
> 
> Say, you have an application, and you use it with some specific PMD.
> Say, that PMD doesn't run into the problem as ours does. In other words,
> the user can insert a flow with action MARK at any point and get mark
> delivery working starting from that moment without any problem. Say,
> this is exactly the way how it works for you at the moment.
> 
> Now. This new API kicks in. We update the application to invoke it as
> early as possible. But your PMD in question still doesn't support this
> API. The comment in the patch says that if the method returns ENOTSUP,
> the application should ignore that without batting an eyelid. It should
> just keep on working as it did before the introduction of this API.
> 
> More specific example:
> Say, the application doesn't mind using either "RSS + MARK" or tunnel
> offload. What it does right now is attempt to insert tunnel flows first
> and, if this fails, fall back to "RSS + MARK". With this API, the
> application will try to invoke this API with "USER_MARK | TUNNEL_ID" in
> adapter initialised state. If the PMD says that it can only enable the
> tunnel offload, then the application will get the knowledge that it
> doesn't make sense to even try inserting "RSS + MARK" flows. It just can
> skip useless actions. But if the PMD doesn't support the method, the
> application will see ENOTSUP and handle this gracefully: it will make no
> assumptions about what's guaranteed to be supported and what's not and
> will just keep on its old behaviour: try to insert a flow, fail, fall
> back to another type of flow.
> 
> So no breakages with this API.
> 
>>
>> Please see more comments inline.
>>
>> Thanks,
>> Ori
>>
>>> -----Original Message-----
>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>> Sent: Thursday, September 23, 2021 2:20 PM
>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>>> data
>>>
>>> Delivery of mark, flag and the likes might affect small packet
>>> performance.
>>> If these features are disabled by default, enabling them in started
>>> state
>>> without causing traffic disruption may not always be possible.
>>>
>>> Let applications negotiate delivery of Rx meta data beforehand.
>>>
>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>> ---
>>>   app/test-flow-perf/main.c              | 21 ++++++++++++
>>>   app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>>>   doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>>>   lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>>>   lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>>>   lib/ethdev/rte_ethdev.h                | 45 ++++++++++++++++++++++++++
>>>   lib/ethdev/rte_flow.h                  | 12 +++++++
>>>   lib/ethdev/version.map                 |  3 ++
>>>   8 files changed, 160 insertions(+)
>>>
>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c index
>>> 9be8edc31d..48eafffb1d 100644
>>> --- a/app/test-flow-perf/main.c
>>> +++ b/app/test-flow-perf/main.c
>>> @@ -1760,6 +1760,27 @@ init_port(void)
>>>           rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
>>>
>>>       for (port_id = 0; port_id < nr_ports; port_id++) {
>>> +        uint64_t rx_meta_features = 0;
>>> +
>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>> +
>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
>>> &rx_meta_features);
>>> +        if (ret == 0) {
>>> +            if (!(rx_meta_features &
>>> RTE_ETH_RX_META_USER_FLAG)) {
>>> +                printf(":: flow action FLAG will not affect Rx
>>> mbufs on port=%u\n",
>>> +                       port_id);
>>> +            }
>>> +
>>> +            if (!(rx_meta_features &
>>> RTE_ETH_RX_META_USER_MARK)) {
>>> +                printf(":: flow action MARK will not affect Rx
>>> mbufs on port=%u\n",
>>> +                       port_id);
>>> +            }
>>> +        } else if (ret != -ENOTSUP) {
>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
>>> meta features on port=%u: %s\n",
>>> +                 port_id, rte_strerror(-ret));
>>> +        }
>>> +
>>>           ret = rte_eth_dev_info_get(port_id, &dev_info);
>>>           if (ret != 0)
>>>               rte_exit(EXIT_FAILURE,
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> 97ae52e17e..7a8da3d7ab 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -1485,10 +1485,36 @@ static void
>>>   init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>>>       struct rte_port *port = &ports[pid];
>>> +    uint64_t rx_meta_features = 0;
>>>       uint16_t data_size;
>>>       int ret;
>>>       int i;
>>>
>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>>> +
>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>>> +    if (ret == 0) {
>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
>>> affect Rx mbufs on port %u\n",
>>> +                    pid);
>>> +        }
>>> +
>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
>>> {
>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
>>> affect Rx mbufs on port %u\n",
>>> +                    pid);
>>> +        }
>>> +
>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
>>> might be limited or unavailable on port %u\n",
>>> +                    pid);
>>> +        }
>>> +    } else if (ret != -ENOTSUP) {
>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
>>> features on port %u: %s\n",
>>> +             pid, rte_strerror(-ret));
>>> +    }
>>> +
>>>       port->dev_conf.txmode = tx_mode;
>>>       port->dev_conf.rxmode = rx_mode;
>>>
>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
>>> b/doc/guides/rel_notes/release_21_11.rst
>>> index 19356ac53c..6674d4474c 100644
>>> --- a/doc/guides/rel_notes/release_21_11.rst
>>> +++ b/doc/guides/rel_notes/release_21_11.rst
>>> @@ -106,6 +106,15 @@ New Features
>>>     Added command-line options to specify total number of processes and
>>>     current process ID. Each process owns subset of Rx and Tx queues.
>>>
>>> +* **Added an API to negotiate delivery of specific parts of Rx meta
>>> +data**
>>> +
>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
>>> +  The following parts of Rx meta data were defined:
>>> +
>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
>>> +  * ``RTE_ETH_RX_META_USER_MARK``
>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
>>> +
>>>
>>>   Removed Items
>>>   -------------
>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>> index
>>> 40e474aa7e..96e0c60cae 100644
>>> --- a/lib/ethdev/ethdev_driver.h
>>> +++ b/lib/ethdev/ethdev_driver.h
>>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
>>> typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
>>>       struct rte_eth_representor_info *info);
>>>
>>> +/**
>>> + * @internal
>>> + * Negotiate delivery of specific parts of Rx meta data.
>>> + *
>>> + * @param dev
>>> + *   Port (ethdev) handle
>>> + *
>>> + * @param[inout] features
>>> + *   Feature selection buffer
>>> + *
>>> + * @return
>>> + *   Negative errno value on error, zero otherwise
>>> + */
>>> +typedef int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
>>> +                       uint64_t *features);
>>> +
>>>   /**
>>>    * @internal A structure containing the functions exported by an
>>> Ethernet
>>> driver.
>>>    */
>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>>>
>>>       eth_representor_info_get_t representor_info_get;
>>>       /**< Get representor info. */
>>> +
>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
>>>   };
>>>
>>>   /**
>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
>>> daf5ca9242..49cb84d64c 100644
>>> --- a/lib/ethdev/rte_ethdev.c
>>> +++ b/lib/ethdev/rte_ethdev.c
>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
>>>       return eth_err(port_id, (*dev->dev_ops-
>>>> representor_info_get)(dev, info));  }
>>>
>>> +int
>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
>>> +    struct rte_eth_dev *dev;
>>> +
>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>> +    dev = &rte_eth_devices[port_id];
>>> +
>>> +    if (dev->data->dev_configured != 0) {
>>> +        RTE_ETHDEV_LOG(ERR,
>>> +            "The port (id=%"PRIu16") is already configured\n",
>>> +            port_id);
>>> +        return -EBUSY;
>>> +    }
>>> +
>>> +    if (features == NULL) {
>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate,
>>> -ENOTSUP);
>>> +    return eth_err(port_id,
>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev, features)); }
>>> +
>>>   RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>>>
>>>   RTE_INIT(ethdev_init_telemetry)
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>> 1da37896d8..8467a7a362 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
>>> rte_eth_representor_info_get(uint16_t port_id,
>>>                    struct rte_eth_representor_info *info);
>>>
>>> +/** The ethdev sees flagged packets if there are flows with action
>>> +FLAG. */ #define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
>>> +
>>> +/** The ethdev sees mark IDs in packets if there are flows with action
>>> +MARK. */ #define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
>>> +
>>> +/** The ethdev detects missed packets if there are "tunnel_set" flows
>>> +in use. */ #define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
>>> +
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>> + *
>>> + * Negotiate delivery of specific parts of Rx meta data.
>>> + *
>>> + * Invoke this API before the first rte_eth_dev_configure() invocation
>>> + * to let the PMD make preparations that are inconvenient to do later.
>>> + *
>>> + * The negotiation process is as follows:
>>> + *
>>> + * - the application requests features intending to use at least some
>>> +of them;
>>> + * - the PMD responds with the guaranteed subset of the requested
>>> +feature set;
>>> + * - the application can retry negotiation with another set of
>>> +features;
>>> + * - the application can pass zero to clear the negotiation result;
>>> + * - the last negotiated result takes effect upon the ethdev start.
>>> + *
>>> + * If this API is unsupported, the application should gracefully
>>> ignore that.
>>> + *
>>> + * @param port_id
>>> + *   Port (ethdev) identifier
>>> + *
>>> + * @param[inout] features
>>> + *   Feature selection buffer
>>> + *
>>> + * @return
>>> + *   - (-EBUSY) if the port can't handle this in its current state;
>>> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
>>> + *   - (-ENODEV) if *port_id* is invalid;
>>> + *   - (-EINVAL) if *features* is NULL;
>>> + *   - (-EIO) if the device is removed;
>>> + *   - (0) on success
>>> + */
>>> +__rte_experimental
>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features);
>>
>> I don't think meta is the best name since we also have meta item and
>> the word meta can be used
>> in other cases.
> 
> I'm no expert in naming. What could be a better term for this?
> Personally, I'd rather not perceive "meta" the way you describe. It's
> not just "meta". It's "rx_meta", and the flags supplied with this API
> provide enough context to explain what it's all about.

Thinking overnight about it I'd suggest full "metadata".
Yes, it will name a bit longer, but less confusing versus
term META already used in flow API.

Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  6:47       ` Andrew Rybchenko
@ 2021-10-01  8:11         ` Thomas Monjalon
  2021-10-01  8:54           ` Andrew Rybchenko
  2021-10-01  8:55           ` Ivan Malov
  0 siblings, 2 replies; 97+ messages in thread
From: Thomas Monjalon @ 2021-10-01  8:11 UTC (permalink / raw)
  To: Ivan Malov, Andrew Rybchenko
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

01/10/2021 08:47, Andrew Rybchenko:
> On 9/30/21 10:30 PM, Ivan Malov wrote:
> > Hi Thomas,
> > 
> > On 30/09/2021 19:18, Thomas Monjalon wrote:
> >> 23/09/2021 13:20, Ivan Malov:
> >>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
> >>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
> >>> only the former has been added. The problem hasn't been solved.
> >>> Applications still assume that no efforts are needed to enable
> >>> flow mark and similar meta data delivery.
> >>>
> >>> The team behind net/sfc driver has to take over the efforts since
> >>> the problem has started impacting us. Riverhead, a cutting edge
> >>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
> >>> is available only from long Rx prefix. Switching between the
> >>> prefix formats can't happen in started state. Hence, we run
> >>> into the same problem which [1] was aiming to solve.
> >>
> >> Sorry I don't understand what is Rx prefix?
> > 
> > A small chunk of per-packet metadata in Rx packet buffer preceding the
> > actual packet data. In terms of mbuf, this could be something lying
> > before m->data_off.

I've never seen the word "Rx prefix".
In general we talk about mbuf headroom and mbuf metadata,
the rest being the mbuf payload and mbuf tailroom.
I guess you mean mbuf metadata in the space of the struct rte_mbuf?

> >>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
> >>> on its own since the corresponding flows must be active to set
> >>> the data in the first place. Hence, adding offload flags
> >>> similar to RSS_HASH is not a good idea.
> >>
> >> What means "active" here?
> > 
> > Active = inserted and functional. What this paragraph is trying to say
> > is that when you enable, say, RSS_HASH, that implies both computation of
> > the hash and the driver's ability to extract in from packets
> > ("delivery"). But when it comes to MARK, it's just "delivery". No
> > "offload" here: the NIC won't set any mark in packets unless you create
> > a flow rule to make it do so. That's the gist of it.

OK
Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag.
Same for RTE_FLOW_ACTION_TYPE_SET_META.

> >>> Patch [1/5] of this series adds a generic API to let applications
> >>> negotiate delivery of Rx meta data during initialisation period.

What is a metadata?
Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
Metadata word could cover any field in the mbuf struct so it is vague.

> >>> This way, an application knows right from the start which parts
> >>> of Rx meta data won't be delivered. Hence, no necessity to try
> >>> inserting flows requesting such data and handle the failures.
> >>
> >> Sorry I don't understand the problem you want to solve.
> >> And sorry for not noticing earlier.
> > 
> > No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
> > packets by default (for performance reasons). If the application tries
> > to insert a flow with action MARK, the PMD may not be able to enable
> > delivery of Rx mark without the need to re-start Rx sub-system. And
> > that's fraught with traffic disruption and similar bad consequences. In
> > order to address it, we need to let the application express its interest
> > in receiving mark with packets as early as possible. This way, the PMD
> > can enable Rx mark delivery in advance. And, as an additional benefit,
> > the application can learn *from the very beginning* whether it will be
> > possible to use the feature or not. If this API tells the application
> > that no mark delivery will be enabled, then the application can just
> > skip many unnecessary attempts to insert wittingly unsupported flows
> > during runtime.

I'm puzzled, because we could have the same reasoning for any offload.
I don't understand why we are focusing on mark only.
I would prefer we find a generic solution using the rte_flow API.
Can we make rte_flow_validate() working before port start?
If validating a fake rule doesn't make sense,
why not having a new function accepting a single action as parameter?

> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
> is vendor-specific way to address the same problem.

Not exactly, it is configuring the capabilities:
  +------+-----------+-----------+-------------+-------------+
  | Mode | ``MARK``  | ``META``  | ``META`` Tx | FDB/Through |
  +======+===========+===========+=============+=============+
  | 0    | 24 bits   | 32 bits   | 32 bits     | no          |
  +------+-----------+-----------+-------------+-------------+
  | 1    | 24 bits   | vary 0-32 | 32 bits     | yes         |
  +------+-----------+-----------+-------------+-------------+
  | 2    | vary 0-24 | 32 bits   | 32 bits     | yes         |
  +------+-----------+-----------+-------------+-------------+




^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  8:11         ` Thomas Monjalon
@ 2021-10-01  8:54           ` Andrew Rybchenko
  2021-10-01  9:32             ` Thomas Monjalon
  2021-10-01  8:55           ` Ivan Malov
  1 sibling, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-01  8:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ivan Malov
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

On 10/1/21 11:11 AM, Thomas Monjalon wrote:
> 01/10/2021 08:47, Andrew Rybchenko:
>> On 9/30/21 10:30 PM, Ivan Malov wrote:
>>> Hi Thomas,
>>>
>>> On 30/09/2021 19:18, Thomas Monjalon wrote:
>>>> 23/09/2021 13:20, Ivan Malov:
>>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
>>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
>>>>> only the former has been added. The problem hasn't been solved.
>>>>> Applications still assume that no efforts are needed to enable
>>>>> flow mark and similar meta data delivery.
>>>>>
>>>>> The team behind net/sfc driver has to take over the efforts since
>>>>> the problem has started impacting us. Riverhead, a cutting edge
>>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
>>>>> is available only from long Rx prefix. Switching between the
>>>>> prefix formats can't happen in started state. Hence, we run
>>>>> into the same problem which [1] was aiming to solve.
>>>>
>>>> Sorry I don't understand what is Rx prefix?
>>>
>>> A small chunk of per-packet metadata in Rx packet buffer preceding the
>>> actual packet data. In terms of mbuf, this could be something lying
>>> before m->data_off.
> 
> I've never seen the word "Rx prefix".

Yes, I agree. The term is vendor-specific.

> In general we talk about mbuf headroom and mbuf metadata,
> the rest being the mbuf payload and mbuf tailroom.
> I guess you mean mbuf metadata in the space of the struct rte_mbuf?

Not exactly. It is rather lower level, but finally yes, it goes
to extra data represented by one or another field in mbuf
structure. Broadly Rx metadata is all per-packet extra
information available in HW and could be delivered to SW:
 - Rx checksum offloads information
 - Rx packet classification
 - RSS hash
 - flow mark/flag
 - flow meta
 - tunnel offload information
 - source e-Switch port

Delivering everything is expensive. That's why we have offload
flags, possibility to reduce required Rx packet classification
etc. Some metadata are not covered yet and the series suggest
an approach how to cover it.

> 
>>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
>>>>> on its own since the corresponding flows must be active to set
>>>>> the data in the first place. Hence, adding offload flags
>>>>> similar to RSS_HASH is not a good idea.
>>>>
>>>> What means "active" here?
>>>
>>> Active = inserted and functional. What this paragraph is trying to say
>>> is that when you enable, say, RSS_HASH, that implies both computation of
>>> the hash and the driver's ability to extract in from packets
>>> ("delivery"). But when it comes to MARK, it's just "delivery". No
>>> "offload" here: the NIC won't set any mark in packets unless you create
>>> a flow rule to make it do so. That's the gist of it.
> 
> OK
> Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag.
> Same for RTE_FLOW_ACTION_TYPE_SET_META.
> 
>>>>> Patch [1/5] of this series adds a generic API to let applications
>>>>> negotiate delivery of Rx meta data during initialisation period.
> 
> What is a metadata?

See above.

> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
> Metadata word could cover any field in the mbuf struct so it is vague.

We failed to find better term. Yes, it overlaps with other Rx
features. We can document exceptions and add references to
existing ways to control these exceptions.

If you have idea how to name it, you're welcome.

> 
>>>>> This way, an application knows right from the start which parts
>>>>> of Rx meta data won't be delivered. Hence, no necessity to try
>>>>> inserting flows requesting such data and handle the failures.
>>>>
>>>> Sorry I don't understand the problem you want to solve.
>>>> And sorry for not noticing earlier.
>>>
>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
>>> packets by default (for performance reasons). If the application tries
>>> to insert a flow with action MARK, the PMD may not be able to enable
>>> delivery of Rx mark without the need to re-start Rx sub-system. And
>>> that's fraught with traffic disruption and similar bad consequences. In
>>> order to address it, we need to let the application express its interest
>>> in receiving mark with packets as early as possible. This way, the PMD
>>> can enable Rx mark delivery in advance. And, as an additional benefit,
>>> the application can learn *from the very beginning* whether it will be
>>> possible to use the feature or not. If this API tells the application
>>> that no mark delivery will be enabled, then the application can just
>>> skip many unnecessary attempts to insert wittingly unsupported flows
>>> during runtime.
> 
> I'm puzzled, because we could have the same reasoning for any offload.
> I don't understand why we are focusing on mark only.
> I would prefer we find a generic solution using the rte_flow API.
> Can we make rte_flow_validate() working before port start?
> If validating a fake rule doesn't make sense,
> why not having a new function accepting a single action as parameter?

IMHO, it will be misuse of the rte_flow_validate(). It will be
complex from application point of view and driver
implementation point of view since most likely implemented in
a absolutely different code branch.
Also what should be checked for tunnel offload?

> 
>> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
>> is vendor-specific way to address the same problem.
> 
> Not exactly, it is configuring the capabilities:
>   +------+-----------+-----------+-------------+-------------+
>   | Mode | ``MARK``  | ``META``  | ``META`` Tx | FDB/Through |
>   +======+===========+===========+=============+=============+
>   | 0    | 24 bits   | 32 bits   | 32 bits     | no          |
>   +------+-----------+-----------+-------------+-------------+
>   | 1    | 24 bits   | vary 0-32 | 32 bits     | yes         |
>   +------+-----------+-----------+-------------+-------------+
>   | 2    | vary 0-24 | 32 bits   | 32 bits     | yes         |
>   +------+-----------+-----------+-------------+-------------+

Sorry, but I don't understand the difference. Negotiate is
exactly about capabilities which we want to use.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  8:11         ` Thomas Monjalon
  2021-10-01  8:54           ` Andrew Rybchenko
@ 2021-10-01  8:55           ` Ivan Malov
  2021-10-01  9:48             ` Thomas Monjalon
  1 sibling, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-01  8:55 UTC (permalink / raw)
  To: Thomas Monjalon, Andrew Rybchenko
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

Hi Thomas,

On 01/10/2021 11:11, Thomas Monjalon wrote:
> 01/10/2021 08:47, Andrew Rybchenko:
>> On 9/30/21 10:30 PM, Ivan Malov wrote:
>>> Hi Thomas,
>>>
>>> On 30/09/2021 19:18, Thomas Monjalon wrote:
>>>> 23/09/2021 13:20, Ivan Malov:
>>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
>>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
>>>>> only the former has been added. The problem hasn't been solved.
>>>>> Applications still assume that no efforts are needed to enable
>>>>> flow mark and similar meta data delivery.
>>>>>
>>>>> The team behind net/sfc driver has to take over the efforts since
>>>>> the problem has started impacting us. Riverhead, a cutting edge
>>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
>>>>> is available only from long Rx prefix. Switching between the
>>>>> prefix formats can't happen in started state. Hence, we run
>>>>> into the same problem which [1] was aiming to solve.
>>>>
>>>> Sorry I don't understand what is Rx prefix?
>>>
>>> A small chunk of per-packet metadata in Rx packet buffer preceding the
>>> actual packet data. In terms of mbuf, this could be something lying
>>> before m->data_off.
> 
> I've never seen the word "Rx prefix".
> In general we talk about mbuf headroom and mbuf metadata,
> the rest being the mbuf payload and mbuf tailroom.
> I guess you mean mbuf metadata in the space of the struct rte_mbuf?

In this paragraph I describe the two ways how the NIC itself can provide 
metadata buffers of different sizes. Hence the term "Rx prefix". As you 
understand, the NIC HW is unaware of DPDK, mbufs and whatever else SW 
concepts. To NIC, this is "Rx prefix", that is, a chunk of per-packet 
metadata *preceding* the actual packet data. It's responsibility of the 
PMD to treat this the right way, care about headroom, payload and 
tailroom. I describe the two Rx prefix formats in NIC terminology just 
to provide the gist of the problem.

> 
>>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
>>>>> on its own since the corresponding flows must be active to set
>>>>> the data in the first place. Hence, adding offload flags
>>>>> similar to RSS_HASH is not a good idea.
>>>>
>>>> What means "active" here?
>>>
>>> Active = inserted and functional. What this paragraph is trying to say
>>> is that when you enable, say, RSS_HASH, that implies both computation of
>>> the hash and the driver's ability to extract in from packets
>>> ("delivery"). But when it comes to MARK, it's just "delivery". No
>>> "offload" here: the NIC won't set any mark in packets unless you create
>>> a flow rule to make it do so. That's the gist of it.
> 
> OK
> Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag.
> Same for RTE_FLOW_ACTION_TYPE_SET_META.
> 
>>>>> Patch [1/5] of this series adds a generic API to let applications
>>>>> negotiate delivery of Rx meta data during initialisation period.
> 
> What is a metadata?
> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
> Metadata word could cover any field in the mbuf struct so it is vague.

Metadata here is *any* additional information provided by the NIC for 
each received packet. For example, Rx flag, Rx mark, RSS hash, packet 
classification info, you name it. I'd like to stress out that the 
suggested API comes with flags each of which is crystal clear on what 
concrete kind of metadata it covers, eg. Rx mark.

> 
>>>>> This way, an application knows right from the start which parts
>>>>> of Rx meta data won't be delivered. Hence, no necessity to try
>>>>> inserting flows requesting such data and handle the failures.
>>>>
>>>> Sorry I don't understand the problem you want to solve.
>>>> And sorry for not noticing earlier.
>>>
>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
>>> packets by default (for performance reasons). If the application tries
>>> to insert a flow with action MARK, the PMD may not be able to enable
>>> delivery of Rx mark without the need to re-start Rx sub-system. And
>>> that's fraught with traffic disruption and similar bad consequences. In
>>> order to address it, we need to let the application express its interest
>>> in receiving mark with packets as early as possible. This way, the PMD
>>> can enable Rx mark delivery in advance. And, as an additional benefit,
>>> the application can learn *from the very beginning* whether it will be
>>> possible to use the feature or not. If this API tells the application
>>> that no mark delivery will be enabled, then the application can just
>>> skip many unnecessary attempts to insert wittingly unsupported flows
>>> during runtime.
> 
> I'm puzzled, because we could have the same reasoning for any offload.

We're not discussing *offloads*. An offload is when NIC *computes 
something* and *delivers* it. We are discussing precisely *delivery*.

> I don't understand why we are focusing on mark only
We are not focusing on mark on purpose. It's just how our discussion 
goes. I chose mark (could've chosen flag or anything else) just to show 
you an example.

> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start?
> If validating a fake rule doesn't make sense,
> why not having a new function accepting a single action as parameter?

A noble idea, but if we feed the entire flow rule to the driver for 
validation, then the driver must not look specifically for actions FLAG 
or MARK in it (to enable or disable metadata delivery). This way, the 
driver is obliged to also validate match criteria, attributes, etc. And, 
if something is unsupported (say, some specific item), the driver will 
have to reject the rule as a whole thus leaving the application to join 
the dots itself.

Say, you ask the driver to validate the following rule:
pattern blah-blah-1 / blah-blah-2 / end action flag / end
intending to check support for FLAG delivery. Suppose, the driver 
doesn't support pattern item "blah-blah-1". It will throw an error right 
after seeing this unsupported item and won't even go further to see the 
action FLAG. How can application know whether its request for FLAG was 
heard or not?

And I'd not bind delivery of metadata to flow API. Consider the 
following example. We have a DPDK application sitting at the *host* and 
we have a *guest* with its *own* DPDK instance. The guest DPDK has asked 
the NIC (by virtue of flow API) to mark all outgoing packets. This 
packets reach the *host* DPDK. Say, the host application just wants to 
see the marked packets from the guest. Its own, (the host's) use of flow 
API is a don't care here. The host doesn't want to mark packets itself, 
it wants to see packets marked by the guest.

> 
>> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
>> is vendor-specific way to address the same problem.
> 
> Not exactly, it is configuring the capabilities:
>    +------+-----------+-----------+-------------+-------------+
>    | Mode | ``MARK``  | ``META``  | ``META`` Tx | FDB/Through |
>    +======+===========+===========+=============+=============+
>    | 0    | 24 bits   | 32 bits   | 32 bits     | no          |
>    +------+-----------+-----------+-------------+-------------+
>    | 1    | 24 bits   | vary 0-32 | 32 bits     | yes         |
>    +------+-----------+-----------+-------------+-------------+
>    | 2    | vary 0-24 | 32 bits   | 32 bits     | yes         |
>    +------+-----------+-----------+-------------+-------------+
> 
> 

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  8:54           ` Andrew Rybchenko
@ 2021-10-01  9:32             ` Thomas Monjalon
  2021-10-01  9:41               ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2021-10-01  9:32 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Ivan Malov, dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

01/10/2021 10:54, Andrew Rybchenko:
> >> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
> >> is vendor-specific way to address the same problem.
> > 
> > Not exactly, it is configuring the capabilities:
> >   +------+-----------+-----------+-------------+-------------+
> >   | Mode | ``MARK``  | ``META``  | ``META`` Tx | FDB/Through |
> >   +======+===========+===========+=============+=============+
> >   | 0    | 24 bits   | 32 bits   | 32 bits     | no          |
> >   +------+-----------+-----------+-------------+-------------+
> >   | 1    | 24 bits   | vary 0-32 | 32 bits     | yes         |
> >   +------+-----------+-----------+-------------+-------------+
> >   | 2    | vary 0-24 | 32 bits   | 32 bits     | yes         |
> >   +------+-----------+-----------+-------------+-------------+
> 
> Sorry, but I don't understand the difference. Negotiate is
> exactly about capabilities which we want to use.

The difference is that dv_xmeta_en is not enabling/disabling
the delivery of mark/meta.
It is just extending the scope (cross-domain) and the size.
I think this is not covered in your proposal.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  9:32             ` Thomas Monjalon
@ 2021-10-01  9:41               ` Andrew Rybchenko
  0 siblings, 0 replies; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-01  9:41 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ivan Malov, dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

On 10/1/21 12:32 PM, Thomas Monjalon wrote:
> 01/10/2021 10:54, Andrew Rybchenko:
>>>> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option
>>>> is vendor-specific way to address the same problem.
>>>
>>> Not exactly, it is configuring the capabilities:
>>>   +------+-----------+-----------+-------------+-------------+
>>>   | Mode | ``MARK``  | ``META``  | ``META`` Tx | FDB/Through |
>>>   +======+===========+===========+=============+=============+
>>>   | 0    | 24 bits   | 32 bits   | 32 bits     | no          |
>>>   +------+-----------+-----------+-------------+-------------+
>>>   | 1    | 24 bits   | vary 0-32 | 32 bits     | yes         |
>>>   +------+-----------+-----------+-------------+-------------+
>>>   | 2    | vary 0-24 | 32 bits   | 32 bits     | yes         |
>>>   +------+-----------+-----------+-------------+-------------+
>>
>> Sorry, but I don't understand the difference. Negotiate is
>> exactly about capabilities which we want to use.
> 
> The difference is that dv_xmeta_en is not enabling/disabling
> the delivery of mark/meta.
> It is just extending the scope (cross-domain) and the size.

Enabling/disabling delivery of some bits...

> I think this is not covered in your proposal.

Yes, that's true, since it is too specific this way,
but our proposal can help to make the best automatic
choice from above options depending on negotiation
request.

Of course, you can always enforce via the driver option
and reply on negotiate requests in accordance with
enforced configuration.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  8:55           ` Ivan Malov
@ 2021-10-01  9:48             ` Thomas Monjalon
  2021-10-01 10:15               ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2021-10-01  9:48 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

01/10/2021 10:55, Ivan Malov:
> On 01/10/2021 11:11, Thomas Monjalon wrote:
> > 01/10/2021 08:47, Andrew Rybchenko:
> >> On 9/30/21 10:30 PM, Ivan Malov wrote:
> >>> On 30/09/2021 19:18, Thomas Monjalon wrote:
> >>>> 23/09/2021 13:20, Ivan Malov:
> >>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
> >>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
> >>>>> only the former has been added. The problem hasn't been solved.
> >>>>> Applications still assume that no efforts are needed to enable
> >>>>> flow mark and similar meta data delivery.
> >>>>>
> >>>>> The team behind net/sfc driver has to take over the efforts since
> >>>>> the problem has started impacting us. Riverhead, a cutting edge
> >>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data
> >>>>> is available only from long Rx prefix. Switching between the
> >>>>> prefix formats can't happen in started state. Hence, we run
> >>>>> into the same problem which [1] was aiming to solve.
> >>>>
> >>>> Sorry I don't understand what is Rx prefix?
> >>>
> >>> A small chunk of per-packet metadata in Rx packet buffer preceding the
> >>> actual packet data. In terms of mbuf, this could be something lying
> >>> before m->data_off.
> > 
> > I've never seen the word "Rx prefix".
> > In general we talk about mbuf headroom and mbuf metadata,
> > the rest being the mbuf payload and mbuf tailroom.
> > I guess you mean mbuf metadata in the space of the struct rte_mbuf?
> 
> In this paragraph I describe the two ways how the NIC itself can provide 
> metadata buffers of different sizes. Hence the term "Rx prefix". As you 
> understand, the NIC HW is unaware of DPDK, mbufs and whatever else SW 
> concepts. To NIC, this is "Rx prefix", that is, a chunk of per-packet 
> metadata *preceding* the actual packet data. It's responsibility of the 
> PMD to treat this the right way, care about headroom, payload and 
> tailroom. I describe the two Rx prefix formats in NIC terminology just 
> to provide the gist of the problem.

OK but it is confusing as it is vendor-specific.
Please stick with DPDK terms if possible.

> >>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload
> >>>>> on its own since the corresponding flows must be active to set
> >>>>> the data in the first place. Hence, adding offload flags
> >>>>> similar to RSS_HASH is not a good idea.
> >>>>
> >>>> What means "active" here?
> >>>
> >>> Active = inserted and functional. What this paragraph is trying to say
> >>> is that when you enable, say, RSS_HASH, that implies both computation of
> >>> the hash and the driver's ability to extract in from packets
> >>> ("delivery"). But when it comes to MARK, it's just "delivery". No
> >>> "offload" here: the NIC won't set any mark in packets unless you create
> >>> a flow rule to make it do so. That's the gist of it.
> > 
> > OK
> > Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag.
> > Same for RTE_FLOW_ACTION_TYPE_SET_META.
> > 
> >>>>> Patch [1/5] of this series adds a generic API to let applications
> >>>>> negotiate delivery of Rx meta data during initialisation period.
> > 
> > What is a metadata?
> > Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
> > Metadata word could cover any field in the mbuf struct so it is vague.
> 
> Metadata here is *any* additional information provided by the NIC for 
> each received packet. For example, Rx flag, Rx mark, RSS hash, packet 
> classification info, you name it. I'd like to stress out that the 
> suggested API comes with flags each of which is crystal clear on what 
> concrete kind of metadata it covers, eg. Rx mark.

I missed the flags.
You mean these 3 flags?

+/** The ethdev sees flagged packets if there are flows with action FLAG. */
+#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
+
+/** The ethdev sees mark IDs in packets if there are flows with action MARK. */
+#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
+
+/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */
+#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)

It is not crystal clear because it does not reference the API,
like RTE_FLOW_ACTION_TYPE_MARK.
And it covers a limited set of metadata.
Do you intend to extend to all mbuf metadata?

> >>>>> This way, an application knows right from the start which parts
> >>>>> of Rx meta data won't be delivered. Hence, no necessity to try
> >>>>> inserting flows requesting such data and handle the failures.
> >>>>
> >>>> Sorry I don't understand the problem you want to solve.
> >>>> And sorry for not noticing earlier.
> >>>
> >>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
> >>> packets by default (for performance reasons). If the application tries
> >>> to insert a flow with action MARK, the PMD may not be able to enable
> >>> delivery of Rx mark without the need to re-start Rx sub-system. And
> >>> that's fraught with traffic disruption and similar bad consequences. In
> >>> order to address it, we need to let the application express its interest
> >>> in receiving mark with packets as early as possible. This way, the PMD
> >>> can enable Rx mark delivery in advance. And, as an additional benefit,
> >>> the application can learn *from the very beginning* whether it will be
> >>> possible to use the feature or not. If this API tells the application
> >>> that no mark delivery will be enabled, then the application can just
> >>> skip many unnecessary attempts to insert wittingly unsupported flows
> >>> during runtime.
> > 
> > I'm puzzled, because we could have the same reasoning for any offload.
> 
> We're not discussing *offloads*. An offload is when NIC *computes 
> something* and *delivers* it. We are discussing precisely *delivery*.

OK but still, there are a lot more mbuf metadata delivered.

> > I don't understand why we are focusing on mark only
> 
> We are not focusing on mark on purpose. It's just how our discussion 
> goes. I chose mark (could've chosen flag or anything else) just to show 
> you an example.
> 
> > I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start?
> > If validating a fake rule doesn't make sense,
> > why not having a new function accepting a single action as parameter?
> 
> A noble idea, but if we feed the entire flow rule to the driver for 
> validation, then the driver must not look specifically for actions FLAG 
> or MARK in it (to enable or disable metadata delivery). This way, the 
> driver is obliged to also validate match criteria, attributes, etc. And, 
> if something is unsupported (say, some specific item), the driver will 
> have to reject the rule as a whole thus leaving the application to join 
> the dots itself.
>
> Say, you ask the driver to validate the following rule:
> pattern blah-blah-1 / blah-blah-2 / end action flag / end
> intending to check support for FLAG delivery. Suppose, the driver 
> doesn't support pattern item "blah-blah-1". It will throw an error right 
> after seeing this unsupported item and won't even go further to see the 
> action FLAG. How can application know whether its request for FLAG was 
> heard or not?

No, I'm proposing a new function to validate the action alone,
without any match etc.
Example:
	rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK)


> And I'd not bind delivery of metadata to flow API. Consider the 
> following example. We have a DPDK application sitting at the *host* and 
> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked 
> the NIC (by virtue of flow API) to mark all outgoing packets. This 
> packets reach the *host* DPDK. Say, the host application just wants to 
> see the marked packets from the guest. Its own, (the host's) use of flow 
> API is a don't care here. The host doesn't want to mark packets itself, 
> it wants to see packets marked by the guest.

It does not make sense to me. We are talking about a DPDK API.
My concern is to avoid redefining new flags
while we already have rte_flow actions.




^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01  9:48             ` Thomas Monjalon
@ 2021-10-01 10:15               ` Andrew Rybchenko
  2021-10-01 12:10                 ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-01 10:15 UTC (permalink / raw)
  To: Thomas Monjalon, Ivan Malov
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

On 10/1/21 12:48 PM, Thomas Monjalon wrote:
> 01/10/2021 10:55, Ivan Malov:
>> On 01/10/2021 11:11, Thomas Monjalon wrote:
>>> 01/10/2021 08:47, Andrew Rybchenko:
>>>> On 9/30/21 10:30 PM, Ivan Malov wrote:
>>>>> On 30/09/2021 19:18, Thomas Monjalon wrote:
>>>>>> 23/09/2021 13:20, Ivan Malov:
>>>>>>> Patch [1/5] of this series adds a generic API to let applications
>>>>>>> negotiate delivery of Rx meta data during initialisation period.
>>>
>>> What is a metadata?
>>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
>>> Metadata word could cover any field in the mbuf struct so it is vague.
>>
>> Metadata here is *any* additional information provided by the NIC for 
>> each received packet. For example, Rx flag, Rx mark, RSS hash, packet 
>> classification info, you name it. I'd like to stress out that the 
>> suggested API comes with flags each of which is crystal clear on what 
>> concrete kind of metadata it covers, eg. Rx mark.
> 
> I missed the flags.
> You mean these 3 flags?

Yes

> +/** The ethdev sees flagged packets if there are flows with action FLAG. */
> +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
> +
> +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */
> +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
> +
> +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */
> +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
> 
> It is not crystal clear because it does not reference the API,
> like RTE_FLOW_ACTION_TYPE_MARK.

Thanks, it is easy to fix. Please, note that there is no action
for tunnel ID case.

> And it covers a limited set of metadata.

Yes which are not covered by offloads, packet classification
etc. Anything else?

> Do you intend to extend to all mbuf metadata?

No. It should be discussed case-by-case separately.

> 
>>>>>>> This way, an application knows right from the start which parts
>>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try
>>>>>>> inserting flows requesting such data and handle the failures.
>>>>>>
>>>>>> Sorry I don't understand the problem you want to solve.
>>>>>> And sorry for not noticing earlier.
>>>>>
>>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
>>>>> packets by default (for performance reasons). If the application tries
>>>>> to insert a flow with action MARK, the PMD may not be able to enable
>>>>> delivery of Rx mark without the need to re-start Rx sub-system. And
>>>>> that's fraught with traffic disruption and similar bad consequences. In
>>>>> order to address it, we need to let the application express its interest
>>>>> in receiving mark with packets as early as possible. This way, the PMD
>>>>> can enable Rx mark delivery in advance. And, as an additional benefit,
>>>>> the application can learn *from the very beginning* whether it will be
>>>>> possible to use the feature or not. If this API tells the application
>>>>> that no mark delivery will be enabled, then the application can just
>>>>> skip many unnecessary attempts to insert wittingly unsupported flows
>>>>> during runtime.
>>>
>>> I'm puzzled, because we could have the same reasoning for any offload.
>>
>> We're not discussing *offloads*. An offload is when NIC *computes 
>> something* and *delivers* it. We are discussing precisely *delivery*.
> 
> OK but still, there are a lot more mbuf metadata delivered.

Yes, and some are not controlled yet early enough, and
we do here.

> 
>>> I don't understand why we are focusing on mark only
>>
>> We are not focusing on mark on purpose. It's just how our discussion 
>> goes. I chose mark (could've chosen flag or anything else) just to show 
>> you an example.
>>
>>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start?
>>> If validating a fake rule doesn't make sense,
>>> why not having a new function accepting a single action as parameter?
>>
>> A noble idea, but if we feed the entire flow rule to the driver for 
>> validation, then the driver must not look specifically for actions FLAG 
>> or MARK in it (to enable or disable metadata delivery). This way, the 
>> driver is obliged to also validate match criteria, attributes, etc. And, 
>> if something is unsupported (say, some specific item), the driver will 
>> have to reject the rule as a whole thus leaving the application to join 
>> the dots itself.
>>
>> Say, you ask the driver to validate the following rule:
>> pattern blah-blah-1 / blah-blah-2 / end action flag / end
>> intending to check support for FLAG delivery. Suppose, the driver 
>> doesn't support pattern item "blah-blah-1". It will throw an error right 
>> after seeing this unsupported item and won't even go further to see the 
>> action FLAG. How can application know whether its request for FLAG was 
>> heard or not?
> 
> No, I'm proposing a new function to validate the action alone,
> without any match etc.
> Example:
> 	rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK)

When about tunnel ID?

Also negotiation in terms of bitmask natively allows to
provide everything required at once and it simplifies
implementation in the driver. No dependency on order of
checks etc. Also it allows to renegotiate without any
extra API functions.

> 
> 
>> And I'd not bind delivery of metadata to flow API. Consider the 
>> following example. We have a DPDK application sitting at the *host* and 
>> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked 
>> the NIC (by virtue of flow API) to mark all outgoing packets. This 
>> packets reach the *host* DPDK. Say, the host application just wants to 
>> see the marked packets from the guest. Its own, (the host's) use of flow 
>> API is a don't care here. The host doesn't want to mark packets itself, 
>> it wants to see packets marked by the guest.
> 
> It does not make sense to me. We are talking about a DPDK API.
> My concern is to avoid redefining new flags
> while we already have rte_flow actions.

See above.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01 10:15               ` Andrew Rybchenko
@ 2021-10-01 12:10                 ` Thomas Monjalon
  2021-10-04  9:17                   ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2021-10-01 12:10 UTC (permalink / raw)
  To: Ivan Malov, Andrew Rybchenko
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

01/10/2021 12:15, Andrew Rybchenko:
> On 10/1/21 12:48 PM, Thomas Monjalon wrote:
> > 01/10/2021 10:55, Ivan Malov:
> >> On 01/10/2021 11:11, Thomas Monjalon wrote:
> >>> 01/10/2021 08:47, Andrew Rybchenko:
> >>>> On 9/30/21 10:30 PM, Ivan Malov wrote:
> >>>>> On 30/09/2021 19:18, Thomas Monjalon wrote:
> >>>>>> 23/09/2021 13:20, Ivan Malov:
> >>>>>>> Patch [1/5] of this series adds a generic API to let applications
> >>>>>>> negotiate delivery of Rx meta data during initialisation period.
> >>>
> >>> What is a metadata?
> >>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
> >>> Metadata word could cover any field in the mbuf struct so it is vague.
> >>
> >> Metadata here is *any* additional information provided by the NIC for 
> >> each received packet. For example, Rx flag, Rx mark, RSS hash, packet 
> >> classification info, you name it. I'd like to stress out that the 
> >> suggested API comes with flags each of which is crystal clear on what 
> >> concrete kind of metadata it covers, eg. Rx mark.
> > 
> > I missed the flags.
> > You mean these 3 flags?
> 
> Yes
> 
> > +/** The ethdev sees flagged packets if there are flows with action FLAG. */
> > +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
> > +
> > +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */
> > +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
> > +
> > +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */
> > +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
> > 
> > It is not crystal clear because it does not reference the API,
> > like RTE_FLOW_ACTION_TYPE_MARK.
> 
> Thanks, it is easy to fix. Please, note that there is no action
> for tunnel ID case.

I don't understand the tunnel ID meta.
Is it an existing offload? API?

> > And it covers a limited set of metadata.
> 
> Yes which are not covered by offloads, packet classification
> etc. Anything else?
> 
> > Do you intend to extend to all mbuf metadata?
> 
> No. It should be discussed case-by-case separately.

Ah, it makes the intent clearer.
Why not planning to do something truly generic?

> >>>>>>> This way, an application knows right from the start which parts
> >>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try
> >>>>>>> inserting flows requesting such data and handle the failures.
> >>>>>>
> >>>>>> Sorry I don't understand the problem you want to solve.
> >>>>>> And sorry for not noticing earlier.
> >>>>>
> >>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
> >>>>> packets by default (for performance reasons). If the application tries
> >>>>> to insert a flow with action MARK, the PMD may not be able to enable
> >>>>> delivery of Rx mark without the need to re-start Rx sub-system. And
> >>>>> that's fraught with traffic disruption and similar bad consequences. In
> >>>>> order to address it, we need to let the application express its interest
> >>>>> in receiving mark with packets as early as possible. This way, the PMD
> >>>>> can enable Rx mark delivery in advance. And, as an additional benefit,
> >>>>> the application can learn *from the very beginning* whether it will be
> >>>>> possible to use the feature or not. If this API tells the application
> >>>>> that no mark delivery will be enabled, then the application can just
> >>>>> skip many unnecessary attempts to insert wittingly unsupported flows
> >>>>> during runtime.
> >>>
> >>> I'm puzzled, because we could have the same reasoning for any offload.
> >>
> >> We're not discussing *offloads*. An offload is when NIC *computes 
> >> something* and *delivers* it. We are discussing precisely *delivery*.
> > 
> > OK but still, there are a lot more mbuf metadata delivered.
> 
> Yes, and some are not controlled yet early enough, and
> we do here.
> 
> > 
> >>> I don't understand why we are focusing on mark only
> >>
> >> We are not focusing on mark on purpose. It's just how our discussion 
> >> goes. I chose mark (could've chosen flag or anything else) just to show 
> >> you an example.
> >>
> >>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start?
> >>> If validating a fake rule doesn't make sense,
> >>> why not having a new function accepting a single action as parameter?
> >>
> >> A noble idea, but if we feed the entire flow rule to the driver for 
> >> validation, then the driver must not look specifically for actions FLAG 
> >> or MARK in it (to enable or disable metadata delivery). This way, the 
> >> driver is obliged to also validate match criteria, attributes, etc. And, 
> >> if something is unsupported (say, some specific item), the driver will 
> >> have to reject the rule as a whole thus leaving the application to join 
> >> the dots itself.
> >>
> >> Say, you ask the driver to validate the following rule:
> >> pattern blah-blah-1 / blah-blah-2 / end action flag / end
> >> intending to check support for FLAG delivery. Suppose, the driver 
> >> doesn't support pattern item "blah-blah-1". It will throw an error right 
> >> after seeing this unsupported item and won't even go further to see the 
> >> action FLAG. How can application know whether its request for FLAG was 
> >> heard or not?
> > 
> > No, I'm proposing a new function to validate the action alone,
> > without any match etc.
> > Example:
> > 	rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK)
> 
> When about tunnel ID?
> 
> Also negotiation in terms of bitmask natively allows to
> provide everything required at once and it simplifies
> implementation in the driver. No dependency on order of
> checks etc. Also it allows to renegotiate without any
> extra API functions.

You mean there is a single function call with all bits set?

> >> And I'd not bind delivery of metadata to flow API. Consider the 
> >> following example. We have a DPDK application sitting at the *host* and 
> >> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked 
> >> the NIC (by virtue of flow API) to mark all outgoing packets. This 
> >> packets reach the *host* DPDK. Say, the host application just wants to 
> >> see the marked packets from the guest. Its own, (the host's) use of flow 
> >> API is a don't care here. The host doesn't want to mark packets itself, 
> >> it wants to see packets marked by the guest.
> > 
> > It does not make sense to me. We are talking about a DPDK API.
> > My concern is to avoid redefining new flags
> > while we already have rte_flow actions.
> 
> See above.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-09-30 22:22           ` Ivan Malov
@ 2021-10-03  7:05             ` Ori Kam
  0 siblings, 0 replies; 97+ messages in thread
From: Ori Kam @ 2021-10-03  7:05 UTC (permalink / raw)
  To: Ivan Malov, Ajit Khaparde
  Cc: dpdk-dev, Andy Moreton, Andrew Rybchenko, Ray Kinsella,
	Jerin Jacob, Wisam Monther, Xiaoyun Li,
	NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi

> -----Original Message-----
> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery
> of Rx meta data
> 
> 
> 
> On 01/10/2021 01:12, Ajit Khaparde wrote:
> > On Thu, Sep 30, 2021 at 3:01 PM Ivan Malov <Ivan.Malov@oktetlabs.ru>
> wrote:
> >>
> >> Hi Ajit,
> >>
> >> On 01/10/2021 00:48, Ajit Khaparde wrote:
> >>> ::::
> >>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>>> 97ae52e17e..7a8da3d7ab 100644
> >>>> --- a/app/test-pmd/testpmd.c
> >>>> +++ b/app/test-pmd/testpmd.c
> >>>> @@ -1485,10 +1485,36 @@ static void
> >>>>    init_config_port_offloads(portid_t pid, uint32_t socket_id)
> >>>>    {
> >>>>           struct rte_port *port = &ports[pid];
> >>>> +       uint64_t rx_meta_features = 0;
> >>>>           uint16_t data_size;
> >>>>           int ret;
> >>>>           int i;
> >>>>
> >>>> +       rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>>> +       rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>>> +       rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> >>>> +
> >>>> +       ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> >>>> +       if (ret == 0) {
> >>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> >>>> +                       TESTPMD_LOG(INFO, "Flow action FLAG will
> >>>> + not affect Rx mbufs on port %u\n",
> >>> Log level info might be a little too noisy?
> >>
> >> Do you really think so? But main() sets default log level to DEBUG, quote:
> >>       rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);
> >>
> >> If I go for DEBUG instead of INFO here, it won't get any quieter, will it?
> > You are right. It won't.
> > But then three extra messages per port will stand out. But that's my
> opinion.
> > Maybe you could log the message when a flow is created with any of the
> > meta features?
> 
> The idea is to warn the user from the very beginning that certain flow
> primitives won't actually work. This way, the user can refrain from trying to
> use them in flow rules. This might save their time.
> 
> But I don't mind going for DEBUG here. More opinions are welcome.
> 

+1 for doing it only for configuration, and not per flow.

> >
> >>
> >>>
> >>>> +                                   pid);
> >>>> +               }
> >>>> +
> >>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK)) {
> >>>> +                       TESTPMD_LOG(INFO, "Flow action MARK will not affect Rx
> mbufs on port %u\n",
> >>>> +                                   pid);
> >>>> +               }
> >>>> +
> >>>> +               if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> >>>> +                       TESTPMD_LOG(INFO, "Flow tunnel offload support might
> be limited or unavailable on port %u\n",
> >>>> +                                   pid);
> >>>> +               }
> >>> :::
> >>>>
> >>
> >> --
> >> Ivan M
> 
> --
> Ivan M
Best,
Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-01  6:50         ` Andrew Rybchenko
@ 2021-10-03  7:42           ` Ori Kam
  2021-10-03  9:30             ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-03  7:42 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Andrew and Ivan,


> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, October 1, 2021 9:50 AM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> On 9/30/21 10:07 PM, Ivan Malov wrote:
> > Hi Ori,
> >
> > On 30/09/2021 17:59, Ori Kam wrote:
> >> Hi Ivan,
> >> Sorry for jumping in late.
> >
> > No worries. That's OK.
> >
> >> I have a concern that this patch breaks other PMDs.
> >
> > It does no such thing.
> >
> >>> From the rst file " One should negotiate flag delivery beforehand"
> >> since you only added this function for your PMD all other PMD will fail.
> >> I see that you added exception in the  examples, but it doesn't make
> >> sense that applications will also need to add this exception which is
> >> not documented.
> >
> > Say, you have an application, and you use it with some specific PMD.
> > Say, that PMD doesn't run into the problem as ours does. In other
> > words, the user can insert a flow with action MARK at any point and
> > get mark delivery working starting from that moment without any
> > problem. Say, this is exactly the way how it works for you at the moment.
> >
> > Now. This new API kicks in. We update the application to invoke it as
> > early as possible. But your PMD in question still doesn't support this
> > API. The comment in the patch says that if the method returns ENOTSUP,
> > the application should ignore that without batting an eyelid. It
> > should just keep on working as it did before the introduction of this API.
> >

I understand that it is nice to write in the patch comment that application
should disregard this function in case of 
ENOTSUP but in a few month someone will read the official doc,
where it is stated that this function call is a must and then what do you
think the application will do?
I think that the correct way is to add this function to all PMDs.
Another option is to add to the doc that if the function is returning ENOTSUP
the application should assume that all is supported.
 
So from this point of view there is API break.

> > More specific example:
> > Say, the application doesn't mind using either "RSS + MARK" or tunnel
> > offload. What it does right now is attempt to insert tunnel flows
> > first and, if this fails, fall back to "RSS + MARK". With this API,
> > the application will try to invoke this API with "USER_MARK |
> > TUNNEL_ID" in adapter initialised state. If the PMD says that it can
> > only enable the tunnel offload, then the application will get the
> > knowledge that it doesn't make sense to even try inserting "RSS +
> > MARK" flows. It just can skip useless actions. But if the PMD doesn't
> > support the method, the application will see ENOTSUP and handle this
> > gracefully: it will make no assumptions about what's guaranteed to be
> > supported and what's not and will just keep on its old behavior: try
> > to insert a flow, fail, fall back to another type of flow.
> >

I fully agree with your example, and think that this is the way
to go, application should supply as much info as possible during startup.
My question/comment is the negotiated result means that all of the actions
are supported on the same rule?
for example if application wants to add mark and tag on the same rule.
(I know it doesn't make much sense) and the PMD can support both of them
but not on the same rule, what should it return?
Or for example if using the mark can only be supported if no decap action is set
on this rule what should be the result?
From my undstanding this function is only to let the PMD know that on some
rules the application will use those actions, the checking if the action combination
is valid only happens on validate function right?

In any case I think this is good idea and I will see how we can add a more generic approach of
this API to the new API that I'm going to present.


> > So no breakages with this API.
> >
> >>
> >> Please see more comments inline.
> >>
> >> Thanks,
> >> Ori
> >>
> >>> -----Original Message-----
> >>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>> Sent: Thursday, September 23, 2021 2:20 PM
> >>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx
> >>> meta data
> >>>
> >>> Delivery of mark, flag and the likes might affect small packet
> >>> performance.
> >>> If these features are disabled by default, enabling them in started
> >>> state without causing traffic disruption may not always be possible.
> >>>
> >>> Let applications negotiate delivery of Rx meta data beforehand.
> >>>
> >>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> >>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> >>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>> ---
> >>>   app/test-flow-perf/main.c              | 21 ++++++++++++
> >>>   app/test-pmd/testpmd.c                 | 26 +++++++++++++++
> >>>   doc/guides/rel_notes/release_21_11.rst |  9 ++++++
> >>>   lib/ethdev/ethdev_driver.h             | 19 +++++++++++
> >>>   lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
> >>>   lib/ethdev/rte_ethdev.h                | 45
> >>> ++++++++++++++++++++++++++
> >>>   lib/ethdev/rte_flow.h                  | 12 +++++++
> >>>   lib/ethdev/version.map                 |  3 ++
> >>>   8 files changed, 160 insertions(+)
> >>>
> >>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
> >>> index 9be8edc31d..48eafffb1d 100644
> >>> --- a/app/test-flow-perf/main.c
> >>> +++ b/app/test-flow-perf/main.c
> >>> @@ -1760,6 +1760,27 @@ init_port(void)
> >>>           rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
> >>>
> >>>       for (port_id = 0; port_id < nr_ports; port_id++) {
> >>> +        uint64_t rx_meta_features = 0;
> >>> +
> >>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>> +
> >>> +        ret = rte_eth_rx_meta_negotiate(port_id,
> >>> &rx_meta_features);
> >>> +        if (ret == 0) {
> >>> +            if (!(rx_meta_features &
> >>> RTE_ETH_RX_META_USER_FLAG)) {
> >>> +                printf(":: flow action FLAG will not affect Rx
> >>> mbufs on port=%u\n",
> >>> +                       port_id);
> >>> +            }
> >>> +
> >>> +            if (!(rx_meta_features &
> >>> RTE_ETH_RX_META_USER_MARK)) {
> >>> +                printf(":: flow action MARK will not affect Rx
> >>> mbufs on port=%u\n",
> >>> +                       port_id);
> >>> +            }
> >>> +        } else if (ret != -ENOTSUP) {
> >>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
> >>> meta features on port=%u: %s\n",
> >>> +                 port_id, rte_strerror(-ret));
> >>> +        }
> >>> +
> >>>           ret = rte_eth_dev_info_get(port_id, &dev_info);
> >>>           if (ret != 0)
> >>>               rte_exit(EXIT_FAILURE, diff --git
> >>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>> 97ae52e17e..7a8da3d7ab 100644
> >>> --- a/app/test-pmd/testpmd.c
> >>> +++ b/app/test-pmd/testpmd.c
> >>> @@ -1485,10 +1485,36 @@ static void
> >>>   init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
> >>>       struct rte_port *port = &ports[pid];
> >>> +    uint64_t rx_meta_features = 0;
> >>>       uint16_t data_size;
> >>>       int ret;
> >>>       int i;
> >>>
> >>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> >>> +
> >>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> >>> +    if (ret == 0) {
> >>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> >>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
> >>> affect Rx mbufs on port %u\n",
> >>> +                    pid);
> >>> +        }
> >>> +
> >>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
> >>> {
> >>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
> >>> affect Rx mbufs on port %u\n",
> >>> +                    pid);
> >>> +        }
> >>> +
> >>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> >>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
> >>> might be limited or unavailable on port %u\n",
> >>> +                    pid);
> >>> +        }
> >>> +    } else if (ret != -ENOTSUP) {
> >>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
> >>> features on port %u: %s\n",
> >>> +             pid, rte_strerror(-ret));
> >>> +    }
> >>> +
> >>>       port->dev_conf.txmode = tx_mode;
> >>>       port->dev_conf.rxmode = rx_mode;
> >>>
> >>> diff --git a/doc/guides/rel_notes/release_21_11.rst
> >>> b/doc/guides/rel_notes/release_21_11.rst
> >>> index 19356ac53c..6674d4474c 100644
> >>> --- a/doc/guides/rel_notes/release_21_11.rst
> >>> +++ b/doc/guides/rel_notes/release_21_11.rst
> >>> @@ -106,6 +106,15 @@ New Features
> >>>     Added command-line options to specify total number of processes
> >>> and
> >>>     current process ID. Each process owns subset of Rx and Tx queues.
> >>>
> >>> +* **Added an API to negotiate delivery of specific parts of Rx meta
> >>> +data**
> >>> +
> >>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
> >>> +  The following parts of Rx meta data were defined:
> >>> +
> >>> +  * ``RTE_ETH_RX_META_USER_FLAG``
> >>> +  * ``RTE_ETH_RX_META_USER_MARK``
> >>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
> >>> +
> >>>
> >>>   Removed Items
> >>>   -------------
> >>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >>> index 40e474aa7e..96e0c60cae 100644
> >>> --- a/lib/ethdev/ethdev_driver.h
> >>> +++ b/lib/ethdev/ethdev_driver.h
> >>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void
> >>> *rxq, typedef int (*eth_representor_info_get_t)(struct rte_eth_dev
> >>> *dev,
> >>>       struct rte_eth_representor_info *info);
> >>>
> >>> +/**
> >>> + * @internal
> >>> + * Negotiate delivery of specific parts of Rx meta data.
> >>> + *
> >>> + * @param dev
> >>> + *   Port (ethdev) handle
> >>> + *
> >>> + * @param[inout] features
> >>> + *   Feature selection buffer
> >>> + *
> >>> + * @return
> >>> + *   Negative errno value on error, zero otherwise  */ typedef int
> >>> +(*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
> >>> +                       uint64_t *features);
> >>> +
> >>>   /**
> >>>    * @internal A structure containing the functions exported by an
> >>> Ethernet driver.
> >>>    */
> >>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
> >>>
> >>>       eth_representor_info_get_t representor_info_get;
> >>>       /**< Get representor info. */
> >>> +
> >>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
> >>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
> >>>   };
> >>>
> >>>   /**
> >>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> >>> daf5ca9242..49cb84d64c 100644
> >>> --- a/lib/ethdev/rte_ethdev.c
> >>> +++ b/lib/ethdev/rte_ethdev.c
> >>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
> >>> port_id,
> >>>       return eth_err(port_id, (*dev->dev_ops-
> >>>> representor_info_get)(dev, info));  }
> >>>
> >>> +int
> >>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
> >>> +    struct rte_eth_dev *dev;
> >>> +
> >>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >>> +    dev = &rte_eth_devices[port_id];
> >>> +
> >>> +    if (dev->data->dev_configured != 0) {
> >>> +        RTE_ETHDEV_LOG(ERR,
> >>> +            "The port (id=%"PRIu16") is already configured\n",
> >>> +            port_id);
> >>> +        return -EBUSY;
> >>> +    }
> >>> +
> >>> +    if (features == NULL) {
> >>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> >>> +        return -EINVAL;
> >>> +    }
> >>> +
> >>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate,
> >>> -ENOTSUP);
> >>> +    return eth_err(port_id,
> >>> +               (*dev->dev_ops->rx_meta_negotiate)(dev, features));
> >>> +}
> >>> +
> >>>   RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >>>
> >>>   RTE_INIT(ethdev_init_telemetry)
> >>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> >>> 1da37896d8..8467a7a362 100644
> >>> --- a/lib/ethdev/rte_ethdev.h
> >>> +++ b/lib/ethdev/rte_ethdev.h
> >>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
> >>> rte_eth_representor_info_get(uint16_t port_id,
> >>>                    struct rte_eth_representor_info *info);
> >>>
> >>> +/** The ethdev sees flagged packets if there are flows with action
> >>> +FLAG. */ #define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
> >>> +
> >>> +/** The ethdev sees mark IDs in packets if there are flows with
> >>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
> (UINT64_C(1) <<
> >>> +1)
> >>> +
> >>> +/** The ethdev detects missed packets if there are "tunnel_set"
> >>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1)
> <<
> >>> +2)
> >>> +
> >>> +/**
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this API may change without prior notice
> >>> + *
> >>> + * Negotiate delivery of specific parts of Rx meta data.
> >>> + *
> >>> + * Invoke this API before the first rte_eth_dev_configure()
> >>> +invocation
> >>> + * to let the PMD make preparations that are inconvenient to do later.
> >>> + *
> >>> + * The negotiation process is as follows:
> >>> + *
> >>> + * - the application requests features intending to use at least
> >>> +some of them;
> >>> + * - the PMD responds with the guaranteed subset of the requested
> >>> +feature set;
> >>> + * - the application can retry negotiation with another set of
> >>> +features;
> >>> + * - the application can pass zero to clear the negotiation result;
> >>> + * - the last negotiated result takes effect upon the ethdev start.
> >>> + *
> >>> + * If this API is unsupported, the application should gracefully
> >>> ignore that.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port (ethdev) identifier
> >>> + *
> >>> + * @param[inout] features
> >>> + *   Feature selection buffer
> >>> + *
> >>> + * @return
> >>> + *   - (-EBUSY) if the port can't handle this in its current state;
> >>> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
> >>> + *   - (-ENODEV) if *port_id* is invalid;
> >>> + *   - (-EINVAL) if *features* is NULL;
> >>> + *   - (-EIO) if the device is removed;
> >>> + *   - (0) on success
> >>> + */
> >>> +__rte_experimental
> >>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
> >>> +*features);
> >>
> >> I don't think meta is the best name since we also have meta item and
> >> the word meta can be used in other cases.
> >
> > I'm no expert in naming. What could be a better term for this?
> > Personally, I'd rather not perceive "meta" the way you describe. It's
> > not just "meta". It's "rx_meta", and the flags supplied with this API
> > provide enough context to explain what it's all about.
> 
> Thinking overnight about it I'd suggest full "metadata".
> Yes, it will name a bit longer, but less confusing versus term META already
> used in flow API.
> 
Following my above comments, I think it should be part of the new API
but in any case what about rx_flow_action_negotiate?

> Andrew.
Best,
Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03  7:42           ` Ori Kam
@ 2021-10-03  9:30             ` Ivan Malov
  2021-10-03 11:01               ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-03  9:30 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ori,

Thanks for reviewing this.

On 03/10/2021 10:42, Ori Kam wrote:
> Hi Andrew and Ivan,
> 
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Friday, October 1, 2021 9:50 AM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> On 9/30/21 10:07 PM, Ivan Malov wrote:
>>> Hi Ori,
>>>
>>> On 30/09/2021 17:59, Ori Kam wrote:
>>>> Hi Ivan,
>>>> Sorry for jumping in late.
>>>
>>> No worries. That's OK.
>>>
>>>> I have a concern that this patch breaks other PMDs.
>>>
>>> It does no such thing.
>>>
>>>>>  From the rst file " One should negotiate flag delivery beforehand"
>>>> since you only added this function for your PMD all other PMD will fail.
>>>> I see that you added exception in the  examples, but it doesn't make
>>>> sense that applications will also need to add this exception which is
>>>> not documented.
>>>
>>> Say, you have an application, and you use it with some specific PMD.
>>> Say, that PMD doesn't run into the problem as ours does. In other
>>> words, the user can insert a flow with action MARK at any point and
>>> get mark delivery working starting from that moment without any
>>> problem. Say, this is exactly the way how it works for you at the moment.
>>>
>>> Now. This new API kicks in. We update the application to invoke it as
>>> early as possible. But your PMD in question still doesn't support this
>>> API. The comment in the patch says that if the method returns ENOTSUP,
>>> the application should ignore that without batting an eyelid. It
>>> should just keep on working as it did before the introduction of this API.
>>>
> 
> I understand that it is nice to write in the patch comment that application
> should disregard this function in case of
> ENOTSUP but in a few month someone will read the official doc,
> where it is stated that this function call is a must and then what do you
> think the application will do?
> I think that the correct way is to add this function to all PMDs.
> Another option is to add to the doc that if the function is returning ENOTSUP
> the application should assume that all is supported.
>   
> So from this point of view there is API break.

So, you mean an API breakage in some formal sense? If the doc is fixed 
in accordance with the second option you suggest, will it suffice to 
avoid this formal API breakage?

> 
>>> More specific example:
>>> Say, the application doesn't mind using either "RSS + MARK" or tunnel
>>> offload. What it does right now is attempt to insert tunnel flows
>>> first and, if this fails, fall back to "RSS + MARK". With this API,
>>> the application will try to invoke this API with "USER_MARK |
>>> TUNNEL_ID" in adapter initialised state. If the PMD says that it can
>>> only enable the tunnel offload, then the application will get the
>>> knowledge that it doesn't make sense to even try inserting "RSS +
>>> MARK" flows. It just can skip useless actions. But if the PMD doesn't
>>> support the method, the application will see ENOTSUP and handle this
>>> gracefully: it will make no assumptions about what's guaranteed to be
>>> supported and what's not and will just keep on its old behavior: try
>>> to insert a flow, fail, fall back to another type of flow.
>>>
> 
> I fully agree with your example, and think that this is the way
> to go, application should supply as much info as possible during startup.

Right.

> My question/comment is the negotiated result means that all of the actions
> are supported on the same rule?
> for example if application wants to add mark and tag on the same rule.
> (I know it doesn't make much sense) and the PMD can support both of them
> but not on the same rule, what should it return?
> Or for example if using the mark can only be supported if no decap action is set
> on this rule what should be the result?
>  From my undstanding this function is only to let the PMD know that on some
> rules the application will use those actions, the checking if the action combination
> is valid only happens on validate function right?

This API does not bind itself to flow API. It's *not* about enabling 
support for metadata *actions* (they are conducted entirely *inside* the 
NIC). It's about enabling *delivery* of metadata from the NIC to host.

Say, you insert a flow rule to mark some packets. The NIC, internally 
(in the e-switch) adds the mark to matching packets. Yes, in the 
boundaries of the NIC HW, the packets bear the mark on them. It has been 
set, yes. But when time comes to *deliver* the packets to the host, the 
NIC (at least, in net/sfc case) has two options: either provide only a 
small chunk of the metadata for each packet *to the host*, which doesn't 
include mark ID, flag and RSS hash, OR, alternatively, provide the full 
set of metadata. In the former option, the mark is simply not delivered. 
Once again: it *has been set*, but simply will not be *delivered to the 
host*.

So, this API is about negotiating *delivery* of metadata. In pure 
technical sense. And the set of flags that this API returns indicates 
which kinds of metadata the NIC will be able to deliver simultaneously.

For example, as I understand, in the case of tunnel offload, MLX5 claims 
Rx mark entirely for tunnel ID metadata, so, if an application requests 
"MARK | TUNNEL_ID" with this API, this PMD should probably want to 
respond with just "TUNNEL_ID". The application will see the response and 
realise that, even if it adds its *own* (user) action MARK to a flow and 
if the flow is not rejected by the PMD, it won't be able to see the mark 
in the received mbufs (or the mark will be incorrect).

But some other PMDs (net/sfc, for instance) claim only a small fraction 
of bits in Rx mark to deliver tunnel ID information. Remaining bits are 
still available for delivery of *user* mark ID. Please see an example at 
https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-ivan.malov@oktetlabs.ru/ 
. In this case, the PMD may want to return both flags in the response: 
"MARK | TUNNEL_ID". This way, the application knows that both features 
are enabled and available for use.

Now. I anticipate more questions asking why wouldn't we prefer flow API 
terminology or why wouldn't we add an API for negotiating support for 
metadata *actions* and not just metadata *delivery*. There's an answer. 
Always has been.

The thing is, the use of *actions* is very complicated. For example, the 
PMD may support action MARK for "transfer" flows but not for 
non-"transfer" ones. Also, simultaneous use of multiple different 
metadata actions may not be possible. And, last but not least, if we 
force the application to check support for *actions* on 
action-after-action basis, the order of checks will be very confusing to 
applications.

Previously, in this thread, Thomas suggested to go for exactly this type 
of API, to check support for actions one-by-one, without any context 
("transfer" / non-"transfer"). I'm afraid, this won't be OK.

> 
> In any case I think this is good idea and I will see how we can add a more generic approach of
> this API to the new API that I'm going to present.
> 
> 
>>> So no breakages with this API.
>>>
>>>>
>>>> Please see more comments inline.
>>>>
>>>> Thanks,
>>>> Ori
>>>>
>>>>> -----Original Message-----
>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>> Sent: Thursday, September 23, 2021 2:20 PM
>>>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx
>>>>> meta data
>>>>>
>>>>> Delivery of mark, flag and the likes might affect small packet
>>>>> performance.
>>>>> If these features are disabled by default, enabling them in started
>>>>> state without causing traffic disruption may not always be possible.
>>>>>
>>>>> Let applications negotiate delivery of Rx meta data beforehand.
>>>>>
>>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>> ---
>>>>>    app/test-flow-perf/main.c              | 21 ++++++++++++
>>>>>    app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>>>>>    doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>>>>>    lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>>>>>    lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>>>>>    lib/ethdev/rte_ethdev.h                | 45
>>>>> ++++++++++++++++++++++++++
>>>>>    lib/ethdev/rte_flow.h                  | 12 +++++++
>>>>>    lib/ethdev/version.map                 |  3 ++
>>>>>    8 files changed, 160 insertions(+)
>>>>>
>>>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
>>>>> index 9be8edc31d..48eafffb1d 100644
>>>>> --- a/app/test-flow-perf/main.c
>>>>> +++ b/app/test-flow-perf/main.c
>>>>> @@ -1760,6 +1760,27 @@ init_port(void)
>>>>>            rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
>>>>>
>>>>>        for (port_id = 0; port_id < nr_ports; port_id++) {
>>>>> +        uint64_t rx_meta_features = 0;
>>>>> +
>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>> +
>>>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
>>>>> &rx_meta_features);
>>>>> +        if (ret == 0) {
>>>>> +            if (!(rx_meta_features &
>>>>> RTE_ETH_RX_META_USER_FLAG)) {
>>>>> +                printf(":: flow action FLAG will not affect Rx
>>>>> mbufs on port=%u\n",
>>>>> +                       port_id);
>>>>> +            }
>>>>> +
>>>>> +            if (!(rx_meta_features &
>>>>> RTE_ETH_RX_META_USER_MARK)) {
>>>>> +                printf(":: flow action MARK will not affect Rx
>>>>> mbufs on port=%u\n",
>>>>> +                       port_id);
>>>>> +            }
>>>>> +        } else if (ret != -ENOTSUP) {
>>>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
>>>>> meta features on port=%u: %s\n",
>>>>> +                 port_id, rte_strerror(-ret));
>>>>> +        }
>>>>> +
>>>>>            ret = rte_eth_dev_info_get(port_id, &dev_info);
>>>>>            if (ret != 0)
>>>>>                rte_exit(EXIT_FAILURE, diff --git
>>>>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>>>> 97ae52e17e..7a8da3d7ab 100644
>>>>> --- a/app/test-pmd/testpmd.c
>>>>> +++ b/app/test-pmd/testpmd.c
>>>>> @@ -1485,10 +1485,36 @@ static void
>>>>>    init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>>>>>        struct rte_port *port = &ports[pid];
>>>>> +    uint64_t rx_meta_features = 0;
>>>>>        uint16_t data_size;
>>>>>        int ret;
>>>>>        int i;
>>>>>
>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>>>>> +
>>>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>>>>> +    if (ret == 0) {
>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>>>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
>>>>> affect Rx mbufs on port %u\n",
>>>>> +                    pid);
>>>>> +        }
>>>>> +
>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
>>>>> {
>>>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
>>>>> affect Rx mbufs on port %u\n",
>>>>> +                    pid);
>>>>> +        }
>>>>> +
>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>>>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
>>>>> might be limited or unavailable on port %u\n",
>>>>> +                    pid);
>>>>> +        }
>>>>> +    } else if (ret != -ENOTSUP) {
>>>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
>>>>> features on port %u: %s\n",
>>>>> +             pid, rte_strerror(-ret));
>>>>> +    }
>>>>> +
>>>>>        port->dev_conf.txmode = tx_mode;
>>>>>        port->dev_conf.rxmode = rx_mode;
>>>>>
>>>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
>>>>> b/doc/guides/rel_notes/release_21_11.rst
>>>>> index 19356ac53c..6674d4474c 100644
>>>>> --- a/doc/guides/rel_notes/release_21_11.rst
>>>>> +++ b/doc/guides/rel_notes/release_21_11.rst
>>>>> @@ -106,6 +106,15 @@ New Features
>>>>>      Added command-line options to specify total number of processes
>>>>> and
>>>>>      current process ID. Each process owns subset of Rx and Tx queues.
>>>>>
>>>>> +* **Added an API to negotiate delivery of specific parts of Rx meta
>>>>> +data**
>>>>> +
>>>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
>>>>> +  The following parts of Rx meta data were defined:
>>>>> +
>>>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
>>>>> +  * ``RTE_ETH_RX_META_USER_MARK``
>>>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
>>>>> +
>>>>>
>>>>>    Removed Items
>>>>>    -------------
>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>> index 40e474aa7e..96e0c60cae 100644
>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void
>>>>> *rxq, typedef int (*eth_representor_info_get_t)(struct rte_eth_dev
>>>>> *dev,
>>>>>        struct rte_eth_representor_info *info);
>>>>>
>>>>> +/**
>>>>> + * @internal
>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>> + *
>>>>> + * @param dev
>>>>> + *   Port (ethdev) handle
>>>>> + *
>>>>> + * @param[inout] features
>>>>> + *   Feature selection buffer
>>>>> + *
>>>>> + * @return
>>>>> + *   Negative errno value on error, zero otherwise  */ typedef int
>>>>> +(*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
>>>>> +                       uint64_t *features);
>>>>> +
>>>>>    /**
>>>>>     * @internal A structure containing the functions exported by an
>>>>> Ethernet driver.
>>>>>     */
>>>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>>>>>
>>>>>        eth_representor_info_get_t representor_info_get;
>>>>>        /**< Get representor info. */
>>>>> +
>>>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
>>>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
>>>>>    };
>>>>>
>>>>>    /**
>>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
>>>>> daf5ca9242..49cb84d64c 100644
>>>>> --- a/lib/ethdev/rte_ethdev.c
>>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
>>>>> port_id,
>>>>>        return eth_err(port_id, (*dev->dev_ops-
>>>>>> representor_info_get)(dev, info));  }
>>>>>
>>>>> +int
>>>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
>>>>> +    struct rte_eth_dev *dev;
>>>>> +
>>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>>>> +    dev = &rte_eth_devices[port_id];
>>>>> +
>>>>> +    if (dev->data->dev_configured != 0) {
>>>>> +        RTE_ETHDEV_LOG(ERR,
>>>>> +            "The port (id=%"PRIu16") is already configured\n",
>>>>> +            port_id);
>>>>> +        return -EBUSY;
>>>>> +    }
>>>>> +
>>>>> +    if (features == NULL) {
>>>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +
>>>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_meta_negotiate,
>>>>> -ENOTSUP);
>>>>> +    return eth_err(port_id,
>>>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev, features));
>>>>> +}
>>>>> +
>>>>>    RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>>>>>
>>>>>    RTE_INIT(ethdev_init_telemetry)
>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>>>> 1da37896d8..8467a7a362 100644
>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
>>>>> rte_eth_representor_info_get(uint16_t port_id,
>>>>>                     struct rte_eth_representor_info *info);
>>>>>
>>>>> +/** The ethdev sees flagged packets if there are flows with action
>>>>> +FLAG. */ #define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
>>>>> +
>>>>> +/** The ethdev sees mark IDs in packets if there are flows with
>>>>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
>> (UINT64_C(1) <<
>>>>> +1)
>>>>> +
>>>>> +/** The ethdev detects missed packets if there are "tunnel_set"
>>>>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1)
>> <<
>>>>> +2)
>>>>> +
>>>>> +/**
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>>> + *
>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>> + *
>>>>> + * Invoke this API before the first rte_eth_dev_configure()
>>>>> +invocation
>>>>> + * to let the PMD make preparations that are inconvenient to do later.
>>>>> + *
>>>>> + * The negotiation process is as follows:
>>>>> + *
>>>>> + * - the application requests features intending to use at least
>>>>> +some of them;
>>>>> + * - the PMD responds with the guaranteed subset of the requested
>>>>> +feature set;
>>>>> + * - the application can retry negotiation with another set of
>>>>> +features;
>>>>> + * - the application can pass zero to clear the negotiation result;
>>>>> + * - the last negotiated result takes effect upon the ethdev start.
>>>>> + *
>>>>> + * If this API is unsupported, the application should gracefully
>>>>> ignore that.
>>>>> + *
>>>>> + * @param port_id
>>>>> + *   Port (ethdev) identifier
>>>>> + *
>>>>> + * @param[inout] features
>>>>> + *   Feature selection buffer
>>>>> + *
>>>>> + * @return
>>>>> + *   - (-EBUSY) if the port can't handle this in its current state;
>>>>> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
>>>>> + *   - (-ENODEV) if *port_id* is invalid;
>>>>> + *   - (-EINVAL) if *features* is NULL;
>>>>> + *   - (-EIO) if the device is removed;
>>>>> + *   - (0) on success
>>>>> + */
>>>>> +__rte_experimental
>>>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
>>>>> +*features);
>>>>
>>>> I don't think meta is the best name since we also have meta item and
>>>> the word meta can be used in other cases.
>>>
>>> I'm no expert in naming. What could be a better term for this?
>>> Personally, I'd rather not perceive "meta" the way you describe. It's
>>> not just "meta". It's "rx_meta", and the flags supplied with this API
>>> provide enough context to explain what it's all about.
>>
>> Thinking overnight about it I'd suggest full "metadata".
>> Yes, it will name a bit longer, but less confusing versus term META already
>> used in flow API.
>>
> Following my above comments, I think it should be part of the new API
> but in any case what about rx_flow_action_negotiate?

See my thoughts above. It makes no sense to negotiate *support for 
actions*. Existing "rte_flow_validate()" already does that job. The new 
"negotiate Rx metadata* API is all about *delivery* of metadata which is 
supposed to be *already* set for the packets *inside* the NIC. So, we 
negotiate *delivery from the NIC to the host*. Nothing more.

> 
>> Andrew.
> Best,
> Ori
> 

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03  9:30             ` Ivan Malov
@ 2021-10-03 11:01               ` Ori Kam
  2021-10-03 17:30                 ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-03 11:01 UTC (permalink / raw)
  To: Ivan Malov, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ivan,

> -----Original Message-----
> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> Sent: Sunday, October 3, 2021 12:30 PM
> data
> 
> Hi Ori,
> 
> Thanks for reviewing this.
> 

No problem.

> On 03/10/2021 10:42, Ori Kam wrote:
> > Hi Andrew and Ivan,
> >
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Friday, October 1, 2021 9:50 AM
> >> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >> Rx meta data
> >>
> >> On 9/30/21 10:07 PM, Ivan Malov wrote:
> >>> Hi Ori,
> >>>
> >>> On 30/09/2021 17:59, Ori Kam wrote:
> >>>> Hi Ivan,
> >>>> Sorry for jumping in late.
> >>>
> >>> No worries. That's OK.
> >>>
> >>>> I have a concern that this patch breaks other PMDs.
> >>>
> >>> It does no such thing.
> >>>
> >>>>>  From the rst file " One should negotiate flag delivery beforehand"
> >>>> since you only added this function for your PMD all other PMD will fail.
> >>>> I see that you added exception in the  examples, but it doesn't
> >>>> make sense that applications will also need to add this exception
> >>>> which is not documented.
> >>>
> >>> Say, you have an application, and you use it with some specific PMD.
> >>> Say, that PMD doesn't run into the problem as ours does. In other
> >>> words, the user can insert a flow with action MARK at any point and
> >>> get mark delivery working starting from that moment without any
> >>> problem. Say, this is exactly the way how it works for you at the
> moment.
> >>>
> >>> Now. This new API kicks in. We update the application to invoke it
> >>> as early as possible. But your PMD in question still doesn't support
> >>> this API. The comment in the patch says that if the method returns
> >>> ENOTSUP, the application should ignore that without batting an
> >>> eyelid. It should just keep on working as it did before the introduction of
> this API.
> >>>
> >
> > I understand that it is nice to write in the patch comment that
> > application should disregard this function in case of ENOTSUP but in a
> > few month someone will read the official doc, where it is stated that
> > this function call is a must and then what do you think the
> > application will do?
> > I think that the correct way is to add this function to all PMDs.
> > Another option is to add to the doc that if the function is returning
> > ENOTSUP the application should assume that all is supported.
> >
> > So from this point of view there is API break.
> 
> So, you mean an API breakage in some formal sense? If the doc is fixed in
> accordance with the second option you suggest, will it suffice to avoid this
> formal API breakage?
> 

Yes, but I think it will be better to add the missing function.

> >
> >>> More specific example:
> >>> Say, the application doesn't mind using either "RSS + MARK" or
> >>> tunnel offload. What it does right now is attempt to insert tunnel
> >>> flows first and, if this fails, fall back to "RSS + MARK". With this
> >>> API, the application will try to invoke this API with "USER_MARK |
> >>> TUNNEL_ID" in adapter initialised state. If the PMD says that it can
> >>> only enable the tunnel offload, then the application will get the
> >>> knowledge that it doesn't make sense to even try inserting "RSS +
> >>> MARK" flows. It just can skip useless actions. But if the PMD
> >>> doesn't support the method, the application will see ENOTSUP and
> >>> handle this
> >>> gracefully: it will make no assumptions about what's guaranteed to
> >>> be supported and what's not and will just keep on its old behavior:
> >>> try to insert a flow, fail, fall back to another type of flow.
> >>>
> >
> > I fully agree with your example, and think that this is the way to go,
> > application should supply as much info as possible during startup.
> 
> Right.
> 
> > My question/comment is the negotiated result means that all of the
> > actions are supported on the same rule?
> > for example if application wants to add mark and tag on the same rule.
> > (I know it doesn't make much sense) and the PMD can support both of
> > them but not on the same rule, what should it return?
> > Or for example if using the mark can only be supported if no decap
> > action is set on this rule what should be the result?
> >  From my undstanding this function is only to let the PMD know that on
> > some rules the application will use those actions, the checking if the
> > action combination is valid only happens on validate function right?
> 
> This API does not bind itself to flow API. It's *not* about enabling support for
> metadata *actions* (they are conducted entirely *inside* the NIC). It's
> about enabling *delivery* of metadata from the NIC to host.
>

Good point so why not use the same logic as the metadata and register it?
Since in any case, this is something in the mbuf so maybe this should be the answer?
 
> Say, you insert a flow rule to mark some packets. The NIC, internally (in the
> e-switch) adds the mark to matching packets. Yes, in the boundaries of the
> NIC HW, the packets bear the mark on them. It has been set, yes. But when
> time comes to *deliver* the packets to the host, the NIC (at least, in net/sfc
> case) has two options: either provide only a small chunk of the metadata for
> each packet *to the host*, which doesn't include mark ID, flag and RSS hash,
> OR, alternatively, provide the full set of metadata. In the former option, the
> mark is simply not delivered.
> Once again: it *has been set*, but simply will not be *delivered to the host*.
> 
> So, this API is about negotiating *delivery* of metadata. In pure technical
> sense. And the set of flags that this API returns indicates which kinds of
> metadata the NIC will be able to deliver simultaneously.
> 
> For example, as I understand, in the case of tunnel offload, MLX5 claims Rx
> mark entirely for tunnel ID metadata, so, if an application requests "MARK |
> TUNNEL_ID" with this API, this PMD should probably want to respond with
> just "TUNNEL_ID". The application will see the response and realise that,
> even if it adds its *own* (user) action MARK to a flow and if the flow is not
> rejected by the PMD, it won't be able to see the mark in the received mbufs
> (or the mark will be incorrect).
>
So what should the application do if on some flows it wants MARK and on other FLAG?
From DPDK viewpoint both of them can't be shared on the same rule
(they are using the same space in mbuf) so the application will never
ask for both of them in the same rule but he can on some rules ask for mark
while on other request for FLAG, even in your code you added both of them.

So what should the PMD return if it can support both of them just not at the same
rule?

One option is to document that the supported values are not per rule but for the entire
port. For example in the example you gave MLX5 will support mark + flag but will not
support mark + tunnel.

Also considering your example, the negotiation may result in subpar result.
taking your example the PMD returned  TUNNEL_ID maybe application would prefer
to have the mark and not the TUNNEL_ID. I understand that application can check
and try again with just the MARK. 
You are inserting logic to the PMD, maybe the function should just fail maybe
returning the conflicting items?


 
> But some other PMDs (net/sfc, for instance) claim only a small fraction of bits
> in Rx mark to deliver tunnel ID information. Remaining bits are still available
> for delivery of *user* mark ID. Please see an example at
> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
> ivan.malov@oktetlabs.ru/
> . In this case, the PMD may want to return both flags in the response:
> "MARK | TUNNEL_ID". This way, the application knows that both features
> are enabled and available for use.
> 
> Now. I anticipate more questions asking why wouldn't we prefer flow API
> terminology or why wouldn't we add an API for negotiating support for
> metadata *actions* and not just metadata *delivery*. There's an answer.
> Always has been.
> 
> The thing is, the use of *actions* is very complicated. For example, the PMD
> may support action MARK for "transfer" flows but not for non-"transfer"
> ones. Also, simultaneous use of multiple different metadata actions may not
> be possible. And, last but not least, if we force the application to check
> support for *actions* on action-after-action basis, the order of checks will be
> very confusing to applications.
> 
> Previously, in this thread, Thomas suggested to go for exactly this type of
> API, to check support for actions one-by-one, without any context
> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
> 
+1 to keeping it as a separated API. (I agree actions limitation are very complex metrix)

> >
> > In any case I think this is good idea and I will see how we can add a
> > more generic approach of this API to the new API that I'm going to present.
> >
> >
> >>> So no breakages with this API.
> >>>
> >>>>
> >>>> Please see more comments inline.
> >>>>
> >>>> Thanks,
> >>>> Ori
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>>> Sent: Thursday, September 23, 2021 2:20 PM
> >>>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >>>>> Rx meta data
> >>>>>
> >>>>> Delivery of mark, flag and the likes might affect small packet
> >>>>> performance.
> >>>>> If these features are disabled by default, enabling them in
> >>>>> started state without causing traffic disruption may not always be
> possible.
> >>>>>
> >>>>> Let applications negotiate delivery of Rx meta data beforehand.
> >>>>>
> >>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>>> Reviewed-by: Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>
> >>>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> >>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> >>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>>>> ---
> >>>>>    app/test-flow-perf/main.c              | 21 ++++++++++++
> >>>>>    app/test-pmd/testpmd.c                 | 26 +++++++++++++++
> >>>>>    doc/guides/rel_notes/release_21_11.rst |  9 ++++++
> >>>>>    lib/ethdev/ethdev_driver.h             | 19 +++++++++++
> >>>>>    lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
> >>>>>    lib/ethdev/rte_ethdev.h                | 45
> >>>>> ++++++++++++++++++++++++++
> >>>>>    lib/ethdev/rte_flow.h                  | 12 +++++++
> >>>>>    lib/ethdev/version.map                 |  3 ++
> >>>>>    8 files changed, 160 insertions(+)
> >>>>>
> >>>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
> >>>>> index 9be8edc31d..48eafffb1d 100644
> >>>>> --- a/app/test-flow-perf/main.c
> >>>>> +++ b/app/test-flow-perf/main.c
> >>>>> @@ -1760,6 +1760,27 @@ init_port(void)
> >>>>>            rte_exit(EXIT_FAILURE, "Error: can't init mbuf
> >>>>> pool\n");
> >>>>>
> >>>>>        for (port_id = 0; port_id < nr_ports; port_id++) {
> >>>>> +        uint64_t rx_meta_features = 0;
> >>>>> +
> >>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>>>> +
> >>>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
> >>>>> &rx_meta_features);
> >>>>> +        if (ret == 0) {
> >>>>> +            if (!(rx_meta_features &
> >>>>> RTE_ETH_RX_META_USER_FLAG)) {
> >>>>> +                printf(":: flow action FLAG will not affect Rx
> >>>>> mbufs on port=%u\n",
> >>>>> +                       port_id);
> >>>>> +            }
> >>>>> +
> >>>>> +            if (!(rx_meta_features &
> >>>>> RTE_ETH_RX_META_USER_MARK)) {
> >>>>> +                printf(":: flow action MARK will not affect Rx
> >>>>> mbufs on port=%u\n",
> >>>>> +                       port_id);
> >>>>> +            }
> >>>>> +        } else if (ret != -ENOTSUP) {
> >>>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
> >>>>> meta features on port=%u: %s\n",
> >>>>> +                 port_id, rte_strerror(-ret));
> >>>>> +        }
> >>>>> +
> >>>>>            ret = rte_eth_dev_info_get(port_id, &dev_info);
> >>>>>            if (ret != 0)
> >>>>>                rte_exit(EXIT_FAILURE, diff --git
> >>>>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>>>> 97ae52e17e..7a8da3d7ab 100644
> >>>>> --- a/app/test-pmd/testpmd.c
> >>>>> +++ b/app/test-pmd/testpmd.c
> >>>>> @@ -1485,10 +1485,36 @@ static void
> >>>>>    init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
> >>>>>        struct rte_port *port = &ports[pid];
> >>>>> +    uint64_t rx_meta_features = 0;
> >>>>>        uint16_t data_size;
> >>>>>        int ret;
> >>>>>        int i;
> >>>>>
> >>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> >>>>> +
> >>>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> >>>>> +    if (ret == 0) {
> >>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> >>>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
> >>>>> affect Rx mbufs on port %u\n",
> >>>>> +                    pid);
> >>>>> +        }
> >>>>> +
> >>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
> >>>>> {
> >>>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
> >>>>> affect Rx mbufs on port %u\n",
> >>>>> +                    pid);
> >>>>> +        }
> >>>>> +
> >>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> >>>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
> >>>>> might be limited or unavailable on port %u\n",
> >>>>> +                    pid);
> >>>>> +        }
> >>>>> +    } else if (ret != -ENOTSUP) {
> >>>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
> >>>>> features on port %u: %s\n",
> >>>>> +             pid, rte_strerror(-ret));
> >>>>> +    }
> >>>>> +
> >>>>>        port->dev_conf.txmode = tx_mode;
> >>>>>        port->dev_conf.rxmode = rx_mode;
> >>>>>
> >>>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
> >>>>> b/doc/guides/rel_notes/release_21_11.rst
> >>>>> index 19356ac53c..6674d4474c 100644
> >>>>> --- a/doc/guides/rel_notes/release_21_11.rst
> >>>>> +++ b/doc/guides/rel_notes/release_21_11.rst
> >>>>> @@ -106,6 +106,15 @@ New Features
> >>>>>      Added command-line options to specify total number of
> >>>>> processes and
> >>>>>      current process ID. Each process owns subset of Rx and Tx queues.
> >>>>>
> >>>>> +* **Added an API to negotiate delivery of specific parts of Rx
> >>>>> +meta
> >>>>> +data**
> >>>>> +
> >>>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
> >>>>> +  The following parts of Rx meta data were defined:
> >>>>> +
> >>>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
> >>>>> +  * ``RTE_ETH_RX_META_USER_MARK``
> >>>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
> >>>>> +
> >>>>>
> >>>>>    Removed Items
> >>>>>    -------------
> >>>>> diff --git a/lib/ethdev/ethdev_driver.h
> >>>>> b/lib/ethdev/ethdev_driver.h index 40e474aa7e..96e0c60cae 100644
> >>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void
> >>>>> *rxq, typedef int (*eth_representor_info_get_t)(struct rte_eth_dev
> >>>>> *dev,
> >>>>>        struct rte_eth_representor_info *info);
> >>>>>
> >>>>> +/**
> >>>>> + * @internal
> >>>>> + * Negotiate delivery of specific parts of Rx meta data.
> >>>>> + *
> >>>>> + * @param dev
> >>>>> + *   Port (ethdev) handle
> >>>>> + *
> >>>>> + * @param[inout] features
> >>>>> + *   Feature selection buffer
> >>>>> + *
> >>>>> + * @return
> >>>>> + *   Negative errno value on error, zero otherwise  */ typedef
> >>>>> +int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
> >>>>> +                       uint64_t *features);
> >>>>> +
> >>>>>    /**
> >>>>>     * @internal A structure containing the functions exported by
> >>>>> an Ethernet driver.
> >>>>>     */
> >>>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
> >>>>>
> >>>>>        eth_representor_info_get_t representor_info_get;
> >>>>>        /**< Get representor info. */
> >>>>> +
> >>>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
> >>>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
> >>>>>    };
> >>>>>
> >>>>>    /**
> >>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> >>>>> index daf5ca9242..49cb84d64c 100644
> >>>>> --- a/lib/ethdev/rte_ethdev.c
> >>>>> +++ b/lib/ethdev/rte_ethdev.c
> >>>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
> >>>>> port_id,
> >>>>>        return eth_err(port_id, (*dev->dev_ops-
> >>>>>> representor_info_get)(dev, info));  }
> >>>>>
> >>>>> +int
> >>>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
> >>>>> +    struct rte_eth_dev *dev;
> >>>>> +
> >>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >>>>> +    dev = &rte_eth_devices[port_id];
> >>>>> +
> >>>>> +    if (dev->data->dev_configured != 0) {
> >>>>> +        RTE_ETHDEV_LOG(ERR,
> >>>>> +            "The port (id=%"PRIu16") is already configured\n",
> >>>>> +            port_id);
> >>>>> +        return -EBUSY;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (features == NULL) {
> >>>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> >>>>> +        return -EINVAL;
> >>>>> +    }
> >>>>> +
> >>>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >rx_meta_negotiate,
> >>>>> -ENOTSUP);
> >>>>> +    return eth_err(port_id,
> >>>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev,
> >>>>> +features)); }
> >>>>> +
> >>>>>    RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >>>>>
> >>>>>    RTE_INIT(ethdev_init_telemetry) diff --git
> >>>>> a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> >>>>> 1da37896d8..8467a7a362 100644
> >>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
> >>>>> rte_eth_representor_info_get(uint16_t port_id,
> >>>>>                     struct rte_eth_representor_info *info);
> >>>>>
> >>>>> +/** The ethdev sees flagged packets if there are flows with
> >>>>> +action FLAG. */ #define RTE_ETH_RX_META_USER_FLAG
> (UINT64_C(1) <<
> >>>>> +0)
> >>>>> +
> >>>>> +/** The ethdev sees mark IDs in packets if there are flows with
> >>>>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
> >> (UINT64_C(1) <<
> >>>>> +1)
> >>>>> +
> >>>>> +/** The ethdev detects missed packets if there are "tunnel_set"
> >>>>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID
> (UINT64_C(1)
> >> <<
> >>>>> +2)
> >>>>> +
> >>>>> +/**
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this API may change without prior notice
> >>>>> + *
> >>>>> + * Negotiate delivery of specific parts of Rx meta data.
> >>>>> + *
> >>>>> + * Invoke this API before the first rte_eth_dev_configure()
> >>>>> +invocation
> >>>>> + * to let the PMD make preparations that are inconvenient to do
> later.
> >>>>> + *
> >>>>> + * The negotiation process is as follows:
> >>>>> + *
> >>>>> + * - the application requests features intending to use at least
> >>>>> +some of them;
> >>>>> + * - the PMD responds with the guaranteed subset of the requested
> >>>>> +feature set;
> >>>>> + * - the application can retry negotiation with another set of
> >>>>> +features;
> >>>>> + * - the application can pass zero to clear the negotiation
> >>>>> +result;
> >>>>> + * - the last negotiated result takes effect upon the ethdev start.
> >>>>> + *
> >>>>> + * If this API is unsupported, the application should gracefully
> >>>>> ignore that.
> >>>>> + *
> >>>>> + * @param port_id
> >>>>> + *   Port (ethdev) identifier
> >>>>> + *
> >>>>> + * @param[inout] features
> >>>>> + *   Feature selection buffer
> >>>>> + *
> >>>>> + * @return
> >>>>> + *   - (-EBUSY) if the port can't handle this in its current
> >>>>> +state;
> >>>>> + *   - (-ENOTSUP) if the method itself is not supported by the
> >>>>> +PMD;
> >>>>> + *   - (-ENODEV) if *port_id* is invalid;
> >>>>> + *   - (-EINVAL) if *features* is NULL;
> >>>>> + *   - (-EIO) if the device is removed;
> >>>>> + *   - (0) on success
> >>>>> + */
> >>>>> +__rte_experimental
> >>>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
> >>>>> +*features);
> >>>>
> >>>> I don't think meta is the best name since we also have meta item
> >>>> and the word meta can be used in other cases.
> >>>
> >>> I'm no expert in naming. What could be a better term for this?
> >>> Personally, I'd rather not perceive "meta" the way you describe.
> >>> It's not just "meta". It's "rx_meta", and the flags supplied with
> >>> this API provide enough context to explain what it's all about.
> >>
> >> Thinking overnight about it I'd suggest full "metadata".
> >> Yes, it will name a bit longer, but less confusing versus term META
> >> already used in flow API.
> >>
> > Following my above comments, I think it should be part of the new API
> > but in any case what about rx_flow_action_negotiate?
> 
> See my thoughts above. It makes no sense to negotiate *support for
> actions*. Existing "rte_flow_validate()" already does that job. The new
> "negotiate Rx metadata* API is all about *delivery* of metadata which is
> supposed to be *already* set for the packets *inside* the NIC. So, we
> negotiate *delivery from the NIC to the host*. Nothing more.
> 
Agree with your comment but then maybe we should go to the register
approach just like metadata?

Best,
Ori
> >
> >> Andrew.
> > Best,
> > Ori
> >
> 
> --
> Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03 11:01               ` Ori Kam
@ 2021-10-03 17:30                 ` Ivan Malov
  2021-10-03 21:04                   ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-03 17:30 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ori,

On 03/10/2021 14:01, Ori Kam wrote:
> Hi Ivan,
> 
>> -----Original Message-----
>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>> Sent: Sunday, October 3, 2021 12:30 PM
>> data
>>
>> Hi Ori,
>>
>> Thanks for reviewing this.
>>
> 
> No problem.
> 
>> On 03/10/2021 10:42, Ori Kam wrote:
>>> Hi Andrew and Ivan,
>>>
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> Sent: Friday, October 1, 2021 9:50 AM
>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>> Rx meta data
>>>>
>>>> On 9/30/21 10:07 PM, Ivan Malov wrote:
>>>>> Hi Ori,
>>>>>
>>>>> On 30/09/2021 17:59, Ori Kam wrote:
>>>>>> Hi Ivan,
>>>>>> Sorry for jumping in late.
>>>>>
>>>>> No worries. That's OK.
>>>>>
>>>>>> I have a concern that this patch breaks other PMDs.
>>>>>
>>>>> It does no such thing.
>>>>>
>>>>>>>   From the rst file " One should negotiate flag delivery beforehand"
>>>>>> since you only added this function for your PMD all other PMD will fail.
>>>>>> I see that you added exception in the  examples, but it doesn't
>>>>>> make sense that applications will also need to add this exception
>>>>>> which is not documented.
>>>>>
>>>>> Say, you have an application, and you use it with some specific PMD.
>>>>> Say, that PMD doesn't run into the problem as ours does. In other
>>>>> words, the user can insert a flow with action MARK at any point and
>>>>> get mark delivery working starting from that moment without any
>>>>> problem. Say, this is exactly the way how it works for you at the
>> moment.
>>>>>
>>>>> Now. This new API kicks in. We update the application to invoke it
>>>>> as early as possible. But your PMD in question still doesn't support
>>>>> this API. The comment in the patch says that if the method returns
>>>>> ENOTSUP, the application should ignore that without batting an
>>>>> eyelid. It should just keep on working as it did before the introduction of
>> this API.
>>>>>
>>>
>>> I understand that it is nice to write in the patch comment that
>>> application should disregard this function in case of ENOTSUP but in a
>>> few month someone will read the official doc, where it is stated that
>>> this function call is a must and then what do you think the
>>> application will do?
>>> I think that the correct way is to add this function to all PMDs.
>>> Another option is to add to the doc that if the function is returning
>>> ENOTSUP the application should assume that all is supported.
>>>
>>> So from this point of view there is API break.
>>
>> So, you mean an API breakage in some formal sense? If the doc is fixed in
>> accordance with the second option you suggest, will it suffice to avoid this
>> formal API breakage?
>>
> 
> Yes, but I think it will be better to add the missing function.
> 
>>>
>>>>> More specific example:
>>>>> Say, the application doesn't mind using either "RSS + MARK" or
>>>>> tunnel offload. What it does right now is attempt to insert tunnel
>>>>> flows first and, if this fails, fall back to "RSS + MARK". With this
>>>>> API, the application will try to invoke this API with "USER_MARK |
>>>>> TUNNEL_ID" in adapter initialised state. If the PMD says that it can
>>>>> only enable the tunnel offload, then the application will get the
>>>>> knowledge that it doesn't make sense to even try inserting "RSS +
>>>>> MARK" flows. It just can skip useless actions. But if the PMD
>>>>> doesn't support the method, the application will see ENOTSUP and
>>>>> handle this
>>>>> gracefully: it will make no assumptions about what's guaranteed to
>>>>> be supported and what's not and will just keep on its old behavior:
>>>>> try to insert a flow, fail, fall back to another type of flow.
>>>>>
>>>
>>> I fully agree with your example, and think that this is the way to go,
>>> application should supply as much info as possible during startup.
>>
>> Right.
>>
>>> My question/comment is the negotiated result means that all of the
>>> actions are supported on the same rule?
>>> for example if application wants to add mark and tag on the same rule.
>>> (I know it doesn't make much sense) and the PMD can support both of
>>> them but not on the same rule, what should it return?
>>> Or for example if using the mark can only be supported if no decap
>>> action is set on this rule what should be the result?
>>>   From my undstanding this function is only to let the PMD know that on
>>> some rules the application will use those actions, the checking if the
>>> action combination is valid only happens on validate function right?
>>
>> This API does not bind itself to flow API. It's *not* about enabling support for
>> metadata *actions* (they are conducted entirely *inside* the NIC). It's
>> about enabling *delivery* of metadata from the NIC to host.
>>
> 
> Good point so why not use the same logic as the metadata and register it?
> Since in any case, this is something in the mbuf so maybe this should be the answer?

I didn't catch your thought. Could you please elaborate on it?

>   
>> Say, you insert a flow rule to mark some packets. The NIC, internally (in the
>> e-switch) adds the mark to matching packets. Yes, in the boundaries of the
>> NIC HW, the packets bear the mark on them. It has been set, yes. But when
>> time comes to *deliver* the packets to the host, the NIC (at least, in net/sfc
>> case) has two options: either provide only a small chunk of the metadata for
>> each packet *to the host*, which doesn't include mark ID, flag and RSS hash,
>> OR, alternatively, provide the full set of metadata. In the former option, the
>> mark is simply not delivered.
>> Once again: it *has been set*, but simply will not be *delivered to the host*.
>>
>> So, this API is about negotiating *delivery* of metadata. In pure technical
>> sense. And the set of flags that this API returns indicates which kinds of
>> metadata the NIC will be able to deliver simultaneously.
>>
>> For example, as I understand, in the case of tunnel offload, MLX5 claims Rx
>> mark entirely for tunnel ID metadata, so, if an application requests "MARK |
>> TUNNEL_ID" with this API, this PMD should probably want to respond with
>> just "TUNNEL_ID". The application will see the response and realise that,
>> even if it adds its *own* (user) action MARK to a flow and if the flow is not
>> rejected by the PMD, it won't be able to see the mark in the received mbufs
>> (or the mark will be incorrect).
>>
> So what should the application do if on some flows it wants MARK and on other FLAG?

You mentioned flows, so I'd like to stress this out one more time: what 
this API cares about is solely the possibility to deliver metadata 
between the NIC and the host. The host == the PMD (*not* application).

>  From DPDK viewpoint both of them can't be shared on the same rule
> (they are using the same space in mbuf) so the application will never
> ask for both of them in the same rule but he can on some rules ask for mark
> while on other request for FLAG, even in your code you added both of them.
> 
> So what should the PMD return if it can support both of them just not at the same
> rule?

Please see above. This is not about rules. This is not about the way how 
flag and mark are presented *by* the PMD *to* the application in mbufs. 
Simultaneous use of actions FLAG and MARK in flows must be ruled out by 
rte_flow_validate() / rte_flow_create(). The way how flag and mark are 
*represented* in mbufs belongs in mbuf library responsibility domain.

Consider the following operational sequence:

1) The NIC has a packet, which has metadata associated with it;
2) The NIC transfers this packet to the host;
3) The PMD sees the packet and its metadata;
4) The PMD represents whatever available metadata in mbuf format.

Features negotiated by virtue of this API (for instance, FLAG and MARK) 
enable delivery of these kinds of metadata between points (2) and (3).

And the problem of flag / mark co-existence in mbufs sits at point (4).

-> Completely different problems, in fact.

> 
> One option is to document that the supported values are not per rule but for the entire
> port. For example in the example you gave MLX5 will support mark + flag but will not
> support mark + tunnel.

Yes, for the port. Flow rules are a don't care to this API.

> 
> Also considering your example, the negotiation may result in subpar result.
> taking your example the PMD returned  TUNNEL_ID maybe application would prefer
> to have the mark and not the TUNNEL_ID. I understand that application can check
> and try again with just the MARK.

Exactly. The Application can repeat negotiation with just MARK. Is there 
any problem with that?

> You are inserting logic to the PMD, maybe the function should just fail maybe
> returning the conflicting items?

Why return conflicting items? The optimal subset (from the PMD's 
perspective) should be returned. It's a silver lining. In the end, the 
application can learn which features can be enabled and in what 
combinations. And it can rely on the outcome of the negotiation process.

> 
> 
>   
>> But some other PMDs (net/sfc, for instance) claim only a small fraction of bits
>> in Rx mark to deliver tunnel ID information. Remaining bits are still available
>> for delivery of *user* mark ID. Please see an example at
>> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
>> ivan.malov@oktetlabs.ru/
>> . In this case, the PMD may want to return both flags in the response:
>> "MARK | TUNNEL_ID". This way, the application knows that both features
>> are enabled and available for use.
>>
>> Now. I anticipate more questions asking why wouldn't we prefer flow API
>> terminology or why wouldn't we add an API for negotiating support for
>> metadata *actions* and not just metadata *delivery*. There's an answer.
>> Always has been.
>>
>> The thing is, the use of *actions* is very complicated. For example, the PMD
>> may support action MARK for "transfer" flows but not for non-"transfer"
>> ones. Also, simultaneous use of multiple different metadata actions may not
>> be possible. And, last but not least, if we force the application to check
>> support for *actions* on action-after-action basis, the order of checks will be
>> very confusing to applications.
>>
>> Previously, in this thread, Thomas suggested to go for exactly this type of
>> API, to check support for actions one-by-one, without any context
>> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
>>
> +1 to keeping it as a separated API. (I agree actions limitation are very complex metrix)
> 
>>>
>>> In any case I think this is good idea and I will see how we can add a
>>> more generic approach of this API to the new API that I'm going to present.
>>>
>>>
>>>>> So no breakages with this API.
>>>>>
>>>>>>
>>>>>> Please see more comments inline.
>>>>>>
>>>>>> Thanks,
>>>>>> Ori
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>> Sent: Thursday, September 23, 2021 2:20 PM
>>>>>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>>>>> Rx meta data
>>>>>>>
>>>>>>> Delivery of mark, flag and the likes might affect small packet
>>>>>>> performance.
>>>>>>> If these features are disabled by default, enabling them in
>>>>>>> started state without causing traffic disruption may not always be
>> possible.
>>>>>>>
>>>>>>> Let applications negotiate delivery of Rx meta data beforehand.
>>>>>>>
>>>>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>> Reviewed-by: Andrew Rybchenko
>> <andrew.rybchenko@oktetlabs.ru>
>>>>>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>>>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>>>> ---
>>>>>>>     app/test-flow-perf/main.c              | 21 ++++++++++++
>>>>>>>     app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>>>>>>>     doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>>>>>>>     lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>>>>>>>     lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>>>>>>>     lib/ethdev/rte_ethdev.h                | 45
>>>>>>> ++++++++++++++++++++++++++
>>>>>>>     lib/ethdev/rte_flow.h                  | 12 +++++++
>>>>>>>     lib/ethdev/version.map                 |  3 ++
>>>>>>>     8 files changed, 160 insertions(+)
>>>>>>>
>>>>>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
>>>>>>> index 9be8edc31d..48eafffb1d 100644
>>>>>>> --- a/app/test-flow-perf/main.c
>>>>>>> +++ b/app/test-flow-perf/main.c
>>>>>>> @@ -1760,6 +1760,27 @@ init_port(void)
>>>>>>>             rte_exit(EXIT_FAILURE, "Error: can't init mbuf
>>>>>>> pool\n");
>>>>>>>
>>>>>>>         for (port_id = 0; port_id < nr_ports; port_id++) {
>>>>>>> +        uint64_t rx_meta_features = 0;
>>>>>>> +
>>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>>>> +
>>>>>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
>>>>>>> &rx_meta_features);
>>>>>>> +        if (ret == 0) {
>>>>>>> +            if (!(rx_meta_features &
>>>>>>> RTE_ETH_RX_META_USER_FLAG)) {
>>>>>>> +                printf(":: flow action FLAG will not affect Rx
>>>>>>> mbufs on port=%u\n",
>>>>>>> +                       port_id);
>>>>>>> +            }
>>>>>>> +
>>>>>>> +            if (!(rx_meta_features &
>>>>>>> RTE_ETH_RX_META_USER_MARK)) {
>>>>>>> +                printf(":: flow action MARK will not affect Rx
>>>>>>> mbufs on port=%u\n",
>>>>>>> +                       port_id);
>>>>>>> +            }
>>>>>>> +        } else if (ret != -ENOTSUP) {
>>>>>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
>>>>>>> meta features on port=%u: %s\n",
>>>>>>> +                 port_id, rte_strerror(-ret));
>>>>>>> +        }
>>>>>>> +
>>>>>>>             ret = rte_eth_dev_info_get(port_id, &dev_info);
>>>>>>>             if (ret != 0)
>>>>>>>                 rte_exit(EXIT_FAILURE, diff --git
>>>>>>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>>>>>> 97ae52e17e..7a8da3d7ab 100644
>>>>>>> --- a/app/test-pmd/testpmd.c
>>>>>>> +++ b/app/test-pmd/testpmd.c
>>>>>>> @@ -1485,10 +1485,36 @@ static void
>>>>>>>     init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>>>>>>>         struct rte_port *port = &ports[pid];
>>>>>>> +    uint64_t rx_meta_features = 0;
>>>>>>>         uint16_t data_size;
>>>>>>>         int ret;
>>>>>>>         int i;
>>>>>>>
>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>>>>>>> +
>>>>>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>>>>>>> +    if (ret == 0) {
>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>>>>>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
>>>>>>> affect Rx mbufs on port %u\n",
>>>>>>> +                    pid);
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
>>>>>>> {
>>>>>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
>>>>>>> affect Rx mbufs on port %u\n",
>>>>>>> +                    pid);
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>>>>>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
>>>>>>> might be limited or unavailable on port %u\n",
>>>>>>> +                    pid);
>>>>>>> +        }
>>>>>>> +    } else if (ret != -ENOTSUP) {
>>>>>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
>>>>>>> features on port %u: %s\n",
>>>>>>> +             pid, rte_strerror(-ret));
>>>>>>> +    }
>>>>>>> +
>>>>>>>         port->dev_conf.txmode = tx_mode;
>>>>>>>         port->dev_conf.rxmode = rx_mode;
>>>>>>>
>>>>>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
>>>>>>> b/doc/guides/rel_notes/release_21_11.rst
>>>>>>> index 19356ac53c..6674d4474c 100644
>>>>>>> --- a/doc/guides/rel_notes/release_21_11.rst
>>>>>>> +++ b/doc/guides/rel_notes/release_21_11.rst
>>>>>>> @@ -106,6 +106,15 @@ New Features
>>>>>>>       Added command-line options to specify total number of
>>>>>>> processes and
>>>>>>>       current process ID. Each process owns subset of Rx and Tx queues.
>>>>>>>
>>>>>>> +* **Added an API to negotiate delivery of specific parts of Rx
>>>>>>> +meta
>>>>>>> +data**
>>>>>>> +
>>>>>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
>>>>>>> +  The following parts of Rx meta data were defined:
>>>>>>> +
>>>>>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
>>>>>>> +  * ``RTE_ETH_RX_META_USER_MARK``
>>>>>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
>>>>>>> +
>>>>>>>
>>>>>>>     Removed Items
>>>>>>>     -------------
>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h
>>>>>>> b/lib/ethdev/ethdev_driver.h index 40e474aa7e..96e0c60cae 100644
>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>> @@ -789,6 +789,22 @@ typedef int (*eth_get_monitor_addr_t)(void
>>>>>>> *rxq, typedef int (*eth_representor_info_get_t)(struct rte_eth_dev
>>>>>>> *dev,
>>>>>>>         struct rte_eth_representor_info *info);
>>>>>>>
>>>>>>> +/**
>>>>>>> + * @internal
>>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>>>> + *
>>>>>>> + * @param dev
>>>>>>> + *   Port (ethdev) handle
>>>>>>> + *
>>>>>>> + * @param[inout] features
>>>>>>> + *   Feature selection buffer
>>>>>>> + *
>>>>>>> + * @return
>>>>>>> + *   Negative errno value on error, zero otherwise  */ typedef
>>>>>>> +int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
>>>>>>> +                       uint64_t *features);
>>>>>>> +
>>>>>>>     /**
>>>>>>>      * @internal A structure containing the functions exported by
>>>>>>> an Ethernet driver.
>>>>>>>      */
>>>>>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>>>>>>>
>>>>>>>         eth_representor_info_get_t representor_info_get;
>>>>>>>         /**< Get representor info. */
>>>>>>> +
>>>>>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
>>>>>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
>>>>>>>     };
>>>>>>>
>>>>>>>     /**
>>>>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>>>>>> index daf5ca9242..49cb84d64c 100644
>>>>>>> --- a/lib/ethdev/rte_ethdev.c
>>>>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>>>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
>>>>>>> port_id,
>>>>>>>         return eth_err(port_id, (*dev->dev_ops-
>>>>>>>> representor_info_get)(dev, info));  }
>>>>>>>
>>>>>>> +int
>>>>>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t *features) {
>>>>>>> +    struct rte_eth_dev *dev;
>>>>>>> +
>>>>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>>>>>> +    dev = &rte_eth_devices[port_id];
>>>>>>> +
>>>>>>> +    if (dev->data->dev_configured != 0) {
>>>>>>> +        RTE_ETHDEV_LOG(ERR,
>>>>>>> +            "The port (id=%"PRIu16") is already configured\n",
>>>>>>> +            port_id);
>>>>>>> +        return -EBUSY;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (features == NULL) {
>>>>>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
>>> rx_meta_negotiate,
>>>>>>> -ENOTSUP);
>>>>>>> +    return eth_err(port_id,
>>>>>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev,
>>>>>>> +features)); }
>>>>>>> +
>>>>>>>     RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>>>>>>>
>>>>>>>     RTE_INIT(ethdev_init_telemetry) diff --git
>>>>>>> a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>>>>>> 1da37896d8..8467a7a362 100644
>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
>>>>>>> rte_eth_representor_info_get(uint16_t port_id,
>>>>>>>                      struct rte_eth_representor_info *info);
>>>>>>>
>>>>>>> +/** The ethdev sees flagged packets if there are flows with
>>>>>>> +action FLAG. */ #define RTE_ETH_RX_META_USER_FLAG
>> (UINT64_C(1) <<
>>>>>>> +0)
>>>>>>> +
>>>>>>> +/** The ethdev sees mark IDs in packets if there are flows with
>>>>>>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
>>>> (UINT64_C(1) <<
>>>>>>> +1)
>>>>>>> +
>>>>>>> +/** The ethdev detects missed packets if there are "tunnel_set"
>>>>>>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID
>> (UINT64_C(1)
>>>> <<
>>>>>>> +2)
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * @warning
>>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>>>>> + *
>>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>>>> + *
>>>>>>> + * Invoke this API before the first rte_eth_dev_configure()
>>>>>>> +invocation
>>>>>>> + * to let the PMD make preparations that are inconvenient to do
>> later.
>>>>>>> + *
>>>>>>> + * The negotiation process is as follows:
>>>>>>> + *
>>>>>>> + * - the application requests features intending to use at least
>>>>>>> +some of them;
>>>>>>> + * - the PMD responds with the guaranteed subset of the requested
>>>>>>> +feature set;
>>>>>>> + * - the application can retry negotiation with another set of
>>>>>>> +features;
>>>>>>> + * - the application can pass zero to clear the negotiation
>>>>>>> +result;
>>>>>>> + * - the last negotiated result takes effect upon the ethdev start.
>>>>>>> + *
>>>>>>> + * If this API is unsupported, the application should gracefully
>>>>>>> ignore that.
>>>>>>> + *
>>>>>>> + * @param port_id
>>>>>>> + *   Port (ethdev) identifier
>>>>>>> + *
>>>>>>> + * @param[inout] features
>>>>>>> + *   Feature selection buffer
>>>>>>> + *
>>>>>>> + * @return
>>>>>>> + *   - (-EBUSY) if the port can't handle this in its current
>>>>>>> +state;
>>>>>>> + *   - (-ENOTSUP) if the method itself is not supported by the
>>>>>>> +PMD;
>>>>>>> + *   - (-ENODEV) if *port_id* is invalid;
>>>>>>> + *   - (-EINVAL) if *features* is NULL;
>>>>>>> + *   - (-EIO) if the device is removed;
>>>>>>> + *   - (0) on success
>>>>>>> + */
>>>>>>> +__rte_experimental
>>>>>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
>>>>>>> +*features);
>>>>>>
>>>>>> I don't think meta is the best name since we also have meta item
>>>>>> and the word meta can be used in other cases.
>>>>>
>>>>> I'm no expert in naming. What could be a better term for this?
>>>>> Personally, I'd rather not perceive "meta" the way you describe.
>>>>> It's not just "meta". It's "rx_meta", and the flags supplied with
>>>>> this API provide enough context to explain what it's all about.
>>>>
>>>> Thinking overnight about it I'd suggest full "metadata".
>>>> Yes, it will name a bit longer, but less confusing versus term META
>>>> already used in flow API.
>>>>
>>> Following my above comments, I think it should be part of the new API
>>> but in any case what about rx_flow_action_negotiate?
>>
>> See my thoughts above. It makes no sense to negotiate *support for
>> actions*. Existing "rte_flow_validate()" already does that job. The new
>> "negotiate Rx metadata* API is all about *delivery* of metadata which is
>> supposed to be *already* set for the packets *inside* the NIC. So, we
>> negotiate *delivery from the NIC to the host*. Nothing more.
>>
> Agree with your comment but then maybe we should go to the register
> approach just like metadata?

Don't you mean "registering mbuf dynamic field / flag" by any chance? 
Even if it's technically possible, this may complicate the API contract 
because the key idea here is to demand that the application negotiate 
metadata delivery at the earliest step possible (before configuring the 
port), whilst a dynamic field / flag can be (theoretically) registered 
at any time. But, of course, feel free to elaborate on your idea.

We should make sure that we all reach an agreement.

> 
> Best,
> Ori
>>>
>>>> Andrew.
>>> Best,
>>> Ori
>>>
>>
>> --
>> Ivan M

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03 17:30                 ` Ivan Malov
@ 2021-10-03 21:04                   ` Ori Kam
  2021-10-03 23:50                     ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-03 21:04 UTC (permalink / raw)
  To: Ivan Malov, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ivan,

Sorry for the long review.

> -----Original Message-----
> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> Sent: Sunday, October 3, 2021 8:30 PM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> Hi Ori,
> 
> On 03/10/2021 14:01, Ori Kam wrote:
> > Hi Ivan,
> >
> >> -----Original Message-----
> >> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> >> Sent: Sunday, October 3, 2021 12:30 PM data
> >>
> >> Hi Ori,
> >>
> >> Thanks for reviewing this.
> >>
> >
> > No problem.
> >
> >> On 03/10/2021 10:42, Ori Kam wrote:
> >>> Hi Andrew and Ivan,
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>> Sent: Friday, October 1, 2021 9:50 AM
> >>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery
> >>>> of Rx meta data
> >>>>
> >>>> On 9/30/21 10:07 PM, Ivan Malov wrote:
> >>>>> Hi Ori,
> >>>>>
> >>>>> On 30/09/2021 17:59, Ori Kam wrote:
> >>>>>> Hi Ivan,
> >>>>>> Sorry for jumping in late.
> >>>>>
> >>>>> No worries. That's OK.
> >>>>>
> >>>>>> I have a concern that this patch breaks other PMDs.
> >>>>>
> >>>>> It does no such thing.
> >>>>>
> >>>>>>>   From the rst file " One should negotiate flag delivery beforehand"
> >>>>>> since you only added this function for your PMD all other PMD will
> fail.
> >>>>>> I see that you added exception in the  examples, but it doesn't
> >>>>>> make sense that applications will also need to add this exception
> >>>>>> which is not documented.
> >>>>>
> >>>>> Say, you have an application, and you use it with some specific PMD.
> >>>>> Say, that PMD doesn't run into the problem as ours does. In other
> >>>>> words, the user can insert a flow with action MARK at any point
> >>>>> and get mark delivery working starting from that moment without
> >>>>> any problem. Say, this is exactly the way how it works for you at
> >>>>> the
> >> moment.
> >>>>>
> >>>>> Now. This new API kicks in. We update the application to invoke it
> >>>>> as early as possible. But your PMD in question still doesn't
> >>>>> support this API. The comment in the patch says that if the method
> >>>>> returns ENOTSUP, the application should ignore that without
> >>>>> batting an eyelid. It should just keep on working as it did before
> >>>>> the introduction of
> >> this API.
> >>>>>
> >>>
> >>> I understand that it is nice to write in the patch comment that
> >>> application should disregard this function in case of ENOTSUP but in
> >>> a few month someone will read the official doc, where it is stated
> >>> that this function call is a must and then what do you think the
> >>> application will do?
> >>> I think that the correct way is to add this function to all PMDs.
> >>> Another option is to add to the doc that if the function is
> >>> returning ENOTSUP the application should assume that all is supported.
> >>>
> >>> So from this point of view there is API break.
> >>
> >> So, you mean an API breakage in some formal sense? If the doc is
> >> fixed in accordance with the second option you suggest, will it
> >> suffice to avoid this formal API breakage?
> >>
> >
> > Yes, but I think it will be better to add the missing function.
> >
> >>>
> >>>>> More specific example:
> >>>>> Say, the application doesn't mind using either "RSS + MARK" or
> >>>>> tunnel offload. What it does right now is attempt to insert tunnel
> >>>>> flows first and, if this fails, fall back to "RSS + MARK". With
> >>>>> this API, the application will try to invoke this API with
> >>>>> "USER_MARK | TUNNEL_ID" in adapter initialised state. If the PMD
> >>>>> says that it can only enable the tunnel offload, then the
> >>>>> application will get the knowledge that it doesn't make sense to
> >>>>> even try inserting "RSS + MARK" flows. It just can skip useless
> >>>>> actions. But if the PMD doesn't support the method, the
> >>>>> application will see ENOTSUP and handle this
> >>>>> gracefully: it will make no assumptions about what's guaranteed to
> >>>>> be supported and what's not and will just keep on its old behavior:
> >>>>> try to insert a flow, fail, fall back to another type of flow.
> >>>>>
> >>>
> >>> I fully agree with your example, and think that this is the way to
> >>> go, application should supply as much info as possible during startup.
> >>
> >> Right.
> >>
> >>> My question/comment is the negotiated result means that all of the
> >>> actions are supported on the same rule?
> >>> for example if application wants to add mark and tag on the same rule.
> >>> (I know it doesn't make much sense) and the PMD can support both of
> >>> them but not on the same rule, what should it return?
> >>> Or for example if using the mark can only be supported if no decap
> >>> action is set on this rule what should be the result?
> >>>   From my undstanding this function is only to let the PMD know that
> >>> on some rules the application will use those actions, the checking
> >>> if the action combination is valid only happens on validate function right?
> >>
> >> This API does not bind itself to flow API. It's *not* about enabling
> >> support for metadata *actions* (they are conducted entirely *inside*
> >> the NIC). It's about enabling *delivery* of metadata from the NIC to host.
> >>
> >
> > Good point so why not use the same logic as the metadata and register it?
> > Since in any case, this is something in the mbuf so maybe this should be the
> answer?
> 
> I didn't catch your thought. Could you please elaborate on it?

The metadata action just like the mark or flag is used to give application
data that was set by a flow rule.
To enable the metadata the application must register the metadata field.
Since this happens during the creation of the mbuf it means that it must be
created before the device start.

I understand that the mark and flag don't need to be registered in the mbuf
since they have saved space but from application point of view there is no
difference between the metadata and mark, so why does negotiate function
doesn't handle the metadata?

I hope this is clearer.

> 
> >
> >> Say, you insert a flow rule to mark some packets. The NIC, internally
> >> (in the
> >> e-switch) adds the mark to matching packets. Yes, in the boundaries
> >> of the NIC HW, the packets bear the mark on them. It has been set,
> >> yes. But when time comes to *deliver* the packets to the host, the
> >> NIC (at least, in net/sfc
> >> case) has two options: either provide only a small chunk of the
> >> metadata for each packet *to the host*, which doesn't include mark
> >> ID, flag and RSS hash, OR, alternatively, provide the full set of
> >> metadata. In the former option, the mark is simply not delivered.
> >> Once again: it *has been set*, but simply will not be *delivered to the
> host*.
> >>
> >> So, this API is about negotiating *delivery* of metadata. In pure
> >> technical sense. And the set of flags that this API returns indicates
> >> which kinds of metadata the NIC will be able to deliver simultaneously.
> >>
> >> For example, as I understand, in the case of tunnel offload, MLX5
> >> claims Rx mark entirely for tunnel ID metadata, so, if an application
> >> requests "MARK | TUNNEL_ID" with this API, this PMD should probably
> >> want to respond with just "TUNNEL_ID". The application will see the
> >> response and realise that, even if it adds its *own* (user) action
> >> MARK to a flow and if the flow is not rejected by the PMD, it won't
> >> be able to see the mark in the received mbufs (or the mark will be
> incorrect).
> >>
> > So what should the application do if on some flows it wants MARK and on
> other FLAG?
> 
> You mentioned flows, so I'd like to stress this out one more time: what this
> API cares about is solely the possibility to deliver metadata between the NIC
> and the host. The host == the PMD (*not* application).
> 

I understand that you are only talking about enabling the action,
meaning to let the PMD know that at some point there will be a rule
that will use the mark action for example. 
Is my understanding correct?
I don't understand your last comment about host == PMD since at the end
this value should be given to the application.

> >  From DPDK viewpoint both of them can't be shared on the same rule
> > (they are using the same space in mbuf) so the application will never
> > ask for both of them in the same rule but he can on some rules ask for
> > mark while on other request for FLAG, even in your code you added both
> of them.
> >
> > So what should the PMD return if it can support both of them just not
> > at the same rule?
> 
> Please see above. This is not about rules. This is not about the way how flag
> and mark are presented *by* the PMD *to* the application in mbufs.
> Simultaneous use of actions FLAG and MARK in flows must be ruled out by
> rte_flow_validate() / rte_flow_create(). The way how flag and mark are
> *represented* in mbufs belongs in mbuf library responsibility domain.
> 
> Consider the following operational sequence:
> 
> 1) The NIC has a packet, which has metadata associated with it;
> 2) The NIC transfers this packet to the host;
> 3) The PMD sees the packet and its metadata;
> 4) The PMD represents whatever available metadata in mbuf format.
> 
> Features negotiated by virtue of this API (for instance, FLAG and MARK)
> enable delivery of these kinds of metadata between points (2) and (3).
> 
> And the problem of flag / mark co-existence in mbufs sits at point (4).
> 
> -> Completely different problems, in fact.
> 

Agree.

> >
> > One option is to document that the supported values are not per rule
> > but for the entire port. For example in the example you gave MLX5 will
> > support mark + flag but will not support mark + tunnel.
> 
> Yes, for the port. Flow rules are a don't care to this API.
> 
> >
> > Also considering your example, the negotiation may result in subpar result.
> > taking your example the PMD returned  TUNNEL_ID maybe application
> > would prefer to have the mark and not the TUNNEL_ID. I understand that
> > application can check and try again with just the MARK.
> 
> Exactly. The Application can repeat negotiation with just MARK. Is there any
> problem with that?
> 

I understand that the application can negotiate again and again.
I just don't like that the PMD has logic and selects what he thinks will be best.

I wanted to suggest that the PMD will just tell what are the conflicts and the application
will negotiate again based on its logic.

> > You are inserting logic to the PMD, maybe the function should just
> > fail maybe returning the conflicting items?
> 
> Why return conflicting items? The optimal subset (from the PMD's
> perspective) should be returned. It's a silver lining. In the end, the application
> can learn which features can be enabled and in what combinations. And it
> can rely on the outcome of the negotiation process.
> 
That is my point this is PMD perspective, not the application. 
how can a PMD define an optimal subset? How can it know what is more
important to the application?
Also, the PMD logic is internal so if for some reason
the PMD selected the best for the application by chance, so the application learns
that this is a good value for him. A release later the internal PMD logic changes
for example, a new feature was added, other customer requests.
since this is PMD the original app is not aware of this change and may fail.

We both agree that the application should check the result and renegotiate if needed
I only suggested that the PMD will only return error and not assume he knows best.


> >
> >
> >
> >> But some other PMDs (net/sfc, for instance) claim only a small fraction of
> bits
> >> in Rx mark to deliver tunnel ID information. Remaining bits are still
> available
> >> for delivery of *user* mark ID. Please see an example at
> >> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
> >> ivan.malov@oktetlabs.ru/
> >> . In this case, the PMD may want to return both flags in the response:
> >> "MARK | TUNNEL_ID". This way, the application knows that both features
> >> are enabled and available for use.
> >>
> >> Now. I anticipate more questions asking why wouldn't we prefer flow API
> >> terminology or why wouldn't we add an API for negotiating support for
> >> metadata *actions* and not just metadata *delivery*. There's an answer.
> >> Always has been.
> >>
> >> The thing is, the use of *actions* is very complicated. For example, the
> PMD
> >> may support action MARK for "transfer" flows but not for non-"transfer"
> >> ones. Also, simultaneous use of multiple different metadata actions may
> not
> >> be possible. And, last but not least, if we force the application to check
> >> support for *actions* on action-after-action basis, the order of checks will
> be
> >> very confusing to applications.
> >>
> >> Previously, in this thread, Thomas suggested to go for exactly this type of
> >> API, to check support for actions one-by-one, without any context
> >> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
> >>
> > +1 to keeping it as a separated API. (I agree actions limitation are very
> complex metrix)
> >
> >>>
> >>> In any case I think this is good idea and I will see how we can add a
> >>> more generic approach of this API to the new API that I'm going to
> present.
> >>>
> >>>
> >>>>> So no breakages with this API.
> >>>>>
> >>>>>>
> >>>>>> Please see more comments inline.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Ori
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>>>>> Sent: Thursday, September 23, 2021 2:20 PM
> >>>>>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >>>>>>> Rx meta data
> >>>>>>>
> >>>>>>> Delivery of mark, flag and the likes might affect small packet
> >>>>>>> performance.
> >>>>>>> If these features are disabled by default, enabling them in
> >>>>>>> started state without causing traffic disruption may not always be
> >> possible.
> >>>>>>>
> >>>>>>> Let applications negotiate delivery of Rx meta data beforehand.
> >>>>>>>
> >>>>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>>>>> Reviewed-by: Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>
> >>>>>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> >>>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> >>>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>>>>>> ---
> >>>>>>>     app/test-flow-perf/main.c              | 21 ++++++++++++
> >>>>>>>     app/test-pmd/testpmd.c                 | 26 +++++++++++++++
> >>>>>>>     doc/guides/rel_notes/release_21_11.rst |  9 ++++++
> >>>>>>>     lib/ethdev/ethdev_driver.h             | 19 +++++++++++
> >>>>>>>     lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
> >>>>>>>     lib/ethdev/rte_ethdev.h                | 45
> >>>>>>> ++++++++++++++++++++++++++
> >>>>>>>     lib/ethdev/rte_flow.h                  | 12 +++++++
> >>>>>>>     lib/ethdev/version.map                 |  3 ++
> >>>>>>>     8 files changed, 160 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
> >>>>>>> index 9be8edc31d..48eafffb1d 100644
> >>>>>>> --- a/app/test-flow-perf/main.c
> >>>>>>> +++ b/app/test-flow-perf/main.c
> >>>>>>> @@ -1760,6 +1760,27 @@ init_port(void)
> >>>>>>>             rte_exit(EXIT_FAILURE, "Error: can't init mbuf
> >>>>>>> pool\n");
> >>>>>>>
> >>>>>>>         for (port_id = 0; port_id < nr_ports; port_id++) {
> >>>>>>> +        uint64_t rx_meta_features = 0;
> >>>>>>> +
> >>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>>>>>> +
> >>>>>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
> >>>>>>> &rx_meta_features);
> >>>>>>> +        if (ret == 0) {
> >>>>>>> +            if (!(rx_meta_features &
> >>>>>>> RTE_ETH_RX_META_USER_FLAG)) {
> >>>>>>> +                printf(":: flow action FLAG will not affect Rx
> >>>>>>> mbufs on port=%u\n",
> >>>>>>> +                       port_id);
> >>>>>>> +            }
> >>>>>>> +
> >>>>>>> +            if (!(rx_meta_features &
> >>>>>>> RTE_ETH_RX_META_USER_MARK)) {
> >>>>>>> +                printf(":: flow action MARK will not affect Rx
> >>>>>>> mbufs on port=%u\n",
> >>>>>>> +                       port_id);
> >>>>>>> +            }
> >>>>>>> +        } else if (ret != -ENOTSUP) {
> >>>>>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
> >>>>>>> meta features on port=%u: %s\n",
> >>>>>>> +                 port_id, rte_strerror(-ret));
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>>             ret = rte_eth_dev_info_get(port_id, &dev_info);
> >>>>>>>             if (ret != 0)
> >>>>>>>                 rte_exit(EXIT_FAILURE, diff --git
> >>>>>>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>>>>>> 97ae52e17e..7a8da3d7ab 100644
> >>>>>>> --- a/app/test-pmd/testpmd.c
> >>>>>>> +++ b/app/test-pmd/testpmd.c
> >>>>>>> @@ -1485,10 +1485,36 @@ static void
> >>>>>>>     init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
> >>>>>>>         struct rte_port *port = &ports[pid];
> >>>>>>> +    uint64_t rx_meta_features = 0;
> >>>>>>>         uint16_t data_size;
> >>>>>>>         int ret;
> >>>>>>>         int i;
> >>>>>>>
> >>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
> >>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
> >>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
> >>>>>>> +
> >>>>>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
> >>>>>>> +    if (ret == 0) {
> >>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
> >>>>>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
> >>>>>>> affect Rx mbufs on port %u\n",
> >>>>>>> +                    pid);
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
> >>>>>>> {
> >>>>>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
> >>>>>>> affect Rx mbufs on port %u\n",
> >>>>>>> +                    pid);
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
> >>>>>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
> >>>>>>> might be limited or unavailable on port %u\n",
> >>>>>>> +                    pid);
> >>>>>>> +        }
> >>>>>>> +    } else if (ret != -ENOTSUP) {
> >>>>>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
> >>>>>>> features on port %u: %s\n",
> >>>>>>> +             pid, rte_strerror(-ret));
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>>         port->dev_conf.txmode = tx_mode;
> >>>>>>>         port->dev_conf.rxmode = rx_mode;
> >>>>>>>
> >>>>>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
> >>>>>>> b/doc/guides/rel_notes/release_21_11.rst
> >>>>>>> index 19356ac53c..6674d4474c 100644
> >>>>>>> --- a/doc/guides/rel_notes/release_21_11.rst
> >>>>>>> +++ b/doc/guides/rel_notes/release_21_11.rst
> >>>>>>> @@ -106,6 +106,15 @@ New Features
> >>>>>>>       Added command-line options to specify total number of
> >>>>>>> processes and
> >>>>>>>       current process ID. Each process owns subset of Rx and Tx
> queues.
> >>>>>>>
> >>>>>>> +* **Added an API to negotiate delivery of specific parts of Rx
> >>>>>>> +meta
> >>>>>>> +data**
> >>>>>>> +
> >>>>>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
> >>>>>>> +  The following parts of Rx meta data were defined:
> >>>>>>> +
> >>>>>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
> >>>>>>> +  * ``RTE_ETH_RX_META_USER_MARK``
> >>>>>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
> >>>>>>> +
> >>>>>>>
> >>>>>>>     Removed Items
> >>>>>>>     -------------
> >>>>>>> diff --git a/lib/ethdev/ethdev_driver.h
> >>>>>>> b/lib/ethdev/ethdev_driver.h index 40e474aa7e..96e0c60cae
> 100644
> >>>>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>>>> @@ -789,6 +789,22 @@ typedef int
> (*eth_get_monitor_addr_t)(void
> >>>>>>> *rxq, typedef int (*eth_representor_info_get_t)(struct
> rte_eth_dev
> >>>>>>> *dev,
> >>>>>>>         struct rte_eth_representor_info *info);
> >>>>>>>
> >>>>>>> +/**
> >>>>>>> + * @internal
> >>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
> >>>>>>> + *
> >>>>>>> + * @param dev
> >>>>>>> + *   Port (ethdev) handle
> >>>>>>> + *
> >>>>>>> + * @param[inout] features
> >>>>>>> + *   Feature selection buffer
> >>>>>>> + *
> >>>>>>> + * @return
> >>>>>>> + *   Negative errno value on error, zero otherwise  */ typedef
> >>>>>>> +int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
> >>>>>>> +                       uint64_t *features);
> >>>>>>> +
> >>>>>>>     /**
> >>>>>>>      * @internal A structure containing the functions exported by
> >>>>>>> an Ethernet driver.
> >>>>>>>      */
> >>>>>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
> >>>>>>>
> >>>>>>>         eth_representor_info_get_t representor_info_get;
> >>>>>>>         /**< Get representor info. */
> >>>>>>> +
> >>>>>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
> >>>>>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
> >>>>>>>     };
> >>>>>>>
> >>>>>>>     /**
> >>>>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> >>>>>>> index daf5ca9242..49cb84d64c 100644
> >>>>>>> --- a/lib/ethdev/rte_ethdev.c
> >>>>>>> +++ b/lib/ethdev/rte_ethdev.c
> >>>>>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
> >>>>>>> port_id,
> >>>>>>>         return eth_err(port_id, (*dev->dev_ops-
> >>>>>>>> representor_info_get)(dev, info));  }
> >>>>>>>
> >>>>>>> +int
> >>>>>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
> *features) {
> >>>>>>> +    struct rte_eth_dev *dev;
> >>>>>>> +
> >>>>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >>>>>>> +    dev = &rte_eth_devices[port_id];
> >>>>>>> +
> >>>>>>> +    if (dev->data->dev_configured != 0) {
> >>>>>>> +        RTE_ETHDEV_LOG(ERR,
> >>>>>>> +            "The port (id=%"PRIu16") is already configured\n",
> >>>>>>> +            port_id);
> >>>>>>> +        return -EBUSY;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    if (features == NULL) {
> >>>>>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
> >>>>>>> +        return -EINVAL;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >>> rx_meta_negotiate,
> >>>>>>> -ENOTSUP);
> >>>>>>> +    return eth_err(port_id,
> >>>>>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev,
> >>>>>>> +features)); }
> >>>>>>> +
> >>>>>>>     RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >>>>>>>
> >>>>>>>     RTE_INIT(ethdev_init_telemetry) diff --git
> >>>>>>> a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> >>>>>>> 1da37896d8..8467a7a362 100644
> >>>>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
> >>>>>>> rte_eth_representor_info_get(uint16_t port_id,
> >>>>>>>                      struct rte_eth_representor_info *info);
> >>>>>>>
> >>>>>>> +/** The ethdev sees flagged packets if there are flows with
> >>>>>>> +action FLAG. */ #define RTE_ETH_RX_META_USER_FLAG
> >> (UINT64_C(1) <<
> >>>>>>> +0)
> >>>>>>> +
> >>>>>>> +/** The ethdev sees mark IDs in packets if there are flows with
> >>>>>>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
> >>>> (UINT64_C(1) <<
> >>>>>>> +1)
> >>>>>>> +
> >>>>>>> +/** The ethdev detects missed packets if there are "tunnel_set"
> >>>>>>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID
> >> (UINT64_C(1)
> >>>> <<
> >>>>>>> +2)
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * @warning
> >>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
> >>>>>>> + *
> >>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
> >>>>>>> + *
> >>>>>>> + * Invoke this API before the first rte_eth_dev_configure()
> >>>>>>> +invocation
> >>>>>>> + * to let the PMD make preparations that are inconvenient to do
> >> later.
> >>>>>>> + *
> >>>>>>> + * The negotiation process is as follows:
> >>>>>>> + *
> >>>>>>> + * - the application requests features intending to use at least
> >>>>>>> +some of them;
> >>>>>>> + * - the PMD responds with the guaranteed subset of the
> requested
> >>>>>>> +feature set;
> >>>>>>> + * - the application can retry negotiation with another set of
> >>>>>>> +features;
> >>>>>>> + * - the application can pass zero to clear the negotiation
> >>>>>>> +result;
> >>>>>>> + * - the last negotiated result takes effect upon the ethdev start.
> >>>>>>> + *
> >>>>>>> + * If this API is unsupported, the application should gracefully
> >>>>>>> ignore that.
> >>>>>>> + *
> >>>>>>> + * @param port_id
> >>>>>>> + *   Port (ethdev) identifier
> >>>>>>> + *
> >>>>>>> + * @param[inout] features
> >>>>>>> + *   Feature selection buffer
> >>>>>>> + *
> >>>>>>> + * @return
> >>>>>>> + *   - (-EBUSY) if the port can't handle this in its current
> >>>>>>> +state;
> >>>>>>> + *   - (-ENOTSUP) if the method itself is not supported by the
> >>>>>>> +PMD;
> >>>>>>> + *   - (-ENODEV) if *port_id* is invalid;
> >>>>>>> + *   - (-EINVAL) if *features* is NULL;
> >>>>>>> + *   - (-EIO) if the device is removed;
> >>>>>>> + *   - (0) on success
> >>>>>>> + */
> >>>>>>> +__rte_experimental
> >>>>>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
> >>>>>>> +*features);
> >>>>>>
> >>>>>> I don't think meta is the best name since we also have meta item
> >>>>>> and the word meta can be used in other cases.
> >>>>>
> >>>>> I'm no expert in naming. What could be a better term for this?
> >>>>> Personally, I'd rather not perceive "meta" the way you describe.
> >>>>> It's not just "meta". It's "rx_meta", and the flags supplied with
> >>>>> this API provide enough context to explain what it's all about.
> >>>>
> >>>> Thinking overnight about it I'd suggest full "metadata".
> >>>> Yes, it will name a bit longer, but less confusing versus term META
> >>>> already used in flow API.
> >>>>
> >>> Following my above comments, I think it should be part of the new API
> >>> but in any case what about rx_flow_action_negotiate?
> >>
> >> See my thoughts above. It makes no sense to negotiate *support for
> >> actions*. Existing "rte_flow_validate()" already does that job. The new
> >> "negotiate Rx metadata* API is all about *delivery* of metadata which is
> >> supposed to be *already* set for the packets *inside* the NIC. So, we
> >> negotiate *delivery from the NIC to the host*. Nothing more.
> >>
> > Agree with your comment but then maybe we should go to the register
> > approach just like metadata?
> 
> Don't you mean "registering mbuf dynamic field / flag" by any chance?
> Even if it's technically possible, this may complicate the API contract
> because the key idea here is to demand that the application negotiate
> metadata delivery at the earliest step possible (before configuring the
> port), whilst a dynamic field / flag can be (theoretically) registered
> at any time. But, of course, feel free to elaborate on your idea.
> 

Yes, like I said above I don't see a difference between metadata
and mark, at least not from the application usage.
I assume you need this info at device start and by definition
the registration should happen before. (mbuf should be configured
before start)

> We should make sure that we all reach an agreement.
> 

+1 I think we can agree that there is a need for letting the PMD
know before the start that some action will be used.

And I'm sorry if I sound picky and hard, this is not my intention.
I'm also doing my best to review this as fast as I can.
My open issues and priorities:
1. the API breakage the possible solution adding support for the rest of the PMDs / update doc
to say that if the function is not supported the application should assume that
it can still use the mark / flag. -- blocker this must be resolved.
2. function name. my main issue is that metadata should be just like mark
maybe the solution can be adding metadata flag to this function.
the drawback is that the application needs to calls two functions to configure
metadata. -- high priority but if you can give me good reasoning not just
we don't need to register the mark I guess I will be O.K.
3. If PMD has internal logic in case of conflict or not.
Please think about it. -- low prio I will agree to what you decide.
but if you decide that PMD will have internal logic then this must be documented
so the application will know not to rely on the results.

Best,
Ori

> >
> > Best,
> > Ori
> >>>
> >>>> Andrew.
> >>> Best,
> >>> Ori
> >>>
> >>
> >> --
> >> Ivan M
> 
> --
> Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03 21:04                   ` Ori Kam
@ 2021-10-03 23:50                     ` Ivan Malov
  2021-10-04  6:56                       ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-03 23:50 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ori,

On 04/10/2021 00:04, Ori Kam wrote:
> Hi Ivan,
> 
> Sorry for the long review.
> 
>> -----Original Message-----
>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>> Sent: Sunday, October 3, 2021 8:30 PM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Hi Ori,
>>
>> On 03/10/2021 14:01, Ori Kam wrote:
>>> Hi Ivan,
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>>>> Sent: Sunday, October 3, 2021 12:30 PM data
>>>>
>>>> Hi Ori,
>>>>
>>>> Thanks for reviewing this.
>>>>
>>>
>>> No problem.
>>>
>>>> On 03/10/2021 10:42, Ori Kam wrote:
>>>>> Hi Andrew and Ivan,
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>> Sent: Friday, October 1, 2021 9:50 AM
>>>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery
>>>>>> of Rx meta data
>>>>>>
>>>>>> On 9/30/21 10:07 PM, Ivan Malov wrote:
>>>>>>> Hi Ori,
>>>>>>>
>>>>>>> On 30/09/2021 17:59, Ori Kam wrote:
>>>>>>>> Hi Ivan,
>>>>>>>> Sorry for jumping in late.
>>>>>>>
>>>>>>> No worries. That's OK.
>>>>>>>
>>>>>>>> I have a concern that this patch breaks other PMDs.
>>>>>>>
>>>>>>> It does no such thing.
>>>>>>>
>>>>>>>>>    From the rst file " One should negotiate flag delivery beforehand"
>>>>>>>> since you only added this function for your PMD all other PMD will
>> fail.
>>>>>>>> I see that you added exception in the  examples, but it doesn't
>>>>>>>> make sense that applications will also need to add this exception
>>>>>>>> which is not documented.
>>>>>>>
>>>>>>> Say, you have an application, and you use it with some specific PMD.
>>>>>>> Say, that PMD doesn't run into the problem as ours does. In other
>>>>>>> words, the user can insert a flow with action MARK at any point
>>>>>>> and get mark delivery working starting from that moment without
>>>>>>> any problem. Say, this is exactly the way how it works for you at
>>>>>>> the
>>>> moment.
>>>>>>>
>>>>>>> Now. This new API kicks in. We update the application to invoke it
>>>>>>> as early as possible. But your PMD in question still doesn't
>>>>>>> support this API. The comment in the patch says that if the method
>>>>>>> returns ENOTSUP, the application should ignore that without
>>>>>>> batting an eyelid. It should just keep on working as it did before
>>>>>>> the introduction of
>>>> this API.
>>>>>>>
>>>>>
>>>>> I understand that it is nice to write in the patch comment that
>>>>> application should disregard this function in case of ENOTSUP but in
>>>>> a few month someone will read the official doc, where it is stated
>>>>> that this function call is a must and then what do you think the
>>>>> application will do?
>>>>> I think that the correct way is to add this function to all PMDs.
>>>>> Another option is to add to the doc that if the function is
>>>>> returning ENOTSUP the application should assume that all is supported.
>>>>>
>>>>> So from this point of view there is API break.
>>>>
>>>> So, you mean an API breakage in some formal sense? If the doc is
>>>> fixed in accordance with the second option you suggest, will it
>>>> suffice to avoid this formal API breakage?
>>>>
>>>
>>> Yes, but I think it will be better to add the missing function.
>>>
>>>>>
>>>>>>> More specific example:
>>>>>>> Say, the application doesn't mind using either "RSS + MARK" or
>>>>>>> tunnel offload. What it does right now is attempt to insert tunnel
>>>>>>> flows first and, if this fails, fall back to "RSS + MARK". With
>>>>>>> this API, the application will try to invoke this API with
>>>>>>> "USER_MARK | TUNNEL_ID" in adapter initialised state. If the PMD
>>>>>>> says that it can only enable the tunnel offload, then the
>>>>>>> application will get the knowledge that it doesn't make sense to
>>>>>>> even try inserting "RSS + MARK" flows. It just can skip useless
>>>>>>> actions. But if the PMD doesn't support the method, the
>>>>>>> application will see ENOTSUP and handle this
>>>>>>> gracefully: it will make no assumptions about what's guaranteed to
>>>>>>> be supported and what's not and will just keep on its old behavior:
>>>>>>> try to insert a flow, fail, fall back to another type of flow.
>>>>>>>
>>>>>
>>>>> I fully agree with your example, and think that this is the way to
>>>>> go, application should supply as much info as possible during startup.
>>>>
>>>> Right.
>>>>
>>>>> My question/comment is the negotiated result means that all of the
>>>>> actions are supported on the same rule?
>>>>> for example if application wants to add mark and tag on the same rule.
>>>>> (I know it doesn't make much sense) and the PMD can support both of
>>>>> them but not on the same rule, what should it return?
>>>>> Or for example if using the mark can only be supported if no decap
>>>>> action is set on this rule what should be the result?
>>>>>    From my undstanding this function is only to let the PMD know that
>>>>> on some rules the application will use those actions, the checking
>>>>> if the action combination is valid only happens on validate function right?
>>>>
>>>> This API does not bind itself to flow API. It's *not* about enabling
>>>> support for metadata *actions* (they are conducted entirely *inside*
>>>> the NIC). It's about enabling *delivery* of metadata from the NIC to host.
>>>>
>>>
>>> Good point so why not use the same logic as the metadata and register it?
>>> Since in any case, this is something in the mbuf so maybe this should be the
>> answer?
>>
>> I didn't catch your thought. Could you please elaborate on it?
> 
> The metadata action just like the mark or flag is used to give application
> data that was set by a flow rule.
> To enable the metadata the application must register the metadata field.
> Since this happens during the creation of the mbuf it means that it must be
> created before the device start.
> 
> I understand that the mark and flag don't need to be registered in the mbuf
> since they have saved space but from application point of view there is no
> difference between the metadata and mark, so why does negotiate function
> doesn't handle the metadata?
> 
> I hope this is clearer.

Thank you. That's a lot clearer.

I inspected struct rte_flow_action_set_meta as well as 
rte_flow_dynf_metadata_register(). The latter API doesn't require that 
applications invoke it precisely before adapter start. It says "must be 
called prior to use SET_META action", and the comment before the 
structure says just "in advance". So, at a bare minimum, the API 
contract could've been made more strict with this respect. However, far 
more important points are as follows:

1) This API enables delivery of this "custom" metadata between the PMD 
and the application, whilst the API under review, as I noted before, 
negotiates delivery of various kinds of metadata between the NIC and the 
PMD. These are two completely different (albeit adjacent) stages of 
packet delivery process.

2) This API doesn't negotiate anything with the PMD. It doesn't interact 
with the PMD at all. It just reserves extra room in mbufs for the 
metadata field and exits.

3) As a consequence of (3), the PMD can't immediately learn about this 
field being enabled. It's forced to face this fact at some deferred 
point. If the PMD, for instance, learns about that during adapter start 
and if it for some reason decides to deny the use of this field, it 
won't be able to convey its decision to the application. As a result, 
the application will live in the wrong assumption that it has 
successfully enabled the feature.

4) Even if we add similar APIs to "register" more kinds of metadata 
(flag, mark, tunnel ID, etc) and re-define the meaning of all these APIs 
to say that not only they enable delivery of the metadata between the 
PMD and the application but also enable the HW transport to get the 
metadata delivered from the NIC to the PMD itself, we won't be able to 
use this set of APIs to actually *negotiate* something. The order of 
invocations will be confusing to the application. If the PMD can't 
combine some of these features, it won't be able to communicate this 
clearly to the application. It will have to silently disregard some of 
the "registered" features. And this is something that we probably want 
to avoid. Right?

But I tend to agree that the API under review could have one more (4th) 
flag to negotiate delivery of this "custom" metadata from the NIC to the 
PMD. At the same time, enabling delivery of this data from the PMD to 
the application will remain in the responsibility domain of 
rte_flow_dynf_metadata_register().

> 
>>
>>>
>>>> Say, you insert a flow rule to mark some packets. The NIC, internally
>>>> (in the
>>>> e-switch) adds the mark to matching packets. Yes, in the boundaries
>>>> of the NIC HW, the packets bear the mark on them. It has been set,
>>>> yes. But when time comes to *deliver* the packets to the host, the
>>>> NIC (at least, in net/sfc
>>>> case) has two options: either provide only a small chunk of the
>>>> metadata for each packet *to the host*, which doesn't include mark
>>>> ID, flag and RSS hash, OR, alternatively, provide the full set of
>>>> metadata. In the former option, the mark is simply not delivered.
>>>> Once again: it *has been set*, but simply will not be *delivered to the
>> host*.
>>>>
>>>> So, this API is about negotiating *delivery* of metadata. In pure
>>>> technical sense. And the set of flags that this API returns indicates
>>>> which kinds of metadata the NIC will be able to deliver simultaneously.
>>>>
>>>> For example, as I understand, in the case of tunnel offload, MLX5
>>>> claims Rx mark entirely for tunnel ID metadata, so, if an application
>>>> requests "MARK | TUNNEL_ID" with this API, this PMD should probably
>>>> want to respond with just "TUNNEL_ID". The application will see the
>>>> response and realise that, even if it adds its *own* (user) action
>>>> MARK to a flow and if the flow is not rejected by the PMD, it won't
>>>> be able to see the mark in the received mbufs (or the mark will be
>> incorrect).
>>>>
>>> So what should the application do if on some flows it wants MARK and on
>> other FLAG?
>>
>> You mentioned flows, so I'd like to stress this out one more time: what this
>> API cares about is solely the possibility to deliver metadata between the NIC
>> and the host. The host == the PMD (*not* application).
>>
> 
> I understand that you are only talking about enabling the action,
> meaning to let the PMD know that at some point there will be a rule
> that will use the mark action for example.
> Is my understanding correct?

Not really. The causal relationships are as follows. The application 
comes to realise that it will need to use, say, action MARK in flows. 
This, in turn, means that, in order to be able to actually see the mark 
in received packets, the application needs to ensure that a) the NIC 
will be able to deliver the mark to the PMD and b) that the PMD will be 
able to deliver the mark to the application. In particular, in the case 
of Rx mark, (b) doesn't need to be negotiated = field "mark" is anyway 
provisioned in the mbuf structure, so no need to enable it. But (a) 
needs to be negotiated. Hence this API.

> I don't understand your last comment about host == PMD since at the end
> this value should be given to the application.

Two different steps, Ori, two different steps. The first one is to 
deliver the mark from the NIC to the PMD. And the second one is to 
deliver the mark from the PMD to the application. As you might 
understand, mbufs get filled out on the second step. That's it.

> 
>>>   From DPDK viewpoint both of them can't be shared on the same rule
>>> (they are using the same space in mbuf) so the application will never
>>> ask for both of them in the same rule but he can on some rules ask for
>>> mark while on other request for FLAG, even in your code you added both
>> of them.
>>>
>>> So what should the PMD return if it can support both of them just not
>>> at the same rule?
>>
>> Please see above. This is not about rules. This is not about the way how flag
>> and mark are presented *by* the PMD *to* the application in mbufs.
>> Simultaneous use of actions FLAG and MARK in flows must be ruled out by
>> rte_flow_validate() / rte_flow_create(). The way how flag and mark are
>> *represented* in mbufs belongs in mbuf library responsibility domain.
>>
>> Consider the following operational sequence:
>>
>> 1) The NIC has a packet, which has metadata associated with it;
>> 2) The NIC transfers this packet to the host;
>> 3) The PMD sees the packet and its metadata;
>> 4) The PMD represents whatever available metadata in mbuf format.
>>
>> Features negotiated by virtue of this API (for instance, FLAG and MARK)
>> enable delivery of these kinds of metadata between points (2) and (3).
>>
>> And the problem of flag / mark co-existence in mbufs sits at point (4).
>>
>> -> Completely different problems, in fact.
>>
> 
> Agree.
> 
>>>
>>> One option is to document that the supported values are not per rule
>>> but for the entire port. For example in the example you gave MLX5 will
>>> support mark + flag but will not support mark + tunnel.
>>
>> Yes, for the port. Flow rules are a don't care to this API.
>>
>>>
>>> Also considering your example, the negotiation may result in subpar result.
>>> taking your example the PMD returned  TUNNEL_ID maybe application
>>> would prefer to have the mark and not the TUNNEL_ID. I understand that
>>> application can check and try again with just the MARK.
>>
>> Exactly. The Application can repeat negotiation with just MARK. Is there any
>> problem with that?
>>
> 
> I understand that the application can negotiate again and again.
> I just don't like that the PMD has logic and selects what he thinks will be best.
> 
> I wanted to suggest that the PMD will just tell what are the conflicts and the application
> will negotiate again based on its logic.

Well, I'm not saying that letting the PMD decide on the optimal feature 
subset is the only reasonable MO. But it simplifies the negotiation 
procedure a lot. Conveying conflicts and feature inter-dependencies to 
the application might be rather complex and prone to errors.

At this point I believe it's important to clarify: the original intent 
is to assume that the PMD will first consider enabling all requested 
features. Only in the case when it fails to do so should it come up with 
the optimal subset.

> 
>>> You are inserting logic to the PMD, maybe the function should just
>>> fail maybe returning the conflicting items?
>>
>> Why return conflicting items? The optimal subset (from the PMD's
>> perspective) should be returned. It's a silver lining. In the end, the application
>> can learn which features can be enabled and in what combinations. And it
>> can rely on the outcome of the negotiation process.
>>
> That is my point this is PMD perspective, not the application.
> how can a PMD define an optimal subset? How can it know what is more
> important to the application?

How does "ls" command know the optimal sort mode? Why does it prefer 
sorting by name over sorting by date? Thanks to its options, it allows 
the user to express their own preference. So does the API in question. 
If the application knows that tunnel offload is more important to it 
(compared to MARK, for instance), it can request just TUNNEL_ID. Why 
complicate this?

> Also, the PMD logic is internal so if for some reason
> the PMD selected the best for the application by chance, so the application learns
> that this is a good value for him. A release later the internal PMD logic changes
> for example, a new feature was added, other customer requests.
> since this is PMD the original app is not aware of this change and may fail.

The same argumentation can equally apply to default RSS table, for 
example. What if an application gets accustomed to the round-robin table 
being default in some specific PMD (and the PMD maintainers change 
default RSS table out of a sudden)? Oops!

The truth is that the application shouldn't bind itself to some specific 
vendor / PMD. In any case. Hence the negotiation process. It's just a 
building block for some automation in the application.

> 
> We both agree that the application should check the result and renegotiate if needed
> I only suggested that the PMD will only return error and not assume he knows best.

I believe we should give this more thought. Maybe Andrew can join this 
conversation.

> 
> 
>>>
>>>
>>>
>>>> But some other PMDs (net/sfc, for instance) claim only a small fraction of
>> bits
>>>> in Rx mark to deliver tunnel ID information. Remaining bits are still
>> available
>>>> for delivery of *user* mark ID. Please see an example at
>>>> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
>>>> ivan.malov@oktetlabs.ru/
>>>> . In this case, the PMD may want to return both flags in the response:
>>>> "MARK | TUNNEL_ID". This way, the application knows that both features
>>>> are enabled and available for use.
>>>>
>>>> Now. I anticipate more questions asking why wouldn't we prefer flow API
>>>> terminology or why wouldn't we add an API for negotiating support for
>>>> metadata *actions* and not just metadata *delivery*. There's an answer.
>>>> Always has been.
>>>>
>>>> The thing is, the use of *actions* is very complicated. For example, the
>> PMD
>>>> may support action MARK for "transfer" flows but not for non-"transfer"
>>>> ones. Also, simultaneous use of multiple different metadata actions may
>> not
>>>> be possible. And, last but not least, if we force the application to check
>>>> support for *actions* on action-after-action basis, the order of checks will
>> be
>>>> very confusing to applications.
>>>>
>>>> Previously, in this thread, Thomas suggested to go for exactly this type of
>>>> API, to check support for actions one-by-one, without any context
>>>> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
>>>>
>>> +1 to keeping it as a separated API. (I agree actions limitation are very
>> complex metrix)
>>>
>>>>>
>>>>> In any case I think this is good idea and I will see how we can add a
>>>>> more generic approach of this API to the new API that I'm going to
>> present.
>>>>>
>>>>>
>>>>>>> So no breakages with this API.
>>>>>>>
>>>>>>>>
>>>>>>>> Please see more comments inline.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ori
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>>>> Sent: Thursday, September 23, 2021 2:20 PM
>>>>>>>>> Subject: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>>>>>>> Rx meta data
>>>>>>>>>
>>>>>>>>> Delivery of mark, flag and the likes might affect small packet
>>>>>>>>> performance.
>>>>>>>>> If these features are disabled by default, enabling them in
>>>>>>>>> started state without causing traffic disruption may not always be
>>>> possible.
>>>>>>>>>
>>>>>>>>> Let applications negotiate delivery of Rx meta data beforehand.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>>>> Reviewed-by: Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru>
>>>>>>>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>>>>>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>>>>>> ---
>>>>>>>>>      app/test-flow-perf/main.c              | 21 ++++++++++++
>>>>>>>>>      app/test-pmd/testpmd.c                 | 26 +++++++++++++++
>>>>>>>>>      doc/guides/rel_notes/release_21_11.rst |  9 ++++++
>>>>>>>>>      lib/ethdev/ethdev_driver.h             | 19 +++++++++++
>>>>>>>>>      lib/ethdev/rte_ethdev.c                | 25 ++++++++++++++
>>>>>>>>>      lib/ethdev/rte_ethdev.h                | 45
>>>>>>>>> ++++++++++++++++++++++++++
>>>>>>>>>      lib/ethdev/rte_flow.h                  | 12 +++++++
>>>>>>>>>      lib/ethdev/version.map                 |  3 ++
>>>>>>>>>      8 files changed, 160 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
>>>>>>>>> index 9be8edc31d..48eafffb1d 100644
>>>>>>>>> --- a/app/test-flow-perf/main.c
>>>>>>>>> +++ b/app/test-flow-perf/main.c
>>>>>>>>> @@ -1760,6 +1760,27 @@ init_port(void)
>>>>>>>>>              rte_exit(EXIT_FAILURE, "Error: can't init mbuf
>>>>>>>>> pool\n");
>>>>>>>>>
>>>>>>>>>          for (port_id = 0; port_id < nr_ports; port_id++) {
>>>>>>>>> +        uint64_t rx_meta_features = 0;
>>>>>>>>> +
>>>>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>>>>>> +        rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>>>>>> +
>>>>>>>>> +        ret = rte_eth_rx_meta_negotiate(port_id,
>>>>>>>>> &rx_meta_features);
>>>>>>>>> +        if (ret == 0) {
>>>>>>>>> +            if (!(rx_meta_features &
>>>>>>>>> RTE_ETH_RX_META_USER_FLAG)) {
>>>>>>>>> +                printf(":: flow action FLAG will not affect Rx
>>>>>>>>> mbufs on port=%u\n",
>>>>>>>>> +                       port_id);
>>>>>>>>> +            }
>>>>>>>>> +
>>>>>>>>> +            if (!(rx_meta_features &
>>>>>>>>> RTE_ETH_RX_META_USER_MARK)) {
>>>>>>>>> +                printf(":: flow action MARK will not affect Rx
>>>>>>>>> mbufs on port=%u\n",
>>>>>>>>> +                       port_id);
>>>>>>>>> +            }
>>>>>>>>> +        } else if (ret != -ENOTSUP) {
>>>>>>>>> +            rte_exit(EXIT_FAILURE, "Error when negotiating Rx
>>>>>>>>> meta features on port=%u: %s\n",
>>>>>>>>> +                 port_id, rte_strerror(-ret));
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>>              ret = rte_eth_dev_info_get(port_id, &dev_info);
>>>>>>>>>              if (ret != 0)
>>>>>>>>>                  rte_exit(EXIT_FAILURE, diff --git
>>>>>>>>> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>>>>>>>> 97ae52e17e..7a8da3d7ab 100644
>>>>>>>>> --- a/app/test-pmd/testpmd.c
>>>>>>>>> +++ b/app/test-pmd/testpmd.c
>>>>>>>>> @@ -1485,10 +1485,36 @@ static void
>>>>>>>>>      init_config_port_offloads(portid_t pid, uint32_t socket_id)  {
>>>>>>>>>          struct rte_port *port = &ports[pid];
>>>>>>>>> +    uint64_t rx_meta_features = 0;
>>>>>>>>>          uint16_t data_size;
>>>>>>>>>          int ret;
>>>>>>>>>          int i;
>>>>>>>>>
>>>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_FLAG;
>>>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_USER_MARK;
>>>>>>>>> +    rx_meta_features |= RTE_ETH_RX_META_TUNNEL_ID;
>>>>>>>>> +
>>>>>>>>> +    ret = rte_eth_rx_meta_negotiate(pid, &rx_meta_features);
>>>>>>>>> +    if (ret == 0) {
>>>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_FLAG)) {
>>>>>>>>> +            TESTPMD_LOG(INFO, "Flow action FLAG will not
>>>>>>>>> affect Rx mbufs on port %u\n",
>>>>>>>>> +                    pid);
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_USER_MARK))
>>>>>>>>> {
>>>>>>>>> +            TESTPMD_LOG(INFO, "Flow action MARK will not
>>>>>>>>> affect Rx mbufs on port %u\n",
>>>>>>>>> +                    pid);
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        if (!(rx_meta_features & RTE_ETH_RX_META_TUNNEL_ID)) {
>>>>>>>>> +            TESTPMD_LOG(INFO, "Flow tunnel offload support
>>>>>>>>> might be limited or unavailable on port %u\n",
>>>>>>>>> +                    pid);
>>>>>>>>> +        }
>>>>>>>>> +    } else if (ret != -ENOTSUP) {
>>>>>>>>> +        rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta
>>>>>>>>> features on port %u: %s\n",
>>>>>>>>> +             pid, rte_strerror(-ret));
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>>          port->dev_conf.txmode = tx_mode;
>>>>>>>>>          port->dev_conf.rxmode = rx_mode;
>>>>>>>>>
>>>>>>>>> diff --git a/doc/guides/rel_notes/release_21_11.rst
>>>>>>>>> b/doc/guides/rel_notes/release_21_11.rst
>>>>>>>>> index 19356ac53c..6674d4474c 100644
>>>>>>>>> --- a/doc/guides/rel_notes/release_21_11.rst
>>>>>>>>> +++ b/doc/guides/rel_notes/release_21_11.rst
>>>>>>>>> @@ -106,6 +106,15 @@ New Features
>>>>>>>>>        Added command-line options to specify total number of
>>>>>>>>> processes and
>>>>>>>>>        current process ID. Each process owns subset of Rx and Tx
>> queues.
>>>>>>>>>
>>>>>>>>> +* **Added an API to negotiate delivery of specific parts of Rx
>>>>>>>>> +meta
>>>>>>>>> +data**
>>>>>>>>> +
>>>>>>>>> +  A new API, ``rte_eth_rx_meta_negotiate()``, was added.
>>>>>>>>> +  The following parts of Rx meta data were defined:
>>>>>>>>> +
>>>>>>>>> +  * ``RTE_ETH_RX_META_USER_FLAG``
>>>>>>>>> +  * ``RTE_ETH_RX_META_USER_MARK``
>>>>>>>>> +  * ``RTE_ETH_RX_META_TUNNEL_ID``
>>>>>>>>> +
>>>>>>>>>
>>>>>>>>>      Removed Items
>>>>>>>>>      -------------
>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h
>>>>>>>>> b/lib/ethdev/ethdev_driver.h index 40e474aa7e..96e0c60cae
>> 100644
>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>>>> @@ -789,6 +789,22 @@ typedef int
>> (*eth_get_monitor_addr_t)(void
>>>>>>>>> *rxq, typedef int (*eth_representor_info_get_t)(struct
>> rte_eth_dev
>>>>>>>>> *dev,
>>>>>>>>>          struct rte_eth_representor_info *info);
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * @internal
>>>>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>>>>>> + *
>>>>>>>>> + * @param dev
>>>>>>>>> + *   Port (ethdev) handle
>>>>>>>>> + *
>>>>>>>>> + * @param[inout] features
>>>>>>>>> + *   Feature selection buffer
>>>>>>>>> + *
>>>>>>>>> + * @return
>>>>>>>>> + *   Negative errno value on error, zero otherwise  */ typedef
>>>>>>>>> +int (*eth_rx_meta_negotiate_t)(struct rte_eth_dev *dev,
>>>>>>>>> +                       uint64_t *features);
>>>>>>>>> +
>>>>>>>>>      /**
>>>>>>>>>       * @internal A structure containing the functions exported by
>>>>>>>>> an Ethernet driver.
>>>>>>>>>       */
>>>>>>>>> @@ -949,6 +965,9 @@ struct eth_dev_ops {
>>>>>>>>>
>>>>>>>>>          eth_representor_info_get_t representor_info_get;
>>>>>>>>>          /**< Get representor info. */
>>>>>>>>> +
>>>>>>>>> +    eth_rx_meta_negotiate_t rx_meta_negotiate;
>>>>>>>>> +    /**< Negotiate delivery of specific parts of Rx meta data. */
>>>>>>>>>      };
>>>>>>>>>
>>>>>>>>>      /**
>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>>>>>>>> index daf5ca9242..49cb84d64c 100644
>>>>>>>>> --- a/lib/ethdev/rte_ethdev.c
>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>>>>>>> @@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t
>>>>>>>>> port_id,
>>>>>>>>>          return eth_err(port_id, (*dev->dev_ops-
>>>>>>>>>> representor_info_get)(dev, info));  }
>>>>>>>>>
>>>>>>>>> +int
>>>>>>>>> +rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
>> *features) {
>>>>>>>>> +    struct rte_eth_dev *dev;
>>>>>>>>> +
>>>>>>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>>>>>>>> +    dev = &rte_eth_devices[port_id];
>>>>>>>>> +
>>>>>>>>> +    if (dev->data->dev_configured != 0) {
>>>>>>>>> +        RTE_ETHDEV_LOG(ERR,
>>>>>>>>> +            "The port (id=%"PRIu16") is already configured\n",
>>>>>>>>> +            port_id);
>>>>>>>>> +        return -EBUSY;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    if (features == NULL) {
>>>>>>>>> +        RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
>>>>>>>>> +        return -EINVAL;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
>>>>> rx_meta_negotiate,
>>>>>>>>> -ENOTSUP);
>>>>>>>>> +    return eth_err(port_id,
>>>>>>>>> +               (*dev->dev_ops->rx_meta_negotiate)(dev,
>>>>>>>>> +features)); }
>>>>>>>>> +
>>>>>>>>>      RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>>>>>>>>>
>>>>>>>>>      RTE_INIT(ethdev_init_telemetry) diff --git
>>>>>>>>> a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>>>>>>>> 1da37896d8..8467a7a362 100644
>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>>> @@ -4888,6 +4888,51 @@ __rte_experimental  int
>>>>>>>>> rte_eth_representor_info_get(uint16_t port_id,
>>>>>>>>>                       struct rte_eth_representor_info *info);
>>>>>>>>>
>>>>>>>>> +/** The ethdev sees flagged packets if there are flows with
>>>>>>>>> +action FLAG. */ #define RTE_ETH_RX_META_USER_FLAG
>>>> (UINT64_C(1) <<
>>>>>>>>> +0)
>>>>>>>>> +
>>>>>>>>> +/** The ethdev sees mark IDs in packets if there are flows with
>>>>>>>>> +action MARK. */ #define RTE_ETH_RX_META_USER_MARK
>>>>>> (UINT64_C(1) <<
>>>>>>>>> +1)
>>>>>>>>> +
>>>>>>>>> +/** The ethdev detects missed packets if there are "tunnel_set"
>>>>>>>>> +flows in use. */ #define RTE_ETH_RX_META_TUNNEL_ID
>>>> (UINT64_C(1)
>>>>>> <<
>>>>>>>>> +2)
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * @warning
>>>>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>>>>>>> + *
>>>>>>>>> + * Negotiate delivery of specific parts of Rx meta data.
>>>>>>>>> + *
>>>>>>>>> + * Invoke this API before the first rte_eth_dev_configure()
>>>>>>>>> +invocation
>>>>>>>>> + * to let the PMD make preparations that are inconvenient to do
>>>> later.
>>>>>>>>> + *
>>>>>>>>> + * The negotiation process is as follows:
>>>>>>>>> + *
>>>>>>>>> + * - the application requests features intending to use at least
>>>>>>>>> +some of them;
>>>>>>>>> + * - the PMD responds with the guaranteed subset of the
>> requested
>>>>>>>>> +feature set;
>>>>>>>>> + * - the application can retry negotiation with another set of
>>>>>>>>> +features;
>>>>>>>>> + * - the application can pass zero to clear the negotiation
>>>>>>>>> +result;
>>>>>>>>> + * - the last negotiated result takes effect upon the ethdev start.
>>>>>>>>> + *
>>>>>>>>> + * If this API is unsupported, the application should gracefully
>>>>>>>>> ignore that.
>>>>>>>>> + *
>>>>>>>>> + * @param port_id
>>>>>>>>> + *   Port (ethdev) identifier
>>>>>>>>> + *
>>>>>>>>> + * @param[inout] features
>>>>>>>>> + *   Feature selection buffer
>>>>>>>>> + *
>>>>>>>>> + * @return
>>>>>>>>> + *   - (-EBUSY) if the port can't handle this in its current
>>>>>>>>> +state;
>>>>>>>>> + *   - (-ENOTSUP) if the method itself is not supported by the
>>>>>>>>> +PMD;
>>>>>>>>> + *   - (-ENODEV) if *port_id* is invalid;
>>>>>>>>> + *   - (-EINVAL) if *features* is NULL;
>>>>>>>>> + *   - (-EIO) if the device is removed;
>>>>>>>>> + *   - (0) on success
>>>>>>>>> + */
>>>>>>>>> +__rte_experimental
>>>>>>>>> +int rte_eth_rx_meta_negotiate(uint16_t port_id, uint64_t
>>>>>>>>> +*features);
>>>>>>>>
>>>>>>>> I don't think meta is the best name since we also have meta item
>>>>>>>> and the word meta can be used in other cases.
>>>>>>>
>>>>>>> I'm no expert in naming. What could be a better term for this?
>>>>>>> Personally, I'd rather not perceive "meta" the way you describe.
>>>>>>> It's not just "meta". It's "rx_meta", and the flags supplied with
>>>>>>> this API provide enough context to explain what it's all about.
>>>>>>
>>>>>> Thinking overnight about it I'd suggest full "metadata".
>>>>>> Yes, it will name a bit longer, but less confusing versus term META
>>>>>> already used in flow API.
>>>>>>
>>>>> Following my above comments, I think it should be part of the new API
>>>>> but in any case what about rx_flow_action_negotiate?
>>>>
>>>> See my thoughts above. It makes no sense to negotiate *support for
>>>> actions*. Existing "rte_flow_validate()" already does that job. The new
>>>> "negotiate Rx metadata* API is all about *delivery* of metadata which is
>>>> supposed to be *already* set for the packets *inside* the NIC. So, we
>>>> negotiate *delivery from the NIC to the host*. Nothing more.
>>>>
>>> Agree with your comment but then maybe we should go to the register
>>> approach just like metadata?
>>
>> Don't you mean "registering mbuf dynamic field / flag" by any chance?
>> Even if it's technically possible, this may complicate the API contract
>> because the key idea here is to demand that the application negotiate
>> metadata delivery at the earliest step possible (before configuring the
>> port), whilst a dynamic field / flag can be (theoretically) registered
>> at any time. But, of course, feel free to elaborate on your idea.
>>
> 
> Yes, like I said above I don't see a difference between metadata
> and mark, at least not from the application usage.
> I assume you need this info at device start and by definition
> the registration should happen before. (mbuf should be configured
> before start)

Please see my thoughts about dynamic fields above.

> 
>> We should make sure that we all reach an agreement.
>>
> 
> +1 I think we can agree that there is a need for letting the PMD
> know before the start that some action will be used.
> 
> And I'm sorry if I sound picky and hard, this is not my intention.
> I'm also doing my best to review this as fast as I can.
> My open issues and priorities:
> 1. the API breakage the possible solution adding support for the rest of the PMDs / update doc
> to say that if the function is not supported the application should assume that
> it can still use the mark / flag. -- blocker this must be resolved.

I see.

> 2. function name. my main issue is that metadata should be just like mark
> maybe the solution can be adding metadata flag to this function.
> the drawback is that the application needs to calls two functions to configure
> metadata. -- high priority but if you can give me good reasoning not just
> we don't need to register the mark I guess I will be O.K.

Please see my thoughts above. This API negotiates metadata delivery on 
the path between the NIC and the PMD. The metadata mbuf register API 
does this on the path between the PMD and the application. So no 
contradiction here.

> 3. If PMD has internal logic in case of conflict or not.
> Please think about it. -- low prio I will agree to what you decide.
> but if you decide that PMD will have internal logic then this must be documented
> so the application will know not to rely on the results.

Please see my reply above. The application can rely on the outcome of 
the negotiation (the last negotiated subset of features), but it should 
know that if it disagrees with the suggested feature subset, it can 
re-negotiate. All fair and square.

> 
> Best,
> Ori
> 
>>>
>>> Best,
>>> Ori
>>>>>
>>>>>> Andrew.
>>>>> Best,
>>>>> Ori
>>>>>
>>>>
>>>> --
>>>> Ivan M
>>
>> --
>> Ivan M

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-03 23:50                     ` Ivan Malov
@ 2021-10-04  6:56                       ` Ori Kam
  2021-10-04 11:39                         ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-04  6:56 UTC (permalink / raw)
  To: Ivan Malov, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ivan,

> -----Original Message-----
> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> Sent: Monday, October 4, 2021 2:50 AM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> Hi Ori,
> 
> On 04/10/2021 00:04, Ori Kam wrote:
> > Hi Ivan,
> >
> > Sorry for the long review.
> >
> >> -----Original Message-----
> >> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> >> Sent: Sunday, October 3, 2021 8:30 PM
> >> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >> Rx meta data
> >>
> >> Hi Ori,
> >>
> >> On 03/10/2021 14:01, Ori Kam wrote:
> >>> Hi Ivan,
> >>>
> >>>> -----Original Message-----
> >>>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
> >>>> Sent: Sunday, October 3, 2021 12:30 PM data
> >>>>
> >>>> Hi Ori,
> >>>>
> >>>> Thanks for reviewing this.
> >>>>
> >>>
> >>> No problem.
> >>>
> >>>> On 03/10/2021 10:42, Ori Kam wrote:
> >>>>> Hi Andrew and Ivan,
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>>>> Sent: Friday, October 1, 2021 9:50 AM
> >>>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery
> >>>>>> of Rx meta data
> >>>>>>
> >>>>>> On 9/30/21 10:07 PM, Ivan Malov wrote:
> >>>>>>> Hi Ori,
> >>>>>>>
> >>>>>>> On 30/09/2021 17:59, Ori Kam wrote:
> >>>>>>>> Hi Ivan,

[Snip]

> >>> Good point so why not use the same logic as the metadata and register
> it?
> >>> Since in any case, this is something in the mbuf so maybe this
> >>> should be the
> >> answer?
> >>
> >> I didn't catch your thought. Could you please elaborate on it?
> >
> > The metadata action just like the mark or flag is used to give
> > application data that was set by a flow rule.
> > To enable the metadata the application must register the metadata field.
> > Since this happens during the creation of the mbuf it means that it
> > must be created before the device start.
> >
> > I understand that the mark and flag don't need to be registered in the
> > mbuf since they have saved space but from application point of view
> > there is no difference between the metadata and mark, so why does
> > negotiate function doesn't handle the metadata?
> >
> > I hope this is clearer.
> 
> Thank you. That's a lot clearer.
> 
> I inspected struct rte_flow_action_set_meta as well as
> rte_flow_dynf_metadata_register(). The latter API doesn't require that
> applications invoke it precisely before adapter start. It says "must be called
> prior to use SET_META action", and the comment before the structure says
> just "in advance". So, at a bare minimum, the API contract could've been
> made more strict with this respect. However, far more important points are
> as follows:
> 

Agree, that doc should be updated but by definition this must be set before mbuf
creation this means before device start.

> 1) This API enables delivery of this "custom" metadata between the PMD
> and the application, whilst the API under review, as I noted before,
> negotiates delivery of various kinds of metadata between the NIC and the
> PMD. These are two completely different (albeit adjacent) stages of packet
> delivery process.
>
They are exactly alike also in the metadata case the registertion does two things:
Saves a place for the info in the mbuf and tells the PMD that it should configure the NIC
to supply this information upon request.
Even in your PMD assuming that it can support the metadata, you will need to configure
it otherwise when the application will request this data using a rule you will be at the
same spot you are now with the mark.

> 2) This API doesn't negotiate anything with the PMD. It doesn't interact with
> the PMD at all. It just reserves extra room in mbufs for the metadata field
> and exits.
> 
> 3) As a consequence of (3), the PMD can't immediately learn about this field
> being enabled. It's forced to face this fact at some deferred point. If the
> PMD, for instance, learns about that during adapter start and if it for some
> reason decides to deny the use of this field, it won't be able to convey its
> decision to the application. As a result, the application will live in the wrong
> assumption that it has successfully enabled the feature.
>
> 4) Even if we add similar APIs to "register" more kinds of metadata (flag,
> mark, tunnel ID, etc) and re-define the meaning of all these APIs to say that
> not only they enable delivery of the metadata between the PMD and the
> application but also enable the HW transport to get the metadata delivered
> from the NIC to the PMD itself, we won't be able to use this set of APIs to
> actually *negotiate* something. The order of invocations will be confusing to
> the application. If the PMD can't combine some of these features, it won't be
> able to communicate this clearly to the application. It will have to silently
> disregard some of the "registered" features. And this is something that we
> probably want to avoid. Right?
> 
> But I tend to agree that the API under review could have one more (4th) flag
> to negotiate delivery of this "custom" metadata from the NIC to the PMD. At
> the same time, enabling delivery of this data from the PMD to the application
> will remain in the responsibility domain of
> rte_flow_dynf_metadata_register().
> 

I agree and think this is the best solution.

> >
> >>
> >>>
> >>>> Say, you insert a flow rule to mark some packets. The NIC,
> >>>> internally (in the
> >>>> e-switch) adds the mark to matching packets. Yes, in the boundaries
> >>>> of the NIC HW, the packets bear the mark on them. It has been set,
> >>>> yes. But when time comes to *deliver* the packets to the host, the
> >>>> NIC (at least, in net/sfc
> >>>> case) has two options: either provide only a small chunk of the
> >>>> metadata for each packet *to the host*, which doesn't include mark
> >>>> ID, flag and RSS hash, OR, alternatively, provide the full set of
> >>>> metadata. In the former option, the mark is simply not delivered.
> >>>> Once again: it *has been set*, but simply will not be *delivered to
> >>>> the
> >> host*.
> >>>>
> >>>> So, this API is about negotiating *delivery* of metadata. In pure
> >>>> technical sense. And the set of flags that this API returns
> >>>> indicates which kinds of metadata the NIC will be able to deliver
> simultaneously.
> >>>>
> >>>> For example, as I understand, in the case of tunnel offload, MLX5
> >>>> claims Rx mark entirely for tunnel ID metadata, so, if an
> >>>> application requests "MARK | TUNNEL_ID" with this API, this PMD
> >>>> should probably want to respond with just "TUNNEL_ID". The
> >>>> application will see the response and realise that, even if it adds
> >>>> its *own* (user) action MARK to a flow and if the flow is not
> >>>> rejected by the PMD, it won't be able to see the mark in the
> >>>> received mbufs (or the mark will be
> >> incorrect).
> >>>>
> >>> So what should the application do if on some flows it wants MARK and
> >>> on
> >> other FLAG?
> >>
> >> You mentioned flows, so I'd like to stress this out one more time:
> >> what this API cares about is solely the possibility to deliver
> >> metadata between the NIC and the host. The host == the PMD (*not*
> application).
> >>
> >
> > I understand that you are only talking about enabling the action,
> > meaning to let the PMD know that at some point there will be a rule
> > that will use the mark action for example.
> > Is my understanding correct?
> 
> Not really. The causal relationships are as follows. The application comes to
> realise that it will need to use, say, action MARK in flows.
> This, in turn, means that, in order to be able to actually see the mark in
> received packets, the application needs to ensure that a) the NIC will be able
> to deliver the mark to the PMD and b) that the PMD will be able to deliver
> the mark to the application. In particular, in the case of Rx mark, (b) doesn't
> need to be negotiated = field "mark" is anyway provisioned in the mbuf
> structure, so no need to enable it. But (a) needs to be negotiated. Hence this
> API.
> 
Please see my above comment I think we both agree.

> > I don't understand your last comment about host == PMD since at the
> > end this value should be given to the application.
> 
> Two different steps, Ori, two different steps. The first one is to deliver the
> mark from the NIC to the PMD. And the second one is to deliver the mark
> from the PMD to the application. As you might understand, mbufs get filled
> out on the second step. That's it.
> 
> >
> >>>   From DPDK viewpoint both of them can't be shared on the same rule
> >>> (they are using the same space in mbuf) so the application will never
> >>> ask for both of them in the same rule but he can on some rules ask for
> >>> mark while on other request for FLAG, even in your code you added
> both
> >> of them.
> >>>
> >>> So what should the PMD return if it can support both of them just not
> >>> at the same rule?
> >>
> >> Please see above. This is not about rules. This is not about the way how
> flag
> >> and mark are presented *by* the PMD *to* the application in mbufs.
> >> Simultaneous use of actions FLAG and MARK in flows must be ruled out
> by
> >> rte_flow_validate() / rte_flow_create(). The way how flag and mark are
> >> *represented* in mbufs belongs in mbuf library responsibility domain.
> >>
> >> Consider the following operational sequence:
> >>
> >> 1) The NIC has a packet, which has metadata associated with it;
> >> 2) The NIC transfers this packet to the host;
> >> 3) The PMD sees the packet and its metadata;
> >> 4) The PMD represents whatever available metadata in mbuf format.
> >>
> >> Features negotiated by virtue of this API (for instance, FLAG and MARK)
> >> enable delivery of these kinds of metadata between points (2) and (3).
> >>
> >> And the problem of flag / mark co-existence in mbufs sits at point (4).
> >>
> >> -> Completely different problems, in fact.
> >>
> >
> > Agree.
> >
> >>>
> >>> One option is to document that the supported values are not per rule
> >>> but for the entire port. For example in the example you gave MLX5 will
> >>> support mark + flag but will not support mark + tunnel.
> >>
> >> Yes, for the port. Flow rules are a don't care to this API.
> >>
> >>>
> >>> Also considering your example, the negotiation may result in subpar
> result.
> >>> taking your example the PMD returned  TUNNEL_ID maybe application
> >>> would prefer to have the mark and not the TUNNEL_ID. I understand
> that
> >>> application can check and try again with just the MARK.
> >>
> >> Exactly. The Application can repeat negotiation with just MARK. Is there
> any
> >> problem with that?
> >>
> >
> > I understand that the application can negotiate again and again.
> > I just don't like that the PMD has logic and selects what he thinks will be
> best.
> >
> > I wanted to suggest that the PMD will just tell what are the conflicts and
> the application
> > will negotiate again based on its logic.
> 
> Well, I'm not saying that letting the PMD decide on the optimal feature
> subset is the only reasonable MO. But it simplifies the negotiation
> procedure a lot. Conveying conflicts and feature inter-dependencies to
> the application might be rather complex and prone to errors.
> 
> At this point I believe it's important to clarify: the original intent
> is to assume that the PMD will first consider enabling all requested
> features. Only in the case when it fails to do so should it come up with
> the optimal subset.
> 

I understand my issue is the the later case and how can PMD know what is
the optimal subset.

> >
> >>> You are inserting logic to the PMD, maybe the function should just
> >>> fail maybe returning the conflicting items?
> >>
> >> Why return conflicting items? The optimal subset (from the PMD's
> >> perspective) should be returned. It's a silver lining. In the end, the
> application
> >> can learn which features can be enabled and in what combinations. And it
> >> can rely on the outcome of the negotiation process.
> >>
> > That is my point this is PMD perspective, not the application.
> > how can a PMD define an optimal subset? How can it know what is more
> > important to the application?
> 
> How does "ls" command know the optimal sort mode? Why does it prefer
> sorting by name over sorting by date? Thanks to its options, it allows
> the user to express their own preference. So does the API in question.
> If the application knows that tunnel offload is more important to it
> (compared to MARK, for instance), it can request just TUNNEL_ID. Why
> complicate this?
> 
I don't agree with your example, the "ls"  is clearly defined and each
time you run it you get the same order. It doesn't change between versions.
While in this case there will be change between versions.
Think about it this way lets assume that PMD doesn't support the TUNNEL_ID
so the application request at startup both TUNNEL_ID and MARK.
PMD returnes only MARK, application checks and see that the PMD
didn't return the TUNNEL_ID so it negotiate again only to get that nothing
is supported, then application try only the mark and to this the PMD agree.

Again this is not critical to me. But keep it in mind.

> > Also, the PMD logic is internal so if for some reason
> > the PMD selected the best for the application by chance, so the application
> learns
> > that this is a good value for him. A release later the internal PMD logic
> changes
> > for example, a new feature was added, other customer requests.
> > since this is PMD the original app is not aware of this change and may fail.
> 
> The same argumentation can equally apply to default RSS table, for
> example. What if an application gets accustomed to the round-robin table
> being default in some specific PMD (and the PMD maintainers change
> default RSS table out of a sudden)? Oops!
> 
Yes but this is why the use has the option to select the mode,
in case of RSS if the requested mode isn't supported the PMD fails not
just select different algorithm right?

> The truth is that the application shouldn't bind itself to some specific
> vendor / PMD. In any case. Hence the negotiation process. It's just a
> building block for some automation in the application.
> 
> >
> > We both agree that the application should check the result and renegotiate
> if needed
> > I only suggested that the PMD will only return error and not assume he
> knows best.
> 
> I believe we should give this more thought. Maybe Andrew can join this
> conversation.
> 
I fully agree lets sleep on it, 
This will not be a blocker.

> >
> >
> >>>
> >>>
> >>>
> >>>> But some other PMDs (net/sfc, for instance) claim only a small fraction
> of
> >> bits
> >>>> in Rx mark to deliver tunnel ID information. Remaining bits are still
> >> available
> >>>> for delivery of *user* mark ID. Please see an example at
> >>>> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
> >>>> ivan.malov@oktetlabs.ru/
> >>>> . In this case, the PMD may want to return both flags in the response:
> >>>> "MARK | TUNNEL_ID". This way, the application knows that both
> features
> >>>> are enabled and available for use.
> >>>>
> >>>> Now. I anticipate more questions asking why wouldn't we prefer flow
> API
> >>>> terminology or why wouldn't we add an API for negotiating support for
> >>>> metadata *actions* and not just metadata *delivery*. There's an
> answer.
> >>>> Always has been.
> >>>>
> >>>> The thing is, the use of *actions* is very complicated. For example, the
> >> PMD
> >>>> may support action MARK for "transfer" flows but not for non-
> "transfer"
> >>>> ones. Also, simultaneous use of multiple different metadata actions
> may
> >> not
> >>>> be possible. And, last but not least, if we force the application to check
> >>>> support for *actions* on action-after-action basis, the order of checks
> will
> >> be
> >>>> very confusing to applications.
> >>>>
> >>>> Previously, in this thread, Thomas suggested to go for exactly this type
> of
> >>>> API, to check support for actions one-by-one, without any context
> >>>> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
> >>>>
> >>> +1 to keeping it as a separated API. (I agree actions limitation are very
> >> complex metrix)
> >>>
> >>>>>
> >>>>> In any case I think this is good idea and I will see how we can add a
> >>>>> more generic approach of this API to the new API that I'm going to
> >> present.
> >>>>>
> >>>>>
> >>>>>>> So no breakages with this API.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Please see more comments inline.


[Snip]

> > Yes, like I said above I don't see a difference between metadata
> > and mark, at least not from the application usage.
> > I assume you need this info at device start and by definition
> > the registration should happen before. (mbuf should be configured
> > before start)
> 
> Please see my thoughts about dynamic fields above.
> 
> >
> >> We should make sure that we all reach an agreement.
> >>
> >
> > +1 I think we can agree that there is a need for letting the PMD
> > know before the start that some action will be used.
> >
> > And I'm sorry if I sound picky and hard, this is not my intention.
> > I'm also doing my best to review this as fast as I can.
> > My open issues and priorities:
> > 1. the API breakage the possible solution adding support for the rest of the
> PMDs / update doc
> > to say that if the function is not supported the application should assume
> that
> > it can still use the mark / flag. -- blocker this must be resolved.
> 
> I see.
> 
> > 2. function name. my main issue is that metadata should be just like mark
> > maybe the solution can be adding metadata flag to this function.
> > the drawback is that the application needs to calls two functions to
> configure
> > metadata. -- high priority but if you can give me good reasoning not just
> > we don't need to register the mark I guess I will be O.K.
> 
> Please see my thoughts above. This API negotiates metadata delivery on
> the path between the NIC and the PMD. The metadata mbuf register API
> does this on the path between the PMD and the application. So no
> contradiction here.
> 

See my comments above I think we have an agreed solution.

> > 3. If PMD has internal logic in case of conflict or not.
> > Please think about it. -- low prio I will agree to what you decide.
> > but if you decide that PMD will have internal logic then this must be
> documented
> > so the application will know not to rely on the results.
> 
> Please see my reply above. The application can rely on the outcome of
> the negotiation (the last negotiated subset of features), but it should
> know that if it disagrees with the suggested feature subset, it can
> re-negotiate. All fair and square.
> 

Like I said above think about it some more, I will also think in any
case this will not be a blocker.

Best,
Ori
> >
> > Best,
> > Ori
> >
> >>>
> >>> Best,
> >>> Ori
> >>>>>
> >>>>>> Andrew.
> >>>>> Best,
> >>>>> Ori
> >>>>>
> >>>>
> >>>> --
> >>>> Ivan M
> >>
> >> --
> >> Ivan M
> 
> --
> Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data
  2021-10-01 12:10                 ` Thomas Monjalon
@ 2021-10-04  9:17                   ` Andrew Rybchenko
  0 siblings, 0 replies; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-04  9:17 UTC (permalink / raw)
  To: Thomas Monjalon, Ivan Malov
  Cc: dev, Andy Moreton, orika, ferruh.yigit, olivier.matz

On 10/1/21 3:10 PM, Thomas Monjalon wrote:
> 01/10/2021 12:15, Andrew Rybchenko:
>> On 10/1/21 12:48 PM, Thomas Monjalon wrote:
>>> 01/10/2021 10:55, Ivan Malov:
>>>> On 01/10/2021 11:11, Thomas Monjalon wrote:
>>>>> 01/10/2021 08:47, Andrew Rybchenko:
>>>>>> On 9/30/21 10:30 PM, Ivan Malov wrote:
>>>>>>> On 30/09/2021 19:18, Thomas Monjalon wrote:
>>>>>>>> 23/09/2021 13:20, Ivan Malov:
>>>>>>>>> Patch [1/5] of this series adds a generic API to let applications
>>>>>>>>> negotiate delivery of Rx meta data during initialisation period.
>>>>>
>>>>> What is a metadata?
>>>>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK?
>>>>> Metadata word could cover any field in the mbuf struct so it is vague.
>>>>
>>>> Metadata here is *any* additional information provided by the NIC for 
>>>> each received packet. For example, Rx flag, Rx mark, RSS hash, packet 
>>>> classification info, you name it. I'd like to stress out that the 
>>>> suggested API comes with flags each of which is crystal clear on what 
>>>> concrete kind of metadata it covers, eg. Rx mark.
>>>
>>> I missed the flags.
>>> You mean these 3 flags?
>>
>> Yes
>>
>>> +/** The ethdev sees flagged packets if there are flows with action FLAG. */
>>> +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0)
>>> +
>>> +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */
>>> +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1)
>>> +
>>> +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */
>>> +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2)
>>>
>>> It is not crystal clear because it does not reference the API,
>>> like RTE_FLOW_ACTION_TYPE_MARK.
>>
>> Thanks, it is easy to fix. Please, note that there is no action
>> for tunnel ID case.
> 
> I don't understand the tunnel ID meta.
> Is it an existing offload? API?

rte_flow_tunnel_*() API and "Tunneled traffic offload" in flow
API documentation.

> 
>>> And it covers a limited set of metadata.
>>
>> Yes which are not covered by offloads, packet classification
>> etc. Anything else?
>>
>>> Do you intend to extend to all mbuf metadata?
>>
>> No. It should be discussed case-by-case separately.
> 
> Ah, it makes the intent clearer.
> Why not planning to do something truly generic?

IMHO, it is generic enough for the purpose.

> 
>>>>>>>>> This way, an application knows right from the start which parts
>>>>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try
>>>>>>>>> inserting flows requesting such data and handle the failures.
>>>>>>>>
>>>>>>>> Sorry I don't understand the problem you want to solve.
>>>>>>>> And sorry for not noticing earlier.
>>>>>>>
>>>>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the
>>>>>>> packets by default (for performance reasons). If the application tries
>>>>>>> to insert a flow with action MARK, the PMD may not be able to enable
>>>>>>> delivery of Rx mark without the need to re-start Rx sub-system. And
>>>>>>> that's fraught with traffic disruption and similar bad consequences. In
>>>>>>> order to address it, we need to let the application express its interest
>>>>>>> in receiving mark with packets as early as possible. This way, the PMD
>>>>>>> can enable Rx mark delivery in advance. And, as an additional benefit,
>>>>>>> the application can learn *from the very beginning* whether it will be
>>>>>>> possible to use the feature or not. If this API tells the application
>>>>>>> that no mark delivery will be enabled, then the application can just
>>>>>>> skip many unnecessary attempts to insert wittingly unsupported flows
>>>>>>> during runtime.
>>>>>
>>>>> I'm puzzled, because we could have the same reasoning for any offload.
>>>>
>>>> We're not discussing *offloads*. An offload is when NIC *computes 
>>>> something* and *delivers* it. We are discussing precisely *delivery*.
>>>
>>> OK but still, there are a lot more mbuf metadata delivered.
>>
>> Yes, and some are not controlled yet early enough, and
>> we do here.
>>
>>>
>>>>> I don't understand why we are focusing on mark only
>>>>
>>>> We are not focusing on mark on purpose. It's just how our discussion 
>>>> goes. I chose mark (could've chosen flag or anything else) just to show 
>>>> you an example.
>>>>
>>>>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start?
>>>>> If validating a fake rule doesn't make sense,
>>>>> why not having a new function accepting a single action as parameter?
>>>>
>>>> A noble idea, but if we feed the entire flow rule to the driver for 
>>>> validation, then the driver must not look specifically for actions FLAG 
>>>> or MARK in it (to enable or disable metadata delivery). This way, the 
>>>> driver is obliged to also validate match criteria, attributes, etc. And, 
>>>> if something is unsupported (say, some specific item), the driver will 
>>>> have to reject the rule as a whole thus leaving the application to join 
>>>> the dots itself.
>>>>
>>>> Say, you ask the driver to validate the following rule:
>>>> pattern blah-blah-1 / blah-blah-2 / end action flag / end
>>>> intending to check support for FLAG delivery. Suppose, the driver 
>>>> doesn't support pattern item "blah-blah-1". It will throw an error right 
>>>> after seeing this unsupported item and won't even go further to see the 
>>>> action FLAG. How can application know whether its request for FLAG was 
>>>> heard or not?
>>>
>>> No, I'm proposing a new function to validate the action alone,
>>> without any match etc.
>>> Example:
>>> 	rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK)

Also, please, note that sometimes it makes sense to
use action MARK on transfer level, match it in flow
rules in non-transfer level, but do not require
deliver the mark to host.

>>
>> When about tunnel ID?
>>
>> Also negotiation in terms of bitmask natively allows to
>> provide everything required at once and it simplifies
>> implementation in the driver. No dependency on order of
>> checks etc. Also it allows to renegotiate without any
>> extra API functions.
> 
> You mean there is a single function call with all bits set?

Yes, but not all, but required bits set.

> 
>>>> And I'd not bind delivery of metadata to flow API. Consider the 
>>>> following example. We have a DPDK application sitting at the *host* and 
>>>> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked 
>>>> the NIC (by virtue of flow API) to mark all outgoing packets. This 
>>>> packets reach the *host* DPDK. Say, the host application just wants to 
>>>> see the marked packets from the guest. Its own, (the host's) use of flow 
>>>> API is a don't care here. The host doesn't want to mark packets itself, 
>>>> it wants to see packets marked by the guest.
>>>
>>> It does not make sense to me. We are talking about a DPDK API.
>>> My concern is to avoid redefining new flags
>>> while we already have rte_flow actions.
>>
>> See above.
> 


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-04  6:56                       ` Ori Kam
@ 2021-10-04 11:39                         ` Ivan Malov
  2021-10-04 13:53                           ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 11:39 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Ori,

On 04/10/2021 09:56, Ori Kam wrote:
> Hi Ivan,
> 
>> -----Original Message-----
>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>> Sent: Monday, October 4, 2021 2:50 AM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Hi Ori,
>>
>> On 04/10/2021 00:04, Ori Kam wrote:
>>> Hi Ivan,
>>>
>>> Sorry for the long review.
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>>>> Sent: Sunday, October 3, 2021 8:30 PM
>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>> Rx meta data
>>>>
>>>> Hi Ori,
>>>>
>>>> On 03/10/2021 14:01, Ori Kam wrote:
>>>>> Hi Ivan,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
>>>>>> Sent: Sunday, October 3, 2021 12:30 PM data
>>>>>>
>>>>>> Hi Ori,
>>>>>>
>>>>>> Thanks for reviewing this.
>>>>>>
>>>>>
>>>>> No problem.
>>>>>
>>>>>> On 03/10/2021 10:42, Ori Kam wrote:
>>>>>>> Hi Andrew and Ivan,
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>>>> Sent: Friday, October 1, 2021 9:50 AM
>>>>>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery
>>>>>>>> of Rx meta data
>>>>>>>>
>>>>>>>> On 9/30/21 10:07 PM, Ivan Malov wrote:
>>>>>>>>> Hi Ori,
>>>>>>>>>
>>>>>>>>> On 30/09/2021 17:59, Ori Kam wrote:
>>>>>>>>>> Hi Ivan,
> 
> [Snip]
> 
>>>>> Good point so why not use the same logic as the metadata and register
>> it?
>>>>> Since in any case, this is something in the mbuf so maybe this
>>>>> should be the
>>>> answer?
>>>>
>>>> I didn't catch your thought. Could you please elaborate on it?
>>>
>>> The metadata action just like the mark or flag is used to give
>>> application data that was set by a flow rule.
>>> To enable the metadata the application must register the metadata field.
>>> Since this happens during the creation of the mbuf it means that it
>>> must be created before the device start.
>>>
>>> I understand that the mark and flag don't need to be registered in the
>>> mbuf since they have saved space but from application point of view
>>> there is no difference between the metadata and mark, so why does
>>> negotiate function doesn't handle the metadata?
>>>
>>> I hope this is clearer.
>>
>> Thank you. That's a lot clearer.
>>
>> I inspected struct rte_flow_action_set_meta as well as
>> rte_flow_dynf_metadata_register(). The latter API doesn't require that
>> applications invoke it precisely before adapter start. It says "must be called
>> prior to use SET_META action", and the comment before the structure says
>> just "in advance". So, at a bare minimum, the API contract could've been
>> made more strict with this respect. However, far more important points are
>> as follows:
>>
> 
> Agree, that doc should be updated but by definition this must be set before mbuf
> creation this means before device start.
> 
>> 1) This API enables delivery of this "custom" metadata between the PMD
>> and the application, whilst the API under review, as I noted before,
>> negotiates delivery of various kinds of metadata between the NIC and the
>> PMD. These are two completely different (albeit adjacent) stages of packet
>> delivery process.
>>
> They are exactly alike also in the metadata case the registertion does two things:
> Saves a place for the info in the mbuf and tells the PMD that it should configure the NIC
> to supply this information upon request.

Looking at rte_flow_dynf_metadata_register() implementation, it doesn't 
seem to notify the PMD of the new field directly. Yes, the PMD will 
finally know, but at that point it won't be able to reject the field. 
It's one-sided communication in fact.

> Even in your PMD assuming that it can support the metadata, you will need to configure
> it otherwise when the application will request this data using a rule you will be at the
> same spot you are now with the mark.

Right, but as I said, the primary concern is to configure delivery of 
metadata from the NIC HW to the PMD. It's not about mbuf dynfields.

> 
>> 2) This API doesn't negotiate anything with the PMD. It doesn't interact with
>> the PMD at all. It just reserves extra room in mbufs for the metadata field
>> and exits.
>>
>> 3) As a consequence of (3), the PMD can't immediately learn about this field
>> being enabled. It's forced to face this fact at some deferred point. If the
>> PMD, for instance, learns about that during adapter start and if it for some
>> reason decides to deny the use of this field, it won't be able to convey its
>> decision to the application. As a result, the application will live in the wrong
>> assumption that it has successfully enabled the feature.
>>
>> 4) Even if we add similar APIs to "register" more kinds of metadata (flag,
>> mark, tunnel ID, etc) and re-define the meaning of all these APIs to say that
>> not only they enable delivery of the metadata between the PMD and the
>> application but also enable the HW transport to get the metadata delivered
>> from the NIC to the PMD itself, we won't be able to use this set of APIs to
>> actually *negotiate* something. The order of invocations will be confusing to
>> the application. If the PMD can't combine some of these features, it won't be
>> able to communicate this clearly to the application. It will have to silently
>> disregard some of the "registered" features. And this is something that we
>> probably want to avoid. Right?
>>
>> But I tend to agree that the API under review could have one more (4th) flag
>> to negotiate delivery of this "custom" metadata from the NIC to the PMD. At
>> the same time, enabling delivery of this data from the PMD to the application
>> will remain in the responsibility domain of
>> rte_flow_dynf_metadata_register().
>>
> 
> I agree and think this is the best solution.

Thank you.

> 
>>>
>>>>
>>>>>
>>>>>> Say, you insert a flow rule to mark some packets. The NIC,
>>>>>> internally (in the
>>>>>> e-switch) adds the mark to matching packets. Yes, in the boundaries
>>>>>> of the NIC HW, the packets bear the mark on them. It has been set,
>>>>>> yes. But when time comes to *deliver* the packets to the host, the
>>>>>> NIC (at least, in net/sfc
>>>>>> case) has two options: either provide only a small chunk of the
>>>>>> metadata for each packet *to the host*, which doesn't include mark
>>>>>> ID, flag and RSS hash, OR, alternatively, provide the full set of
>>>>>> metadata. In the former option, the mark is simply not delivered.
>>>>>> Once again: it *has been set*, but simply will not be *delivered to
>>>>>> the
>>>> host*.
>>>>>>
>>>>>> So, this API is about negotiating *delivery* of metadata. In pure
>>>>>> technical sense. And the set of flags that this API returns
>>>>>> indicates which kinds of metadata the NIC will be able to deliver
>> simultaneously.
>>>>>>
>>>>>> For example, as I understand, in the case of tunnel offload, MLX5
>>>>>> claims Rx mark entirely for tunnel ID metadata, so, if an
>>>>>> application requests "MARK | TUNNEL_ID" with this API, this PMD
>>>>>> should probably want to respond with just "TUNNEL_ID". The
>>>>>> application will see the response and realise that, even if it adds
>>>>>> its *own* (user) action MARK to a flow and if the flow is not
>>>>>> rejected by the PMD, it won't be able to see the mark in the
>>>>>> received mbufs (or the mark will be
>>>> incorrect).
>>>>>>
>>>>> So what should the application do if on some flows it wants MARK and
>>>>> on
>>>> other FLAG?
>>>>
>>>> You mentioned flows, so I'd like to stress this out one more time:
>>>> what this API cares about is solely the possibility to deliver
>>>> metadata between the NIC and the host. The host == the PMD (*not*
>> application).
>>>>
>>>
>>> I understand that you are only talking about enabling the action,
>>> meaning to let the PMD know that at some point there will be a rule
>>> that will use the mark action for example.
>>> Is my understanding correct?
>>
>> Not really. The causal relationships are as follows. The application comes to
>> realise that it will need to use, say, action MARK in flows.
>> This, in turn, means that, in order to be able to actually see the mark in
>> received packets, the application needs to ensure that a) the NIC will be able
>> to deliver the mark to the PMD and b) that the PMD will be able to deliver
>> the mark to the application. In particular, in the case of Rx mark, (b) doesn't
>> need to be negotiated = field "mark" is anyway provisioned in the mbuf
>> structure, so no need to enable it. But (a) needs to be negotiated. Hence this
>> API.
>>
> Please see my above comment I think we both agree.

Agree to have the 4-th flag in the new API to cover this "custom / raw 
metdata" delivery? Personally, I tend to agree, but maybe Andrew can 
express his opinion, too.

> 
>>> I don't understand your last comment about host == PMD since at the
>>> end this value should be given to the application.
>>
>> Two different steps, Ori, two different steps. The first one is to deliver the
>> mark from the NIC to the PMD. And the second one is to deliver the mark
>> from the PMD to the application. As you might understand, mbufs get filled
>> out on the second step. That's it.
>>
>>>
>>>>>    From DPDK viewpoint both of them can't be shared on the same rule
>>>>> (they are using the same space in mbuf) so the application will never
>>>>> ask for both of them in the same rule but he can on some rules ask for
>>>>> mark while on other request for FLAG, even in your code you added
>> both
>>>> of them.
>>>>>
>>>>> So what should the PMD return if it can support both of them just not
>>>>> at the same rule?
>>>>
>>>> Please see above. This is not about rules. This is not about the way how
>> flag
>>>> and mark are presented *by* the PMD *to* the application in mbufs.
>>>> Simultaneous use of actions FLAG and MARK in flows must be ruled out
>> by
>>>> rte_flow_validate() / rte_flow_create(). The way how flag and mark are
>>>> *represented* in mbufs belongs in mbuf library responsibility domain.
>>>>
>>>> Consider the following operational sequence:
>>>>
>>>> 1) The NIC has a packet, which has metadata associated with it;
>>>> 2) The NIC transfers this packet to the host;
>>>> 3) The PMD sees the packet and its metadata;
>>>> 4) The PMD represents whatever available metadata in mbuf format.
>>>>
>>>> Features negotiated by virtue of this API (for instance, FLAG and MARK)
>>>> enable delivery of these kinds of metadata between points (2) and (3).
>>>>
>>>> And the problem of flag / mark co-existence in mbufs sits at point (4).
>>>>
>>>> -> Completely different problems, in fact.
>>>>
>>>
>>> Agree.
>>>
>>>>>
>>>>> One option is to document that the supported values are not per rule
>>>>> but for the entire port. For example in the example you gave MLX5 will
>>>>> support mark + flag but will not support mark + tunnel.
>>>>
>>>> Yes, for the port. Flow rules are a don't care to this API.
>>>>
>>>>>
>>>>> Also considering your example, the negotiation may result in subpar
>> result.
>>>>> taking your example the PMD returned  TUNNEL_ID maybe application
>>>>> would prefer to have the mark and not the TUNNEL_ID. I understand
>> that
>>>>> application can check and try again with just the MARK.
>>>>
>>>> Exactly. The Application can repeat negotiation with just MARK. Is there
>> any
>>>> problem with that?
>>>>
>>>
>>> I understand that the application can negotiate again and again.
>>> I just don't like that the PMD has logic and selects what he thinks will be
>> best.
>>>
>>> I wanted to suggest that the PMD will just tell what are the conflicts and
>> the application
>>> will negotiate again based on its logic.
>>
>> Well, I'm not saying that letting the PMD decide on the optimal feature
>> subset is the only reasonable MO. But it simplifies the negotiation
>> procedure a lot. Conveying conflicts and feature inter-dependencies to
>> the application might be rather complex and prone to errors.
>>
>> At this point I believe it's important to clarify: the original intent
>> is to assume that the PMD will first consider enabling all requested
>> features. Only in the case when it fails to do so should it come up with
>> the optimal subset.
>>
> 
> I understand my issue is the the later case and how can PMD know what is
> the optimal subset.
> 
>>>
>>>>> You are inserting logic to the PMD, maybe the function should just
>>>>> fail maybe returning the conflicting items?
>>>>
>>>> Why return conflicting items? The optimal subset (from the PMD's
>>>> perspective) should be returned. It's a silver lining. In the end, the
>> application
>>>> can learn which features can be enabled and in what combinations. And it
>>>> can rely on the outcome of the negotiation process.
>>>>
>>> That is my point this is PMD perspective, not the application.
>>> how can a PMD define an optimal subset? How can it know what is more
>>> important to the application?
>>
>> How does "ls" command know the optimal sort mode? Why does it prefer
>> sorting by name over sorting by date? Thanks to its options, it allows
>> the user to express their own preference. So does the API in question.
>> If the application knows that tunnel offload is more important to it
>> (compared to MARK, for instance), it can request just TUNNEL_ID. Why
>> complicate this?
>>
> I don't agree with your example, the "ls"  is clearly defined and each
> time you run it you get the same order. It doesn't change between versions.
> While in this case there will be change between versions.

Maybe not that good example, indeed. But the fact that it's clearly 
defined is true in this particular case. There are tons of programs 
which don't document their defaults clearly and never cease to surprise 
their users when new versions get released... It's so customary.

> Think about it this way lets assume that PMD doesn't support the TUNNEL_ID
> so the application request at startup both TUNNEL_ID and MARK.
> PMD returnes only MARK, application checks and see that the PMD
> didn't return the TUNNEL_ID so it negotiate again only to get that nothing
> is supported, then application try only the mark and to this the PMD agree.

So what's the problem? The key phrase here is that "application checks". 
Yes, it does check the output. And has the right to disagree, to 
re-negotiate.

> 
> Again this is not critical to me. But keep it in mind.

We never lost this from our view.

Frankly, we had internal discussions and of course we did realise that 
letting the PMD chose the optimal subset would raise concerns. But we 
also should keep in mind the fact that communicating conflicts might be 
difficult. I'll refrain from ranting about possible algorithms, though.

It's a trade off between avoiding PMDs push their vision of the optimal 
feature set and keeping the API simple and concise and thus user-friendly.

> 
>>> Also, the PMD logic is internal so if for some reason
>>> the PMD selected the best for the application by chance, so the application
>> learns
>>> that this is a good value for him. A release later the internal PMD logic
>> changes
>>> for example, a new feature was added, other customer requests.
>>> since this is PMD the original app is not aware of this change and may fail.
>>
>> The same argumentation can equally apply to default RSS table, for
>> example. What if an application gets accustomed to the round-robin table
>> being default in some specific PMD (and the PMD maintainers change
>> default RSS table out of a sudden)? Oops!
>>
> Yes but this is why the use has the option to select the mode,
> in case of RSS if the requested mode isn't supported the PMD fails not
> just select different algorithm right?

I don't refer to the MQ mode or hash algorithm. I refer to default 
fill-out of RETA. The application author may test its product once with 
some PMD and watch the RETA work in round-robin manner by default. They 
may then mistakenly assume that its guaranteed behaviour while it's not. 
Hence the existence of an API to let the application explicitly set RETA 
entries. And the applications are encouraged to use this API.

The same might apply to the API in question. Yes, it allows the PMD to 
suggest the optimal feature subset *if* it can't enable the full / 
originally requested set of features simultaneously. But nobody prevents 
the application from re-negotiating this. The application can narrow 
down the requested set of features or check them one-by one.

And *this* effectively enables the application to have its own logic and 
fully control it. It can do multiple invocations of the API and join the 
dots itself. Conflicts between some features can be very clear to the 
application this way.

> 
>> The truth is that the application shouldn't bind itself to some specific
>> vendor / PMD. In any case. Hence the negotiation process. It's just a
>> building block for some automation in the application.
>>
>>>
>>> We both agree that the application should check the result and renegotiate
>> if needed
>>> I only suggested that the PMD will only return error and not assume he
>> knows best.
>>
>> I believe we should give this more thought. Maybe Andrew can join this
>> conversation.
>>
> I fully agree lets sleep on it,
> This will not be a blocker.
> 
>>>
>>>
>>>>>
>>>>>
>>>>>
>>>>>> But some other PMDs (net/sfc, for instance) claim only a small fraction
>> of
>>>> bits
>>>>>> in Rx mark to deliver tunnel ID information. Remaining bits are still
>>>> available
>>>>>> for delivery of *user* mark ID. Please see an example at
>>>>>> https://patches.dpdk.org/project/dpdk/patch/20210929205730.775-2-
>>>>>> ivan.malov@oktetlabs.ru/
>>>>>> . In this case, the PMD may want to return both flags in the response:
>>>>>> "MARK | TUNNEL_ID". This way, the application knows that both
>> features
>>>>>> are enabled and available for use.
>>>>>>
>>>>>> Now. I anticipate more questions asking why wouldn't we prefer flow
>> API
>>>>>> terminology or why wouldn't we add an API for negotiating support for
>>>>>> metadata *actions* and not just metadata *delivery*. There's an
>> answer.
>>>>>> Always has been.
>>>>>>
>>>>>> The thing is, the use of *actions* is very complicated. For example, the
>>>> PMD
>>>>>> may support action MARK for "transfer" flows but not for non-
>> "transfer"
>>>>>> ones. Also, simultaneous use of multiple different metadata actions
>> may
>>>> not
>>>>>> be possible. And, last but not least, if we force the application to check
>>>>>> support for *actions* on action-after-action basis, the order of checks
>> will
>>>> be
>>>>>> very confusing to applications.
>>>>>>
>>>>>> Previously, in this thread, Thomas suggested to go for exactly this type
>> of
>>>>>> API, to check support for actions one-by-one, without any context
>>>>>> ("transfer" / non-"transfer"). I'm afraid, this won't be OK.
>>>>>>
>>>>> +1 to keeping it as a separated API. (I agree actions limitation are very
>>>> complex metrix)
>>>>>
>>>>>>>
>>>>>>> In any case I think this is good idea and I will see how we can add a
>>>>>>> more generic approach of this API to the new API that I'm going to
>>>> present.
>>>>>>>
>>>>>>>
>>>>>>>>> So no breakages with this API.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please see more comments inline.
> 
> 
> [Snip]
> 
>>> Yes, like I said above I don't see a difference between metadata
>>> and mark, at least not from the application usage.
>>> I assume you need this info at device start and by definition
>>> the registration should happen before. (mbuf should be configured
>>> before start)
>>
>> Please see my thoughts about dynamic fields above.
>>
>>>
>>>> We should make sure that we all reach an agreement.
>>>>
>>>
>>> +1 I think we can agree that there is a need for letting the PMD
>>> know before the start that some action will be used.
>>>
>>> And I'm sorry if I sound picky and hard, this is not my intention.
>>> I'm also doing my best to review this as fast as I can.
>>> My open issues and priorities:
>>> 1. the API breakage the possible solution adding support for the rest of the
>> PMDs / update doc
>>> to say that if the function is not supported the application should assume
>> that
>>> it can still use the mark / flag. -- blocker this must be resolved.
>>
>> I see.
>>
>>> 2. function name. my main issue is that metadata should be just like mark
>>> maybe the solution can be adding metadata flag to this function.
>>> the drawback is that the application needs to calls two functions to
>> configure
>>> metadata. -- high priority but if you can give me good reasoning not just
>>> we don't need to register the mark I guess I will be O.K.
>>
>> Please see my thoughts above. This API negotiates metadata delivery on
>> the path between the NIC and the PMD. The metadata mbuf register API
>> does this on the path between the PMD and the application. So no
>> contradiction here.
>>
> 
> See my comments above I think we have an agreed solution.
> 
>>> 3. If PMD has internal logic in case of conflict or not.
>>> Please think about it. -- low prio I will agree to what you decide.
>>> but if you decide that PMD will have internal logic then this must be
>> documented
>>> so the application will know not to rely on the results.
>>
>> Please see my reply above. The application can rely on the outcome of
>> the negotiation (the last negotiated subset of features), but it should
>> know that if it disagrees with the suggested feature subset, it can
>> re-negotiate. All fair and square.
>>
> 
> Like I said above think about it some more, I will also think in any
> case this will not be a blocker.
> 
> Best,
> Ori
>>>
>>> Best,
>>> Ori
>>>
>>>>>
>>>>> Best,
>>>>> Ori
>>>>>>>
>>>>>>>> Andrew.
>>>>>>> Best,
>>>>>>> Ori
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ivan M
>>>>
>>>> --
>>>> Ivan M
>>
>> --
>> Ivan M

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-04 11:39                         ` Ivan Malov
@ 2021-10-04 13:53                           ` Andrew Rybchenko
  2021-10-05  6:30                             ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-04 13:53 UTC (permalink / raw)
  To: Ivan Malov, Ori Kam, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

On 10/4/21 2:39 PM, Ivan Malov wrote:
> On 04/10/2021 09:56, Ori Kam wrote:
>>> On 04/10/2021 00:04, Ori Kam wrote:
>>>> I understand that you are only talking about enabling the action,
>>>> meaning to let the PMD know that at some point there will be a rule
>>>> that will use the mark action for example.
>>>> Is my understanding correct?
>>>
>>> Not really. The causal relationships are as follows. The application
>>> comes to
>>> realise that it will need to use, say, action MARK in flows.
>>> This, in turn, means that, in order to be able to actually see the
>>> mark in
>>> received packets, the application needs to ensure that a) the NIC
>>> will be able
>>> to deliver the mark to the PMD and b) that the PMD will be able to
>>> deliver
>>> the mark to the application. In particular, in the case of Rx mark,
>>> (b) doesn't
>>> need to be negotiated = field "mark" is anyway provisioned in the mbuf
>>> structure, so no need to enable it. But (a) needs to be negotiated.
>>> Hence this
>>> API.
>>>
>> Please see my above comment I think we both agree.
> 
> Agree to have the 4-th flag in the new API to cover this "custom / raw
> metdata" delivery? Personally, I tend to agree, but maybe Andrew can
> express his opinion, too.

Of course, it could be added, but we're not going to support it
in net/sfc. So, I think the flag should be added when a PMD
will going to support it (e.g. net/mlx5).

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (5 preceding siblings ...)
  2021-09-30 16:18   ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Thomas Monjalon
@ 2021-10-04 23:50   ` Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
                       ` (4 more replies)
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                     ` (2 subsequent siblings)
  9 siblings, 5 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev; +Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam, Ajit Khaparde

In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
only the former has been added. The issue has not been solved.
Applications still assume that metadata features always work
and do not need to be configured in advance.

The team behind net/sfc driver has given this problem more thought.
Conclusions that have been reached are as follows.

1. Not all kinds of metadata can be represented by device offload flags.
   For instance, having flag RSS_HASH is legitimate because the NIC is
   supposed to actually compute something when this feature is active.
   However, if similar flag existed for Rx mark, requesting it would
   not make the NIC actually compute anything. The HW needs external
   stimuli (flow rules) in order to set the mark in the first place.

2. As a consequence of (1), it is apparent that the user's ability to
   use Rx metadata features is complex and consists of multiple parts:
   a) the NIC's ability to conduct the flow actions (set metadata);
   b) the NIC's ability to deliver metadata (if set) to the PMD;
   c) the PMD's ability to provide metadata received from the
      NIC to the user by virtue of filling out mbuf fields.

3. Aspects (2-a) and (2-c) are already addressed by flow validate API
   and the procedure of dynamic mbuf field registration respectively,
   hence, the only problem which really needs a solution is (2-b).
  
Patch [1/5] of this series adds a generic API to let the application
negotiate the NIC's ability to deliver specific kinds of metadata to
the PMD. This API is supposed to be invoked during initialisation
period in order to let the PMD configure HW resources which might
be hard to (re-)configure in the adapter's started state without
causing traffic disruption and other unwanted consequences.

[1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Changes in v3:
* [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
* [1/5] has the ack from Jerin Jacob added

Changes in v4:
* [1/5] has the API contract clarified to address concerns raised by Ori Kam
* [1/5] has the API name fixed to use term "metadata" instead of "meta"
* [1/5] has testpmd loglevel changed as per the note by Ajit Khaparde
* [1/5] has testpmd code revisited to take multi-process into account
* [2/5] through [5/5] have the corresponding adjustments incorporated

Ivan Malov (5):
  ethdev: negotiate delivery of packet metadata from HW to PMD
  net/sfc: support API to negotiate delivery of Rx metadata
  net/sfc: support flow mark delivery on EF100 native datapath
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: report user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 +++++++++
 drivers/net/sfc/sfc_ethdev.c           | 29 ++++++++++++++
 drivers/net/sfc/sfc_flow.c             | 13 +++++++
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 53 +++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 311 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-04 23:50     ` Ivan Malov
  2021-10-05 12:03       ` Ori Kam
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton, Wisam Jaddo,
	Xiaoyun Li, Ferruh Yigit

Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 53 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 182 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..4d01791f6f 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_metadata = 0;
+
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+		ret = rte_eth_rx_metadata_negotiate(port_id, &rx_metadata);
+		if (ret == 0) {
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..bf80de4e80 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -533,6 +533,41 @@ int proc_id;
  */
 unsigned int num_procs = 1;
 
+static void
+eth_rx_metadata_negotiate_mp(uint16_t port_id)
+{
+	uint64_t rx_meta_features = 0;
+	int ret;
+
+	if (!is_proc_primary())
+		return;
+
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+	ret = rte_eth_rx_metadata_negotiate(port_id, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_FLAG)) {
+			TESTPMD_LOG(DEBUG, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_MARK)) {
+			TESTPMD_LOG(DEBUG, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+			TESTPMD_LOG(DEBUG, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    port_id);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 port_id, rte_strerror(-ret));
+	}
+}
+
 static int
 eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1489,6 +1524,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id)
 	int ret;
 	int i;
 
+	eth_rx_metadata_negotiate_mp(pid);
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index f099b1cca2..48fd045db7 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -129,6 +129,15 @@ New Features
   * Added tests to validate packets hard expiry.
   * Added tests to verify tunnel header verification in IPsec inbound.
 
+* **Added an API to control delivery of Rx metadata from the HW to the PMD**
+
+  A new API, ``rte_eth_rx_metadata_negotiate()``, was added.
+  The following parts of Rx metadata were defined:
+
+  * ``RTE_ETH_RX_METADATA_USER_FLAG``
+  * ``RTE_ETH_RX_METADATA_USER_MARK``
+  * ``RTE_ETH_RX_METADATA_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index cc2c75261c..d073d63ba8 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -785,6 +785,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_metadata_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -945,6 +961,12 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	/**
+	 * Negotiate the NIC's ability to deliver specific
+	 * kinds of metadata to the PMD.
+	 */
+	eth_rx_metadata_negotiate_t rx_metadata_negotiate;
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index daf5ca9242..a41fb8a398 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_metadata_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_metadata_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index afdc53b674..6b2da6de0a 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4902,6 +4902,59 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/** The NIC is able to deliver flag (if set) with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
+
+/** The NIC is able to deliver mark ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
+
+/** The NIC is able to deliver tunnel ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * Invoke this API before the first rte_eth_dev_configure() invocation
+ * to let the PMD make preparations that are inconvenient to do later.
+ *
+ * The negotiation process is as follows:
+ *
+ * - the application requests features intending to use at least some of them;
+ * - the PMD responds with the guaranteed subset of the requested feature set;
+ * - the application can retry negotiation with another set of features;
+ * - the application can pass zero to clear the negotiation result;
+ * - the last negotiated result takes effect upon the ethdev start.
+ *
+ * @note
+ *   The PMD is supposed to first consider enabling the requested feature set
+ *   in its entirety. Only if it fails to do so, does it have the right to
+ *   respond with a smaller set of the originally requested features.
+ *
+ * @note
+ *   Return code (-ENOTSUP) does not necessarily mean that the requested
+ *   features are unsupported. In this case, the application should just
+ *   assume that these features can be used without prior negotiations.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..75656ff9f8 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * One should negotiate mark delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * One should negotiate flag delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * One should negotiate tunnel metadata delivery from the NIC to the HW.
+ * @see rte_eth_rx_metadata_negotiate()
+ * @see RTE_ETH_RX_METADATA_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..2e638c680e 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_metadata_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 2/5] net/sfc: support API to negotiate delivery of Rx metadata
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-04 23:50     ` Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 13 +++++++++++++
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 331e06bac6..079216c1fb 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -312,6 +312,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_metadata;
+
 	uint32_t			rxd_wait_timeout_ns;
 };
 
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 2db0d000c3..00b2c84b46 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1859,6 +1859,28 @@ sfc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t ethdev_qid)
 	return sap->dp_rx->intr_disable(rxq_info->dp);
 }
 
+static int
+sfc_rx_metadata_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+	uint64_t supported = 0;
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_MARK;
+
+	sa->negotiated_rx_metadata = supported & *features;
+	*features = sa->negotiated_rx_metadata;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1906,6 +1928,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_by_id		= sfc_xstats_get_by_id,
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
+	.rx_metadata_negotiate		= sfc_rx_metadata_negotiate,
 };
 
 /**
@@ -1998,6 +2021,12 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/* FLAG and MARK are always available from Rx prefix. */
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 4f5993a68d..1f54bea3d9 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1760,6 +1760,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
 	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1832,6 +1833,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"FLAG action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_FLAG) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"flag delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			spec_filter->template.efs_flags |=
@@ -1849,6 +1856,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"MARK action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_MARK) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"mark delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			rc = sfc_flow_parse_mark(sa, actions->conf, flow);
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 4b520bc619..63b917a323 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -2963,6 +2963,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3012,12 +3013,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"flag delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"mark delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 3/5] net/sfc: support flow mark delivery on EF100 native datapath
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
@ 2021-10-04 23:50     ` Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 1bf04f565a..b634c8f23a 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -914,6 +914,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
 	.queue_offload_capa	= DEV_RX_OFFLOAD_CHECKSUM |
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..5b924010bd 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (2 preceding siblings ...)
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
@ 2021-10-04 23:50     ` Ivan Malov
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index 24e1314cc3..bed1029f59 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3007,6 +3007,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 76b8ce302a..9d3258b503 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (3 preceding siblings ...)
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-10-04 23:50     ` Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-04 23:50 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Detect the flag in Rx prefix and pass it to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index b634c8f23a..7d0d6b3d00 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -62,6 +62,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_RSS_HASH		0x10
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -371,6 +372,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH_VALID, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(CLASS, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -800,6 +811,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -914,6 +931,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 5b924010bd..5e120f5851 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-04 13:53                           ` Andrew Rybchenko
@ 2021-10-05  6:30                             ` Ori Kam
  2021-10-05  7:27                               ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-05  6:30 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Monday, October 4, 2021 4:53 PM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
>
> On 10/4/21 2:39 PM, Ivan Malov wrote:
> > On 04/10/2021 09:56, Ori Kam wrote:
> >>> On 04/10/2021 00:04, Ori Kam wrote:
> >>>> I understand that you are only talking about enabling the action,
> >>>> meaning to let the PMD know that at some point there will be a rule
> >>>> that will use the mark action for example.
> >>>> Is my understanding correct?
> >>>
> >>> Not really. The causal relationships are as follows. The application
> >>> comes to realise that it will need to use, say, action MARK in
> >>> flows.
> >>> This, in turn, means that, in order to be able to actually see the
> >>> mark in received packets, the application needs to ensure that a)
> >>> the NIC will be able to deliver the mark to the PMD and b) that the
> >>> PMD will be able to deliver the mark to the application. In
> >>> particular, in the case of Rx mark,
> >>> (b) doesn't
> >>> need to be negotiated = field "mark" is anyway provisioned in the
> >>> mbuf structure, so no need to enable it. But (a) needs to be negotiated.
> >>> Hence this
> >>> API.
> >>>
> >> Please see my above comment I think we both agree.
> >
> > Agree to have the 4-th flag in the new API to cover this "custom / raw
> > metdata" delivery? Personally, I tend to agree, but maybe Andrew can
> > express his opinion, too.
>
> Of course, it could be added, but we're not going to support it in net/sfc. So, I
> think the flag should be added when a PMD will going to support it (e.g.
> net/mlx5).

I think it should be added now, and more I think that this patch should add the missing function
to all PMDs 😊

Best,
Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05  6:30                             ` Ori Kam
@ 2021-10-05  7:27                               ` Andrew Rybchenko
  2021-10-05  8:17                                 ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-05  7:27 UTC (permalink / raw)
  To: Ori Kam, Ivan Malov, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

On 10/5/21 9:30 AM, Ori Kam wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Monday, October 4, 2021 4:53 PM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> On 10/4/21 2:39 PM, Ivan Malov wrote:
>>> On 04/10/2021 09:56, Ori Kam wrote:
>>>>> On 04/10/2021 00:04, Ori Kam wrote:
>>>>>> I understand that you are only talking about enabling the action,
>>>>>> meaning to let the PMD know that at some point there will be a rule
>>>>>> that will use the mark action for example.
>>>>>> Is my understanding correct?
>>>>> Not really. The causal relationships are as follows. The application
>>>>> comes to realise that it will need to use, say, action MARK in
>>>>> flows.
>>>>> This, in turn, means that, in order to be able to actually see the
>>>>> mark in received packets, the application needs to ensure that a)
>>>>> the NIC will be able to deliver the mark to the PMD and b) that the
>>>>> PMD will be able to deliver the mark to the application. In
>>>>> particular, in the case of Rx mark,
>>>>> (b) doesn't
>>>>> need to be negotiated = field "mark" is anyway provisioned in the
>>>>> mbuf structure, so no need to enable it. But (a) needs to be negotiated.
>>>>> Hence this
>>>>> API.
>>>>>
>>>> Please see my above comment I think we both agree.
>>> Agree to have the 4-th flag in the new API to cover this "custom / raw
>>> metdata" delivery? Personally, I tend to agree, but maybe Andrew can
>>> express his opinion, too.
>> Of course, it could be added, but we're not going to support it in net/sfc. So, I
>> think the flag should be added when a PMD will going to support it (e.g.
>> net/mlx5).
> I think it should be added now, and more I think that this patch should add the missing function
> to all PMDs 😊

Sorry, but I disagree. Could you point out to DPDK documentation
where it is written? Should all new API be supported in all PMDs
by the API contributor?

Andrew.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05  7:27                               ` Andrew Rybchenko
@ 2021-10-05  8:17                                 ` Ori Kam
  2021-10-05  8:38                                   ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-05  8:17 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov, dev
  Cc: Andy Moreton, Ray Kinsella, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, NBU-Contact-Thomas Monjalon, Ferruh Yigit

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Tuesday, October 5, 2021 10:27 AM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> On 10/5/21 9:30 AM, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Monday, October 4, 2021 4:53 PM
> >> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >> Rx meta data
> >>
> >> On 10/4/21 2:39 PM, Ivan Malov wrote:
> >>> On 04/10/2021 09:56, Ori Kam wrote:
> >>>>> On 04/10/2021 00:04, Ori Kam wrote:
> >>>>>> I understand that you are only talking about enabling the action,
> >>>>>> meaning to let the PMD know that at some point there will be a
> >>>>>> rule that will use the mark action for example.
> >>>>>> Is my understanding correct?
> >>>>> Not really. The causal relationships are as follows. The
> >>>>> application comes to realise that it will need to use, say, action
> >>>>> MARK in flows.
> >>>>> This, in turn, means that, in order to be able to actually see the
> >>>>> mark in received packets, the application needs to ensure that a)
> >>>>> the NIC will be able to deliver the mark to the PMD and b) that
> >>>>> the PMD will be able to deliver the mark to the application. In
> >>>>> particular, in the case of Rx mark,
> >>>>> (b) doesn't
> >>>>> need to be negotiated = field "mark" is anyway provisioned in the
> >>>>> mbuf structure, so no need to enable it. But (a) needs to be negotiated.
> >>>>> Hence this
> >>>>> API.
> >>>>>
> >>>> Please see my above comment I think we both agree.
> >>> Agree to have the 4-th flag in the new API to cover this "custom /
> >>> raw metdata" delivery? Personally, I tend to agree, but maybe Andrew
> >>> can express his opinion, too.
> >> Of course, it could be added, but we're not going to support it in
> >> net/sfc. So, I think the flag should be added when a PMD will going to
> support it (e.g.
> >> net/mlx5).
> > I think it should be added now, and more I think that this patch
> > should add the missing function to all PMDs 😊
> 
> Sorry, but I disagree. Could you point out to DPDK documentation where it is
> written? Should all new API be supported in all PMDs by the API contributor?
> 
This changes existing PMD beavior, until now there was no need to register the MARK
now you require it, it is just like change the shared counter you needed to fix different drivers.
This is not critical to me like I said in other thread as long is it is clear that if PMD doesn't support
the new function it doesn't mean the the PMD has issue with the request.

One more thing, I think this flag should be added now since you need it,
I think you should report that you don't support it.
since just like we talked there is no real difference between metadata and MARK.
What do you think?

Best,
Ori
> Andrew.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05  8:17                                 ` Ori Kam
@ 2021-10-05  8:38                                   ` Andrew Rybchenko
  2021-10-05  9:41                                     ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-05  8:38 UTC (permalink / raw)
  To: Ori Kam, Ivan Malov, NBU-Contact-Thomas Monjalon
  Cc: Andy Moreton, Ray Kinsella, dev, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Ori,

On 10/5/21 11:17 AM, Ori Kam wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Tuesday, October 5, 2021 10:27 AM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> On 10/5/21 9:30 AM, Ori Kam wrote:
>>> Hi Andrew,
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> Sent: Monday, October 4, 2021 4:53 PM
>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>> Rx meta data
>>>>
>>>> On 10/4/21 2:39 PM, Ivan Malov wrote:
>>>>> On 04/10/2021 09:56, Ori Kam wrote:
>>>>>>> On 04/10/2021 00:04, Ori Kam wrote:
>>>>>>>> I understand that you are only talking about enabling the action,
>>>>>>>> meaning to let the PMD know that at some point there will be a
>>>>>>>> rule that will use the mark action for example.
>>>>>>>> Is my understanding correct?
>>>>>>> Not really. The causal relationships are as follows. The
>>>>>>> application comes to realise that it will need to use, say, action
>>>>>>> MARK in flows.
>>>>>>> This, in turn, means that, in order to be able to actually see the
>>>>>>> mark in received packets, the application needs to ensure that a)
>>>>>>> the NIC will be able to deliver the mark to the PMD and b) that
>>>>>>> the PMD will be able to deliver the mark to the application. In
>>>>>>> particular, in the case of Rx mark,
>>>>>>> (b) doesn't
>>>>>>> need to be negotiated = field "mark" is anyway provisioned in the
>>>>>>> mbuf structure, so no need to enable it. But (a) needs to be negotiated.
>>>>>>> Hence this
>>>>>>> API.
>>>>>>>
>>>>>> Please see my above comment I think we both agree.
>>>>> Agree to have the 4-th flag in the new API to cover this "custom /
>>>>> raw metdata" delivery? Personally, I tend to agree, but maybe Andrew
>>>>> can express his opinion, too.
>>>> Of course, it could be added, but we're not going to support it in
>>>> net/sfc. So, I think the flag should be added when a PMD will going to
>> support it (e.g.
>>>> net/mlx5).
>>> I think it should be added now, and more I think that this patch
>>> should add the missing function to all PMDs 😊
>>
>> Sorry, but I disagree. Could you point out to DPDK documentation where it is
>> written? Should all new API be supported in all PMDs by the API contributor?
>>
> This changes existing PMD beavior, until now there was no need to register the MARK
> now you require it, it is just like change the shared counter you needed to fix different drivers.
> This is not critical to me like I said in other thread as long is it is clear that if PMD doesn't support
> the new function it doesn't mean the the PMD has issue with the request.

I see your point. Hopefully the function description in v4 is
clear that it is not the case. If callback is not supported by
a driver, application should try to use all required metadata.
So, there is no breakage in accordance with defined API
contract.

Many thanks for your review notes. The review really
makes the API clearer and better documented.

> One more thing, I think this flag should be added now since you need it,
> I think you should report that you don't support it.
> since just like we talked there is no real difference between metadata and MARK.
> What do you think?

It sounds like a trick :) Negative support is *not* a support
in fact. DPDK policy requires support of a feature in a PMD
and in-tree application. Of course, it is not a problem to
add meta. It is really easy to do. I just don't want to add
it in v5 to be deleted in v6 because of my above concerns.

@Thomas, what do you think?

Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05  8:38                                   ` Andrew Rybchenko
@ 2021-10-05  9:41                                     ` Ori Kam
  2021-10-05 10:01                                       ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-05  9:41 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov, NBU-Contact-Thomas Monjalon
  Cc: Andy Moreton, Ray Kinsella, dev, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Tuesday, October 5, 2021 11:39 AM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> Hi Ori,
> 
> On 10/5/21 11:17 AM, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Tuesday, October 5, 2021 10:27 AM
> >> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >> Rx meta data
> >>
> >> On 10/5/21 9:30 AM, Ori Kam wrote:
> >>> Hi Andrew,
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>> Sent: Monday, October 4, 2021 4:53 PM
> >>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery
> >>>> of Rx meta data
> >>>>
> >>>> On 10/4/21 2:39 PM, Ivan Malov wrote:
> >>>>> On 04/10/2021 09:56, Ori Kam wrote:
> >>>>>>> On 04/10/2021 00:04, Ori Kam wrote:
> >>>>>>>> I understand that you are only talking about enabling the
> >>>>>>>> action, meaning to let the PMD know that at some point there
> >>>>>>>> will be a rule that will use the mark action for example.
> >>>>>>>> Is my understanding correct?
> >>>>>>> Not really. The causal relationships are as follows. The
> >>>>>>> application comes to realise that it will need to use, say,
> >>>>>>> action MARK in flows.
> >>>>>>> This, in turn, means that, in order to be able to actually see
> >>>>>>> the mark in received packets, the application needs to ensure
> >>>>>>> that a) the NIC will be able to deliver the mark to the PMD and
> >>>>>>> b) that the PMD will be able to deliver the mark to the
> >>>>>>> application. In particular, in the case of Rx mark,
> >>>>>>> (b) doesn't
> >>>>>>> need to be negotiated = field "mark" is anyway provisioned in
> >>>>>>> the mbuf structure, so no need to enable it. But (a) needs to be
> negotiated.
> >>>>>>> Hence this
> >>>>>>> API.
> >>>>>>>
> >>>>>> Please see my above comment I think we both agree.
> >>>>> Agree to have the 4-th flag in the new API to cover this "custom /
> >>>>> raw metdata" delivery? Personally, I tend to agree, but maybe
> >>>>> Andrew can express his opinion, too.
> >>>> Of course, it could be added, but we're not going to support it in
> >>>> net/sfc. So, I think the flag should be added when a PMD will going
> >>>> to
> >> support it (e.g.
> >>>> net/mlx5).
> >>> I think it should be added now, and more I think that this patch
> >>> should add the missing function to all PMDs 😊
> >>
> >> Sorry, but I disagree. Could you point out to DPDK documentation
> >> where it is written? Should all new API be supported in all PMDs by the API
> contributor?
> >>
> > This changes existing PMD beavior, until now there was no need to
> > register the MARK now you require it, it is just like change the shared counter
> you needed to fix different drivers.
> > This is not critical to me like I said in other thread as long is it
> > is clear that if PMD doesn't support the new function it doesn't mean the the
> PMD has issue with the request.
> 
> I see your point. Hopefully the function description in v4 is clear that it is not
> the case. If callback is not supported by a driver, application should try to use
> all required metadata.
> So, there is no breakage in accordance with defined API contract.
> 

Agree les pretty but works.

> Many thanks for your review notes. The review really makes the API clearer
> and better documented.
> 

Trying to do my best.

> > One more thing, I think this flag should be added now since you need
> > it, I think you should report that you don't support it.
> > since just like we talked there is no real difference between metadata and
> MARK.
> > What do you think?
> 
> It sounds like a trick :) Negative support is *not* a support in fact. DPDK policy
> requires support of a feature in a PMD and in-tree application. Of course, it is
> not a problem to add meta. It is really easy to do. I just don't want to add it in
> v5 to be deleted in v6 because of my above concerns.
> 
This was not a trick. I understand what you are saying.
if we say that metadata is the same as mark, (I think we all agree on it) and that
application need to notify pmd about such operations, I assume it will try to see how to
request the metadata.

I'm O.K. with adding it later and in any case I promise you that if you add it
it will stay.

> @Thomas, what do you think?
> 
> Andrew.

Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05  9:41                                     ` Ori Kam
@ 2021-10-05 10:01                                       ` Andrew Rybchenko
  2021-10-05 10:10                                         ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-05 10:01 UTC (permalink / raw)
  To: Ori Kam, Ivan Malov, NBU-Contact-Thomas Monjalon
  Cc: Andy Moreton, Ray Kinsella, dev, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Ori,

On 10/5/21 12:41 PM, Ori Kam wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Tuesday, October 5, 2021 11:39 AM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Hi Ori,
>>
>> On 10/5/21 11:17 AM, Ori Kam wrote:
>>
>>> One more thing, I think this flag should be added now since you need
>>> it, I think you should report that you don't support it.
>>> since just like we talked there is no real difference between metadata and MARK.
>>> What do you think?
>>
>> It sounds like a trick :) Negative support is *not* a support in fact. DPDK policy
>> requires support of a feature in a PMD and in-tree application. Of course, it is
>> not a problem to add meta. It is really easy to do. I just don't want to add it in
>> v5 to be deleted in v6 because of my above concerns.
>>
> This was not a trick. I understand what you are saying.
> if we say that metadata is the same as mark, (I think we all agree on it) and that
> application need to notify pmd about such operations, I assume it will try to see how to
> request the metadata.

Frankly speaking I feel sick when I think about META and MARK
together. Do we really need both in DPDK?

> I'm O.K. with adding it later and in any case I promise you that if you add it
> it will stay.

Many thanks, I see.

>> @Thomas, what do you think?
>>
>> Andrew.
> 
> Ori
> 

Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05 10:01                                       ` Andrew Rybchenko
@ 2021-10-05 10:10                                         ` Ori Kam
  2021-10-05 11:11                                           ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-05 10:10 UTC (permalink / raw)
  To: Andrew Rybchenko, Ivan Malov, NBU-Contact-Thomas Monjalon
  Cc: Andy Moreton, Ray Kinsella, dev, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Tuesday, October 5, 2021 1:02 PM
> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
> data
> 
> Hi Ori,
> 
> On 10/5/21 12:41 PM, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Tuesday, October 5, 2021 11:39 AM
> >> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
> >> Rx meta data
> >>
> >> Hi Ori,
> >>
> >> On 10/5/21 11:17 AM, Ori Kam wrote:
> >>
> >>> One more thing, I think this flag should be added now since you need
> >>> it, I think you should report that you don't support it.
> >>> since just like we talked there is no real difference between metadata and
> MARK.
> >>> What do you think?
> >>
> >> It sounds like a trick :) Negative support is *not* a support in
> >> fact. DPDK policy requires support of a feature in a PMD and in-tree
> >> application. Of course, it is not a problem to add meta. It is really
> >> easy to do. I just don't want to add it in
> >> v5 to be deleted in v6 because of my above concerns.
> >>
> > This was not a trick. I understand what you are saying.
> > if we say that metadata is the same as mark, (I think we all agree on
> > it) and that application need to notify pmd about such operations, I
> > assume it will try to see how to request the metadata.
> 
> Frankly speaking I feel sick when I think about META and MARK together. Do
> we really need both in DPDK?
> 
I realy don't want you the be sick,
The resoun that we need both of them is that 32 in Nvidia it is only 24 bits of mark is not
enough, so there is a need for more bits.
I think that in the end we will go to something much more generic that the application
will just say how many bits it wants to get and this what he will get.
for example the application may say it needs 128 bits and it will register this size to the mbuf
or give in the mbuf pointer two where those values should be set.
In any case as you can see we have already to many changes in rte_flow in this release and the
next one, but I'm planning to push this feature in the future
what do you think of such a feature?

Ori
> > I'm O.K. with adding it later and in any case I promise you that if
> > you add it it will stay.
> 
> Many thanks, I see.
> 
> >> @Thomas, what do you think?
> >>
> >> Andrew.
> >
> > Ori
> >
> 
> Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05 10:10                                         ` Ori Kam
@ 2021-10-05 11:11                                           ` Andrew Rybchenko
  2021-10-06  8:30                                             ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-05 11:11 UTC (permalink / raw)
  To: Ori Kam, Ivan Malov, NBU-Contact-Thomas Monjalon
  Cc: Andy Moreton, Ray Kinsella, dev, Jerin Jacob, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Ori,

On 10/5/21 1:10 PM, Ori Kam wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Tuesday, October 5, 2021 1:02 PM
>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta
>> data
>>
>> Hi Ori,
>>
>> On 10/5/21 12:41 PM, Ori Kam wrote:
>>> Hi Andrew,
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> Sent: Tuesday, October 5, 2021 11:39 AM
>>>> Subject: Re: [PATCH v3 1/5] ethdev: add API to negotiate delivery of
>>>> Rx meta data
>>>>
>>>> Hi Ori,
>>>>
>>>> On 10/5/21 11:17 AM, Ori Kam wrote:
>>>>
>>>>> One more thing, I think this flag should be added now since you need
>>>>> it, I think you should report that you don't support it.
>>>>> since just like we talked there is no real difference between metadata and
>> MARK.
>>>>> What do you think?
>>>>
>>>> It sounds like a trick :) Negative support is *not* a support in
>>>> fact. DPDK policy requires support of a feature in a PMD and in-tree
>>>> application. Of course, it is not a problem to add meta. It is really
>>>> easy to do. I just don't want to add it in
>>>> v5 to be deleted in v6 because of my above concerns.
>>>>
>>> This was not a trick. I understand what you are saying.
>>> if we say that metadata is the same as mark, (I think we all agree on
>>> it) and that application need to notify pmd about such operations, I
>>> assume it will try to see how to request the metadata.
>>
>> Frankly speaking I feel sick when I think about META and MARK together. Do
>> we really need both in DPDK?
>>
> I realy don't want you the be sick,
> The resoun that we need both of them is that 32 in Nvidia it is only 24 bits of mark is not
> enough, so there is a need for more bits.
> I think that in the end we will go to something much more generic that the application
> will just say how many bits it wants to get and this what he will get.
> for example the application may say it needs 128 bits and it will register this size to the mbuf
> or give in the mbuf pointer two where those values should be set.
> In any case as you can see we have already to many changes in rte_flow in this release and the
> next one, but I'm planning to push this feature in the future
> what do you think of such a feature?

I agree that there are really many changes in flow API which
are on review in the release cycle.
I hope the above idea will allow to merge MARK and META.

Could you take a look at v4 sent by Ivan yesterday and
summarize current status of the review.
Which points are still unclear and must be improved?
What is desirable to improve from your point of view?

Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-05 12:03       ` Ori Kam
  2021-10-05 12:50         ` Ivan Malov
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-05 12:03 UTC (permalink / raw)
  To: Ivan Malov, dev
  Cc: Ray Kinsella, Jerin Jacob, NBU-Contact-Thomas Monjalon,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Ivan,

Just a nit below.

> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Tuesday, October 5, 2021 2:50 AM
> Subject: [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from
> HW to PMD
> 
> Provide an API to let the application control the NIC's ability to deliver specific
> kinds of per-packet metadata to the PMD.
> 
> Checks for the NIC's ability to set these kinds of metadata in the first place
> (support for the flow actions) belong in flow API responsibility domain (flow
> validate mechanism).
> This topic is out of scope of the new API in question.
> 
> The PMD's ability to deliver received metadata to the user by virtue of mbuf
> fields should be covered by mbuf library.
> It is also out of scope of the new API in question.
> 

+1 very clear.

> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---

[Snip]

> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4902,6 +4902,59 @@ __rte_experimental  int
> rte_eth_representor_info_get(uint16_t port_id,
>  				 struct rte_eth_representor_info *info);
> 
> +/** The NIC is able to deliver flag (if set) with packets to the PMD.
> +*/ #define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
> +
> +/** The NIC is able to deliver mark ID with packets to the PMD. */
> +#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
> +
> +/** The NIC is able to deliver tunnel ID with packets to the PMD. */
> +#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
> + *
> + * Invoke this API before the first rte_eth_dev_configure() invocation
> + * to let the PMD make preparations that are inconvenient to do later.
> + *
> + * The negotiation process is as follows:
> + *
> + * - the application requests features intending to use at least some
> +of them;
> + * - the PMD responds with the guaranteed subset of the requested
> +feature set;
> + * - the application can retry negotiation with another set of
> +features;
> + * - the application can pass zero to clear the negotiation result;
> + * - the last negotiated result takes effect upon the ethdev start.

Not upon ethdev configure?

> + *
> + * @note
> + *   The PMD is supposed to first consider enabling the requested feature set
> + *   in its entirety. Only if it fails to do so, does it have the right to
> + *   respond with a smaller set of the originally requested features.
> + *
> + * @note
> + *   Return code (-ENOTSUP) does not necessarily mean that the requested
> + *   features are unsupported. In this case, the application should just
> + *   assume that these features can be used without prior negotiations.
> + *
> + * @param port_id
> + *   Port (ethdev) identifier
> + *
> + * @param[inout] features
> + *   Feature selection buffer
> + *
> + * @return
> + *   - (-EBUSY) if the port can't handle this in its current state;
> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
> + *   - (-ENODEV) if *port_id* is invalid;
> + *   - (-EINVAL) if *features* is NULL;
> + *   - (-EIO) if the device is removed;
> + *   - (0) on success
> + */
> +__rte_experimental
> +int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t
> +*features);
> +
>  #include <rte_ethdev_core.h>
> 
>  /**
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> 7b1ed7f110..75656ff9f8 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
>  	 * PKT_RX_FDIR_ID mbuf flags.
>  	 *
>  	 * See struct rte_flow_action_mark.
> +	 *
> +	 * One should negotiate mark delivery from the NIC to the PMD.
> +	 * @see rte_eth_rx_metadata_negotiate()
> +	 * @see RTE_ETH_RX_METADATA_USER_MARK
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MARK,
> 
> @@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
>  	 * sets the PKT_RX_FDIR mbuf flag.
>  	 *
>  	 * No associated configuration structure.
> +	 *
> +	 * One should negotiate flag delivery from the NIC to the PMD.
> +	 * @see rte_eth_rx_metadata_negotiate()
> +	 * @see RTE_ETH_RX_METADATA_USER_FLAG
>  	 */
>  	RTE_FLOW_ACTION_TYPE_FLAG,
> 
> @@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
>  /**
>   * Populate the current packet processing state, if exists, for the given mbuf.
>   *
> + * One should negotiate tunnel metadata delivery from the NIC to the HW.
> + * @see rte_eth_rx_metadata_negotiate()
> + * @see RTE_ETH_RX_METADATA_TUNNEL_ID
> + *
>   * @param port_id
>   *   Port identifier of Ethernet device.
>   * @param[in] m
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> 904bce6ea1..2e638c680e 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -247,6 +247,9 @@ EXPERIMENTAL {
>  	rte_mtr_meter_policy_delete;
>  	rte_mtr_meter_policy_update;
>  	rte_mtr_meter_policy_validate;
> +
> +	# added in 21.11
> +	rte_eth_rx_metadata_negotiate;
>  };
> 
>  INTERNAL {
> --
> 2.20.1
Best,
Ori


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-05 12:03       ` Ori Kam
@ 2021-10-05 12:50         ` Ivan Malov
  2021-10-05 13:17           ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 12:50 UTC (permalink / raw)
  To: Ori Kam, dev
  Cc: Ray Kinsella, Jerin Jacob, NBU-Contact-Thomas Monjalon,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton, Wisam Monther,
	Xiaoyun Li, Ferruh Yigit

Hi Ori,

On 05/10/2021 15:03, Ori Kam wrote:
> Hi Ivan,
> 
> Just a nit below.
> 
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, October 5, 2021 2:50 AM
>> Subject: [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from
>> HW to PMD
>>
>> Provide an API to let the application control the NIC's ability to deliver specific
>> kinds of per-packet metadata to the PMD.
>>
>> Checks for the NIC's ability to set these kinds of metadata in the first place
>> (support for the flow actions) belong in flow API responsibility domain (flow
>> validate mechanism).
>> This topic is out of scope of the new API in question.
>>
>> The PMD's ability to deliver received metadata to the user by virtue of mbuf
>> fields should be covered by mbuf library.
>> It is also out of scope of the new API in question.
>>
> 
> +1 very clear.
> 
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>> ---
> 
> [Snip]
> 
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -4902,6 +4902,59 @@ __rte_experimental  int
>> rte_eth_representor_info_get(uint16_t port_id,
>>   				 struct rte_eth_representor_info *info);
>>
>> +/** The NIC is able to deliver flag (if set) with packets to the PMD.
>> +*/ #define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
>> +
>> +/** The NIC is able to deliver mark ID with packets to the PMD. */
>> +#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
>> +
>> +/** The NIC is able to deliver tunnel ID with packets to the PMD. */
>> +#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice
>> + *
>> + * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
>> + *
>> + * Invoke this API before the first rte_eth_dev_configure() invocation
>> + * to let the PMD make preparations that are inconvenient to do later.
>> + *
>> + * The negotiation process is as follows:
>> + *
>> + * - the application requests features intending to use at least some
>> +of them;
>> + * - the PMD responds with the guaranteed subset of the requested
>> +feature set;
>> + * - the application can retry negotiation with another set of
>> +features;
>> + * - the application can pass zero to clear the negotiation result;
>> + * - the last negotiated result takes effect upon the ethdev start.
> 
> Not upon ethdev configure?

Well, technically, doing "configure()" just closes the negotiation 
window. I guess, "to take effect" is "to be activated", and activation 
of Rx features typically happens on Rx subsystem start.

I know it might seem a bit inconsistent, but in any case the API 
contract says clearly that invocations of "metadata_negotiate()" should 
be done before "configure()".

Andrew?

> 
>> + *
>> + * @note
>> + *   The PMD is supposed to first consider enabling the requested feature set
>> + *   in its entirety. Only if it fails to do so, does it have the right to
>> + *   respond with a smaller set of the originally requested features.
>> + *
>> + * @note
>> + *   Return code (-ENOTSUP) does not necessarily mean that the requested
>> + *   features are unsupported. In this case, the application should just
>> + *   assume that these features can be used without prior negotiations.
>> + *
>> + * @param port_id
>> + *   Port (ethdev) identifier
>> + *
>> + * @param[inout] features
>> + *   Feature selection buffer
>> + *
>> + * @return
>> + *   - (-EBUSY) if the port can't handle this in its current state;
>> + *   - (-ENOTSUP) if the method itself is not supported by the PMD;
>> + *   - (-ENODEV) if *port_id* is invalid;
>> + *   - (-EINVAL) if *features* is NULL;
>> + *   - (-EIO) if the device is removed;
>> + *   - (0) on success
>> + */
>> +__rte_experimental
>> +int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t
>> +*features);
>> +
>>   #include <rte_ethdev_core.h>
>>
>>   /**
>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>> 7b1ed7f110..75656ff9f8 100644
>> --- a/lib/ethdev/rte_flow.h
>> +++ b/lib/ethdev/rte_flow.h
>> @@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
>>   	 * PKT_RX_FDIR_ID mbuf flags.
>>   	 *
>>   	 * See struct rte_flow_action_mark.
>> +	 *
>> +	 * One should negotiate mark delivery from the NIC to the PMD.
>> +	 * @see rte_eth_rx_metadata_negotiate()
>> +	 * @see RTE_ETH_RX_METADATA_USER_MARK
>>   	 */
>>   	RTE_FLOW_ACTION_TYPE_MARK,
>>
>> @@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
>>   	 * sets the PKT_RX_FDIR mbuf flag.
>>   	 *
>>   	 * No associated configuration structure.
>> +	 *
>> +	 * One should negotiate flag delivery from the NIC to the PMD.
>> +	 * @see rte_eth_rx_metadata_negotiate()
>> +	 * @see RTE_ETH_RX_METADATA_USER_FLAG
>>   	 */
>>   	RTE_FLOW_ACTION_TYPE_FLAG,
>>
>> @@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
>>   /**
>>    * Populate the current packet processing state, if exists, for the given mbuf.
>>    *
>> + * One should negotiate tunnel metadata delivery from the NIC to the HW.
>> + * @see rte_eth_rx_metadata_negotiate()
>> + * @see RTE_ETH_RX_METADATA_TUNNEL_ID
>> + *
>>    * @param port_id
>>    *   Port identifier of Ethernet device.
>>    * @param[in] m
>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>> 904bce6ea1..2e638c680e 100644
>> --- a/lib/ethdev/version.map
>> +++ b/lib/ethdev/version.map
>> @@ -247,6 +247,9 @@ EXPERIMENTAL {
>>   	rte_mtr_meter_policy_delete;
>>   	rte_mtr_meter_policy_update;
>>   	rte_mtr_meter_policy_validate;
>> +
>> +	# added in 21.11
>> +	rte_eth_rx_metadata_negotiate;
>>   };
>>
>>   INTERNAL {
>> --
>> 2.20.1
> Best,
> Ori
> 

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-05 12:50         ` Ivan Malov
@ 2021-10-05 13:17           ` Andrew Rybchenko
  0 siblings, 0 replies; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-05 13:17 UTC (permalink / raw)
  To: Ivan Malov, Ori Kam, dev
  Cc: Ray Kinsella, Jerin Jacob, NBU-Contact-Thomas Monjalon,
	Ajit Khaparde, Andy Moreton, Wisam Monther, Xiaoyun Li,
	Ferruh Yigit

On 10/5/21 3:50 PM, Ivan Malov wrote:
> Hi Ori,
> 
> On 05/10/2021 15:03, Ori Kam wrote:
>> Hi Ivan,
>>
>> Just a nit below.
>>
>>> -----Original Message-----
>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>> Sent: Tuesday, October 5, 2021 2:50 AM
>>> Subject: [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata
>>> from
>>> HW to PMD
>>>
>>> Provide an API to let the application control the NIC's ability to
>>> deliver specific
>>> kinds of per-packet metadata to the PMD.
>>>
>>> Checks for the NIC's ability to set these kinds of metadata in the
>>> first place
>>> (support for the flow actions) belong in flow API responsibility
>>> domain (flow
>>> validate mechanism).
>>> This topic is out of scope of the new API in question.
>>>
>>> The PMD's ability to deliver received metadata to the user by virtue
>>> of mbuf
>>> fields should be covered by mbuf library.
>>> It is also out of scope of the new API in question.
>>>
>>
>> +1 very clear.
>>
>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>> ---
>>
>> [Snip]
>>
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -4902,6 +4902,59 @@ __rte_experimental  int
>>> rte_eth_representor_info_get(uint16_t port_id,
>>>                    struct rte_eth_representor_info *info);
>>>
>>> +/** The NIC is able to deliver flag (if set) with packets to the PMD.
>>> +*/ #define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
>>> +
>>> +/** The NIC is able to deliver mark ID with packets to the PMD. */
>>> +#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
>>> +
>>> +/** The NIC is able to deliver tunnel ID with packets to the PMD. */
>>> +#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
>>> +
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>> + *
>>> + * Negotiate the NIC's ability to deliver specific kinds of metadata
>>> to the PMD.
>>> + *
>>> + * Invoke this API before the first rte_eth_dev_configure() invocation
>>> + * to let the PMD make preparations that are inconvenient to do later.
>>> + *
>>> + * The negotiation process is as follows:
>>> + *
>>> + * - the application requests features intending to use at least some
>>> +of them;
>>> + * - the PMD responds with the guaranteed subset of the requested
>>> +feature set;
>>> + * - the application can retry negotiation with another set of
>>> +features;
>>> + * - the application can pass zero to clear the negotiation result;
>>> + * - the last negotiated result takes effect upon the ethdev start.
>>
>> Not upon ethdev configure?
> 
> Well, technically, doing "configure()" just closes the negotiation
> window. I guess, "to take effect" is "to be activated", and activation
> of Rx features typically happens on Rx subsystem start.

Yes, i.e. ethdev port start from application point of view

> I know it might seem a bit inconsistent, but in any case the API
> contract says clearly that invocations of "metadata_negotiate()" should
> be done before "configure()".
> 
> Andrew?

Yes, the reason to define order is to simplify implementation.
When configure is invoked, PMD know that Rx metadata are
negotiated and it should treat all other bits of the
configuration with respect to Rx metadata configuration,
of course, if applicable.

So, I think the question is right and correct description
should say: ... upon the ethdev configure and start.

Andrew.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (6 preceding siblings ...)
  2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-05 15:56   ` Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
                       ` (4 more replies)
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  9 siblings, 5 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev; +Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam, Ajit Khaparde

In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
only the former has been added. The issue has not been solved.
Applications still assume that metadata features always work
and do not need to be configured in advance.

The team behind net/sfc driver has given this problem more thought.
Conclusions that have been reached are as follows.

1. Not all kinds of metadata can be represented by device offload flags.
   For instance, having flag RSS_HASH is legitimate because the NIC is
   supposed to actually compute something when this feature is active.
   However, if similar flag existed for Rx mark, requesting it would
   not make the NIC actually compute anything. The HW needs external
   stimuli (flow rules) in order to set the mark in the first place.

2. As a consequence of (1), it is apparent that the user's ability to
   use Rx metadata features is complex and consists of multiple parts:
   a) the NIC's ability to conduct the flow actions (set metadata);
   b) the NIC's ability to deliver metadata (if set) to the PMD;
   c) the PMD's ability to provide metadata received from the
      NIC to the user by virtue of filling out mbuf fields.

3. Aspects (2-a) and (2-c) are already addressed by flow validate API
   and the procedure of dynamic mbuf field registration respectively,
   hence, the only problem which really needs a solution is (2-b).
  
Patch [1/5] of this series adds a generic API to let the application
negotiate the NIC's ability to deliver specific kinds of metadata to
the PMD. This API is supposed to be invoked during initialisation
period in order to let the PMD configure HW resources which might
be hard to (re-)configure in the adapter's started state without
causing traffic disruption and other unwanted consequences.

[1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Changes in v3:
* [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
* [1/5] has the ack from Jerin Jacob added

Changes in v4:
* [1/5] has the API contract clarified to address concerns raised by Ori Kam
* [1/5] has the API name fixed to use term "metadata" instead of "meta"
* [1/5] has testpmd loglevel changed as per the note by Ajit Khaparde
* [1/5] has testpmd code revisited to take multi-process into account
* [2/5] through [5/5] have the corresponding adjustments incorporated

Changes in v5:
* [1/5] has the API comment improved as per the note by Ori Kam

Ivan Malov (5):
  ethdev: negotiate delivery of packet metadata from HW to PMD
  net/sfc: support API to negotiate delivery of Rx metadata
  net/sfc: support flow mark delivery on EF100 native datapath
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: report user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 +++++++++
 drivers/net/sfc/sfc_ethdev.c           | 29 ++++++++++++++
 drivers/net/sfc/sfc_flow.c             | 13 +++++++
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 312 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-05 15:56     ` Ivan Malov
  2021-10-05 21:40       ` Ajit Khaparde
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton, Wisam Jaddo,
	Xiaoyun Li, Ferruh Yigit

Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 183 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..4d01791f6f 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_metadata = 0;
+
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+		ret = rte_eth_rx_metadata_negotiate(port_id, &rx_metadata);
+		if (ret == 0) {
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..bf80de4e80 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -533,6 +533,41 @@ int proc_id;
  */
 unsigned int num_procs = 1;
 
+static void
+eth_rx_metadata_negotiate_mp(uint16_t port_id)
+{
+	uint64_t rx_meta_features = 0;
+	int ret;
+
+	if (!is_proc_primary())
+		return;
+
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+	ret = rte_eth_rx_metadata_negotiate(port_id, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_FLAG)) {
+			TESTPMD_LOG(DEBUG, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_MARK)) {
+			TESTPMD_LOG(DEBUG, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+			TESTPMD_LOG(DEBUG, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    port_id);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 port_id, rte_strerror(-ret));
+	}
+}
+
 static int
 eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1489,6 +1524,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id)
 	int ret;
 	int i;
 
+	eth_rx_metadata_negotiate_mp(pid);
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index dfc2cbdeed..1aa8a6525a 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -130,6 +130,15 @@ New Features
   * Added tests to validate packets hard expiry.
   * Added tests to verify tunnel header verification in IPsec inbound.
 
+* **Added an API to control delivery of Rx metadata from the HW to the PMD**
+
+  A new API, ``rte_eth_rx_metadata_negotiate()``, was added.
+  The following parts of Rx metadata were defined:
+
+  * ``RTE_ETH_RX_METADATA_USER_FLAG``
+  * ``RTE_ETH_RX_METADATA_USER_MARK``
+  * ``RTE_ETH_RX_METADATA_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index cc2c75261c..d073d63ba8 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -785,6 +785,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_metadata_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -945,6 +961,12 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	/**
+	 * Negotiate the NIC's ability to deliver specific
+	 * kinds of metadata to the PMD.
+	 */
+	eth_rx_metadata_negotiate_t rx_metadata_negotiate;
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index daf5ca9242..a41fb8a398 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6310,6 +6310,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_metadata_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_metadata_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index afdc53b674..00c0af9a15 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4902,6 +4902,60 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/** The NIC is able to deliver flag (if set) with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
+
+/** The NIC is able to deliver mark ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
+
+/** The NIC is able to deliver tunnel ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * Invoke this API before the first rte_eth_dev_configure() invocation
+ * to let the PMD make preparations that are inconvenient to do later.
+ *
+ * The negotiation process is as follows:
+ *
+ * - the application requests features intending to use at least some of them;
+ * - the PMD responds with the guaranteed subset of the requested feature set;
+ * - the application can retry negotiation with another set of features;
+ * - the application can pass zero to clear the negotiation result;
+ * - the last negotiated result takes effect upon
+ *   the ethdev configure and start.
+ *
+ * @note
+ *   The PMD is supposed to first consider enabling the requested feature set
+ *   in its entirety. Only if it fails to do so, does it have the right to
+ *   respond with a smaller set of the originally requested features.
+ *
+ * @note
+ *   Return code (-ENOTSUP) does not necessarily mean that the requested
+ *   features are unsupported. In this case, the application should just
+ *   assume that these features can be used without prior negotiations.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..75656ff9f8 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * One should negotiate mark delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * One should negotiate flag delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4223,6 +4231,10 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * One should negotiate tunnel metadata delivery from the NIC to the HW.
+ * @see rte_eth_rx_metadata_negotiate()
+ * @see RTE_ETH_RX_METADATA_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..2e638c680e 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_metadata_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 2/5] net/sfc: support API to negotiate delivery of Rx metadata
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-05 15:56     ` Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 13 +++++++++++++
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 331e06bac6..079216c1fb 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -312,6 +312,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_metadata;
+
 	uint32_t			rxd_wait_timeout_ns;
 };
 
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 2db0d000c3..00b2c84b46 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1859,6 +1859,28 @@ sfc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t ethdev_qid)
 	return sap->dp_rx->intr_disable(rxq_info->dp);
 }
 
+static int
+sfc_rx_metadata_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+	uint64_t supported = 0;
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_MARK;
+
+	sa->negotiated_rx_metadata = supported & *features;
+	*features = sa->negotiated_rx_metadata;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1906,6 +1928,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_by_id		= sfc_xstats_get_by_id,
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
+	.rx_metadata_negotiate		= sfc_rx_metadata_negotiate,
 };
 
 /**
@@ -1998,6 +2021,12 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/* FLAG and MARK are always available from Rx prefix. */
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 4f5993a68d..1f54bea3d9 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1760,6 +1760,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
 	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1832,6 +1833,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"FLAG action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_FLAG) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"flag delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			spec_filter->template.efs_flags |=
@@ -1849,6 +1856,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"MARK action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_MARK) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"mark delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			rc = sfc_flow_parse_mark(sa, actions->conf, flow);
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 4b520bc619..63b917a323 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -2963,6 +2963,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3012,12 +3013,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"flag delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"mark delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 3/5] net/sfc: support flow mark delivery on EF100 native datapath
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
@ 2021-10-05 15:56     ` Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 1bf04f565a..b634c8f23a 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -914,6 +914,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
 	.queue_offload_capa	= DEV_RX_OFFLOAD_CHECKSUM |
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..5b924010bd 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (2 preceding siblings ...)
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
@ 2021-10-05 15:56     ` Ivan Malov
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index 24e1314cc3..bed1029f59 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3007,6 +3007,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 76b8ce302a..9d3258b503 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (3 preceding siblings ...)
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-10-05 15:56     ` Ivan Malov
  2021-10-12 18:08       ` Ferruh Yigit
  4 siblings, 1 reply; 97+ messages in thread
From: Ivan Malov @ 2021-10-05 15:56 UTC (permalink / raw)
  To: dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Detect the flag in Rx prefix and pass it to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index b634c8f23a..7d0d6b3d00 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -62,6 +62,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_RSS_HASH		0x10
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -371,6 +372,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH_VALID, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(CLASS, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -800,6 +811,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -914,6 +931,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 5b924010bd..5e120f5851 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-05 21:40       ` Ajit Khaparde
  2021-10-06  6:04         ` Somnath Kotur
  0 siblings, 1 reply; 97+ messages in thread
From: Ajit Khaparde @ 2021-10-05 21:40 UTC (permalink / raw)
  To: Ivan Malov
  Cc: dpdk-dev, Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Andrew Rybchenko, Andy Moreton, Wisam Jaddo, Xiaoyun Li,
	Ferruh Yigit

[-- Attachment #1: Type: text/plain, Size: 938 bytes --]

On Tue, Oct 5, 2021 at 8:56 AM Ivan Malov <ivan.malov@oktetlabs.ru> wrote:
>
> Provide an API to let the application control the NIC's ability
> to deliver specific kinds of per-packet metadata to the PMD.
>
> Checks for the NIC's ability to set these kinds of metadata
> in the first place (support for the flow actions) belong in
> flow API responsibility domain (flow validate mechanism).
> This topic is out of scope of the new API in question.
>
> The PMD's ability to deliver received metadata to the user
> by virtue of mbuf fields should be covered by mbuf library.
> It is also out of scope of the new API in question.
>
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Acked-by: Jerin Jacob <jerinj@marvell.com>

Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-05 21:40       ` Ajit Khaparde
@ 2021-10-06  6:04         ` Somnath Kotur
  2021-10-06  6:10           ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Somnath Kotur @ 2021-10-06  6:04 UTC (permalink / raw)
  To: Ajit Khaparde
  Cc: Ivan Malov, dpdk-dev, Ray Kinsella, Jerin Jacob, Thomas Monjalon,
	Ori Kam, Andrew Rybchenko, Andy Moreton, Wisam Jaddo, Xiaoyun Li,
	Ferruh Yigit

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On Wed, Oct 6, 2021 at 3:10 AM Ajit Khaparde <ajit.khaparde@broadcom.com> wrote:
>
> On Tue, Oct 5, 2021 at 8:56 AM Ivan Malov <ivan.malov@oktetlabs.ru> wrote:
> >
> > Provide an API to let the application control the NIC's ability
> > to deliver specific kinds of per-packet metadata to the PMD.
> >
> > Checks for the NIC's ability to set these kinds of metadata
> > in the first place (support for the flow actions) belong in
> > flow API responsibility domain (flow validate mechanism).
> > This topic is out of scope of the new API in question.
> >
> > The PMD's ability to deliver received metadata to the user
> > by virtue of mbuf fields should be covered by mbuf library.
> > It is also out of scope of the new API in question.
> >
> > Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> > Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > Acked-by: Jerin Jacob <jerinj@marvell.com>
>
> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-06  6:04         ` Somnath Kotur
@ 2021-10-06  6:10           ` Ori Kam
  2021-10-06  7:22             ` Wisam Monther
  0 siblings, 1 reply; 97+ messages in thread
From: Ori Kam @ 2021-10-06  6:10 UTC (permalink / raw)
  To: Somnath Kotur, Ajit Khaparde
  Cc: Ivan Malov, dpdk-dev, Ray Kinsella, Jerin Jacob,
	NBU-Contact-Thomas Monjalon, Andrew Rybchenko, Andy Moreton,
	Wisam Monther, Xiaoyun Li, Ferruh Yigit

Hi Ivan,

> -----Original Message-----
> From: Somnath Kotur <somnath.kotur@broadcom.com>
> Sent: Wednesday, October 6, 2021 9:04 AM
> metadata from HW to PMD
> 
> On Wed, Oct 6, 2021 at 3:10 AM Ajit Khaparde
> <ajit.khaparde@broadcom.com> wrote:
> >
> > On Tue, Oct 5, 2021 at 8:56 AM Ivan Malov <ivan.malov@oktetlabs.ru>
> wrote:
> > >
> > > Provide an API to let the application control the NIC's ability
> > > to deliver specific kinds of per-packet metadata to the PMD.
> > >
> > > Checks for the NIC's ability to set these kinds of metadata
> > > in the first place (support for the flow actions) belong in
> > > flow API responsibility domain (flow validate mechanism).
> > > This topic is out of scope of the new API in question.
> > >
> > > The PMD's ability to deliver received metadata to the user
> > > by virtue of mbuf fields should be covered by mbuf library.
> > > It is also out of scope of the new API in question.
> > >
> > > Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> > > Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > > Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> > > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> >
> > Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>

Thanks,
Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-06  6:10           ` Ori Kam
@ 2021-10-06  7:22             ` Wisam Monther
  0 siblings, 0 replies; 97+ messages in thread
From: Wisam Monther @ 2021-10-06  7:22 UTC (permalink / raw)
  To: Ori Kam, Somnath Kotur, Ajit Khaparde
  Cc: Ivan Malov, dpdk-dev, Ray Kinsella, Jerin Jacob,
	NBU-Contact-Thomas Monjalon, Andrew Rybchenko, Andy Moreton,
	Xiaoyun Li, Ferruh Yigit

Hi,

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Wednesday, October 6, 2021 9:10 AM
> To: Somnath Kotur <somnath.kotur@broadcom.com>; Ajit Khaparde
> <ajit.khaparde@broadcom.com>
> Cc: Ivan Malov <ivan.malov@oktetlabs.ru>; dpdk-dev <dev@dpdk.org>; Ray
> Kinsella <mdr@ashroe.eu>; Jerin Jacob <jerinj@marvell.com>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; Andy Moreton
> <amoreton@xilinx.com>; Wisam Monther <wisamm@nvidia.com>; Xiaoyun
> Li <xiaoyun.li@intel.com>; Ferruh Yigit <ferruh.yigit@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet
> metadata from HW to PMD
> 
> Hi Ivan,
> 
> > -----Original Message-----
> > From: Somnath Kotur <somnath.kotur@broadcom.com>
> > Sent: Wednesday, October 6, 2021 9:04 AM metadata from HW to PMD
> >
> > On Wed, Oct 6, 2021 at 3:10 AM Ajit Khaparde
> > <ajit.khaparde@broadcom.com> wrote:
> > >
> > > On Tue, Oct 5, 2021 at 8:56 AM Ivan Malov <ivan.malov@oktetlabs.ru>
> > wrote:
> > > >
> > > > Provide an API to let the application control the NIC's ability to
> > > > deliver specific kinds of per-packet metadata to the PMD.
> > > >
> > > > Checks for the NIC's ability to set these kinds of metadata in the
> > > > first place (support for the flow actions) belong in flow API
> > > > responsibility domain (flow validate mechanism).
> > > > This topic is out of scope of the new API in question.
> > > >
> > > > The PMD's ability to deliver received metadata to the user by
> > > > virtue of mbuf fields should be covered by mbuf library.
> > > > It is also out of scope of the new API in question.
> > > >
> > > > Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> > > > Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > > > Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> > > > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> > Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Acked-by: Ori Kam <orika@nvidia.com>

Acked-by: Wisam Jaddo <wisamm@nvidia.com>


BRs,
Wisam Jaddo

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-05 11:11                                           ` Andrew Rybchenko
@ 2021-10-06  8:30                                             ` Thomas Monjalon
  2021-10-06  8:38                                               ` Andrew Rybchenko
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2021-10-06  8:30 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko
  Cc: Ivan Malov, Andy Moreton, Ray Kinsella, dev, Jerin Jacob,
	Wisam Monther, Xiaoyun Li, Ferruh Yigit

05/10/2021 13:11, Andrew Rybchenko:
> On 10/5/21 1:10 PM, Ori Kam wrote:
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> On 10/5/21 12:41 PM, Ori Kam wrote:
> >>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>> On 10/5/21 11:17 AM, Ori Kam wrote:
> >>>>> One more thing, I think this flag should be added now since you need
> >>>>> it, I think you should report that you don't support it.
> >>>>> since just like we talked there is no real difference between metadata and
> >> MARK.
> >>>>> What do you think?
> >>>>
> >>>> It sounds like a trick :) Negative support is *not* a support in
> >>>> fact. DPDK policy requires support of a feature in a PMD and in-tree
> >>>> application. Of course, it is not a problem to add meta. It is really
> >>>> easy to do. I just don't want to add it in
> >>>> v5 to be deleted in v6 because of my above concerns.
> >>>>
> >>> This was not a trick. I understand what you are saying.
> >>> if we say that metadata is the same as mark, (I think we all agree on
> >>> it) and that application need to notify pmd about such operations, I
> >>> assume it will try to see how to request the metadata.
> >>
> >> Frankly speaking I feel sick when I think about META and MARK together. Do
> >> we really need both in DPDK?
> >>
> > I realy don't want you the be sick,
> > The resoun that we need both of them is that 32 in Nvidia it is only 24 bits of mark is not
> > enough, so there is a need for more bits.
> > I think that in the end we will go to something much more generic that the application
> > will just say how many bits it wants to get and this what he will get.
> > for example the application may say it needs 128 bits and it will register this size to the mbuf
> > or give in the mbuf pointer two where those values should be set.
> > In any case as you can see we have already to many changes in rte_flow in this release and the
> > next one, but I'm planning to push this feature in the future
> > what do you think of such a feature?
> 
> I agree that there are really many changes in flow API which
> are on review in the release cycle.
> I hope the above idea will allow to merge MARK and META.

I agree we should merge mark and meta in a common dynamic mbuf field.
What do we need in mark which is not in meta?
I think dynamic mbuf field of meta is the way to go but I prefer the name "mark" :)



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-06  8:30                                             ` Thomas Monjalon
@ 2021-10-06  8:38                                               ` Andrew Rybchenko
  2021-10-06  9:14                                                 ` Ori Kam
  0 siblings, 1 reply; 97+ messages in thread
From: Andrew Rybchenko @ 2021-10-06  8:38 UTC (permalink / raw)
  To: Thomas Monjalon, Ori Kam
  Cc: Ivan Malov, Andy Moreton, Ray Kinsella, dev, Jerin Jacob,
	Wisam Monther, Xiaoyun Li, Ferruh Yigit

On 10/6/21 11:30 AM, Thomas Monjalon wrote:
> 05/10/2021 13:11, Andrew Rybchenko:
>> On 10/5/21 1:10 PM, Ori Kam wrote:
>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> On 10/5/21 12:41 PM, Ori Kam wrote:
>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>> On 10/5/21 11:17 AM, Ori Kam wrote:
>>>>>>> One more thing, I think this flag should be added now since you need
>>>>>>> it, I think you should report that you don't support it.
>>>>>>> since just like we talked there is no real difference between metadata and
>>>> MARK.
>>>>>>> What do you think?
>>>>>>
>>>>>> It sounds like a trick :) Negative support is *not* a support in
>>>>>> fact. DPDK policy requires support of a feature in a PMD and in-tree
>>>>>> application. Of course, it is not a problem to add meta. It is really
>>>>>> easy to do. I just don't want to add it in
>>>>>> v5 to be deleted in v6 because of my above concerns.
>>>>>>
>>>>> This was not a trick. I understand what you are saying.
>>>>> if we say that metadata is the same as mark, (I think we all agree on
>>>>> it) and that application need to notify pmd about such operations, I
>>>>> assume it will try to see how to request the metadata.
>>>>
>>>> Frankly speaking I feel sick when I think about META and MARK together. Do
>>>> we really need both in DPDK?
>>>>
>>> I realy don't want you the be sick,
>>> The resoun that we need both of them is that 32 in Nvidia it is only 24 bits of mark is not
>>> enough, so there is a need for more bits.
>>> I think that in the end we will go to something much more generic that the application
>>> will just say how many bits it wants to get and this what he will get.
>>> for example the application may say it needs 128 bits and it will register this size to the mbuf
>>> or give in the mbuf pointer two where those values should be set.
>>> In any case as you can see we have already to many changes in rte_flow in this release and the
>>> next one, but I'm planning to push this feature in the future
>>> what do you think of such a feature?
>>
>> I agree that there are really many changes in flow API which
>> are on review in the release cycle.
>> I hope the above idea will allow to merge MARK and META.
> 
> I agree we should merge mark and meta in a common dynamic mbuf field.
> What do we need in mark which is not in meta?
> I think dynamic mbuf field of meta is the way to go but I prefer the name "mark" :)
> 

+1 but I don't have answer to the question

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add API to negotiate delivery of Rx meta data
  2021-10-06  8:38                                               ` Andrew Rybchenko
@ 2021-10-06  9:14                                                 ` Ori Kam
  0 siblings, 0 replies; 97+ messages in thread
From: Ori Kam @ 2021-10-06  9:14 UTC (permalink / raw)
  To: Andrew Rybchenko, NBU-Contact-Thomas Monjalon
  Cc: Ivan Malov, Andy Moreton, Ray Kinsella, dev, Jerin Jacob,
	Wisam Monther, Xiaoyun Li, Ferruh Yigit

Hi

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Wednesday, October 6, 2021 11:38 AM
> data
> 
> On 10/6/21 11:30 AM, Thomas Monjalon wrote:
> > 05/10/2021 13:11, Andrew Rybchenko:
> >> On 10/5/21 1:10 PM, Ori Kam wrote:
> >>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>> On 10/5/21 12:41 PM, Ori Kam wrote:
> >>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>>>> On 10/5/21 11:17 AM, Ori Kam wrote:
> >>>>>>> One more thing, I think this flag should be added now since you
> >>>>>>> need it, I think you should report that you don't support it.
> >>>>>>> since just like we talked there is no real difference between
> >>>>>>> metadata and
> >>>> MARK.
> >>>>>>> What do you think?
> >>>>>>
> >>>>>> It sounds like a trick :) Negative support is *not* a support in
> >>>>>> fact. DPDK policy requires support of a feature in a PMD and
> >>>>>> in-tree application. Of course, it is not a problem to add meta.
> >>>>>> It is really easy to do. I just don't want to add it in
> >>>>>> v5 to be deleted in v6 because of my above concerns.
> >>>>>>
> >>>>> This was not a trick. I understand what you are saying.
> >>>>> if we say that metadata is the same as mark, (I think we all agree
> >>>>> on
> >>>>> it) and that application need to notify pmd about such operations,
> >>>>> I assume it will try to see how to request the metadata.
> >>>>
> >>>> Frankly speaking I feel sick when I think about META and MARK
> >>>> together. Do we really need both in DPDK?
> >>>>
> >>> I realy don't want you the be sick,
> >>> The resoun that we need both of them is that 32 in Nvidia it is only
> >>> 24 bits of mark is not enough, so there is a need for more bits.
> >>> I think that in the end we will go to something much more generic
> >>> that the application will just say how many bits it wants to get and this
> what he will get.
> >>> for example the application may say it needs 128 bits and it will
> >>> register this size to the mbuf or give in the mbuf pointer two where those
> values should be set.
> >>> In any case as you can see we have already to many changes in
> >>> rte_flow in this release and the next one, but I'm planning to push
> >>> this feature in the future what do you think of such a feature?
> >>
> >> I agree that there are really many changes in flow API which are on
> >> review in the release cycle.
> >> I hope the above idea will allow to merge MARK and META.
> >
> > I agree we should merge mark and meta in a common dynamic mbuf field.
> > What do we need in mark which is not in meta?
> > I think dynamic mbuf field of meta is the way to go but I prefer the
> > name "mark" :)
> >
> 
> +1 but I don't have answer to the question

We have MARK, FLAG, and META.
MARK and FLAG are the same just one of them give predefined value.
we should merged those two for sure.
META allows the application to get more bits from the HW.
Like I said above I think we should merge everything.
but this is a talk for a different thread, and different time.

Ori

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
@ 2021-10-12 18:08       ` Ferruh Yigit
  2021-10-12 19:39         ` Ivan Malov
  2021-10-12 19:48         ` Ivan Malov
  0 siblings, 2 replies; 97+ messages in thread
From: Ferruh Yigit @ 2021-10-12 18:08 UTC (permalink / raw)
  To: Ivan Malov, dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

On 10/5/2021 4:56 PM, Ivan Malov wrote:
> Detect the flag in Rx prefix and pass it to users.
> 
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>

<...>

> @@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
>   					      ESF_GZ_RX_PREFIX_RSS_HASH);
>   	}
>   
> +	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
> +		uint32_t user_flag;
> +
> +		user_flag = EFX_OWORD_FIELD(rx_prefix[0],
> +					    ESF_GZ_RX_PREFIX_USER_FLAG);
> +		if (user_flag != 0)
> +			ol_flags |= PKT_RX_FDIR;
> +	}
> +

Hi Ivan,

This cause a build error after another sfc patch merged into next-net [1].
Following change [2] seems fixing the issue, but to be sure nothing is missed
can you please send a new version rebasing on top of latest next-net?


[1]
Commit d86c6ced8732 ("net/sfc: use xword type for EF100 Rx prefix")

[2]
diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 704c62c0ac90..8237b772f151 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -415,7 +415,7 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
         if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
                 uint32_t user_flag;
  
-               user_flag = EFX_OWORD_FIELD(rx_prefix[0],
+               user_flag = EFX_XWORD_FIELD(rx_prefix[0],
                                             ESF_GZ_RX_PREFIX_USER_FLAG);
                 if (user_flag != 0)
                         ol_flags |= PKT_RX_FDIR;

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (7 preceding siblings ...)
  2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-12 19:38   ` Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
                       ` (4 more replies)
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  9 siblings, 5 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
only the former has been added. The issue has not been solved.
Applications still assume that metadata features always work
and do not need to be configured in advance.

The team behind net/sfc driver has given this problem more thought.
Conclusions that have been reached are as follows.

1. Not all kinds of metadata can be represented by device offload flags.
   For instance, having flag RSS_HASH is legitimate because the NIC is
   supposed to actually compute something when this feature is active.
   However, if similar flag existed for Rx mark, requesting it would
   not make the NIC actually compute anything. The HW needs external
   stimuli (flow rules) in order to set the mark in the first place.

2. As a consequence of (1), it is apparent that the user's ability to
   use Rx metadata features is complex and consists of multiple parts:
   a) the NIC's ability to conduct the flow actions (set metadata);
   b) the NIC's ability to deliver metadata (if set) to the PMD;
   c) the PMD's ability to provide metadata received from the
      NIC to the user by virtue of filling out mbuf fields.

3. Aspects (2-a) and (2-c) are already addressed by flow validate API
   and the procedure of dynamic mbuf field registration respectively,
   hence, the only problem which really needs a solution is (2-b).
  
Patch [1/5] of this series adds a generic API to let the application
negotiate the NIC's ability to deliver specific kinds of metadata to
the PMD. This API is supposed to be invoked during initialisation
period in order to let the PMD configure HW resources which might
be hard to (re-)configure in the adapter's started state without
causing traffic disruption and other unwanted consequences.

[1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Changes in v3:
* [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
* [1/5] has the ack from Jerin Jacob added

Changes in v4:
* [1/5] has the API contract clarified to address concerns raised by Ori Kam
* [1/5] has the API name fixed to use term "metadata" instead of "meta"
* [1/5] has testpmd loglevel changed as per the note by Ajit Khaparde
* [1/5] has testpmd code revisited to take multi-process into account
* [2/5] through [5/5] have the corresponding adjustments incorporated

Changes in v5:
* [1/5] has the API comment improved as per the note by Ori Kam

Changes in v6:
* Rebase as per request by Ferruh Yigit

Ivan Malov (5):
  ethdev: negotiate delivery of packet metadata from HW to PMD
  net/sfc: support API to negotiate delivery of Rx metadata
  net/sfc: support flow mark delivery on EF100 native datapath
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: report user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 +++++++++
 drivers/net/sfc/sfc_ethdev.c           | 29 ++++++++++++++
 drivers/net/sfc/sfc_flow.c             | 13 +++++++
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 312 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-12 19:38     ` Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev
  Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton, Ray Kinsella,
	Jerin Jacob, Ajit Khaparde, Somnath Kotur, Ori Kam, Wisam Jaddo,
	Xiaoyun Li, Thomas Monjalon

Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
---
 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 183 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..4d01791f6f 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_metadata = 0;
+
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+		ret = rte_eth_rx_metadata_negotiate(port_id, &rx_metadata);
+		if (ret == 0) {
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 12a0db8796..a7841c557f 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -533,6 +533,41 @@ int proc_id;
  */
 unsigned int num_procs = 1;
 
+static void
+eth_rx_metadata_negotiate_mp(uint16_t port_id)
+{
+	uint64_t rx_meta_features = 0;
+	int ret;
+
+	if (!is_proc_primary())
+		return;
+
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+	ret = rte_eth_rx_metadata_negotiate(port_id, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_FLAG)) {
+			TESTPMD_LOG(DEBUG, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_MARK)) {
+			TESTPMD_LOG(DEBUG, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+			TESTPMD_LOG(DEBUG, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    port_id);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 port_id, rte_strerror(-ret));
+	}
+}
+
 static int
 eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1489,6 +1524,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id)
 	int ret;
 	int i;
 
+	eth_rx_metadata_negotiate_mp(pid);
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 3a8e50c324..fd09a838a2 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -164,6 +164,15 @@ New Features
   * Added tests to verify tunnel header verification in IPsec inbound.
   * Added tests to verify inner checksum.
 
+* **Added an API to control delivery of Rx metadata from the HW to the PMD**
+
+  A new API, ``rte_eth_rx_metadata_negotiate()``, was added.
+  The following parts of Rx metadata were defined:
+
+  * ``RTE_ETH_RX_METADATA_USER_FLAG``
+  * ``RTE_ETH_RX_METADATA_USER_MARK``
+  * ``RTE_ETH_RX_METADATA_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index c4ea735732..56db53df1a 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -810,6 +810,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_metadata_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -967,6 +983,12 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	/**
+	 * Negotiate the NIC's ability to deliver specific
+	 * kinds of metadata to the PMD.
+	 */
+	eth_rx_metadata_negotiate_t rx_metadata_negotiate;
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index c909a9fac1..5fae7357c8 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6229,6 +6229,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_metadata_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_metadata_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 91fa28ba8e..cb847a2c38 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4828,6 +4828,60 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/** The NIC is able to deliver flag (if set) with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
+
+/** The NIC is able to deliver mark ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
+
+/** The NIC is able to deliver tunnel ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * Invoke this API before the first rte_eth_dev_configure() invocation
+ * to let the PMD make preparations that are inconvenient to do later.
+ *
+ * The negotiation process is as follows:
+ *
+ * - the application requests features intending to use at least some of them;
+ * - the PMD responds with the guaranteed subset of the requested feature set;
+ * - the application can retry negotiation with another set of features;
+ * - the application can pass zero to clear the negotiation result;
+ * - the last negotiated result takes effect upon
+ *   the ethdev configure and start.
+ *
+ * @note
+ *   The PMD is supposed to first consider enabling the requested feature set
+ *   in its entirety. Only if it fails to do so, does it have the right to
+ *   respond with a smaller set of the originally requested features.
+ *
+ * @note
+ *   Return code (-ENOTSUP) does not necessarily mean that the requested
+ *   features are unsupported. In this case, the application should just
+ *   assume that these features can be used without prior negotiations.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index e073ec17a9..5f87851f8c 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * One should negotiate mark delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * One should negotiate flag delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4209,6 +4217,10 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * One should negotiate tunnel metadata delivery from the NIC to the HW.
+ * @see rte_eth_rx_metadata_negotiate()
+ * @see RTE_ETH_RX_METADATA_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index efd729c0f2..29fb71f1af 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -245,6 +245,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_metadata_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 2/5] net/sfc: support API to negotiate delivery of Rx metadata
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-12 19:38     ` Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 13 +++++++++++++
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 2b459a72db..bba9adc424 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -289,6 +289,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_metadata;
+
 	uint32_t			rxd_wait_timeout_ns;
 
 	bool				switchdev;
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 200961cfc8..c0d9810fbb 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -2318,6 +2318,28 @@ sfc_representor_info_get(struct rte_eth_dev *dev,
 	return nb_repr;
 }
 
+static int
+sfc_rx_metadata_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+	uint64_t supported = 0;
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_MARK;
+
+	sa->negotiated_rx_metadata = supported & *features;
+	*features = sa->negotiated_rx_metadata;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -2366,6 +2388,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
 	.representor_info_get		= sfc_representor_info_get,
+	.rx_metadata_negotiate		= sfc_rx_metadata_negotiate,
 };
 
 struct sfc_ethdev_init_data {
@@ -2462,6 +2485,12 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/* FLAG and MARK are always available from Rx prefix. */
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 36ee79f331..9e6d8109c7 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1760,6 +1760,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
 	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1832,6 +1833,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"FLAG action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_FLAG) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"flag delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			spec_filter->template.efs_flags |=
@@ -1849,6 +1856,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"MARK action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_MARK) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"mark delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			rc = sfc_flow_parse_mark(sa, actions->conf, flow);
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 053a729a77..571673a723 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -3088,6 +3088,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3137,12 +3138,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"flag delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"mark delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 3/5] net/sfc: support flow mark delivery on EF100 native datapath
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
@ 2021-10-12 19:38     ` Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index de35c19089..37957eae11 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -935,6 +935,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR |
 				  SFC_DP_RX_FEAT_STATS,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..5b924010bd 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (2 preceding siblings ...)
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
@ 2021-10-12 19:38     ` Ivan Malov
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index b61984a8e3..e05261218b 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3030,6 +3030,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 692c3e1d49..7b9a4af9da 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (3 preceding siblings ...)
  2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-10-12 19:38     ` Ivan Malov
  4 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:38 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Detect the flag in Rx prefix and pass it to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 37957eae11..8bab09597f 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -63,6 +63,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
 #define SFC_EF100_RXQ_INGRESS_MPORT	0x80
+#define SFC_EF100_RXQ_USER_FLAG		0x80
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -374,6 +375,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		EFX_RX_PREFIX_FIELD(INGRESS_MPORT,
 				    ESF_GZ_RX_PREFIX_INGRESS_MPORT, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -410,6 +412,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_XWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -815,6 +826,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -935,6 +952,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR |
 				  SFC_DP_RX_FEAT_STATS,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 5b924010bd..5e120f5851 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-12 18:08       ` Ferruh Yigit
@ 2021-10-12 19:39         ` Ivan Malov
  2021-10-12 19:48         ` Ivan Malov
  1 sibling, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:39 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Hi Ferruh,

On 12/10/2021 21:08, Ferruh Yigit wrote:
> On 10/5/2021 4:56 PM, Ivan Malov wrote:
>> Detect the flag in Rx prefix and pass it to users.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> 
> <...>
> 
>> @@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct 
>> sfc_ef100_rxq *rxq,
>>                             ESF_GZ_RX_PREFIX_RSS_HASH);
>>       }
>> +    if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
>> +        uint32_t user_flag;
>> +
>> +        user_flag = EFX_OWORD_FIELD(rx_prefix[0],
>> +                        ESF_GZ_RX_PREFIX_USER_FLAG);
>> +        if (user_flag != 0)
>> +            ol_flags |= PKT_RX_FDIR;
>> +    }
>> +
> 
> Hi Ivan,
> 
> This cause a build error after another sfc patch merged into next-net [1].
> Following change [2] seems fixing the issue, but to be sure nothing is 
> missed
> can you please send a new version rebasing on top of latest next-net?

Done. Thank you.

> 
> 
> [1]
> Commit d86c6ced8732 ("net/sfc: use xword type for EF100 Rx prefix")
> 
> [2]
> diff --git a/drivers/net/sfc/sfc_ef100_rx.c 
> b/drivers/net/sfc/sfc_ef100_rx.c
> index 704c62c0ac90..8237b772f151 100644
> --- a/drivers/net/sfc/sfc_ef100_rx.c
> +++ b/drivers/net/sfc/sfc_ef100_rx.c
> @@ -415,7 +415,7 @@ sfc_ef100_rx_prefix_to_offloads(const struct 
> sfc_ef100_rxq *rxq,
>          if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
>                  uint32_t user_flag;
> 
> -               user_flag = EFX_OWORD_FIELD(rx_prefix[0],
> +               user_flag = EFX_XWORD_FIELD(rx_prefix[0],
>                                              ESF_GZ_RX_PREFIX_USER_FLAG);
>                  if (user_flag != 0)
>                          ol_flags |= PKT_RX_FDIR;

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD
  2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
                     ` (8 preceding siblings ...)
  2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-12 19:46   ` Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
                       ` (5 more replies)
  9 siblings, 6 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
only the former has been added. The issue has not been solved.
Applications still assume that metadata features always work
and do not need to be configured in advance.

The team behind net/sfc driver has given this problem more thought.
Conclusions that have been reached are as follows.

1. Not all kinds of metadata can be represented by device offload flags.
   For instance, having flag RSS_HASH is legitimate because the NIC is
   supposed to actually compute something when this feature is active.
   However, if similar flag existed for Rx mark, requesting it would
   not make the NIC actually compute anything. The HW needs external
   stimuli (flow rules) in order to set the mark in the first place.

2. As a consequence of (1), it is apparent that the user's ability to
   use Rx metadata features is complex and consists of multiple parts:
   a) the NIC's ability to conduct the flow actions (set metadata);
   b) the NIC's ability to deliver metadata (if set) to the PMD;
   c) the PMD's ability to provide metadata received from the
      NIC to the user by virtue of filling out mbuf fields.

3. Aspects (2-a) and (2-c) are already addressed by flow validate API
   and the procedure of dynamic mbuf field registration respectively,
   hence, the only problem which really needs a solution is (2-b).
  
Patch [1/5] of this series adds a generic API to let the application
negotiate the NIC's ability to deliver specific kinds of metadata to
the PMD. This API is supposed to be invoked during initialisation
period in order to let the PMD configure HW resources which might
be hard to (re-)configure in the adapter's started state without
causing traffic disruption and other unwanted consequences.

[1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")

Changes in v2:
* [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
* [2/5] has minor adjustments incorporated to follow changes in [1/5]

Changes in v3:
* [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
* [1/5] has the ack from Jerin Jacob added

Changes in v4:
* [1/5] has the API contract clarified to address concerns raised by Ori Kam
* [1/5] has the API name fixed to use term "metadata" instead of "meta"
* [1/5] has testpmd loglevel changed as per the note by Ajit Khaparde
* [1/5] has testpmd code revisited to take multi-process into account
* [2/5] through [5/5] have the corresponding adjustments incorporated

Changes in v5:
* [1/5] has the API comment improved as per the note by Ori Kam

Changes in v6:
* Rebase as per request by Ferruh Yigit

Changes in v7:
* [5/5] has rebase defect fixed

Ivan Malov (5):
  ethdev: negotiate delivery of packet metadata from HW to PMD
  net/sfc: support API to negotiate delivery of Rx metadata
  net/sfc: support flow mark delivery on EF100 native datapath
  common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  net/sfc: report user flag on EF100 native datapath

 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 drivers/net/sfc/sfc.h                  |  2 +
 drivers/net/sfc/sfc_ef100_rx.c         | 19 +++++++++
 drivers/net/sfc/sfc_ethdev.c           | 29 ++++++++++++++
 drivers/net/sfc/sfc_flow.c             | 13 +++++++
 drivers/net/sfc/sfc_mae.c              | 22 ++++++++++-
 drivers/net/sfc/sfc_rx.c               |  6 +++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 17 files changed, 312 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
@ 2021-10-12 19:46     ` Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev
  Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton, Ray Kinsella,
	Jerin Jacob, Ajit Khaparde, Somnath Kotur, Ori Kam, Wisam Jaddo,
	Xiaoyun Li, Thomas Monjalon

Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
---
 app/test-flow-perf/main.c              | 21 ++++++++++
 app/test-pmd/testpmd.c                 | 37 ++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  9 +++++
 lib/ethdev/ethdev_driver.h             | 22 +++++++++++
 lib/ethdev/rte_ethdev.c                | 25 ++++++++++++
 lib/ethdev/rte_ethdev.h                | 54 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow.h                  | 12 ++++++
 lib/ethdev/version.map                 |  3 ++
 8 files changed, 183 insertions(+)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 9be8edc31d..4d01791f6f 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -1760,6 +1760,27 @@ init_port(void)
 		rte_exit(EXIT_FAILURE, "Error: can't init mbuf pool\n");
 
 	for (port_id = 0; port_id < nr_ports; port_id++) {
+		uint64_t rx_metadata = 0;
+
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+		ret = rte_eth_rx_metadata_negotiate(port_id, &rx_metadata);
+		if (ret == 0) {
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG)) {
+				printf(":: flow action FLAG will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+
+			if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+				printf(":: flow action MARK will not affect Rx mbufs on port=%u\n",
+				       port_id);
+			}
+		} else if (ret != -ENOTSUP) {
+			rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port=%u: %s\n",
+				 port_id, rte_strerror(-ret));
+		}
+
 		ret = rte_eth_dev_info_get(port_id, &dev_info);
 		if (ret != 0)
 			rte_exit(EXIT_FAILURE,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 12a0db8796..a7841c557f 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -533,6 +533,41 @@ int proc_id;
  */
 unsigned int num_procs = 1;
 
+static void
+eth_rx_metadata_negotiate_mp(uint16_t port_id)
+{
+	uint64_t rx_meta_features = 0;
+	int ret;
+
+	if (!is_proc_primary())
+		return;
+
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_FLAG;
+	rx_meta_features |= RTE_ETH_RX_METADATA_USER_MARK;
+	rx_meta_features |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+	ret = rte_eth_rx_metadata_negotiate(port_id, &rx_meta_features);
+	if (ret == 0) {
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_FLAG)) {
+			TESTPMD_LOG(DEBUG, "Flow action FLAG will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_USER_MARK)) {
+			TESTPMD_LOG(DEBUG, "Flow action MARK will not affect Rx mbufs on port %u\n",
+				    port_id);
+		}
+
+		if (!(rx_meta_features & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+			TESTPMD_LOG(DEBUG, "Flow tunnel offload support might be limited or unavailable on port %u\n",
+				    port_id);
+		}
+	} else if (ret != -ENOTSUP) {
+		rte_exit(EXIT_FAILURE, "Error when negotiating Rx meta features on port %u: %s\n",
+			 port_id, rte_strerror(-ret));
+	}
+}
+
 static int
 eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1489,6 +1524,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id)
 	int ret;
 	int i;
 
+	eth_rx_metadata_negotiate_mp(pid);
+
 	port->dev_conf.txmode = tx_mode;
 	port->dev_conf.rxmode = rx_mode;
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 3a8e50c324..fd09a838a2 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -164,6 +164,15 @@ New Features
   * Added tests to verify tunnel header verification in IPsec inbound.
   * Added tests to verify inner checksum.
 
+* **Added an API to control delivery of Rx metadata from the HW to the PMD**
+
+  A new API, ``rte_eth_rx_metadata_negotiate()``, was added.
+  The following parts of Rx metadata were defined:
+
+  * ``RTE_ETH_RX_METADATA_USER_FLAG``
+  * ``RTE_ETH_RX_METADATA_USER_MARK``
+  * ``RTE_ETH_RX_METADATA_TUNNEL_ID``
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index c4ea735732..56db53df1a 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -810,6 +810,22 @@ typedef int (*eth_get_monitor_addr_t)(void *rxq,
 typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_representor_info *info);
 
+/**
+ * @internal
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * @param dev
+ *   Port (ethdev) handle
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_rx_metadata_negotiate_t)(struct rte_eth_dev *dev,
+				       uint64_t *features);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -967,6 +983,12 @@ struct eth_dev_ops {
 
 	eth_representor_info_get_t representor_info_get;
 	/**< Get representor info. */
+
+	/**
+	 * Negotiate the NIC's ability to deliver specific
+	 * kinds of metadata to the PMD.
+	 */
+	eth_rx_metadata_negotiate_t rx_metadata_negotiate;
 };
 
 /**
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index c909a9fac1..5fae7357c8 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6229,6 +6229,31 @@ rte_eth_representor_info_get(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info));
 }
 
+int
+rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->data->dev_configured != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"The port (id=%"PRIu16") is already configured\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	if (features == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid features (NULL)\n");
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_metadata_negotiate, -ENOTSUP);
+	return eth_err(port_id,
+		       (*dev->dev_ops->rx_metadata_negotiate)(dev, features));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 91fa28ba8e..cb847a2c38 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4828,6 +4828,60 @@ __rte_experimental
 int rte_eth_representor_info_get(uint16_t port_id,
 				 struct rte_eth_representor_info *info);
 
+/** The NIC is able to deliver flag (if set) with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_FLAG (UINT64_C(1) << 0)
+
+/** The NIC is able to deliver mark ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_USER_MARK (UINT64_C(1) << 1)
+
+/** The NIC is able to deliver tunnel ID with packets to the PMD. */
+#define RTE_ETH_RX_METADATA_TUNNEL_ID (UINT64_C(1) << 2)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Negotiate the NIC's ability to deliver specific kinds of metadata to the PMD.
+ *
+ * Invoke this API before the first rte_eth_dev_configure() invocation
+ * to let the PMD make preparations that are inconvenient to do later.
+ *
+ * The negotiation process is as follows:
+ *
+ * - the application requests features intending to use at least some of them;
+ * - the PMD responds with the guaranteed subset of the requested feature set;
+ * - the application can retry negotiation with another set of features;
+ * - the application can pass zero to clear the negotiation result;
+ * - the last negotiated result takes effect upon
+ *   the ethdev configure and start.
+ *
+ * @note
+ *   The PMD is supposed to first consider enabling the requested feature set
+ *   in its entirety. Only if it fails to do so, does it have the right to
+ *   respond with a smaller set of the originally requested features.
+ *
+ * @note
+ *   Return code (-ENOTSUP) does not necessarily mean that the requested
+ *   features are unsupported. In this case, the application should just
+ *   assume that these features can be used without prior negotiations.
+ *
+ * @param port_id
+ *   Port (ethdev) identifier
+ *
+ * @param[inout] features
+ *   Feature selection buffer
+ *
+ * @return
+ *   - (-EBUSY) if the port can't handle this in its current state;
+ *   - (-ENOTSUP) if the method itself is not supported by the PMD;
+ *   - (-ENODEV) if *port_id* is invalid;
+ *   - (-EINVAL) if *features* is NULL;
+ *   - (-EIO) if the device is removed;
+ *   - (0) on success
+ */
+__rte_experimental
+int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index e073ec17a9..5f87851f8c 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -1904,6 +1904,10 @@ enum rte_flow_action_type {
 	 * PKT_RX_FDIR_ID mbuf flags.
 	 *
 	 * See struct rte_flow_action_mark.
+	 *
+	 * One should negotiate mark delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_MARK
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
@@ -1912,6 +1916,10 @@ enum rte_flow_action_type {
 	 * sets the PKT_RX_FDIR mbuf flag.
 	 *
 	 * No associated configuration structure.
+	 *
+	 * One should negotiate flag delivery from the NIC to the PMD.
+	 * @see rte_eth_rx_metadata_negotiate()
+	 * @see RTE_ETH_RX_METADATA_USER_FLAG
 	 */
 	RTE_FLOW_ACTION_TYPE_FLAG,
 
@@ -4209,6 +4217,10 @@ rte_flow_tunnel_match(uint16_t port_id,
 /**
  * Populate the current packet processing state, if exists, for the given mbuf.
  *
+ * One should negotiate tunnel metadata delivery from the NIC to the HW.
+ * @see rte_eth_rx_metadata_negotiate()
+ * @see RTE_ETH_RX_METADATA_TUNNEL_ID
+ *
  * @param port_id
  *   Port identifier of Ethernet device.
  * @param[in] m
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index efd729c0f2..29fb71f1af 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -245,6 +245,9 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_eth_rx_metadata_negotiate;
 };
 
 INTERNAL {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 2/5] net/sfc: support API to negotiate delivery of Rx metadata
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
@ 2021-10-12 19:46     ` Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc.h        |  2 ++
 drivers/net/sfc/sfc_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_flow.c   | 13 +++++++++++++
 drivers/net/sfc/sfc_mae.c    | 22 ++++++++++++++++++++--
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 2b459a72db..bba9adc424 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -289,6 +289,8 @@ struct sfc_adapter {
 	boolean_t			tso;
 	boolean_t			tso_encap;
 
+	uint64_t			negotiated_rx_metadata;
+
 	uint32_t			rxd_wait_timeout_ns;
 
 	bool				switchdev;
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 200961cfc8..c0d9810fbb 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -2318,6 +2318,28 @@ sfc_representor_info_get(struct rte_eth_dev *dev,
 	return nb_repr;
 }
 
+static int
+sfc_rx_metadata_negotiate(struct rte_eth_dev *dev, uint64_t *features)
+{
+	struct sfc_adapter *sa = sfc_adapter_by_eth_dev(dev);
+	uint64_t supported = 0;
+
+	sfc_adapter_lock(sa);
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_FLAG) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_FLAG;
+
+	if ((sa->priv.dp_rx->features & SFC_DP_RX_FEAT_FLOW_MARK) != 0)
+		supported |= RTE_ETH_RX_METADATA_USER_MARK;
+
+	sa->negotiated_rx_metadata = supported & *features;
+	*features = sa->negotiated_rx_metadata;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -2366,6 +2388,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.xstats_get_names_by_id		= sfc_xstats_get_names_by_id,
 	.pool_ops_supported		= sfc_pool_ops_supported,
 	.representor_info_get		= sfc_representor_info_get,
+	.rx_metadata_negotiate		= sfc_rx_metadata_negotiate,
 };
 
 struct sfc_ethdev_init_data {
@@ -2462,6 +2485,12 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 		goto fail_dp_rx_name;
 	}
 
+	if (strcmp(dp_rx->dp.name, SFC_KVARG_DATAPATH_EF10_ESSB) == 0) {
+		/* FLAG and MARK are always available from Rx prefix. */
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_FLAG;
+		sa->negotiated_rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+	}
+
 	sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
 	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 36ee79f331..9e6d8109c7 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1760,6 +1760,7 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 	struct sfc_flow_spec *spec = &flow->spec;
 	struct sfc_flow_spec_filter *spec_filter = &spec->filter;
 	const unsigned int dp_rx_features = sa->priv.dp_rx->features;
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	uint32_t actions_set = 0;
 	const uint32_t fate_actions_mask = (1UL << RTE_FLOW_ACTION_TYPE_QUEUE) |
 					   (1UL << RTE_FLOW_ACTION_TYPE_RSS) |
@@ -1832,6 +1833,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"FLAG action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_FLAG) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"flag delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			spec_filter->template.efs_flags |=
@@ -1849,6 +1856,12 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
 					"MARK action is not supported on the current Rx datapath");
 				return -rte_errno;
+			} else if ((rx_metadata &
+				    RTE_ETH_RX_METADATA_USER_MARK) == 0) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					"mark delivery has not been negotiated");
+				return -rte_errno;
 			}
 
 			rc = sfc_flow_parse_mark(sa, actions->conf, flow);
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 053a729a77..571673a723 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -3088,6 +3088,7 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 			  efx_mae_actions_t *spec,
 			  struct rte_flow_error *error)
 {
+	const uint64_t rx_metadata = sa->negotiated_rx_metadata;
 	bool custom_error = B_FALSE;
 	int rc = 0;
 
@@ -3137,12 +3138,29 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
 	case RTE_FLOW_ACTION_TYPE_FLAG:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_FLAG,
 				       bundle->actions_mask);
-		rc = efx_mae_action_set_populate_flag(spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0) {
+			rc = efx_mae_action_set_populate_flag(spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"flag delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_MARK:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_MARK,
 				       bundle->actions_mask);
-		rc = sfc_mae_rule_parse_action_mark(sa, action->conf, spec);
+		if ((rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0) {
+			rc = sfc_mae_rule_parse_action_mark(sa, action->conf,
+							    spec);
+		} else {
+			rc = rte_flow_error_set(error, ENOTSUP,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						action,
+						"mark delivery has not been negotiated");
+			custom_error = B_TRUE;
+		}
 		break;
 	case RTE_FLOW_ACTION_TYPE_PHY_PORT:
 		SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PHY_PORT,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 3/5] net/sfc: support flow mark delivery on EF100 native datapath
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
@ 2021-10-12 19:46     ` Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 1 +
 drivers/net/sfc/sfc_rx.c       | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index de35c19089..37957eae11 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -935,6 +935,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR |
 				  SFC_DP_RX_FEAT_STATS,
 	.dev_offload_capa	= 0,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 280e8a61f9..5b924010bd 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
+
 	rc = sfc_ev_qinit(sa, SFC_EVQ_TYPE_RX, sw_index,
 			  evq_entries, socket_id, &evq);
 	if (rc != 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (2 preceding siblings ...)
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
@ 2021-10-12 19:46     ` Ivan Malov
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
  2021-10-12 23:25     ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ferruh Yigit
  5 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Add an RxQ flag to request support for user flag field of Rx
prefix. The feature is supported only on EF100 and EF10 ESSB.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/common/sfc_efx/base/ef10_rx.c  | 54 ++++++++++++++++----------
 drivers/common/sfc_efx/base/efx.h      |  4 ++
 drivers/common/sfc_efx/base/rhead_rx.c |  3 ++
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/common/sfc_efx/base/ef10_rx.c b/drivers/common/sfc_efx/base/ef10_rx.c
index 0c3f9413cf..a658e0dba2 100644
--- a/drivers/common/sfc_efx/base/ef10_rx.c
+++ b/drivers/common/sfc_efx/base/ef10_rx.c
@@ -930,6 +930,10 @@ ef10_rx_qcreate(
 			rc = ENOTSUP;
 			goto fail2;
 		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail3;
+		}
 		/*
 		 * Ignore EFX_RXQ_FLAG_RSS_HASH since if RSS hash is calculated
 		 * it is always delivered from HW in the pseudo-header.
@@ -940,7 +944,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_packed_stream_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail3;
+			goto fail4;
 		}
 		switch (type_data->ertd_packed_stream.eps_buf_size) {
 		case EFX_RXQ_PACKED_STREAM_BUF_SIZE_1M:
@@ -960,17 +964,21 @@ ef10_rx_qcreate(
 			break;
 		default:
 			rc = ENOTSUP;
-			goto fail4;
+			goto fail5;
 		}
 		erp->er_buf_size = type_data->ertd_packed_stream.eps_buf_size;
 		/* Packed stream pseudo header does not have RSS hash value */
 		if (flags & EFX_RXQ_FLAG_RSS_HASH) {
 			rc = ENOTSUP;
-			goto fail5;
+			goto fail6;
 		}
 		if (flags & EFX_RXQ_FLAG_USER_MARK) {
 			rc = ENOTSUP;
-			goto fail6;
+			goto fail7;
+		}
+		if (flags & EFX_RXQ_FLAG_USER_FLAG) {
+			rc = ENOTSUP;
+			goto fail8;
 		}
 		break;
 #endif /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -979,7 +987,7 @@ ef10_rx_qcreate(
 		erpl = &ef10_essb_rx_prefix_layout;
 		if (type_data == NULL) {
 			rc = EINVAL;
-			goto fail7;
+			goto fail9;
 		}
 		params.es_bufs_per_desc =
 		    type_data->ertd_es_super_buffer.eessb_bufs_per_desc;
@@ -997,7 +1005,7 @@ ef10_rx_qcreate(
 #endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
 	default:
 		rc = ENOTSUP;
-		goto fail8;
+		goto fail10;
 	}
 
 #if EFSYS_OPT_RX_PACKED_STREAM
@@ -1005,13 +1013,13 @@ ef10_rx_qcreate(
 		/* Check if datapath firmware supports packed stream mode */
 		if (encp->enc_rx_packed_stream_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail9;
+			goto fail11;
 		}
 		/* Check if packed stream allows configurable buffer sizes */
 		if ((params.ps_buf_size != MC_CMD_INIT_RXQ_EXT_IN_PS_BUFF_1M) &&
 		    (encp->enc_rx_var_packed_stream_supported == B_FALSE)) {
 			rc = ENOTSUP;
-			goto fail10;
+			goto fail12;
 		}
 	}
 #else /* EFSYS_OPT_RX_PACKED_STREAM */
@@ -1022,17 +1030,17 @@ ef10_rx_qcreate(
 	if (params.es_bufs_per_desc > 0) {
 		if (encp->enc_rx_es_super_buffer_supported == B_FALSE) {
 			rc = ENOTSUP;
-			goto fail11;
+			goto fail13;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_max_dma_len,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail12;
+			goto fail14;
 		}
 		if (!EFX_IS_P2ALIGNED(uint32_t, params.es_buf_stride,
 			    EFX_RX_ES_SUPER_BUFFER_BUF_ALIGNMENT)) {
 			rc = EINVAL;
-			goto fail13;
+			goto fail15;
 		}
 	}
 #else /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
@@ -1041,7 +1049,7 @@ ef10_rx_qcreate(
 
 	if (flags & EFX_RXQ_FLAG_INGRESS_MPORT) {
 		rc = ENOTSUP;
-		goto fail14;
+		goto fail16;
 	}
 
 	/* Scatter can only be disabled if the firmware supports doing so */
@@ -1057,7 +1065,7 @@ ef10_rx_qcreate(
 
 	if ((rc = efx_mcdi_init_rxq(enp, ndescs, eep, label, index,
 		    esmp, &params)) != 0)
-		goto fail15;
+		goto fail17;
 
 	erp->er_eep = eep;
 	erp->er_label = label;
@@ -1070,40 +1078,44 @@ ef10_rx_qcreate(
 
 	return (0);
 
+fail17:
+	EFSYS_PROBE(fail15);
+fail16:
+	EFSYS_PROBE(fail14);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail15:
 	EFSYS_PROBE(fail15);
 fail14:
 	EFSYS_PROBE(fail14);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail13:
 	EFSYS_PROBE(fail13);
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail12:
 	EFSYS_PROBE(fail12);
 fail11:
 	EFSYS_PROBE(fail11);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail10:
 	EFSYS_PROBE(fail10);
+#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail9:
 	EFSYS_PROBE(fail9);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
+#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
+#if EFSYS_OPT_RX_PACKED_STREAM
 fail8:
 	EFSYS_PROBE(fail8);
-#if EFSYS_OPT_RX_ES_SUPER_BUFFER
 fail7:
 	EFSYS_PROBE(fail7);
-#endif /* EFSYS_OPT_RX_ES_SUPER_BUFFER */
-#if EFSYS_OPT_RX_PACKED_STREAM
 fail6:
 	EFSYS_PROBE(fail6);
 fail5:
 	EFSYS_PROBE(fail5);
 fail4:
 	EFSYS_PROBE(fail4);
+#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail3:
 	EFSYS_PROBE(fail3);
-#endif /* EFSYS_OPT_RX_PACKED_STREAM */
 fail2:
 	EFSYS_PROBE(fail2);
 fail1:
diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h
index b61984a8e3..e05261218b 100644
--- a/drivers/common/sfc_efx/base/efx.h
+++ b/drivers/common/sfc_efx/base/efx.h
@@ -3030,6 +3030,10 @@ typedef enum efx_rxq_type_e {
  * Request user mark field in the Rx prefix of a queue.
  */
 #define	EFX_RXQ_FLAG_USER_MARK		0x10
+/*
+ * Request user flag field in the Rx prefix of a queue.
+ */
+#define	EFX_RXQ_FLAG_USER_FLAG		0x20
 
 LIBEFX_API
 extern	__checkReturn	efx_rc_t
diff --git a/drivers/common/sfc_efx/base/rhead_rx.c b/drivers/common/sfc_efx/base/rhead_rx.c
index 692c3e1d49..7b9a4af9da 100644
--- a/drivers/common/sfc_efx/base/rhead_rx.c
+++ b/drivers/common/sfc_efx/base/rhead_rx.c
@@ -635,6 +635,9 @@ rhead_rx_qcreate(
 	if (flags & EFX_RXQ_FLAG_USER_MARK)
 		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_MARK;
 
+	if (flags & EFX_RXQ_FLAG_USER_FLAG)
+		fields_mask |= 1U << EFX_RX_PREFIX_FIELD_USER_FLAG;
+
 	/*
 	 * LENGTH is required in EF100 host interface, as receive events
 	 * do not include the packet length.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v7 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (3 preceding siblings ...)
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
@ 2021-10-12 19:46     ` Ivan Malov
  2021-10-12 23:25     ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ferruh Yigit
  5 siblings, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:46 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit, Andrew Rybchenko, Andy Moreton

Detect the flag in Rx prefix and pass it to users.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_ef100_rx.c | 18 ++++++++++++++++++
 drivers/net/sfc/sfc_rx.c       |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 37957eae11..d6308dc70d 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -63,6 +63,7 @@ struct sfc_ef100_rxq {
 #define SFC_EF100_RXQ_USER_MARK		0x20
 #define SFC_EF100_RXQ_FLAG_INTR_EN	0x40
 #define SFC_EF100_RXQ_INGRESS_MPORT	0x80
+#define SFC_EF100_RXQ_USER_FLAG		0x100
 	unsigned int			ptr_mask;
 	unsigned int			evq_phase_bit_shift;
 	unsigned int			ready_pkts;
@@ -374,6 +375,7 @@ static const efx_rx_prefix_layout_t sfc_ef100_rx_prefix_layout = {
 		EFX_RX_PREFIX_FIELD(INGRESS_MPORT,
 				    ESF_GZ_RX_PREFIX_INGRESS_MPORT, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(RSS_HASH, B_FALSE),
+		SFC_EF100_RX_PREFIX_FIELD(USER_FLAG, B_FALSE),
 		SFC_EF100_RX_PREFIX_FIELD(USER_MARK, B_FALSE),
 
 #undef	SFC_EF100_RX_PREFIX_FIELD
@@ -410,6 +412,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct sfc_ef100_rxq *rxq,
 					      ESF_GZ_RX_PREFIX_RSS_HASH);
 	}
 
+	if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
+		uint32_t user_flag;
+
+		user_flag = EFX_XWORD_FIELD(rx_prefix[0],
+					    ESF_GZ_RX_PREFIX_USER_FLAG);
+		if (user_flag != 0)
+			ol_flags |= PKT_RX_FDIR;
+	}
+
 	if (rxq->flags & SFC_EF100_RXQ_USER_MARK) {
 		uint32_t user_mark;
 
@@ -815,6 +826,12 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_RSS_HASH;
 
+	if ((unsup_rx_prefix_fields &
+	     (1U << EFX_RX_PREFIX_FIELD_USER_FLAG)) == 0)
+		rxq->flags |= SFC_EF100_RXQ_USER_FLAG;
+	else
+		rxq->flags &= ~SFC_EF100_RXQ_USER_FLAG;
+
 	if ((unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_USER_MARK)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_USER_MARK;
@@ -935,6 +952,7 @@ struct sfc_dp_rx sfc_ef100_rx = {
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF100,
 	},
 	.features		= SFC_DP_RX_FEAT_MULTI_PROCESS |
+				  SFC_DP_RX_FEAT_FLOW_FLAG |
 				  SFC_DP_RX_FEAT_FLOW_MARK |
 				  SFC_DP_RX_FEAT_INTR |
 				  SFC_DP_RX_FEAT_STATS,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 5b924010bd..5e120f5851 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1178,6 +1178,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (offloads & DEV_RX_OFFLOAD_RSS_HASH)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_RSS_HASH;
 
+	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_FLAG) != 0)
+		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_FLAG;
+
 	if ((sa->negotiated_rx_metadata & RTE_ETH_RX_METADATA_USER_MARK) != 0)
 		rxq_info->type_flags |= EFX_RXQ_FLAG_USER_MARK;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath
  2021-10-12 18:08       ` Ferruh Yigit
  2021-10-12 19:39         ` Ivan Malov
@ 2021-10-12 19:48         ` Ivan Malov
  1 sibling, 0 replies; 97+ messages in thread
From: Ivan Malov @ 2021-10-12 19:48 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: Ray Kinsella, Jerin Jacob, Thomas Monjalon, Ori Kam,
	Ajit Khaparde, Andrew Rybchenko, Andy Moreton

Hi Ferruh,

I apologise: there was a defect in v6. I re-submitted the series (v7):
https://patches.dpdk.org/project/dpdk/list/?series=19571

Thank you.

On 12/10/2021 21:08, Ferruh Yigit wrote:
> On 10/5/2021 4:56 PM, Ivan Malov wrote:
>> Detect the flag in Rx prefix and pass it to users.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> 
> <...>
> 
>> @@ -407,6 +409,15 @@ sfc_ef100_rx_prefix_to_offloads(const struct 
>> sfc_ef100_rxq *rxq,
>>                             ESF_GZ_RX_PREFIX_RSS_HASH);
>>       }
>> +    if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
>> +        uint32_t user_flag;
>> +
>> +        user_flag = EFX_OWORD_FIELD(rx_prefix[0],
>> +                        ESF_GZ_RX_PREFIX_USER_FLAG);
>> +        if (user_flag != 0)
>> +            ol_flags |= PKT_RX_FDIR;
>> +    }
>> +
> 
> Hi Ivan,
> 
> This cause a build error after another sfc patch merged into next-net [1].
> Following change [2] seems fixing the issue, but to be sure nothing is 
> missed
> can you please send a new version rebasing on top of latest next-net?
> 
> 
> [1]
> Commit d86c6ced8732 ("net/sfc: use xword type for EF100 Rx prefix")
> 
> [2]
> diff --git a/drivers/net/sfc/sfc_ef100_rx.c 
> b/drivers/net/sfc/sfc_ef100_rx.c
> index 704c62c0ac90..8237b772f151 100644
> --- a/drivers/net/sfc/sfc_ef100_rx.c
> +++ b/drivers/net/sfc/sfc_ef100_rx.c
> @@ -415,7 +415,7 @@ sfc_ef100_rx_prefix_to_offloads(const struct 
> sfc_ef100_rxq *rxq,
>          if (rxq->flags & SFC_EF100_RXQ_USER_FLAG) {
>                  uint32_t user_flag;
> 
> -               user_flag = EFX_OWORD_FIELD(rx_prefix[0],
> +               user_flag = EFX_XWORD_FIELD(rx_prefix[0],
>                                              ESF_GZ_RX_PREFIX_USER_FLAG);
>                  if (user_flag != 0)
>                          ol_flags |= PKT_RX_FDIR;

-- 
Ivan M

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD
  2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
                       ` (4 preceding siblings ...)
  2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
@ 2021-10-12 23:25     ` Ferruh Yigit
  5 siblings, 0 replies; 97+ messages in thread
From: Ferruh Yigit @ 2021-10-12 23:25 UTC (permalink / raw)
  To: Ivan Malov, dev

On 10/12/2021 8:46 PM, Ivan Malov wrote:
> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace
> intending to add new flags, RSS_HASH and FLOW_MARK. Since then,
> only the former has been added. The issue has not been solved.
> Applications still assume that metadata features always work
> and do not need to be configured in advance.
> 
> The team behind net/sfc driver has given this problem more thought.
> Conclusions that have been reached are as follows.
> 
> 1. Not all kinds of metadata can be represented by device offload flags.
>     For instance, having flag RSS_HASH is legitimate because the NIC is
>     supposed to actually compute something when this feature is active.
>     However, if similar flag existed for Rx mark, requesting it would
>     not make the NIC actually compute anything. The HW needs external
>     stimuli (flow rules) in order to set the mark in the first place.
> 
> 2. As a consequence of (1), it is apparent that the user's ability to
>     use Rx metadata features is complex and consists of multiple parts:
>     a) the NIC's ability to conduct the flow actions (set metadata);
>     b) the NIC's ability to deliver metadata (if set) to the PMD;
>     c) the PMD's ability to provide metadata received from the
>        NIC to the user by virtue of filling out mbuf fields.
> 
> 3. Aspects (2-a) and (2-c) are already addressed by flow validate API
>     and the procedure of dynamic mbuf field registration respectively,
>     hence, the only problem which really needs a solution is (2-b).
>    
> Patch [1/5] of this series adds a generic API to let the application
> negotiate the NIC's ability to deliver specific kinds of metadata to
> the PMD. This API is supposed to be invoked during initialisation
> period in order to let the PMD configure HW resources which might
> be hard to (re-)configure in the adapter's started state without
> causing traffic disruption and other unwanted consequences.
> 
> [1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags")
> 
> Changes in v2:
> * [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added
> * [2/5] has minor adjustments incorporated to follow changes in [1/5]
> 
> Changes in v3:
> * [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording)
> * [1/5] has the ack from Jerin Jacob added
> 
> Changes in v4:
> * [1/5] has the API contract clarified to address concerns raised by Ori Kam
> * [1/5] has the API name fixed to use term "metadata" instead of "meta"
> * [1/5] has testpmd loglevel changed as per the note by Ajit Khaparde
> * [1/5] has testpmd code revisited to take multi-process into account
> * [2/5] through [5/5] have the corresponding adjustments incorporated
> 
> Changes in v5:
> * [1/5] has the API comment improved as per the note by Ori Kam
> 
> Changes in v6:
> * Rebase as per request by Ferruh Yigit
> 
> Changes in v7:
> * [5/5] has rebase defect fixed
> 
> Ivan Malov (5):
>    ethdev: negotiate delivery of packet metadata from HW to PMD
>    net/sfc: support API to negotiate delivery of Rx metadata
>    net/sfc: support flow mark delivery on EF100 native datapath
>    common/sfc_efx/base: add RxQ flag to use Rx prefix user flag
>    net/sfc: report user flag on EF100 native datapath
> 

Series applied to dpdk-next-net/main, thanks.


^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2021-10-12 23:26 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-02 14:23 [dpdk-dev] [PATCH 0/5] A means to negotiate support for Rx meta information Ivan Malov
2021-09-02 14:23 ` [dpdk-dev] [PATCH 1/5] ethdev: add API " Ivan Malov
2021-09-02 14:47   ` Jerin Jacob
2021-09-02 16:14   ` Kinsella, Ray
2021-09-03  9:34   ` Jerin Jacob
2021-09-02 14:23 ` [dpdk-dev] [PATCH 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
2021-09-02 14:23 ` [dpdk-dev] [PATCH 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
2021-09-02 14:23 ` [dpdk-dev] [PATCH 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-09-02 14:23 ` [dpdk-dev] [PATCH 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
2021-09-03  0:15 ` [dpdk-dev] [PATCH v2 0/5] A means to negotiate support for Rx meta information Ivan Malov
2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 1/5] ethdev: add API " Ivan Malov
2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 2/5] net/sfc: provide API to negotiate supported Rx meta features Ivan Malov
2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 3/5] net/sfc: allow to use EF100 native datapath Rx mark in flows Ivan Malov
2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-09-03  0:15   ` [dpdk-dev] [PATCH v2 5/5] net/sfc: allow to discern user flag on EF100 native datapath Ivan Malov
2021-09-23 11:20 ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Ivan Malov
2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 1/5] ethdev: add API " Ivan Malov
2021-09-30 14:59     ` Ori Kam
2021-09-30 15:07       ` Andrew Rybchenko
2021-09-30 19:07       ` Ivan Malov
2021-10-01  6:50         ` Andrew Rybchenko
2021-10-03  7:42           ` Ori Kam
2021-10-03  9:30             ` Ivan Malov
2021-10-03 11:01               ` Ori Kam
2021-10-03 17:30                 ` Ivan Malov
2021-10-03 21:04                   ` Ori Kam
2021-10-03 23:50                     ` Ivan Malov
2021-10-04  6:56                       ` Ori Kam
2021-10-04 11:39                         ` Ivan Malov
2021-10-04 13:53                           ` Andrew Rybchenko
2021-10-05  6:30                             ` Ori Kam
2021-10-05  7:27                               ` Andrew Rybchenko
2021-10-05  8:17                                 ` Ori Kam
2021-10-05  8:38                                   ` Andrew Rybchenko
2021-10-05  9:41                                     ` Ori Kam
2021-10-05 10:01                                       ` Andrew Rybchenko
2021-10-05 10:10                                         ` Ori Kam
2021-10-05 11:11                                           ` Andrew Rybchenko
2021-10-06  8:30                                             ` Thomas Monjalon
2021-10-06  8:38                                               ` Andrew Rybchenko
2021-10-06  9:14                                                 ` Ori Kam
2021-09-30 21:48     ` Ajit Khaparde
2021-09-30 22:00       ` Ivan Malov
2021-09-30 22:12         ` Ajit Khaparde
2021-09-30 22:22           ` Ivan Malov
2021-10-03  7:05             ` Ori Kam
2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 2/5] net/sfc: support " Ivan Malov
2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-09-23 11:20   ` [dpdk-dev] [PATCH v3 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
2021-09-30 16:18   ` [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data Thomas Monjalon
2021-09-30 19:30     ` Ivan Malov
2021-10-01  6:47       ` Andrew Rybchenko
2021-10-01  8:11         ` Thomas Monjalon
2021-10-01  8:54           ` Andrew Rybchenko
2021-10-01  9:32             ` Thomas Monjalon
2021-10-01  9:41               ` Andrew Rybchenko
2021-10-01  8:55           ` Ivan Malov
2021-10-01  9:48             ` Thomas Monjalon
2021-10-01 10:15               ` Andrew Rybchenko
2021-10-01 12:10                 ` Thomas Monjalon
2021-10-04  9:17                   ` Andrew Rybchenko
2021-10-04 23:50   ` [dpdk-dev] [PATCH v4 0/5] Negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
2021-10-05 12:03       ` Ori Kam
2021-10-05 12:50         ` Ivan Malov
2021-10-05 13:17           ` Andrew Rybchenko
2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-10-04 23:50     ` [dpdk-dev] [PATCH v4 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
2021-10-05 15:56   ` [dpdk-dev] [PATCH v5 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
2021-10-05 21:40       ` Ajit Khaparde
2021-10-06  6:04         ` Somnath Kotur
2021-10-06  6:10           ` Ori Kam
2021-10-06  7:22             ` Wisam Monther
2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-10-05 15:56     ` [dpdk-dev] [PATCH v5 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
2021-10-12 18:08       ` Ferruh Yigit
2021-10-12 19:39         ` Ivan Malov
2021-10-12 19:48         ` Ivan Malov
2021-10-12 19:38   ` [dpdk-dev] [PATCH v6 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-10-12 19:38     ` [dpdk-dev] [PATCH v6 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
2021-10-12 19:46   ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ivan Malov
2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 1/5] ethdev: negotiate delivery of packet metadata from HW to PMD Ivan Malov
2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 2/5] net/sfc: support API to negotiate delivery of Rx metadata Ivan Malov
2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 3/5] net/sfc: support flow mark delivery on EF100 native datapath Ivan Malov
2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 4/5] common/sfc_efx/base: add RxQ flag to use Rx prefix user flag Ivan Malov
2021-10-12 19:46     ` [dpdk-dev] [PATCH v7 5/5] net/sfc: report user flag on EF100 native datapath Ivan Malov
2021-10-12 23:25     ` [dpdk-dev] [PATCH v7 0/5] ethdev: negotiate the NIC's ability to deliver Rx metadata to the PMD Ferruh Yigit

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git