DPDK patches and discussions
 help / color / Atom feed
* [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling
@ 2020-06-25 16:26 Jiawei Wang
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
                   ` (9 more replies)
  0 siblings, 10 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample
the incoming traffic and send a duplicated traffic in some predefined
ratio to the application, while the original packet will continue to
the target destination.

If the sample ratio value be set to 1, means that the packets would be
completely mirrored. The sample packet can be assigned with additional
set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample
action while creating a flow.

Jiawei Wang (8):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for default miss and sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  net/mlx5: update the metadata register c0 support
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c           | 285 ++++++++++++++-
 drivers/common/mlx5/Makefile          |  10 +
 drivers/common/mlx5/linux/meson.build |   4 +
 drivers/common/mlx5/linux/mlx5_glue.c |  28 ++
 drivers/common/mlx5/linux/mlx5_glue.h |  13 +
 drivers/common/mlx5/mlx5_devx_cmds.c  |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h  |   1 +
 drivers/common/mlx5/mlx5_prm.h        |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c      |  14 +
 drivers/net/mlx5/mlx5.c               |  11 +
 drivers/net/mlx5/mlx5.h               |   4 +
 drivers/net/mlx5/mlx5_flow.c          | 270 +++++++++++++-
 drivers/net/mlx5/mlx5_flow.h          |  52 ++-
 drivers/net/mlx5/mlx5_flow_dv.c       | 668 +++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c          |   1 +
 lib/librte_ethdev/rte_flow.h          |  29 ++
 16 files changed, 1422 insertions(+), 46 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-25 17:55   ` Jerin Jacob
                     ` (2 more replies)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 2/8] common/mlx5: glue for default miss and sample action Jiawei Wang
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

When using full offload, all traffic will be handled by the HW, and
directed to the requested vf or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic in some predefined ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1
, means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 lib/librte_ethdev/rte_flow.c |  1 +
 lib/librte_ethdev/rte_flow.h | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 1685be5..733871d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -173,6 +173,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b0e4199..71dd82c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * Redirects specific ratio of packets to vport or queue.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2709,6 +2716,28 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated to a special queue or vport
+ * in the predefined probabiilty, All the packets continues processing
+ * on the default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	/* packets sampled equals to '1/ratio' */
+	const uint32_t ratio;
+	/* sub-action list specific for the sampling hit cases */
+	const struct rte_flow_action *actions;
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 2/8] common/mlx5: glue for default miss and sample action
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 15:25   ` Ori Kam
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 3/8] common/mlx5: query sampler object capability via DevX Jiawei Wang
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

rdma-core introduce two new DR action: default miss and sample
action.

Add the rdma-core commands in glue to create these two actions.

Default miss action is used for the sampled packet on FDB domain,
it steering packet to eswitch manager vport.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/common/mlx5/Makefile          | 10 ++++++++++
 drivers/common/mlx5/linux/meson.build |  4 ++++
 drivers/common/mlx5/linux/mlx5_glue.c | 28 ++++++++++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 13 +++++++++++++
 4 files changed, 55 insertions(+)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index 622bde4..8db0604 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -187,6 +187,16 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		func mlx5dv_dump_dr_domain \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_default_miss \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_flow_sampler \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
 		infiniband/mlx5dv.h \
 		enum MLX5_MMAP_GET_NC_PAGES_CMD \
diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 638bb2b..95f3204 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -160,6 +160,10 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_default_miss'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index c91ee33..ea366e2 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1047,6 +1047,30 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_default_miss(void)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS
+	return mlx5dv_dr_action_create_default_miss();
+#else
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1294,6 +1318,10 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_default_miss =
+		mlx5_glue_dr_create_flow_action_default_miss,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 5d238a4..9b1487d 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -303,6 +313,9 @@ struct mlx5_glue {
 			 struct mlx5dv_devx_async_event_hdr *event_data,
 			 size_t event_resp_len);
 	void (*dr_reclaim_domain_memory)(void *domain, uint32_t enable);
+	void *(*dr_create_flow_action_default_miss)(void);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 3/8] common/mlx5: query sampler object capability via DevX
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 2/8] common/mlx5: glue for default miss and sample action Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 17:38   ` Ori Kam
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action Jiawei Wang
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index ec92eb6..6b551f1 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -496,6 +496,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capabilitiy From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 25704ef..a9cfe6d 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -90,6 +90,7 @@ struct mlx5_hca_attr {
 	uint32_t vhca_id:16;
 	uint32_t relaxed_ordering_write:1;
 	uint32_t relaxed_ordering_read:1;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index c63795f..e7d0a65 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -944,6 +944,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1365,12 +1366,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (2 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 3/8] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 17:59   ` Ori Kam
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 5/8] net/mlx5: split sample flow into two sub flows Jiawei Wang
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX or DEFAULT_MISS in FDB.

Only NIC_RX support with addition optinal actions.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 130 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 146 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index f0147e6..5c057d3 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -878,6 +878,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
+		    config.dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config.hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config.mprq.enabled && mprq) {
 		if (config.mprq.stride_num_n &&
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8a09ebc..c2a875c 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -607,6 +607,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 2c96677..902380b 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -200,6 +200,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV4_DSCP (1ull << 32)
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 35)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index f174009..710c0f3 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3925,6 +3925,127 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optinal action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5539,6 +5660,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 5/8] net/mlx5: split sample flow into two sub flows
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (3 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 18:18   ` Ori Kam
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action Jiawei Wang
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 254 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  37 +++++++
 4 files changed, 301 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ddbe29d..4a52462 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -238,6 +238,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = rte_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = rte_malloc_socket,
+		.free = rte_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index c2a875c..7394753 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -51,6 +51,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -510,6 +511,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -558,6 +560,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_indexed_pool *ipool[MLX5_IPOOL_MAX];
 	/* Memory Pool for mlx5 flow resources. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 3a48b89..7c65a9a 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -360,6 +360,8 @@ struct mlx5_flow_tunnel_info {
 		return REG_B;
 	case MLX5_HAIRPIN_TX:
 		return REG_A;
+	case MLX5_SAMPLE_FDB:
+		return REG_C_0;
 	case MLX5_METADATA_RX:
 		switch (config->dv_xmeta_en) {
 		case MLX5_XMETA_MODE_LEGACY:
@@ -3878,6 +3880,137 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			/* Add the extra tag action first. */
+			tag_action = actions_pre;
+			tag_action->type = (enum rte_flow_action_type)
+					MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+			actions_pre++;
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre++;
+	/* Set the tag. */
+	set_tag = (void *)actions_pre;
+	set_tag->id = mlx5_flow_get_reg_id(dev, attr->transfer ?
+			MLX5_SAMPLE_FDB : MLX5_APP_TAG, 0, &error);
+	tag_id = flow_qrss_get_id(dev);
+	set_tag->data = tag_id;
+	assert(tag_action);
+	tag_action->conf = set_tag;
+	/* Prepare the suffix subflow items. */
+	if (sfx_items) {
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4137,6 +4270,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4183,8 +4317,9 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4199,7 +4334,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4210,6 +4345,117 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4257,7 +4503,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 902380b..941de5f 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -79,6 +79,7 @@ enum mlx5_feature_name {
 	MLX5_COPY_MARK,
 	MLX5_MTR_COLOR,
 	MLX5_MTR_SFX,
+	MLX5_SAMPLE_FDB,
 };
 
 /* Pattern outer Layer bits. */
@@ -498,6 +499,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -526,6 +559,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -589,6 +624,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (4 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 5/8] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 19:54   ` Ori Kam
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 7/8] net/mlx5: update the metadata register c0 support Jiawei Wang
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 502 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 511 insertions(+), 21 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7c65a9a..73ef290 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4569,10 +4569,14 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4591,7 +4595,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4637,7 +4641,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4674,8 +4678,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 941de5f..4163183 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -369,6 +369,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -599,13 +606,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 710c0f3..62a4a3b 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -79,6 +79,10 @@
 flow_dv_tbl_resource_release(struct rte_eth_dev *dev,
 			     struct mlx5_flow_tbl_resource *tbl);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				     uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -7897,6 +7901,385 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[in, out] hrxq_idx
+ *   Hash Rx queue index.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx,
+			  struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	if (!hrxq) {
+		rte_flow_error_set
+			(error, rte_errno,
+			 RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+			 "cannot get hash queue");
+		goto error;
+	}
+	dh->rix_hrxq = *hrxq_idx;
+	return hrxq;
+error:
+	/* hrxq is union, don't clear it if the flag is not set. */
+	if (dh->rix_hrxq) {
+		mlx5_hrxq_release(dev, dh->rix_hrxq);
+		dh->rix_hrxq = 0;
+	}
+	return NULL;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx,
+					error);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not suppport for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -7959,9 +8342,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -7980,7 +8367,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -7988,6 +8374,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		switch (action_type) {
 		case RTE_FLOW_ACTION_TYPE_VOID:
@@ -8221,9 +8608,12 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8384,6 +8774,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8410,6 +8813,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -8819,18 +9237,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9036,6 +9454,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an encap/decap resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9113,8 +9596,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 7/8] net/mlx5: update the metadata register c0 support
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (5 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 8/8] app/testpmd: add testpmd command for sample action Jiawei Wang
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

For Sample flow, it splits into two sub flows and using metadata
register as matcher between two flows.

Meatadata register C0 filed might be also used for source vport
index if kernel uses this field, this changes add the checking
while do tag action with reg_c0 to decide using upper or lower
16-bits of metadata register c0 filed.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_dv.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 62a4a3b..ed9d2d2e 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -980,13 +980,24 @@ struct field_modify_info modify_tcp[] = {
  */
 static int
 flow_dv_convert_action_set_reg
-			(struct mlx5_flow_dv_modify_hdr_resource *resource,
+			(struct rte_eth_dev *dev,
+			 struct mlx5_flow_dv_modify_hdr_resource *resource,
 			 const struct rte_flow_action *action,
 			 struct rte_flow_error *error)
 {
 	const struct mlx5_rte_flow_action_set_tag *conf = action->conf;
-	struct mlx5_modification_cmd *actions = resource->actions;
 	uint32_t i = resource->actions_num;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	rte_be32_t mask = UINT32_MAX;
+	rte_be32_t data = rte_cpu_to_be_32(conf->data) & mask;
+	struct rte_flow_item item = {
+		.spec = &data,
+		.mask = &mask,
+	};
+	struct field_modify_info reg_c_x[] = {
+		[1] = {0, 0, 0},
+	};
+	int reg = conf->id;
 
 	if (i >= MLX5_MAX_MODIFY_NUM)
 		return rte_flow_error_set(error, EINVAL,
@@ -994,15 +1005,16 @@ struct field_modify_info modify_tcp[] = {
 					  "too many items to modify");
 	MLX5_ASSERT(conf->id != REG_NONE);
 	MLX5_ASSERT(conf->id < RTE_DIM(reg_to_field));
-	actions[i] = (struct mlx5_modification_cmd) {
-		.action_type = MLX5_MODIFICATION_TYPE_SET,
-		.field = reg_to_field[conf->id],
-	};
-	actions[i].data0 = rte_cpu_to_be_32(actions[i].data0);
-	actions[i].data1 = rte_cpu_to_be_32(conf->data);
-	++i;
-	resource->actions_num = i;
-	return 0;
+	if (reg == REG_C_0) {
+		uint32_t msk_c0 = priv->sh->dv_regc0_mask;
+		uint32_t shl_c0 = rte_bsf32(msk_c0);
+		data = rte_cpu_to_be_32(rte_cpu_to_be_32(data) << shl_c0);
+		mask = rte_cpu_to_be_32(mask) & msk_c0;
+		mask = rte_cpu_to_be_32(mask << shl_c0);
+	}
+	reg_c_x[0] = (struct field_modify_info){4, 0, reg_to_field[reg]};
+	return flow_dv_convert_modify_action(&item, reg_c_x, NULL, resource,
+			MLX5_MODIFICATION_TYPE_SET, error);
 }
 
 /**
@@ -8722,7 +8734,7 @@ struct field_modify_info modify_tcp[] = {
 			break;
 		case MLX5_RTE_FLOW_ACTION_TYPE_TAG:
 			if (flow_dv_convert_action_set_reg
-					(mhdr_res, actions, error))
+					(dev, mhdr_res, actions, error))
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_TAG;
 			break;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 8/8] app/testpmd: add testpmd command for sample action
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (6 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 7/8] net/mlx5: update the metadata register c0 support Jiawei Wang
@ 2020-06-25 16:26 ` Jiawei Wang
  2020-06-30 15:23   ` Ori Kam
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
  9 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-06-25 16:26 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e2006c..6b1e515 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -349,6 +351,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -484,6 +490,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1161,6 +1183,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1393,9 +1416,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1460,7 +1502,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1531,6 +1581,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3612,11 +3664,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3647,6 +3701,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3750,6 +3821,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5207,6 +5309,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -5971,6 +6143,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6007,11 +6211,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6162,6 +6363,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6607,7 +6826,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6624,6 +6889,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-06-25 17:55   ` Jerin Jacob
  2020-06-25 19:29     ` Thomas Monjalon
  2020-06-28  8:27   ` Andrew Rybchenko
  2020-07-01  9:37   ` Ori Kam
  2 siblings, 1 reply; 129+ messages in thread
From: Jerin Jacob @ 2020-06-25 17:55 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev, Thomas Monjalon,
	Raslan Darawsheh, ian.stokes, fbl

On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
>
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested vf or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
>
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic in some predefined ratio to the
> application, while the original packet will continue to the target
> destination.
>
> The packets sampled equals is '1/ratio', if the ratio value be set to 1
> , means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
>
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample

Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
I am not able to understand, Why it is called sample.


> will be introduced.
>
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  lib/librte_ethdev/rte_flow.c |  1 +
>  lib/librte_ethdev/rte_flow.h | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
>

> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated to a special queue or vport
> + * in the predefined probabiilty, All the packets continues processing
> + * on the default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> + * Additional action list be supported to add for sampled or mirrored packets.
> + */
> +struct rte_flow_action_sample {
> +       /* packets sampled equals to '1/ratio' */
> +       const uint32_t ratio;
> +       /* sub-action list specific for the sampling hit cases */

Why not use, RTE_FLOW_ACTION_TYPE_PASSTHRU action to append further
action for this ACTION.

> +       const struct rte_flow_action *actions;
> +};
> +
> +/**
>   * Verbose error types.
>   *
>   * Most of them provide the type of the object referenced by struct
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 17:55   ` Jerin Jacob
@ 2020-06-25 19:29     ` Thomas Monjalon
  2020-06-26 10:35       ` Jerin Jacob
  0 siblings, 1 reply; 129+ messages in thread
From: Thomas Monjalon @ 2020-06-25 19:29 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl

25/06/2020 19:55, Jerin Jacob:
> On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> >
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> > visibility on the traffic.
> > So there's a need for an action that will enable the control application
> > some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > , means that the packets would be completely mirrored. The sample packet
> > can be assigned with different set of actions from the original packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> 
> Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
> I am not able to understand, Why it is called sample.

Sampling is a partial mirroring.
Full mirroring is sampling 100% packets (ratio = 1).
That's why only one action is enough.



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 19:29     ` Thomas Monjalon
@ 2020-06-26 10:35       ` Jerin Jacob
  2020-06-26 10:45         ` Thomas Monjalon
  0 siblings, 1 reply; 129+ messages in thread
From: Jerin Jacob @ 2020-06-26 10:35 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl

On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 25/06/2020 19:55, Jerin Jacob:
> > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> > >
> > > When using full offload, all traffic will be handled by the HW, and
> > > directed to the requested vf or wire, the control application loses
> > > visibility on the traffic.
> > > So there's a need for an action that will enable the control application
> > > some visibility.
> > >
> > > The solution is introduced a new action that will sample the incoming
> > > traffic and send a duplicated traffic in some predefined ratio to the
> > > application, while the original packet will continue to the target
> > > destination.
> > >
> > > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > > , means that the packets would be completely mirrored. The sample packet
> > > can be assigned with different set of actions from the original packet.
> > >
> > > In order to support the sample packet in rte_flow, new rte_flow action
> > > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> >
> > Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
> > I am not able to understand, Why it is called sample.
>
> Sampling is a partial mirroring.

I think, By definition, _sampling_ is the _selection_ of items from a
specific group.
I think, _sampling_ is not dictating, what is the real action for the
"selected"  items.
One can get confused with the selected ones can be for forward, drop
any other action.

So IMO, explicit mirror keyword usage makes it is clear.

Some more related questions:
1) What is the real use case for ratio? I am not against adding a
ratio attribute if the MLX hardware supports it. It will be good to
know the use case from the application perspective? And what basics
application set ratio != 1?
2) If it is for "rate-limiting" or "policing", why not use rte_mtr
object (rte_mtr.h) via rte_flow action.
3) One of the issue for driver developers and application writers are
overlapping APIs. This would overlap with rte_eth_mirror_rule_set()
API.

Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain for
all to have overlapping APIs. We have not fixed the VLAN filter API
overlap with rte_flow in ethdev. Its being TODO for multiple releases
now.


> Full mirroring is sampling 100% packets (ratio = 1).
> That's why only one action is enough.
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-26 10:35       ` Jerin Jacob
@ 2020-06-26 10:45         ` Thomas Monjalon
  2020-06-26 11:10           ` Jerin Jacob
  0 siblings, 1 reply; 129+ messages in thread
From: Thomas Monjalon @ 2020-06-26 10:45 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, ferruh.yigit, arybchenko

26/06/2020 12:35, Jerin Jacob:
> On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 25/06/2020 19:55, Jerin Jacob:
> > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> > > >
> > > > When using full offload, all traffic will be handled by the HW, and
> > > > directed to the requested vf or wire, the control application loses
> > > > visibility on the traffic.
> > > > So there's a need for an action that will enable the control application
> > > > some visibility.
> > > >
> > > > The solution is introduced a new action that will sample the incoming
> > > > traffic and send a duplicated traffic in some predefined ratio to the
> > > > application, while the original packet will continue to the target
> > > > destination.
> > > >
> > > > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > > > , means that the packets would be completely mirrored. The sample packet
> > > > can be assigned with different set of actions from the original packet.
> > > >
> > > > In order to support the sample packet in rte_flow, new rte_flow action
> > > > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> > >
> > > Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
> > > I am not able to understand, Why it is called sample.
> >
> > Sampling is a partial mirroring.
> 
> I think, By definition, _sampling_ is the _selection_ of items from a
> specific group.
> I think, _sampling_ is not dictating, what is the real action for the
> "selected"  items.
> One can get confused with the selected ones can be for forward, drop
> any other action.

I see. Good design question (I will let others reply).

> So IMO, explicit mirror keyword usage makes it is clear.
> 
> Some more related questions:
> 1) What is the real use case for ratio? I am not against adding a
> ratio attribute if the MLX hardware supports it. It will be good to
> know the use case from the application perspective? And what basics
> application set ratio != 1?

If I understand well, some applications want to check,
by picking random packets, that the processing is not failing.

> 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
> object (rte_mtr.h) via rte_flow action.
> 3) One of the issue for driver developers and application writers are
> overlapping APIs. This would overlap with rte_eth_mirror_rule_set()
> API.
> 
> Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain for
> all to have overlapping APIs. We have not fixed the VLAN filter API
> overlap with rte_flow in ethdev. Its being TODO for multiple releases
> now.

Ooooooooh yes!
I think flow-based API is more powerful, and should deprecate
old port-based API.
I want to help deprecating such API in 20.11 if possible.

> > Full mirroring is sampling 100% packets (ratio = 1).
> > That's why only one action is enough.




^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-26 10:45         ` Thomas Monjalon
@ 2020-06-26 11:10           ` Jerin Jacob
  2020-06-28  8:14             ` Andrew Rybchenko
  2020-06-28 13:16             ` Jiawei(Jonny) Wang
  0 siblings, 2 replies; 129+ messages in thread
From: Jerin Jacob @ 2020-06-26 11:10 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 26/06/2020 12:35, Jerin Jacob:
> > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 25/06/2020 19:55, Jerin Jacob:
> > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> > > > >
> > > > > When using full offload, all traffic will be handled by the HW, and
> > > > > directed to the requested vf or wire, the control application loses
> > > > > visibility on the traffic.
> > > > > So there's a need for an action that will enable the control application
> > > > > some visibility.
> > > > >
> > > > > The solution is introduced a new action that will sample the incoming
> > > > > traffic and send a duplicated traffic in some predefined ratio to the
> > > > > application, while the original packet will continue to the target
> > > > > destination.
> > > > >
> > > > > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > > > > , means that the packets would be completely mirrored. The sample packet
> > > > > can be assigned with different set of actions from the original packet.
> > > > >
> > > > > In order to support the sample packet in rte_flow, new rte_flow action
> > > > > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> > > >
> > > > Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
> > > > I am not able to understand, Why it is called sample.
> > >
> > > Sampling is a partial mirroring.
> >
> > I think, By definition, _sampling_ is the _selection_ of items from a
> > specific group.
> > I think, _sampling_ is not dictating, what is the real action for the
> > "selected"  items.
> > One can get confused with the selected ones can be for forward, drop
> > any other action.
>
> I see. Good design question (I will let others reply).
>
> > So IMO, explicit mirror keyword usage makes it is clear.
> >
> > Some more related questions:
> > 1) What is the real use case for ratio? I am not against adding a
> > ratio attribute if the MLX hardware supports it. It will be good to
> > know the use case from the application perspective? And what basics
> > application set ratio != 1?
>
> If I understand well, some applications want to check,
> by picking random packets, that the processing is not failing.

Not clear to me. I will wait for another explanation if any.
In what basics application set .1 vs .8?

>
> > 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
> > object (rte_mtr.h) via rte_flow action.
> > 3) One of the issue for driver developers and application writers are
> > overlapping APIs. This would overlap with rte_eth_mirror_rule_set()
> > API.
> >
> > Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain for
> > all to have overlapping APIs. We have not fixed the VLAN filter API
> > overlap with rte_flow in ethdev. Its being TODO for multiple releases
> > now.
>
> Ooooooooh yes!
> I think flow-based API is more powerful, and should deprecate
> old port-based API.

+1 from me.

it is taking too much effort and time to make support duplicate APIs.

> I want to help deprecating such API in 20.11 if possible.

Please start that discussion. In this case, it is clear API overlap
with rte_eth_mirror_rule_set().
We should not have two separate paths for the same function in the
same ethdev library.



>
> > > Full mirroring is sampling 100% packets (ratio = 1).
> > > That's why only one action is enough.
>
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-26 11:10           ` Jerin Jacob
@ 2020-06-28  8:14             ` Andrew Rybchenko
  2020-06-28 13:16             ` Jiawei(Jonny) Wang
  1 sibling, 0 replies; 129+ messages in thread
From: Andrew Rybchenko @ 2020-06-28  8:14 UTC (permalink / raw)
  To: Jerin Jacob, Thomas Monjalon
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit

On 6/26/20 2:10 PM, Jerin Jacob wrote:
> On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>
>> 26/06/2020 12:35, Jerin Jacob:
>>> On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>
>>>> 25/06/2020 19:55, Jerin Jacob:
>>>>> On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang <jiaweiw@mellanox.com> wrote:
>>>>>>
>>>>>> When using full offload, all traffic will be handled by the HW, and
>>>>>> directed to the requested vf or wire, the control application loses
>>>>>> visibility on the traffic.
>>>>>> So there's a need for an action that will enable the control application
>>>>>> some visibility.
>>>>>>
>>>>>> The solution is introduced a new action that will sample the incoming
>>>>>> traffic and send a duplicated traffic in some predefined ratio to the
>>>>>> application, while the original packet will continue to the target
>>>>>> destination.
>>>>>>
>>>>>> The packets sampled equals is '1/ratio', if the ratio value be set to 1
>>>>>> , means that the packets would be completely mirrored. The sample packet
>>>>>> can be assigned with different set of actions from the original packet.
>>>>>>
>>>>>> In order to support the sample packet in rte_flow, new rte_flow action
>>>>>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
>>>>>
>>>>> Isn't mirroring the packet? How about, RTE_FLOW_ACTION_TYPE_MIRROR
>>>>> I am not able to understand, Why it is called sample.
>>>>
>>>> Sampling is a partial mirroring.
>>>
>>> I think, By definition, _sampling_ is the _selection_ of items from a
>>> specific group.
>>> I think, _sampling_ is not dictating, what is the real action for the
>>> "selected"  items.
>>> One can get confused with the selected ones can be for forward, drop
>>> any other action.
>>
>> I see. Good design question (I will let others reply).
>>
>>> So IMO, explicit mirror keyword usage makes it is clear.
>>>
>>> Some more related questions:
>>> 1) What is the real use case for ratio? I am not against adding a
>>> ratio attribute if the MLX hardware supports it. It will be good to
>>> know the use case from the application perspective? And what basics
>>> application set ratio != 1?
>>
>> If I understand well, some applications want to check,
>> by picking random packets, that the processing is not failing.
> 
> Not clear to me. I will wait for another explanation if any.
> In what basics application set .1 vs .8?
> 
>>
>>> 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
>>> object (rte_mtr.h) via rte_flow action.
>>> 3) One of the issue for driver developers and application writers are
>>> overlapping APIs. This would overlap with rte_eth_mirror_rule_set()
>>> API.
>>>
>>> Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain for
>>> all to have overlapping APIs. We have not fixed the VLAN filter API
>>> overlap with rte_flow in ethdev. Its being TODO for multiple releases
>>> now.
>>
>> Ooooooooh yes!
>> I think flow-based API is more powerful, and should deprecate
>> old port-based API.
> 
> +1 from me.

+1

> it is taking too much effort and time to make support duplicate APIs.
> 
>> I want to help deprecating such API in 20.11 if possible.
> 
> Please start that discussion. In this case, it is clear API overlap
> with rte_eth_mirror_rule_set().
> We should not have two separate paths for the same function in the
> same ethdev library.
> 
> 
> 
>>
>>>> Full mirroring is sampling 100% packets (ratio = 1).
>>>> That's why only one action is enough.
>>
>>
>>


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-06-25 17:55   ` Jerin Jacob
@ 2020-06-28  8:27   ` Andrew Rybchenko
  2020-06-28 16:16     ` Jiawei(Jonny) Wang
  2020-07-01  9:37   ` Ori Kam
  2 siblings, 1 reply; 129+ messages in thread
From: Andrew Rybchenko @ 2020-06-28  8:27 UTC (permalink / raw)
  To: Jiawei Wang, orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl

On 6/25/20 7:26 PM, Jiawei Wang wrote:
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested vf or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
> 
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic in some predefined ratio to the
> application, while the original packet will continue to the target
> destination.
> 
> The packets sampled equals is '1/ratio', if the ratio value be set to 1
> , means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
> 
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> will be introduced.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>

[snip]

> @@ -2709,6 +2716,28 @@ struct rte_flow_action {
>  struct rte_flow;
>  
>  /**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> + *
> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated to a special queue or vport
> + * in the predefined probabiilty, All the packets continues processing
> + * on the default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> + * Additional action list be supported to add for sampled or mirrored packets.
> + */
> +struct rte_flow_action_sample {
> +	/* packets sampled equals to '1/ratio' */
> +	const uint32_t ratio;
> +	/* sub-action list specific for the sampling hit cases */
> +	const struct rte_flow_action *actions;

This design idea does not look good to me from the very
beginning. IMHO it does not fit flow API overall design.
I mean sub-action list.

As I understand Linux iptables solves it on match level
(i.e. in pattern). E.g. "limit" extension which is basically
sampling. Sampling using meta pattern item in combination
with PASSTHRU action (to make sampling actions non-terminating
if required) is a better solution from design point of view.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-26 11:10           ` Jerin Jacob
  2020-06-28  8:14             ` Andrew Rybchenko
@ 2020-06-28 13:16             ` Jiawei(Jonny) Wang
  2020-06-28 13:37               ` Jerin Jacob
  1 sibling, 1 reply; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-06-28 13:16 UTC (permalink / raw)
  To: Jerin Jacob, Thomas Monjalon
  Cc: Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko


On Friday, June 26, 2020 7:10 PM Jerin Jacob <jerinjacobk@gmail.com> Wrote:
>
> On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 26/06/2020 12:35, Jerin Jacob:
> > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > >
> > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
> <jiaweiw@mellanox.com> wrote:
> > > > > >
> > > > > > When using full offload, all traffic will be handled by the
> > > > > > HW, and directed to the requested vf or wire, the control
> > > > > > application loses visibility on the traffic.
> > > > > > So there's a need for an action that will enable the control
> > > > > > application some visibility.
> > > > > >
> > > > > > The solution is introduced a new action that will sample the
> > > > > > incoming traffic and send a duplicated traffic in some
> > > > > > predefined ratio to the application, while the original packet
> > > > > > will continue to the target destination.
> > > > > >
> > > > > > The packets sampled equals is '1/ratio', if the ratio value be
> > > > > > set to 1 , means that the packets would be completely
> > > > > > mirrored. The sample packet can be assigned with different set of
> actions from the original packet.
> > > > > >
> > > > > > In order to support the sample packet in rte_flow, new
> > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE and
> > > > > > structure rte_flow_action_sample
> > > > >
> > > > > Isn't mirroring the packet? How about,
> > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand, Why
> it is called sample.
> > > >
> > > > Sampling is a partial mirroring.
> > >
> > > I think, By definition, _sampling_ is the _selection_ of items from
> > > a specific group.
> > > I think, _sampling_ is not dictating, what is the real action for
> > > the "selected"  items.
> > > One can get confused with the selected ones can be for forward, drop
> > > any other action.
> >
> > I see. Good design question (I will let others reply).
> >
> > > So IMO, explicit mirror keyword usage makes it is clear.

Sampled packet is duplicated from incoming traffic at specific ratio and will go to different sample actions;
ratio=1 is 100% duplication or mirroring.
All packets will continue to go to default flow actions.

> > >
> > > Some more related questions:
> > > 1) What is the real use case for ratio? I am not against adding a
> > > ratio attribute if the MLX hardware supports it. It will be good to
> > > know the use case from the application perspective? And what basics
> > > application set ratio != 1?
> >
> > If I understand well, some applications want to check, by picking
> > random packets, that the processing is not failing.
> 
> Not clear to me. I will wait for another explanation if any.
> In what basics application set .1 vs .8?

The real case is like monitor the traffic with full-offload. 
While packet hit the sample flow, the matching packets will be sampled and sent to specific Queue,
align with OVS sflow probability, user application can set it different value.

> 
> >
> > > 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
> > > object (rte_mtr.h) via rte_flow action.

The sample ratio isn’t the same as “meter’, the ratio of sampling will be calculated with incoming packets mask (every some packets sampled 1). Then the packets will be duplicated and go to do the other sample actions.


> > > 3) One of the issue for driver developers and application writers
> > > are overlapping APIs. This would overlap with
> > > rte_eth_mirror_rule_set() API.
> > >
> > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain
> > > for all to have overlapping APIs. We have not fixed the VLAN filter
> > > API overlap with rte_flow in ethdev. Its being TODO for multiple
> > > releases now.
> >
> > Ooooooooh yes!
> > I think flow-based API is more powerful, and should deprecate old
> > port-based API.
> 
> +1 from me.
> 
> it is taking too much effort and time to make support duplicate APIs.
> 
> > I want to help deprecating such API in 20.11 if possible.
> 
> Please start that discussion. In this case, it is clear API overlap with
> rte_eth_mirror_rule_set().
> We should not have two separate paths for the same function in the same
> ethdev library.
> 
> 
> 
> >
> > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > That's why only one action is enough.
> >
> >
> >

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28 13:16             ` Jiawei(Jonny) Wang
@ 2020-06-28 13:37               ` Jerin Jacob
  2020-06-28 15:52                 ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 129+ messages in thread
From: Jerin Jacob @ 2020-06-28 13:37 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang
  Cc: Thomas Monjalon, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

On Sun, Jun 28, 2020 at 6:46 PM Jiawei(Jonny) Wang <jiaweiw@mellanox.com> wrote:
>
>
> On Friday, June 26, 2020 7:10 PM Jerin Jacob <jerinjacobk@gmail.com> Wrote:
> >
> > On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon <thomas@monjalon.net>
> > wrote:
> > >
> > > 26/06/2020 12:35, Jerin Jacob:
> > > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > >
> > > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
> > <jiaweiw@mellanox.com> wrote:
> > > > > > >
> > > > > > > When using full offload, all traffic will be handled by the
> > > > > > > HW, and directed to the requested vf or wire, the control
> > > > > > > application loses visibility on the traffic.
> > > > > > > So there's a need for an action that will enable the control
> > > > > > > application some visibility.
> > > > > > >
> > > > > > > The solution is introduced a new action that will sample the
> > > > > > > incoming traffic and send a duplicated traffic in some
> > > > > > > predefined ratio to the application, while the original packet
> > > > > > > will continue to the target destination.
> > > > > > >
> > > > > > > The packets sampled equals is '1/ratio', if the ratio value be
> > > > > > > set to 1 , means that the packets would be completely
> > > > > > > mirrored. The sample packet can be assigned with different set of
> > actions from the original packet.
> > > > > > >
> > > > > > > In order to support the sample packet in rte_flow, new
> > > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE and
> > > > > > > structure rte_flow_action_sample
> > > > > >
> > > > > > Isn't mirroring the packet? How about,
> > > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand, Why
> > it is called sample.
> > > > >
> > > > > Sampling is a partial mirroring.
> > > >
> > > > I think, By definition, _sampling_ is the _selection_ of items from
> > > > a specific group.
> > > > I think, _sampling_ is not dictating, what is the real action for
> > > > the "selected"  items.
> > > > One can get confused with the selected ones can be for forward, drop
> > > > any other action.
> > >
> > > I see. Good design question (I will let others reply).
> > >
> > > > So IMO, explicit mirror keyword usage makes it is clear.
>
> Sampled packet is duplicated from incoming traffic at specific ratio and will go to different sample actions;
> ratio=1 is 100% duplication or mirroring.
> All packets will continue to go to default flow actions.

Functionality is clear from the git commit log(Not from action name).
The only question is what would be the appropriate name
for this action. RTE_FLOW_ACTION_TYPE_SAMPLE vs RTE_FLOW_ACTION_TYPE_MIRROR

>
> > > >
> > > > Some more related questions:
> > > > 1) What is the real use case for ratio? I am not against adding a
> > > > ratio attribute if the MLX hardware supports it. It will be good to
> > > > know the use case from the application perspective? And what basics
> > > > application set ratio != 1?
> > >
> > > If I understand well, some applications want to check, by picking
> > > random packets, that the processing is not failing.
> >
> > Not clear to me. I will wait for another explanation if any.
> > In what basics application set .1 vs .8?
>
> The real case is like monitor the traffic with full-offload.
> While packet hit the sample flow, the matching packets will be sampled and sent to specific Queue,
> align with OVS sflow probability, user application can set it different value.

I understand the use case for mirror and supported in a lot of HW.
What I would like to understand is the use case for "ratio"?
Is the "ratio" part of OpenFlow spec? Or Is it an MLX hardware feature?



>
> >
> > >
> > > > 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
> > > > object (rte_mtr.h) via rte_flow action.
>
> The sample ratio isn’t the same as “meter’, the ratio of sampling will be calculated with incoming packets mask (every some packets sampled 1). Then the packets will be duplicated and go to do the other sample actions.

What I meant here is , If the ratio is used for rate-limiting then
having a cascade rule like RTE_FLOW_ACTION_TYPE_MIRROR,
RTE_FLOW_ACTION_TYPE_MTR will do the job.

>
>
> > > > 3) One of the issue for driver developers and application writers
> > > > are overlapping APIs. This would overlap with
> > > > rte_eth_mirror_rule_set() API.
> > > >
> > > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain
> > > > for all to have overlapping APIs. We have not fixed the VLAN filter
> > > > API overlap with rte_flow in ethdev. Its being TODO for multiple
> > > > releases now.
> > >
> > > Ooooooooh yes!
> > > I think flow-based API is more powerful, and should deprecate old
> > > port-based API.
> >
> > +1 from me.
> >
> > it is taking too much effort and time to make support duplicate APIs.
> >
> > > I want to help deprecating such API in 20.11 if possible.
> >
> > Please start that discussion. In this case, it is clear API overlap with
> > rte_eth_mirror_rule_set().
> > We should not have two separate paths for the same function in the same
> > ethdev library.
> >
> >
> >
> > >
> > > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > > That's why only one action is enough.
> > >
> > >
> > >

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28 13:37               ` Jerin Jacob
@ 2020-06-28 15:52                 ` Jiawei(Jonny) Wang
  2020-07-02  0:18                   ` Stephen Hemminger
  0 siblings, 1 reply; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-06-28 15:52 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Thomas Monjalon, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko



> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Sunday, June 28, 2020 9:38 PM
> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Ori Kam
> <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Matan Azrad <matan@mellanox.com>; dpdk-dev <dev@dpdk.org>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> On Sun, Jun 28, 2020 at 6:46 PM Jiawei(Jonny) Wang
> <jiaweiw@mellanox.com> wrote:
> >
> >
> > On Friday, June 26, 2020 7:10 PM Jerin Jacob <jerinjacobk@gmail.com>
> Wrote:
> > >
> > > On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon
> > > <thomas@monjalon.net>
> > > wrote:
> > > >
> > > > 26/06/2020 12:35, Jerin Jacob:
> > > > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
> > > <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
> > > <jiaweiw@mellanox.com> wrote:
> > > > > > > >
> > > > > > > > When using full offload, all traffic will be handled by
> > > > > > > > the HW, and directed to the requested vf or wire, the
> > > > > > > > control application loses visibility on the traffic.
> > > > > > > > So there's a need for an action that will enable the
> > > > > > > > control application some visibility.
> > > > > > > >
> > > > > > > > The solution is introduced a new action that will sample
> > > > > > > > the incoming traffic and send a duplicated traffic in some
> > > > > > > > predefined ratio to the application, while the original
> > > > > > > > packet will continue to the target destination.
> > > > > > > >
> > > > > > > > The packets sampled equals is '1/ratio', if the ratio
> > > > > > > > value be set to 1 , means that the packets would be
> > > > > > > > completely mirrored. The sample packet can be assigned
> > > > > > > > with different set of
> > > actions from the original packet.
> > > > > > > >
> > > > > > > > In order to support the sample packet in rte_flow, new
> > > > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE
> and
> > > > > > > > structure rte_flow_action_sample
> > > > > > >
> > > > > > > Isn't mirroring the packet? How about,
> > > > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand,
> Why
> > > it is called sample.
> > > > > >
> > > > > > Sampling is a partial mirroring.
> > > > >
> > > > > I think, By definition, _sampling_ is the _selection_ of items
> > > > > from a specific group.
> > > > > I think, _sampling_ is not dictating, what is the real action
> > > > > for the "selected"  items.
> > > > > One can get confused with the selected ones can be for forward,
> > > > > drop any other action.
> > > >
> > > > I see. Good design question (I will let others reply).
> > > >
> > > > > So IMO, explicit mirror keyword usage makes it is clear.
> >
> > Sampled packet is duplicated from incoming traffic at specific ratio
> > and will go to different sample actions;
> > ratio=1 is 100% duplication or mirroring.
> > All packets will continue to go to default flow actions.
> 
> Functionality is clear from the git commit log(Not from action name).
> The only question is what would be the appropriate name for this action.
> RTE_FLOW_ACTION_TYPE_SAMPLE vs RTE_FLOW_ACTION_TYPE_MIRROR
> 
> >
> > > > >
> > > > > Some more related questions:
> > > > > 1) What is the real use case for ratio? I am not against adding
> > > > > a ratio attribute if the MLX hardware supports it. It will be
> > > > > good to know the use case from the application perspective? And
> > > > > what basics application set ratio != 1?
> > > >
> > > > If I understand well, some applications want to check, by picking
> > > > random packets, that the processing is not failing.
> > >
> > > Not clear to me. I will wait for another explanation if any.
> > > In what basics application set .1 vs .8?
> >
> > The real case is like monitor the traffic with full-offload.
> > While packet hit the sample flow, the matching packets will be sampled
> > and sent to specific Queue, align with OVS sflow probability, user
> application can set it different value.
> 
> I understand the use case for mirror and supported in a lot of HW.
> What I would like to understand is the use case for "ratio"?
> Is the "ratio" part of OpenFlow spec? Or Is it an MLX hardware feature?
> 
The same usage of the 'probability' variable of ovs sample action;
MLX HW implemented it.
> 
> 
> >
> > >
> > > >
> > > > > 2) If it is for "rate-limiting" or "policing", why not use
> > > > > rte_mtr object (rte_mtr.h) via rte_flow action.
> >
> > The sample ratio isn’t the same as “meter’, the ratio of sampling will be
> calculated with incoming packets mask (every some packets sampled 1).
> Then the packets will be duplicated and go to do the other sample actions.
> 
> What I meant here is , If the ratio is used for rate-limiting then having a
> cascade rule like RTE_FLOW_ACTION_TYPE_MIRROR,
> RTE_FLOW_ACTION_TYPE_MTR will do the job.
> 
The ratio means the probability with packet replication, we don't need add METER action here.
> >
> >
> > > > > 3) One of the issue for driver developers and application
> > > > > writers are overlapping APIs. This would overlap with
> > > > > rte_eth_mirror_rule_set() API.
> > > > >
> > > > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a
> > > > > pain for all to have overlapping APIs. We have not fixed the
> > > > > VLAN filter API overlap with rte_flow in ethdev. Its being TODO
> > > > > for multiple releases now.
> > > >
> > > > Ooooooooh yes!
> > > > I think flow-based API is more powerful, and should deprecate old
> > > > port-based API.
> > >
> > > +1 from me.
> > >
> > > it is taking too much effort and time to make support duplicate APIs.
> > >
> > > > I want to help deprecating such API in 20.11 if possible.
> > >
> > > Please start that discussion. In this case, it is clear API overlap
> > > with rte_eth_mirror_rule_set().
> > > We should not have two separate paths for the same function in the
> > > same ethdev library.
> > >
> > >
> > >
> > > >
> > > > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > > > That's why only one action is enough.
> > > >
> > > >
> > > >

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28  8:27   ` Andrew Rybchenko
@ 2020-06-28 16:16     ` Jiawei(Jonny) Wang
  2020-06-28 16:18       ` Andrew Rybchenko
  0 siblings, 1 reply; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-06-28 16:16 UTC (permalink / raw)
  To: Andrew Rybchenko, Ori Kam, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl


On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
> 
> On 6/25/20 7:26 PM, Jiawei Wang wrote:
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> > visibility on the traffic.
> > So there's a need for an action that will enable the control
> > application some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> > The packets sampled equals is '1/ratio', if the ratio value be set to
> > 1 , means that the packets would be completely mirrored. The sample
> > packet can be assigned with different set of actions from the original
> packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample will be introduced.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> 
> [snip]
> 
> > @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
> >
> >  /**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > + *
> > + * Adds a sample action to a matched flow.
> > + *
> > + * The matching packets will be duplicated to a special queue or
> > +vport
> > + * in the predefined probabiilty, All the packets continues
> > +processing
> > + * on the default flow path.
> > + *
> > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > + * Additional action list be supported to add for sampled or mirrored
> packets.
> > + */
> > +struct rte_flow_action_sample {
> > +	/* packets sampled equals to '1/ratio' */
> > +	const uint32_t ratio;
> > +	/* sub-action list specific for the sampling hit cases */
> > +	const struct rte_flow_action *actions;
> 
> This design idea does not look good to me from the very beginning. IMHO it
> does not fit flow API overall design.
> I mean sub-action list.
> 
> As I understand Linux iptables solves it on match level (i.e. in pattern). E.g.
> "limit" extension which is basically sampling. Sampling using meta pattern
> item in combination with PASSTHRU action (to make sampling actions non-
> terminating if required) is a better solution from design point of view.

On our design, there're sample flow path and normal flow path, each path can have different actions.
The defined sub-actions list only applied for sampled packets in the sample flow path;
For normal path, all packets will continue to go with the original actions.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28 16:16     ` Jiawei(Jonny) Wang
@ 2020-06-28 16:18       ` Andrew Rybchenko
  2020-06-29 11:40         ` Ori Kam
  0 siblings, 1 reply; 129+ messages in thread
From: Andrew Rybchenko @ 2020-06-28 16:18 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

On 6/28/20 7:16 PM, Jiawei(Jonny) Wang wrote:
> 
> On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
>>
>> On 6/25/20 7:26 PM, Jiawei Wang wrote:
>>> When using full offload, all traffic will be handled by the HW, and
>>> directed to the requested vf or wire, the control application loses
>>> visibility on the traffic.
>>> So there's a need for an action that will enable the control
>>> application some visibility.
>>>
>>> The solution is introduced a new action that will sample the incoming
>>> traffic and send a duplicated traffic in some predefined ratio to the
>>> application, while the original packet will continue to the target
>>> destination.
>>>
>>> The packets sampled equals is '1/ratio', if the ratio value be set to
>>> 1 , means that the packets would be completely mirrored. The sample
>>> packet can be assigned with different set of actions from the original
>> packet.
>>>
>>> In order to support the sample packet in rte_flow, new rte_flow action
>>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
>>> rte_flow_action_sample will be introduced.
>>>
>>> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
>>
>> [snip]
>>
>>> @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
>>>
>>>  /**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>>> + *
>>> + * RTE_FLOW_ACTION_TYPE_SAMPLE
>>> + *
>>> + * Adds a sample action to a matched flow.
>>> + *
>>> + * The matching packets will be duplicated to a special queue or
>>> +vport
>>> + * in the predefined probabiilty, All the packets continues
>>> +processing
>>> + * on the default flow path.
>>> + *
>>> + * When the sample ratio is set to 1 then the packets will be 100%
>> mirrored.
>>> + * Additional action list be supported to add for sampled or mirrored
>> packets.
>>> + */
>>> +struct rte_flow_action_sample {
>>> +	/* packets sampled equals to '1/ratio' */
>>> +	const uint32_t ratio;
>>> +	/* sub-action list specific for the sampling hit cases */
>>> +	const struct rte_flow_action *actions;
>>
>> This design idea does not look good to me from the very beginning. IMHO it
>> does not fit flow API overall design.
>> I mean sub-action list.
>>
>> As I understand Linux iptables solves it on match level (i.e. in pattern). E.g.
>> "limit" extension which is basically sampling. Sampling using meta pattern
>> item in combination with PASSTHRU action (to make sampling actions non-
>> terminating if required) is a better solution from design point of view.
> 
> On our design, there're sample flow path and normal flow path, each path can have different actions.
> The defined sub-actions list only applied for sampled packets in the sample flow path;
> For normal path, all packets will continue to go with the original actions.
> 

In my too.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28 16:18       ` Andrew Rybchenko
@ 2020-06-29 11:40         ` Ori Kam
  2020-06-29 13:11           ` Andrew Rybchenko
  0 siblings, 1 reply; 129+ messages in thread
From: Ori Kam @ 2020-06-29 11:40 UTC (permalink / raw)
  To: Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

Hi all,

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Sunday, June 28, 2020 7:19 PM
> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Ori Kam
> <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; Matan
> Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> On 6/28/20 7:16 PM, Jiawei(Jonny) Wang wrote:
> >
> > On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko
> <arybchenko@solarflare.com> wrote:
> >>
> >> On 6/25/20 7:26 PM, Jiawei Wang wrote:
> >>> When using full offload, all traffic will be handled by the HW, and
> >>> directed to the requested vf or wire, the control application loses
> >>> visibility on the traffic.
> >>> So there's a need for an action that will enable the control
> >>> application some visibility.
> >>>
> >>> The solution is introduced a new action that will sample the incoming
> >>> traffic and send a duplicated traffic in some predefined ratio to the
> >>> application, while the original packet will continue to the target
> >>> destination.
> >>>
> >>> The packets sampled equals is '1/ratio', if the ratio value be set to
> >>> 1 , means that the packets would be completely mirrored. The sample
> >>> packet can be assigned with different set of actions from the original
> >> packet.
> >>>
> >>> In order to support the sample packet in rte_flow, new rte_flow action
> >>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> >>> rte_flow_action_sample will be introduced.
> >>>
> >>> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> >>
> >> [snip]
> >>
> >>> @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
> >>>
> >>>  /**
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this structure may change without prior notice
> >>> + *
> >>> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> >>> + *
> >>> + * Adds a sample action to a matched flow.
> >>> + *
> >>> + * The matching packets will be duplicated to a special queue or
> >>> +vport
> >>> + * in the predefined probabiilty, All the packets continues
> >>> +processing
> >>> + * on the default flow path.
> >>> + *
> >>> + * When the sample ratio is set to 1 then the packets will be 100%
> >> mirrored.
> >>> + * Additional action list be supported to add for sampled or mirrored
> >> packets.
> >>> + */
> >>> +struct rte_flow_action_sample {
> >>> +	/* packets sampled equals to '1/ratio' */
> >>> +	const uint32_t ratio;
> >>> +	/* sub-action list specific for the sampling hit cases */
> >>> +	const struct rte_flow_action *actions;
> >>
> >> This design idea does not look good to me from the very beginning. IMHO it
> >> does not fit flow API overall design.
> >> I mean sub-action list.
> >>
> >> As I understand Linux iptables solves it on match level (i.e. in pattern). E.g.
> >> "limit" extension which is basically sampling. Sampling using meta pattern
> >> item in combination with PASSTHRU action (to make sampling actions non-
> >> terminating if required) is a better solution from design point of view.
> >
> > On our design, there're sample flow path and normal flow path, each path
> can have different actions.
> > The defined sub-actions list only applied for sampled packets in the sample
> flow path;
> > For normal path, all packets will continue to go with the original actions.
> >
> 
> In my too.

First as far as I know TC works close to the suggest approach (that by itself doesn’t mean anything)
The concept of a PASSTHRU is a good one but it has some issue to consider:
1. When using PASSTHRU it will mean that the matching part will be needed to be checked 
more times this will have performance penalty , also number of HW have limited number of flow that can be offload
this will approach will waste resources.
2. Using PASSTHRU will force the order of flows (sure it can be done using priorities but it is more complex to 
the application to implement) 
3. PASSTHRU will mean that there will be 2 terminal action for each flow (for example queue index 2 / passthru)
this also is not native to RTE flow. 
4. since we want to select only part of the packets, and we want to have some of the actions done on both 
packets (the sampled and the standard one) and then we want   on the sampled packet do some specific actions
while on the standard packet do different actions.
Lest check the following use case:
Application is using full offload traffic from the wire to a VM, which should decaped 
So the basic flow is:
Flow create 0  transfer ingress pattern eth / outer.ip =x / end  actions decap / port id 3 
Since after the offload the application loses visibility of the traffic. it still wants to sample some of the traffic
in order to verify that the traffic is valid. So the application request to receive some of the original traffic and
mark it with id.

If we use the original approach (the one in the patch) we will need something like this:
Flow 1: flow create 0 transfer ingress pattern eth / outer.ip=x / end actions sample(ratio 2,  actions mark id 3 / port pf)) / decap / port 3

In the PASSTHRU concept (I'm not sure I can even create such flows)
Flow 1: flow create 0 transfer ingress pattern eth / outer.ip =x / end  actions decap / port 2  /passtthru // original request
Flow 2: flow create 0 transfer ingress pattern eth/ outer.ip=x / should sample (new item that selects if the packet is selected based on the ratio)end act / mark / port pf

The main issue with this case the decap is before the sample so the sample will get decap packet.

So when looking at everything I think the original API is the best approach.
For the record I think that passthru action is very important and should be supported but not the best one for this feature.

Thanks,
Ori


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-29 11:40         ` Ori Kam
@ 2020-06-29 13:11           ` Andrew Rybchenko
  2020-06-29 14:29             ` Ori Kam
  0 siblings, 1 reply; 129+ messages in thread
From: Andrew Rybchenko @ 2020-06-29 13:11 UTC (permalink / raw)
  To: Ori Kam, Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Adrien Mazarguil

Hi all,

CC Adrien

(I apologize for pulling you to the rte_flow API discussions
once again, but may be you can find spare time and share your
thoughts. Your opinion as an author and architect of the
rte_flow API would be very useful and highly appreciated.)

On 6/29/20 2:40 PM, Ori Kam wrote:
> Hi all,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Sunday, June 28, 2020 7:19 PM
>> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Ori Kam
>> <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; Matan
>> Azrad <matan@mellanox.com>
>> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
>> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com
>> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
>> flow
>>
>> On 6/28/20 7:16 PM, Jiawei(Jonny) Wang wrote:
>>>
>>> On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko
>> <arybchenko@solarflare.com> wrote:
>>>>
>>>> On 6/25/20 7:26 PM, Jiawei Wang wrote:
>>>>> When using full offload, all traffic will be handled by the HW, and
>>>>> directed to the requested vf or wire, the control application loses
>>>>> visibility on the traffic.
>>>>> So there's a need for an action that will enable the control
>>>>> application some visibility.
>>>>>
>>>>> The solution is introduced a new action that will sample the incoming
>>>>> traffic and send a duplicated traffic in some predefined ratio to the
>>>>> application, while the original packet will continue to the target
>>>>> destination.
>>>>>
>>>>> The packets sampled equals is '1/ratio', if the ratio value be set to
>>>>> 1 , means that the packets would be completely mirrored. The sample
>>>>> packet can be assigned with different set of actions from the original
>>>> packet.
>>>>>
>>>>> In order to support the sample packet in rte_flow, new rte_flow action
>>>>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
>>>>> rte_flow_action_sample will be introduced.
>>>>>
>>>>> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
>>>>
>>>> [snip]
>>>>
>>>>> @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
>>>>>
>>>>>  /**
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this structure may change without prior notice
>>>>> + *
>>>>> + * RTE_FLOW_ACTION_TYPE_SAMPLE
>>>>> + *
>>>>> + * Adds a sample action to a matched flow.
>>>>> + *
>>>>> + * The matching packets will be duplicated to a special queue or
>>>>> +vport
>>>>> + * in the predefined probabiilty, All the packets continues
>>>>> +processing
>>>>> + * on the default flow path.
>>>>> + *
>>>>> + * When the sample ratio is set to 1 then the packets will be 100%
>>>> mirrored.
>>>>> + * Additional action list be supported to add for sampled or mirrored
>>>> packets.
>>>>> + */
>>>>> +struct rte_flow_action_sample {
>>>>> +	/* packets sampled equals to '1/ratio' */
>>>>> +	const uint32_t ratio;
>>>>> +	/* sub-action list specific for the sampling hit cases */
>>>>> +	const struct rte_flow_action *actions;
>>>>
>>>> This design idea does not look good to me from the very beginning. IMHO it
>>>> does not fit flow API overall design.
>>>> I mean sub-action list.
>>>>
>>>> As I understand Linux iptables solves it on match level (i.e. in pattern). E.g.
>>>> "limit" extension which is basically sampling. Sampling using meta pattern
>>>> item in combination with PASSTHRU action (to make sampling actions non-
>>>> terminating if required) is a better solution from design point of view.
>>>
>>> On our design, there're sample flow path and normal flow path, each path
>> can have different actions.
>>> The defined sub-actions list only applied for sampled packets in the sample
>> flow path;
>>> For normal path, all packets will continue to go with the original actions.
>>>
>>
>> In my too.
> 
> First as far as I know TC works close to the suggest approach (that by itself doesn’t mean anything)
> The concept of a PASSTHRU is a good one but it has some issue to consider:
> 1. When using PASSTHRU it will mean that the matching part will be needed to be checked 
> more times this will have performance penalty , also number of HW have limited number of flow that can be offload
> this will approach will waste resources.

Marking or tagging could be used to address it. E.g. target
traffic could be tagged first, then matching by tag should be
used to sample and to do HW offloads.

Moreover, mapping of rte_flow API rules into HW rule is not
required to be 1-to-1. Yes, 1-to-1 is simple, but it could be
more complicated 1-to-N (when one rte_flow API rule is
represented by many HW rules) or N-to-1 (when few flow API
rules are represented as one HW rule) or even N-to-M.
For example, tagging which is not visible outside, could be
purely SW and used to build such constructions in SW.
It is an implementation detail is out of scope of the generic
API definition.

Yes, it sounds like over-complicating, but I really dislike
above sub-action list from design point of view and that's
why I"m trying to think in different directions.

> 2. Using PASSTHRU will force the order of flows (sure it can be done using priorities but it is more complex to 
> the application to implement) 

See above.

> 3. PASSTHRU will mean that there will be 2 terminal action for each flow (for example queue index 2 / passthru)
> this also is not native to RTE flow. 

Sorry, but there are two branches for terminating actions in
the sampling action design anyway (yes, internal/hidden).
You need two copies of the packet, so whatever you do it
will be two terminating actions.

> 4. since we want to select only part of the packets, and we want to have some of the actions done on both 
> packets (the sampled and the standard one) and then we want   on the sampled packet do some specific actions
> while on the standard packet do different actions.

Yes, it is not a problem with PASSTHRU.

> Lest check the following use case:
> Application is using full offload traffic from the wire to a VM, which should decaped 
> So the basic flow is:
> Flow create 0  transfer ingress pattern eth / outer.ip =x / end  actions decap / port id 3 
> Since after the offload the application loses visibility of the traffic. it still wants to sample some of the traffic
> in order to verify that the traffic is valid. So the application request to receive some of the original traffic and
> mark it with id.
> 
> If we use the original approach (the one in the patch) we will need something like this:
> Flow 1: flow create 0 transfer ingress pattern eth / outer.ip=x / end actions sample(ratio 2,  actions mark id 3 / port pf)) / decap / port 3
> 
> In the PASSTHRU concept (I'm not sure I can even create such flows)
> Flow 1: flow create 0 transfer ingress pattern eth / outer.ip =x / end  actions decap / port 2  /passtthru // original request
> Flow 2: flow create 0 transfer ingress pattern eth/ outer.ip=x / should sample (new item that selects if the packet is selected based on the ratio)end act / mark / port pf
> 
> The main issue with this case the decap is before the sample so the sample will get decap packet.

Order should be simply different: first sampling with pass-
through, second decap and deliver to VM.

Yes, I realize that two actions with basically the same match
(modulo sampling match) is not ideal for mapping to HW (even
if it collapsed into trivial tag match which pre-rule to make
tagging). I'm not 100% happy with it, but I'm even less happy
with sub-action list design and just trying to find better
alternative solution.

> So when looking at everything I think the original API is the best approach.
> For the record I think that passthru action is very important and should be supported but not the best one for this feature.
> 
> Thanks,
> Ori
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-29 13:11           ` Andrew Rybchenko
@ 2020-06-29 14:29             ` Ori Kam
  2020-06-30 16:42               ` Ori Kam
  0 siblings, 1 reply; 129+ messages in thread
From: Ori Kam @ 2020-06-29 14:29 UTC (permalink / raw)
  To: Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Adrien Mazarguil

Hi All,

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Monday, June 29, 2020 4:12 PM
> To: Ori Kam <orika@mellanox.com>; Jiawei(Jonny) Wang
> <jiaweiw@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> Hi all,
> 
> CC Adrien
> 
> (I apologize for pulling you to the rte_flow API discussions
> once again, but may be you can find spare time and share your
> thoughts. Your opinion as an author and architect of the
> rte_flow API would be very useful and highly appreciated.)
> 
> On 6/29/20 2:40 PM, Ori Kam wrote:
> > Hi all,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Sunday, June 28, 2020 7:19 PM
> >> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Ori Kam
> >> <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Matan
> >> Azrad <matan@mellanox.com>
> >> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> >> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com
> >> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> >> flow
> >>
> >> On 6/28/20 7:16 PM, Jiawei(Jonny) Wang wrote:
> >>>
> >>> On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko
> >> <arybchenko@solarflare.com> wrote:
> >>>>
> >>>> On 6/25/20 7:26 PM, Jiawei Wang wrote:
> >>>>> When using full offload, all traffic will be handled by the HW, and
> >>>>> directed to the requested vf or wire, the control application loses
> >>>>> visibility on the traffic.
> >>>>> So there's a need for an action that will enable the control
> >>>>> application some visibility.
> >>>>>
> >>>>> The solution is introduced a new action that will sample the incoming
> >>>>> traffic and send a duplicated traffic in some predefined ratio to the
> >>>>> application, while the original packet will continue to the target
> >>>>> destination.
> >>>>>
> >>>>> The packets sampled equals is '1/ratio', if the ratio value be set to
> >>>>> 1 , means that the packets would be completely mirrored. The sample
> >>>>> packet can be assigned with different set of actions from the original
> >>>> packet.
> >>>>>
> >>>>> In order to support the sample packet in rte_flow, new rte_flow action
> >>>>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> >>>>> rte_flow_action_sample will be introduced.
> >>>>>
> >>>>> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> >>>>
> >>>> [snip]
> >>>>
> >>>>> @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
> >>>>>
> >>>>>  /**
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this structure may change without prior notice
> >>>>> + *
> >>>>> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> >>>>> + *
> >>>>> + * Adds a sample action to a matched flow.
> >>>>> + *
> >>>>> + * The matching packets will be duplicated to a special queue or
> >>>>> +vport
> >>>>> + * in the predefined probabiilty, All the packets continues
> >>>>> +processing
> >>>>> + * on the default flow path.
> >>>>> + *
> >>>>> + * When the sample ratio is set to 1 then the packets will be 100%
> >>>> mirrored.
> >>>>> + * Additional action list be supported to add for sampled or mirrored
> >>>> packets.
> >>>>> + */
> >>>>> +struct rte_flow_action_sample {
> >>>>> +	/* packets sampled equals to '1/ratio' */
> >>>>> +	const uint32_t ratio;
> >>>>> +	/* sub-action list specific for the sampling hit cases */
> >>>>> +	const struct rte_flow_action *actions;
> >>>>
> >>>> This design idea does not look good to me from the very beginning. IMHO
> it
> >>>> does not fit flow API overall design.
> >>>> I mean sub-action list.
> >>>>
> >>>> As I understand Linux iptables solves it on match level (i.e. in pattern). E.g.
> >>>> "limit" extension which is basically sampling. Sampling using meta pattern
> >>>> item in combination with PASSTHRU action (to make sampling actions
> non-
> >>>> terminating if required) is a better solution from design point of view.
> >>>
> >>> On our design, there're sample flow path and normal flow path, each path
> >> can have different actions.
> >>> The defined sub-actions list only applied for sampled packets in the sample
> >> flow path;
> >>> For normal path, all packets will continue to go with the original actions.
> >>>
> >>
> >> In my too.
> >
> > First as far as I know TC works close to the suggest approach (that by itself
> doesn’t mean anything)
> > The concept of a PASSTHRU is a good one but it has some issue to consider:
> > 1. When using PASSTHRU it will mean that the matching part will be needed
> to be checked
> > more times this will have performance penalty , also number of HW have
> limited number of flow that can be offload
> > this will approach will waste resources.
> 
> Marking or tagging could be used to address it. E.g. target
> traffic could be tagged first, then matching by tag should be
> used to sample and to do HW offloads.
> 
In this case you are forcing at least two steps, this will hurt performance.
Matching on mark is the same as matching on other items. While it may have extra
penalty to add the extra mark action. (this is general HW issue I assume number of 
manufactures have the same limitation)

> Moreover, mapping of rte_flow API rules into HW rule is not
> required to be 1-to-1. Yes, 1-to-1 is simple, but it could be
> more complicated 1-to-N (when one rte_flow API rule is
> represented by many HW rules) or N-to-1 (when few flow API
> rules are represented as one HW rule) or even N-to-M.
> For example, tagging which is not visible outside, could be
> purely SW and used to build such constructions in SW.
> It is an implementation detail is out of scope of the generic
> API definition.
> 
I don’t understand this part. I never said that 1 to 1 is needed
but if you try to combine flows in SW it means that you must keep all flows
in the PMD in order to combine them. I now some HW must do that, but not 
all of them and if they don’t it is just a huge memory waste.

> Yes, it sounds like over-complicating, but I really dislike
> above sub-action list from design point of view and that's
> why I"m trying to think in different directions.
> 
Thinking in different direction is always good. 
I just think that between the two approaches I like the original better.
May be you can explain your reason for your opinion and we can find the best
solution together?



> > 2. Using PASSTHRU will force the order of flows (sure it can be done using
> priorities but it is more complex to
> > the application to implement)
> 
> See above.
> 

See above 😊

> > 3. PASSTHRU will mean that there will be 2 terminal action for each flow (for
> example queue index 2 / passthru)
> > this also is not native to RTE flow.
> 
> Sorry, but there are two branches for terminating actions in
> the sampling action design anyway (yes, internal/hidden).
> You need two copies of the packet, so whatever you do it
> will be two terminating actions.
> 
Yes but you can look at it as 1 flow with 2 sets of actions and not two flows. 

> > 4. since we want to select only part of the packets, and we want to have some
> of the actions done on both
> > packets (the sampled and the standard one) and then we want   on the
> sampled packet do some specific actions
> > while on the standard packet do different actions.
> 
> Yes, it is not a problem with PASSTHRU.
> 
> > Lest check the following use case:
> > Application is using full offload traffic from the wire to a VM, which should
> decaped
> > So the basic flow is:
> > Flow create 0  transfer ingress pattern eth / outer.ip =x / end  actions decap /
> port id 3
> > Since after the offload the application loses visibility of the traffic. it still
> wants to sample some of the traffic
> > in order to verify that the traffic is valid. So the application request to receive
> some of the original traffic and
> > mark it with id.
> >
> > If we use the original approach (the one in the patch) we will need something
> like this:
> > Flow 1: flow create 0 transfer ingress pattern eth / outer.ip=x / end actions
> sample(ratio 2,  actions mark id 3 / port pf)) / decap / port 3
> >
> > In the PASSTHRU concept (I'm not sure I can even create such flows)
> > Flow 1: flow create 0 transfer ingress pattern eth / outer.ip =x / end  actions
> decap / port 2  /passtthru // original request
> > Flow 2: flow create 0 transfer ingress pattern eth/ outer.ip=x / should sample
> (new item that selects if the packet is selected based on the ratio)end act /
> mark / port pf
> >
> > The main issue with this case the decap is before the sample so the sample
> will get decap packet.
> 
> Order should be simply different: first sampling with pass-
> through, second decap and deliver to VM.
>
Then in your case the packet will be marked. Which is not what the 
application requested.
 
> Yes, I realize that two actions with basically the same match
> (modulo sampling match) is not ideal for mapping to HW (even
> if it collapsed into trivial tag match which pre-rule to make
> tagging). I'm not 100% happy with it, but I'm even less happy
> with sub-action list design and just trying to find better
> alternative solution.
> 
Please see above, maybe we can find the best solution together. 

> > So when looking at everything I think the original API is the best approach.
> > For the record I think that passthru action is very important and should be
> supported but not the best one for this feature.
> >
> > Thanks,
> > Ori
> >
Best,
Ori

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 8/8] app/testpmd: add testpmd command for sample action
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 8/8] app/testpmd: add testpmd command for sample action Jiawei Wang
@ 2020-06-30 15:23   ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-06-30 15:23 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang



> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> To: Ori Kam <orika@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: [PATCH 8/8] app/testpmd: add testpmd command for sample action
> 
> Add a new testpmd command 'set sample_actions' that supports the multiple
> sample actions list configuration by using the index:
> set sample_actions <index> <actions list>
> 
> The examples for the sample flow use case and result as below:
> 
> 1. set sample_actions 0 mark id 0x8 / queue index 2 / end
> .. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...
> 
> This flow will result in all the matched ingress packets will be
> jumped to next flow table, and the each second packet will be
> marked and sent to queue 2 of the control application.
> 
> 2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...
> 
> The flow will result in all the matched ingress packets will be sent to
> port 2, and the each second packet will also be sent to e-switch
> manager vport.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  app/test-pmd/cmdline_flow.c | 285
> ++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 276 insertions(+), 9 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 4e2006c..6b1e515 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -56,6 +56,8 @@ enum index {
>  	SET_RAW_ENCAP,
>  	SET_RAW_DECAP,
>  	SET_RAW_INDEX,
> +	SET_SAMPLE_ACTIONS,
> +	SET_SAMPLE_INDEX,
> 
>  	/* Top-level command. */
>  	FLOW,
> @@ -349,6 +351,10 @@ enum index {
>  	ACTION_SET_IPV6_DSCP_VALUE,
>  	ACTION_AGE,
>  	ACTION_AGE_TIMEOUT,
> +	ACTION_SAMPLE,
> +	ACTION_SAMPLE_RATIO,
> +	ACTION_SAMPLE_INDEX,
> +	ACTION_SAMPLE_INDEX_VALUE,
>  };
> 
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -484,6 +490,22 @@ struct action_nvgre_encap_data {
> 
>  struct mplsoudp_decap_conf mplsoudp_decap_conf;
> 
> +#define ACTION_SAMPLE_ACTIONS_NUM 10
> +#define RAW_SAMPLE_CONFS_MAX_NUM 8
> +/** Storage for struct rte_flow_action_sample including external data. */
> +struct action_sample_data {
> +	struct rte_flow_action_sample conf;
> +	uint32_t idx;
> +};
> +/** Storage for struct rte_flow_action_sample. */
> +struct raw_sample_conf {
> +	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
> +};
> +struct raw_sample_conf
> raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
> +struct rte_flow_action_mark
> sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
> +struct rte_flow_action_queue
> sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
> +struct rte_flow_action_count
> sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
> +
>  /** Maximum number of subsequent tokens and arguments on the stack. */
>  #define CTX_STACK_SIZE 16
> 
> @@ -1161,6 +1183,7 @@ struct parse_action_priv {
>  	ACTION_SET_IPV4_DSCP,
>  	ACTION_SET_IPV6_DSCP,
>  	ACTION_AGE,
> +	ACTION_SAMPLE,
>  	ZERO,
>  };
> 
> @@ -1393,9 +1416,28 @@ struct parse_action_priv {
>  	ZERO,
>  };
> 
> +static const enum index action_sample[] = {
> +	ACTION_SAMPLE,
> +	ACTION_SAMPLE_RATIO,
> +	ACTION_SAMPLE_INDEX,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
> +static const enum index next_action_sample[] = {
> +	ACTION_QUEUE,
> +	ACTION_MARK,
> +	ACTION_COUNT,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> +static int parse_set_sample_action(struct context *, const struct token *,
> +				   const char *, unsigned int,
> +				   void *, unsigned int);
>  static int parse_set_init(struct context *, const struct token *,
>  			  const char *, unsigned int,
>  			  void *, unsigned int);
> @@ -1460,7 +1502,15 @@ static int parse_vc_action_raw_decap_index(struct
> context *,
>  static int parse_vc_action_set_meta(struct context *ctx,
>  				    const struct token *token, const char *str,
>  				    unsigned int len, void *buf,
> +					unsigned int size);
> +static int parse_vc_action_sample(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
>  				    unsigned int size);
> +static int
> +parse_vc_action_sample_index(struct context *ctx, const struct token *token,
> +				const char *str, unsigned int len, void *buf,
> +				unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -1531,6 +1581,8 @@ static int comp_vc_action_rss_queue(struct context
> *, const struct token *,
>  				    unsigned int, char *, unsigned int);
>  static int comp_set_raw_index(struct context *, const struct token *,
>  			      unsigned int, char *, unsigned int);
> +static int comp_set_sample_index(struct context *, const struct token *,
> +			      unsigned int, char *, unsigned int);
> 
>  /** Token definitions. */
>  static const struct token token_list[] = {
> @@ -3612,11 +3664,13 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	/* Top level command. */
>  	[SET] = {
>  		.name = "set",
> -		.help = "set raw encap/decap data",
> -		.type = "set raw_encap|raw_decap <index> <pattern>",
> +		.help = "set raw encap/decap/sample data",
> +		.type = "set raw_encap|raw_decap <index> <pattern>"
> +				" or set sample_actions <index> <action>",
>  		.next = NEXT(NEXT_ENTRY
>  			     (SET_RAW_ENCAP,
> -			      SET_RAW_DECAP)),
> +			      SET_RAW_DECAP,
> +			      SET_SAMPLE_ACTIONS)),
>  		.call = parse_set_init,
>  	},
>  	/* Sub-level commands. */
> @@ -3647,6 +3701,23 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  		.next = NEXT(next_item),
>  		.call = parse_port,
>  	},
> +	[SET_SAMPLE_INDEX] = {
> +		.name = "{index}",
> +		.type = "UNSIGNED",
> +		.help = "index of sample actions",
> +		.next = NEXT(next_action_sample),
> +		.call = parse_port,
> +	},
> +	[SET_SAMPLE_ACTIONS] = {
> +		.name = "sample_actions",
> +		.help = "set sample actions list",
> +		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
> +		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
> +				(offsetof(struct buffer, port),
> +				 sizeof(((struct buffer *)0)->port),
> +				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
> +		.call = parse_set_sample_action,
> +	},
>  	[ACTION_SET_TAG] = {
>  		.name = "set_tag",
>  		.help = "set tag",
> @@ -3750,6 +3821,37 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
>  		.call = parse_vc_conf,
>  	},
> +	[ACTION_SAMPLE] = {
> +		.name = "sample",
> +		.help = "set a sample action",
> +		.next = NEXT(action_sample),
> +		.priv = PRIV_ACTION(SAMPLE,
> +			sizeof(struct action_sample_data)),
> +		.call = parse_vc_action_sample,
> +	},
> +	[ACTION_SAMPLE_RATIO] = {
> +		.name = "ratio",
> +		.help = "flow sample ratio value",
> +		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_ARB
> +			     (offsetof(struct action_sample_data, conf) +
> +			      offsetof(struct rte_flow_action_sample, ratio),
> +			      sizeof(((struct rte_flow_action_sample *)0)->
> +				     ratio))),
> +	},
> +	[ACTION_SAMPLE_INDEX] = {
> +		.name = "index",
> +		.help = "the index of sample actions list",
> +		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
> +	},
> +	[ACTION_SAMPLE_INDEX_VALUE] = {
> +		.name = "{index}",
> +		.type = "UNSIGNED",
> +		.help = "unsigned integer value",
> +		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
> +		.call = parse_vc_action_sample_index,
> +		.comp = comp_set_sample_index,
> +	},
>  };
> 
>  /** Remove and return last entry from argument stack. */
> @@ -5207,6 +5309,76 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	return len;
>  }
> 
> +static int
> +parse_vc_action_sample(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	struct buffer *out = buf;
> +	struct rte_flow_action *action;
> +	struct action_sample_data *action_sample_data = NULL;
> +	static struct rte_flow_action end_action = {
> +		RTE_FLOW_ACTION_TYPE_END, 0
> +	};
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	/* Nothing else to do if there is no buffer. */
> +	if (!out)
> +		return ret;
> +	if (!out->args.vc.actions_n)
> +		return -1;
> +	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
> +	/* Point to selected object. */
> +	ctx->object = out->args.vc.data;
> +	ctx->objmask = NULL;
> +	/* Copy the headers to the buffer. */
> +	action_sample_data = ctx->object;
> +	action_sample_data->conf.actions = &end_action;
> +	action->conf = &action_sample_data->conf;
> +	return ret;
> +}
> +
> +static int
> +parse_vc_action_sample_index(struct context *ctx, const struct token *token,
> +				const char *str, unsigned int len, void *buf,
> +				unsigned int size)
> +{
> +	struct action_sample_data *action_sample_data;
> +	struct rte_flow_action *action;
> +	const struct arg *arg;
> +	struct buffer *out = buf;
> +	int ret;
> +	uint16_t idx;
> +
> +	RTE_SET_USED(token);
> +	RTE_SET_USED(buf);
> +	RTE_SET_USED(size);
> +	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
> +		return -1;
> +	arg = ARGS_ENTRY_ARB_BOUNDED
> +		(offsetof(struct action_sample_data, idx),
> +		 sizeof(((struct action_sample_data *)0)->idx),
> +		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
> +	if (push_args(ctx, arg))
> +		return -1;
> +	ret = parse_int(ctx, token, str, len, NULL, 0);
> +	if (ret < 0) {
> +		pop_args(ctx);
> +		return -1;
> +	}
> +	if (!ctx->object)
> +		return len;
> +	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
> +	action_sample_data = ctx->object;
> +	idx = action_sample_data->idx;
> +	action_sample_data->conf.actions = raw_sample_confs[idx].data;
> +	action->conf = &action_sample_data->conf;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> @@ -5971,6 +6143,38 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	if (!out->command)
>  		return -1;
>  	out->command = ctx->curr;
> +	/* For encap/decap we need is pattern */
> +	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
> +						       sizeof(double));
> +	return len;
> +}
> +
> +/** Parse set command, initialize output buffer for subsequent tokens. */
> +static int
> +parse_set_sample_action(struct context *ctx, const struct token *token,
> +			  const char *str, unsigned int len,
> +			  void *buf, unsigned int size)
> +{
> +	struct buffer *out = buf;
> +
> +	/* Token name must match. */
> +	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
> +		return -1;
> +	/* Nothing else to do if there is no buffer. */
> +	if (!out)
> +		return len;
> +	/* Make sure buffer is large enough. */
> +	if (size < sizeof(*out))
> +		return -1;
> +	ctx->objdata = 0;
> +	ctx->objmask = NULL;
> +	ctx->object = out;
> +	if (!out->command)
> +		return -1;
> +	out->command = ctx->curr;
> +	/* For sampler we need is actions */
> +	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
> +						       sizeof(double));
>  	return len;
>  }
> 
> @@ -6007,11 +6211,8 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  			return -1;
>  		out->command = ctx->curr;
>  		out->args.vc.data = (uint8_t *)out + size;
> -		/* All we need is pattern */
> -		out->args.vc.pattern =
> -			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
> -					       sizeof(double));
> -		ctx->object = out->args.vc.pattern;
> +		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
> +						       sizeof(double));
>  	}
>  	return len;
>  }
> @@ -6162,6 +6363,24 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	return nb;
>  }
> 
> +/** Complete index number for set raw_encap/raw_decap commands. */
> +static int
> +comp_set_sample_index(struct context *ctx, const struct token *token,
> +		   unsigned int ent, char *buf, unsigned int size)
> +{
> +	uint16_t idx = 0;
> +	uint16_t nb = 0;
> +
> +	RTE_SET_USED(ctx);
> +	RTE_SET_USED(token);
> +	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
> +		if (buf && idx == ent)
> +			return snprintf(buf, size, "%u", idx);
> +		++nb;
> +	}
> +	return nb;
> +}
> +
>  /** Internal context. */
>  static struct context cmd_flow_context;
> 
> @@ -6607,7 +6826,53 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	return mask;
>  }
> 
> -
> +/** Dispatch parsed buffer to function calls. */
> +static void
> +cmd_set_raw_parsed_sample(const struct buffer *in)
> +{
> +	uint32_t n = in->args.vc.actions_n;
> +	uint32_t i = 0;
> +	struct rte_flow_action *action = NULL;
> +	struct rte_flow_action *data = NULL;
> +	size_t size = 0;
> +	uint16_t idx = in->port; /* We borrow port field as index */
> +	uint32_t max_size = sizeof(struct rte_flow_action) *
> +
> 	ACTION_SAMPLE_ACTIONS_NUM;
> +
> +	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
> +	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
> +	memset(data, 0x00, max_size);
> +	for (; i <= n - 1; i++) {
> +		action = in->args.vc.actions + i;
> +		if (action->type == RTE_FLOW_ACTION_TYPE_END)
> +			break;
> +		switch (action->type) {
> +		case RTE_FLOW_ACTION_TYPE_MARK:
> +			size = sizeof(struct rte_flow_action_mark);
> +			rte_memcpy(&sample_mark[idx],
> +				(const void *)action->conf, size);
> +			action->conf = &sample_mark[idx];
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_COUNT:
> +			size = sizeof(struct rte_flow_action_count);
> +			rte_memcpy(&sample_count[idx],
> +				(const void *)action->conf, size);
> +			action->conf = &sample_count[idx];
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_QUEUE:
> +			size = sizeof(struct rte_flow_action_queue);
> +			rte_memcpy(&sample_queue[idx],
> +				(const void *)action->conf, size);
> +			action->conf = &sample_queue[idx];
> +			break;
> +		default:
> +			printf("Error - Not supported action\n");
> +			return;
> +		}
> +		rte_memcpy(data, action, sizeof(struct rte_flow_action));
> +		data++;
> +	}
> +}
> 
>  /** Dispatch parsed buffer to function calls. */
>  static void
> @@ -6624,6 +6889,8 @@ static int comp_set_raw_index(struct context *,
> const struct token *,
>  	uint16_t proto = 0;
>  	uint16_t idx = in->port; /* We borrow port field as index */
> 
> +	if (in->command == SET_SAMPLE_ACTIONS)
> +		return cmd_set_raw_parsed_sample(in);
>  	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
>  		   in->command == SET_RAW_DECAP);
>  	if (in->command == SET_RAW_ENCAP) {
> --
> 1.8.3.1

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 2/8] common/mlx5: glue for default miss and sample action
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 2/8] common/mlx5: glue for default miss and sample action Jiawei Wang
@ 2020-06-30 15:25   ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-06-30 15:25 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang



> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> To: Ori Kam <orika@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: [PATCH 2/8] common/mlx5: glue for default miss and sample action
> 
> rdma-core introduce two new DR action: default miss and sample
> action.
> 
> Add the rdma-core commands in glue to create these two actions.
> 
> Default miss action is used for the sampled packet on FDB domain,
> it steering packet to eswitch manager vport.
> 
> Sample action is used for creating the sample object to implement
> the sampling/mirroring function.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  drivers/common/mlx5/Makefile          | 10 ++++++++++
>  drivers/common/mlx5/linux/meson.build |  4 ++++
>  drivers/common/mlx5/linux/mlx5_glue.c | 28
> ++++++++++++++++++++++++++++
>  drivers/common/mlx5/linux/mlx5_glue.h | 13 +++++++++++++
>  4 files changed, 55 insertions(+)
> 
> diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
> index 622bde4..8db0604 100644
> --- a/drivers/common/mlx5/Makefile
> +++ b/drivers/common/mlx5/Makefile
> @@ -187,6 +187,16 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-
> config-h.sh
>  		func mlx5dv_dump_dr_domain \
>  		$(AUTOCONF_OUTPUT)
>  	$Q sh -- '$<' '$@' \
> +		HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS \
> +		infiniband/mlx5dv.h \
> +		func mlx5dv_dr_action_create_default_miss \
> +		$(AUTOCONF_OUTPUT)
> +	$Q sh -- '$<' '$@' \
> +		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
> +		infiniband/mlx5dv.h \
> +		func mlx5dv_dr_action_create_flow_sampler \
> +		$(AUTOCONF_OUTPUT)
> +	$Q sh -- '$<' '$@' \
>  		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
>  		infiniband/mlx5dv.h \
>  		enum MLX5_MMAP_GET_NC_PAGES_CMD \
> diff --git a/drivers/common/mlx5/linux/meson.build
> b/drivers/common/mlx5/linux/meson.build
> index 638bb2b..95f3204 100644
> --- a/drivers/common/mlx5/linux/meson.build
> +++ b/drivers/common/mlx5/linux/meson.build
> @@ -160,6 +160,10 @@ has_sym_args = [
>  	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
>  	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
>  	'mlx5dv_dump_dr_domain'],
> +	[ 'HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS',
> 'infiniband/mlx5dv.h',
> +	'mlx5dv_dr_action_create_default_miss'],
> +	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE',
> 'infiniband/mlx5dv.h',
> +	'mlx5dv_dr_action_create_flow_sampler'],
>  	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
>  	'mlx5dv_dr_domain_set_reclaim_device_memory'],
>  	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
> diff --git a/drivers/common/mlx5/linux/mlx5_glue.c
> b/drivers/common/mlx5/linux/mlx5_glue.c
> index c91ee33..ea366e2 100644
> --- a/drivers/common/mlx5/linux/mlx5_glue.c
> +++ b/drivers/common/mlx5/linux/mlx5_glue.c
> @@ -1047,6 +1047,30 @@
>  #endif
>  }
> 
> +static void *
> +mlx5_glue_dr_create_flow_action_default_miss(void)
> +{
> +#ifdef HAVE_MLX5_DR_CREATE_ACTION_DEFAULT_MISS
> +	return mlx5dv_dr_action_create_default_miss();
> +#else
> +	errno = ENOTSUP;
> +	return NULL;
> +#endif
> +}
> +
> +static void *
> +mlx5_glue_dr_create_flow_action_sampler(
> +			struct mlx5dv_dr_flow_sampler_attr *attr)
> +{
> +#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
> +	return mlx5dv_dr_action_create_flow_sampler(attr);
> +#else
> +	(void)attr;
> +	errno = ENOTSUP;
> +	return NULL;
> +#endif
> +}
> +
>  static int
>  mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
>  			 uint32_t *eqn)
> @@ -1294,6 +1318,10 @@
>  	.devx_port_query = mlx5_glue_devx_port_query,
>  	.dr_dump_domain = mlx5_glue_dr_dump_domain,
>  	.dr_reclaim_domain_memory =
> mlx5_glue_dr_reclaim_domain_memory,
> +	.dr_create_flow_action_default_miss =
> +		mlx5_glue_dr_create_flow_action_default_miss,
> +	.dr_create_flow_action_sampler =
> +		mlx5_glue_dr_create_flow_action_sampler,
>  	.devx_query_eqn = mlx5_glue_devx_query_eqn,
>  	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
>  	.devx_destroy_event_channel =
> mlx5_glue_devx_destroy_event_channel,
> diff --git a/drivers/common/mlx5/linux/mlx5_glue.h
> b/drivers/common/mlx5/linux/mlx5_glue.h
> index 5d238a4..9b1487d 100644
> --- a/drivers/common/mlx5/linux/mlx5_glue.h
> +++ b/drivers/common/mlx5/linux/mlx5_glue.h
> @@ -77,6 +77,7 @@
>  #ifndef HAVE_MLX5DV_DR
>  enum  mlx5dv_dr_domain_type { unused, };
>  struct mlx5dv_dr_domain;
> +struct mlx5dv_dr_action;
>  #endif
> 
>  #ifndef HAVE_MLX5DV_DR_DEVX_PORT
> @@ -87,6 +88,15 @@
>  struct mlx5dv_dr_flow_meter_attr;
>  #endif
> 
> +#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
> +struct mlx5dv_dr_flow_sampler_attr {
> +	uint32_t sample_ratio;
> +	void *default_next_table;
> +	size_t num_sample_actions;
> +	struct mlx5dv_dr_action **sample_actions;
> +};
> +#endif
> +
>  #ifndef HAVE_IBV_DEVX_EVENT
>  struct mlx5dv_devx_event_channel { int fd; };
>  struct mlx5dv_devx_async_event_hdr;
> @@ -303,6 +313,9 @@ struct mlx5_glue {
>  			 struct mlx5dv_devx_async_event_hdr *event_data,
>  			 size_t event_resp_len);
>  	void (*dr_reclaim_domain_memory)(void *domain, uint32_t enable);
> +	void *(*dr_create_flow_action_default_miss)(void);
> +	void *(*dr_create_flow_action_sampler)
> +			(struct mlx5dv_dr_flow_sampler_attr *attr);
>  };
> 
>  extern const struct mlx5_glue *mlx5_glue;
> --
> 1.8.3.1

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-29 14:29             ` Ori Kam
@ 2020-06-30 16:42               ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-06-30 16:42 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko,
	Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Adrien Mazarguil

Hi All,

After considering both approaches, I think the original is the better approach.

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Ori Kam
> Sent: Monday, June 29, 2020 5:30 PM
> To: Andrew Rybchenko <arybchenko@solarflare.com>; Jiawei(Jonny) Wang
> <jiaweiw@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> Hi All,
> 
> > -----Original Message-----
> > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > Sent: Monday, June 29, 2020 4:12 PM
> > To: Ori Kam <orika@mellanox.com>; Jiawei(Jonny) Wang
> > <jiaweiw@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> > Matan Azrad <matan@mellanox.com>
> > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> > Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com;
> > Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> > flow
> >
> > Hi all,
> >
> > CC Adrien
> >
> > (I apologize for pulling you to the rte_flow API discussions
> > once again, but may be you can find spare time and share your
> > thoughts. Your opinion as an author and architect of the
> > rte_flow API would be very useful and highly appreciated.)
> >
> > On 6/29/20 2:40 PM, Ori Kam wrote:
> > > Hi all,
> > >
> > >> -----Original Message-----
> > >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> > >> Sent: Sunday, June 28, 2020 7:19 PM
> > >> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Ori Kam
> > >> <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> > Matan
> > >> Azrad <matan@mellanox.com>
> > >> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> > >> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> > fbl@redhat.com
> > >> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for
> rte
> > >> flow
> > >>
> > >> On 6/28/20 7:16 PM, Jiawei(Jonny) Wang wrote:
> > >>>
> > >>> On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko
> > >> <arybchenko@solarflare.com> wrote:
> > >>>>
> > >>>> On 6/25/20 7:26 PM, Jiawei Wang wrote:
> > >>>>> When using full offload, all traffic will be handled by the HW, and
> > >>>>> directed to the requested vf or wire, the control application loses
> > >>>>> visibility on the traffic.
> > >>>>> So there's a need for an action that will enable the control
> > >>>>> application some visibility.
> > >>>>>
> > >>>>> The solution is introduced a new action that will sample the incoming
> > >>>>> traffic and send a duplicated traffic in some predefined ratio to the
> > >>>>> application, while the original packet will continue to the target
> > >>>>> destination.
> > >>>>>
> > >>>>> The packets sampled equals is '1/ratio', if the ratio value be set to
> > >>>>> 1 , means that the packets would be completely mirrored. The sample
> > >>>>> packet can be assigned with different set of actions from the original
> > >>>> packet.
> > >>>>>
> > >>>>> In order to support the sample packet in rte_flow, new rte_flow action
> > >>>>> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > >>>>> rte_flow_action_sample will be introduced.
> > >>>>>
> > >>>>> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > >>>>
> > >>>> [snip]
> > >>>>
> > >>>>> @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
> > >>>>>
> > >>>>>  /**
> > >>>>> + * @warning
> > >>>>> + * @b EXPERIMENTAL: this structure may change without prior notice
> > >>>>> + *
> > >>>>> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > >>>>> + *
> > >>>>> + * Adds a sample action to a matched flow.
> > >>>>> + *
> > >>>>> + * The matching packets will be duplicated to a special queue or
> > >>>>> +vport
> > >>>>> + * in the predefined probabiilty, All the packets continues
> > >>>>> +processing
> > >>>>> + * on the default flow path.
> > >>>>> + *
> > >>>>> + * When the sample ratio is set to 1 then the packets will be 100%
> > >>>> mirrored.
> > >>>>> + * Additional action list be supported to add for sampled or mirrored
> > >>>> packets.
> > >>>>> + */
> > >>>>> +struct rte_flow_action_sample {
> > >>>>> +	/* packets sampled equals to '1/ratio' */
> > >>>>> +	const uint32_t ratio;
> > >>>>> +	/* sub-action list specific for the sampling hit cases */
> > >>>>> +	const struct rte_flow_action *actions;
> > >>>>
> > >>>> This design idea does not look good to me from the very beginning.
> IMHO
> > it
> > >>>> does not fit flow API overall design.
> > >>>> I mean sub-action list.
> > >>>>
> > >>>> As I understand Linux iptables solves it on match level (i.e. in pattern).
> E.g.
> > >>>> "limit" extension which is basically sampling. Sampling using meta
> pattern
> > >>>> item in combination with PASSTHRU action (to make sampling actions
> > non-
> > >>>> terminating if required) is a better solution from design point of view.
> > >>>
> > >>> On our design, there're sample flow path and normal flow path, each
> path
> > >> can have different actions.
> > >>> The defined sub-actions list only applied for sampled packets in the
> sample
> > >> flow path;
> > >>> For normal path, all packets will continue to go with the original actions.
> > >>>
> > >>
> > >> In my too.
> > >
> > > First as far as I know TC works close to the suggest approach (that by itself
> > doesn’t mean anything)
> > > The concept of a PASSTHRU is a good one but it has some issue to consider:
> > > 1. When using PASSTHRU it will mean that the matching part will be needed
> > to be checked
> > > more times this will have performance penalty , also number of HW have
> > limited number of flow that can be offload
> > > this will approach will waste resources.
> >
> > Marking or tagging could be used to address it. E.g. target
> > traffic could be tagged first, then matching by tag should be
> > used to sample and to do HW offloads.
> >
> In this case you are forcing at least two steps, this will hurt performance.
> Matching on mark is the same as matching on other items. While it may have
> extra
> penalty to add the extra mark action. (this is general HW issue I assume
> number of
> manufactures have the same limitation)
> 
> > Moreover, mapping of rte_flow API rules into HW rule is not
> > required to be 1-to-1. Yes, 1-to-1 is simple, but it could be
> > more complicated 1-to-N (when one rte_flow API rule is
> > represented by many HW rules) or N-to-1 (when few flow API
> > rules are represented as one HW rule) or even N-to-M.
> > For example, tagging which is not visible outside, could be
> > purely SW and used to build such constructions in SW.
> > It is an implementation detail is out of scope of the generic
> > API definition.
> >
> I don’t understand this part. I never said that 1 to 1 is needed
> but if you try to combine flows in SW it means that you must keep all flows
> in the PMD in order to combine them. I now some HW must do that, but not
> all of them and if they don’t it is just a huge memory waste.
> 
> > Yes, it sounds like over-complicating, but I really dislike
> > above sub-action list from design point of view and that's
> > why I"m trying to think in different directions.
> >
> Thinking in different direction is always good.
> I just think that between the two approaches I like the original better.
> May be you can explain your reason for your opinion and we can find the best
> solution together?
> 
> 
> 
> > > 2. Using PASSTHRU will force the order of flows (sure it can be done using
> > priorities but it is more complex to
> > > the application to implement)
> >
> > See above.
> >
> 
> See above 😊
> 
> > > 3. PASSTHRU will mean that there will be 2 terminal action for each flow
> (for
> > example queue index 2 / passthru)
> > > this also is not native to RTE flow.
> >
> > Sorry, but there are two branches for terminating actions in
> > the sampling action design anyway (yes, internal/hidden).
> > You need two copies of the packet, so whatever you do it
> > will be two terminating actions.
> >
> Yes but you can look at it as 1 flow with 2 sets of actions and not two flows.
> 
> > > 4. since we want to select only part of the packets, and we want to have
> some
> > of the actions done on both
> > > packets (the sampled and the standard one) and then we want   on the
> > sampled packet do some specific actions
> > > while on the standard packet do different actions.
> >
> > Yes, it is not a problem with PASSTHRU.
> >
> > > Lest check the following use case:
> > > Application is using full offload traffic from the wire to a VM, which should
> > decaped
> > > So the basic flow is:
> > > Flow create 0  transfer ingress pattern eth / outer.ip =x / end  actions decap
> /
> > port id 3
> > > Since after the offload the application loses visibility of the traffic. it still
> > wants to sample some of the traffic
> > > in order to verify that the traffic is valid. So the application request to
> receive
> > some of the original traffic and
> > > mark it with id.
> > >
> > > If we use the original approach (the one in the patch) we will need
> something
> > like this:
> > > Flow 1: flow create 0 transfer ingress pattern eth / outer.ip=x / end actions
> > sample(ratio 2,  actions mark id 3 / port pf)) / decap / port 3
> > >
> > > In the PASSTHRU concept (I'm not sure I can even create such flows)
> > > Flow 1: flow create 0 transfer ingress pattern eth / outer.ip =x / end  actions
> > decap / port 2  /passtthru // original request
> > > Flow 2: flow create 0 transfer ingress pattern eth/ outer.ip=x / should
> sample
> > (new item that selects if the packet is selected based on the ratio)end act /
> > mark / port pf
> > >
> > > The main issue with this case the decap is before the sample so the sample
> > will get decap packet.
> >
> > Order should be simply different: first sampling with pass-
> > through, second decap and deliver to VM.
> >
> Then in your case the packet will be marked. Which is not what the
> application requested.
> 
> > Yes, I realize that two actions with basically the same match
> > (modulo sampling match) is not ideal for mapping to HW (even
> > if it collapsed into trivial tag match which pre-rule to make
> > tagging). I'm not 100% happy with it, but I'm even less happy
> > with sub-action list design and just trying to find better
> > alternative solution.
> >
> Please see above, maybe we can find the best solution together.
> 
> > > So when looking at everything I think the original API is the best approach.
> > > For the record I think that passthru action is very important and should be
> > supported but not the best one for this feature.
> > >
> > > Thanks,
> > > Ori
> > >
> Best,
> Ori

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 3/8] common/mlx5: query sampler object capability via DevX
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 3/8] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-06-30 17:38   ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-06-30 17:38 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang



> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: [PATCH 3/8] common/mlx5: query sampler object capability via DevX
> 
> Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
> Table attributes query, then get the log_max_flow_sampler_num from
> flow table properties.
> 
> Add the related structs definition in mlx5_prm.h.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-06-30 17:59   ` Ori Kam
  2020-07-01 13:55     ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 129+ messages in thread
From: Ori Kam @ 2020-06-30 17:59 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei,

PSB.

Best,
Ori

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> Subject: [PATCH 4/8] net/mlx5: add the validate sample action
> 
> Add sample action validate function.
> 
> For Sample flow support NIC-RX and FDB domain, must include an
> action of a dest TIR in NIC_RX or DEFAULT_MISS in FDB.

What is the DEFAULT_MISS action?
I think from reading the code that you mean that no action is allowed and it is 
always goes to e-switch manager / go to PF, am I correct?

> 
> Only NIC_RX support with addition optinal actions.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
>  drivers/net/mlx5/mlx5.h          |   1 +
>  drivers/net/mlx5/mlx5_flow.h     |   1 +
>  drivers/net/mlx5/mlx5_flow_dv.c  | 130
> +++++++++++++++++++++++++++++++++++++++
>  4 files changed, 146 insertions(+)
> 
> diff --git a/drivers/net/mlx5/linux/mlx5_os.c
> b/drivers/net/mlx5/linux/mlx5_os.c
> index f0147e6..5c057d3 100644
> --- a/drivers/net/mlx5/linux/mlx5_os.c
> +++ b/drivers/net/mlx5/linux/mlx5_os.c
> @@ -878,6 +878,20 @@
>  			}
>  		}
>  #endif
> +#if defined(HAVE_MLX5DV_DR) &&
> defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
> +		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
> +		    config.dv_flow_en) {
> +			priv->sampler_en = 1;
> +			DRV_LOG(DEBUG, "The Sampler enabled!\n");
> +		} else {
> +			priv->sampler_en = 0;
> +			if (!config.hca_attr.log_max_ft_sampler_num)
> +				DRV_LOG(WARNING, "No available register
> for"
> +						" Sampler.");
> +			else
> +				DRV_LOG(DEBUG, "DV flow is not
> supported!\n");
> +		}
> +#endif
>  	}
>  	if (config.mprq.enabled && mprq) {
>  		if (config.mprq.stride_num_n &&
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 8a09ebc..c2a875c 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -607,6 +607,7 @@ struct mlx5_priv {
>  	unsigned int counter_fallback:1; /* Use counter fallback management.
> */
>  	unsigned int mtr_en:1; /* Whether support meter. */
>  	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share.
> */
> +	unsigned int sampler_en:1; /* Whether support sampler. */
>  	uint16_t domain_id; /* Switch domain identifier. */
>  	uint16_t vport_id; /* Associated VF vport index (if any). */
>  	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 2c96677..902380b 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -200,6 +200,7 @@ enum mlx5_feature_name {
>  #define MLX5_FLOW_ACTION_SET_IPV4_DSCP (1ull << 32)
>  #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
>  #define MLX5_FLOW_ACTION_AGE (1ull << 34)
> +#define MLX5_FLOW_ACTION_SAMPLE (1ull << 35)
> 
>  #define MLX5_FLOW_FATE_ACTIONS \
>  	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> b/drivers/net/mlx5/mlx5_flow_dv.c
> index f174009..710c0f3 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -3925,6 +3925,127 @@ struct field_modify_info modify_tcp[] = {
>  }
> 
>  /**
> + * Validate the sample action.
> + *
> + * @param[in] action_flags
> + *   Holds the actions detected until now.
> + * @param[in] action
> + *   Pointer to the sample action.
> + * @param[in] dev
> + *   Pointer to the Ethernet device structure.
> + * @param[in] attr
> + *   Attributes of flow that includes this action.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_action_sample(uint64_t action_flags,
> +			      const struct rte_flow_action *action,
> +			      struct rte_eth_dev *dev,
> +			      const struct rte_flow_attr *attr,
> +			      struct rte_flow_error *error)
> +{
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct mlx5_dev_config *dev_conf = &priv->config;
> +	const struct rte_flow_action_sample *sample = action->conf;
> +	const struct rte_flow_action *act = sample->actions;
> +	uint64_t sub_action_flags = 0;
> +	int actions_n = 0;
> +	int ret;
> +
> +	if (!attr->group)
> +		return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
> +					  NULL, "root table is not supported");
> +	if (!priv->config.devx || !priv->sampler_en)
> +		return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +					  NULL,
> +					  "sample action not supported");
> +	if (!(action->conf))
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> action,
> +					  "configuration cannot be null");
> +	if (sample->ratio == 0)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> action,
> +					  "ratio value start from 1");
> +	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +					  "Duplicate sample actions set");
> +	if (action_flags & MLX5_FLOW_ACTION_METER)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> action,
> +					  "wrong action order, meter should "
> +					  "be after sample action");
> +	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
> +		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
> +			return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ACTION,
> +						  act, "too many actions");
> +		switch (act->type) {
> +		case RTE_FLOW_ACTION_TYPE_QUEUE:
> +			ret = mlx5_flow_validate_action_queue(act,
> +							      sub_action_flags,
> +							      dev,
> +							      attr, error);
> +			if (ret < 0)
> +				return ret;
> +			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_MARK:
> +			ret = flow_dv_validate_action_mark(dev, act,
> +							   sub_action_flags,
> +							   attr, error);
> +			if (ret < 0)
> +				return ret;
> +			if (dev_conf->dv_xmeta_en !=
> MLX5_XMETA_MODE_LEGACY)
> +				sub_action_flags |=
> MLX5_FLOW_ACTION_MARK |
> +
> 	MLX5_FLOW_ACTION_MARK_EXT;
> +			else
> +				sub_action_flags |=
> MLX5_FLOW_ACTION_MARK;
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_COUNT:
> +			ret = flow_dv_validate_action_count(dev, error);
> +			if (ret < 0)
> +				return ret;
> +			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
> +			break;
> +		default:
> +			return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ACTION,
> +						  NULL,
> +						  "Doesn't support optional "
> +						  "action");
> +		}
> +	}
> +	if (attr->ingress && !attr->transfer) {
> +		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
> +			return rte_flow_error_set(error, EINVAL,
> +
> RTE_FLOW_ERROR_TYPE_ACTION,
> +						  NULL,
> +						  "Ingress must has a dest "
> +						  "QUEUE for Sample");
> +	} else if (attr->egress && !attr->transfer) {
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> +					  NULL,
> +					  "Sample Only support Ingress "
> +					  "or E-Switch");
> +	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +					  "E-Switch doesn't support any "
> +					  "optinal action for sampling");
> +	}
> +	return 0;
> +}
> +
> +/**
>   * Find existing modify-header resource or create and register a new one.
>   *
>   * @param dev[in, out]
> @@ -5539,6 +5660,15 @@ struct field_modify_info modify_tcp[] = {
>  			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
>  			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
>  			break;
> +		case RTE_FLOW_ACTION_TYPE_SAMPLE:
> +			ret = flow_dv_validate_action_sample(action_flags,
> +							     actions, dev,
> +							     attr, error);
> +			if (ret < 0)
> +				return ret;
> +			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
> +			++actions_n;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
> 
> RTE_FLOW_ERROR_TYPE_ACTION,
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 5/8] net/mlx5: split sample flow into two sub flows
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 5/8] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-06-30 18:18   ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-06-30 18:18 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei,

Please fix the small comment below and send with my ack
Acked-by: Ori Kam <orika@mellanox.com>

Best,
Ori

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> To: Ori Kam <orika@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: [PATCH 5/8] net/mlx5: split sample flow into two sub flows
> 
> Add the sampler action resource structs definition.
> 
> The flow with sample action will be splited into two sub flows,
> the prefix flow with sample action, the suffix flow with the left
> actions.
> 
> For the prefix flow, add the extra the tag action with unique id
> to metadata register, and suffix flow will add the extra tag item
> to match that unique id.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c      |  11 ++
>  drivers/net/mlx5/mlx5.h      |   3 +
>  drivers/net/mlx5/mlx5_flow.c | 254
> ++++++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_flow.h |  37 +++++++
>  4 files changed, 301 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index ddbe29d..4a52462 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -238,6 +238,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared)
> mlx5_dev_ctx_list =
>  		.free = rte_free,
>  		.type = "mlx5_jump_ipool",
>  	},
> +	{
> +		.size = sizeof(struct mlx5_flow_dv_sample_resource),
> +		.trunk_size = 64,
> +		.grow_trunk = 3,
> +		.grow_shift = 2,
> +		.need_lock = 0,
> +		.release_mem_en = 1,
> +		.malloc = rte_malloc_socket,
> +		.free = rte_free,
> +		.type = "mlx5_sample_ipool",
> +	},
>  #endif
>  	{
>  		.size = sizeof(struct mlx5_flow_meter),
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index c2a875c..7394753 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -51,6 +51,7 @@ enum mlx5_ipool_index {
>  	MLX5_IPOOL_TAG, /* Pool for tag resource. */
>  	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
>  	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
> +	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
>  #endif
>  	MLX5_IPOOL_MTR, /* Pool for meter resource. */
>  	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
> @@ -510,6 +511,7 @@ struct mlx5_flow_tbl_resource {
>  /* Tables for metering splits should be added here. */
>  #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
>  #define MLX5_MAX_TABLES_FDB UINT16_MAX
> +#define MLX5_FLOW_TABLE_FACTOR 10
> 
>  /* ID generation structure. */
>  struct mlx5_flow_id_pool {
> @@ -558,6 +560,7 @@ struct mlx5_dev_ctx_shared {
>  	struct mlx5_hlist *tag_table;
>  	uint32_t port_id_action_list; /* List of port ID actions. */
>  	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
> +	uint32_t sample_action_list; /* List of sample actions. */
>  	struct mlx5_flow_counter_mng cmng; /* Counters management
> structure. */
>  	struct mlx5_indexed_pool *ipool[MLX5_IPOOL_MAX];
>  	/* Memory Pool for mlx5 flow resources. */
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 3a48b89..7c65a9a 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -360,6 +360,8 @@ struct mlx5_flow_tunnel_info {
>  		return REG_B;
>  	case MLX5_HAIRPIN_TX:
>  		return REG_A;
> +	case MLX5_SAMPLE_FDB:
> +		return REG_C_0;
>  	case MLX5_METADATA_RX:
>  		switch (config->dv_xmeta_en) {
>  		case MLX5_XMETA_MODE_LEGACY:
> @@ -3878,6 +3880,137 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	return 0;
>  }
> 
> +
> +/**
> + * Check the match action from the action list.
> + *
> + * @param[in] actions
> + *   Pointer to the list of actions.
> + * @param[in] action
> + *   The action to be check if exist.
> + *
> + * @return
> + *   > 0 the total number of actions.
> + *   0 if not found match action in action list.
> + */
> +static int
> +flow_check_match_action(const struct rte_flow_action actions[],
> +					enum rte_flow_action_type action)
> +{
> +	int actions_n = 0;
> +	int flag = 0;
> +
> +	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
> +		if (actions->type == action)
> +			flag = 1;
> +		actions_n++;
> +	}
> +	/* Count RTE_FLOW_ACTION_TYPE_END. */
> +	return flag ? actions_n + 1 : 0;
> +}
> +
> +/**
> + * Split the sample flow.
> + *
> + * As sample flow will split to two sub flow, sample flow with
> + * sample action, the other actions will move to new suffix flow.
> + *
> + * Also add unique tag id with tag action in the sample flow,
> + * the same tag id will be as match in the suffix flow.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[out] sfx_items
> + *   Suffix flow match items (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[out] actions_sfx
> + *   Suffix flow actions.
> + * @param[out] actions_pre
> + *   Prefix flow actions.
> + *
> + * @return
> + *   0 on success.


It looks like the function also returns the tag id.

> + */
> +static int
> +flow_sample_split_prep(struct rte_eth_dev *dev,
> +		 const struct rte_flow_attr *attr,
> +		 struct rte_flow_item sfx_items[],
> +		 const struct rte_flow_action actions[],
> +		 struct rte_flow_action actions_sfx[],
> +		 struct rte_flow_action actions_pre[])
> +{
> +	struct mlx5_rte_flow_action_set_tag *set_tag;
> +	struct mlx5_rte_flow_item_tag *tag_spec;
> +	struct mlx5_rte_flow_item_tag *tag_mask;
> +	struct rte_flow_item *tag_item;
> +	struct rte_flow_action *tag_action = NULL;
> +	bool pre_sample = true;
> +	struct rte_flow_error error;
> +	uint32_t tag_id;
> +
> +	/* Prepare the actions for prefix and suffix flow. */
> +	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
> +		struct rte_flow_action **action_cur = NULL;
> +
> +		switch (actions->type) {
> +		case RTE_FLOW_ACTION_TYPE_SAMPLE:
> +			/* Add the extra tag action first. */
> +			tag_action = actions_pre;
> +			tag_action->type = (enum rte_flow_action_type)
> +
> 	MLX5_RTE_FLOW_ACTION_TYPE_TAG;
> +			actions_pre++;
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_JUMP:
> +		case RTE_FLOW_ACTION_TYPE_METER:
> +			action_cur = &actions_sfx;
> +			break;
> +		default:
> +			break;
> +		}
> +		if (pre_sample && !action_cur)
> +			action_cur = &actions_pre;
> +		else
> +			action_cur = &actions_sfx;
> +		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
> +		(*action_cur)++;
> +		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
> +			pre_sample = false;
> +	}
> +	/* Add end action to the actions. */
> +	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
> +	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
> +	actions_pre++;
> +	/* Set the tag. */
> +	set_tag = (void *)actions_pre;
> +	set_tag->id = mlx5_flow_get_reg_id(dev, attr->transfer ?
> +			MLX5_SAMPLE_FDB : MLX5_APP_TAG, 0, &error);
> +	tag_id = flow_qrss_get_id(dev);
> +	set_tag->data = tag_id;
> +	assert(tag_action);
> +	tag_action->conf = set_tag;
> +	/* Prepare the suffix subflow items. */
> +	if (sfx_items) {
> +		tag_item = sfx_items++;
> +		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
> +		sfx_items++;
> +		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
> +		tag_spec->data = tag_id;
> +		tag_spec->id = set_tag->id;
> +		tag_mask = tag_spec + 1;
> +		tag_mask->data = UINT32_MAX;
> +		tag_mask->id = UINT16_MAX;
> +		tag_item->type = (enum rte_flow_item_type)
> +				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
> +		tag_item->spec = tag_spec;
> +		tag_item->last = NULL;
> +		tag_item->mask = tag_mask;
> +	}
> +	return tag_id;
> +}
> +
>  /**
>   * The splitting for metadata feature.
>   *
> @@ -4137,6 +4270,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  static int
>  flow_create_split_meter(struct rte_eth_dev *dev,
>  			   struct rte_flow *flow,
> +			   uint64_t prefix_layers,
>  			   const struct rte_flow_attr *attr,
>  			   const struct rte_flow_item items[],
>  			   const struct rte_flow_action actions[],
> @@ -4183,8 +4317,9 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  			goto exit;
>  		}
>  		/* Add the prefix subflow. */
> -		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
> -					      items, pre_actions, external,
> +		ret = flow_create_split_inner(dev, flow, &dev_flow,
> +					      prefix_layers, attr, items,
> +					      pre_actions, external,
>  					      flow_idx, error);
>  		if (ret) {
>  			ret = -rte_errno;
> @@ -4199,7 +4334,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	/* Add the prefix subflow. */
>  	ret = flow_create_split_metadata(dev, flow, dev_flow ?
> 
> flow_get_prefix_layer_flags(dev_flow) :
> -					 0, &sfx_attr,
> +					 prefix_layers, &sfx_attr,
>  					 sfx_items ? sfx_items : items,
>  					 sfx_actions ? sfx_actions : actions,
>  					 external, flow_idx, error);
> @@ -4210,6 +4345,117 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  }
> 
>  /**
> + * The splitting for sample feature.
> + *
> + * The sample flow will be split to two flows as prefix and
> + * suffix flow.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param[in] flow
> + *   Parent flow structure pointer.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[in] items
> + *   Pattern specification (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[in] external
> + *   This flow rule is created by request external to PMD.
> + * @param[in] flow_idx
> + *   This memory pool index to the flow.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + * @return
> + *   0 on success, negative value otherwise
> + */
> +static int
> +flow_create_split_sample(struct rte_eth_dev *dev,
> +			   struct rte_flow *flow,
> +			   const struct rte_flow_attr *attr,
> +			   const struct rte_flow_item items[],
> +			   const struct rte_flow_action actions[],
> +			   bool external, uint32_t flow_idx,
> +			   struct rte_flow_error *error)
> +{
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct rte_flow_action *sfx_actions = NULL;
> +	struct rte_flow_action *pre_actions = NULL;
> +	struct rte_flow_item *sfx_items = NULL;
> +	struct mlx5_flow *dev_flow = NULL;
> +	struct rte_flow_attr sfx_attr = *attr;
> +	struct mlx5_flow_dv_sample_resource *sample_res;
> +	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
> +	struct mlx5_flow_tbl_resource *sfx_tbl;
> +	union mlx5_flow_tbl_key sfx_table_key;
> +	size_t act_size;
> +	size_t item_size;
> +	uint32_t tag_id = 0;
> +	int actions_n = 0;
> +	int ret = 0;
> +
> +	if (priv->sampler_en)
> +		actions_n = flow_check_match_action(actions,
> +					RTE_FLOW_ACTION_TYPE_SAMPLE);
> +	if (actions_n) {
> +		/* The prefix actions must includes sample, tag, end. */
> +		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
> +			   sizeof(struct mlx5_rte_flow_action_set_tag);
> +		/* tag, end. */
> +#define SAMPLE_SUFFIX_ITEM 2
> +		item_size = sizeof(struct rte_flow_item) *
> SAMPLE_SUFFIX_ITEM +
> +			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
> +		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
> +		if (!sfx_actions)
> +			return rte_flow_error_set(error, ENOMEM,
> +
> RTE_FLOW_ERROR_TYPE_ACTION,
> +						  NULL, "no memory to split "
> +						  "sample flow");
> +		if (!attr->transfer)
> +			sfx_items = (struct rte_flow_item *)((char
> *)sfx_actions
> +					+ act_size);
> +		pre_actions = sfx_actions + actions_n;
> +		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
> +						   actions, sfx_actions,
> +						   pre_actions);
> +		if (!tag_id) {
> +			ret = -rte_errno;
> +			goto exit;
> +		}
> +		/* Add the prefix subflow. */
> +		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
> +					      items, pre_actions, external,
> +					      flow_idx, error);
> +		if (ret) {
> +			ret = -rte_errno;
> +			goto exit;
> +		}
> +		dev_flow->handle->split_flow_id = tag_id;
> +		/* Set the sfx group attr. */
> +		sample_res = (struct mlx5_flow_dv_sample_resource *)
> +					dev_flow->dv.sample_res;
> +		sfx_tbl = (struct mlx5_flow_tbl_resource *)
> +					sample_res->normal_path_tbl;
> +		sfx_tbl_data = container_of(sfx_tbl,
> +					struct mlx5_flow_tbl_data_entry, tbl);
> +		sfx_table_key.v64 = sfx_tbl_data->entry.key;
> +		sfx_attr.group = sfx_attr.transfer ?
> +					(sfx_table_key.table_id - 1) :
> +					sfx_table_key.table_id;
> +	}
> +	/* Add the suffix subflow. */
> +	ret = flow_create_split_meter(dev, flow, dev_flow ?
> +				 flow_get_prefix_layer_flags(dev_flow) : 0,
> +				 &sfx_attr, sfx_items ? sfx_items : items,
> +				 sfx_actions ? sfx_actions : actions,
> +				 external, flow_idx, error);
> +exit:
> +	if (sfx_actions)
> +		rte_free(sfx_actions);
> +	return ret;
> +}
> +
> +/**
>   * Split the flow to subflow set. The splitters might be linked
>   * in the chain, like this:
>   * flow_create_split_outer() calls:
> @@ -4257,7 +4503,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  {
>  	int ret;
> 
> -	ret = flow_create_split_meter(dev, flow, attr, items,
> +	ret = flow_create_split_sample(dev, flow, attr, items,
>  					 actions, external, flow_idx, error);
>  	MLX5_ASSERT(ret <= 0);
>  	return ret;
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 902380b..941de5f 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -79,6 +79,7 @@ enum mlx5_feature_name {
>  	MLX5_COPY_MARK,
>  	MLX5_MTR_COLOR,
>  	MLX5_MTR_SFX,
> +	MLX5_SAMPLE_FDB,
>  };
> 
>  /* Pattern outer Layer bits. */
> @@ -498,6 +499,38 @@ struct mlx5_flow_tbl_data_entry {
>  	uint32_t idx; /**< index for the indexed mempool. */
>  };
> 
> +/* Sub rdma-core actions list. */
> +struct mlx5_flow_sub_actions_list {
> +	uint32_t actions_num; /**< Number of sample actions. */
> +	uint64_t action_flags;
> +	void *dr_queue_action;
> +	void *dr_tag_action;
> +	void *dr_cnt_action;
> +};
> +
> +/* Sample sub-actions resource list. */
> +struct mlx5_flow_sub_actions_idx {
> +	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
> +	uint32_t rix_tag; /**< Index to the tag action. */
> +	uint32_t cnt;
> +};
> +
> +/* Sample action resource structure. */
> +struct mlx5_flow_dv_sample_resource {
> +	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
> +	rte_atomic32_t refcnt; /**< Reference counter. */
> +	void *verbs_action; /**< Verbs sample action object. */
> +	uint8_t ft_type; /** Flow Table Type */
> +	uint32_t ft_id; /** Flow Table Level */
> +	void *normal_path_tbl; /** Flow Table pointer */
> +	void *default_miss; /** default_miss dr_action. */
> +	uint32_t ratio;   /** Sample Ratio */
> +	struct mlx5_flow_sub_actions_idx sample_idx;
> +	/**< Action index resources. */
> +	struct mlx5_flow_sub_actions_list sample_act;
> +	/**< Action resources. */
> +};
> +
>  /* Verbs specification header. */
>  struct ibv_spec_header {
>  	enum ibv_flow_spec_type type;
> @@ -526,6 +559,8 @@ struct mlx5_flow_handle_dv {
>  	/**< Index to push VLAN action resource in cache. */
>  	uint32_t rix_tag;
>  	/**< Index to the tag action. */
> +	uint32_t rix_sample;
> +	/**< Index to sample action resource in cache. */
>  } __rte_packed;
> 
>  /** Device flow handle structure: used both for creating & destroying. */
> @@ -589,6 +624,8 @@ struct mlx5_flow_dv_workspace {
>  	/**< Pointer to the jump action resource. */
>  	struct mlx5_flow_dv_match_params value;
>  	/**< Holds the value that the packet is compared to. */
> +	struct mlx5_flow_dv_sample_resource *sample_res;
> +	/**< Pointer to the sample action resource. */
>  };
> 
>  /*
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-06-30 19:54   ` Ori Kam
  2020-07-01 15:06     ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 129+ messages in thread
From: Ori Kam @ 2020-06-30 19:54 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei,
PSB,

Thanks,
Ori

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> Subject: [PATCH 6/8] net/mlx5: update translate function for sample action
> 
> Translate the attribute of sample action that include sample ratio
> and sub actions list, then create the sample DR action.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c    |  16 +-
>  drivers/net/mlx5/mlx5_flow.h    |  14 +-
>  drivers/net/mlx5/mlx5_flow_dv.c | 502
> +++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 511 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 7c65a9a..73ef290 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -4569,10 +4569,14 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	int hairpin_flow;
>  	uint32_t hairpin_id = 0;
>  	struct rte_flow_attr attr_tx = { .priority = 0 };
> +	struct rte_flow_attr attr_factor = {0};
>  	int ret;
> 
> -	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
> -	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
> +	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
> +	if (external)
> +		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
> +	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
> +	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
>  				external, hairpin_flow, error);
>  	if (ret < 0)
>  		return 0;
> @@ -4591,7 +4595,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  		rte_errno = ENOMEM;
>  		goto error_before_flow;
>  	}
> -	flow->drv_type = flow_get_drv_type(dev, attr);
> +	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
>  	if (hairpin_id != 0)
>  		flow->hairpin_flow_id = hairpin_id;
>  	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
> @@ -4637,7 +4641,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  		 * depending on configuration. In the simplest
>  		 * case it just creates unmodified original flow.
>  		 */
> -		ret = flow_create_split_outer(dev, flow, attr,
> +		ret = flow_create_split_outer(dev, flow, &attr_factor,
>  					      buf->entry[i].pattern,
>  					      p_actions_rx, external, idx,
>  					      error);
> @@ -4674,8 +4678,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	 * the egress Flows belong to the different device and
>  	 * copy table should be updated in peer NIC Rx domain.
>  	 */
> -	if (attr->ingress &&
> -	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP))
> {
> +	if (attr_factor.ingress &&
> +	    (external || attr_factor.group !=
> MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
>  		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
>  		if (ret)
>  			goto error;
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 941de5f..4163183 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -369,6 +369,13 @@ enum mlx5_flow_fate_type {
>  	MLX5_FLOW_FATE_MAX,
>  };
> 
> +/*
> + * Max number of actions per DV flow.
> + * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
> + * in rdma-core file providers/mlx5/verbs.c.
> + */
> +#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
> +
>  /* Matcher PRM representation */
>  struct mlx5_flow_dv_match_params {
>  	size_t size;
> @@ -599,13 +606,6 @@ struct mlx5_flow_handle {
>  #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
>  #endif
> 
> -/*
> - * Max number of actions per DV flow.
> - * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
> - * in rdma-core file providers/mlx5/verbs.c.
> - */
> -#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
> -
>  /** Device flow structure only for DV flow creation. */
>  struct mlx5_flow_dv_workspace {
>  	uint32_t group; /**< The group index. */
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> b/drivers/net/mlx5/mlx5_flow_dv.c
> index 710c0f3..62a4a3b 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -79,6 +79,10 @@
>  flow_dv_tbl_resource_release(struct rte_eth_dev *dev,
>  			     struct mlx5_flow_tbl_resource *tbl);
> 
> +static int
> +flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
> +				     uint32_t encap_decap_idx);
> +
>  /**
>   * Initialize flow attributes structure according to flow items' types.
>   *
> @@ -7897,6 +7901,385 @@ struct field_modify_info modify_tcp[] = {
>  }
> 
>  /**
> + * Create an Rx Hash queue.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param[in] dev_flow
> + *   Pointer to the mlx5_flow.
> + * @param[in] rss_desc
> + *   Pointer to the mlx5_flow_rss_desc.
> + * @param[in, out] hrxq_idx

I think this is only used as out.

> + *   Hash Rx queue index.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
> + */
> +static struct mlx5_hrxq *
> +flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
> +			  struct mlx5_flow *dev_flow,
> +			  struct mlx5_flow_rss_desc *rss_desc,
> +			  uint32_t *hrxq_idx,
> +			  struct rte_flow_error *error)
> +{
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct mlx5_flow_handle *dh = dev_flow->handle;
> +	struct mlx5_hrxq *hrxq;
> +
> +	MLX5_ASSERT(rss_desc->queue_num);
> +	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
> +				 MLX5_RSS_HASH_KEY_LEN,
> +				 dev_flow->hash_fields,
> +				 rss_desc->queue,
> +				 rss_desc->queue_num);
> +	if (!*hrxq_idx) {
> +		*hrxq_idx = mlx5_hrxq_new
> +				(dev, rss_desc->key,
> +				MLX5_RSS_HASH_KEY_LEN,
> +				dev_flow->hash_fields,
> +				rss_desc->queue,
> +				rss_desc->queue_num,
> +				!!(dh->layers &
> +				MLX5_FLOW_LAYER_TUNNEL));
> +	}
> +	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
> +			      *hrxq_idx);

Why do you need this line? You can compare the hrxq_idx to check for error.

> +	if (!hrxq) {
> +		rte_flow_error_set
> +			(error, rte_errno,
> +			 RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
> +			 "cannot get hash queue");
> +		goto error;
> +	}
> +	dh->rix_hrxq = *hrxq_idx;
> +	return hrxq;
> +error:
> +	/* hrxq is union, don't clear it if the flag is not set. */
> +	if (dh->rix_hrxq) {
> +		mlx5_hrxq_release(dev, dh->rix_hrxq);
> +		dh->rix_hrxq = 0;
> +	}
> +	return NULL;
> +}
> +


[snap...]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-06-25 17:55   ` Jerin Jacob
  2020-06-28  8:27   ` Andrew Rybchenko
@ 2020-07-01  9:37   ` Ori Kam
  2 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-01  9:37 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei,

Please note that you are missing the doc changes.

Please update them.

Best,
Ori

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Sent: Thursday, June 25, 2020 7:26 PM
> To: Ori Kam <orika@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com; fbl@redhat.com;
> Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: [PATCH 1/8] ethdev: introduce sample action for rte flow
> 
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested vf or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
> 
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic in some predefined ratio to the
> application, while the original packet will continue to the target
> destination.
> 
> The packets sampled equals is '1/ratio', if the ratio value be set to 1
> , means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
> 
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> rte_flow_action_sample
> will be introduced.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
>  lib/librte_ethdev/rte_flow.c |  1 +
>  lib/librte_ethdev/rte_flow.h | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index 1685be5..733871d 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
>  	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
>  	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> +	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
>  };
> 
>  int
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index b0e4199..71dd82c 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
>  	 * see enum RTE_ETH_EVENT_FLOW_AGED
>  	 */
>  	RTE_FLOW_ACTION_TYPE_AGE,
> +
> +	/**
> +	 * Redirects specific ratio of packets to vport or queue.
> +	 *
> +	 * See struct rte_flow_action_sample.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SAMPLE,
>  };
> 
>  /**
> @@ -2709,6 +2716,28 @@ struct rte_flow_action {
>  struct rte_flow;
> 
>  /**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> + *
> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated to a special queue or vport
> + * in the predefined probabiilty, All the packets continues processing
> + * on the default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> + * Additional action list be supported to add for sampled or mirrored packets.
> + */
> +struct rte_flow_action_sample {
> +	/* packets sampled equals to '1/ratio' */
> +	const uint32_t ratio;
> +	/* sub-action list specific for the sampling hit cases */
> +	const struct rte_flow_action *actions;
> +};
> +
> +/**
>   * Verbose error types.
>   *
>   * Most of them provide the type of the object referenced by struct
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action
  2020-06-30 17:59   ` Ori Kam
@ 2020-07-01 13:55     ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-07-01 13:55 UTC (permalink / raw)
  To: Ori Kam, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl



> -----Original Message-----
> From: Ori Kam <orika@mellanox.com>
> Sent: Wednesday, July 1, 2020 2:00 AM
> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com; Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: RE: [PATCH 4/8] net/mlx5: add the validate sample action
> 
> Hi Jiawei,
> 
> PSB.
> 
> Best,
> Ori
> 
> > -----Original Message-----
> > From: Jiawei Wang <jiaweiw@mellanox.com>
> > Sent: Thursday, June 25, 2020 7:26 PM
> > Subject: [PATCH 4/8] net/mlx5: add the validate sample action
> >
> > Add sample action validate function.
> >
> > For Sample flow support NIC-RX and FDB domain, must include an action
> > of a dest TIR in NIC_RX or DEFAULT_MISS in FDB.
> 
> What is the DEFAULT_MISS action?
> I think from reading the code that you mean that no action is allowed and it
> is always goes to e-switch manager / go to PF, am I correct?
> 
Yes, you're right,  For FDB, not addition action be allowed for sampling, the
default action is go to e-switch manager port.
The DEFAULT_MISS is rdma-core action that steering packet to default miss
of the steering domain, for FDB domain, it's e-switch manager port.

I'll update the commit log description.

Thanks.

> >
> > Only NIC_RX support with addition optinal actions.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > ---
> >  drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
> >  drivers/net/mlx5/mlx5.h          |   1 +
> >  drivers/net/mlx5/mlx5_flow.h     |   1 +
> >  drivers/net/mlx5/mlx5_flow_dv.c  | 130
> > +++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 146 insertions(+)
> >
> > diff --git a/drivers/net/mlx5/linux/mlx5_os.c
> > b/drivers/net/mlx5/linux/mlx5_os.c
> > index f0147e6..5c057d3 100644
> > --- a/drivers/net/mlx5/linux/mlx5_os.c
> > +++ b/drivers/net/mlx5/linux/mlx5_os.c
> > @@ -878,6 +878,20 @@
> >  			}
> >  		}
> >  #endif
> > +#if defined(HAVE_MLX5DV_DR) &&
> > defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
> > +		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
> > +		    config.dv_flow_en) {
> > +			priv->sampler_en = 1;
> > +			DRV_LOG(DEBUG, "The Sampler enabled!\n");
> > +		} else {
> > +			priv->sampler_en = 0;
> > +			if (!config.hca_attr.log_max_ft_sampler_num)
> > +				DRV_LOG(WARNING, "No available register
> > for"
> > +						" Sampler.");
> > +			else
> > +				DRV_LOG(DEBUG, "DV flow is not
> > supported!\n");
> > +		}
> > +#endif
> >  	}
> >  	if (config.mprq.enabled && mprq) {
> >  		if (config.mprq.stride_num_n &&
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > 8a09ebc..c2a875c 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -607,6 +607,7 @@ struct mlx5_priv {
> >  	unsigned int counter_fallback:1; /* Use counter fallback management.
> > */
> >  	unsigned int mtr_en:1; /* Whether support meter. */
> >  	unsigned int mtr_reg_share:1; /* Whether support meter REG_C
> share.
> > */
> > +	unsigned int sampler_en:1; /* Whether support sampler. */
> >  	uint16_t domain_id; /* Switch domain identifier. */
> >  	uint16_t vport_id; /* Associated VF vport index (if any). */
> >  	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG.
> > */ diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 2c96677..902380b 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -200,6 +200,7 @@ enum mlx5_feature_name {  #define
> > MLX5_FLOW_ACTION_SET_IPV4_DSCP (1ull << 32)  #define
> > MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)  #define
> > MLX5_FLOW_ACTION_AGE (1ull << 34)
> > +#define MLX5_FLOW_ACTION_SAMPLE (1ull << 35)
> >
> >  #define MLX5_FLOW_FATE_ACTIONS \
> >  	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \ diff
> --git
> > a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> > index f174009..710c0f3 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -3925,6 +3925,127 @@ struct field_modify_info modify_tcp[] = {  }
> >
> >  /**
> > + * Validate the sample action.
> > + *
> > + * @param[in] action_flags
> > + *   Holds the actions detected until now.
> > + * @param[in] action
> > + *   Pointer to the sample action.
> > + * @param[in] dev
> > + *   Pointer to the Ethernet device structure.
> > + * @param[in] attr
> > + *   Attributes of flow that includes this action.
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_action_sample(uint64_t action_flags,
> > +			      const struct rte_flow_action *action,
> > +			      struct rte_eth_dev *dev,
> > +			      const struct rte_flow_attr *attr,
> > +			      struct rte_flow_error *error) {
> > +	struct mlx5_priv *priv = dev->data->dev_private;
> > +	struct mlx5_dev_config *dev_conf = &priv->config;
> > +	const struct rte_flow_action_sample *sample = action->conf;
> > +	const struct rte_flow_action *act = sample->actions;
> > +	uint64_t sub_action_flags = 0;
> > +	int actions_n = 0;
> > +	int ret;
> > +
> > +	if (!attr->group)
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
> > +					  NULL, "root table is not supported");
> > +	if (!priv->config.devx || !priv->sampler_en)
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> > +					  NULL,
> > +					  "sample action not supported");
> > +	if (!(action->conf))
> > +		return rte_flow_error_set(error, EINVAL,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > action,
> > +					  "configuration cannot be null");
> > +	if (sample->ratio == 0)
> > +		return rte_flow_error_set(error, EINVAL,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > action,
> > +					  "ratio value start from 1");
> > +	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
> > +		return rte_flow_error_set(error, EINVAL,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > +					  "Duplicate sample actions set");
> > +	if (action_flags & MLX5_FLOW_ACTION_METER)
> > +		return rte_flow_error_set(error, EINVAL,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > action,
> > +					  "wrong action order, meter should "
> > +					  "be after sample action");
> > +	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
> > +		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
> > +			return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ACTION,
> > +						  act, "too many actions");
> > +		switch (act->type) {
> > +		case RTE_FLOW_ACTION_TYPE_QUEUE:
> > +			ret = mlx5_flow_validate_action_queue(act,
> > +							      sub_action_flags,
> > +							      dev,
> > +							      attr, error);
> > +			if (ret < 0)
> > +				return ret;
> > +			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
> > +			break;
> > +		case RTE_FLOW_ACTION_TYPE_MARK:
> > +			ret = flow_dv_validate_action_mark(dev, act,
> > +							   sub_action_flags,
> > +							   attr, error);
> > +			if (ret < 0)
> > +				return ret;
> > +			if (dev_conf->dv_xmeta_en !=
> > MLX5_XMETA_MODE_LEGACY)
> > +				sub_action_flags |=
> > MLX5_FLOW_ACTION_MARK |
> > +
> > 	MLX5_FLOW_ACTION_MARK_EXT;
> > +			else
> > +				sub_action_flags |=
> > MLX5_FLOW_ACTION_MARK;
> > +			break;
> > +		case RTE_FLOW_ACTION_TYPE_COUNT:
> > +			ret = flow_dv_validate_action_count(dev, error);
> > +			if (ret < 0)
> > +				return ret;
> > +			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
> > +			break;
> > +		default:
> > +			return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ACTION,
> > +						  NULL,
> > +						  "Doesn't support optional "
> > +						  "action");
> > +		}
> > +	}
> > +	if (attr->ingress && !attr->transfer) {
> > +		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
> > +			return rte_flow_error_set(error, EINVAL,
> > +
> > RTE_FLOW_ERROR_TYPE_ACTION,
> > +						  NULL,
> > +						  "Ingress must has a dest "
> > +						  "QUEUE for Sample");
> > +	} else if (attr->egress && !attr->transfer) {
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > +					  NULL,
> > +					  "Sample Only support Ingress "
> > +					  "or E-Switch");
> > +	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
> > +		return rte_flow_error_set(error, ENOTSUP,
> > +					  RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > +					  "E-Switch doesn't support any "
> > +					  "optinal action for sampling");
> > +	}
> > +	return 0;
> > +}
> > +
> > +/**
> >   * Find existing modify-header resource or create and register a new one.
> >   *
> >   * @param dev[in, out]
> > @@ -5539,6 +5660,15 @@ struct field_modify_info modify_tcp[] = {
> >  			action_flags |=
> MLX5_FLOW_ACTION_SET_IPV6_DSCP;
> >  			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
> >  			break;
> > +		case RTE_FLOW_ACTION_TYPE_SAMPLE:
> > +			ret = flow_dv_validate_action_sample(action_flags,
> > +							     actions, dev,
> > +							     attr, error);
> > +			if (ret < 0)
> > +				return ret;
> > +			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
> > +			++actions_n;
> > +			break;
> >  		default:
> >  			return rte_flow_error_set(error, ENOTSUP,
> >
> > RTE_FLOW_ERROR_TYPE_ACTION,
> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action
  2020-06-30 19:54   ` Ori Kam
@ 2020-07-01 15:06     ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-07-01 15:06 UTC (permalink / raw)
  To: Ori Kam, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl



> -----Original Message-----
> From: Ori Kam <orika@mellanox.com>
> Sent: Wednesday, July 1, 2020 3:55 AM
> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com; Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Subject: RE: [PATCH 6/8] net/mlx5: update translate function for sample
> action
> 
> Hi Jiawei,
> PSB,
> 
> Thanks,
> Ori
> 
> > -----Original Message-----
> > From: Jiawei Wang <jiaweiw@mellanox.com>
> > Sent: Thursday, June 25, 2020 7:26 PM
> > Subject: [PATCH 6/8] net/mlx5: update translate function for sample
> > action
> >
> > Translate the attribute of sample action that include sample ratio and
> > sub actions list, then create the sample DR action.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c    |  16 +-
> >  drivers/net/mlx5/mlx5_flow.h    |  14 +-
> >  drivers/net/mlx5/mlx5_flow_dv.c | 502
> > +++++++++++++++++++++++++++++++++++++++-
> >  3 files changed, 511 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 7c65a9a..73ef290 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -4569,10 +4569,14 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> >  	int hairpin_flow;
> >  	uint32_t hairpin_id = 0;
> >  	struct rte_flow_attr attr_tx = { .priority = 0 };
> > +	struct rte_flow_attr attr_factor = {0};
> >  	int ret;
> >
> > -	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
> > -	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
> > +	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
> > +	if (external)
> > +		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
> > +	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
> > +	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
> >  				external, hairpin_flow, error);
> >  	if (ret < 0)
> >  		return 0;
> > @@ -4591,7 +4595,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> >  		rte_errno = ENOMEM;
> >  		goto error_before_flow;
> >  	}
> > -	flow->drv_type = flow_get_drv_type(dev, attr);
> > +	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
> >  	if (hairpin_id != 0)
> >  		flow->hairpin_flow_id = hairpin_id;
> >  	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN && @@ -
> 4637,7
> > +4641,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
> > int32_t priority,
> >  		 * depending on configuration. In the simplest
> >  		 * case it just creates unmodified original flow.
> >  		 */
> > -		ret = flow_create_split_outer(dev, flow, attr,
> > +		ret = flow_create_split_outer(dev, flow, &attr_factor,
> >  					      buf->entry[i].pattern,
> >  					      p_actions_rx, external, idx,
> >  					      error);
> > @@ -4674,8 +4678,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> >  	 * the egress Flows belong to the different device and
> >  	 * copy table should be updated in peer NIC Rx domain.
> >  	 */
> > -	if (attr->ingress &&
> > -	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP))
> > {
> > +	if (attr_factor.ingress &&
> > +	    (external || attr_factor.group !=
> > MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
> >  		ret = flow_mreg_update_copy_table(dev, flow, actions,
> error);
> >  		if (ret)
> >  			goto error;
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 941de5f..4163183 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -369,6 +369,13 @@ enum mlx5_flow_fate_type {
> >  	MLX5_FLOW_FATE_MAX,
> >  };
> >
> > +/*
> > + * Max number of actions per DV flow.
> > + * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
> > + * in rdma-core file providers/mlx5/verbs.c.
> > + */
> > +#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
> > +
> >  /* Matcher PRM representation */
> >  struct mlx5_flow_dv_match_params {
> >  	size_t size;
> > @@ -599,13 +606,6 @@ struct mlx5_flow_handle {  #define
> > MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
> #endif
> >
> > -/*
> > - * Max number of actions per DV flow.
> > - * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
> > - * in rdma-core file providers/mlx5/verbs.c.
> > - */
> > -#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
> > -
> >  /** Device flow structure only for DV flow creation. */  struct
> > mlx5_flow_dv_workspace {
> >  	uint32_t group; /**< The group index. */ diff --git
> > a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> > index 710c0f3..62a4a3b 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -79,6 +79,10 @@
> >  flow_dv_tbl_resource_release(struct rte_eth_dev *dev,
> >  			     struct mlx5_flow_tbl_resource *tbl);
> >
> > +static int
> > +flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
> > +				     uint32_t encap_decap_idx);
> > +
> >  /**
> >   * Initialize flow attributes structure according to flow items' types.
> >   *
> > @@ -7897,6 +7901,385 @@ struct field_modify_info modify_tcp[] = {  }
> >
> >  /**
> > + * Create an Rx Hash queue.
> > + *
> > + * @param dev
> > + *   Pointer to Ethernet device.
> > + * @param[in] dev_flow
> > + *   Pointer to the mlx5_flow.
> > + * @param[in] rss_desc
> > + *   Pointer to the mlx5_flow_rss_desc.
> > + * @param[in, out] hrxq_idx
> 
> I think this is only used as out.
> 
right, will change it.
> > + *   Hash Rx queue index.
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
> > + */
> > +static struct mlx5_hrxq *
> > +flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
> > +			  struct mlx5_flow *dev_flow,
> > +			  struct mlx5_flow_rss_desc *rss_desc,
> > +			  uint32_t *hrxq_idx,
> > +			  struct rte_flow_error *error)
> > +{
> > +	struct mlx5_priv *priv = dev->data->dev_private;
> > +	struct mlx5_flow_handle *dh = dev_flow->handle;
> > +	struct mlx5_hrxq *hrxq;
> > +
> > +	MLX5_ASSERT(rss_desc->queue_num);
> > +	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
> > +				 MLX5_RSS_HASH_KEY_LEN,
> > +				 dev_flow->hash_fields,
> > +				 rss_desc->queue,
> > +				 rss_desc->queue_num);
> > +	if (!*hrxq_idx) {
> > +		*hrxq_idx = mlx5_hrxq_new
> > +				(dev, rss_desc->key,
> > +				MLX5_RSS_HASH_KEY_LEN,
> > +				dev_flow->hash_fields,
> > +				rss_desc->queue,
> > +				rss_desc->queue_num,
> > +				!!(dh->layers &
> > +				MLX5_FLOW_LAYER_TUNNEL));
> > +	}
> > +	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
> > +			      *hrxq_idx);
> 
> Why do you need this line? You can compare the hrxq_idx to check for error.
> 
Yes, we can check by *hrxq_idx==0 for error, or return corresponding hash rx queue object if no error.
Thanks.
 
> > +	if (!hrxq) {
> > +		rte_flow_error_set
> > +			(error, rte_errno,
> > +			 RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
> > +			 "cannot get hash queue");
> > +		goto error;
> > +	}
> > +	dh->rix_hrxq = *hrxq_idx;
> > +	return hrxq;
> > +error:
> > +	/* hrxq is union, don't clear it if the flag is not set. */
> > +	if (dh->rix_hrxq) {
> > +		mlx5_hrxq_release(dev, dh->rix_hrxq);
> > +		dh->rix_hrxq = 0;
> > +	}
> > +	return NULL;
> > +}
> > +
> 
> 
> [snap...]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-06-28 15:52                 ` Jiawei(Jonny) Wang
@ 2020-07-02  0:18                   ` Stephen Hemminger
  2020-07-02  7:16                     ` Ori Kam
  0 siblings, 1 reply; 129+ messages in thread
From: Stephen Hemminger @ 2020-07-02  0:18 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang
  Cc: Jerin Jacob, Thomas Monjalon, Ori Kam, Slava Ovsiienko,
	Matan Azrad, dpdk-dev, Raslan Darawsheh, ian.stokes, fbl,
	Ferruh Yigit, Andrew Rybchenko

On Sun, 28 Jun 2020 15:52:27 +0000
"Jiawei(Jonny) Wang" <jiaweiw@mellanox.com> wrote:

> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Sunday, June 28, 2020 9:38 PM
> > To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Ori Kam
> > <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> > Matan Azrad <matan@mellanox.com>; dpdk-dev <dev@dpdk.org>; Raslan
> > Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> > fbl@redhat.com; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko
> > <arybchenko@solarflare.com>
> > Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> > flow
> > 
> > On Sun, Jun 28, 2020 at 6:46 PM Jiawei(Jonny) Wang
> > <jiaweiw@mellanox.com> wrote:  
> > >
> > >
> > > On Friday, June 26, 2020 7:10 PM Jerin Jacob <jerinjacobk@gmail.com>  
> > Wrote:  
> > > >
> > > > On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon
> > > > <thomas@monjalon.net>
> > > > wrote:  
> > > > >
> > > > > 26/06/2020 12:35, Jerin Jacob:  
> > > > > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon  
> > > > <thomas@monjalon.net> wrote:  
> > > > > > >
> > > > > > > 25/06/2020 19:55, Jerin Jacob:  
> > > > > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang  
> > > > <jiaweiw@mellanox.com> wrote:  
> > > > > > > > >
> > > > > > > > > When using full offload, all traffic will be handled by
> > > > > > > > > the HW, and directed to the requested vf or wire, the
> > > > > > > > > control application loses visibility on the traffic.
> > > > > > > > > So there's a need for an action that will enable the
> > > > > > > > > control application some visibility.
> > > > > > > > >
> > > > > > > > > The solution is introduced a new action that will sample
> > > > > > > > > the incoming traffic and send a duplicated traffic in some
> > > > > > > > > predefined ratio to the application, while the original
> > > > > > > > > packet will continue to the target destination.
> > > > > > > > >
> > > > > > > > > The packets sampled equals is '1/ratio', if the ratio
> > > > > > > > > value be set to 1 , means that the packets would be
> > > > > > > > > completely mirrored. The sample packet can be assigned
> > > > > > > > > with different set of  
> > > > actions from the original packet.  
> > > > > > > > >
> > > > > > > > > In order to support the sample packet in rte_flow, new
> > > > > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE  
> > and  
> > > > > > > > > structure rte_flow_action_sample  
> > > > > > > >
> > > > > > > > Isn't mirroring the packet? How about,
> > > > > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand,  
> > Why  
> > > > it is called sample.  
> > > > > > >
> > > > > > > Sampling is a partial mirroring.  
> > > > > >
> > > > > > I think, By definition, _sampling_ is the _selection_ of items
> > > > > > from a specific group.
> > > > > > I think, _sampling_ is not dictating, what is the real action
> > > > > > for the "selected"  items.
> > > > > > One can get confused with the selected ones can be for forward,
> > > > > > drop any other action.  
> > > > >
> > > > > I see. Good design question (I will let others reply).
> > > > >  
> > > > > > So IMO, explicit mirror keyword usage makes it is clear.  
> > >
> > > Sampled packet is duplicated from incoming traffic at specific ratio
> > > and will go to different sample actions;
> > > ratio=1 is 100% duplication or mirroring.
> > > All packets will continue to go to default flow actions.  
> > 
> > Functionality is clear from the git commit log(Not from action name).
> > The only question is what would be the appropriate name for this action.
> > RTE_FLOW_ACTION_TYPE_SAMPLE vs RTE_FLOW_ACTION_TYPE_MIRROR
> >   
> > >  
> > > > > >
> > > > > > Some more related questions:
> > > > > > 1) What is the real use case for ratio? I am not against adding
> > > > > > a ratio attribute if the MLX hardware supports it. It will be
> > > > > > good to know the use case from the application perspective? And
> > > > > > what basics application set ratio != 1?  
> > > > >
> > > > > If I understand well, some applications want to check, by picking
> > > > > random packets, that the processing is not failing.  
> > > >
> > > > Not clear to me. I will wait for another explanation if any.
> > > > In what basics application set .1 vs .8?  
> > >
> > > The real case is like monitor the traffic with full-offload.
> > > While packet hit the sample flow, the matching packets will be sampled
> > > and sent to specific Queue, align with OVS sflow probability, user  
> > application can set it different value.
> > 
> > I understand the use case for mirror and supported in a lot of HW.
> > What I would like to understand is the use case for "ratio"?
> > Is the "ratio" part of OpenFlow spec? Or Is it an MLX hardware feature?
> >   
> The same usage of the 'probability' variable of ovs sample action;
> MLX HW implemented it.
> > 
> >   
> > >  
> > > >  
> > > > >  
> > > > > > 2) If it is for "rate-limiting" or "policing", why not use
> > > > > > rte_mtr object (rte_mtr.h) via rte_flow action.  
> > >
> > > The sample ratio isn’t the same as “meter’, the ratio of sampling will be  
> > calculated with incoming packets mask (every some packets sampled 1).
> > Then the packets will be duplicated and go to do the other sample actions.
> > 
> > What I meant here is , If the ratio is used for rate-limiting then having a
> > cascade rule like RTE_FLOW_ACTION_TYPE_MIRROR,
> > RTE_FLOW_ACTION_TYPE_MTR will do the job.
> >   
> The ratio means the probability with packet replication, we don't need add METER action here.
> > >
> > >  
> > > > > > 3) One of the issue for driver developers and application
> > > > > > writers are overlapping APIs. This would overlap with
> > > > > > rte_eth_mirror_rule_set() API.
> > > > > >
> > > > > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a
> > > > > > pain for all to have overlapping APIs. We have not fixed the
> > > > > > VLAN filter API overlap with rte_flow in ethdev. Its being TODO
> > > > > > for multiple releases now.  
> > > > >
> > > > > Ooooooooh yes!
> > > > > I think flow-based API is more powerful, and should deprecate old
> > > > > port-based API.  
> > > >
> > > > +1 from me.
> > > >
> > > > it is taking too much effort and time to make support duplicate APIs.
> > > >  
> > > > > I want to help deprecating such API in 20.11 if possible.  
> > > >
> > > > Please start that discussion. In this case, it is clear API overlap
> > > > with rte_eth_mirror_rule_set().
> > > > We should not have two separate paths for the same function in the
> > > > same ethdev library.
> > > >
> > > >
> > > >  
> > > > >  
> > > > > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > > > > That's why only one action is enough.  
> > > > >
> > > > >
> > > > >  

One use case would be simulating packet loss or duplication. (like netem does).
In that use case, the sample action would have to not have an implicit copy.

Could sample be defined as "if random number hits the ratio" then execute
this alternative rule, otherwise go to next rule.

The the alternative rule could mirror if it wanted, or drop, or count, or ...

It could even be used as a form of transmit load balancing.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow
  2020-07-02  0:18                   ` Stephen Hemminger
@ 2020-07-02  7:16                     ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-02  7:16 UTC (permalink / raw)
  To: Stephen Hemminger, Jiawei(Jonny) Wang
  Cc: Jerin Jacob, Thomas Monjalon, Slava Ovsiienko, Matan Azrad,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, July 2, 2020 3:18 AM
> To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> Cc: Jerin Jacob <jerinjacobk@gmail.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ori Kam <orika@mellanox.com>; Slava Ovsiienko
> <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>; dpdk-
> dev <dev@dpdk.org>; Raslan Darawsheh <rasland@mellanox.com>;
> ian.stokes@intel.com; fbl@redhat.com; Ferruh Yigit <ferruh.yigit@intel.com>;
> Andrew Rybchenko <arybchenko@solarflare.com>
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> On Sun, 28 Jun 2020 15:52:27 +0000
> "Jiawei(Jonny) Wang" <jiaweiw@mellanox.com> wrote:
> 
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > Sent: Sunday, June 28, 2020 9:38 PM
> > > To: Jiawei(Jonny) Wang <jiaweiw@mellanox.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Ori Kam
> > > <orika@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> > > Matan Azrad <matan@mellanox.com>; dpdk-dev <dev@dpdk.org>; Raslan
> > > Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> > > fbl@redhat.com; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko
> > > <arybchenko@solarflare.com>
> > > Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> > > flow
> > >
> > > On Sun, Jun 28, 2020 at 6:46 PM Jiawei(Jonny) Wang
> > > <jiaweiw@mellanox.com> wrote:
> > > >
> > > >
> > > > On Friday, June 26, 2020 7:10 PM Jerin Jacob <jerinjacobk@gmail.com>
> > > Wrote:
> > > > >
> > > > > On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon
> > > > > <thomas@monjalon.net>
> > > > > wrote:
> > > > > >
> > > > > > 26/06/2020 12:35, Jerin Jacob:
> > > > > > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
> > > > > <thomas@monjalon.net> wrote:
> > > > > > > >
> > > > > > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
> > > > > <jiaweiw@mellanox.com> wrote:
> > > > > > > > > >
> > > > > > > > > > When using full offload, all traffic will be handled by
> > > > > > > > > > the HW, and directed to the requested vf or wire, the
> > > > > > > > > > control application loses visibility on the traffic.
> > > > > > > > > > So there's a need for an action that will enable the
> > > > > > > > > > control application some visibility.
> > > > > > > > > >
> > > > > > > > > > The solution is introduced a new action that will sample
> > > > > > > > > > the incoming traffic and send a duplicated traffic in some
> > > > > > > > > > predefined ratio to the application, while the original
> > > > > > > > > > packet will continue to the target destination.
> > > > > > > > > >
> > > > > > > > > > The packets sampled equals is '1/ratio', if the ratio
> > > > > > > > > > value be set to 1 , means that the packets would be
> > > > > > > > > > completely mirrored. The sample packet can be assigned
> > > > > > > > > > with different set of
> > > > > actions from the original packet.
> > > > > > > > > >
> > > > > > > > > > In order to support the sample packet in rte_flow, new
> > > > > > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE
> > > and
> > > > > > > > > > structure rte_flow_action_sample
> > > > > > > > >
> > > > > > > > > Isn't mirroring the packet? How about,
> > > > > > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand,
> > > Why
> > > > > it is called sample.
> > > > > > > >
> > > > > > > > Sampling is a partial mirroring.
> > > > > > >
> > > > > > > I think, By definition, _sampling_ is the _selection_ of items
> > > > > > > from a specific group.
> > > > > > > I think, _sampling_ is not dictating, what is the real action
> > > > > > > for the "selected"  items.
> > > > > > > One can get confused with the selected ones can be for forward,
> > > > > > > drop any other action.
> > > > > >
> > > > > > I see. Good design question (I will let others reply).
> > > > > >
> > > > > > > So IMO, explicit mirror keyword usage makes it is clear.
> > > >
> > > > Sampled packet is duplicated from incoming traffic at specific ratio
> > > > and will go to different sample actions;
> > > > ratio=1 is 100% duplication or mirroring.
> > > > All packets will continue to go to default flow actions.
> > >
> > > Functionality is clear from the git commit log(Not from action name).
> > > The only question is what would be the appropriate name for this action.
> > > RTE_FLOW_ACTION_TYPE_SAMPLE vs RTE_FLOW_ACTION_TYPE_MIRROR
> > >
> > > >
> > > > > > >
> > > > > > > Some more related questions:
> > > > > > > 1) What is the real use case for ratio? I am not against adding
> > > > > > > a ratio attribute if the MLX hardware supports it. It will be
> > > > > > > good to know the use case from the application perspective? And
> > > > > > > what basics application set ratio != 1?
> > > > > >
> > > > > > If I understand well, some applications want to check, by picking
> > > > > > random packets, that the processing is not failing.
> > > > >
> > > > > Not clear to me. I will wait for another explanation if any.
> > > > > In what basics application set .1 vs .8?
> > > >
> > > > The real case is like monitor the traffic with full-offload.
> > > > While packet hit the sample flow, the matching packets will be sampled
> > > > and sent to specific Queue, align with OVS sflow probability, user
> > > application can set it different value.
> > >
> > > I understand the use case for mirror and supported in a lot of HW.
> > > What I would like to understand is the use case for "ratio"?
> > > Is the "ratio" part of OpenFlow spec? Or Is it an MLX hardware feature?
> > >
> > The same usage of the 'probability' variable of ovs sample action;
> > MLX HW implemented it.
> > >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > 2) If it is for "rate-limiting" or "policing", why not use
> > > > > > > rte_mtr object (rte_mtr.h) via rte_flow action.
> > > >
> > > > The sample ratio isn’t the same as “meter’, the ratio of sampling will be
> > > calculated with incoming packets mask (every some packets sampled 1).
> > > Then the packets will be duplicated and go to do the other sample actions.
> > >
> > > What I meant here is , If the ratio is used for rate-limiting then having a
> > > cascade rule like RTE_FLOW_ACTION_TYPE_MIRROR,
> > > RTE_FLOW_ACTION_TYPE_MTR will do the job.
> > >
> > The ratio means the probability with packet replication, we don't need add
> METER action here.
> > > >
> > > >
> > > > > > > 3) One of the issue for driver developers and application
> > > > > > > writers are overlapping APIs. This would overlap with
> > > > > > > rte_eth_mirror_rule_set() API.
> > > > > > >
> > > > > > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a
> > > > > > > pain for all to have overlapping APIs. We have not fixed the
> > > > > > > VLAN filter API overlap with rte_flow in ethdev. Its being TODO
> > > > > > > for multiple releases now.
> > > > > >
> > > > > > Ooooooooh yes!
> > > > > > I think flow-based API is more powerful, and should deprecate old
> > > > > > port-based API.
> > > > >
> > > > > +1 from me.
> > > > >
> > > > > it is taking too much effort and time to make support duplicate APIs.
> > > > >
> > > > > > I want to help deprecating such API in 20.11 if possible.
> > > > >
> > > > > Please start that discussion. In this case, it is clear API overlap
> > > > > with rte_eth_mirror_rule_set().
> > > > > We should not have two separate paths for the same function in the
> > > > > same ethdev library.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > > > > > That's why only one action is enough.
> > > > > >
> > > > > >
> > > > > >
> 
> One use case would be simulating packet loss or duplication. (like netem does).
> In that use case, the sample action would have to not have an implicit copy.
> 
> Could sample be defined as "if random number hits the ratio" then execute
> this alternative rule, otherwise go to next rule.
> 
> The the alternative rule could mirror if it wanted, or drop, or count, or ...
> 
> It could even be used as a form of transmit load balancing.

I think what you are suggesting is a different action, by definition (at least in kernel)
The sample is a duplication of a packet, it comes to solve the lack of visibility when doing full offload.
What you are suggesting is to add 2 flows, like discussed in a different thread will have major impact on performance
(duplicate the matching, waste two flows since number of HW have limited rules capabilities )
Also it seems a bug that application can't relay to where the packet will get to. It will break applications that are
counting on traffic to get to a specific queue.

Best,
Ori




^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (7 preceding siblings ...)
  2020-06-25 16:26 ` [dpdk-dev] [PATCH 8/8] app/testpmd: add testpmd command for sample action Jiawei Wang
@ 2020-07-02 17:43 ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
                     ` (6 more replies)
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
  9 siblings, 7 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample
the incoming traffic and send a duplicated traffic in some predefined
ratio to the application, while the original packet will continue to
the target destination.

If the sample ratio value be set to 1, means that the packets would be
completely mirrored. The sample packet can be assigned with additional
set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample
action while creating a flow.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (7):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c            | 285 ++++++++++++++-
 doc/guides/prog_guide/rte_flow.rst     |  25 ++
 doc/guides/rel_notes/release_20_08.rst |   6 +
 drivers/common/mlx5/Makefile           |   5 +
 drivers/common/mlx5/linux/meson.build  |   2 +
 drivers/common/mlx5/linux/mlx5_glue.c  |  15 +
 drivers/common/mlx5/linux/mlx5_glue.h  |  12 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h   |   1 +
 drivers/common/mlx5/mlx5_prm.h         |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c       |  14 +
 drivers/net/mlx5/mlx5.c                |  11 +
 drivers/net/mlx5/mlx5.h                |   4 +
 drivers/net/mlx5/mlx5_flow.c           | 274 +++++++++++++-
 drivers/net/mlx5/mlx5_flow.h           |  51 ++-
 drivers/net/mlx5/mlx5_flow_dv.c        | 627 ++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c           |   1 +
 lib/librte_ethdev/rte_flow.h           |  28 ++
 18 files changed, 1404 insertions(+), 35 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 1/7] ethdev: introduce sample action for rte flow
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 2/7] common/mlx5: glue for sample action Jiawei Wang
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

When using full offload, all traffic will be handled by the HW, and
directed to the requested vf or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic in some predefined ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1
, means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_08.rst |  6 ++++++
 lib/librte_ethdev/rte_flow.c           |  1 +
 lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index d5dd18c..50dfe1f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+
 
+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated to a special queue or vport
+with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
+All the packets continues to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index 5cbc4ce..313e8d3 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -81,6 +81,12 @@ New Features
   * Added support for virtio queue statistics.
   * Added support for MTU update.
 
+* **Added flow-based traffic sampling support.**
+
+  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
+  packets with given ratio and redirects to vport or queue. The sampled packets
+  also can be assigned with an additional optional actions.
+
 * **Updated Marvell octeontx2 ethdev PMD.**
 
   Updated Marvell octeontx2 driver with cn98xx support.
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 1685be5..733871d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -173,6 +173,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b0e4199..c9cd80d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * Redirects specific ratio of packets to vport or queue.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2709,6 +2716,27 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated to a special queue or vport
+ * in the predefined probabiilty, All the packets continues processing
+ * on the default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+	const struct rte_flow_action *actions;
+		/**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 2/7] common/mlx5: glue for sample action
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

rdma-core introduce a new DR sample action.

Add the rdma-core commands in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/Makefile          |  5 +++++
 drivers/common/mlx5/linux/meson.build |  2 ++
 drivers/common/mlx5/linux/mlx5_glue.c | 15 +++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 12 ++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index f6c762b..4c1484c 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -192,6 +192,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		func mlx5dv_dump_dr_domain \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_flow_sampler \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
 		infiniband/mlx5dv.h \
 		enum MLX5_MMAP_GET_NC_PAGES_CMD \
diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 2294213..0f08318 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -162,6 +162,8 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index 048207e..98b3e71 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1059,6 +1059,19 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1308,6 +1321,8 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 069d854..11b95c5 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -304,6 +314,8 @@ struct mlx5_glue {
 			 struct mlx5dv_devx_async_event_hdr *event_data,
 			 size_t event_resp_len);
 	void (*dr_reclaim_domain_memory)(void *domain, uint32_t enable);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 3/7] common/mlx5: query sampler object capability via DevX
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 2/7] common/mlx5: glue for sample action Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 4/7] net/mlx5: add the validate sample action Jiawei Wang
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index ec92eb6..6b551f1 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -496,6 +496,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capabilitiy From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 25704ef..a9cfe6d 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -90,6 +90,7 @@ struct mlx5_hca_attr {
 	uint32_t vhca_id:16;
 	uint32_t relaxed_ordering_write:1;
 	uint32_t relaxed_ordering_read:1;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index c63795f..e7d0a65 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -944,6 +944,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1365,12 +1366,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 4/7] net/mlx5: add the validate sample action
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
                     ` (2 preceding siblings ...)
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX.

Only NIC_RX support with addition optional actions. FDB doesn't
support any optional action, the sampled packets is always goes
to e-switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 133 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2dc57b2..6dfacf2 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -878,6 +878,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
+		    config.dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config.hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config.mprq.enabled && mprq) {
 		if (config.mprq.stride_num_n &&
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 46e66eb..6790738 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -617,6 +617,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 43cbda8..45a073c 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -202,6 +202,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
 #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 36)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 0bd1c99..002e075 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3958,6 +3958,130 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			++actions_n;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optinal action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5591,6 +5715,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 5/7] net/mlx5: split sample flow into two sub flows
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
                     ` (3 preceding siblings ...)
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 6/7] net/mlx5: update translate function for sample action Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 7/7] app/testpmd: add testpmd command " Jiawei Wang
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 258 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  36 ++++++
 4 files changed, 304 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 07c6add..db55545 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -241,6 +241,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = rte_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = rte_malloc_socket,
+		.free = rte_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6790738..756bd68 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -51,6 +51,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -518,6 +519,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -566,6 +568,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_flow_default_miss_resource default_miss;
 	/* Default miss action resource structure. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index ae5ccc2..7ed9ba3 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3917,6 +3917,139 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success, or unique flow_id.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id = 0;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			if (!attr->transfer) {
+				/* Add the extra tag action first for NIC-RX. */
+				tag_action = actions_pre;
+				tag_action->type = (enum rte_flow_action_type)
+						MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+				actions_pre++;
+			}
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	if (!attr->transfer) {
+		actions_pre++;
+		/* Set the tag. */
+		set_tag = (void *)actions_pre;
+		set_tag->id = mlx5_flow_get_reg_id(dev, MLX5_APP_TAG,
+						   0, &error);
+		tag_id = flow_qrss_get_id(dev);
+		set_tag->data = tag_id;
+		assert(tag_action);
+		tag_action->conf = set_tag;
+		/* Prepare the suffix subflow items. */
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4176,6 +4309,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4222,8 +4356,9 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4238,7 +4373,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4249,6 +4384,121 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+#endif
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!attr->transfer && !tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+#endif
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4296,7 +4546,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 45a073c..51826f8 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -502,6 +502,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -530,6 +562,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -595,6 +629,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 6/7] net/mlx5: update translate function for sample action
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
                     ` (4 preceding siblings ...)
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-07-02 17:43   ` Jiawei Wang
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 7/7] app/testpmd: add testpmd command " Jiawei Wang
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 494 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 502 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7ed9ba3..c91ae7d 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4612,10 +4612,14 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4634,7 +4638,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4680,7 +4684,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4717,8 +4721,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 51826f8..99e900b 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -372,6 +372,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -604,13 +611,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 002e075..3d0eaed 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -82,6 +82,10 @@
 static int
 flow_dv_default_miss_resource_release(struct rte_eth_dev *dev);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				      uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -7955,6 +7959,373 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[out] hrxq_idx
+ *   Hash Rx queue index.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+		if (!*hrxq_idx)
+			return NULL;
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	return hrxq;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			if (action_flags & MLX5_FLOW_ACTION_MARK)
+				dev_flow->handle->rix_hrxq = hrxq_idx;
+			dev_flow->handle->fate_action =
+					MLX5_FLOW_FATE_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			dev_flow->handle->mark = 1;
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not support for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -8017,9 +8388,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -8038,7 +8413,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -8046,6 +8420,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		if (!mlx5_flow_os_action_supported(action_type))
 			return rte_flow_error_set(error, ENOTSUP,
@@ -8284,9 +8659,13 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external && jump_group <
+					MLX5_MAX_TABLES_EXTERNAL)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8452,6 +8831,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8478,6 +8870,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -8776,7 +9183,8 @@ struct field_modify_info modify_tcp[] = {
 				dh->rix_hrxq = UINT32_MAX;
 				dv->actions[n++] = drop_hrxq->action;
 			}
-		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE) {
+		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE &&
+			   !dv_h->rix_sample) {
 			struct mlx5_hrxq *hrxq;
 			uint32_t hrxq_idx;
 			struct mlx5_flow_rss_desc *rss_desc =
@@ -8908,18 +9316,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9165,6 +9573,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an encap/decap resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9243,8 +9716,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH 7/7] app/testpmd: add testpmd command for sample action
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
                     ` (5 preceding siblings ...)
  2020-07-02 17:43   ` [dpdk-dev] [PATCH 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-07-02 17:43   ` " Jiawei Wang
  6 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 17:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e2006c..6b1e515 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -349,6 +351,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -484,6 +490,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1161,6 +1183,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1393,9 +1416,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1460,7 +1502,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1531,6 +1581,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3612,11 +3664,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3647,6 +3701,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3750,6 +3821,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5207,6 +5309,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -5971,6 +6143,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6007,11 +6211,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6162,6 +6363,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6607,7 +6826,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6624,6 +6889,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling
  2020-06-25 16:26 [dpdk-dev] [PATCH 0/8] support the flow-based traffic sampling Jiawei Wang
                   ` (8 preceding siblings ...)
  2020-07-02 17:43 ` [dpdk-dev] [PATCH 0/7] support the flow-based traffic sampling Jiawei Wang
@ 2020-07-02 18:43 ` Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
                     ` (7 more replies)
  9 siblings, 8 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Sorry for that I forget to change the subject prefix to PATCH v2 of title last time.

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample
the incoming traffic and send a duplicated traffic in some predefined
ratio to the application, while the original packet will continue to
the target destination.

If the sample ratio value be set to 1, means that the packets would be
completely mirrored. The sample packet can be assigned with additional
set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample
action while creating a flow.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (7):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c            | 285 ++++++++++++++-
 doc/guides/prog_guide/rte_flow.rst     |  25 ++
 doc/guides/rel_notes/release_20_08.rst |   6 +
 drivers/common/mlx5/Makefile           |   5 +
 drivers/common/mlx5/linux/meson.build  |   2 +
 drivers/common/mlx5/linux/mlx5_glue.c  |  15 +
 drivers/common/mlx5/linux/mlx5_glue.h  |  12 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h   |   1 +
 drivers/common/mlx5/mlx5_prm.h         |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c       |  14 +
 drivers/net/mlx5/mlx5.c                |  11 +
 drivers/net/mlx5/mlx5.h                |   4 +
 drivers/net/mlx5/mlx5_flow.c           | 274 +++++++++++++-
 drivers/net/mlx5/mlx5_flow.h           |  51 ++-
 drivers/net/mlx5/mlx5_flow_dv.c        | 627 ++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c           |   1 +
 lib/librte_ethdev/rte_flow.h           |  28 ++
 18 files changed, 1404 insertions(+), 35 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-03  6:39     ` Jerin Jacob
  2020-07-04 13:04     ` Andrew Rybchenko
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 2/7] common/mlx5: glue for sample action Jiawei Wang
                     ` (6 subsequent siblings)
  7 siblings, 2 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

When using full offload, all traffic will be handled by the HW, and
directed to the requested vf or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic in some predefined ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1
, means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_08.rst |  6 ++++++
 lib/librte_ethdev/rte_flow.c           |  1 +
 lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index d5dd18c..50dfe1f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+
 
+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated to a special queue or vport
+with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
+All the packets continues to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index 5cbc4ce..313e8d3 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -81,6 +81,12 @@ New Features
   * Added support for virtio queue statistics.
   * Added support for MTU update.
 
+* **Added flow-based traffic sampling support.**
+
+  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
+  packets with given ratio and redirects to vport or queue. The sampled packets
+  also can be assigned with an additional optional actions.
+
 * **Updated Marvell octeontx2 ethdev PMD.**
 
   Updated Marvell octeontx2 driver with cn98xx support.
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 1685be5..733871d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -173,6 +173,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b0e4199..c9cd80d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * Redirects specific ratio of packets to vport or queue.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2709,6 +2716,27 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated to a special queue or vport
+ * in the predefined probabiilty, All the packets continues processing
+ * on the default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+	const struct rte_flow_action *actions;
+		/**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 2/7] common/mlx5: glue for sample action
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

rdma-core introduce a new DR sample action.

Add the rdma-core commands in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/Makefile          |  5 +++++
 drivers/common/mlx5/linux/meson.build |  2 ++
 drivers/common/mlx5/linux/mlx5_glue.c | 15 +++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 12 ++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index f6c762b..4c1484c 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -192,6 +192,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		func mlx5dv_dump_dr_domain \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_flow_sampler \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
 		infiniband/mlx5dv.h \
 		enum MLX5_MMAP_GET_NC_PAGES_CMD \
diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 2294213..0f08318 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -162,6 +162,8 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index 048207e..98b3e71 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1059,6 +1059,19 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1308,6 +1321,8 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 069d854..11b95c5 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -304,6 +314,8 @@ struct mlx5_glue {
 			 struct mlx5dv_devx_async_event_hdr *event_data,
 			 size_t event_resp_len);
 	void (*dr_reclaim_domain_memory)(void *domain, uint32_t enable);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 3/7] common/mlx5: query sampler object capability via DevX
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 2/7] common/mlx5: glue for sample action Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 4/7] net/mlx5: add the validate sample action Jiawei Wang
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index ec92eb6..6b551f1 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -496,6 +496,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capabilitiy From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 25704ef..a9cfe6d 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -90,6 +90,7 @@ struct mlx5_hca_attr {
 	uint32_t vhca_id:16;
 	uint32_t relaxed_ordering_write:1;
 	uint32_t relaxed_ordering_read:1;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index c63795f..e7d0a65 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -944,6 +944,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1365,12 +1366,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 4/7] net/mlx5: add the validate sample action
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
                     ` (2 preceding siblings ...)
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-05 19:30     ` Ori Kam
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX.

Only NIC_RX support with addition optional actions. FDB doesn't
support any optional action, the sampled packets is always goes
to e-switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 133 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2dc57b2..6dfacf2 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -878,6 +878,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
+		    config.dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config.hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config.mprq.enabled && mprq) {
 		if (config.mprq.stride_num_n &&
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 46e66eb..6790738 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -617,6 +617,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 43cbda8..45a073c 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -202,6 +202,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
 #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 36)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 0bd1c99..002e075 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3958,6 +3958,130 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			++actions_n;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optinal action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5591,6 +5715,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 5/7] net/mlx5: split sample flow into two sub flows
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
                     ` (3 preceding siblings ...)
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 6/7] net/mlx5: update translate function for sample action Jiawei Wang
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 258 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  36 ++++++
 4 files changed, 304 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 07c6add..db55545 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -241,6 +241,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = rte_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = rte_malloc_socket,
+		.free = rte_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6790738..756bd68 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -51,6 +51,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -518,6 +519,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -566,6 +568,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_flow_default_miss_resource default_miss;
 	/* Default miss action resource structure. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index ae5ccc2..7ed9ba3 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3917,6 +3917,139 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success, or unique flow_id.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id = 0;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			if (!attr->transfer) {
+				/* Add the extra tag action first for NIC-RX. */
+				tag_action = actions_pre;
+				tag_action->type = (enum rte_flow_action_type)
+						MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+				actions_pre++;
+			}
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	if (!attr->transfer) {
+		actions_pre++;
+		/* Set the tag. */
+		set_tag = (void *)actions_pre;
+		set_tag->id = mlx5_flow_get_reg_id(dev, MLX5_APP_TAG,
+						   0, &error);
+		tag_id = flow_qrss_get_id(dev);
+		set_tag->data = tag_id;
+		assert(tag_action);
+		tag_action->conf = set_tag;
+		/* Prepare the suffix subflow items. */
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4176,6 +4309,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4222,8 +4356,9 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4238,7 +4373,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4249,6 +4384,121 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+#endif
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!attr->transfer && !tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+#endif
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4296,7 +4546,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 45a073c..51826f8 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -502,6 +502,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -530,6 +562,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -595,6 +629,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 6/7] net/mlx5: update translate function for sample action
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
                     ` (4 preceding siblings ...)
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-07-02 18:43   ` Jiawei Wang
  2020-07-05 19:32     ` Ori Kam
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 7/7] app/testpmd: add testpmd command " Jiawei Wang
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 494 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 502 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7ed9ba3..c91ae7d 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4612,10 +4612,14 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4634,7 +4638,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4680,7 +4684,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4717,8 +4721,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 51826f8..99e900b 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -372,6 +372,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -604,13 +611,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 002e075..3d0eaed 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -82,6 +82,10 @@
 static int
 flow_dv_default_miss_resource_release(struct rte_eth_dev *dev);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				      uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -7955,6 +7959,373 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[out] hrxq_idx
+ *   Hash Rx queue index.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+		if (!*hrxq_idx)
+			return NULL;
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	return hrxq;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			if (action_flags & MLX5_FLOW_ACTION_MARK)
+				dev_flow->handle->rix_hrxq = hrxq_idx;
+			dev_flow->handle->fate_action =
+					MLX5_FLOW_FATE_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			dev_flow->handle->mark = 1;
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not support for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -8017,9 +8388,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -8038,7 +8413,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -8046,6 +8420,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		if (!mlx5_flow_os_action_supported(action_type))
 			return rte_flow_error_set(error, ENOTSUP,
@@ -8284,9 +8659,13 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external && jump_group <
+					MLX5_MAX_TABLES_EXTERNAL)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8452,6 +8831,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8478,6 +8870,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -8776,7 +9183,8 @@ struct field_modify_info modify_tcp[] = {
 				dh->rix_hrxq = UINT32_MAX;
 				dv->actions[n++] = drop_hrxq->action;
 			}
-		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE) {
+		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE &&
+			   !dv_h->rix_sample) {
 			struct mlx5_hrxq *hrxq;
 			uint32_t hrxq_idx;
 			struct mlx5_flow_rss_desc *rss_desc =
@@ -8908,18 +9316,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9165,6 +9573,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an encap/decap resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9243,8 +9716,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v2 7/7] app/testpmd: add testpmd command for sample action
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
                     ` (5 preceding siblings ...)
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-07-02 18:43   ` " Jiawei Wang
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-02 18:43 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e2006c..6b1e515 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -349,6 +351,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -484,6 +490,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1161,6 +1183,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1393,9 +1416,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1460,7 +1502,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1531,6 +1581,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3612,11 +3664,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3647,6 +3701,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3750,6 +3821,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5207,6 +5309,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -5971,6 +6143,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6007,11 +6211,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6162,6 +6363,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6607,7 +6826,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6624,6 +6889,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-07-03  6:39     ` Jerin Jacob
  2020-07-03 14:55       ` Matan Azrad
  2020-07-04 14:44       ` Ajit Khaparde
  2020-07-04 13:04     ` Andrew Rybchenko
  1 sibling, 2 replies; 129+ messages in thread
From: Jerin Jacob @ 2020-07-03  6:39 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev, Thomas Monjalon,
	Raslan Darawsheh, ian.stokes, fbl

On Fri, Jul 3, 2020 at 12:13 AM Jiawei Wang <jiaweiw@mellanox.com> wrote:
>
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested vf or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
>
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic in some predefined ratio to the
> application, while the original packet will continue to the target
> destination.
>
> The packets sampled equals is '1/ratio', if the ratio value be set to 1
> , means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
>
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> will be introduced.
>
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> Acked-by: Ori Kam <orika@mellanox.com>

When adding overlapping API(rte_eth_mirror_rule_set()) in the same
library(ethdev).
Please depreciate the old API.
We should not have two separate paths for the same function in the
same ethdev library. It is pain for app and driver developers.

With the above deprecation notice,
Acked-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
>  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
>  lib/librte_ethdev/rte_flow.c           |  1 +
>  lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
>  4 files changed, 60 insertions(+)
>
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index d5dd18c..50dfe1f 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
>     | ``context``  | user input flow context         |
>     +--------------+---------------------------------+
>
> +Action: ``SAMPLE``
> +^^^^^^^^^^^^^^^^^^
> +
> +Adds a sample action to a matched flow.
> +
> +The matching packets will be duplicated to a special queue or vport
> +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> +All the packets continues to the target destination.
> +
> +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> +``actions`` represent the different set of actions for the sampled or mirrored
> +packets.
> +
> +.. _table_rte_flow_action_sample:
> +
> +.. table:: SAMPLE
> +
> +   +--------------+---------------------------------+
> +   | Field        | Value                           |
> +   +==============+=================================+
> +   | ``ratio``    | 32 bits sample ratio value      |
> +   +--------------+---------------------------------+
> +   | ``actions``  | sub-action list for sampling    |
> +   +--------------+---------------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>
> diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> index 5cbc4ce..313e8d3 100644
> --- a/doc/guides/rel_notes/release_20_08.rst
> +++ b/doc/guides/rel_notes/release_20_08.rst
> @@ -81,6 +81,12 @@ New Features
>    * Added support for virtio queue statistics.
>    * Added support for MTU update.
>
> +* **Added flow-based traffic sampling support.**
> +
> +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
> +  packets with given ratio and redirects to vport or queue. The sampled packets
> +  also can be assigned with an additional optional actions.
> +
>  * **Updated Marvell octeontx2 ethdev PMD.**
>
>    Updated Marvell octeontx2 driver with cn98xx support.
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index 1685be5..733871d 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
>         MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
>         MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
>         MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> +       MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
>  };
>
>  int
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index b0e4199..c9cd80d 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
>          * see enum RTE_ETH_EVENT_FLOW_AGED
>          */
>         RTE_FLOW_ACTION_TYPE_AGE,
> +
> +       /**
> +        * Redirects specific ratio of packets to vport or queue.
> +        *
> +        * See struct rte_flow_action_sample.
> +        */
> +       RTE_FLOW_ACTION_TYPE_SAMPLE,
>  };
>
>  /**
> @@ -2709,6 +2716,27 @@ struct rte_flow_action {
>  struct rte_flow;
>
>  /**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> + *
> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated to a special queue or vport
> + * in the predefined probabiilty, All the packets continues processing
> + * on the default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> + * Additional action list be supported to add for sampled or mirrored packets.
> + */
> +struct rte_flow_action_sample {
> +       const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> +       const struct rte_flow_action *actions;
> +               /**< sub-action list specific for the sampling hit cases. */
> +};
> +
> +/**
>   * Verbose error types.
>   *
>   * Most of them provide the type of the object referenced by struct
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03  6:39     ` Jerin Jacob
@ 2020-07-03 14:55       ` Matan Azrad
  2020-07-03 15:08         ` Jerin Jacob
  2020-07-04 14:44       ` Ajit Khaparde
  1 sibling, 1 reply; 129+ messages in thread
From: Matan Azrad @ 2020-07-03 14:55 UTC (permalink / raw)
  To: Jerin Jacob, Jiawei(Jonny) Wang
  Cc: Ori Kam, Slava Ovsiienko, dpdk-dev, Thomas Monjalon,
	Raslan Darawsheh, ian.stokes, fbl


Hi Jerin

From: Jerin Jacob:
> On Fri, Jul 3, 2020 at 12:13 AM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> >
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> > visibility on the traffic.
> > So there's a need for an action that will enable the control
> > application some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> > The packets sampled equals is '1/ratio', if the ratio value be set to
> > 1 , means that the packets would be completely mirrored. The sample
> > packet can be assigned with different set of actions from the original
> packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample will be introduced.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > Acked-by: Ori Kam <orika@mellanox.com>
> 
> When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> library(ethdev).
> Please depreciate the old API.
> We should not have two separate paths for the same function in the same
> ethdev library. It is pain for app and driver developers.

What are about all the other rte_flow parallel configuration APIs in ethdev:
 promiscuous_enable;
promiscuous_disable;
allmulticast_enable;
allmulticast_disable;
mac_addr_remove;
mac_addr_add;
mac_addr_set;
set_mc_addr_list;
vlan_filter_set;
vlan_tpid_set;
vlan_strip_queue_set;
vlan_offload_set;
vlan_pvid_set;        
udp_tunnel_port_add;
udp_tunnel_port_del;
...

These APIs can be replaced easily by rte_flow API.
Do you think we need to deprecate all?

> With the above deprecation notice,
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> 
> 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst     | 25
> +++++++++++++++++++++++++
> >  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
> >  lib/librte_ethdev/rte_flow.c           |  1 +
> >  lib/librte_ethdev/rte_flow.h           | 28
> ++++++++++++++++++++++++++++
> >  4 files changed, 60 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index d5dd18c..50dfe1f 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the
> flow.
> >     | ``context``  | user input flow context         |
> >     +--------------+---------------------------------+
> >
> > +Action: ``SAMPLE``
> > +^^^^^^^^^^^^^^^^^^
> > +
> > +Adds a sample action to a matched flow.
> > +
> > +The matching packets will be duplicated to a special queue or vport
> > +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> > +All the packets continues to the target destination.
> > +
> > +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> > +``actions`` represent the different set of actions for the sampled or
> > +mirrored packets.
> > +
> > +.. _table_rte_flow_action_sample:
> > +
> > +.. table:: SAMPLE
> > +
> > +   +--------------+---------------------------------+
> > +   | Field        | Value                           |
> > +   +==============+=================================+
> > +   | ``ratio``    | 32 bits sample ratio value      |
> > +   +--------------+---------------------------------+
> > +   | ``actions``  | sub-action list for sampling    |
> > +   +--------------+---------------------------------+
> > +
> >  Negative types
> >  ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_20_08.rst
> > b/doc/guides/rel_notes/release_20_08.rst
> > index 5cbc4ce..313e8d3 100644
> > --- a/doc/guides/rel_notes/release_20_08.rst
> > +++ b/doc/guides/rel_notes/release_20_08.rst
> > @@ -81,6 +81,12 @@ New Features
> >    * Added support for virtio queue statistics.
> >    * Added support for MTU update.
> >
> > +* **Added flow-based traffic sampling support.**
> > +
> > +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate
> the
> > + matching  packets with given ratio and redirects to vport or queue.
> > + The sampled packets  also can be assigned with an additional optional
> actions.
> > +
> >  * **Updated Marvell octeontx2 ethdev PMD.**
> >
> >    Updated Marvell octeontx2 driver with cn98xx support.
> > diff --git a/lib/librte_ethdev/rte_flow.c
> > b/lib/librte_ethdev/rte_flow.c index 1685be5..733871d 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
> >         MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >         MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >         MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> > +       MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
> >  };
> >
> >  int
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index b0e4199..c9cd80d 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
> >          * see enum RTE_ETH_EVENT_FLOW_AGED
> >          */
> >         RTE_FLOW_ACTION_TYPE_AGE,
> > +
> > +       /**
> > +        * Redirects specific ratio of packets to vport or queue.
> > +        *
> > +        * See struct rte_flow_action_sample.
> > +        */
> > +       RTE_FLOW_ACTION_TYPE_SAMPLE,
> >  };
> >
> >  /**
> > @@ -2709,6 +2716,27 @@ struct rte_flow_action {  struct rte_flow;
> >
> >  /**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > + *
> > + * Adds a sample action to a matched flow.
> > + *
> > + * The matching packets will be duplicated to a special queue or
> > +vport
> > + * in the predefined probabiilty, All the packets continues
> > +processing
> > + * on the default flow path.
> > + *
> > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > + * Additional action list be supported to add for sampled or mirrored
> packets.
> > + */
> > +struct rte_flow_action_sample {
> > +       const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> > +       const struct rte_flow_action *actions;
> > +               /**< sub-action list specific for the sampling hit
> > +cases. */ };
> > +
> > +/**
> >   * Verbose error types.
> >   *
> >   * Most of them provide the type of the object referenced by struct
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 14:55       ` Matan Azrad
@ 2020-07-03 15:08         ` Jerin Jacob
  2020-07-03 15:27           ` Matan Azrad
  2020-07-03 15:27           ` Thomas Monjalon
  0 siblings, 2 replies; 129+ messages in thread
From: Jerin Jacob @ 2020-07-03 15:08 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko, dpdk-dev,
	Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com> wrote:
>
>
> Hi Jerin

Hi Matan,

>
> From: Jerin Jacob:
> > On Fri, Jul 3, 2020 at 12:13 AM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> > >
> > > When using full offload, all traffic will be handled by the HW, and
> > > directed to the requested vf or wire, the control application loses
> > > visibility on the traffic.
> > > So there's a need for an action that will enable the control
> > > application some visibility.
> > >
> > > The solution is introduced a new action that will sample the incoming
> > > traffic and send a duplicated traffic in some predefined ratio to the
> > > application, while the original packet will continue to the target
> > > destination.
> > >
> > > The packets sampled equals is '1/ratio', if the ratio value be set to
> > > 1 , means that the packets would be completely mirrored. The sample
> > > packet can be assigned with different set of actions from the original
> > packet.
> > >
> > > In order to support the sample packet in rte_flow, new rte_flow action
> > > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > > rte_flow_action_sample will be introduced.
> > >
> > > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > > Acked-by: Ori Kam <orika@mellanox.com>
> >
> > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > library(ethdev).
> > Please depreciate the old API.
> > We should not have two separate paths for the same function in the same
> > ethdev library. It is pain for app and driver developers.
>
> What are about all the other rte_flow parallel configuration APIs in ethdev:
>  promiscuous_enable;
> promiscuous_disable;
> allmulticast_enable;
> allmulticast_disable;
> mac_addr_remove;
> mac_addr_add;
> mac_addr_set;
> set_mc_addr_list;
> vlan_filter_set;
> vlan_tpid_set;
> vlan_strip_queue_set;
> vlan_offload_set;
> vlan_pvid_set;
> udp_tunnel_port_add;
> udp_tunnel_port_del;
> ...
>
> These APIs can be replaced easily by rte_flow API.
> Do you think we need to deprecate all?

I think, basic stuff like below can have separate API.
1)  promiscuous_enable;
2) promiscuous_disable;
3) allmulticast_enable;
4) allmulticast_disable;
5) mac_addr_remove;
6) mac_addr_add;
7) mac_addr_set;
8) set_mc_addr_list;

But The VLAN and UDP related should be rte_flow candidates.(IMO)

>
> > With the above deprecation notice,
> > Acked-by: Jerin Jacob <jerinj@marvell.com>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 15:08         ` Jerin Jacob
@ 2020-07-03 15:27           ` Matan Azrad
  2020-07-03 15:27           ` Thomas Monjalon
  1 sibling, 0 replies; 129+ messages in thread
From: Matan Azrad @ 2020-07-03 15:27 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko, dpdk-dev,
	Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl



From: Jerin Jacob
> On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com> wrote:
> >
> >
> > Hi Jerin
> 
> Hi Matan,
> 
> >
> > From: Jerin Jacob:
> > > On Fri, Jul 3, 2020 at 12:13 AM Jiawei Wang <jiaweiw@mellanox.com>
> wrote:
> > > >
> > > > When using full offload, all traffic will be handled by the HW,
> > > > and directed to the requested vf or wire, the control application
> > > > loses visibility on the traffic.
> > > > So there's a need for an action that will enable the control
> > > > application some visibility.
> > > >
> > > > The solution is introduced a new action that will sample the
> > > > incoming traffic and send a duplicated traffic in some predefined
> > > > ratio to the application, while the original packet will continue
> > > > to the target destination.
> > > >
> > > > The packets sampled equals is '1/ratio', if the ratio value be set
> > > > to
> > > > 1 , means that the packets would be completely mirrored. The
> > > > sample packet can be assigned with different set of actions from
> > > > the original
> > > packet.
> > > >
> > > > In order to support the sample packet in rte_flow, new rte_flow
> > > > action definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > > > rte_flow_action_sample will be introduced.
> > > >
> > > > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > > > Acked-by: Ori Kam <orika@mellanox.com>
> > >
> > > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > > library(ethdev).
> > > Please depreciate the old API.
> > > We should not have two separate paths for the same function in the
> > > same ethdev library. It is pain for app and driver developers.
> >
> > What are about all the other rte_flow parallel configuration APIs in ethdev:
> >  promiscuous_enable;
> > promiscuous_disable;
> > allmulticast_enable;
> > allmulticast_disable;
> > mac_addr_remove;
> > mac_addr_add;
> > mac_addr_set;
> > set_mc_addr_list;
> > vlan_filter_set;
> > vlan_tpid_set;
> > vlan_strip_queue_set;
> > vlan_offload_set;
> > vlan_pvid_set;
> > udp_tunnel_port_add;
> > udp_tunnel_port_del;
> > ...
> >
> > These APIs can be replaced easily by rte_flow API.
> > Do you think we need to deprecate all?
> 
> I think, basic stuff like below can have separate API.
> 1)  promiscuous_enable;
> 2) promiscuous_disable;
> 3) allmulticast_enable;
> 4) allmulticast_disable;
> 5) mac_addr_remove;
> 6) mac_addr_add;
> 7) mac_addr_set;
> 8) set_mc_addr_list;
> 
> But The VLAN and UDP related should be rte_flow candidates.(IMO)

Can you explain why?
Each one of them can be configured by rte_flow, so why do we need duplicate ways to configure the same thing.

What is the expected behavior if there are conflicts? Is it documented?

It is probably different discussion, but I think that rte_flow becomes more popular and it will be makes sense to concentrate all the traffic filtering/actions to the rte_flow.

 
> >
> > > With the above deprecation notice,
> > > Acked-by: Jerin Jacob <jerinj@marvell.com>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 15:08         ` Jerin Jacob
  2020-07-03 15:27           ` Matan Azrad
@ 2020-07-03 15:27           ` Thomas Monjalon
  2020-07-03 15:36             ` Jerin Jacob
  2020-07-04 14:35             ` Ajit Khaparde
  1 sibling, 2 replies; 129+ messages in thread
From: Thomas Monjalon @ 2020-07-03 15:27 UTC (permalink / raw)
  To: Matan Azrad, Jerin Jacob, Jiawei(Jonny) Wang
  Cc: Ori Kam, Slava Ovsiienko, dpdk-dev, Raslan Darawsheh, ian.stokes,
	fbl, ferruh.yigit, arybchenko

03/07/2020 17:08, Jerin Jacob:
> On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com> wrote:
> > From: Jerin Jacob:
> > > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > > library(ethdev).
> > > Please depreciate the old API.
> > > We should not have two separate paths for the same function in the same
> > > ethdev library. It is pain for app and driver developers.
> >
> > What are about all the other rte_flow parallel configuration APIs in ethdev:
> >  promiscuous_enable;
> > promiscuous_disable;
> > allmulticast_enable;
> > allmulticast_disable;
> > mac_addr_remove;
> > mac_addr_add;
> > mac_addr_set;
> > set_mc_addr_list;
> > vlan_filter_set;
> > vlan_tpid_set;
> > vlan_strip_queue_set;
> > vlan_offload_set;
> > vlan_pvid_set;
> > udp_tunnel_port_add;
> > udp_tunnel_port_del;
> > ...
> >
> > These APIs can be replaced easily by rte_flow API.
> > Do you think we need to deprecate all?
> 
> I think, basic stuff like below can have separate API.
> 1)  promiscuous_enable;
> 2) promiscuous_disable;
> 3) allmulticast_enable;
> 4) allmulticast_disable;
> 5) mac_addr_remove;
> 6) mac_addr_add;
> 7) mac_addr_set;
> 8) set_mc_addr_list;

"Basic" is not a precise definition :)
I would say port-level configuration should remain
out of rte_flow API.

> But The VLAN and UDP related should be rte_flow candidates.(IMO)

Yes definitely, tunneling is better managed with rte_flow.

This is an important discussion, I Cc other ethdev maintainers.
Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
Aren't you using --cc-cmd devtools/get-maintainer.sh ?



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 15:27           ` Thomas Monjalon
@ 2020-07-03 15:36             ` Jerin Jacob
  2020-07-04 19:26               ` Matan Azrad
  2020-07-04 14:35             ` Ajit Khaparde
  1 sibling, 1 reply; 129+ messages in thread
From: Jerin Jacob @ 2020-07-03 15:36 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

On Fri, Jul 3, 2020 at 8:57 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/07/2020 17:08, Jerin Jacob:
> > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com> wrote:
> > > From: Jerin Jacob:
> > > > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > > > library(ethdev).
> > > > Please depreciate the old API.
> > > > We should not have two separate paths for the same function in the same
> > > > ethdev library. It is pain for app and driver developers.
> > >
> > > What are about all the other rte_flow parallel configuration APIs in ethdev:
> > >  promiscuous_enable;
> > > promiscuous_disable;
> > > allmulticast_enable;
> > > allmulticast_disable;
> > > mac_addr_remove;
> > > mac_addr_add;
> > > mac_addr_set;
> > > set_mc_addr_list;
> > > vlan_filter_set;
> > > vlan_tpid_set;
> > > vlan_strip_queue_set;
> > > vlan_offload_set;
> > > vlan_pvid_set;
> > > udp_tunnel_port_add;
> > > udp_tunnel_port_del;
> > > ...
> > >
> > > These APIs can be replaced easily by rte_flow API.
> > > Do you think we need to deprecate all?
> >
> > I think, basic stuff like below can have separate API.
> > 1)  promiscuous_enable;
> > 2) promiscuous_disable;
> > 3) allmulticast_enable;
> > 4) allmulticast_disable;
> > 5) mac_addr_remove;
> > 6) mac_addr_add;
> > 7) mac_addr_set;
> > 8) set_mc_addr_list;
>
> "Basic" is not a precise definition :)

Yep.

> I would say port-level configuration should remain
> out of rte_flow API.

+1.
In addition that, I would say anything needs to configured at
"per-flow" granularity use rte_flow.

>
> > But The VLAN and UDP related should be rte_flow candidates.(IMO)
>
> Yes definitely, tunneling is better managed with rte_flow.
>
> This is an important discussion, I Cc other ethdev maintainers.
> Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
> Aren't you using --cc-cmd devtools/get-maintainer.sh ?
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-07-03  6:39     ` Jerin Jacob
@ 2020-07-04 13:04     ` Andrew Rybchenko
  2020-07-05 10:18       ` Ori Kam
  1 sibling, 1 reply; 129+ messages in thread
From: Andrew Rybchenko @ 2020-07-04 13:04 UTC (permalink / raw)
  To: Jiawei Wang, orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl

On 7/2/20 9:43 PM, Jiawei Wang wrote:
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested vf or wire, the control application loses

vf->VF

> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
> 
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic in some predefined ratio to the
> application, while the original packet will continue to the target
> destination.
> 

May be 1 packet per second is a better sampling approach?
Or just different.

> The packets sampled equals is '1/ratio', if the ratio value be set to 1
> , means that the packets would be completely mirrored. The sample packet

Comma on the next line looks bad.

> can be assigned with different set of actions from the original packet.
> 
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> will be introduced.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> Acked-by: Ori Kam <orika@mellanox.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
>  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
>  lib/librte_ethdev/rte_flow.c           |  1 +
>  lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
>  4 files changed, 60 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index d5dd18c..50dfe1f 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
>     | ``context``  | user input flow context         |
>     +--------------+---------------------------------+
>  
> +Action: ``SAMPLE``
> +^^^^^^^^^^^^^^^^^^
> +
> +Adds a sample action to a matched flow.
> +
> +The matching packets will be duplicated to a special queue or vport

what is vport above?

> +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> +All the packets continues to the target destination.

continues -> continue (if I'm not mistaken)

> +
> +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> +``actions`` represent the different set of actions for the sampled or mirrored
> +packets.
> +
> +.. _table_rte_flow_action_sample:
> +
> +.. table:: SAMPLE
> +
> +   +--------------+---------------------------------+
> +   | Field        | Value                           |
> +   +==============+=================================+
> +   | ``ratio``    | 32 bits sample ratio value      |
> +   +--------------+---------------------------------+
> +   | ``actions``  | sub-action list for sampling    |
> +   +--------------+---------------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>  
> diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> index 5cbc4ce..313e8d3 100644
> --- a/doc/guides/rel_notes/release_20_08.rst
> +++ b/doc/guides/rel_notes/release_20_08.rst
> @@ -81,6 +81,12 @@ New Features
>    * Added support for virtio queue statistics.
>    * Added support for MTU update.
>  
> +* **Added flow-based traffic sampling support.**
> +
> +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
> +  packets with given ratio and redirects to vport or queue. The sampled packets

What is vport above?

> +  also can be assigned with an additional optional actions.

May actions list be empty or NULL? If no, it does not look
optional.

> +
>  * **Updated Marvell octeontx2 ethdev PMD.**
>  
>    Updated Marvell octeontx2 driver with cn98xx support.
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index 1685be5..733871d 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
>  	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
>  	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> +	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
>  };
>  
>  int
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index b0e4199..c9cd80d 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
>  	 * see enum RTE_ETH_EVENT_FLOW_AGED
>  	 */
>  	RTE_FLOW_ACTION_TYPE_AGE,
> +
> +	/**
> +	 * Redirects specific ratio of packets to vport or queue.
> +	 *
> +	 * See struct rte_flow_action_sample.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SAMPLE,
>  };
>  
>  /**
> @@ -2709,6 +2716,27 @@ struct rte_flow_action {
>  struct rte_flow;
>  
>  /**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> + *
> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated to a special queue or vport

again 'vport' here
It sounds misleading and too restrictive to say "be duplicated
to a special queue or vport". There is no specification of the
queue or vport in the control structure.
You should either describe it in a generic way like "be
duplicated and own set of actions with a fate action applied"
or put a restriction about QUEUE, RSS or "vport"-related action
to be present in the sub-actions list.

> + * in the predefined probabiilty, All the packets continues processing

probabiilty -> probability
I think 'predefined' is misleading here, 'specified' is better.
Also strictly speaking it is not a predefined probability (as
Stephen suggested), it is defined ratio.

> + * on the default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> + * Additional action list be supported to add for sampled or mirrored packets.
> + */
> +struct rte_flow_action_sample {
> +	const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */

const is still above and it is meaningless (other actions do
not have 'const' for plain fields).

> +	const struct rte_flow_action *actions;
> +		/**< sub-action list specific for the sampling hit cases. */

Is it required to have fate action?
May I use it to MARK some packets and do not duplicate?
I guess no. Or COUNT and DROP? Just COUNT?

What I'm trying to say that you're adding a generic packet
selection mechanism with very restricted usage by design.

Anyway, if you go with it, please, process other notes above.

> +};
> +
> +/**
>   * Verbose error types.
>   *
>   * Most of them provide the type of the object referenced by struct
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 15:27           ` Thomas Monjalon
  2020-07-03 15:36             ` Jerin Jacob
@ 2020-07-04 14:35             ` Ajit Khaparde
  1 sibling, 0 replies; 129+ messages in thread
From: Ajit Khaparde @ 2020-07-04 14:35 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Jerin Jacob, Jiawei(Jonny) Wang, Ori Kam,
	Slava Ovsiienko, dpdk-dev, Raslan Darawsheh, ian.stokes, fbl,
	Ferruh Yigit, Andrew Rybchenko

On Fri, Jul 3, 2020 at 8:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:

> 03/07/2020 17:08, Jerin Jacob:
> > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com> wrote:
> > > From: Jerin Jacob:
> > > > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > > > library(ethdev).
> > > > Please depreciate the old API.
> > > > We should not have two separate paths for the same function in the
> same
> > > > ethdev library. It is pain for app and driver developers.
> > >
> > > What are about all the other rte_flow parallel configuration APIs in
> ethdev:
> > >  promiscuous_enable;
> > > promiscuous_disable;
> > > allmulticast_enable;
> > > allmulticast_disable;
> > > mac_addr_remove;
> > > mac_addr_add;
> > > mac_addr_set;
> > > set_mc_addr_list;
> > > vlan_filter_set;
> > > vlan_tpid_set;
> > > vlan_strip_queue_set;
> > > vlan_offload_set;
> > > vlan_pvid_set;
> > > udp_tunnel_port_add;
> > > udp_tunnel_port_del;
> > > ...
> > >
> > > These APIs can be replaced easily by rte_flow API.
> > > Do you think we need to deprecate all?
> >
> > I think, basic stuff like below can have separate API.
> > 1)  promiscuous_enable;
> > 2) promiscuous_disable;
> > 3) allmulticast_enable;
> > 4) allmulticast_disable;
> > 5) mac_addr_remove;
> > 6) mac_addr_add;
> > 7) mac_addr_set;
> > 8) set_mc_addr_list;
>
> "Basic" is not a precise definition :)
> I would say port-level configuration should remain
> out of rte_flow API.
>
+1


>
> > But The VLAN and UDP related should be rte_flow candidates.(IMO)
>
I do not have a strong opinion on VLAN in port-level or rte_flow list.
But isn't the UDP port number for tunnels a port-level setting for HW?


>
> Yes definitely, tunneling is better managed with rte_flow.
>
> This is an important discussion, I Cc other ethdev maintainers.
> Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
> Aren't you using --cc-cmd devtools/get-maintainer.sh ?
>
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03  6:39     ` Jerin Jacob
  2020-07-03 14:55       ` Matan Azrad
@ 2020-07-04 14:44       ` Ajit Khaparde
  2020-07-05  8:55         ` Thomas Monjalon
  1 sibling, 1 reply; 129+ messages in thread
From: Ajit Khaparde @ 2020-07-04 14:44 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

On Thu, Jul 2, 2020 at 11:40 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Fri, Jul 3, 2020 at 12:13 AM Jiawei Wang <jiaweiw@mellanox.com> wrote:
> >
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> > visibility on the traffic.
> > So there's a need for an action that will enable the control application
> > some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > , means that the packets would be completely mirrored. The sample packet
> > can be assigned with different set of actions from the original packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> rte_flow_action_sample
> > will be introduced.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > Acked-by: Ori Kam <orika@mellanox.com>
>
> When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> library(ethdev).
> Please depreciate the old API.
> We should not have two separate paths for the same function in the
> same ethdev library. It is pain for app and driver developers.
>
> With the above deprecation notice,
> Acked-by: Jerin Jacob <jerinj@marvell.com>
>
I am fine with the proposed RTE_FLOW_ACTION_TYPE_SAMPLE. But..

When rte_eth_mirror_rule_set() is deprecated, are we going to add
RTE_FLOW_ACTION_TYPE_MIRROR for full fledged mirror action?
Or we are proposing to use RTE_FLOW_ACTION_TYPE_SAMPLE with
ratio of 1 to mirror all packets, thereby doing away with the need for
a separate RTE_FLOW_ACTION_TYPE_MIRROR?



>
>
> > ---
> >  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
> >  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
> >  lib/librte_ethdev/rte_flow.c           |  1 +
> >  lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
> >  4 files changed, 60 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index d5dd18c..50dfe1f 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
> >     | ``context``  | user input flow context         |
> >     +--------------+---------------------------------+
> >
> > +Action: ``SAMPLE``
> > +^^^^^^^^^^^^^^^^^^
> > +
> > +Adds a sample action to a matched flow.
> > +
> > +The matching packets will be duplicated to a special queue or vport
> > +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> > +All the packets continues to the target destination.
> > +
> > +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> > +``actions`` represent the different set of actions for the sampled or
> mirrored
> > +packets.
> > +
> > +.. _table_rte_flow_action_sample:
> > +
> > +.. table:: SAMPLE
> > +
> > +   +--------------+---------------------------------+
> > +   | Field        | Value                           |
> > +   +==============+=================================+
> > +   | ``ratio``    | 32 bits sample ratio value      |
> > +   +--------------+---------------------------------+
> > +   | ``actions``  | sub-action list for sampling    |
> > +   +--------------+---------------------------------+
> > +
> >  Negative types
> >  ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_20_08.rst
> b/doc/guides/rel_notes/release_20_08.rst
> > index 5cbc4ce..313e8d3 100644
> > --- a/doc/guides/rel_notes/release_20_08.rst
> > +++ b/doc/guides/rel_notes/release_20_08.rst
> > @@ -81,6 +81,12 @@ New Features
> >    * Added support for virtio queue statistics.
> >    * Added support for MTU update.
> >
> > +* **Added flow-based traffic sampling support.**
> > +
> > +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the
> matching
> > +  packets with given ratio and redirects to vport or queue. The sampled
> packets
> > +  also can be assigned with an additional optional actions.
> > +
> >  * **Updated Marvell octeontx2 ethdev PMD.**
> >
> >    Updated Marvell octeontx2 driver with cn98xx support.
> > diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> > index 1685be5..733871d 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
> >         MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >         MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >         MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> > +       MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
> >  };
> >
> >  int
> > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > index b0e4199..c9cd80d 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
> >          * see enum RTE_ETH_EVENT_FLOW_AGED
> >          */
> >         RTE_FLOW_ACTION_TYPE_AGE,
> > +
> > +       /**
> > +        * Redirects specific ratio of packets to vport or queue.
> > +        *
> > +        * See struct rte_flow_action_sample.
> > +        */
> > +       RTE_FLOW_ACTION_TYPE_SAMPLE,
> >  };
> >
> >  /**
> > @@ -2709,6 +2716,27 @@ struct rte_flow_action {
> >  struct rte_flow;
> >
> >  /**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > + *
> > + * Adds a sample action to a matched flow.
> > + *
> > + * The matching packets will be duplicated to a special queue or vport
> > + * in the predefined probabiilty, All the packets continues processing
> > + * on the default flow path.
> > + *
> > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > + * Additional action list be supported to add for sampled or mirrored
> packets.
> > + */
> > +struct rte_flow_action_sample {
> > +       const uint32_t ratio; /**< packets sampled equals to '1/ratio'.
> */
> > +       const struct rte_flow_action *actions;
> > +               /**< sub-action list specific for the sampling hit
> cases. */
> > +};
> > +
> > +/**
> >   * Verbose error types.
> >   *
> >   * Most of them provide the type of the object referenced by struct
> > --
> > 1.8.3.1
> >
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-03 15:36             ` Jerin Jacob
@ 2020-07-04 19:26               ` Matan Azrad
  2020-07-05  1:21                 ` Jerin Jacob
  0 siblings, 1 reply; 129+ messages in thread
From: Matan Azrad @ 2020-07-04 19:26 UTC (permalink / raw)
  To: Jerin Jacob, Thomas Monjalon
  Cc: Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

Hi all

From: Jerin Jacob:
> On Fri, Jul 3, 2020 at 8:57 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 03/07/2020 17:08, Jerin Jacob:
> > > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com>
> wrote:
> > > > From: Jerin Jacob:
> > > > > When adding overlapping API(rte_eth_mirror_rule_set()) in the
> > > > > same library(ethdev).
> > > > > Please depreciate the old API.
> > > > > We should not have two separate paths for the same function in
> > > > > the same ethdev library. It is pain for app and driver developers.
> > > >
> > > > What are about all the other rte_flow parallel configuration APIs in
> ethdev:
> > > >  promiscuous_enable;
> > > > promiscuous_disable;
> > > > allmulticast_enable;
> > > > allmulticast_disable;
> > > > mac_addr_remove;
> > > > mac_addr_add;
> > > > mac_addr_set;
> > > > set_mc_addr_list;
> > > > vlan_filter_set;
> > > > vlan_tpid_set;
> > > > vlan_strip_queue_set;
> > > > vlan_offload_set;
> > > > vlan_pvid_set;
> > > > udp_tunnel_port_add;
> > > > udp_tunnel_port_del;
> > > > ...
> > > >
> > > > These APIs can be replaced easily by rte_flow API.
> > > > Do you think we need to deprecate all?
> > >
> > > I think, basic stuff like below can have separate API.
> > > 1)  promiscuous_enable;
> > > 2) promiscuous_disable;
> > > 3) allmulticast_enable;
> > > 4) allmulticast_disable;
> > > 5) mac_addr_remove;
> > > 6) mac_addr_add;
> > > 7) mac_addr_set;
> > > 8) set_mc_addr_list;
> >
> > "Basic" is not a precise definition :)
> 
> Yep.
> 
> > I would say port-level configuration should remain out of rte_flow
> > API.

Thomas, Can you explain what is port-level?
Everything in rte_flow is per port...

Also, can you give reasons for your claim? 

> +1.
> In addition that, I would say anything needs to configured at "per-flow"
> granularity use rte_flow.

Jerin, What do you mean "per-flow" ?
Everything in traffic filtering\actions is per flow, for example:

Promiscuous: flow create 0 ingress pattern eth / end actions queue index 0 / end
Multicast\mac related: flow create 0 ingress pattern eth dst is X /end actions queue 0/ end
(in case of legacy RSS queue action will be replaced by rss).
....

Everything here are flows.

> >
> > > But The VLAN and UDP related should be rte_flow candidates.(IMO)
> >
> > Yes definitely, tunneling is better managed with rte_flow.

Can you give reasons for your claim?
Why should Vlan\Tunnel be in rte_flow and promiscuous\Multicast\mac not? 

> > This is an important discussion, I Cc other ethdev maintainers.

Agree, this is a good trigger to open this important discussion.

> > Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
> > Aren't you using --cc-cmd devtools/get-maintainer.sh ?


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-04 19:26               ` Matan Azrad
@ 2020-07-05  1:21                 ` Jerin Jacob
  2020-07-05  4:52                   ` Matan Azrad
  0 siblings, 1 reply; 129+ messages in thread
From: Jerin Jacob @ 2020-07-05  1:21 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

On Sun, Jul 5, 2020 at 12:56 AM Matan Azrad <matan@mellanox.com> wrote:
>
> Hi all
>
> From: Jerin Jacob:
> > On Fri, Jul 3, 2020 at 8:57 PM Thomas Monjalon <thomas@monjalon.net>
> > wrote:
> > >
> > > 03/07/2020 17:08, Jerin Jacob:
> > > > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com>
> > wrote:
> > > > > From: Jerin Jacob:
> > > > > > When adding overlapping API(rte_eth_mirror_rule_set()) in the
> > > > > > same library(ethdev).
> > > > > > Please depreciate the old API.
> > > > > > We should not have two separate paths for the same function in
> > > > > > the same ethdev library. It is pain for app and driver developers.
> > > > >
> > > > > What are about all the other rte_flow parallel configuration APIs in
> > ethdev:
> > > > >  promiscuous_enable;
> > > > > promiscuous_disable;
> > > > > allmulticast_enable;
> > > > > allmulticast_disable;
> > > > > mac_addr_remove;
> > > > > mac_addr_add;
> > > > > mac_addr_set;
> > > > > set_mc_addr_list;
> > > > > vlan_filter_set;
> > > > > vlan_tpid_set;
> > > > > vlan_strip_queue_set;
> > > > > vlan_offload_set;
> > > > > vlan_pvid_set;
> > > > > udp_tunnel_port_add;
> > > > > udp_tunnel_port_del;
> > > > > ...
> > > > >
> > > > > These APIs can be replaced easily by rte_flow API.
> > > > > Do you think we need to deprecate all?
> > > >
> > > > I think, basic stuff like below can have separate API.
> > > > 1)  promiscuous_enable;
> > > > 2) promiscuous_disable;
> > > > 3) allmulticast_enable;
> > > > 4) allmulticast_disable;
> > > > 5) mac_addr_remove;
> > > > 6) mac_addr_add;
> > > > 7) mac_addr_set;
> > > > 8) set_mc_addr_list;
> > >
> > > "Basic" is not a precise definition :)
> >
> > Yep.
> >
> > > I would say port-level configuration should remain out of rte_flow
> > > API.
>
> Thomas, Can you explain what is port-level?
> Everything in rte_flow is per port...
>
> Also, can you give reasons for your claim?
>
> > +1.
> > In addition that, I would say anything needs to configured at "per-flow"
> > granularity use rte_flow.
>
> Jerin, What do you mean "per-flow" ?

In Terms of  NIC HW features, Typical HW will have
a) Basic "port" level configuration like
- enable/disable promiscuous
b) Advance HW's will have CAM based flow filtering. IMO, CAM related
stuff should go to rte_flow.

This is to enable,  The very basic PMD(without advanced features) should work
with port level basic APIs(i.e without rte_flow)

I have seen promiscuous, mac address handling is part of basic NIC
HW(i.e NICs without advanced CAM filters).
That's my reasoning for the split.

> Everything in traffic filtering\actions is per flow, for example:
> Promiscuous: flow create 0 ingress pattern eth / end actions queue index 0 / end

IMO, it is not an accurate representation of promiscuous enable. It
needs to send the traffic
to all queues and patterns is not just eth.

> Multicast\mac related: flow create 0 ingress pattern eth dst is X /end actions queue 0/ end
> (in case of legacy RSS queue action will be replaced by rss).
> ....
>
> Everything here are flows.
>
> > >
> > > > But The VLAN and UDP related should be rte_flow candidates.(IMO)
> > >
> > > Yes definitely, tunneling is better managed with rte_flow.
>
> Can you give reasons for your claim?
> Why should Vlan\Tunnel be in rte_flow and promiscuous\Multicast\mac not?
>
> > > This is an important discussion, I Cc other ethdev maintainers.
>
> Agree, this is a good trigger to open this important discussion.
>
> > > Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
> > > Aren't you using --cc-cmd devtools/get-maintainer.sh ?
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-05  1:21                 ` Jerin Jacob
@ 2020-07-05  4:52                   ` Matan Azrad
  2020-07-06  8:37                     ` Jerin Jacob
  0 siblings, 1 reply; 129+ messages in thread
From: Matan Azrad @ 2020-07-05  4:52 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Thomas Monjalon, Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko



From: Jerin Jacob:
> On Sun, Jul 5, 2020 at 12:56 AM Matan Azrad <matan@mellanox.com> wrote:
> >
> > Hi all
> >
> > From: Jerin Jacob:
> > > On Fri, Jul 3, 2020 at 8:57 PM Thomas Monjalon <thomas@monjalon.net>
> > > wrote:
> > > >
> > > > 03/07/2020 17:08, Jerin Jacob:
> > > > > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com>
> > > wrote:
> > > > > > From: Jerin Jacob:
> > > > > > > When adding overlapping API(rte_eth_mirror_rule_set()) in
> > > > > > > the same library(ethdev).
> > > > > > > Please depreciate the old API.
> > > > > > > We should not have two separate paths for the same function
> > > > > > > in the same ethdev library. It is pain for app and driver developers.
> > > > > >
> > > > > > What are about all the other rte_flow parallel configuration
> > > > > > APIs in
> > > ethdev:
> > > > > >  promiscuous_enable;
> > > > > > promiscuous_disable;
> > > > > > allmulticast_enable;
> > > > > > allmulticast_disable;
> > > > > > mac_addr_remove;
> > > > > > mac_addr_add;
> > > > > > mac_addr_set;
> > > > > > set_mc_addr_list;
> > > > > > vlan_filter_set;
> > > > > > vlan_tpid_set;
> > > > > > vlan_strip_queue_set;
> > > > > > vlan_offload_set;
> > > > > > vlan_pvid_set;
> > > > > > udp_tunnel_port_add;
> > > > > > udp_tunnel_port_del;
> > > > > > ...
> > > > > >
> > > > > > These APIs can be replaced easily by rte_flow API.
> > > > > > Do you think we need to deprecate all?
> > > > >
> > > > > I think, basic stuff like below can have separate API.
> > > > > 1)  promiscuous_enable;
> > > > > 2) promiscuous_disable;
> > > > > 3) allmulticast_enable;
> > > > > 4) allmulticast_disable;
> > > > > 5) mac_addr_remove;
> > > > > 6) mac_addr_add;
> > > > > 7) mac_addr_set;
> > > > > 8) set_mc_addr_list;
> > > >
> > > > "Basic" is not a precise definition :)
> > >
> > > Yep.
> > >
> > > > I would say port-level configuration should remain out of rte_flow
> > > > API.
> >
> > Thomas, Can you explain what is port-level?
> > Everything in rte_flow is per port...
> >
> > Also, can you give reasons for your claim?
> >
> > > +1.
> > > In addition that, I would say anything needs to configured at "per-flow"
> > > granularity use rte_flow.
> >
> > Jerin, What do you mean "per-flow" ?
> 
> In Terms of  NIC HW features, Typical HW will have
> a) Basic "port" level configuration like
> - enable/disable promiscuous

What is "port level", everything in rte_flow is also per port...

> b) Advance HW's will have CAM based flow filtering. IMO, CAM related stuff
> should go to rte_flow.

It is HW internal, I don't think all HWs use the same logic here.
Since rte_flow is generic for all filtering methods, It is good candidate API for all HWs. 

> This is to enable,  The very basic PMD(without advanced features) should
> work with port level basic APIs(i.e without rte_flow)

What is "basic"? Do you mean simple match and simple action?
As I said, Also rte_flow is port level API - so "port level" is not good term here.

As you said " When adding overlapping API(rte_eth_mirror_rule_set()) in the same library(ethdev). Please depreciate the old API."

Different APIs to do the same thing is not good, especially in packet filtering:
What should we do if we have conflicts?
For example: legacy filtering APIs say to receive packet A and rte_flow says to drop it.

Don't you think it complicates more the user API understanding, also the PMD implementations?


> I have seen promiscuous, mac address handling is part of basic NIC HW(i.e
> NICs without advanced CAM filters).
> That's my reasoning for the split.

As I said, the nic HW specific implementation should not affect the API.
I don't think we need to split API and to complicate the user because of it.

IMO, It is better to have one generic API for packet filtering:
It is clearer, simpler, generic and classic.
The user and PMD need to understand only one filtering API and that’s it (no need to combine between multiple filtering APIs). 

I know this is big change but we can do it in modular way.
It reminds me the big change that was done in Rx\Tx offload configurations:
So, when offload became more popular we modularly forced users to replace the offload API.
Also here, flow filtering becomes popular so maybe this is the time(20.08-20.11) to force changes in the old APIs.   

> > Everything in traffic filtering\actions is per flow, for example:
> > Promiscuous: flow create 0 ingress pattern eth / end actions queue
> > index 0 / end
> 
> IMO, it is not an accurate representation of promiscuous enable. It needs to
> send the traffic to all queues and patterns is not just eth.

Yes, if legacy RSS is configured we need to replace the above queue action by rss action as I wrote before.(did you read it just below?)

So, we can add legacy RSS APIs to the list above...

> > Multicast\mac related: flow create 0 ingress pattern eth dst is X /end
> > actions queue 0/ end (in case of legacy RSS queue action will be replaced by
> rss).
> > ....
> >
> > Everything here are flows.
> >
> > > >
> > > > > But The VLAN and UDP related should be rte_flow candidates.(IMO)
> > > >
> > > > Yes definitely, tunneling is better managed with rte_flow.
> >
> > Can you give reasons for your claim?
> > Why should Vlan\Tunnel be in rte_flow and promiscuous\Multicast\mac
> not?
> >
> > > > This is an important discussion, I Cc other ethdev maintainers.
> >
> > Agree, this is a good trigger to open this important discussion.
> >
> > > > Note: this is an ethdev patch, all ethdev maintainers should be Cc'ed.
> > > > Aren't you using --cc-cmd devtools/get-maintainer.sh ?
> >

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-04 14:44       ` Ajit Khaparde
@ 2020-07-05  8:55         ` Thomas Monjalon
  2020-07-05 23:54           ` Ajit Khaparde
  0 siblings, 1 reply; 129+ messages in thread
From: Thomas Monjalon @ 2020-07-05  8:55 UTC (permalink / raw)
  To: Jerin Jacob, Ajit Khaparde
  Cc: Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad, dpdk-dev,
	Raslan Darawsheh, ian.stokes, fbl

04/07/2020 16:44, Ajit Khaparde:
> On Thu, Jul 2, 2020 at 11:40 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > library(ethdev).
> > Please depreciate the old API.
> > We should not have two separate paths for the same function in the
> > same ethdev library. It is pain for app and driver developers.
> >
> > With the above deprecation notice,
> > Acked-by: Jerin Jacob <jerinj@marvell.com>
> >
> I am fine with the proposed RTE_FLOW_ACTION_TYPE_SAMPLE. But..
> 
> When rte_eth_mirror_rule_set() is deprecated, are we going to add
> RTE_FLOW_ACTION_TYPE_MIRROR for full fledged mirror action?
> Or we are proposing to use RTE_FLOW_ACTION_TYPE_SAMPLE with
> ratio of 1 to mirror all packets, thereby doing away with the need for
> a separate RTE_FLOW_ACTION_TYPE_MIRROR?

The idea is to use RTE_FLOW_ACTION_TYPE_SAMPLE with ratio=1 for mirroring.




^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-04 13:04     ` Andrew Rybchenko
@ 2020-07-05 10:18       ` Ori Kam
  2020-07-05 23:54         ` Ajit Khaparde
  2020-07-06  6:53         ` Jiawei(Jonny) Wang
  0 siblings, 2 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-05 10:18 UTC (permalink / raw)
  To: Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

Hi Andrew,

I replied to some of your comments. 
Best,
Ori
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
> Sent: Saturday, July 4, 2020 4:05 PM
> Subject: Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte
> flow
> 
> On 7/2/20 9:43 PM, Jiawei Wang wrote:
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> 
> vf->VF
> 
> > visibility on the traffic.
> > So there's a need for an action that will enable the control application
> > some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> 
> May be 1 packet per second is a better sampling approach?
> Or just different.
> 
Those are two different things, lets take a packet that arrives once every two seconds
and we ask to sample once every second, this means that we will always get that packet.
Also as far as I understand the use case is to have some visibility about the traffic. 
so you can assume that if a packet is sent once per second the application will get the packet 
with very high delay and very low visibility. Lets take a use case that the hyprivor 
wants to check if one of the VM is abusing the system (sends DDOS packets, or just 
trying to understand the network) in this case we can assume that the VM will send large
amount of traffic. and if we only check once per second the application will not be able to 
understand the traffic meaning, but if we sample 1% of the traffic then the application will
see very fast the type of the traffic the VM is sending and if it is trying to abuse the system.
So I vote in favor of keeping as is.


> > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > , means that the packets would be completely mirrored. The sample packet
> 
> Comma on the next line looks bad.
> 
> > can be assigned with different set of actions from the original packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> rte_flow_action_sample
> > will be introduced.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > Acked-by: Ori Kam <orika@mellanox.com>
> > ---
> >  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
> >  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
> >  lib/librte_ethdev/rte_flow.c           |  1 +
> >  lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
> >  4 files changed, 60 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index d5dd18c..50dfe1f 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
> >     | ``context``  | user input flow context         |
> >     +--------------+---------------------------------+
> >
> > +Action: ``SAMPLE``
> > +^^^^^^^^^^^^^^^^^^
> > +
> > +Adds a sample action to a matched flow.
> > +
> > +The matching packets will be duplicated to a special queue or vport
> 
> what is vport above?
> 
I think it should be port (when using E-Switch)

> > +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> > +All the packets continues to the target destination.
> 
> continues -> continue (if I'm not mistaken)
> 
> > +
> > +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> > +``actions`` represent the different set of actions for the sampled or mirrored
> > +packets.
> > +
> > +.. _table_rte_flow_action_sample:
> > +
> > +.. table:: SAMPLE
> > +
> > +   +--------------+---------------------------------+
> > +   | Field        | Value                           |
> > +   +==============+=================================+
> > +   | ``ratio``    | 32 bits sample ratio value      |
> > +   +--------------+---------------------------------+
> > +   | ``actions``  | sub-action list for sampling    |
> > +   +--------------+---------------------------------+
> > +
> >  Negative types
> >  ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_20_08.rst
> b/doc/guides/rel_notes/release_20_08.rst
> > index 5cbc4ce..313e8d3 100644
> > --- a/doc/guides/rel_notes/release_20_08.rst
> > +++ b/doc/guides/rel_notes/release_20_08.rst
> > @@ -81,6 +81,12 @@ New Features
> >    * Added support for virtio queue statistics.
> >    * Added support for MTU update.
> >
> > +* **Added flow-based traffic sampling support.**
> > +
> > +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the
> matching
> > +  packets with given ratio and redirects to vport or queue. The sampled
> packets
> 
> What is vport above?
> 
See comment above.

> > +  also can be assigned with an additional optional actions.
> 
> May actions list be empty or NULL? If no, it does not look
> optional.
> 
I think that the action list can't be NULL or empty. There is no meaning to empty list.
I agree it should be stated.

> > +
> >  * **Updated Marvell octeontx2 ethdev PMD.**
> >
> >    Updated Marvell octeontx2 driver with cn98xx support.
> > diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> > index 1685be5..733871d 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
> >  	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >  	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >  	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> > +	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
> >  };
> >
> >  int
> > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > index b0e4199..c9cd80d 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
> >  	 * see enum RTE_ETH_EVENT_FLOW_AGED
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_AGE,
> > +
> > +	/**
> > +	 * Redirects specific ratio of packets to vport or queue.
> > +	 *
> > +	 * See struct rte_flow_action_sample.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_SAMPLE,
> >  };
> >
> >  /**
> > @@ -2709,6 +2716,27 @@ struct rte_flow_action {
> >  struct rte_flow;
> >
> >  /**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > + *
> > + * Adds a sample action to a matched flow.
> > + *
> > + * The matching packets will be duplicated to a special queue or vport
> 
> again 'vport' here
> It sounds misleading and too restrictive to say "be duplicated
> to a special queue or vport". There is no specification of the
> queue or vport in the control structure.
> You should either describe it in a generic way like "be
> duplicated and own set of actions with a fate action applied"
> or put a restriction about QUEUE, RSS or "vport"-related action
> to be present in the sub-actions list.
> 
> > + * in the predefined probabiilty, All the packets continues processing
> 
> probabiilty -> probability
> I think 'predefined' is misleading here, 'specified' is better.
> Also strictly speaking it is not a predefined probability (as
> Stephen suggested), it is defined ratio.
> 
> > + * on the default flow path.
> > + *
> > + * When the sample ratio is set to 1 then the packets will be 100% mirrored.
> > + * Additional action list be supported to add for sampled or mirrored
> packets.
> > + */
> > +struct rte_flow_action_sample {
> > +	const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> 
> const is still above and it is meaningless (other actions do
> not have 'const' for plain fields).
> 
+1

> > +	const struct rte_flow_action *actions;
> > +		/**< sub-action list specific for the sampling hit cases. */
> 
> Is it required to have fate action?
> May I use it to MARK some packets and do not duplicate?
> I guess no. Or COUNT and DROP? Just COUNT?
>
I from my understanding you may use mark and count but it also must
have a fate action. 
 
> What I'm trying to say that you're adding a generic packet
> selection mechanism with very restricted usage by design.
> 
> Anyway, if you go with it, please, process other notes above.
> 

> > +};
> > +
> > +/**
> >   * Verbose error types.
> >   *
> >   * Most of them provide the type of the object referenced by struct
> >


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/7] net/mlx5: add the validate sample action
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-07-05 19:30     ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-05 19:30 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Subject: [PATCH v2 4/7] net/mlx5: add the validate sample action
> 
> Add sample action validate function.
> 
> For Sample flow support NIC-RX and FDB domain, must include an
> action of a dest TIR in NIC_RX.
> 
> Only NIC_RX support with addition optional actions. FDB doesn't
> support any optional action, the sampled packets is always goes
> to e-switch manager port.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---
Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/7] net/mlx5: update translate function for sample action
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-07-05 19:32     ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-05 19:32 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Jiawei(Jonny) Wang

Hi Jiawei,

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@mellanox.com>
> Subject: [PATCH v2 6/7] net/mlx5: update translate function for sample action
> 
> Translate the attribute of sample action that include sample ratio
> and sub actions list, then create the sample DR action.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> ---

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-05 10:18       ` Ori Kam
@ 2020-07-05 23:54         ` Ajit Khaparde
  2020-07-06  6:53         ` Jiawei(Jonny) Wang
  1 sibling, 0 replies; 129+ messages in thread
From: Ajit Khaparde @ 2020-07-05 23:54 UTC (permalink / raw)
  To: Ori Kam
  Cc: Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko,
	Matan Azrad, dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes,
	fbl

On Sun, Jul 5, 2020 at 3:19 AM Ori Kam <orika@mellanox.com> wrote:

> Hi Andrew,
>
> I replied to some of your comments.
> Best,
> Ori
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
> > Sent: Saturday, July 4, 2020 4:05 PM
> > Subject: Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action
> for rte
> > flow
> >
> > On 7/2/20 9:43 PM, Jiawei Wang wrote:
> > > When using full offload, all traffic will be handled by the HW, and
> > > directed to the requested vf or wire, the control application loses
> >
> > vf->VF
> >
> > > visibility on the traffic.
> > > So there's a need for an action that will enable the control
> application
> > > some visibility.
> > >
> > > The solution is introduced a new action that will sample the incoming
> > > traffic and send a duplicated traffic in some predefined ratio to the
> > > application, while the original packet will continue to the target
> > > destination.
> > >
> >
> > May be 1 packet per second is a better sampling approach?
> > Or just different.
> >
> Those are two different things, lets take a packet that arrives once every
> two seconds
> and we ask to sample once every second, this means that we will always get
> that packet.
> Also as far as I understand the use case is to have some visibility about
> the traffic.
> so you can assume that if a packet is sent once per second the application
> will get the packet
> with very high delay and very low visibility. Lets take a use case that
> the hyprivor
> wants to check if one of the VM is abusing the system (sends DDOS packets,
> or just
> trying to understand the network) in this case we can assume that the VM
> will send large
> amount of traffic. and if we only check once per second the application
> will not be able to
> understand the traffic meaning, but if we sample 1% of the traffic then
> the application will
> see very fast the type of the traffic the VM is sending and if it is
> trying to abuse the system.
> So I vote in favor of keeping as is.
>

Thanks for bringing this up.
So an application may want to sample either "finite" packets per second or
"a percentage" of packets per second or "all" packets.
So a "uint32_t ratio" may not just be enough by itself.
Maybe we need to also couple it with a unit.

uint32_t sampling_unit;     /* Specifies one of the units to use while
sampling. */
RTE_FLOW_NUM_PACKETS_PER_SEC /* Samples specific number of packets per
second. */
RTE_FLOW_PERCENT_PACKETS_PER_SEC /* Samples a percentage of packets per
second. */
RTE_FLOW_ALL_PACKETS    /* SAMPLES all packets - equivalent to mirror */
This may be redundant if percentage is specified and ratio is 100.
In that case instead of "uint32_t ratio", just use "uint32_t sample"?



>
> > > The packets sampled equals is '1/ratio', if the ratio value be set to 1
> > > , means that the packets would be completely mirrored. The sample
> packet
> >
> > Comma on the next line looks bad.
> >
> > > can be assigned with different set of actions from the original packet.
> > >
> > > In order to support the sample packet in rte_flow, new rte_flow action
> > > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample
> > > will be introduced.
> > >
> > > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > > Acked-by: Ori Kam <orika@mellanox.com>
> > > ---
> > >  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
> > >  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
> > >  lib/librte_ethdev/rte_flow.c           |  1 +
> > >  lib/librte_ethdev/rte_flow.h           | 28
> ++++++++++++++++++++++++++++
> > >  4 files changed, 60 insertions(+)
> > >
> > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > > index d5dd18c..50dfe1f 100644
> > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
> > >     | ``context``  | user input flow context         |
> > >     +--------------+---------------------------------+
> > >
> > > +Action: ``SAMPLE``
> > > +^^^^^^^^^^^^^^^^^^
> > > +
> > > +Adds a sample action to a matched flow.
> > > +
> > > +The matching packets will be duplicated to a special queue or vport
> >
> > what is vport above?
> >
> I think it should be port (when using E-Switch)
>
> > > +with the predefined ``ratio``, the packets sampled equals is
> '1/ratio'.
> > > +All the packets continues to the target destination.
> >
> > continues -> continue (if I'm not mistaken)
> >
> > > +
> > > +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> > > +``actions`` represent the different set of actions for the sampled or
> mirrored
> > > +packets.
> > > +
> > > +.. _table_rte_flow_action_sample:
> > > +
> > > +.. table:: SAMPLE
> > > +
> > > +   +--------------+---------------------------------+
> > > +   | Field        | Value                           |
> > > +   +==============+=================================+
> > > +   | ``ratio``    | 32 bits sample ratio value      |
> > > +   +--------------+---------------------------------+
> > > +   | ``actions``  | sub-action list for sampling    |
> > > +   +--------------+---------------------------------+
> > > +
> > >  Negative types
> > >  ~~~~~~~~~~~~~~
> > >
> > > diff --git a/doc/guides/rel_notes/release_20_08.rst
> > b/doc/guides/rel_notes/release_20_08.rst
> > > index 5cbc4ce..313e8d3 100644
> > > --- a/doc/guides/rel_notes/release_20_08.rst
> > > +++ b/doc/guides/rel_notes/release_20_08.rst
> > > @@ -81,6 +81,12 @@ New Features
> > >    * Added support for virtio queue statistics.
> > >    * Added support for MTU update.
> > >
> > > +* **Added flow-based traffic sampling support.**
> > > +
> > > +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the
> > matching
> > > +  packets with given ratio and redirects to vport or queue. The
> sampled
> > packets
> >
> > What is vport above?
> >
> See comment above.
>
> > > +  also can be assigned with an additional optional actions.
> >
> > May actions list be empty or NULL? If no, it does not look
> > optional.
> >
> I think that the action list can't be NULL or empty. There is no meaning
> to empty list.
> I agree it should be stated.
>
> > > +
> > >  * **Updated Marvell octeontx2 ethdev PMD.**
> > >
> > >    Updated Marvell octeontx2 driver with cn98xx support.
> > > diff --git a/lib/librte_ethdev/rte_flow.c
> b/lib/librte_ethdev/rte_flow.c
> > > index 1685be5..733871d 100644
> > > --- a/lib/librte_ethdev/rte_flow.c
> > > +++ b/lib/librte_ethdev/rte_flow.c
> > > @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
> > >     MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> > rte_flow_action_set_dscp)),
> > >     MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> > rte_flow_action_set_dscp)),
> > >     MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> > > +   MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
> > >  };
> > >
> > >  int
> > > diff --git a/lib/librte_ethdev/rte_flow.h
> b/lib/librte_ethdev/rte_flow.h
> > > index b0e4199..c9cd80d 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
> > >      * see enum RTE_ETH_EVENT_FLOW_AGED
> > >      */
> > >     RTE_FLOW_ACTION_TYPE_AGE,
> > > +
> > > +   /**
> > > +    * Redirects specific ratio of packets to vport or queue.
> > > +    *
> > > +    * See struct rte_flow_action_sample.
> > > +    */
> > > +   RTE_FLOW_ACTION_TYPE_SAMPLE,
> > >  };
> > >
> > >  /**
> > > @@ -2709,6 +2716,27 @@ struct rte_flow_action {
> > >  struct rte_flow;
> > >
> > >  /**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > > + *
> > > + * Adds a sample action to a matched flow.
> > > + *
> > > + * The matching packets will be duplicated to a special queue or vport
> >
> > again 'vport' here
> > It sounds misleading and too restrictive to say "be duplicated
> > to a special queue or vport". There is no specification of the
> > queue or vport in the control structure.
> > You should either describe it in a generic way like "be
> > duplicated and own set of actions with a fate action applied"
> > or put a restriction about QUEUE, RSS or "vport"-related action
> > to be present in the sub-actions list.
> >
> > > + * in the predefined probabiilty, All the packets continues processing
> >
> > probabiilty -> probability
> > I think 'predefined' is misleading here, 'specified' is better.
> > Also strictly speaking it is not a predefined probability (as
> > Stephen suggested), it is defined ratio.
> >
> > > + * on the default flow path.
> > > + *
> > > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > > + * Additional action list be supported to add for sampled or mirrored
> > packets.
> > > + */
> > > +struct rte_flow_action_sample {
> > > +   const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> >
> > const is still above and it is meaningless (other actions do
> > not have 'const' for plain fields).
> >
> +1
>
> > > +   const struct rte_flow_action *actions;
> > > +           /**< sub-action list specific for the sampling hit cases.
> */
> >
> > Is it required to have fate action?
> > May I use it to MARK some packets and do not duplicate?
> > I guess no. Or COUNT and DROP? Just COUNT?
> >
> I from my understanding you may use mark and count but it also must
> have a fate action.
>
> > What I'm trying to say that you're adding a generic packet
> > selection mechanism with very restricted usage by design.
> >
> > Anyway, if you go with it, please, process other notes above.
> >
>
> > > +};
> > > +
> > > +/**
> > >   * Verbose error types.
> > >   *
> > >   * Most of them provide the type of the object referenced by struct
> > >
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-05  8:55         ` Thomas Monjalon
@ 2020-07-05 23:54           ` Ajit Khaparde
  0 siblings, 0 replies; 129+ messages in thread
From: Ajit Khaparde @ 2020-07-05 23:54 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Jerin Jacob, Jiawei Wang, Ori Kam, Slava Ovsiienko, Matan Azrad,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl

On Sun, Jul 5, 2020 at 1:55 AM Thomas Monjalon <thomas@monjalon.net> wrote:

> 04/07/2020 16:44, Ajit Khaparde:
> > On Thu, Jul 2, 2020 at 11:40 PM Jerin Jacob <jerinjacobk@gmail.com>
> wrote:
> > > When adding overlapping API(rte_eth_mirror_rule_set()) in the same
> > > library(ethdev).
> > > Please depreciate the old API.
> > > We should not have two separate paths for the same function in the
> > > same ethdev library. It is pain for app and driver developers.
> > >
> > > With the above deprecation notice,
> > > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > I am fine with the proposed RTE_FLOW_ACTION_TYPE_SAMPLE. But..
> >
> > When rte_eth_mirror_rule_set() is deprecated, are we going to add
> > RTE_FLOW_ACTION_TYPE_MIRROR for full fledged mirror action?
> > Or we are proposing to use RTE_FLOW_ACTION_TYPE_SAMPLE with
> > ratio of 1 to mirror all packets, thereby doing away with the need for
> > a separate RTE_FLOW_ACTION_TYPE_MIRROR?
>
> The idea is to use RTE_FLOW_ACTION_TYPE_SAMPLE with ratio=1 for mirroring.
>
Thanks for clarifying.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-05 10:18       ` Ori Kam
  2020-07-05 23:54         ` Ajit Khaparde
@ 2020-07-06  6:53         ` Jiawei(Jonny) Wang
  1 sibling, 0 replies; 129+ messages in thread
From: Jiawei(Jonny) Wang @ 2020-07-06  6:53 UTC (permalink / raw)
  To: Ori Kam, Andrew Rybchenko, Slava Ovsiienko, Matan Azrad
  Cc: dev, Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl

Thanks Andrew and Ori,

> -----Original Message-----
> From: Ori Kam <orika@mellanox.com>
> Sent: Sunday, July 5, 2020 6:19 PM
> To: Andrew Rybchenko <arybchenko@solarflare.com>; Jiawei(Jonny) Wang
> <jiaweiw@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Raslan
> Darawsheh <rasland@mellanox.com>; ian.stokes@intel.com;
> fbl@redhat.com
> Subject: RE: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for
> rte flow
> 
> Hi Andrew,
> 
> I replied to some of your comments.
> Best,
> Ori
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
> > Sent: Saturday, July 4, 2020 4:05 PM
> > Subject: Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action
> > for rte flow
> >
> > On 7/2/20 9:43 PM, Jiawei Wang wrote:
> > > When using full offload, all traffic will be handled by the HW, and
> > > directed to the requested vf or wire, the control application loses
> >
> > vf->VF
will change in v3 patch.
> >
> > > visibility on the traffic.
> > > So there's a need for an action that will enable the control
> > > application some visibility.
> > >
> > > The solution is introduced a new action that will sample the
> > > incoming traffic and send a duplicated traffic in some predefined
> > > ratio to the application, while the original packet will continue to
> > > the target destination.
> > >
> >
> > May be 1 packet per second is a better sampling approach?
> > Or just different.
> >
> Those are two different things, lets take a packet that arrives once every two
> seconds and we ask to sample once every second, this means that we will
> always get that packet.
> Also as far as I understand the use case is to have some visibility about the
> traffic.
> so you can assume that if a packet is sent once per second the application
> will get the packet with very high delay and very low visibility. Lets take a use
> case that the hyprivor wants to check if one of the VM is abusing the system
> (sends DDOS packets, or just trying to understand the network) in this case
> we can assume that the VM will send large amount of traffic. and if we only
> check once per second the application will not be able to understand the
> traffic meaning, but if we sample 1% of the traffic then the application will
> see very fast the type of the traffic the VM is sending and if it is trying to
> abuse the system.
> So I vote in favor of keeping as is.
> 
> 
> > > The packets sampled equals is '1/ratio', if the ratio value be set
> > > to 1 , means that the packets would be completely mirrored. The
> > > sample packet
> >
> > Comma on the next line looks bad.
ok, will move the Comma to the same line.
> >
> > > can be assigned with different set of actions from the original packet.
> > >
> > > In order to support the sample packet in rte_flow, new rte_flow
> > > action definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample
> > > will be introduced.
> > >
> > > Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> > > Acked-by: Ori Kam <orika@mellanox.com>
> > > ---
> > >  doc/guides/prog_guide/rte_flow.rst     | 25
> +++++++++++++++++++++++++
> > >  doc/guides/rel_notes/release_20_08.rst |  6 ++++++
> > >  lib/librte_ethdev/rte_flow.c           |  1 +
> > >  lib/librte_ethdev/rte_flow.h           | 28 ++++++++++++++++++++++++++++
> > >  4 files changed, 60 insertions(+)
> > >
> > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > > index d5dd18c..50dfe1f 100644
> > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the
> flow.
> > >     | ``context``  | user input flow context         |
> > >     +--------------+---------------------------------+
> > >
> > > +Action: ``SAMPLE``
> > > +^^^^^^^^^^^^^^^^^^
> > > +
> > > +Adds a sample action to a matched flow.
> > > +
> > > +The matching packets will be duplicated to a special queue or vport
> >
> > what is vport above?
> >
> I think it should be port (when using E-Switch)
Yes, the destination port is working on e-switch mode, change vport->port.
> 
> > > +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> > > +All the packets continues to the target destination.
> >
> > continues -> continue (if I'm not mistaken)
ok, will change.
> >
> > > +
> > > +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> > > +``actions`` represent the different set of actions for the sampled
> > > +or mirrored packets.
> > > +
> > > +.. _table_rte_flow_action_sample:
> > > +
> > > +.. table:: SAMPLE
> > > +
> > > +   +--------------+---------------------------------+
> > > +   | Field        | Value                           |
> > > +   +==============+=================================+
> > > +   | ``ratio``    | 32 bits sample ratio value      |
> > > +   +--------------+---------------------------------+
> > > +   | ``actions``  | sub-action list for sampling    |
> > > +   +--------------+---------------------------------+
> > > +
> > >  Negative types
> > >  ~~~~~~~~~~~~~~
> > >
> > > diff --git a/doc/guides/rel_notes/release_20_08.rst
> > b/doc/guides/rel_notes/release_20_08.rst
> > > index 5cbc4ce..313e8d3 100644
> > > --- a/doc/guides/rel_notes/release_20_08.rst
> > > +++ b/doc/guides/rel_notes/release_20_08.rst
> > > @@ -81,6 +81,12 @@ New Features
> > >    * Added support for virtio queue statistics.
> > >    * Added support for MTU update.
> > >
> > > +* **Added flow-based traffic sampling support.**
> > > +
> > > +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate
> > > + the
> > matching
> > > +  packets with given ratio and redirects to vport or queue. The
> > > + sampled
> > packets
> >
> > What is vport above?
> >
> See comment above.
> 
> > > +  also can be assigned with an additional optional actions.
> >
> > May actions list be empty or NULL? If no, it does not look optional.
> >
> I think that the action list can't be NULL or empty. There is no meaning to
> empty list.
> I agree it should be stated.
> 
'optional' means that besides an fate action,  also can combine with addition action if needed,
like mark and queue for sampled packet.

> > > +
> > >  * **Updated Marvell octeontx2 ethdev PMD.**
> > >
> > >    Updated Marvell octeontx2 driver with cn98xx support.
> > > diff --git a/lib/librte_ethdev/rte_flow.c
> > > b/lib/librte_ethdev/rte_flow.c index 1685be5..733871d 100644
> > > --- a/lib/librte_ethdev/rte_flow.c
> > > +++ b/lib/librte_ethdev/rte_flow.c
> > > @@ -173,6 +173,7 @@ struct rte_flow_desc_data {
> > >  	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> > rte_flow_action_set_dscp)),
> > >  	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> > rte_flow_action_set_dscp)),
> > >  	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> > > +	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
> > >  };
> > >
> > >  int
> > > diff --git a/lib/librte_ethdev/rte_flow.h
> > > b/lib/librte_ethdev/rte_flow.h index b0e4199..c9cd80d 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -2099,6 +2099,13 @@ enum rte_flow_action_type {
> > >  	 * see enum RTE_ETH_EVENT_FLOW_AGED
> > >  	 */
> > >  	RTE_FLOW_ACTION_TYPE_AGE,
> > > +
> > > +	/**
> > > +	 * Redirects specific ratio of packets to vport or queue.
> > > +	 *
> > > +	 * See struct rte_flow_action_sample.
> > > +	 */
> > > +	RTE_FLOW_ACTION_TYPE_SAMPLE,
> > >  };
> > >
> > >  /**
> > > @@ -2709,6 +2716,27 @@ struct rte_flow_action {  struct rte_flow;
> > >
> > >  /**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > > + *
> > > + * Adds a sample action to a matched flow.
> > > + *
> > > + * The matching packets will be duplicated to a special queue or
> > > + vport
> >
> > again 'vport' here
> > It sounds misleading and too restrictive to say "be duplicated to a
> > special queue or vport". There is no specification of the queue or
> > vport in the control structure.
> > You should either describe it in a generic way like "be duplicated and
> > own set of actions with a fate action applied"
> > or put a restriction about QUEUE, RSS or "vport"-related action to be
> > present in the sub-actions list.
> >
I'll change the description to "The matching packets will be duplicated and applied with own set of actions with a fate action"
> > > + * in the predefined probabiilty, All the packets continues
> > > + processing
> >
> > probabiilty -> probability
> > I think 'predefined' is misleading here, 'specified' is better.
> > Also strictly speaking it is not a predefined probability (as Stephen
> > suggested), it is defined ratio.
> >
Thanks for pointing out typo, I will use 'specified ratio' instead of.
> > > + * on the default flow path.
> > > + *
> > > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > > + * Additional action list be supported to add for sampled or
> > > + mirrored
> > packets.
> > > + */
> > > +struct rte_flow_action_sample {
> > > +	const uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> >
> > const is still above and it is meaningless (other actions do not have
> > 'const' for plain fields).
> >
> +1
agree,  remove 'const'.
> 
> > > +	const struct rte_flow_action *actions;
> > > +		/**< sub-action list specific for the sampling hit cases. */
> >
> > Is it required to have fate action?
> > May I use it to MARK some packets and do not duplicate?
> > I guess no. Or COUNT and DROP? Just COUNT?
> >
> I from my understanding you may use mark and count but it also must have
> a fate action.
> 
Yes, need fate action for sampling, mark or count is optional action.
> > What I'm trying to say that you're adding a generic packet selection
> > mechanism with very restricted usage by design.
> >
> > Anyway, if you go with it, please, process other notes above.
> >
> 
> > > +};
> > > +
> > > +/**
> > >   * Verbose error types.
> > >   *
> > >   * Most of them provide the type of the object referenced by struct
> > >


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow
  2020-07-05  4:52                   ` Matan Azrad
@ 2020-07-06  8:37                     ` Jerin Jacob
  0 siblings, 0 replies; 129+ messages in thread
From: Jerin Jacob @ 2020-07-06  8:37 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jiawei(Jonny) Wang, Ori Kam, Slava Ovsiienko,
	dpdk-dev, Raslan Darawsheh, ian.stokes, fbl, Ferruh Yigit,
	Andrew Rybchenko

On Sun, Jul 5, 2020 at 10:22 AM Matan Azrad <matan@mellanox.com> wrote:
>
>
>
> From: Jerin Jacob:
> > On Sun, Jul 5, 2020 at 12:56 AM Matan Azrad <matan@mellanox.com> wrote:
> > >
> > > Hi all
> > >
> > > From: Jerin Jacob:
> > > > On Fri, Jul 3, 2020 at 8:57 PM Thomas Monjalon <thomas@monjalon.net>
> > > > wrote:
> > > > >
> > > > > 03/07/2020 17:08, Jerin Jacob:
> > > > > > On Fri, Jul 3, 2020 at 8:25 PM Matan Azrad <matan@mellanox.com>
> > > > wrote:
> > > > > > > From: Jerin Jacob:
> > > > > > > > When adding overlapping API(rte_eth_mirror_rule_set()) in
> > > > > > > > the same library(ethdev).
> > > > > > > > Please depreciate the old API.
> > > > > > > > We should not have two separate paths for the same function
> > > > > > > > in the same ethdev library. It is pain for app and driver developers.
> > > > > > >
> > > > > > > What are about all the other rte_flow parallel configuration
> > > > > > > APIs in
> > > > ethdev:
> > > > > > >  promiscuous_enable;
> > > > > > > promiscuous_disable;
> > > > > > > allmulticast_enable;
> > > > > > > allmulticast_disable;
> > > > > > > mac_addr_remove;
> > > > > > > mac_addr_add;
> > > > > > > mac_addr_set;
> > > > > > > set_mc_addr_list;
> > > > > > > vlan_filter_set;
> > > > > > > vlan_tpid_set;
> > > > > > > vlan_strip_queue_set;
> > > > > > > vlan_offload_set;
> > > > > > > vlan_pvid_set;
> > > > > > > udp_tunnel_port_add;
> > > > > > > udp_tunnel_port_del;
> > > > > > > ...
> > > > > > >
> > > > > > > These APIs can be replaced easily by rte_flow API.
> > > > > > > Do you think we need to deprecate all?
> > > > > >
> > > > > > I think, basic stuff like below can have separate API.
> > > > > > 1)  promiscuous_enable;
> > > > > > 2) promiscuous_disable;
> > > > > > 3) allmulticast_enable;
> > > > > > 4) allmulticast_disable;
> > > > > > 5) mac_addr_remove;
> > > > > > 6) mac_addr_add;
> > > > > > 7) mac_addr_set;
> > > > > > 8) set_mc_addr_list;
> > > > >
> > > > > "Basic" is not a precise definition :)
> > > >
> > > > Yep.
> > > >
> > > > > I would say port-level configuration should remain out of rte_flow
> > > > > API.
> > >
> > > Thomas, Can you explain what is port-level?
> > > Everything in rte_flow is per port...
> > >
> > > Also, can you give reasons for your claim?
> > >
> > > > +1.
> > > > In addition that, I would say anything needs to configured at "per-flow"
> > > > granularity use rte_flow.
> > >
> > > Jerin, What do you mean "per-flow" ?
> >
> > In Terms of  NIC HW features, Typical HW will have
> > a) Basic "port" level configuration like
> > - enable/disable promiscuous
>
> What is "port level", everything in rte_flow is also per port...
>
> > b) Advance HW's will have CAM based flow filtering. IMO, CAM related stuff
> > should go to rte_flow.
>
> It is HW internal, I don't think all HWs use the same logic here.
> Since rte_flow is generic for all filtering methods, It is good candidate API for all HWs.
>
> > This is to enable,  The very basic PMD(without advanced features) should
> > work with port level basic APIs(i.e without rte_flow)
>
> What is "basic"? Do you mean simple match and simple action?
> As I said, Also rte_flow is port level API - so "port level" is not good term here.
>
> As you said " When adding overlapping API(rte_eth_mirror_rule_set()) in the same library(ethdev). Please depreciate the old API."
>
> Different APIs to do the same thing is not good, especially in packet filtering:
> What should we do if we have conflicts?
> For example: legacy filtering APIs say to receive packet A and rte_flow says to drop it.
>
> Don't you think it complicates more the user API understanding, also the PMD implementations?
>
>
> > I have seen promiscuous, mac address handling is part of basic NIC HW(i.e
> > NICs without advanced CAM filters).
> > That's my reasoning for the split.
>
> As I said, the nic HW specific implementation should not affect the API.
> I don't think we need to split API and to complicate the user because of it.
>
> IMO, It is better to have one generic API for packet filtering:
> It is clearer, simpler, generic and classic.
> The user and PMD need to understand only one filtering API and that’s it (no need to combine between multiple filtering APIs).
>
> I know this is big change but we can do it in modular way.
> It reminds me the big change that was done in Rx\Tx offload configurations:
> So, when offload became more popular we modularly forced users to replace the offload API.
> Also here, flow filtering becomes popular so maybe this is the time(20.08-20.11) to force changes in the old APIs.
>
> > > Everything in traffic filtering\actions is per flow, for example:
> > > Promiscuous: flow create 0 ingress pattern eth / end actions queue
> > > index 0 / end
> >
> > IMO, it is not an accurate representation of promiscuous enable. It needs to
> > send the traffic to all queues and patterns is not just eth.
>
> Yes, if legacy RSS is configured we need to replace the above queue action by rss action as I wrote before.(did you read it just below?)
>
> So, we can add legacy RSS APIs to the list above...

I meant, If promiscuous enable, then what would be the pattern, Should
be limit just to "eth".
I leave up to ethdev maintainer to decide. Is promiscuous part of
rte_flow API or not? I dnt have a very strong objection.
For me, VLAN(rte_vlan_*) and MIRROR(rte_eth_mirror_rule_set()) are
most worrisome  as each PMD need to duplicate that work
as both are CAM based API.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling
  2020-07-02 18:43 ` [dpdk-dev] [PATCH v2 0/7] [v2] support the flow-based traffic sampling Jiawei Wang
                     ` (6 preceding siblings ...)
  2020-07-02 18:43   ` [dpdk-dev] [PATCH v2 7/7] app/testpmd: add testpmd command " Jiawei Wang
@ 2020-07-06 17:51   ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
                       ` (8 more replies)
  7 siblings, 9 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination.

If the sample ratio value be set to 1, means that the packets would be completely mirrored. The sample packet can be assigned with different set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample action while creating a flow.

v3:
* Remove 'const' of ratio field.
* Update description and commit messages.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (7):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c            | 285 ++++++++++++++-
 doc/guides/prog_guide/rte_flow.rst     |  25 ++
 doc/guides/rel_notes/release_20_08.rst |   6 +
 drivers/common/mlx5/Makefile           |   5 +
 drivers/common/mlx5/linux/meson.build  |   2 +
 drivers/common/mlx5/linux/mlx5_glue.c  |  15 +
 drivers/common/mlx5/linux/mlx5_glue.h  |  12 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h   |   1 +
 drivers/common/mlx5/mlx5_prm.h         |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c       |  14 +
 drivers/net/mlx5/mlx5.c                |  11 +
 drivers/net/mlx5/mlx5.h                |   4 +
 drivers/net/mlx5/mlx5_flow.c           | 274 +++++++++++++-
 drivers/net/mlx5/mlx5_flow.h           |  51 ++-
 drivers/net/mlx5/mlx5_flow_dv.c        | 627 ++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c           |   1 +
 lib/librte_ethdev/rte_flow.h           |  30 ++
 18 files changed, 1406 insertions(+), 35 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-07 10:26       ` Andrew Rybchenko
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 2/7] common/mlx5: glue for sample action Jiawei Wang
                       ` (7 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

When using full offload, all traffic will be handled by the HW, and
directed to the requested VF or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1,
means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_08.rst |  6 ++++++
 lib/librte_ethdev/rte_flow.c           |  1 +
 lib/librte_ethdev/rte_flow.h           | 30 ++++++++++++++++++++++++++++++
 4 files changed, 62 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index d5dd18c..e384d40 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2645,6 +2645,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+
 
+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated with the specified ``ratio`` and
+applied with own set of actions with a fate action, the packets sampled
+equals is '1/ratio'. All the packets continue to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets, and must have a fate action.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index 5cbc4ce..87c719d 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -81,6 +81,12 @@ New Features
   * Added support for virtio queue statistics.
   * Added support for MTU update.
 
+* **Added flow-based traffic sampling support.**
+
+  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
+  packets with specified ratio, and apply with own set of actions with a fate
+  action. When the ratio is set to 1 then the packets will be 100% mirrored.
+
 * **Updated Marvell octeontx2 ethdev PMD.**
 
   Updated Marvell octeontx2 driver with cn98xx support.
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 1685be5..733871d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -173,6 +173,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b0e4199..e0d6264 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2099,6 +2099,14 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * The matching packets will be duplicated with specified ratio and
+	 * applied with own set of actions with a fate action.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2709,6 +2717,28 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated with specified ratio and applied
+ * with own set of actions with a fate action, the sampled packet could be
+ * redirected to queue or port. All the packets continue processing on the
+ * default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+	const struct rte_flow_action *actions;
+		/**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 2/7] common/mlx5: glue for sample action
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
                       ` (6 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

rdma-core introduce a new DR sample action.

Add the rdma-core commands in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/Makefile          |  5 +++++
 drivers/common/mlx5/linux/meson.build |  2 ++
 drivers/common/mlx5/linux/mlx5_glue.c | 15 +++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 12 ++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index f6c762b..4c1484c 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -192,6 +192,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		func mlx5dv_dump_dr_domain \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_flow_sampler \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
 		infiniband/mlx5dv.h \
 		enum MLX5_MMAP_GET_NC_PAGES_CMD \
diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 2294213..0f08318 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -162,6 +162,8 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index 048207e..98b3e71 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1059,6 +1059,19 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1308,6 +1321,8 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 069d854..11b95c5 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -304,6 +314,8 @@ struct mlx5_glue {
 			 struct mlx5dv_devx_async_event_hdr *event_data,
 			 size_t event_resp_len);
 	void (*dr_reclaim_domain_memory)(void *domain, uint32_t enable);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 3/7] common/mlx5: query sampler object capability via DevX
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 2/7] common/mlx5: glue for sample action Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 4/7] net/mlx5: add the validate sample action Jiawei Wang
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index ec92eb6..6b551f1 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -496,6 +496,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capabilitiy From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 25704ef..a9cfe6d 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -90,6 +90,7 @@ struct mlx5_hca_attr {
 	uint32_t vhca_id:16;
 	uint32_t relaxed_ordering_write:1;
 	uint32_t relaxed_ordering_read:1;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index c63795f..e7d0a65 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -944,6 +944,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1365,12 +1366,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 4/7] net/mlx5: add the validate sample action
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (2 preceding siblings ...)
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX.

Only NIC_RX support with addition optional actions. FDB doesn't
support any optional action, the sampled packets is always goes
to e-switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 133 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2dc57b2..6dfacf2 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -878,6 +878,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config.hca_attr.log_max_ft_sampler_num > 0  &&
+		    config.dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config.hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config.mprq.enabled && mprq) {
 		if (config.mprq.stride_num_n &&
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 46e66eb..6790738 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -617,6 +617,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 43cbda8..45a073c 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -202,6 +202,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
 #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 36)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 8b5b683..b5c94d0 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3965,6 +3965,130 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			++actions_n;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optinal action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5598,6 +5722,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 5/7] net/mlx5: split sample flow into two sub flows
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (3 preceding siblings ...)
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 6/7] net/mlx5: update translate function for sample action Jiawei Wang
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 258 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  36 ++++++
 4 files changed, 304 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 07c6add..db55545 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -241,6 +241,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = rte_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = rte_malloc_socket,
+		.free = rte_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6790738..756bd68 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -51,6 +51,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -518,6 +519,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -566,6 +568,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_flow_default_miss_resource default_miss;
 	/* Default miss action resource structure. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index ae5ccc2..7ed9ba3 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3917,6 +3917,139 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success, or unique flow_id.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id = 0;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			if (!attr->transfer) {
+				/* Add the extra tag action first for NIC-RX. */
+				tag_action = actions_pre;
+				tag_action->type = (enum rte_flow_action_type)
+						MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+				actions_pre++;
+			}
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	if (!attr->transfer) {
+		actions_pre++;
+		/* Set the tag. */
+		set_tag = (void *)actions_pre;
+		set_tag->id = mlx5_flow_get_reg_id(dev, MLX5_APP_TAG,
+						   0, &error);
+		tag_id = flow_qrss_get_id(dev);
+		set_tag->data = tag_id;
+		assert(tag_action);
+		tag_action->conf = set_tag;
+		/* Prepare the suffix subflow items. */
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4176,6 +4309,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4222,8 +4356,9 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4238,7 +4373,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4249,6 +4384,121 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+#endif
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!attr->transfer && !tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+#endif
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4296,7 +4546,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 45a073c..51826f8 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -502,6 +502,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -530,6 +562,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -595,6 +629,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 6/7] net/mlx5: update translate function for sample action
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (4 preceding siblings ...)
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-07-06 17:51     ` Jiawei Wang
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: add testpmd command " Jiawei Wang
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 494 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 502 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7ed9ba3..c91ae7d 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4612,10 +4612,14 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4634,7 +4638,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4680,7 +4684,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4717,8 +4721,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority,
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 51826f8..99e900b 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -372,6 +372,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -604,13 +611,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index b5c94d0..1a359e4 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -82,6 +82,10 @@
 static int
 flow_dv_default_miss_resource_release(struct rte_eth_dev *dev);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				      uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -7962,6 +7966,373 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[out] hrxq_idx
+ *   Hash Rx queue index.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+		if (!*hrxq_idx)
+			return NULL;
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	return hrxq;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			if (action_flags & MLX5_FLOW_ACTION_MARK)
+				dev_flow->handle->rix_hrxq = hrxq_idx;
+			dev_flow->handle->fate_action =
+					MLX5_FLOW_FATE_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			dev_flow->handle->mark = 1;
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not support for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -8024,9 +8395,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -8045,7 +8420,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -8053,6 +8427,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		if (!mlx5_flow_os_action_supported(action_type))
 			return rte_flow_error_set(error, ENOTSUP,
@@ -8291,9 +8666,13 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external && jump_group <
+					MLX5_MAX_TABLES_EXTERNAL)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8459,6 +8838,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8485,6 +8877,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -8783,7 +9190,8 @@ struct field_modify_info modify_tcp[] = {
 				dh->rix_hrxq = UINT32_MAX;
 				dv->actions[n++] = drop_hrxq->action;
 			}
-		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE) {
+		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE &&
+			   !dv_h->rix_sample) {
 			struct mlx5_hrxq *hrxq;
 			uint32_t hrxq_idx;
 			struct mlx5_flow_rss_desc *rss_desc =
@@ -8915,18 +9323,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9172,6 +9580,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an encap/decap resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9250,8 +9723,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v3 7/7] app/testpmd: add testpmd command for sample action
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (5 preceding siblings ...)
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-07-06 17:51     ` " Jiawei Wang
  2020-07-06 18:23     ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Stephen Hemminger
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
  8 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-07-06 17:51 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e2006c..6b1e515 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -349,6 +351,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -484,6 +490,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1161,6 +1183,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1393,9 +1416,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1460,7 +1502,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1531,6 +1581,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3612,11 +3664,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3647,6 +3701,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3750,6 +3821,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5207,6 +5309,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -5971,6 +6143,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6007,11 +6211,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6162,6 +6363,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6607,7 +6826,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6624,6 +6889,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (6 preceding siblings ...)
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: add testpmd command " Jiawei Wang
@ 2020-07-06 18:23     ` Stephen Hemminger
  2020-07-06 19:14       ` Ori Kam
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
  8 siblings, 1 reply; 129+ messages in thread
From: Stephen Hemminger @ 2020-07-06 18:23 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: orika, viacheslavo, matan, dev, thomas, rasland, ian.stokes, fbl

On Mon,  6 Jul 2020 20:51:01 +0300
Jiawei Wang <jiaweiw@mellanox.com> wrote:

> This patch set implement the flow sampling for mlx5 driver.
> 
> The solution is introduced a new rte_flow action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination.
> 
> If the sample ratio value be set to 1, means that the packets would be completely mirrored. The sample packet can be assigned with different set of actions from the original packet.
> 
> MLX5 PMD driver will be responsible for validate and translate the sample action while creating a flow.
> 

You seem to have ignored my feedback that this could be more useful if it
didn't just support duplication. It should allow sampling and then make
the other rule chain (the one that gets hit after sampling) run.

By allowing a more general form of sampling it could be used for doing
network emulation (or packet manipulation) as well as simple netflow/ipfix style sampling.



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling
  2020-07-06 18:23     ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Stephen Hemminger
@ 2020-07-06 19:14       ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-07-06 19:14 UTC (permalink / raw)
  To: Stephen Hemminger, Jiawei(Jonny) Wang
  Cc: Slava Ovsiienko, Matan Azrad, dev, Thomas Monjalon,
	Raslan Darawsheh, ian.stokes, fbl

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Subject: Re: [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling
> 
> On Mon,  6 Jul 2020 20:51:01 +0300
> Jiawei Wang <jiaweiw@mellanox.com> wrote:
> 
> > This patch set implement the flow sampling for mlx5 driver.
> >
> > The solution is introduced a new rte_flow action that will sample the
> incoming traffic and send a duplicated traffic with the specified ratio to the
> application, while the original packet will continue to the target destination.
> >
> > If the sample ratio value be set to 1, means that the packets would be
> completely mirrored. The sample packet can be assigned with different set of
> actions from the original packet.
> >
> > MLX5 PMD driver will be responsible for validate and translate the sample
> action while creating a flow.
> >
> 
> You seem to have ignored my feedback that this could be more useful if it
> didn't just support duplication. It should allow sampling and then make
> the other rule chain (the one that gets hit after sampling) run.
> 
> By allowing a more general form of sampling it could be used for doing
> network emulation (or packet manipulation) as well as simple netflow/ipfix
> style sampling.
> 
What is network emulation?
I think that what you are suggesting is a different feature. There is high penalty for 
rules duplication, like I stated in my previous reply.
As I can see there is no plan at least in Mellanox to support this kind of feature you
are suggesting, Since this API in any case is experimental. I suggest that when the need
comes or some other vendor will wish to implement such a feature we
can ether update the API or create a new one.
What do you think?

Best,
Ori 



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow
  2020-07-06 17:51     ` [dpdk-dev] [PATCH v3 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-07-07 10:26       ` Andrew Rybchenko
  0 siblings, 0 replies; 129+ messages in thread
From: Andrew Rybchenko @ 2020-07-07 10:26 UTC (permalink / raw)
  To: Jiawei Wang, orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl

On 7/6/20 8:51 PM, Jiawei Wang wrote:
> When using full offload, all traffic will be handled by the HW, and
> directed to the requested VF or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
> 
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic with the specified ratio to the
> application, while the original packet will continue to the target
> destination.
> 
> The packets sampled equals is '1/ratio', if the ratio value be set to 1,
> means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
> 
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> will be introduced.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
> Acked-by: Ori Kam <orika@mellanox.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 0/7] support the flow-based traffic sampling
  2020-07-06 17:51   ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Jiawei Wang
                       ` (7 preceding siblings ...)
  2020-07-06 18:23     ` [dpdk-dev] [PATCH v3 0/7] support the flow-based traffic sampling Stephen Hemminger
@ 2020-08-26 16:01     ` " Jiawei Wang
  2020-08-26 16:01       ` [dpdk-dev] [PATCH v4 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
                         ` (7 more replies)
  8 siblings, 8 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination.

If the sample ratio value be set to 1, means that the packets would be completely mirrored. The sample packet can be assigned with different set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample action while creating a flow.

v4:
* Rebase.
* Fix the coding style issue.

v3:
* Remove 'const' of ratio field.
* Update description and commit messages.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (7):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c           | 285 +++++++++++++++-
 doc/guides/prog_guide/rte_flow.rst    |  25 ++
 drivers/common/mlx5/Makefile          |   5 +
 drivers/common/mlx5/linux/meson.build |   2 +
 drivers/common/mlx5/linux/mlx5_glue.c |  15 +
 drivers/common/mlx5/linux/mlx5_glue.h |  12 +
 drivers/common/mlx5/mlx5_devx_cmds.c  |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h  |   1 +
 drivers/common/mlx5/mlx5_prm.h        |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c      |  14 +
 drivers/net/mlx5/mlx5.c               |  11 +
 drivers/net/mlx5/mlx5.h               |   4 +
 drivers/net/mlx5/mlx5_flow.c          | 274 ++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h          |  51 ++-
 drivers/net/mlx5/mlx5_flow_dv.c       | 627 +++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c          |   1 +
 lib/librte_ethdev/rte_flow.h          |  30 ++
 17 files changed, 1400 insertions(+), 35 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 1/7] ethdev: introduce sample action for rte flow
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
@ 2020-08-26 16:01       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 2/7] common/mlx5: glue for sample action Jiawei Wang
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

When using full offload, all traffic will be handled by the HW, and
directed to the requested VF or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1,
means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
 lib/librte_ethdev/rte_flow.c       |  1 +
 lib/librte_ethdev/rte_flow.h       | 30 ++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e5cd1e..f8f3f51 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2653,6 +2653,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+
 
+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated with the specified ``ratio`` and
+applied with own set of actions with a fate action, the packets sampled
+equals is '1/ratio'. All the packets continue to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets, and must have a fate action.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..035671d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -174,6 +174,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..fa70d40 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2132,6 +2132,14 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * The matching packets will be duplicated with specified ratio and
+	 * applied with own set of actions with a fate action.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2742,6 +2750,28 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated with specified ratio and applied
+ * with own set of actions with a fate action, the sampled packet could be
+ * redirected to queue or port. All the packets continue processing on the
+ * default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+	const struct rte_flow_action *actions;
+		/**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 2/7] common/mlx5: glue for sample action
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
  2020-08-26 16:01       ` [dpdk-dev] [PATCH v4 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-08-26 16:02       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

rdma-core introduce a new DR sample action.

Add the rdma-core commands in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/common/mlx5/Makefile          |  5 +++++
 drivers/common/mlx5/linux/meson.build |  2 ++
 drivers/common/mlx5/linux/mlx5_glue.c | 15 +++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 12 ++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index 4edd541..24b932c 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -205,6 +205,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		func mlx5dv_dump_dr_domain \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE \
+		infiniband/mlx5dv.h \
+		func mlx5dv_dr_action_create_flow_sampler \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_MMAP_GET_NC_PAGES_CMD \
 		infiniband/mlx5dv.h \
 		enum MLX5_MMAP_GET_NC_PAGES_CMD \
diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 48e8ad6..1aa137d 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -172,6 +172,8 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index fcf03e8..771a47c 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1063,6 +1063,19 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1339,6 +1352,8 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 734ace2..85b43b9 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -309,6 +319,8 @@ struct mlx5_glue {
 					 const void *pp_context,
 					 uint32_t flags);
 	void (*dv_free_pp)(struct mlx5dv_pp *pp);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 3/7] common/mlx5: query sampler object capability via DevX
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
  2020-08-26 16:01       ` [dpdk-dev] [PATCH v4 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 2/7] common/mlx5: glue for sample action Jiawei Wang
@ 2020-08-26 16:02       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 4/7] net/mlx5: add the validate sample action Jiawei Wang
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 7c81ae1..fd4e3f2 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -751,6 +751,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capability From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 1c84cea..cfa7a7b 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -102,6 +102,7 @@ struct mlx5_hca_attr {
 	uint32_t scatter_fcs_w_decap_disable:1;
 	uint32_t regex:1;
 	uint32_t regexp_num_of_engines;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index e0ebe12..4a624e1 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -1036,6 +1036,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1470,12 +1471,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 4/7] net/mlx5: add the validate sample action
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
                         ` (2 preceding siblings ...)
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-08-26 16:02       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX.

Only NIC_RX support with addition optional actions. FDB doesn't
support any optional action, the sampled packets is always goes
to e-switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 133 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index db955ae..5b663a1 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1015,6 +1015,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config->hca_attr.log_max_ft_sampler_num > 0  &&
+		    config->dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config->hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config->tx_pp) {
 		DRV_LOG(DEBUG, "Timestamp counter frequency %u kHz",
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1880a82..ae0c7cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -698,6 +698,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 92301e4..41404de 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -196,6 +196,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
 #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 36)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index dd35959..a8db8ab 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3992,6 +3992,130 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			++actions_n;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optional action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5753,6 +5877,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 5/7] net/mlx5: split sample flow into two sub flows
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
                         ` (3 preceding siblings ...)
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-08-26 16:02       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 6/7] net/mlx5: update translate function for sample action Jiawei Wang
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 258 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  36 ++++++
 4 files changed, 304 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1e4c695..7b33a3e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -244,6 +244,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = mlx5_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = mlx5_malloc,
+		.free = mlx5_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index ae0c7cc..a76c161 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -41,6 +41,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -514,6 +515,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -642,6 +644,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_flow_default_miss_resource default_miss;
 	/* Default miss action resource structure. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7150173..49f49e7 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3908,6 +3908,139 @@ struct mlx5_flow_tunnel_info {
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success, or unique flow_id.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id = 0;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			if (!attr->transfer) {
+				/* Add the extra tag action first for NIC-RX. */
+				tag_action = actions_pre;
+				tag_action->type = (enum rte_flow_action_type)
+						MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+				actions_pre++;
+			}
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	if (!attr->transfer) {
+		actions_pre++;
+		/* Set the tag. */
+		set_tag = (void *)actions_pre;
+		set_tag->id = mlx5_flow_get_reg_id(dev, MLX5_APP_TAG,
+						   0, &error);
+		tag_id = flow_qrss_get_id(dev);
+		set_tag->data = tag_id;
+		assert(tag_action);
+		tag_action->conf = set_tag;
+		/* Prepare the suffix subflow items. */
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4169,6 +4302,7 @@ struct mlx5_flow_tunnel_info {
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4216,8 +4350,9 @@ struct mlx5_flow_tunnel_info {
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4232,7 +4367,7 @@ struct mlx5_flow_tunnel_info {
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4243,6 +4378,121 @@ struct mlx5_flow_tunnel_info {
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+#endif
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!attr->transfer && !tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+#endif
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4290,7 +4540,7 @@ struct mlx5_flow_tunnel_info {
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 41404de..9d2493a 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -507,6 +507,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -539,6 +571,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -604,6 +638,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 6/7] net/mlx5: update translate function for sample action
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
                         ` (4 preceding siblings ...)
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-08-26 16:02       ` Jiawei Wang
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 7/7] app/testpmd: add testpmd command " Jiawei Wang
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 494 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 502 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 49f49e7..d2b79f0 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4606,10 +4606,14 @@ struct mlx5_flow_tunnel_info {
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4628,7 +4632,7 @@ struct mlx5_flow_tunnel_info {
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4674,7 +4678,7 @@ struct mlx5_flow_tunnel_info {
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4711,8 +4715,8 @@ struct mlx5_flow_tunnel_info {
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 9d2493a..f3c0406 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -366,6 +366,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -613,13 +620,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a8db8ab..3d85140 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -76,6 +76,10 @@
 static int
 flow_dv_default_miss_resource_release(struct rte_eth_dev *dev);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				      uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -8233,6 +8237,373 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[out] hrxq_idx
+ *   Hash Rx queue index.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+		if (!*hrxq_idx)
+			return NULL;
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	return hrxq;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			if (action_flags & MLX5_FLOW_ACTION_MARK)
+				dev_flow->handle->rix_hrxq = hrxq_idx;
+			dev_flow->handle->fate_action =
+					MLX5_FLOW_FATE_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			dev_flow->handle->mark = 1;
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not support for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -8296,9 +8667,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -8317,7 +8692,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -8325,6 +8699,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		if (!mlx5_flow_os_action_supported(action_type))
 			return rte_flow_error_set(error, ENOTSUP,
@@ -8563,9 +8938,13 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external && jump_group <
+					MLX5_MAX_TABLES_EXTERNAL)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8731,6 +9110,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8757,6 +9149,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -9068,7 +9475,8 @@ struct field_modify_info modify_tcp[] = {
 				dh->rix_hrxq = UINT32_MAX;
 				dv->actions[n++] = drop_hrxq->action;
 			}
-		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE) {
+		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE &&
+			   !dv_h->rix_sample) {
 			struct mlx5_hrxq *hrxq;
 			uint32_t hrxq_idx;
 			struct mlx5_flow_rss_desc *rss_desc =
@@ -9200,18 +9608,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9462,6 +9870,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an sample resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9540,8 +10013,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v4 7/7] app/testpmd: add testpmd command for sample action
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
                         ` (5 preceding siblings ...)
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-08-26 16:02       ` " Jiawei Wang
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-26 16:02 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6263d30..27fa294 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -358,6 +360,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -493,6 +499,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1189,6 +1211,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1421,9 +1444,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1491,7 +1533,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1562,6 +1612,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3703,11 +3755,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3738,6 +3792,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3841,6 +3912,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5351,6 +5453,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -6115,6 +6287,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6151,11 +6355,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6306,6 +6507,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6751,7 +6970,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6768,6 +7033,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling
  2020-08-26 16:01     ` [dpdk-dev] [PATCH v4 " Jiawei Wang
                         ` (6 preceding siblings ...)
  2020-08-26 16:02       ` [dpdk-dev] [PATCH v4 7/7] app/testpmd: add testpmd command " Jiawei Wang
@ 2020-08-27 15:01       ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
                           ` (7 more replies)
  7 siblings, 8 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination.

If the sample ratio value be set to 1, means that the packets would be completely mirrored. The sample packet can be assigned with different set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample action while creating a flow.

v5:
* Add the release note.
* Remove Make changes since it's deprecated.

v4:
* Rebase.
* Fix the coding style issue.

v3:
* Remove 'const' of ratio field.
* Update description and commit messages.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (7):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action

 app/test-pmd/cmdline_flow.c            | 285 ++++++++++++++-
 doc/guides/prog_guide/rte_flow.rst     |  25 ++
 doc/guides/rel_notes/release_20_11.rst |   6 +
 drivers/common/mlx5/linux/meson.build  |   2 +
 drivers/common/mlx5/linux/mlx5_glue.c  |  15 +
 drivers/common/mlx5/linux/mlx5_glue.h  |  12 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  27 ++
 drivers/common/mlx5/mlx5_devx_cmds.h   |   1 +
 drivers/common/mlx5/mlx5_prm.h         |  51 +++
 drivers/net/mlx5/linux/mlx5_os.c       |  14 +
 drivers/net/mlx5/mlx5.c                |  11 +
 drivers/net/mlx5/mlx5.h                |   4 +
 drivers/net/mlx5/mlx5_flow.c           | 274 +++++++++++++-
 drivers/net/mlx5/mlx5_flow.h           |  51 ++-
 drivers/net/mlx5/mlx5_flow_dv.c        | 627 ++++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c           |   1 +
 lib/librte_ethdev/rte_flow.h           |  30 ++
 17 files changed, 1401 insertions(+), 35 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-09-04  4:17           ` Ajit Khaparde
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 2/7] common/mlx5: glue for sample action Jiawei Wang
                           ` (6 subsequent siblings)
  7 siblings, 1 reply; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

When using full offload, all traffic will be handled by the HW, and
directed to the requested VF or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1,
means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_11.rst |  6 ++++++
 lib/librte_ethdev/rte_flow.c           |  1 +
 lib/librte_ethdev/rte_flow.h           | 30 ++++++++++++++++++++++++++++++
 4 files changed, 62 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e5cd1e..f8f3f51 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2653,6 +2653,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+
 
+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated with the specified ``ratio`` and
+applied with own set of actions with a fate action, the packets sampled
+equals is '1/ratio'. All the packets continue to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets, and must have a fate action.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..7f99563 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added flow-based traffic sampling support.**
+
+  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
+  packets with specified ratio, and apply with own set of actions with a fate
+  action. When the ratio is set to 1 then the packets will be 100% mirrored.
+
 
 Removed Items
 -------------
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..035671d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -174,6 +174,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
 	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+	MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..fa70d40 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2132,6 +2132,14 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * The matching packets will be duplicated with specified ratio and
+	 * applied with own set of actions with a fate action.
+	 *
+	 * See struct rte_flow_action_sample.
+	 */
+	RTE_FLOW_ACTION_TYPE_SAMPLE,
 };
 
 /**
@@ -2742,6 +2750,28 @@ struct rte_flow_action {
 struct rte_flow;
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated with specified ratio and applied
+ * with own set of actions with a fate action, the sampled packet could be
+ * redirected to queue or port. All the packets continue processing on the
+ * default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+	uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+	const struct rte_flow_action *actions;
+		/**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 2/7] common/mlx5: glue for sample action
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
                           ` (5 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

rdma-core introduce a new DR sample action.

Add the rdma-core commands in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/common/mlx5/linux/meson.build |  2 ++
 drivers/common/mlx5/linux/mlx5_glue.c | 15 +++++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h | 12 ++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/common/mlx5/linux/meson.build b/drivers/common/mlx5/linux/meson.build
index 48e8ad6..1aa137d 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -172,6 +172,8 @@ has_sym_args = [
 	'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 	[ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
 	'mlx5dv_dump_dr_domain'],
+	[ 'HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE', 'infiniband/mlx5dv.h',
+	'mlx5dv_dr_action_create_flow_sampler'],
 	[ 'HAVE_MLX5DV_DR_MEM_RECLAIM', 'infiniband/mlx5dv.h',
 	'mlx5dv_dr_domain_set_reclaim_device_memory'],
 	[ 'HAVE_DEVLINK', 'linux/devlink.h', 'DEVLINK_GENL_NAME' ],
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index fcf03e8..771a47c 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1063,6 +1063,19 @@
 #endif
 }
 
+static void *
+mlx5_glue_dr_create_flow_action_sampler(
+			struct mlx5dv_dr_flow_sampler_attr *attr)
+{
+#ifdef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+	return mlx5dv_dr_action_create_flow_sampler(attr);
+#else
+	(void)attr;
+	errno = ENOTSUP;
+	return NULL;
+#endif
+}
+
 static int
 mlx5_glue_devx_query_eqn(struct ibv_context *ctx, uint32_t cpus,
 			 uint32_t *eqn)
@@ -1339,6 +1352,8 @@
 	.devx_port_query = mlx5_glue_devx_port_query,
 	.dr_dump_domain = mlx5_glue_dr_dump_domain,
 	.dr_reclaim_domain_memory = mlx5_glue_dr_reclaim_domain_memory,
+	.dr_create_flow_action_sampler =
+		mlx5_glue_dr_create_flow_action_sampler,
 	.devx_query_eqn = mlx5_glue_devx_query_eqn,
 	.devx_create_event_channel = mlx5_glue_devx_create_event_channel,
 	.devx_destroy_event_channel = mlx5_glue_devx_destroy_event_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 734ace2..85b43b9 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -77,6 +77,7 @@
 #ifndef HAVE_MLX5DV_DR
 enum  mlx5dv_dr_domain_type { unused, };
 struct mlx5dv_dr_domain;
+struct mlx5dv_dr_action;
 #endif
 
 #ifndef HAVE_MLX5DV_DR_DEVX_PORT
@@ -87,6 +88,15 @@
 struct mlx5dv_dr_flow_meter_attr;
 #endif
 
+#ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE
+struct mlx5dv_dr_flow_sampler_attr {
+	uint32_t sample_ratio;
+	void *default_next_table;
+	size_t num_sample_actions;
+	struct mlx5dv_dr_action **sample_actions;
+};
+#endif
+
 #ifndef HAVE_IBV_DEVX_EVENT
 struct mlx5dv_devx_event_channel { int fd; };
 struct mlx5dv_devx_async_event_hdr;
@@ -309,6 +319,8 @@ struct mlx5_glue {
 					 const void *pp_context,
 					 uint32_t flags);
 	void (*dv_free_pp)(struct mlx5dv_pp *pp);
+	void *(*dr_create_flow_action_sampler)
+			(struct mlx5dv_dr_flow_sampler_attr *attr);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 3/7] common/mlx5: query sampler object capability via DevX
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 2/7] common/mlx5: glue for sample action Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 4/7] net/mlx5: add the validate sample action Jiawei Wang
                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 27 +++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/common/mlx5/mlx5_prm.h       | 51 ++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 7c81ae1..fd4e3f2 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -751,6 +751,33 @@ struct mlx5_devx_obj *
 	if (!attr->eth_net_offloads)
 		return 0;
 
+	/* Query Flow Sampler Capability From FLow Table Properties Layout. */
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod,
+		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+		 MLX5_HCA_CAP_OPMOD_GET_CUR);
+
+	rc = mlx5_glue->devx_general_cmd(ctx,
+					 in, sizeof(in),
+					 out, sizeof(out));
+	if (rc)
+		goto error;
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
+			"status %x, syndrome = %x",
+			status, syndrome);
+		attr->log_max_ft_sampler_num = 0;
+		return -1;
+	}
+	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	attr->log_max_ft_sampler_num =
+			MLX5_GET(flow_table_nic_cap,
+			hcattr, flow_table_properties.log_max_ft_sampler_num);
+
 	/* Query HCA offloads for Ethernet protocol. */
 	memset(in, 0, sizeof(in));
 	memset(out, 0, sizeof(out));
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 1c84cea..cfa7a7b 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -102,6 +102,7 @@ struct mlx5_hca_attr {
 	uint32_t scatter_fcs_w_decap_disable:1;
 	uint32_t regex:1;
 	uint32_t regexp_num_of_engines;
+	uint32_t log_max_ft_sampler_num:8;
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index e0ebe12..4a624e1 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -1036,6 +1036,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS = 0x1 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP = 0xc << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
 };
 
@@ -1470,12 +1471,62 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+struct mlx5_ifc_flow_table_prop_layout_bits {
+	u8 ft_support[0x1];
+	u8 flow_tag[0x1];
+	u8 flow_counter[0x1];
+	u8 flow_modify_en[0x1];
+	u8 modify_root[0x1];
+	u8 identified_miss_table[0x1];
+	u8 flow_table_modify[0x1];
+	u8 reformat[0x1];
+	u8 decap[0x1];
+	u8 reset_root_to_default[0x1];
+	u8 pop_vlan[0x1];
+	u8 push_vlan[0x1];
+	u8 fpga_vendor_acceleration[0x1];
+	u8 pop_vlan_2[0x1];
+	u8 push_vlan_2[0x1];
+	u8 reformat_and_vlan_action[0x1];
+	u8 modify_and_vlan_action[0x1];
+	u8 sw_owner[0x1];
+	u8 reformat_l3_tunnel_to_l2[0x1];
+	u8 reformat_l2_to_l3_tunnel[0x1];
+	u8 reformat_and_modify_action[0x1];
+	u8 reserved_at_15[0x9];
+	u8 sw_owner_v2[0x1];
+	u8 reserved_at_1f[0x1];
+	u8 reserved_at_20[0x2];
+	u8 log_max_ft_size[0x6];
+	u8 log_max_modify_header_context[0x8];
+	u8 max_modify_header_actions[0x8];
+	u8 max_ft_level[0x8];
+	u8 reserved_at_40[0x8];
+	u8 log_max_ft_sampler_num[8];
+	u8 metadata_reg_b_width[0x8];
+	u8 metadata_reg_a_width[0x8];
+	u8 reserved_at_60[0x18];
+	u8 log_max_ft_num[0x8];
+	u8 reserved_at_80[0x10];
+	u8 log_max_flow_counter[0x8];
+	u8 log_max_destination[0x8];
+	u8 reserved_at_a0[0x18];
+	u8 log_max_flow[0x8];
+	u8 reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+	u8	   reserved_at_0[0x200];
+	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties;
+};
+
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
 	struct mlx5_ifc_per_protocol_networking_offload_caps_bits
 	       per_protocol_networking_offload_caps;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
 	struct mlx5_ifc_virtio_emulation_cap_bits vdpa_caps;
+	struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
 	u8 reserved_at_0[0x8000];
 };
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 4/7] net/mlx5: add the validate sample action
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
                           ` (2 preceding siblings ...)
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 3/7] common/mlx5: query sampler object capability via DevX Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

Add sample action validate function.

For Sample flow support NIC-RX and FDB domain, must include an
action of a dest TIR in NIC_RX.

Only NIC_RX support with addition optional actions. FDB doesn't
support any optional action, the sampled packets is always goes
to e-switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c |  14 +++++
 drivers/net/mlx5/mlx5.h          |   1 +
 drivers/net/mlx5/mlx5_flow.h     |   1 +
 drivers/net/mlx5/mlx5_flow_dv.c  | 133 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index db955ae..5b663a1 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1015,6 +1015,20 @@
 			}
 		}
 #endif
+#if defined(HAVE_MLX5DV_DR) && defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
+		if (config->hca_attr.log_max_ft_sampler_num > 0  &&
+		    config->dv_flow_en) {
+			priv->sampler_en = 1;
+			DRV_LOG(DEBUG, "The Sampler enabled!\n");
+		} else {
+			priv->sampler_en = 0;
+			if (!config->hca_attr.log_max_ft_sampler_num)
+				DRV_LOG(WARNING, "No available register for"
+						" Sampler.");
+			else
+				DRV_LOG(DEBUG, "DV flow is not supported!\n");
+		}
+#endif
 	}
 	if (config->tx_pp) {
 		DRV_LOG(DEBUG, "Timestamp counter frequency %u kHz",
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1880a82..ae0c7cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -698,6 +698,7 @@ struct mlx5_priv {
 	unsigned int counter_fallback:1; /* Use counter fallback management. */
 	unsigned int mtr_en:1; /* Whether support meter. */
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
+	unsigned int sampler_en:1; /* Whether support sampler. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 92301e4..41404de 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -196,6 +196,7 @@ enum mlx5_feature_name {
 #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)
 #define MLX5_FLOW_ACTION_AGE (1ull << 34)
 #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35)
+#define MLX5_FLOW_ACTION_SAMPLE (1ull << 36)
 
 #define MLX5_FLOW_FATE_ACTIONS \
 	(MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index dd35959..a8db8ab 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3992,6 +3992,130 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate the sample action.
+ *
+ * @param[in] action_flags
+ *   Holds the actions detected until now.
+ * @param[in] action
+ *   Pointer to the sample action.
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this action.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_action_sample(uint64_t action_flags,
+			      const struct rte_flow_action *action,
+			      struct rte_eth_dev *dev,
+			      const struct rte_flow_attr *attr,
+			      struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_config *dev_conf = &priv->config;
+	const struct rte_flow_action_sample *sample = action->conf;
+	const struct rte_flow_action *act = sample->actions;
+	uint64_t sub_action_flags = 0;
+	int actions_n = 0;
+	int ret;
+
+	if (!attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
+					  NULL, "root table is not supported");
+	if (!priv->config.devx || !priv->sampler_en)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "sample action not supported");
+	if (!(action->conf))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "configuration cannot be null");
+	if (sample->ratio == 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "ratio value start from 1");
+	if (action_flags & MLX5_FLOW_ACTION_SAMPLE)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "Duplicate sample actions set");
+	if (action_flags & MLX5_FLOW_ACTION_METER)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION, action,
+					  "wrong action order, meter should "
+					  "be after sample action");
+	for (; act->type != RTE_FLOW_ACTION_TYPE_END; act++) {
+		if (actions_n == MLX5_DV_MAX_NUMBER_OF_ACTIONS)
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  act, "too many actions");
+		switch (act->type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			ret = mlx5_flow_validate_action_queue(act,
+							      sub_action_flags,
+							      dev,
+							      attr, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			ret = flow_dv_validate_action_mark(dev, act,
+							   sub_action_flags,
+							   attr, error);
+			if (ret < 0)
+				return ret;
+			if (dev_conf->dv_xmeta_en != MLX5_XMETA_MODE_LEGACY)
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK |
+						MLX5_FLOW_ACTION_MARK_EXT;
+			else
+				sub_action_flags |= MLX5_FLOW_ACTION_MARK;
+			++actions_n;
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			ret = flow_dv_validate_action_count(dev, error);
+			if (ret < 0)
+				return ret;
+			sub_action_flags |= MLX5_FLOW_ACTION_COUNT;
+			++actions_n;
+			break;
+		default:
+			return rte_flow_error_set(error, ENOTSUP,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Doesn't support optional "
+						  "action");
+		}
+	}
+	if (attr->ingress && !attr->transfer) {
+		if (!(sub_action_flags & MLX5_FLOW_ACTION_QUEUE))
+			return rte_flow_error_set(error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL,
+						  "Ingress must has a dest "
+						  "QUEUE for Sample");
+	} else if (attr->egress && !attr->transfer) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL,
+					  "Sample Only support Ingress "
+					  "or E-Switch");
+	} else if (sample->actions->type != RTE_FLOW_ACTION_TYPE_END) {
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
+					  "E-Switch doesn't support any "
+					  "optional action for sampling");
+	}
+	return 0;
+}
+
+/**
  * Find existing modify-header resource or create and register a new one.
  *
  * @param dev[in, out]
@@ -5753,6 +5877,15 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			rw_act_num += MLX5_ACT_NUM_SET_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			ret = flow_dv_validate_action_sample(action_flags,
+							     actions, dev,
+							     attr, error);
+			if (ret < 0)
+				return ret;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			++actions_n;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ACTION,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 5/7] net/mlx5: split sample flow into two sub flows
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
                           ` (3 preceding siblings ...)
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 4/7] net/mlx5: add the validate sample action Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 6/7] net/mlx5: update translate function for sample action Jiawei Wang
                           ` (2 subsequent siblings)
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

Add the sampler action resource structs definition.

The flow with sample action will be splited into two sub flows,
the prefix flow with sample action, the suffix flow with the left
actions.

For the prefix flow, add the extra the tag action with unique id
to metadata register, and suffix flow will add the extra tag item
to match that unique id.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5.c      |  11 ++
 drivers/net/mlx5/mlx5.h      |   3 +
 drivers/net/mlx5/mlx5_flow.c | 258 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h |  36 ++++++
 4 files changed, 304 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1e4c695..7b33a3e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -244,6 +244,17 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
 		.free = mlx5_free,
 		.type = "mlx5_jump_ipool",
 	},
+	{
+		.size = sizeof(struct mlx5_flow_dv_sample_resource),
+		.trunk_size = 64,
+		.grow_trunk = 3,
+		.grow_shift = 2,
+		.need_lock = 0,
+		.release_mem_en = 1,
+		.malloc = mlx5_malloc,
+		.free = mlx5_free,
+		.type = "mlx5_sample_ipool",
+	},
 #endif
 	{
 		.size = sizeof(struct mlx5_flow_meter),
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index ae0c7cc..a76c161 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -41,6 +41,7 @@ enum mlx5_ipool_index {
 	MLX5_IPOOL_TAG, /* Pool for tag resource. */
 	MLX5_IPOOL_PORT_ID, /* Pool for port id resource. */
 	MLX5_IPOOL_JUMP, /* Pool for jump resource. */
+	MLX5_IPOOL_SAMPLE, /* Pool for sample resource. */
 #endif
 	MLX5_IPOOL_MTR, /* Pool for meter resource. */
 	MLX5_IPOOL_MCP, /* Pool for metadata resource. */
@@ -514,6 +515,7 @@ struct mlx5_flow_tbl_resource {
 /* Tables for metering splits should be added here. */
 #define MLX5_MAX_TABLES_EXTERNAL (MLX5_MAX_TABLES - 3)
 #define MLX5_MAX_TABLES_FDB UINT16_MAX
+#define MLX5_FLOW_TABLE_FACTOR 10
 
 /* ID generation structure. */
 struct mlx5_flow_id_pool {
@@ -642,6 +644,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_hlist *tag_table;
 	uint32_t port_id_action_list; /* List of port ID actions. */
 	uint32_t push_vlan_action_list; /* List of push VLAN actions. */
+	uint32_t sample_action_list; /* List of sample actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	struct mlx5_flow_default_miss_resource default_miss;
 	/* Default miss action resource structure. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7150173..49f49e7 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3908,6 +3908,139 @@ struct mlx5_flow_tunnel_info {
 	return 0;
 }
 
+
+/**
+ * Check the match action from the action list.
+ *
+ * @param[in] actions
+ *   Pointer to the list of actions.
+ * @param[in] action
+ *   The action to be check if exist.
+ *
+ * @return
+ *   > 0 the total number of actions.
+ *   0 if not found match action in action list.
+ */
+static int
+flow_check_match_action(const struct rte_flow_action actions[],
+					enum rte_flow_action_type action)
+{
+	int actions_n = 0;
+	int flag = 0;
+
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		if (actions->type == action)
+			flag = 1;
+		actions_n++;
+	}
+	/* Count RTE_FLOW_ACTION_TYPE_END. */
+	return flag ? actions_n + 1 : 0;
+}
+
+/**
+ * Split the sample flow.
+ *
+ * As sample flow will split to two sub flow, sample flow with
+ * sample action, the other actions will move to new suffix flow.
+ *
+ * Also add unique tag id with tag action in the sample flow,
+ * the same tag id will be as match in the suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[out] sfx_items
+ *   Suffix flow match items (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] actions_sfx
+ *   Suffix flow actions.
+ * @param[out] actions_pre
+ *   Prefix flow actions.
+ *
+ * @return
+ *   0 on success, or unique flow_id.
+ */
+static int
+flow_sample_split_prep(struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr,
+		 struct rte_flow_item sfx_items[],
+		 const struct rte_flow_action actions[],
+		 struct rte_flow_action actions_sfx[],
+		 struct rte_flow_action actions_pre[])
+{
+	struct mlx5_rte_flow_action_set_tag *set_tag;
+	struct mlx5_rte_flow_item_tag *tag_spec;
+	struct mlx5_rte_flow_item_tag *tag_mask;
+	struct rte_flow_item *tag_item;
+	struct rte_flow_action *tag_action = NULL;
+	bool pre_sample = true;
+	struct rte_flow_error error;
+	uint32_t tag_id = 0;
+
+	/* Prepare the actions for prefix and suffix flow. */
+	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
+		struct rte_flow_action **action_cur = NULL;
+
+		switch (actions->type) {
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			if (!attr->transfer) {
+				/* Add the extra tag action first for NIC-RX. */
+				tag_action = actions_pre;
+				tag_action->type = (enum rte_flow_action_type)
+						MLX5_RTE_FLOW_ACTION_TYPE_TAG;
+				actions_pre++;
+			}
+			break;
+		case RTE_FLOW_ACTION_TYPE_JUMP:
+		case RTE_FLOW_ACTION_TYPE_METER:
+			action_cur = &actions_sfx;
+			break;
+		default:
+			break;
+		}
+		if (pre_sample && !action_cur)
+			action_cur = &actions_pre;
+		else
+			action_cur = &actions_sfx;
+		memcpy(*action_cur, actions, sizeof(struct rte_flow_action));
+		(*action_cur)++;
+		if (actions->type == RTE_FLOW_ACTION_TYPE_SAMPLE)
+			pre_sample = false;
+	}
+	/* Add end action to the actions. */
+	actions_sfx->type = RTE_FLOW_ACTION_TYPE_END;
+	actions_pre->type = RTE_FLOW_ACTION_TYPE_END;
+	if (!attr->transfer) {
+		actions_pre++;
+		/* Set the tag. */
+		set_tag = (void *)actions_pre;
+		set_tag->id = mlx5_flow_get_reg_id(dev, MLX5_APP_TAG,
+						   0, &error);
+		tag_id = flow_qrss_get_id(dev);
+		set_tag->data = tag_id;
+		assert(tag_action);
+		tag_action->conf = set_tag;
+		/* Prepare the suffix subflow items. */
+		tag_item = sfx_items++;
+		sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
+		sfx_items++;
+		tag_spec = (struct mlx5_rte_flow_item_tag *)sfx_items;
+		tag_spec->data = tag_id;
+		tag_spec->id = set_tag->id;
+		tag_mask = tag_spec + 1;
+		tag_mask->data = UINT32_MAX;
+		tag_mask->id = UINT16_MAX;
+		tag_item->type = (enum rte_flow_item_type)
+				MLX5_RTE_FLOW_ITEM_TYPE_TAG;
+		tag_item->spec = tag_spec;
+		tag_item->last = NULL;
+		tag_item->mask = tag_mask;
+	}
+	return tag_id;
+}
+
 /**
  * The splitting for metadata feature.
  *
@@ -4169,6 +4302,7 @@ struct mlx5_flow_tunnel_info {
 static int
 flow_create_split_meter(struct rte_eth_dev *dev,
 			   struct rte_flow *flow,
+			   uint64_t prefix_layers,
 			   const struct rte_flow_attr *attr,
 			   const struct rte_flow_item items[],
 			   const struct rte_flow_action actions[],
@@ -4216,8 +4350,9 @@ struct mlx5_flow_tunnel_info {
 			goto exit;
 		}
 		/* Add the prefix subflow. */
-		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
-					      items, pre_actions, external,
+		ret = flow_create_split_inner(dev, flow, &dev_flow,
+					      prefix_layers, attr, items,
+					      pre_actions, external,
 					      flow_idx, error);
 		if (ret) {
 			ret = -rte_errno;
@@ -4232,7 +4367,7 @@ struct mlx5_flow_tunnel_info {
 	/* Add the prefix subflow. */
 	ret = flow_create_split_metadata(dev, flow, dev_flow ?
 					 flow_get_prefix_layer_flags(dev_flow) :
-					 0, &sfx_attr,
+					 prefix_layers, &sfx_attr,
 					 sfx_items ? sfx_items : items,
 					 sfx_actions ? sfx_actions : actions,
 					 external, flow_idx, error);
@@ -4243,6 +4378,121 @@ struct mlx5_flow_tunnel_info {
 }
 
 /**
+ * The splitting for sample feature.
+ *
+ * The sample flow will be split to two flows as prefix and
+ * suffix flow.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] flow
+ *   Parent flow structure pointer.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] items
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[in] external
+ *   This flow rule is created by request external to PMD.
+ * @param[in] flow_idx
+ *   This memory pool index to the flow.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   0 on success, negative value otherwise
+ */
+static int
+flow_create_split_sample(struct rte_eth_dev *dev,
+			   struct rte_flow *flow,
+			   const struct rte_flow_attr *attr,
+			   const struct rte_flow_item items[],
+			   const struct rte_flow_action actions[],
+			   bool external, uint32_t flow_idx,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct rte_flow_action *sfx_actions = NULL;
+	struct rte_flow_action *pre_actions = NULL;
+	struct rte_flow_item *sfx_items = NULL;
+	struct mlx5_flow *dev_flow = NULL;
+	struct rte_flow_attr sfx_attr = *attr;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	struct mlx5_flow_tbl_data_entry *sfx_tbl_data;
+	struct mlx5_flow_tbl_resource *sfx_tbl;
+	union mlx5_flow_tbl_key sfx_table_key;
+#endif
+	size_t act_size;
+	size_t item_size;
+	uint32_t tag_id = 0;
+	int actions_n = 0;
+	int ret = 0;
+
+	if (priv->sampler_en)
+		actions_n = flow_check_match_action(actions,
+					RTE_FLOW_ACTION_TYPE_SAMPLE);
+	if (actions_n) {
+		/* The prefix actions must includes sample, tag, end. */
+		act_size = sizeof(struct rte_flow_action) * (actions_n * 2) +
+			   sizeof(struct mlx5_rte_flow_action_set_tag);
+		/* tag, end. */
+#define SAMPLE_SUFFIX_ITEM 2
+		item_size = sizeof(struct rte_flow_item) * SAMPLE_SUFFIX_ITEM +
+			    sizeof(struct mlx5_rte_flow_item_tag) * 2;
+		sfx_actions = rte_zmalloc(__func__, (act_size + item_size), 0);
+		if (!sfx_actions)
+			return rte_flow_error_set(error, ENOMEM,
+						  RTE_FLOW_ERROR_TYPE_ACTION,
+						  NULL, "no memory to split "
+						  "sample flow");
+		if (!attr->transfer)
+			sfx_items = (struct rte_flow_item *)((char *)sfx_actions
+					+ act_size);
+		pre_actions = sfx_actions + actions_n;
+		tag_id = flow_sample_split_prep(dev, attr, sfx_items,
+						   actions, sfx_actions,
+						   pre_actions);
+		if (!attr->transfer && !tag_id) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		/* Add the prefix subflow. */
+		ret = flow_create_split_inner(dev, flow, &dev_flow, 0, attr,
+					      items, pre_actions, external,
+					      flow_idx, error);
+		if (ret) {
+			ret = -rte_errno;
+			goto exit;
+		}
+		dev_flow->handle->split_flow_id = tag_id;
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+		/* Set the sfx group attr. */
+		sample_res = (struct mlx5_flow_dv_sample_resource *)
+					dev_flow->dv.sample_res;
+		sfx_tbl = (struct mlx5_flow_tbl_resource *)
+					sample_res->normal_path_tbl;
+		sfx_tbl_data = container_of(sfx_tbl,
+					struct mlx5_flow_tbl_data_entry, tbl);
+		sfx_table_key.v64 = sfx_tbl_data->entry.key;
+		sfx_attr.group = sfx_attr.transfer ?
+					(sfx_table_key.table_id - 1) :
+					sfx_table_key.table_id;
+#endif
+	}
+	/* Add the suffix subflow. */
+	ret = flow_create_split_meter(dev, flow, dev_flow ?
+				 flow_get_prefix_layer_flags(dev_flow) : 0,
+				 &sfx_attr, sfx_items ? sfx_items : items,
+				 sfx_actions ? sfx_actions : actions,
+				 external, flow_idx, error);
+exit:
+	if (sfx_actions)
+		rte_free(sfx_actions);
+	return ret;
+}
+
+/**
  * Split the flow to subflow set. The splitters might be linked
  * in the chain, like this:
  * flow_create_split_outer() calls:
@@ -4290,7 +4540,7 @@ struct mlx5_flow_tunnel_info {
 {
 	int ret;
 
-	ret = flow_create_split_meter(dev, flow, attr, items,
+	ret = flow_create_split_sample(dev, flow, attr, items,
 					 actions, external, flow_idx, error);
 	MLX5_ASSERT(ret <= 0);
 	return ret;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 41404de..9d2493a 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -507,6 +507,38 @@ struct mlx5_flow_tbl_data_entry {
 	uint32_t idx; /**< index for the indexed mempool. */
 };
 
+/* Sub rdma-core actions list. */
+struct mlx5_flow_sub_actions_list {
+	uint32_t actions_num; /**< Number of sample actions. */
+	uint64_t action_flags;
+	void *dr_queue_action;
+	void *dr_tag_action;
+	void *dr_cnt_action;
+};
+
+/* Sample sub-actions resource list. */
+struct mlx5_flow_sub_actions_idx {
+	uint32_t rix_hrxq; /**< Hash Rx queue object index. */
+	uint32_t rix_tag; /**< Index to the tag action. */
+	uint32_t cnt;
+};
+
+/* Sample action resource structure. */
+struct mlx5_flow_dv_sample_resource {
+	ILIST_ENTRY(uint32_t)next; /**< Pointer to next element. */
+	rte_atomic32_t refcnt; /**< Reference counter. */
+	void *verbs_action; /**< Verbs sample action object. */
+	uint8_t ft_type; /** Flow Table Type */
+	uint32_t ft_id; /** Flow Table Level */
+	void *normal_path_tbl; /** Flow Table pointer */
+	void *default_miss; /** default_miss dr_action. */
+	uint32_t ratio;   /** Sample Ratio */
+	struct mlx5_flow_sub_actions_idx sample_idx;
+	/**< Action index resources. */
+	struct mlx5_flow_sub_actions_list sample_act;
+	/**< Action resources. */
+};
+
 /* Verbs specification header. */
 struct ibv_spec_header {
 	enum ibv_flow_spec_type type;
@@ -539,6 +571,8 @@ struct mlx5_flow_handle_dv {
 	/**< Index to push VLAN action resource in cache. */
 	uint32_t rix_tag;
 	/**< Index to the tag action. */
+	uint32_t rix_sample;
+	/**< Index to sample action resource in cache. */
 } __rte_packed;
 
 /** Device flow handle structure: used both for creating & destroying. */
@@ -604,6 +638,8 @@ struct mlx5_flow_dv_workspace {
 	/**< Pointer to the jump action resource. */
 	struct mlx5_flow_dv_match_params value;
 	/**< Holds the value that the packet is compared to. */
+	struct mlx5_flow_dv_sample_resource *sample_res;
+	/**< Pointer to the sample action resource. */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 6/7] net/mlx5: update translate function for sample action
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
                           ` (4 preceding siblings ...)
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 5/7] net/mlx5: split sample flow into two sub flows Jiawei Wang
@ 2020-08-27 15:01         ` Jiawei Wang
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: add testpmd command " Jiawei Wang
  2020-09-09  6:48         ` [dpdk-dev] [PATCH v6 00/12] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.c    |  16 +-
 drivers/net/mlx5/mlx5_flow.h    |  14 +-
 drivers/net/mlx5/mlx5_flow_dv.c | 494 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 502 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 49f49e7..d2b79f0 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4606,10 +4606,14 @@ struct mlx5_flow_tunnel_info {
 	int hairpin_flow;
 	uint32_t hairpin_id = 0;
 	struct rte_flow_attr attr_tx = { .priority = 0 };
+	struct rte_flow_attr attr_factor = {0};
 	int ret;
 
-	hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
-	ret = flow_drv_validate(dev, attr, items, p_actions_rx,
+	memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
+	if (external)
+		attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
+	hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
+	ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
 				external, hairpin_flow, error);
 	if (ret < 0)
 		return 0;
@@ -4628,7 +4632,7 @@ struct mlx5_flow_tunnel_info {
 		rte_errno = ENOMEM;
 		goto error_before_flow;
 	}
-	flow->drv_type = flow_get_drv_type(dev, attr);
+	flow->drv_type = flow_get_drv_type(dev, &attr_factor);
 	if (hairpin_id != 0)
 		flow->hairpin_flow_id = hairpin_id;
 	MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN &&
@@ -4674,7 +4678,7 @@ struct mlx5_flow_tunnel_info {
 		 * depending on configuration. In the simplest
 		 * case it just creates unmodified original flow.
 		 */
-		ret = flow_create_split_outer(dev, flow, attr,
+		ret = flow_create_split_outer(dev, flow, &attr_factor,
 					      buf->entry[i].pattern,
 					      p_actions_rx, external, idx,
 					      error);
@@ -4711,8 +4715,8 @@ struct mlx5_flow_tunnel_info {
 	 * the egress Flows belong to the different device and
 	 * copy table should be updated in peer NIC Rx domain.
 	 */
-	if (attr->ingress &&
-	    (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
+	if (attr_factor.ingress &&
+	    (external || attr_factor.group != MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
 		ret = flow_mreg_update_copy_table(dev, flow, actions, error);
 		if (ret)
 			goto error;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 9d2493a..f3c0406 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -366,6 +366,13 @@ enum mlx5_flow_fate_type {
 	MLX5_FLOW_FATE_MAX,
 };
 
+/*
+ * Max number of actions per DV flow.
+ * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
+ * in rdma-core file providers/mlx5/verbs.c.
+ */
+#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
+
 /* Matcher PRM representation */
 struct mlx5_flow_dv_match_params {
 	size_t size;
@@ -613,13 +620,6 @@ struct mlx5_flow_handle {
 #define MLX5_FLOW_HANDLE_VERBS_SIZE (sizeof(struct mlx5_flow_handle))
 #endif
 
-/*
- * Max number of actions per DV flow.
- * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
- * in rdma-core file providers/mlx5/verbs.c.
- */
-#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
-
 /** Device flow structure only for DV flow creation. */
 struct mlx5_flow_dv_workspace {
 	uint32_t group; /**< The group index. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a8db8ab..3d85140 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -76,6 +76,10 @@
 static int
 flow_dv_default_miss_resource_release(struct rte_eth_dev *dev);
 
+static int
+flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
+				      uint32_t encap_decap_idx);
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -8233,6 +8237,373 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Create an Rx Hash queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[in] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] rss_desc
+ *   Pointer to the mlx5_flow_rss_desc.
+ * @param[out] hrxq_idx
+ *   Hash Rx queue index.
+ *
+ * @return
+ *   The Verbs/DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_hrxq *
+flow_dv_handle_rx_queue(struct rte_eth_dev *dev,
+			  struct mlx5_flow *dev_flow,
+			  struct mlx5_flow_rss_desc *rss_desc,
+			  uint32_t *hrxq_idx)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flow_handle *dh = dev_flow->handle;
+	struct mlx5_hrxq *hrxq;
+
+	MLX5_ASSERT(rss_desc->queue_num);
+	*hrxq_idx = mlx5_hrxq_get(dev, rss_desc->key,
+				 MLX5_RSS_HASH_KEY_LEN,
+				 dev_flow->hash_fields,
+				 rss_desc->queue,
+				 rss_desc->queue_num);
+	if (!*hrxq_idx) {
+		*hrxq_idx = mlx5_hrxq_new
+				(dev, rss_desc->key,
+				MLX5_RSS_HASH_KEY_LEN,
+				dev_flow->hash_fields,
+				rss_desc->queue,
+				rss_desc->queue_num,
+				!!(dh->layers &
+				MLX5_FLOW_LAYER_TUNNEL));
+		if (!*hrxq_idx)
+			return NULL;
+	}
+	hrxq = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_HRXQ],
+			      *hrxq_idx);
+	return hrxq;
+}
+
+/**
+ * Find existing sample resource or create and register a new one.
+ *
+ * @param[in, out] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[in] resource
+ *   Pointer to sample resource.
+ * @parm[in, out] dev_flow
+ *   Pointer to the dev_flow.
+ * @param[in, out] sample_dv_actions
+ *   Pointer to sample actions list.
+ * @param[out] error
+ *   pointer to error structure.
+ *
+ * @return
+ *   0 on success otherwise -errno and errno is set.
+ */
+static int
+flow_dv_sample_resource_register(struct rte_eth_dev *dev,
+			 const struct rte_flow_attr *attr,
+			 struct mlx5_flow_dv_sample_resource *resource,
+			 struct mlx5_flow *dev_flow,
+			 void **sample_dv_actions,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+	struct mlx5dv_dr_flow_sampler_attr sampler_attr;
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_flow_tbl_resource *tbl;
+	uint32_t idx = 0;
+	const uint32_t next_ft_step = 1;
+	uint32_t next_ft_id = resource->ft_id +	next_ft_step;
+
+	/* Lookup a matching resource from cache. */
+	ILIST_FOREACH(sh->ipool[MLX5_IPOOL_SAMPLE], sh->sample_action_list,
+		      idx, cache_resource, next) {
+		if (resource->ratio == cache_resource->ratio &&
+		    resource->ft_type == cache_resource->ft_type &&
+		    resource->ft_id == cache_resource->ft_id &&
+		    !memcmp((void *)&resource->sample_act,
+			    (void *)&cache_resource->sample_act,
+			    sizeof(struct mlx5_flow_sub_actions_list))) {
+			DRV_LOG(DEBUG, "sample resource %p: refcnt %d++",
+				(void *)cache_resource,
+				rte_atomic32_read(&cache_resource->refcnt));
+			rte_atomic32_inc(&cache_resource->refcnt);
+			dev_flow->handle->dvh.rix_sample = idx;
+			dev_flow->dv.sample_res = cache_resource;
+			return 0;
+		}
+	}
+	/* Register new sample resource. */
+	cache_resource = mlx5_ipool_zmalloc(sh->ipool[MLX5_IPOOL_SAMPLE],
+				       &dev_flow->handle->dvh.rix_sample);
+	if (!cache_resource)
+		return rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "cannot allocate resource memory");
+	*cache_resource = *resource;
+	/* Create normal path table level */
+	tbl = flow_dv_tbl_resource_get(dev, next_ft_id,
+					attr->egress, attr->transfer, error);
+	if (!tbl) {
+		rte_flow_error_set(error, ENOMEM,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL,
+					  "fail to create normal path table "
+					  "for sample");
+		goto error;
+	}
+	cache_resource->normal_path_tbl = tbl;
+	if (resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		cache_resource->default_miss =
+				mlx5_glue->dr_create_flow_action_default_miss();
+		if (!cache_resource->default_miss) {
+			rte_flow_error_set(error, ENOMEM,
+						RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+						NULL,
+						"cannot create default miss "
+						"action");
+			goto error;
+		}
+		sample_dv_actions[resource->sample_act.actions_num++] =
+						cache_resource->default_miss;
+	}
+	/* Create a DR sample action */
+	sampler_attr.sample_ratio = cache_resource->ratio;
+	sampler_attr.default_next_table = tbl->obj;
+	sampler_attr.num_sample_actions = resource->sample_act.actions_num;
+	sampler_attr.sample_actions = (struct mlx5dv_dr_action **)
+							&sample_dv_actions[0];
+	cache_resource->verbs_action =
+		mlx5_glue->dr_create_flow_action_sampler(&sampler_attr);
+	if (!cache_resource->verbs_action) {
+		rte_flow_error_set(error, ENOMEM,
+					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					NULL, "cannot create sample action");
+		goto error;
+	}
+	rte_atomic32_init(&cache_resource->refcnt);
+	rte_atomic32_inc(&cache_resource->refcnt);
+	ILIST_INSERT(sh->ipool[MLX5_IPOOL_SAMPLE], &sh->sample_action_list,
+		     dev_flow->handle->dvh.rix_sample, cache_resource,
+		     next);
+	dev_flow->dv.sample_res = cache_resource;
+	DRV_LOG(DEBUG, "new sample resource %p: refcnt %d++",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	return 0;
+error:
+	if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+		if (cache_resource->default_miss)
+			claim_zero(mlx5_glue->destroy_flow_action
+				(cache_resource->default_miss));
+	} else {
+		if (cache_resource->sample_idx.rix_hrxq &&
+		    !mlx5_hrxq_release(dev,
+				cache_resource->sample_idx.rix_hrxq))
+			cache_resource->sample_idx.rix_hrxq = 0;
+		if (cache_resource->sample_idx.rix_tag &&
+		    !flow_dv_tag_release(dev,
+				cache_resource->sample_idx.rix_tag))
+			cache_resource->sample_idx.rix_tag = 0;
+		if (cache_resource->sample_idx.cnt) {
+			flow_dv_counter_release(dev,
+				cache_resource->sample_idx.cnt);
+			cache_resource->sample_idx.cnt = 0;
+		}
+	}
+	if (cache_resource->normal_path_tbl)
+		flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	mlx5_ipool_free(sh->ipool[MLX5_IPOOL_SAMPLE],
+				dev_flow->handle->dvh.rix_sample);
+	dev_flow->handle->dvh.rix_sample = 0;
+	return -rte_errno;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in] action
+ *   Pointer to action structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] sample_actions
+ *   Pointer to sample actions list.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_translate_action_sample(struct rte_eth_dev *dev,
+				const struct rte_flow_action *action,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				void **sample_actions,
+				struct mlx5_flow_dv_sample_resource *res,
+				struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_action_sample *sample_action;
+	const struct rte_flow_action *sub_actions;
+	const struct rte_flow_action_queue *queue;
+	struct mlx5_flow_sub_actions_list *sample_act;
+	struct mlx5_flow_sub_actions_idx *sample_idx;
+	struct mlx5_flow_rss_desc *rss_desc = &((struct mlx5_flow_rss_desc *)
+					      priv->rss_desc)
+					      [!!priv->flow_nested_idx];
+	uint64_t action_flags = 0;
+
+	sample_act = &res->sample_act;
+	sample_idx = &res->sample_idx;
+	sample_action = (const struct rte_flow_action_sample *)action->conf;
+	res->ratio = sample_action->ratio;
+	sub_actions = sample_action->actions;
+	for (; sub_actions->type != RTE_FLOW_ACTION_TYPE_END; sub_actions++) {
+		int type = sub_actions->type;
+		uint32_t pre_rix = 0;
+		void *pre_r;
+		switch (type) {
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+		{
+			struct mlx5_hrxq *hrxq;
+			uint32_t hrxq_idx;
+
+			queue = sub_actions->conf;
+			rss_desc->queue_num = 1;
+			rss_desc->queue[0] = queue->index;
+			hrxq = flow_dv_handle_rx_queue(dev, dev_flow,
+					rss_desc, &hrxq_idx);
+			if (!hrxq)
+				return rte_flow_error_set
+					(error, rte_errno,
+					 RTE_FLOW_ERROR_TYPE_ACTION,
+					 NULL,
+					 "cannot create fate queue");
+			sample_act->dr_queue_action = hrxq->action;
+			sample_idx->rix_hrxq = hrxq_idx;
+			sample_actions[sample_act->actions_num++] =
+						hrxq->action;
+			action_flags |= MLX5_FLOW_ACTION_QUEUE;
+			if (action_flags & MLX5_FLOW_ACTION_MARK)
+				dev_flow->handle->rix_hrxq = hrxq_idx;
+			dev_flow->handle->fate_action =
+					MLX5_FLOW_FATE_QUEUE;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_MARK:
+		{
+			uint32_t tag_be = mlx5_flow_mark_set
+				(((const struct rte_flow_action_mark *)
+				(sub_actions->conf))->id);
+			dev_flow->handle->mark = 1;
+			pre_rix = dev_flow->handle->dvh.rix_tag;
+			/* Save the mark resource before sample */
+			pre_r = dev_flow->dv.tag_resource;
+			if (flow_dv_tag_resource_register(dev, tag_be,
+						  dev_flow, error))
+				return -rte_errno;
+			MLX5_ASSERT(dev_flow->dv.tag_resource);
+			sample_act->dr_tag_action =
+				dev_flow->dv.tag_resource->action;
+			sample_idx->rix_tag =
+				dev_flow->handle->dvh.rix_tag;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_tag_action;
+			/* Recover the mark resource after sample */
+			dev_flow->dv.tag_resource = pre_r;
+			dev_flow->handle->dvh.rix_tag = pre_rix;
+			action_flags |= MLX5_FLOW_ACTION_MARK;
+			break;
+		}
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+		{
+			uint32_t counter;
+
+			counter = flow_dv_translate_create_counter(dev,
+					dev_flow, sub_actions->conf, 0);
+			if (!counter)
+				return rte_flow_error_set
+						(error, rte_errno,
+						 RTE_FLOW_ERROR_TYPE_ACTION,
+						 NULL,
+						 "cannot create counter"
+						 " object.");
+			sample_idx->cnt = counter;
+			sample_act->dr_cnt_action =
+				  (flow_dv_counter_get_by_idx(dev,
+				  counter, NULL))->action;
+			sample_actions[sample_act->actions_num++] =
+						sample_act->dr_cnt_action;
+			action_flags |= MLX5_FLOW_ACTION_COUNT;
+			break;
+		}
+		default:
+			return rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				"Not support for sampler action");
+		}
+	}
+	sample_act->action_flags = action_flags;
+	res->ft_id = dev_flow->dv.group;
+	if (attr->transfer)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_FDB;
+	else if (attr->ingress)
+		res->ft_type = MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
+
+	return 0;
+}
+
+/**
+ * Convert Sample action to DV specification.
+ *
+ * @param[in] dev
+ *   Pointer to rte_eth_dev structure.
+ * @param[in, out] dev_flow
+ *   Pointer to the mlx5_flow.
+ * @param[in] attr
+ *   Pointer to the flow attributes.
+ * @param[in, out] res
+ *   Pointer to sample resource.
+ * @param[in] sample_actions
+ *   Pointer to sample path actions list.
+ * @param[out] error
+ *   Pointer to the error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_create_action_sample(struct rte_eth_dev *dev,
+				struct mlx5_flow *dev_flow,
+				const struct rte_flow_attr *attr,
+				struct mlx5_flow_dv_sample_resource *res,
+				void **sample_actions,
+				struct rte_flow_error *error)
+{
+	if (flow_dv_sample_resource_register(dev, attr, res, dev_flow,
+						sample_actions, error))
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ACTION,
+					  NULL, "can't create sample action");
+	return 0;
+}
+
+/**
  * Fill the flow with DV spec, lock free
  * (mutex should be acquired by caller).
  *
@@ -8296,9 +8667,13 @@ struct field_modify_info modify_tcp[] = {
 	void *match_value = dev_flow->dv.value.buf;
 	uint8_t next_protocol = 0xff;
 	struct rte_vlan_hdr vlan = { 0 };
+	struct mlx5_flow_dv_sample_resource sample_res;
+	void *sample_actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS] = {0};
+	uint32_t sample_act_pos = UINT32_MAX;
 	uint32_t table;
 	int ret = 0;
 
+	memset(&sample_res, 0, sizeof(struct mlx5_flow_dv_sample_resource));
 	mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX :
 					   MLX5DV_FLOW_TABLE_TYPE_NIC_RX;
 	ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group,
@@ -8317,7 +8692,6 @@ struct field_modify_info modify_tcp[] = {
 		const struct rte_flow_action_rss *rss;
 		const struct rte_flow_action *action = actions;
 		const uint8_t *rss_key;
-		const struct rte_flow_action_jump *jump_data;
 		const struct rte_flow_action_meter *mtr;
 		struct mlx5_flow_tbl_resource *tbl;
 		uint32_t port_id = 0;
@@ -8325,6 +8699,7 @@ struct field_modify_info modify_tcp[] = {
 		int action_type = actions->type;
 		const struct rte_flow_action *found_action = NULL;
 		struct mlx5_flow_meter *fm = NULL;
+		uint32_t jump_group = 0;
 
 		if (!mlx5_flow_os_action_supported(action_type))
 			return rte_flow_error_set(error, ENOTSUP,
@@ -8563,9 +8938,13 @@ struct field_modify_info modify_tcp[] = {
 			action_flags |= MLX5_FLOW_ACTION_DECAP;
 			break;
 		case RTE_FLOW_ACTION_TYPE_JUMP:
-			jump_data = action->conf;
+			jump_group = ((const struct rte_flow_action_jump *)
+							action->conf)->group;
+			if (dev_flow->external && jump_group <
+					MLX5_MAX_TABLES_EXTERNAL)
+				jump_group *= MLX5_FLOW_TABLE_FACTOR;
 			ret = mlx5_flow_group_to_table(attr, dev_flow->external,
-						       jump_data->group,
+						       jump_group,
 						       !!priv->fdb_def_rule,
 						       &table, error);
 			if (ret)
@@ -8731,6 +9110,19 @@ struct field_modify_info modify_tcp[] = {
 				return -rte_errno;
 			action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP;
 			break;
+		case RTE_FLOW_ACTION_TYPE_SAMPLE:
+			sample_act_pos = actions_n;
+			ret = flow_dv_translate_action_sample(dev,
+							      actions,
+							      dev_flow, attr,
+							      sample_actions,
+							      &sample_res,
+							      error);
+			if (ret < 0)
+				return ret;
+			actions_n++;
+			action_flags |= MLX5_FLOW_ACTION_SAMPLE;
+			break;
 		case RTE_FLOW_ACTION_TYPE_END:
 			actions_end = true;
 			if (mhdr_res->actions_num) {
@@ -8757,6 +9149,21 @@ struct field_modify_info modify_tcp[] = {
 					  (flow_dv_counter_get_by_idx(dev,
 					  flow->counter, NULL))->action;
 			}
+			if (action_flags & MLX5_FLOW_ACTION_SAMPLE) {
+				ret = flow_dv_create_action_sample(dev,
+							  dev_flow, attr,
+							  &sample_res,
+							  sample_actions,
+							  error);
+				if (ret < 0)
+					return rte_flow_error_set
+						(error, rte_errno,
+						RTE_FLOW_ERROR_TYPE_ACTION,
+						NULL,
+						"cannot create sample action");
+				dev_flow->dv.actions[sample_act_pos] =
+					dev_flow->dv.sample_res->verbs_action;
+			}
 			break;
 		default:
 			break;
@@ -9068,7 +9475,8 @@ struct field_modify_info modify_tcp[] = {
 				dh->rix_hrxq = UINT32_MAX;
 				dv->actions[n++] = drop_hrxq->action;
 			}
-		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE) {
+		} else if (dh->fate_action == MLX5_FLOW_FATE_QUEUE &&
+			   !dv_h->rix_sample) {
 			struct mlx5_hrxq *hrxq;
 			uint32_t hrxq_idx;
 			struct mlx5_flow_rss_desc *rss_desc =
@@ -9200,18 +9608,18 @@ struct field_modify_info modify_tcp[] = {
  *
  * @param dev
  *   Pointer to Ethernet device.
- * @param handle
- *   Pointer to mlx5_flow_handle.
+ * @param encap_decap_idx
+ *   Index of encap decap resource.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
  */
 static int
 flow_dv_encap_decap_resource_release(struct rte_eth_dev *dev,
-				     struct mlx5_flow_handle *handle)
+				     uint32_t encap_decap_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	uint32_t idx = handle->dvh.rix_encap_decap;
+	uint32_t idx = encap_decap_idx;
 	struct mlx5_flow_dv_encap_decap_resource *cache_resource;
 
 	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_DECAP_ENCAP],
@@ -9462,6 +9870,71 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Release an sample resource.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param handle
+ *   Pointer to mlx5_flow_handle.
+ *
+ * @return
+ *   1 while a reference on it exists, 0 when freed.
+ */
+static int
+flow_dv_sample_resource_release(struct rte_eth_dev *dev,
+				     struct mlx5_flow_handle *handle)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t idx = handle->dvh.rix_sample;
+	struct mlx5_flow_dv_sample_resource *cache_resource;
+
+	cache_resource = mlx5_ipool_get(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			 idx);
+	if (!cache_resource)
+		return 0;
+	MLX5_ASSERT(cache_resource->verbs_action);
+	DRV_LOG(DEBUG, "sample resource %p: refcnt %d--",
+		(void *)cache_resource,
+		rte_atomic32_read(&cache_resource->refcnt));
+	if (rte_atomic32_dec_and_test(&cache_resource->refcnt)) {
+		if (cache_resource->verbs_action)
+			claim_zero(mlx5_glue->destroy_flow_action
+					(cache_resource->verbs_action));
+		if (cache_resource->ft_type == MLX5DV_FLOW_TABLE_TYPE_FDB) {
+			if (cache_resource->default_miss)
+				claim_zero(mlx5_glue->destroy_flow_action
+				  (cache_resource->default_miss));
+		}
+		if (cache_resource->normal_path_tbl)
+			flow_dv_tbl_resource_release(dev,
+				cache_resource->normal_path_tbl);
+	}
+	if (cache_resource->sample_idx.rix_hrxq &&
+		!mlx5_hrxq_release(dev,
+			cache_resource->sample_idx.rix_hrxq))
+		cache_resource->sample_idx.rix_hrxq = 0;
+	if (cache_resource->sample_idx.rix_tag &&
+		!flow_dv_tag_release(dev,
+			cache_resource->sample_idx.rix_tag))
+		cache_resource->sample_idx.rix_tag = 0;
+	if (cache_resource->sample_idx.cnt) {
+		flow_dv_counter_release(dev,
+			cache_resource->sample_idx.cnt);
+		cache_resource->sample_idx.cnt = 0;
+	}
+	if (!rte_atomic32_read(&cache_resource->refcnt)) {
+		ILIST_REMOVE(priv->sh->ipool[MLX5_IPOOL_SAMPLE],
+			     &priv->sh->sample_action_list, idx,
+			     cache_resource, next);
+		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_SAMPLE], idx);
+		DRV_LOG(DEBUG, "sample resource %p: removed",
+			(void *)cache_resource);
+		return 0;
+	}
+	return 1;
+}
+
+/**
  * Remove the flow from the NIC but keeps it in memory.
  * Lock free, (mutex should be acquired by caller).
  *
@@ -9540,8 +10013,11 @@ struct field_modify_info modify_tcp[] = {
 		flow->dev_handles = dev_handle->next.next;
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
+		if (dev_handle->dvh.rix_sample)
+			flow_dv_sample_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_encap_decap)
-			flow_dv_encap_decap_resource_release(dev, dev_handle);
+			flow_dv_encap_decap_resource_release(dev,
+				dev_handle->dvh.rix_encap_decap);
 		if (dev_handle->dvh.modify_hdr)
 			flow_dv_modify_hdr_resource_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_push_vlan)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v5 7/7] app/testpmd: add testpmd command for sample action
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
                           ` (5 preceding siblings ...)
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 6/7] net/mlx5: update translate function for sample action Jiawei Wang
@ 2020-08-27 15:01         ` " Jiawei Wang
  2020-09-09  6:48         ` [dpdk-dev] [PATCH v6 00/12] support the flow-based traffic sampling Jiawei Wang
  7 siblings, 0 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-08-27 15:01 UTC (permalink / raw)
  To: orika, viacheslavo, matan
  Cc: dev, thomas, rasland, ian.stokes, fbl, jiaweiw, asafp

Add a new testpmd command 'set sample_actions' that supports the multiple
sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be sent to
port 2, and the each second packet will also be sent to e-switch
manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 285 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 276 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6263d30..27fa294 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,8 @@ enum index {
 	SET_RAW_ENCAP,
 	SET_RAW_DECAP,
 	SET_RAW_INDEX,
+	SET_SAMPLE_ACTIONS,
+	SET_SAMPLE_INDEX,
 
 	/* Top-level command. */
 	FLOW,
@@ -358,6 +360,10 @@ enum index {
 	ACTION_SET_IPV6_DSCP_VALUE,
 	ACTION_AGE,
 	ACTION_AGE_TIMEOUT,
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_SAMPLE_INDEX_VALUE,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -493,6 +499,22 @@ struct action_nvgre_encap_data {
 
 struct mplsoudp_decap_conf mplsoudp_decap_conf;
 
+#define ACTION_SAMPLE_ACTIONS_NUM 10
+#define RAW_SAMPLE_CONFS_MAX_NUM 8
+/** Storage for struct rte_flow_action_sample including external data. */
+struct action_sample_data {
+	struct rte_flow_action_sample conf;
+	uint32_t idx;
+};
+/** Storage for struct rte_flow_action_sample. */
+struct raw_sample_conf {
+	struct rte_flow_action data[ACTION_SAMPLE_ACTIONS_NUM];
+};
+struct raw_sample_conf raw_sample_confs[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
+struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -1189,6 +1211,7 @@ struct parse_action_priv {
 	ACTION_SET_IPV4_DSCP,
 	ACTION_SET_IPV6_DSCP,
 	ACTION_AGE,
+	ACTION_SAMPLE,
 	ZERO,
 };
 
@@ -1421,9 +1444,28 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_sample[] = {
+	ACTION_SAMPLE,
+	ACTION_SAMPLE_RATIO,
+	ACTION_SAMPLE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index next_action_sample[] = {
+	ACTION_QUEUE,
+	ACTION_MARK,
+	ACTION_COUNT,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
+static int parse_set_sample_action(struct context *, const struct token *,
+				   const char *, unsigned int,
+				   void *, unsigned int);
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
@@ -1491,7 +1533,15 @@ static int parse_vc_action_raw_decap_index(struct context *,
 static int parse_vc_action_set_meta(struct context *ctx,
 				    const struct token *token, const char *str,
 				    unsigned int len, void *buf,
+					unsigned int size);
+static int parse_vc_action_sample(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
 				    unsigned int size);
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1562,6 +1612,8 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 				    unsigned int, char *, unsigned int);
 static int comp_set_raw_index(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static int comp_set_sample_index(struct context *, const struct token *,
+			      unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -3703,11 +3755,13 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	/* Top level command. */
 	[SET] = {
 		.name = "set",
-		.help = "set raw encap/decap data",
-		.type = "set raw_encap|raw_decap <index> <pattern>",
+		.help = "set raw encap/decap/sample data",
+		.type = "set raw_encap|raw_decap <index> <pattern>"
+				" or set sample_actions <index> <action>",
 		.next = NEXT(NEXT_ENTRY
 			     (SET_RAW_ENCAP,
-			      SET_RAW_DECAP)),
+			      SET_RAW_DECAP,
+			      SET_SAMPLE_ACTIONS)),
 		.call = parse_set_init,
 	},
 	/* Sub-level commands. */
@@ -3738,6 +3792,23 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(next_item),
 		.call = parse_port,
 	},
+	[SET_SAMPLE_INDEX] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "index of sample actions",
+		.next = NEXT(next_action_sample),
+		.call = parse_port,
+	},
+	[SET_SAMPLE_ACTIONS] = {
+		.name = "sample_actions",
+		.help = "set sample actions list",
+		.next = NEXT(NEXT_ENTRY(SET_SAMPLE_INDEX)),
+		.args = ARGS(ARGS_ENTRY_ARB_BOUNDED
+				(offsetof(struct buffer, port),
+				 sizeof(((struct buffer *)0)->port),
+				 0, RAW_SAMPLE_CONFS_MAX_NUM - 1)),
+		.call = parse_set_sample_action,
+	},
 	[ACTION_SET_TAG] = {
 		.name = "set_tag",
 		.help = "set tag",
@@ -3841,6 +3912,37 @@ static int comp_set_raw_index(struct context *, const struct token *,
 		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SAMPLE] = {
+		.name = "sample",
+		.help = "set a sample action",
+		.next = NEXT(action_sample),
+		.priv = PRIV_ACTION(SAMPLE,
+			sizeof(struct action_sample_data)),
+		.call = parse_vc_action_sample,
+	},
+	[ACTION_SAMPLE_RATIO] = {
+		.name = "ratio",
+		.help = "flow sample ratio value",
+		.next = NEXT(action_sample, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_ARB
+			     (offsetof(struct action_sample_data, conf) +
+			      offsetof(struct rte_flow_action_sample, ratio),
+			      sizeof(((struct rte_flow_action_sample *)0)->
+				     ratio))),
+	},
+	[ACTION_SAMPLE_INDEX] = {
+		.name = "index",
+		.help = "the index of sample actions list",
+		.next = NEXT(NEXT_ENTRY(ACTION_SAMPLE_INDEX_VALUE)),
+	},
+	[ACTION_SAMPLE_INDEX_VALUE] = {
+		.name = "{index}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc_action_sample_index,
+		.comp = comp_set_sample_index,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -5351,6 +5453,76 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return len;
 }
 
+static int
+parse_vc_action_sample(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+	struct action_sample_data *action_sample_data = NULL;
+	static struct rte_flow_action end_action = {
+		RTE_FLOW_ACTION_TYPE_END, 0
+	};
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return ret;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Copy the headers to the buffer. */
+	action_sample_data = ctx->object;
+	action_sample_data->conf.actions = &end_action;
+	action->conf = &action_sample_data->conf;
+	return ret;
+}
+
+static int
+parse_vc_action_sample_index(struct context *ctx, const struct token *token,
+				const char *str, unsigned int len, void *buf,
+				unsigned int size)
+{
+	struct action_sample_data *action_sample_data;
+	struct rte_flow_action *action;
+	const struct arg *arg;
+	struct buffer *out = buf;
+	int ret;
+	uint16_t idx;
+
+	RTE_SET_USED(token);
+	RTE_SET_USED(buf);
+	RTE_SET_USED(size);
+	if (ctx->curr != ACTION_SAMPLE_INDEX_VALUE)
+		return -1;
+	arg = ARGS_ENTRY_ARB_BOUNDED
+		(offsetof(struct action_sample_data, idx),
+		 sizeof(((struct action_sample_data *)0)->idx),
+		 0, RAW_SAMPLE_CONFS_MAX_NUM - 1);
+	if (push_args(ctx, arg))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	if (!ctx->object)
+		return len;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	action_sample_data = ctx->object;
+	idx = action_sample_data->idx;
+	action_sample_data->conf.actions = raw_sample_confs[idx].data;
+	action->conf = &action_sample_data->conf;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -6115,6 +6287,38 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	if (!out->command)
 		return -1;
 	out->command = ctx->curr;
+	/* For encap/decap we need is pattern */
+	out->args.vc.pattern = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
+	return len;
+}
+
+/** Parse set command, initialize output buffer for subsequent tokens. */
+static int
+parse_set_sample_action(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	ctx->objdata = 0;
+	ctx->objmask = NULL;
+	ctx->object = out;
+	if (!out->command)
+		return -1;
+	out->command = ctx->curr;
+	/* For sampler we need is actions */
+	out->args.vc.actions = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	return len;
 }
 
@@ -6151,11 +6355,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 			return -1;
 		out->command = ctx->curr;
 		out->args.vc.data = (uint8_t *)out + size;
-		/* All we need is pattern */
-		out->args.vc.pattern =
-			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
-					       sizeof(double));
-		ctx->object = out->args.vc.pattern;
+		ctx->object  = (void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+						       sizeof(double));
 	}
 	return len;
 }
@@ -6306,6 +6507,24 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return nb;
 }
 
+/** Complete index number for set raw_encap/raw_decap commands. */
+static int
+comp_set_sample_index(struct context *ctx, const struct token *token,
+		   unsigned int ent, char *buf, unsigned int size)
+{
+	uint16_t idx = 0;
+	uint16_t nb = 0;
+
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(token);
+	for (idx = 0; idx < RAW_SAMPLE_CONFS_MAX_NUM; ++idx) {
+		if (buf && idx == ent)
+			return snprintf(buf, size, "%u", idx);
+		++nb;
+	}
+	return nb;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -6751,7 +6970,53 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	return mask;
 }
 
-
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_set_raw_parsed_sample(const struct buffer *in)
+{
+	uint32_t n = in->args.vc.actions_n;
+	uint32_t i = 0;
+	struct rte_flow_action *action = NULL;
+	struct rte_flow_action *data = NULL;
+	size_t size = 0;
+	uint16_t idx = in->port; /* We borrow port field as index */
+	uint32_t max_size = sizeof(struct rte_flow_action) *
+						ACTION_SAMPLE_ACTIONS_NUM;
+
+	RTE_ASSERT(in->command == SET_SAMPLE_ACTIONS);
+	data = (struct rte_flow_action *)&raw_sample_confs[idx].data;
+	memset(data, 0x00, max_size);
+	for (; i <= n - 1; i++) {
+		action = in->args.vc.actions + i;
+		if (action->type == RTE_FLOW_ACTION_TYPE_END)
+			break;
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_MARK:
+			size = sizeof(struct rte_flow_action_mark);
+			rte_memcpy(&sample_mark[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_mark[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_COUNT:
+			size = sizeof(struct rte_flow_action_count);
+			rte_memcpy(&sample_count[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_count[idx];
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			size = sizeof(struct rte_flow_action_queue);
+			rte_memcpy(&sample_queue[idx],
+				(const void *)action->conf, size);
+			action->conf = &sample_queue[idx];
+			break;
+		default:
+			printf("Error - Not supported action\n");
+			return;
+		}
+		rte_memcpy(data, action, sizeof(struct rte_flow_action));
+		data++;
+	}
+}
 
 /** Dispatch parsed buffer to function calls. */
 static void
@@ -6768,6 +7033,8 @@ static int comp_set_raw_index(struct context *, const struct token *,
 	uint16_t proto = 0;
 	uint16_t idx = in->port; /* We borrow port field as index */
 
+	if (in->command == SET_SAMPLE_ACTIONS)
+		return cmd_set_raw_parsed_sample(in);
 	RTE_ASSERT(in->command == SET_RAW_ENCAP ||
 		   in->command == SET_RAW_DECAP);
 	if (in->command == SET_RAW_ENCAP) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow Jiawei Wang
@ 2020-09-04  4:17           ` Ajit Khaparde
  2020-09-08 14:38             ` Ori Kam
  0 siblings, 1 reply; 129+ messages in thread
From: Ajit Khaparde @ 2020-09-04  4:17 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: orika, viacheslavo, matan, dpdk-dev, Thomas Monjalon, rasland,
	ian.stokes, fbl, asafp, JP Lee, Michael Baucom, Samik Gupta

On Thu, Aug 27, 2020 at 8:23 AM Jiawei Wang <jiaweiw@nvidia.com> wrote:

> When using full offload, all traffic will be handled by the HW, and
> directed to the requested VF or wire, the control application loses
> visibility on the traffic.
> So there's a need for an action that will enable the control application
> some visibility.
>
> The solution is introduced a new action that will sample the incoming
> traffic and send a duplicated traffic with the specified ratio to the
> application, while the original packet will continue to the target
> destination.
>
> The packets sampled equals is '1/ratio', if the ratio value be set to 1,
> means that the packets would be completely mirrored. The sample packet
> can be assigned with different set of actions from the original packet.
>
> In order to support the sample packet in rte_flow, new rte_flow action
> definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
> will be introduced.
>

In use cases where sampling/mirroring is enabled for monitoring
security/policy breaches
and network connectivity/performance, mirroring copies traffic from
mirrored sources
and sends it to a collector destination where monitoring applications run.

At any given time, the number of flows to be mirrored could be high,
however,
the number of collector destinations is limited because DC operators would
monitor
the copied traffic using a handful number of monitoring applications.

Therefore it would increase the scalability if we can configure the
sampling/mirroring in 2 steps
(something similar to meter configuration). In other words, sampling action
is configured via
one API and the sampling is enabled on a flow via rte_flow_create API.

We could send the proposal in the next couple of days for review.

Thanks
Ajit



> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> Acked-by: Ori Kam <orika@nvidia.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
>  doc/guides/rel_notes/release_20_11.rst |  6 ++++++
>  lib/librte_ethdev/rte_flow.c           |  1 +
>  lib/librte_ethdev/rte_flow.h           | 30 ++++++++++++++++++++++++++++++
>  4 files changed, 62 insertions(+)
>
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 3e5cd1e..f8f3f51 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -2653,6 +2653,31 @@ timeout passed without any matching on the flow.
>     | ``context``  | user input flow context         |
>     +--------------+---------------------------------+
>
> +Action: ``SAMPLE``
> +^^^^^^^^^^^^^^^^^^
> +
> +Adds a sample action to a matched flow.
> +
> +The matching packets will be duplicated with the specified ``ratio`` and
> +applied with own set of actions with a fate action, the packets sampled
> +equals is '1/ratio'. All the packets continue to the target destination.
> +
> +When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
> +``actions`` represent the different set of actions for the sampled or
> mirrored
> +packets, and must have a fate action.
> +
> +.. _table_rte_flow_action_sample:
> +
> +.. table:: SAMPLE
> +
> +   +--------------+---------------------------------+
> +   | Field        | Value                           |
> +   +==============+=================================+
> +   | ``ratio``    | 32 bits sample ratio value      |
> +   +--------------+---------------------------------+
> +   | ``actions``  | sub-action list for sampling    |
> +   +--------------+---------------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index df227a1..7f99563 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,12 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>
> +* **Added flow-based traffic sampling support.**
> +
> +  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the
> matching
> +  packets with specified ratio, and apply with own set of actions with a
> fate
> +  action. When the ratio is set to 1 then the packets will be 100%
> mirrored.
> +
>
>  Removed Items
>  -------------
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index f8fdd68..035671d 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -174,6 +174,7 @@ struct rte_flow_desc_data {
>         MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
>         MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
>         MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> +       MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
>  };
>
>  int
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index da8bfa5..fa70d40 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -2132,6 +2132,14 @@ enum rte_flow_action_type {
>          * see enum RTE_ETH_EVENT_FLOW_AGED
>          */
>         RTE_FLOW_ACTION_TYPE_AGE,
> +
> +       /**
> +        * The matching packets will be duplicated with specified ratio and
> +        * applied with own set of actions with a fate action.
> +        *
> +        * See struct rte_flow_action_sample.
> +        */
> +       RTE_FLOW_ACTION_TYPE_SAMPLE,
>  };
>
>  /**
> @@ -2742,6 +2750,28 @@ struct rte_flow_action {
>  struct rte_flow;
>
>  /**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SAMPLE
> + *
> + * Adds a sample action to a matched flow.
> + *
> + * The matching packets will be duplicated with specified ratio and
> applied
> + * with own set of actions with a fate action, the sampled packet could be
> + * redirected to queue or port. All the packets continue processing on the
> + * default flow path.
> + *
> + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> + * Additional action list be supported to add for sampled or mirrored
> packets.
> + */
> +struct rte_flow_action_sample {
> +       uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
> +       const struct rte_flow_action *actions;
> +               /**< sub-action list specific for the sampling hit cases.
> */
> +};
> +
> +/**
>   * Verbose error types.
>   *
>   * Most of them provide the type of the object referenced by struct
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/7] ethdev: introduce sample action for rte flow
  2020-09-04  4:17           ` Ajit Khaparde
@ 2020-09-08 14:38             ` Ori Kam
  0 siblings, 0 replies; 129+ messages in thread
From: Ori Kam @ 2020-09-08 14:38 UTC (permalink / raw)
  To: Ajit Khaparde, Jiawei(Jonny) Wang
  Cc: Slava Ovsiienko, Matan Azrad, dpdk-dev,
	NBU-Contact-Thomas Monjalon, Raslan Darawsheh, ian.stokes, fbl,
	Asaf Penso, JP Lee, Michael Baucom, Samik Gupta

Hi Ajit,

Sorry for not inline, but for some reason this tread is in html format.
I think that the issue you are rising is a good one, but can be solved using the new
context API that we are going to push into 20.11
https://patches.dpdk.org/cover/73555/

this will enable creating the sample action once and reuse it later.

Best,
Ori

From: Ajit Khaparde <ajit.khaparde@broadcom.com>
Sent: Friday, September 4, 2020 7:17 AM




On Thu, Aug 27, 2020 at 8:23 AM Jiawei Wang <jiaweiw@nvidia.com<mailto:jiaweiw@nvidia.com>> wrote:
When using full offload, all traffic will be handled by the HW, and
directed to the requested VF or wire, the control application loses
visibility on the traffic.
So there's a need for an action that will enable the control application
some visibility.

The solution is introduced a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', if the ratio value be set to 1,
means that the packets would be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

In use cases where sampling/mirroring is enabled for monitoring security/policy breaches
and network connectivity/performance, mirroring copies traffic from mirrored sources
and sends it to a collector destination where monitoring applications run.

At any given time, the number of flows to be mirrored could be high, however,
the number of collector destinations is limited because DC operators would monitor
the copied traffic using a handful number of monitoring applications.

Therefore it would increase the scalability if we can configure the sampling/mirroring in 2 steps
(something similar to meter configuration). In other words, sampling action is configured via
one API and the sampling is enabled on a flow via rte_flow_create API.

We could send the proposal in the next couple of days for review.

Thanks
Ajit



Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com<mailto:jiaweiw@nvidia.com>>
Acked-by: Ori Kam <orika@nvidia.com<mailto:orika@nvidia.com>>
Acked-by: Jerin Jacob <jerinj@marvell.com<mailto:jerinj@marvell.com>>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com<mailto:arybchenko@solarflare.com>>
---
 doc/guides/prog_guide/rte_flow.rst     | 25 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_11.rst |  6 ++++++
 lib/librte_ethdev/rte_flow.c           |  1 +
 lib/librte_ethdev/rte_flow.h           | 30 ++++++++++++++++++++++++++++++
 4 files changed, 62 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e5cd1e..f8f3f51 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2653,6 +2653,31 @@ timeout passed without any matching on the flow.
    | ``context``  | user input flow context         |
    +--------------+---------------------------------+

+Action: ``SAMPLE``
+^^^^^^^^^^^^^^^^^^
+
+Adds a sample action to a matched flow.
+
+The matching packets will be duplicated with the specified ``ratio`` and
+applied with own set of actions with a fate action, the packets sampled
+equals is '1/ratio'. All the packets continue to the target destination.
+
+When the ``ratio`` is set to 1 then the packets will be 100% mirrored.
+``actions`` represent the different set of actions for the sampled or mirrored
+packets, and must have a fate action.
+
+.. _table_rte_flow_action_sample:
+
+.. table:: SAMPLE
+
+   +--------------+---------------------------------+
+   | Field        | Value                           |
+   +==============+=================================+
+   | ``ratio``    | 32 bits sample ratio value      |
+   +--------------+---------------------------------+
+   | ``actions``  | sub-action list for sampling    |
+   +--------------+---------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~

diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..7f99563 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================

+* **Added flow-based traffic sampling support.**
+
+  Added new action: ``RTE_FLOW_ACTION_TYPE_SAMPLE`` to duplicate the matching
+  packets with specified ratio, and apply with own set of actions with a fate
+  action. When the ratio is set to 1 then the packets will be 100% mirrored.
+

 Removed Items
 -------------
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..035671d 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -174,6 +174,7 @@ struct rte_flow_desc_data {
        MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct rte_flow_action_set_dscp)),
        MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct rte_flow_action_set_dscp)),
        MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
+       MK_FLOW_ACTION(SAMPLE, sizeof(struct rte_flow_action_sample)),
 };

 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..fa70d40 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2132,6 +2132,14 @@ enum rte_flow_action_type {
         * see enum RTE_ETH_EVENT_FLOW_AGED
         */
        RTE_FLOW_ACTION_TYPE_AGE,
+
+       /**
+        * The matching packets will be duplicated with specified ratio and
+        * applied with own set of actions with a fate action.
+        *
+        * See struct rte_flow_action_sample.
+        */
+       RTE_FLOW_ACTION_TYPE_SAMPLE,
 };

 /**
@@ -2742,6 +2750,28 @@ struct rte_flow_action {
 struct rte_flow;

 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SAMPLE
+ *
+ * Adds a sample action to a matched flow.
+ *
+ * The matching packets will be duplicated with specified ratio and applied
+ * with own set of actions with a fate action, the sampled packet could be
+ * redirected to queue or port. All the packets continue processing on the
+ * default flow path.
+ *
+ * When the sample ratio is set to 1 then the packets will be 100% mirrored.
+ * Additional action list be supported to add for sampled or mirrored packets.
+ */
+struct rte_flow_action_sample {
+       uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
+       const struct rte_flow_action *actions;
+               /**< sub-action list specific for the sampling hit cases. */
+};
+
+/**
  * Verbose error types.
  *
  * Most of them provide the type of the object referenced by struct
--
1.8.3.1

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [dpdk-dev] [PATCH v6 00/12] support the flow-based traffic sampling
  2020-08-27 15:01       ` [dpdk-dev] [PATCH v5 0/7] support the flow-based traffic sampling Jiawei Wang
                           ` (6 preceding siblings ...)
  2020-08-27 15:01         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: add testpmd command " Jiawei Wang
@ 2020-09-09  6:48         ` Jiawei Wang
  2020-09-09  6:48           ` [dpdk-dev] [PATCH v6 01/12] ethdev: introduce sample action for rte flow Jiawei Wang
                             ` (12 more replies)
  7 siblings, 13 replies; 129+ messages in thread
From: Jiawei Wang @ 2020-09-09  6:48 UTC (permalink / raw)
  To: orika, viacheslavo, matan, thomas, ferruh.yigit, marko.kovacevic,
	arybchenko
  Cc: dev, rasland, ian.stokes, fbl, asafp

This patch set implement the flow sampling for mlx5 driver.

The solution is introduced a new rte_flow action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination.

If the sample ratio value be set to 1, means that the packets would be completely mirrored. The sample packet can be assigned with different set of actions from the original packet.

MLX5 PMD driver will be responsible for validate and translate the sample action while creating a flow.

v6:
* Update the function that restore vport through metadata register c0 for FDB sampler.
* Add multiple destination support.
* Support the remote mirroring with different encapsulation header.
* Fix coverity error.

v5:
* Add the release note.
* Remove Make changes since it's deprecated.

v4:
* Rebase.
* Fix the coding style issue.

v3:
* Remove 'const' of ratio field.
* Update description and commit messages.

v2:
* Rebase patches based on the latest code.
* Update rte_flow and release documents.
* Fix the compile error.
* Removed unnecessary change in [PATCH 7/8] net/mlx5: update the metadata register c0 support since FDB will use 5-tuple to do match.
* Update changes based on the comments.

Jiawei Wang (12):
  ethdev: introduce sample action for rte flow
  common/mlx5: glue for sample action
  common/mlx5: query sampler object capability via DevX
  net/mlx5: add the validate sample action
  net/mlx5: split sample flow into two sub flows
  net/mlx5: update translate function for sample action
  app/testpmd: add testpmd command for sample action
  common/mlx5: add glue function for mirroring
  net/mlx5: update validation for mirroring flow
  net/mlx5: update translate function for mirror
  app/testpmd: add port and encap support for sample action
  net/mlx5: support the native port id actions for mirroring

 app/test-pmd/cmdline_flow.c            |  301 ++++++++-
 doc/guides/prog_guide/rte_flow.rst     |   25 +
 doc/guides/rel_notes/release_20_11.rst |    6 +
 drivers/common/mlx5/linux/meson.build  |    4 +
 drivers/common/mlx5/linux/mlx5_glue.c  |   52 ++
 drivers/common/mlx5/linux/mlx5_glue.h  |   18 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |   27 +
 drivers/common/mlx5/mlx5_devx_cmds.h   |    1 +
 drivers/common/mlx5/mlx5_prm.h         |   61 ++
 drivers/net/mlx5/linux/mlx5_os.c       |   14 +
 drivers/net/mlx5/mlx5.c                |   22 +
 drivers/net/mlx5/mlx5.h                |    6 +
 drivers/net/mlx5/mlx5_flow.c           |  344 +++++++++-
 drivers/net/mlx5/mlx5_flow.h           |   76 ++-
 drivers/net/mlx5/mlx5_flow_dv.c        | 1125 +++++++++++++++++++++++++++++++-
 lib/librte_ethdev/rte_flow.c           |    1 +
 lib/librte_ethdev/rte_flow.h           |   30 +
 17 files changed, 2060 insertions(+), 53 deletions(-)

-- 
1.8.3.1