* [RFC 0/5] add new port affinity item and affinity in Tx queue API
@ 2022-12-21 10:29 Jiawei Wang
  2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas; +Cc: dev, rasland
For the multiple hardware ports connect to a single DPDK port (mhpsdp),
currently there is no information to indicate the packet belongs to
which hardware port.
This patch introduces a new port affinity item in rte flow API, and
the port affinity value reflects the physical port affinity of the
received packets.
The example of match the affinity item:
	testpmd> flow create 0 ingress group 0 pattern port_affinity affinity is 1 /
	end actions queue index 0 / end
This patch adds the tx_affinity setting in Tx queue API, the affinity value
reflects packets be sent to which hardware port.
The testpmd command format as below:
	testpmd> port config (port_id) txq (queue_id) affinity (value)
While uses the port affinity as a matching item in the flow, and sets the
same affinity on the tx queue, then the packet can be sent from the same
hardware port with received.
Jiawei Wang (5):
  ethdev: add port affinity match item
  ethdev: introduce the affinity field in Tx queue API
  drivers: add lag Rx port affinity in PRM
  net/mlx5: add port affinity item support
  drivers: enhance the Tx queue affinity
 app/test-pmd/cmdline.c                      | 84 ++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 | 29 +++++++
 devtools/libabigail.abignore                |  5 ++
 doc/guides/nics/features/mlx5.ini           |  1 +
 doc/guides/nics/mlx5.rst                    |  4 +-
 doc/guides/prog_guide/rte_flow.rst          |  7 ++
 doc/guides/rel_notes/release_22_03.rst      |  9 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 17 ++++
 drivers/common/mlx5/mlx5_devx_cmds.c        |  3 +
 drivers/common/mlx5/mlx5_devx_cmds.h        |  1 +
 drivers/common/mlx5/mlx5_prm.h              | 15 ++--
 drivers/net/mlx5/linux/mlx5_os.c            |  6 ++
 drivers/net/mlx5/mlx5.c                     | 43 +++++-----
 drivers/net/mlx5/mlx5.h                     |  3 +
 drivers/net/mlx5/mlx5_devx.c                | 21 +++--
 drivers/net/mlx5/mlx5_flow.h                |  3 +
 drivers/net/mlx5/mlx5_flow_dv.c             | 95 +++++++++++++++++++++
 drivers/net/mlx5/mlx5_flow_hw.c             | 14 +++
 drivers/net/mlx5/mlx5_tx.h                  |  1 +
 drivers/net/mlx5/mlx5_txq.c                 |  9 ++
 lib/ethdev/rte_ethdev.h                     |  1 +
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 28 ++++++
 23 files changed, 357 insertions(+), 43 deletions(-)
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [RFC 1/5] ethdev: add port affinity match item
  2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
@ 2022-12-21 10:29 ` Jiawei Wang
  2023-01-11 16:41   ` Ori Kam
  2023-01-18 11:07   ` Thomas Monjalon
  2022-12-21 10:29 ` [RFC 2/5] ethdev: introduce the affinity field in Tx queue API Jiawei Wang
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
	Ferruh Yigit, Andrew Rybchenko
  Cc: dev, rasland
For the multiple hardware ports connect to a single DPDK port (mhpsdp),
currently there is no information to indicate the packet belongs to
which hardware port.
This patch introduces a new port affinity item in rte flow API, and
the port affinity value reflects the physical port affinity of the
received packets.
While uses the port affinity as a matching item in the flow, and sets the
same affinity on the tx queue, then the packet can be sent from the same
hardware port with received.
This patch also adds the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern port_affinity affinity is 1 /
	end actions queue index 0 / end
The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port (assumes the affinity 1
stands for the first port) and redirects these packets into RxQ 0.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline_flow.c                 | 29 +++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  7 +++++
 doc/guides/rel_notes/release_22_03.rst      |  5 ++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 28 ++++++++++++++++++++
 6 files changed, 74 insertions(+)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 426585387f..3bc19e112a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -514,6 +514,8 @@ enum index {
 	ITEM_QUOTA,
 	ITEM_QUOTA_STATE,
 	ITEM_QUOTA_STATE_NAME,
+	ITEM_PORT_AFFINITY,
+	ITEM_PORT_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1490,6 +1492,7 @@ static const enum index next_item[] = {
 	ITEM_PPP,
 	ITEM_METER,
 	ITEM_QUOTA,
+	ITEM_PORT_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1976,6 +1979,12 @@ static const enum index item_quota[] = {
 	ZERO,
 };
 
+static const enum index item_port_affinity[] = {
+	ITEM_PORT_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -7239,6 +7248,23 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_PORT_AFFINITY] = {
+		.name = "port_affinity",
+		.help = "match on the physical port affinity of the"
+			" received packet.",
+		.priv = PRIV_ITEM(PORT_AFFINITY,
+				  sizeof(struct rte_flow_item_port_affinity)),
+		.next = NEXT(item_port_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "port affinity value",
+		.next = NEXT(item_port_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -12329,6 +12355,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_METER_COLOR:
 		mask = &rte_flow_item_meter_color_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_PORT_AFFINITY:
+		mask = &rte_flow_item_port_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 59932e82a6..dbf0e9a41f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1558,6 +1558,13 @@ Matches Color Marker set by a Meter.
 
 - ``color``: Metering color marker.
 
+Item: ``PORT_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the physical port affinity of the received packet.
+
+- ``affinity``: Physical port affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst
index 0923707cb8..8acd3174f6 100644
--- a/doc/guides/rel_notes/release_22_03.rst
+++ b/doc/guides/rel_notes/release_22_03.rst
@@ -58,6 +58,11 @@ New Features
   Added ``gre_option`` item in rte_flow to support checksum/key/sequence
   matching in GRE packets.
 
+* **Added rte_flow support for matching Port Affinity fields.**
+
+  Added ``port_affinity`` item in rte_flow to support hardware port affinity of
+  the packets.
+
 * **Added new RSS offload types for L2TPv2 in RSS flow.**
 
   Added ``RTE_ETH_RSS_L2TPV2`` macro so that he L2TPv2 session ID field can be used as
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index f497bba26d..c0ace56c1f 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3722,6 +3722,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``port_affinity``: match port affinity.
+
+  - ``affinity {value}``: port affinity value.
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 07b9ea48a9..645f392b24 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -162,6 +162,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(PPP, sizeof(struct rte_flow_item_ppp)),
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
 	MK_FLOW_ITEM(QUOTA, sizeof(struct rte_flow_item_quota)),
+	MK_FLOW_ITEM(PORT_AFFINITY, sizeof(struct rte_flow_item_port_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 21f7caf540..7907b7c0c2 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -667,6 +667,13 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_sft.
 	 */
 	RTE_FLOW_ITEM_TYPE_SFT,
+
+	/**
+	 * Matches on the physical port affinity of the received packet.
+	 *
+	 * See struct rte_flow_item_port_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
 };
 
 /**
@@ -2227,6 +2234,27 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
+ *
+ * For the multiple hardware ports connect to a single DPDK port (mhpsdp),
+ * use this item to match the hardware port affinity of the packets.
+ */
+struct rte_flow_item_port_affinity {
+	uint8_t affinity; /**< port affinity value. */
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_PORT_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_port_affinity
+rte_flow_item_port_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
  2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
@ 2022-12-21 10:29 ` Jiawei Wang
  2023-01-11 16:47   ` Ori Kam
  2023-01-18 11:37   ` Thomas Monjalon
  2022-12-21 10:29 ` [RFC 3/5] drivers: add lag Rx port affinity in PRM Jiawei Wang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
	Ferruh Yigit, Andrew Rybchenko
  Cc: dev, rasland
For the multiple hardware ports connect to a single DPDK port (mhpsdp),
the previous patch introduces the new rte flow item to match the port
affinity of the received packets.
This patch adds the tx_affinity setting in Tx queue API, the affinity value
reflects packets be sent to which hardware port.
Adds the new tx_affinity field into the padding hole of rte_eth_txconf
structure, the size of rte_eth_txconf keeps the same. Adds a suppress
type for structure change in the ABI check file.
This patch adds the testpmd command line:
testpmd> port config (port_id) txq (queue_id) affinity (value)
For example, there're two hardware ports connects to a single DPDK
port (port id 0), and affinity 1 stood for hard port 1 and affinity
2 stood for hardware port 2, used the below command to config
tx affinity for each TxQ:
	port config 0 txq 0 affinity 1
	port config 0 txq 1 affinity 1
	port config 0 txq 2 affinity 2
	port config 0 txq 3 affinity 2
These commands config the TxQ index 0 and TxQ index 1 with affinity 1,
uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
hardware port 1, and similar with hardware port 2 if sending packets
with TxQ 2 or TxQ 3.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline.c                      | 84 +++++++++++++++++++++
 devtools/libabigail.abignore                |  5 ++
 doc/guides/rel_notes/release_22_03.rst      |  4 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 13 ++++
 lib/ethdev/rte_ethdev.h                     |  1 +
 5 files changed, 107 insertions(+)
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b32dc8bfd4..683cae1cab 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -764,6 +764,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config <port_id> txq <queue_id> affinity <value>\n"
+			"    Set the port affinity value "
+			"on a specific Tx queue\n\n"
 		);
 	}
 
@@ -12621,6 +12625,85 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq affinity value *** */
+struct cmd_config_tx_affinity {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t affinity;
+	uint16_t value;
+};
+
+static void
+cmd_config_tx_affinity_parsed(void *parsed_result,
+				__rte_unused struct cmdline *cl,
+				 __rte_unused void *data)
+{
+	struct cmd_config_tx_affinity *res = parsed_result;
+	struct rte_port *port;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	port = &ports[res->portid];
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	port->txq[res->qid].conf.tx_affinity = res->value;
+
+	cmd_reconfig_device_queue(res->portid, 0, 1);
+}
+
+cmdline_parse_token_string_t cmd_config_tx_affinity_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_affinity_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_affinity_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_affinity_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_affine =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity,
+				 affinity, "affinity");
+cmdline_parse_token_num_t cmd_config_tx_affinity_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity,
+			      value, RTE_UINT16);
+
+static cmdline_parse_inst_t cmd_config_tx_affinity = {
+	.f = cmd_config_tx_affinity_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_affinity_port,
+		(void *)&cmd_config_tx_affinity_config,
+		(void *)&cmd_config_tx_affinity_portid,
+		(void *)&cmd_config_tx_affinity_txq,
+		(void *)&cmd_config_tx_affinity_qid,
+		(void *)&cmd_config_tx_affinity_affine,
+		(void *)&cmd_config_tx_affinity_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12851,6 +12934,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_capability,
 	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
 	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
+	(cmdline_parse_inst_t *)&cmd_config_tx_affinity,
 	NULL,
 };
 
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 7a93de3ba1..cbbde4ef05 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -20,6 +20,11 @@
 [suppress_file]
         soname_regexp = ^librte_.*mlx.*glue\.
 
+; Ignore fields inserted in middle padding of rte_eth_txconf
+[suppress_type]
+        name = rte_eth_txconf
+        has_data_member_inserted_between = {offset_after(tx_deferred_start), offset_of(offloads)}
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Experimental APIs exceptions ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst
index 8acd3174f6..0d81ae7bc3 100644
--- a/doc/guides/rel_notes/release_22_03.rst
+++ b/doc/guides/rel_notes/release_22_03.rst
@@ -212,6 +212,10 @@ API Changes
 * ethdev: Old public macros and enumeration constants without ``RTE_ETH_`` prefix,
   which are kept for backward compatibility, are marked as deprecated.
 
+* ethdev: added a new field:
+
+  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
+
 * cryptodev: The asymmetric session handling was modified to use a single
   mempool object. An API ``rte_cryptodev_asym_session_pool_create`` was added
   to create a mempool with element size big enough to hold the generic asymmetric
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index c0ace56c1f..0c3317ee06 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx affinity
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configure a affinity value per queue Tx offloading only on a specific Tx queue::
+
+   testpmd> port (port_id) txq (queue_id) affinity (value)
+
+* ``affinity``: reflects packet can be sent to which hardware port.
+                uses it on multiple hardware ports connect to
+                a single DPDK port (mhpsdp).
+
+This command should be run when the port is stopped, or else it will fail.
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 882ca585f2..813dbb34b5 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
 				      less free descriptors than this value. */
 
 	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
+	uint8_t tx_affinity; /**< Drives the setting of affinity per-queue. */
 	/**
 	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_* flags.
 	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [RFC 3/5] drivers: add lag Rx port affinity in PRM
  2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
  2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
  2022-12-21 10:29 ` [RFC 2/5] ethdev: introduce the affinity field in Tx queue API Jiawei Wang
@ 2022-12-21 10:29 ` Jiawei Wang
  2022-12-21 10:29 ` [RFC 4/5] net/mlx5: add port affinity item support Jiawei Wang
  2022-12-21 10:29 ` [RFC 5/5] drivers: enhance the Tx queue affinity Jiawei Wang
  4 siblings, 0 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Matan Azrad; +Cc: dev, rasland
This patch adds function to query hca capability via Devx for
lag_rx_port_affinity.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 3 +++
 drivers/common/mlx5/mlx5_devx_cmds.h | 1 +
 drivers/common/mlx5/mlx5_prm.h       | 7 +++++--
 drivers/net/mlx5/linux/mlx5_os.c     | 4 ++++
 drivers/net/mlx5/mlx5.h              | 2 ++
 5 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 9e0b26fa11..16e9e38b0b 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -1244,6 +1244,9 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->outer_ipv4_ihl = MLX5_GET
 		(flow_table_nic_cap, hcattr,
 		 ft_field_support_2_nic_receive.outer_ipv4_ihl);
+	attr->lag_rx_port_affinity = MLX5_GET
+		(flow_table_nic_cap, hcattr,
+		 ft_field_support_2_nic_receive.lag_rx_port_affinity);
 	/* Query HCA offloads for Ethernet protocol. */
 	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 			MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 1c86426e71..4fe0915a65 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -318,6 +318,7 @@ struct mlx5_hca_attr {
 	uint32_t flow_counter_access_aso:1;
 	uint32_t flow_access_aso_opc_mod:8;
 	uint32_t cross_vhca:1;
+	uint32_t lag_rx_port_affinity:1;
 };
 
 /* LAG Context. */
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 5b84657e08..9098b0fe0b 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -905,7 +905,8 @@ struct mlx5_ifc_fte_match_set_misc_bits {
 	u8 vxlan_vni[0x18];
 	u8 reserved_at_b8[0x8];
 	u8 geneve_vni[0x18];
-	u8 reserved_at_e4[0x6];
+	u8 lag_rx_port_affinity[0x4];
+	u8 reserved_at_e8[0x2];
 	u8 geneve_tlv_option_0_exist[0x1];
 	u8 geneve_oam[0x1];
 	u8 reserved_at_e0[0xc];
@@ -2055,7 +2056,9 @@ struct mlx5_ifc_ft_fields_support_bits {
  * Table 1872 - Flow Table Fields Supported 2 Format
  */
 struct mlx5_ifc_ft_fields_support_2_bits {
-	u8 reserved_at_0[0xd];
+	u8 reserved_at_0[0xa];
+	u8 lag_rx_port_affinity[0x1];
+	u8 reserved_at_c[0x2];
 	u8 hash_result[0x1];
 	u8 reserved_at_e[0x1];
 	u8 tunnel_header_2_3[0x1];
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index d48b9b68ac..3fea72013f 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1402,6 +1402,10 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 				DRV_LOG(DEBUG, "DV flow is not supported!");
 		}
 #endif
+		if (hca_attr->lag_rx_port_affinity) {
+			sh->lag_rx_port_affinity_en = 1;
+			DRV_LOG(DEBUG, "LAG Rx Port Affinity enabled");
+		}
 	}
 	/* Process parameters and store port configuration on priv structure. */
 	err = mlx5_port_args_config(priv, mkvlist, &priv->config);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 761b5ac572..dc4d1a8686 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1399,6 +1399,8 @@ struct mlx5_dev_ctx_shared {
 	uint32_t hws_tags:1; /* Check if tags info for HWS initialized. */
 	uint32_t shared_mark_enabled:1;
 	/* If mark action is enabled on Rxqs (shared E-Switch domain). */
+	uint32_t lag_rx_port_affinity_en:1;
+	/* lag_rx_port_affinity is supported. */
 	uint32_t hws_max_log_bulk_sz:5;
 	/* Log of minimal HWS counters created hard coded. */
 	uint32_t hws_max_nb_counters; /* Maximal number for HWS counters. */
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [RFC 4/5] net/mlx5: add port affinity item support
  2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
                   ` (2 preceding siblings ...)
  2022-12-21 10:29 ` [RFC 3/5] drivers: add lag Rx port affinity in PRM Jiawei Wang
@ 2022-12-21 10:29 ` Jiawei Wang
  2022-12-21 10:29 ` [RFC 5/5] drivers: enhance the Tx queue affinity Jiawei Wang
  4 siblings, 0 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Matan Azrad; +Cc: dev, rasland
This patch adds the new port affinity item supports in PMD:
RTE_FLOW_ITEM_TYPE_PORT_AFFINITY.
This patch adds the validation function for the new item,
it works for NIC-RX rules on ROOT-table only.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 doc/guides/nics/features/mlx5.ini |  1 +
 doc/guides/nics/mlx5.rst          |  4 +-
 drivers/net/mlx5/linux/mlx5_os.c  |  2 +
 drivers/net/mlx5/mlx5.h           |  1 +
 drivers/net/mlx5/mlx5_flow.h      |  3 +
 drivers/net/mlx5/mlx5_flow_dv.c   | 95 +++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_flow_hw.c   | 14 +++++
 7 files changed, 119 insertions(+), 1 deletion(-)
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index 62fd330e2b..a7766221c9 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -87,6 +87,7 @@ vlan                 = Y
 vxlan                = Y
 vxlan_gpe            = Y
 represented_port     = Y
+port_affinity        = Y
 
 [rte_flow actions]
 age                  = I
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 85a2b422c5..fe33c2a895 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -106,6 +106,7 @@ Features
 - Sub-Function representors.
 - Sub-Function.
 - Matching on represented port.
+- Matching on port affinity.
 
 
 Limitations
@@ -595,10 +596,11 @@ Limitations
   - key
   - sequence
 
-  Matching on checksum and sequence needs MLNX_OFED 5.6+.
+- Matching on checksum and sequence needs MLNX_OFED 5.6+.
 
 - The NIC egress flow rules on representor port are not supported.
 
+- Match on port affinity is supported NIC ingress flow in group 0 only.
 
 Statistics
 ----------
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3fea72013f..babc7d2f94 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1406,6 +1406,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			sh->lag_rx_port_affinity_en = 1;
 			DRV_LOG(DEBUG, "LAG Rx Port Affinity enabled");
 		}
+		priv->num_lag_ports = hca_attr->num_lag_ports;
+		DRV_LOG(DEBUG, "The number of lag ports is %d", priv->num_lag_ports);
 	}
 	/* Process parameters and store port configuration on priv structure. */
 	err = mlx5_port_args_config(priv, mkvlist, &priv->config);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index dc4d1a8686..52f1592035 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1739,6 +1739,7 @@ struct mlx5_priv {
 	unsigned int mtr_reg_share:1; /* Whether support meter REG_C share. */
 	unsigned int lb_used:1; /* Loopback queue is referred to. */
 	uint32_t mark_enabled:1; /* If mark action is enabled on rxqs. */
+	uint32_t num_lag_ports:4; /* Number of ports can be bonded. */
 	uint16_t domain_id; /* Switch domain identifier. */
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 7148c10e96..13cf9b7d76 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -264,6 +264,9 @@ enum mlx5_feature_name {
 /* IPSEC syndrome item */
 #define MLX5_FLOW_ITEM_IPSEC_SYNDROME (UINT64_C(1) << 46)
 
+/* Port affinity item */
+#define MLX5_FLOW_ITEM_PORT_AFFINITY (UINT64_C(1) << 47)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 310fb7c5c3..62a6fb496d 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -3888,6 +3888,74 @@ flow_dv_validate_item_ipsec_syndrome(struct rte_eth_dev *dev,
 	return 0;
 }
 
+/**
+ * Validate Port affinity item.
+ *
+ * @param[in] dev
+ *   Pointer to the rte_eth_dev structure.
+ * @param[in] item
+ *   Item specification.
+ * @param[in] attr
+ *   Attributes of flow that includes this item.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_port_affinity(struct rte_eth_dev *dev,
+				    const struct rte_flow_item *item,
+				    const struct rte_flow_attr *attr,
+				    struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_item_port_affinity *spec = item->spec;
+	const struct rte_flow_item_port_affinity *mask = item->mask;
+	struct rte_flow_item_port_affinity nic_mask = {
+		.affinity = UINT8_MAX
+	};
+	int ret;
+
+	if (!priv->sh->lag_rx_port_affinity_en)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "Unsupported port affinity with Older FW");
+	if (!attr->ingress || attr->transfer || attr->group)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "port affinity is supported with NIC-RX on Root");
+	if (!spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "data cannot be empty");
+	if (spec->affinity == 0)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "zero affinity number not supported");
+	if (spec->affinity > priv->num_lag_ports)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
+					  item->spec,
+					  "exceed max affinity number in lag ports");
+	if (!mask)
+		mask = &rte_flow_item_port_affinity_mask;
+	if (!mask->affinity)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC, NULL,
+					  "mask cannot be zero");
+	ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
+				(const uint8_t *)&nic_mask,
+				sizeof(struct rte_flow_item_port_affinity),
+				MLX5_ITEM_RANGE_NOT_ACCEPTED, error);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
 int
 flow_dv_encap_decap_match_cb(void *tool_ctx __rte_unused,
 			     struct mlx5_list_entry *entry, void *cb_ctx)
@@ -7679,6 +7747,13 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr,
 				return ret;
 			last_item = MLX5_FLOW_ITEM_IPSEC_SYNDROME;
 			break;
+		case RTE_FLOW_ITEM_TYPE_PORT_AFFINITY:
+			ret = flow_dv_validate_item_port_affinity(dev, items,
+								  attr, error);
+			if (ret < 0)
+				return ret;
+			last_item = MLX5_FLOW_ITEM_PORT_AFFINITY;
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -10978,6 +11053,22 @@ flow_dv_translate_item_ipsec_syndrome(void *key,
 		 spec->syndrome & mask->syndrome);
 }
 
+static void
+flow_dv_translate_item_port_affinity(void *key,
+				     const struct rte_flow_item *item,
+				     uint32_t key_type)
+{
+	const struct rte_flow_item_port_affinity *affinity_v;
+	const struct rte_flow_item_port_affinity *affinity_m;
+	void *misc_v;
+
+	MLX5_ITEM_UPDATE(item, key_type, affinity_v, affinity_m,
+			 &rte_flow_item_port_affinity_mask);
+	misc_v = MLX5_ADDR_OF(fte_match_param, key, misc_parameters);
+	MLX5_SET(fte_match_set_misc, misc_v, lag_rx_port_affinity,
+		 affinity_v->affinity & affinity_m->affinity);
+}
+
 static uint32_t matcher_zero[MLX5_ST_SZ_DW(fte_match_param)] = { 0 };
 
 #define HEADER_IS_ZERO(match_criteria, headers)				     \
@@ -13779,6 +13870,10 @@ flow_dv_translate_items(struct rte_eth_dev *dev,
 		flow_dv_translate_item_ipsec_syndrome(key, items, key_type);
 		last_item = MLX5_FLOW_ITEM_IPSEC_SYNDROME;
 		break;
+	case RTE_FLOW_ITEM_TYPE_PORT_AFFINITY:
+		flow_dv_translate_item_port_affinity(key, items, key_type);
+		last_item = MLX5_FLOW_ITEM_PORT_AFFINITY;
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 0705002d99..2e93dcf801 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -5015,6 +5015,20 @@ flow_hw_pattern_validate(struct rte_eth_dev *dev,
 							  "Unsupported meter color register");
 			break;
 		}
+		case RTE_FLOW_ITEM_TYPE_PORT_AFFINITY:
+		{
+			if (!priv->sh->lag_rx_port_affinity_en)
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+							  "Unsupported port affinity with Older FW");
+			if (!attr->ingress || attr->transfer)
+				return rte_flow_error_set(error, EINVAL,
+							  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+							  "Port affinity item not supported"
+							  " with egress or transfer"
+							  " attribute");
+			break;
+		}
 		case RTE_FLOW_ITEM_TYPE_VOID:
 		case RTE_FLOW_ITEM_TYPE_ETH:
 		case RTE_FLOW_ITEM_TYPE_VLAN:
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [RFC 5/5] drivers: enhance the Tx queue affinity
  2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
                   ` (3 preceding siblings ...)
  2022-12-21 10:29 ` [RFC 4/5] net/mlx5: add port affinity item support Jiawei Wang
@ 2022-12-21 10:29 ` Jiawei Wang
  4 siblings, 0 replies; 16+ messages in thread
From: Jiawei Wang @ 2022-12-21 10:29 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Matan Azrad; +Cc: dev, rasland
Previous patch support the tx affinity configuration in the Tx
queue API, it supports to set the affinity value on each Queue.
This patch updates TIS creation with tx_affinity value of Tx queue
, TIS index 1 goes to port 1, TIS index 2 goes to port 2, and
TIS index 0 is reserved for default HWS hash mode.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 drivers/common/mlx5/mlx5_prm.h |  8 -------
 drivers/net/mlx5/mlx5.c        | 43 +++++++++++++++-------------------
 drivers/net/mlx5/mlx5_devx.c   | 21 ++++++++++-------
 drivers/net/mlx5/mlx5_tx.h     |  1 +
 drivers/net/mlx5/mlx5_txq.c    |  9 +++++++
 5 files changed, 42 insertions(+), 40 deletions(-)
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 9098b0fe0b..778c97b059 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -2362,14 +2362,6 @@ struct mlx5_ifc_query_nic_vport_context_in_bits {
 	u8 reserved_at_68[0x18];
 };
 
-/*
- * lag_tx_port_affinity: 0 auto-selection, 1 PF1, 2 PF2 vice versa.
- * Each TIS binds to one PF by setting lag_tx_port_affinity (>0).
- * Once LAG enabled, we create multiple TISs and bind each one to
- * different PFs, then TIS[i] gets affinity i+1 and goes to PF i+1.
- */
-#define MLX5_IFC_LAG_MAP_TIS_AFFINITY(index, num) ((num) ? \
-						    (index) % (num) + 1 : 0)
 struct mlx5_ifc_tisc_bits {
 	u8 strict_lag_tx_port_affinity[0x1];
 	u8 reserved_at_1[0x3];
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index fe9897f83d..e547fa0219 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1172,9 +1172,9 @@ mlx5_dev_ctx_shared_mempool_subscribe(struct rte_eth_dev *dev)
 static int
 mlx5_setup_tis(struct mlx5_dev_ctx_shared *sh)
 {
-	int i;
 	struct mlx5_devx_lag_context lag_ctx = { 0 };
 	struct mlx5_devx_tis_attr tis_attr = { 0 };
+	int i;
 
 	tis_attr.transport_domain = sh->td->id;
 	if (sh->bond.n_port) {
@@ -1188,35 +1188,30 @@ mlx5_setup_tis(struct mlx5_dev_ctx_shared *sh)
 			DRV_LOG(ERR, "Failed to query lag affinity.");
 			return -1;
 		}
-		if (sh->lag.affinity_mode == MLX5_LAG_MODE_TIS) {
-			for (i = 0; i < sh->bond.n_port; i++) {
-				tis_attr.lag_tx_port_affinity =
-					MLX5_IFC_LAG_MAP_TIS_AFFINITY(i,
-							sh->bond.n_port);
-				sh->tis[i] = mlx5_devx_cmd_create_tis(sh->cdev->ctx,
-						&tis_attr);
-				if (!sh->tis[i]) {
-					DRV_LOG(ERR, "Failed to TIS %d/%d for bonding device"
-						" %s.", i, sh->bond.n_port,
-						sh->ibdev_name);
-					return -1;
-				}
-			}
+		if (sh->lag.affinity_mode == MLX5_LAG_MODE_TIS)
 			DRV_LOG(DEBUG, "LAG number of ports : %d, affinity_1 & 2 : pf%d & %d.\n",
 				sh->bond.n_port, lag_ctx.tx_remap_affinity_1,
 				lag_ctx.tx_remap_affinity_2);
-			return 0;
-		}
-		if (sh->lag.affinity_mode == MLX5_LAG_MODE_HASH)
+		else if (sh->lag.affinity_mode == MLX5_LAG_MODE_HASH)
 			DRV_LOG(INFO, "Device %s enabled HW hash based LAG.",
 					sh->ibdev_name);
 	}
-	tis_attr.lag_tx_port_affinity = 0;
-	sh->tis[0] = mlx5_devx_cmd_create_tis(sh->cdev->ctx, &tis_attr);
-	if (!sh->tis[0]) {
-		DRV_LOG(ERR, "Failed to TIS 0 for bonding device"
-			" %s.", sh->ibdev_name);
-		return -1;
+	for (i = 0; i <= sh->bond.n_port; i++) {
+		/*
+		 * lag_tx_port_affinity: 0 auto-selection, 1 PF1, 2 PF2 vice versa.
+		 * Each TIS binds to one PF by setting lag_tx_port_affinity (> 0).
+		 * Once LAG enabled, we create multiple TISs and bind each one to
+		 * different PFs, then TIS[i+1] gets affinity i+1 and goes to PF i+1.
+		 * TIS[0] is reserved for HW Hash mode.
+		 */
+		tis_attr.lag_tx_port_affinity = i;
+		sh->tis[i] = mlx5_devx_cmd_create_tis(sh->cdev->ctx, &tis_attr);
+		if (!sh->tis[i]) {
+			DRV_LOG(ERR, "Failed to create TIS %d/%d for [bonding] device"
+				" %s.", i, sh->bond.n_port,
+				sh->ibdev_name);
+			return -1;
+		}
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index f6e1943fd7..6da6e9c2ee 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -1191,16 +1191,21 @@ mlx5_get_txq_tis_num(struct rte_eth_dev *dev, uint16_t queue_idx)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	int tis_idx;
+	struct mlx5_txq_data *txq_data = (*priv->txqs)[queue_idx];
 
-	if (priv->sh->bond.n_port && priv->sh->lag.affinity_mode ==
-			MLX5_LAG_MODE_TIS) {
-		tis_idx = (priv->lag_affinity_idx + queue_idx) %
-			priv->sh->bond.n_port;
-		DRV_LOG(INFO, "port %d txq %d gets affinity %d and maps to PF %d.",
-			dev->data->port_id, queue_idx, tis_idx + 1,
-			priv->sh->lag.tx_remap_affinity[tis_idx]);
+	if (txq_data->tx_affinity) {
+		tis_idx = txq_data->tx_affinity;
 	} else {
-		tis_idx = 0;
+		if (priv->sh->bond.n_port && priv->sh->lag.affinity_mode ==
+				MLX5_LAG_MODE_TIS) {
+			tis_idx = (priv->lag_affinity_idx + queue_idx) %
+				   priv->sh->bond.n_port + 1;
+			DRV_LOG(INFO, "port %d txq %d gets affinity %d and maps to PF %d.",
+				dev->data->port_id, queue_idx, tis_idx,
+				priv->sh->lag.tx_remap_affinity[tis_idx - 1]);
+		} else {
+			tis_idx = 0;
+		}
 	}
 	MLX5_ASSERT(priv->sh->tis[tis_idx]);
 	return priv->sh->tis[tis_idx]->id;
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index a44050a1ce..394e9b8d4f 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -144,6 +144,7 @@ struct mlx5_txq_data {
 	uint16_t inlen_send; /* Ordinary send data inline size. */
 	uint16_t inlen_empw; /* eMPW max packet size to inline. */
 	uint16_t inlen_mode; /* Minimal data length to inline. */
+	uint8_t tx_affinity; /* TXQ affinity configuration. */
 	uint32_t qp_num_8s; /* QP number shifted by 8. */
 	uint64_t offloads; /* Offloads for Tx Queue. */
 	struct mlx5_mr_ctrl mr_ctrl; /* MR control descriptor. */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 7ef7c5f43e..b96a45060f 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -392,9 +392,17 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 		container_of(txq, struct mlx5_txq_ctrl, txq);
 	int res;
 
+	if (conf->tx_affinity > priv->num_lag_ports) {
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u unable to setup Tx queue index %u"
+			" affinity is %u exceed the maximum %u", dev->data->port_id,
+			idx, conf->tx_affinity, priv->num_lag_ports);
+		return -rte_errno;
+	}
 	res = mlx5_tx_queue_pre_setup(dev, idx, &desc);
 	if (res)
 		return res;
+
 	txq_ctrl = mlx5_txq_new(dev, idx, desc, socket, conf);
 	if (!txq_ctrl) {
 		DRV_LOG(ERR, "port %u unable to allocate queue index %u",
@@ -1095,6 +1103,7 @@ mlx5_txq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	tmpl->txq.elts_m = desc - 1;
 	tmpl->txq.port_id = dev->data->port_id;
 	tmpl->txq.idx = idx;
+	tmpl->txq.tx_affinity = conf->tx_affinity;
 	txq_set_params(tmpl);
 	if (txq_adjust_params(tmpl))
 		goto error;
-- 
2.18.1
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 1/5] ethdev: add port affinity match item
  2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
@ 2023-01-11 16:41   ` Ori Kam
  2023-01-18 11:07   ` Thomas Monjalon
  1 sibling, 0 replies; 16+ messages in thread
From: Ori Kam @ 2023-01-11 16:41 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko
  Cc: dev, Raslan Darawsheh
Hi Jiawei,
> -----Original Message-----
> From: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Sent: Wednesday, 21 December 2022 12:30
> 
> For the multiple hardware ports connect to a single DPDK port (mhpsdp),
> currently there is no information to indicate the packet belongs to
> which hardware port.
> 
> This patch introduces a new port affinity item in rte flow API, and
> the port affinity value reflects the physical port affinity of the
> received packets.
> 
> While uses the port affinity as a matching item in the flow, and sets the
> same affinity on the tx queue, then the packet can be sent from the same
> hardware port with received.
> 
> This patch also adds the testpmd command line to match the new item:
> 	flow create 0 ingress group 0 pattern port_affinity affinity is 1 /
> 	end actions queue index 0 / end
> 
> The above command means that creates a flow on a single DPDK port and
> matches the packet from the first physical port (assumes the affinity 1
> stands for the first port) and redirects these packets into RxQ 0.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
Acked-by: Ori Kam <orika@nvidia.com>
Best,
Ori
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2022-12-21 10:29 ` [RFC 2/5] ethdev: introduce the affinity field in Tx queue API Jiawei Wang
@ 2023-01-11 16:47   ` Ori Kam
  2023-01-18 11:37   ` Thomas Monjalon
  1 sibling, 0 replies; 16+ messages in thread
From: Ori Kam @ 2023-01-11 16:47 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko
  Cc: dev, Raslan Darawsheh
Hi Jiawei,
> -----Original Message-----
> From: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Sent: Wednesday, 21 December 2022 12:30
> 
> For the multiple hardware ports connect to a single DPDK port (mhpsdp),
> the previous patch introduces the new rte flow item to match the port
> affinity of the received packets.
> 
> This patch adds the tx_affinity setting in Tx queue API, the affinity value
> reflects packets be sent to which hardware port.
> 
> Adds the new tx_affinity field into the padding hole of rte_eth_txconf
> structure, the size of rte_eth_txconf keeps the same. Adds a suppress
> type for structure change in the ABI check file.
> 
> This patch adds the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two hardware ports connects to a single DPDK
> port (port id 0), and affinity 1 stood for hard port 1 and affinity
> 2 stood for hardware port 2, used the below command to config
> tx affinity for each TxQ:
> 	port config 0 txq 0 affinity 1
> 	port config 0 txq 1 affinity 1
> 	port config 0 txq 2 affinity 2
> 	port config 0 txq 3 affinity 2
> 
> These commands config the TxQ index 0 and TxQ
 index 1 with affinity 1,
> uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
> hardware port 1, and similar with hardware port 2 if sending packets
> with TxQ 2 or TxQ 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
Acked-by: Ori Kam <orika@nvidia.com>
Best,
Ori
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [RFC 1/5] ethdev: add port affinity match item
  2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
  2023-01-11 16:41   ` Ori Kam
@ 2023-01-18 11:07   ` Thomas Monjalon
  2023-01-18 14:41     ` Jiawei(Jonny) Wang
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-01-18 11:07 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: viacheslavo, orika, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, rasland, Ivan Malov, jerinj
21/12/2022 11:29, Jiawei Wang:
> +	/**
> +	 * Matches on the physical port affinity of the received packet.
> +	 *
> +	 * See struct rte_flow_item_port_affinity.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
>  };
I'm not sure about the word "affinity".
I think you want to match on a physical port.
It could be a global physical port id or
an index in the group of physical ports connected to a single DPDK port.
In first case, the name of the item could be RTE_FLOW_ITEM_TYPE_PHY_PORT,
in the second case, the name could be RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
"MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
We could replace "PHY" with "HW" as well.
Note that we cannot use the new item RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
because we are in a case where multiple hardware ports are merged
in a single software represented port.
[...]
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> + *
> + * For the multiple hardware ports connect to a single DPDK port (mhpsdp),
> + * use this item to match the hardware port affinity of the packets.
> + */
> +struct rte_flow_item_port_affinity {
> +	uint8_t affinity; /**< port affinity value. */
> +};
We need to define how the port numbering is done.
Is it driver-dependent?
Does it start at 0? etc...
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2022-12-21 10:29 ` [RFC 2/5] ethdev: introduce the affinity field in Tx queue API Jiawei Wang
  2023-01-11 16:47   ` Ori Kam
@ 2023-01-18 11:37   ` Thomas Monjalon
  2023-01-18 14:44     ` Jiawei(Jonny) Wang
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-01-18 11:37 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: viacheslavo, orika, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, rasland, jerinj
21/12/2022 11:29, Jiawei Wang:
> For the multiple hardware ports connect to a single DPDK port (mhpsdp),
> the previous patch introduces the new rte flow item to match the port
> affinity of the received packets.
> 
> This patch adds the tx_affinity setting in Tx queue API, the affinity value
> reflects packets be sent to which hardware port.
I think "affinity" means we would like packet to be sent
on a specific hardware port, but it is not mandatory.
Is it the meaning you want? Or should it be a mandatory port?
> Adds the new tx_affinity field into the padding hole of rte_eth_txconf
> structure, the size of rte_eth_txconf keeps the same. Adds a suppress
> type for structure change in the ABI check file.
> 
> This patch adds the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two hardware ports connects to a single DPDK
connects -> connected
> port (port id 0), and affinity 1 stood for hard port 1 and affinity
> 2 stood for hardware port 2, used the below command to config
> tx affinity for each TxQ:
> 	port config 0 txq 0 affinity 1
> 	port config 0 txq 1 affinity 1
> 	port config 0 txq 2 affinity 2
> 	port config 0 txq 3 affinity 2
> 
> These commands config the TxQ index 0 and TxQ index 1 with affinity 1,
> uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
> hardware port 1, and similar with hardware port 2 if sending packets
> with TxQ 2 or TxQ 3.
[...]
> @@ -212,6 +212,10 @@ API Changes
> +* ethdev: added a new field:
> +
> +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
Adding a new field is not an API change because
existing applications don't need to update their code
if they don't care this new field.
I think you can remove this note.
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
>  				      less free descriptors than this value. */
>  
>  	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
> +	uint8_t tx_affinity; /**< Drives the setting of affinity per-queue. */
Why "Drives"? It is the setting, right?
rte_eth_txconf is per-queue so no need to repeat.
I think a good comment here would be to mention it is a physical port index for mhpsdp.
Another good comment would be to specify how ports are numbered.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 1/5] ethdev: add port affinity match item
  2023-01-18 11:07   ` Thomas Monjalon
@ 2023-01-18 14:41     ` Jiawei(Jonny) Wang
  2023-01-18 16:26       ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-01-18 14:41 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL)
  Cc: Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, Ivan Malov, jerinj
Hi,
> 
> 21/12/2022 11:29, Jiawei Wang:
> > +	/**
> > +	 * Matches on the physical port affinity of the received packet.
> > +	 *
> > +	 * See struct rte_flow_item_port_affinity.
> > +	 */
> > +	RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
> >  };
> 
> I'm not sure about the word "affinity".
> I think you want to match on a physical port.
> It could be a global physical port id or an index in the group of physical ports
> connected to a single DPDK port.
> In first case, the name of the item could be RTE_FLOW_ITEM_TYPE_PHY_PORT,
> in the second case, the name could be
> RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
> "MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
> We could replace "PHY" with "HW" as well.
>
Since DPDK only probe/attach the single port, seems first case does not meet this case.
Here, 'affinity' stands for the packet association with actual physical port.
 
> Note that we cannot use the new item
> RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
> because we are in a case where multiple hardware ports are merged in a single
> software represented port.
> 
> 
> [...]
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> > + *
> > + * For the multiple hardware ports connect to a single DPDK port
> > +(mhpsdp),
> > + * use this item to match the hardware port affinity of the packets.
> > + */
> > +struct rte_flow_item_port_affinity {
> > +	uint8_t affinity; /**< port affinity value. */ };
> 
> We need to define how the port numbering is done.
> Is it driver-dependent?
> Does it start at 0? etc...
> 
> 
User can define any value they want; one use case is the packet could be received and
sent to same port, then they can set the same 'affinity' value in flow and queue configuration.
The flow behavior is driver dependent.
Thanks.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2023-01-18 11:37   ` Thomas Monjalon
@ 2023-01-18 14:44     ` Jiawei(Jonny) Wang
  2023-01-18 16:31       ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-01-18 14:44 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL)
  Cc: Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, jerinj
Hi,
> 
> 21/12/2022 11:29, Jiawei Wang:
> > For the multiple hardware ports connect to a single DPDK port
> > (mhpsdp), the previous patch introduces the new rte flow item to match
> > the port affinity of the received packets.
> >
> > This patch adds the tx_affinity setting in Tx queue API, the affinity
> > value reflects packets be sent to which hardware port.
> 
> I think "affinity" means we would like packet to be sent on a specific hardware
> port, but it is not mandatory.
> Is it the meaning you want? Or should it be a mandatory port?
> 
Right, it's optional setting not mandatory.
> > Adds the new tx_affinity field into the padding hole of rte_eth_txconf
> > structure, the size of rte_eth_txconf keeps the same. Adds a suppress
> > type for structure change in the ABI check file.
> >
> > This patch adds the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two hardware ports connects to a single DPDK
> 
> connects -> connected
> 
OK, will fix in next version.
> > port (port id 0), and affinity 1 stood for hard port 1 and affinity
> > 2 stood for hardware port 2, used the below command to config tx
> > affinity for each TxQ:
> > 	port config 0 txq 0 affinity 1
> > 	port config 0 txq 1 affinity 1
> > 	port config 0 txq 2 affinity 2
> > 	port config 0 txq 3 affinity 2
> >
> > These commands config the TxQ index 0 and TxQ index 1 with affinity 1,
> > uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
> > hardware port 1, and similar with hardware port 2 if sending packets
> > with TxQ 2 or TxQ 3.
> 
> [...]
> > @@ -212,6 +212,10 @@ API Changes
> > +* ethdev: added a new field:
> > +
> > +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
> 
> Adding a new field is not an API change because existing applications don't
> need to update their code if they don't care this new field.
> I think you can remove this note.
> 
OK, will remove in next version.
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
> >  				      less free descriptors than this value. */
> >
> >  	uint8_t tx_deferred_start; /**< Do not start queue with
> > rte_eth_dev_start(). */
> > +	uint8_t tx_affinity; /**< Drives the setting of affinity per-queue.
> > +*/
> 
> Why "Drives"? It is the setting, right?
> rte_eth_txconf is per-queue so no need to repeat.
> I think a good comment here would be to mention it is a physical port index for
> mhpsdp.
> Another good comment would be to specify how ports are numbered.
> 
OK, will update the comment for this new setting.
Thanks.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [RFC 1/5] ethdev: add port affinity match item
  2023-01-18 14:41     ` Jiawei(Jonny) Wang
@ 2023-01-18 16:26       ` Thomas Monjalon
  2023-01-24 14:00         ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-01-18 16:26 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang
  Cc: Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, Ivan Malov, jerinj
18/01/2023 15:41, Jiawei(Jonny) Wang:
> Hi,
> 
> > 
> > 21/12/2022 11:29, Jiawei Wang:
> > > +	/**
> > > +	 * Matches on the physical port affinity of the received packet.
> > > +	 *
> > > +	 * See struct rte_flow_item_port_affinity.
> > > +	 */
> > > +	RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
> > >  };
> > 
> > I'm not sure about the word "affinity".
> > I think you want to match on a physical port.
> > It could be a global physical port id or an index in the group of physical ports
> > connected to a single DPDK port.
> > In first case, the name of the item could be RTE_FLOW_ITEM_TYPE_PHY_PORT,
> > in the second case, the name could be
> > RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
> > "MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
> > We could replace "PHY" with "HW" as well.
> >
> 
> Since DPDK only probe/attach the single port, seems first case does not meet this case.
> Here, 'affinity' stands for the packet association with actual physical port.
I think it is more than affinity because the packet is effectively
received from this port.
And the other concern is that this name does not give any clue
that we are talking about multiple ports merged in a single one.
> > Note that we cannot use the new item
> > RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
> > because we are in a case where multiple hardware ports are merged in a single
> > software represented port.
> > 
> > 
> > [...]
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> > > + *
> > > + * For the multiple hardware ports connect to a single DPDK port
> > > +(mhpsdp),
> > > + * use this item to match the hardware port affinity of the packets.
> > > + */
> > > +struct rte_flow_item_port_affinity {
> > > +	uint8_t affinity; /**< port affinity value. */ };
> > 
> > We need to define how the port numbering is done.
> > Is it driver-dependent?
> > Does it start at 0? etc...
> 
> User can define any value they want; one use case is the packet could be received and
> sent to same port, then they can set the same 'affinity' value in flow and queue configuration.
No it does not work.
If ports are numbered 1 and 2, and user thinks it is 0 and 1,
the port 2 won't be matched at all.
> The flow behavior is driver dependent.
> 
> Thanks.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2023-01-18 14:44     ` Jiawei(Jonny) Wang
@ 2023-01-18 16:31       ` Thomas Monjalon
  2023-01-24 13:32         ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-01-18 16:31 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang
  Cc: Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, jerinj
18/01/2023 15:44, Jiawei(Jonny) Wang:
> > 21/12/2022 11:29, Jiawei Wang:
> > > For the multiple hardware ports connect to a single DPDK port
> > > (mhpsdp), the previous patch introduces the new rte flow item to match
> > > the port affinity of the received packets.
> > >
> > > This patch adds the tx_affinity setting in Tx queue API, the affinity
> > > value reflects packets be sent to which hardware port.
> > 
> > I think "affinity" means we would like packet to be sent on a specific hardware
> > port, but it is not mandatory.
> > Is it the meaning you want? Or should it be a mandatory port?
> 
> Right, it's optional setting not mandatory.
I think there is a misunderstanding.
I mean that "affinity" with port 0 may suggest that we try
to send to port 0 but sometimes the packet will be sent to port 1.
And I think you want the packet to be always sent to port 0
if affinity is 0, right?
If yes, I think the word "affinity" does not convey the right idea.
And again, the naming should give the idea that we are talking about
multiple ports merged in one DPDK port.
> > > Adds the new tx_affinity field into the padding hole of rte_eth_txconf
> > > structure, the size of rte_eth_txconf keeps the same. Adds a suppress
> > > type for structure change in the ABI check file.
> > >
> > > This patch adds the testpmd command line:
> > > testpmd> port config (port_id) txq (queue_id) affinity (value)
> > >
> > > For example, there're two hardware ports connects to a single DPDK
> > 
> > connects -> connected
> 
> OK, will fix in next version.
> 
> > > port (port id 0), and affinity 1 stood for hard port 1 and affinity
> > > 2 stood for hardware port 2, used the below command to config tx
> > > affinity for each TxQ:
> > > 	port config 0 txq 0 affinity 1
> > > 	port config 0 txq 1 affinity 1
> > > 	port config 0 txq 2 affinity 2
> > > 	port config 0 txq 3 affinity 2
> > >
> > > These commands config the TxQ index 0 and TxQ index 1 with affinity 1,
> > > uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
> > > hardware port 1, and similar with hardware port 2 if sending packets
> > > with TxQ 2 or TxQ 3.
> > 
> > [...]
> > > @@ -212,6 +212,10 @@ API Changes
> > > +* ethdev: added a new field:
> > > +
> > > +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
> > 
> > Adding a new field is not an API change because existing applications don't
> > need to update their code if they don't care this new field.
> > I think you can remove this note.
> 
> OK, will remove in next version.
> 
> > > --- a/lib/ethdev/rte_ethdev.h
> > > +++ b/lib/ethdev/rte_ethdev.h
> > > @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
> > >  				      less free descriptors than this value. */
> > >
> > >  	uint8_t tx_deferred_start; /**< Do not start queue with
> > > rte_eth_dev_start(). */
> > > +	uint8_t tx_affinity; /**< Drives the setting of affinity per-queue.
> > > +*/
> > 
> > Why "Drives"? It is the setting, right?
> > rte_eth_txconf is per-queue so no need to repeat.
> > I think a good comment here would be to mention it is a physical port index for
> > mhpsdp.
> > Another good comment would be to specify how ports are numbered.
> 
> OK, will update the comment for this new setting.
> 
> Thanks.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API
  2023-01-18 16:31       ` Thomas Monjalon
@ 2023-01-24 13:32         ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-01-24 13:32 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL)
  Cc: Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, jerinj
Hi,
> 18/01/2023 15:44, Jiawei(Jonny) Wang:
> > > 21/12/2022 11:29, Jiawei Wang:
> > > > For the multiple hardware ports connect to a single DPDK port
> > > > (mhpsdp), the previous patch introduces the new rte flow item to
> > > > match the port affinity of the received packets.
> > > >
> > > > This patch adds the tx_affinity setting in Tx queue API, the
> > > > affinity value reflects packets be sent to which hardware port.
> > >
> > > I think "affinity" means we would like packet to be sent on a
> > > specific hardware port, but it is not mandatory.
> > > Is it the meaning you want? Or should it be a mandatory port?
> >
> > Right, it's optional setting not mandatory.
> 
> I think there is a misunderstanding.
> I mean that "affinity" with port 0 may suggest that we try to send to port 0 but
> sometimes the packet will be sent to port 1.
>
> And I think you want the packet to be always sent to port 0 if affinity is 0, right?
>
These packets should be always sent to port 0 if 'affinity' be set with hardware port 0.
'affinity is 0' -> 0 means that no affinity be set and traffic should be kept the same behavior
as before, for example, routing between different hardware ports.
 
> If yes, I think the word "affinity" does not convey the right idea.
> And again, the naming should give the idea that we are talking about multiple
> ports merged in one DPDK port.
> 
OK, how about 'tx_mhpsdp_hwport? 
'mhpsdp' as mentioned before, 'hwport' means for one 'hardware port'.
> > > > Adds the new tx_affinity field into the padding hole of
> > > > rte_eth_txconf structure, the size of rte_eth_txconf keeps the
> > > > same. Adds a suppress type for structure change in the ABI check file.
> > > >
> > > > This patch adds the testpmd command line:
> > > > testpmd> port config (port_id) txq (queue_id) affinity (value)
> > > >
> > > > For example, there're two hardware ports connects to a single DPDK
> > >
> > > connects -> connected
> >
> > OK, will fix in next version.
> >
> > > > port (port id 0), and affinity 1 stood for hard port 1 and
> > > > affinity
> > > > 2 stood for hardware port 2, used the below command to config tx
> > > > affinity for each TxQ:
> > > > 	port config 0 txq 0 affinity 1
> > > > 	port config 0 txq 1 affinity 1
> > > > 	port config 0 txq 2 affinity 2
> > > > 	port config 0 txq 3 affinity 2
> > > >
> > > > These commands config the TxQ index 0 and TxQ index 1 with
> > > > affinity 1, uses TxQ 0 or TxQ 1 send packets, these packets will
> > > > be sent from the hardware port 1, and similar with hardware port 2
> > > > if sending packets with TxQ 2 or TxQ 3.
> > >
> > > [...]
> > > > @@ -212,6 +212,10 @@ API Changes
> > > > +* ethdev: added a new field:
> > > > +
> > > > +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
> > >
> > > Adding a new field is not an API change because existing
> > > applications don't need to update their code if they don't care this new field.
> > > I think you can remove this note.
> >
> > OK, will remove in next version.
> >
> > > > --- a/lib/ethdev/rte_ethdev.h
> > > > +++ b/lib/ethdev/rte_ethdev.h
> > > > @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
> > > >  				      less free descriptors than this value. */
> > > >
> > > >  	uint8_t tx_deferred_start; /**< Do not start queue with
> > > > rte_eth_dev_start(). */
> > > > +	uint8_t tx_affinity; /**< Drives the setting of affinity per-queue.
> > > > +*/
> > >
> > > Why "Drives"? It is the setting, right?
> > > rte_eth_txconf is per-queue so no need to repeat.
> > > I think a good comment here would be to mention it is a physical
> > > port index for mhpsdp.
> > > Another good comment would be to specify how ports are numbered.
> >
> > OK, will update the comment for this new setting.
> >
> > Thanks.
> 
> 
^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: [RFC 1/5] ethdev: add port affinity match item
  2023-01-18 16:26       ` Thomas Monjalon
@ 2023-01-24 14:00         ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-01-24 14:00 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL), Ori Kam
  Cc: Slava Ovsiienko, Aman Singh, Yuying Zhang, Ferruh Yigit,
	Andrew Rybchenko, dev, Raslan Darawsheh, Ivan Malov, jerinj
Hi,
> > >
> > > 21/12/2022 11:29, Jiawei Wang:
> > > > +	/**
> > > > +	 * Matches on the physical port affinity of the received packet.
> > > > +	 *
> > > > +	 * See struct rte_flow_item_port_affinity.
> > > > +	 */
> > > > +	RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
> > > >  };
> > >
> > > I'm not sure about the word "affinity".
> > > I think you want to match on a physical port.
> > > It could be a global physical port id or an index in the group of
> > > physical ports connected to a single DPDK port.
> > > In first case, the name of the item could be
> > > RTE_FLOW_ITEM_TYPE_PHY_PORT, in the second case, the name could be
> > > RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
> > > "MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
> > > We could replace "PHY" with "HW" as well.
> > >
> >
> > Since DPDK only probe/attach the single port, seems first case does not meet
> this case.
> > Here, 'affinity' stands for the packet association with actual physical port.
> 
> I think it is more than affinity because the packet is effectively received from
> this port.
> And the other concern is that this name does not give any clue that we are
> talking about multiple ports merged in a single one.
> 
RTE_FLOW_ITEM_TYPE_MHPSDP_HW_PORT is better? @Ori Kam WDYT?
> > > Note that we cannot use the new item
> > > RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
> > > because we are in a case where multiple hardware ports are merged in
> > > a single software represented port.
> > >
> > >
> > > [...]
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > +notice
> > > > + *
> > > > + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> > > > + *
> > > > + * For the multiple hardware ports connect to a single DPDK port
> > > > +(mhpsdp),
> > > > + * use this item to match the hardware port affinity of the packets.
> > > > + */
> > > > +struct rte_flow_item_port_affinity {
> > > > +	uint8_t affinity; /**< port affinity value. */ };
> > >
> > > We need to define how the port numbering is done.
> > > Is it driver-dependent?
> > > Does it start at 0? etc...
> >
> > User can define any value they want; one use case is the packet could
> > be received and sent to same port, then they can set the same 'affinity' value
> in flow and queue configuration.
> 
> No it does not work.
> If ports are numbered 1 and 2, and user thinks it is 0 and 1, the port 2 won't be
> matched at all.
> 
OK, I can update the document the affinity 0 is no affinity in tx side and then match on affinity 0
will result an error.
For above case, user should use 1 and 2 to match.
> > The flow behavior is driver dependent.
> >
> > Thanks.
> 
> 
> 
> 
^ permalink raw reply	[flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-01-24 14:00 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-21 10:29 [RFC 0/5] add new port affinity item and affinity in Tx queue API Jiawei Wang
2022-12-21 10:29 ` [RFC 1/5] ethdev: add port affinity match item Jiawei Wang
2023-01-11 16:41   ` Ori Kam
2023-01-18 11:07   ` Thomas Monjalon
2023-01-18 14:41     ` Jiawei(Jonny) Wang
2023-01-18 16:26       ` Thomas Monjalon
2023-01-24 14:00         ` Jiawei(Jonny) Wang
2022-12-21 10:29 ` [RFC 2/5] ethdev: introduce the affinity field in Tx queue API Jiawei Wang
2023-01-11 16:47   ` Ori Kam
2023-01-18 11:37   ` Thomas Monjalon
2023-01-18 14:44     ` Jiawei(Jonny) Wang
2023-01-18 16:31       ` Thomas Monjalon
2023-01-24 13:32         ` Jiawei(Jonny) Wang
2022-12-21 10:29 ` [RFC 3/5] drivers: add lag Rx port affinity in PRM Jiawei Wang
2022-12-21 10:29 ` [RFC 4/5] net/mlx5: add port affinity item support Jiawei Wang
2022-12-21 10:29 ` [RFC 5/5] drivers: enhance the Tx queue affinity Jiawei Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).